Mediation is a commonly used statistical analysis technique, especially in psychological research. A mediator is a mechanism that potentially helps explain the relationship between two variables such that a predictor variable (X) influences an outcome variable (Y) through a mediator (M). There is no limit to how many mediator variables can be included, but researchers have to have a good reason for including something as a mediator. The figure shows this relationship for one mediating variable. An example of a mediated process could be how good grades (X) predict happiness (Y), but indirectly through self-esteem (M). That would mean that the better grades a person gets, the more likely they are to have increased self-esteem, which in turn increases happiness.
For examples and a more in-depth look at mediation, see these articles:
So now that you’ve decided mediation is the statistical analysis you want to use, there are a few things you must consider. First and foremost, mediation assumes causation. That basically means that we have to believe that that X causes Y through M, or in other words you need to know that your predictor variable actually causes the outcome through the mediator in lieu of being simply correlated with the outcome.
How can we be sure? Here are three criteria that must be met to show causation:
Covariation. This is how two variables change or vary together. If we examine the relationship between two variables, as one variable increases for some people, the covariation tells us how much we should expect the other variable to change for those same people on average. This is the easiest condition to meet since three common research methods all cover this requirement. Cross-sectional data collection analyzed using linear regression (collecting data once), longitudinal studies (following people over time and collecting data at multiple points), and experimental manipulation (collecting data just once but randomly assigning participants to different conditions to test your theory) all show correlation.
Temporal ordering. This is a fancy way of saying we need to be sure we know what comes first. Longitudinal studies do this well, because if we know something occurs the second time we collect data on the same people that didn’t occur the first time, we know something about the order of how that thing happens. Experimental manipulation does this too, because if we see a change in the outcome variable for one group but not the other, we can assume it was because of our manipulation. In other words, the cause should come before the effect.
Elimination of competing explanations. This is the trickiest condition, and usually only happens in a true experimental manipulation. But even experimentally manipulating our X variable does not always guarantee riddance of this problem in mediation because we’re also assuming our mediator (M) causes the outcome (Y). There is always the potential for something called spurious association, where there is another variable we don’t know about causing change in both X and Y.
Can I still use mediation analysis if I don’t meet all of the three conditions above? Not necessarily, but it is important to keep in mind that one of the main assumptions of mediation is cause and you should have good theory to support how you ordered your variables. Often it takes an entire field working together to meet the requirements of cause where each study gets us one step closer to meeting all the requirements of cause.
MacKinnon, D. P., Fairchild, A. J., & Fritz, M. S. (2007). Mediation analysis. Annual Review of Psychology, 58, 593. https://doi.org/10.1146/annurev.psych.58.110405.085542