Published on May 1st, 20130
Simultaneous Equations – An Introduction
By Tor G. Jakobsen
Many models in social science are simultaneous in nature. This is especially true for economics, where supply and demand more often than not affect each other. In other words, if you suspect that your dependent variable (Y) influences one or more independent variables (X), than you might want to consider simultaneous equations.
It is all about the classical question: Which came first, the chicken or the egg? This is a common problem in econometrics: demand influences supply, but supply also influences demand.
If we were to perform an ordinary least squares regression on the demand of chicken we are breaching one of the assumptions of OLS-regression, namely that X-variables shall not be correlated with the error term. Using OLS directly on structural equations gives us misleading coefficient estimates.
One solution is to employ simultaneous equations, which is a system of equations. We can assume there is dual causality regarding supply and demand for chicken. Notice that Y1 and Y2 are on the right-hand side of each other’s equations.
Y1 = Amount of chicken available
Y2 = Price of chicken
X1 = Income of the consumer
X2 = Price of beef
X3 = Price of chicken feed
The top equation characterizes the behavior of consumers of the product, while the one below is the behavior of suppliers. Researchers must view the entire system in order to see all feedback loops involved.
By employing simultaneous equations economists can find the equilibrium condition (Q) for the prizing of products. The graph illustrates an increasing (positive) demand function and a declining (negative) production function. Here we can find the correct prize of chicken.
A practical example of two stage least squares (2SLS)
In its simplest form we can perform such analyses using statistical software like SPSS, the more complex structural models require more advanced programs like AMOS, LISREL, or STATA.
Our first step is to find an instrumental variable, that is, one that correlates strongly with Y2, yet is not correlated with e1.
A variable which is correlated with the error term in a regression equation is called an endogenous variable. A variable which is not correlated with the error term is called an exogenous variable. One can test for endogeneity by saving the residuals and including them in the model to see if they influence the dependent. If they do, than we can assume that the other endogenous variable is correlated with the error term.
If we take the example of how education and skills influence a person’s salary. We know that skills influence salary. But skills also influence education. And we often do not have a variable for skills, thus, education will be correlated with the error term.
Here we can use a person’s father’s education as a proxy for education, as it is correlated with education but not (necessarily) with skills.
The output can be interpreted roughly similarly to a regular regression model, and we have now hopefully removed any correlation between education and the error term.
Studemund, A. H. (2011) Using Econometrics: A Practical Guide. Boston, MA: Pearson.