Programs offer Stata users an increased level of flexibility as compared to loops, which require users to specify the elements or values over which to loop. Programs also allow users to drop discrete pieces of information into local macros (i.e., `1', `'2', etc.), which act as buckets. Some programs contain loops. For an example of this, check out the section entitled "A Program to Demean Data" featured in this web resource from the Social Science Computing Cooperative. Overall, programs help users reduce errors by executing repetitive processes that would normally require dozens or even hundreds of lines of code.
To better demonstrate the utility of programs, I've created a short program using Stata's auto.dta data set. This program, which I've called "make," creates a variable for the mean price of each car make. The do-file then consolidates these variables into one variable containing the mean price of each car make. Let's walk through this program line by line. In line 1, I declare and name the program. In line 2, I instruct Stata to generate a variable for price called `1'_price and set it equal to missing. Notice the local macro `1' in line 1. Stata will replace each instance of `1' with the arguments I pass to it. In line 3, I instruct Stata to replace the `1'_price variable created in line 2 with the value contained in the original price variable, conditional on the presence of argument `1' in the string variable called make. Line 4 generates a new variable for the mean price of each car make (i.e., `1'_mnprice_temp), which Stata calculates using the variable created in lines 2 and 3. Line 5 tells Stata to replace values of `1'_mnprice_temp with zero in rows that don't correspond to the argument (i.e., the car make) on which Stata is currently operating. Line 6 drops the variable `1'_price, and line 7 declares the end of the program.
Next, I begin passing arguments to the program. The arguments I pass to the program represent specific car makes. Notably, make is a string variable in this data set, and it contains both the make and model of each vehicle. In the last lines of the do-file, I instruct Stata to generate a new variable that combines the row totals for each `1'_mnprice_temp variable created by the program into one variable called make_mnprice. I drop all of the variables ending in _temp, since I want to create one variable containing the mean price of every make of car (i.e., make_mnprice).
Finally, I check this variable transformation using a list.
|