Tuesday, April 9, 2013

A Beginner's Observations with STATA

This year, I have been occupied with a real estate finance project through the Undergraduate Research Opportunities Program here at the University of Michigan. Below are some observations about STATA that may be of use for some people.


1. Macros are useful, but rather peculiar. As a first time user, I found it very difficult to differentiate between when I should use "`variable'" and `variable'. From what I see, any time that the variable is meant to denote a string, it should be enclosed as "`variable'", whereas if it supposed to be a variable call or something of that sort, it should be`variable'

2. STATA is programmable, but not necessarily a 'programming' language. When working in STATA, you don't really have the flexibility that you would have in R or Python. It's difficult to make arbitrary functions, and sometimes I just want to say "int x = 4;", but there's no natural way to implement that. Another STATA construct that I had to familiarize myself with was the "foreach" command. It does do a natural way of iterating over an array in other languages, but I often get caught typing "for" by mistake.
3. If only I understood if's, my life would be easier. In STATA, there's two kinds of if's, one is a qualifier, and one is the standard if in programming that changes the flow of the program. For example, if I were to say
count if smsa == 0320
It would count the number of observations whose smsa variable entry was 0320. On the other hand, if I had
if(`nat'){
di "Merging with smsa control"
merge 1:1 smsa control using "tmortg`y'_`d'"
}
else{
di "Merging with control"
merge 1:1 control using "tmortg`y'_`d'"
}
The program would merge the file by smsa control if `nat' is true, and it would merge by just control otherwise.

4. On the above note, it may be useful to put a few display statements in your code. It's helpful to see  in the log where the program went, especially when there's these flow control issues.

5. While writing functions may be hard, it is certainly not impossible. To this end, the guide by Roy Mill was invaluable for me. I was able to more effectively abstract my code and reconcile it with my programming instincts.

6. Careful with do files and local variables. After I declared local variables in my do files, I could not access them once I was back in interactive mode. However, I had no problem with those local variables when I was working within the do file.

And for those STATA veterans among you, is there any way to do error handling? In Java, C++, Python, and R, there are tryCatch constructs that can keep the program running even if a variable is missing. This would be useful because it would allow the program to try to do something, and then if that doesn't work I would like to make the program go down a different path.

Any additional advice on STATA would be appreciated. Hope my observations can help some others avoid (too much) frustration.

1 comment:

  1. Ignore errors with the "capture" prefix:

    http://www.stata.com/help.cgi?capture

    Btw, if you already know R, why use STATA?

    ReplyDelete