A suggestion on where to get extensive fundamental data cheaply. This is a response to Mark's Nov 23rd message.
Consider data from the American Association of Individual Investor’s Stock Investor Pro (SIP) software. I’ve had a lifetime membership to the AAII for many years. For the additional, but more than reasonable price of $198/yr, one can license SIP. What makes this source valuable is that it is survivorship-bias free historical data. Subscribers have access to the old software and data as it was when it was distributed going back to 2003. The data include balance sheet, income statement, cash flow, price, and many calculated fields. The list of fields <https://www.aaii.com/files/sipro/Stock%20Investor%20Pro%20Field%20List.pdf> runs to 22 pages. In 2003, over 8,500 companies were covered. For info on SIP, check out the AAII <file:///C:/Users/Rex/Documents/Quant%20Trading/SMW/www.aaii.com> webpage and this presentation <http://www.aaii.com/files/presentations/2011/20%20Joe%20Lan%20-%20Introduction%20to%20Stock%20Investor%20Pro.pdf>. I downloaded about 150 install files from the AAII archives <http://www.aaii.com/stock-investor-pro/archives> page site access to which requires membership ($29) and a subscription. I installed them one by one putting each into its own directory. I downloaded the month-end updates though weekly data was sometime available. I watched an entire season of Friends while doing this and probably lost three IQ points. Each install includes about 7 years of annual data and 8 quarters of quarterly data. The AAII data files are in a Foxpro/DBF format. Fortunately R has the read.dbf <https://stat.ethz.ch/R-manual/R-devel/library/foreign/html/read.dbf.html> function in the foreign package to handle this. Let me emphasize that this data is (almost) free of survivor-ship and look-ahead biases. You are getting the data as SIP released it back in the day. So companies around in 2003 that are not are in the data set. The data only has data that was available at the time of the release, so there is no look-ahead problem. I added "almost" to cover 2 caveats. As an example, if you use the first install (end of 2002) to figure out companies with P/E's less than X in 2001, you've got a survivor-ship bias problem. The SIP data is available pretty much at month end, but you won't be able to trade at month-end. If you assume that you can, you have a look-ahead bias. Weekly data is available beginning in 2005. I hope this helps. If you use SIP and find data errors, I'd like to know about them. [[alternative HTML version deleted]] _______________________________________________ R-SIG-Finance@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.