Michael Hoffman wrote: > Talbot Katz wrote: > >> I hope you'll indulge an ignorant outsider. I work at a financial >> software firm, and the tool I currently use for my research is R, a >> software environment for statistical computing and graphics. R is >> designed with matrix manipulation in mind, and it's very easy to do >> regression and time series modeling, and to plot the results and test >> hypotheses. The kinds of functionality we rely on the most are standard >> and robust versions of regression and principal component / factor >> analysis, bayesian methods such as Gibbs sampling and shrinkage, and >> optimization by linear, quadratic, newtonian / nonlinear, and genetic >> programming; frequently used graphics include QQ plots and histograms. >> In R, these procedures are all available as functions (some of them are >> in auxiliary libraries that don't come with the standard distribution, >> but are easily downloaded from a central repository). > > I use both R and Python for my work. I think R is probably better for > most of the stuff you are mentioning. I do any sort of heavy > lifting--database queries/tabulation/aggregation in Python and load the > resulting data frames into R for analysis and graphics.
I would second that. It is not either/or. Use Python, including Numpy and matplotlib and packages from SciPy, for some things, and R for others. And you can even embed R in Python using RPy - see http://rpy.sourceforge.net/ We use the combination of Python, Numpy (actually, the older Numeric Python package, but soon to be converted to Numpy), RPy and R in our NetEpi Analysis project - exploratory epidemiological analysis of large data sets - see http://sourceforge.net/projects/netepi - and it is a good combination - Python for the Web interface, data manipulation and data heavy-lifting, and for some of the more elementary statistics, and R for more involved statistical analysis and graphics (with teh option of using matplotlib or other Python-based graphics packages for some tasks if we wish). The main thing to remember, though, is that indexing is zero-based in Python and 1-based in R... Tim C -- http://mail.python.org/mailman/listinfo/python-list