Thanks so much for the many responses on and off this email list! I think it's helped me reach a resolution. I'm going to use R on my current small project for something relatively self-contained, such as some final tests and graphics. I figure this will help me learn some basics about interacting with R without getting too bogged down in setting up a new database system -- which sounds like a job all on its own. After I understand a bit more about R's capabilities I'll ease into the database part.
Cheers, Martin On 8/19/07, Gabor Grothendieck <[EMAIL PROTECTED]> wrote: > > Regarding RODBC vs. DBI-based packages (RSQLite, RMySQL, etc.) its > my perception, possibly mistaken, that apart from any consideration of > the R packages themselves, ODBC (which originated in the Windows world) > is more widely used on Windows than UNIX. Also ODBC has the problem > that one must configure it which puts an extra step into the > process. Clear > documentation on how to do such ODBC configuration may be difficult to > find. > > On the other hand the RODBC package itself seems to be maintained > very well and is typically available for new versions of R before the > DBI-based packages. > > On 8/19/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote: > > Some additional comments on the DBMS front. > > > > (a) SPSS is not a DBMS, so it is not clear that you need this. But if > you > > do and are storing valuable data in a DBMS a lot of further questions > come > > into play, like how you are going to do backups. I'd say PostgreSQL was > > really only for professional-level administrators. My sysadmins > recommend > > MySQL for most people. We do also run PostgreSQL and they find it a lot > > trickier to maintain. > > > > 'dozens of columns and thousands of rows' is not big. A data frame with > > 50 columns and 5000 rows would only take 2Mb to store, and R will easily > > handle 100x with 4GB of RAM (and if you have less, get 4GB). So storing > > data in .rda (R's save() format) is most likely viable. R's indexing > etc > > operations make it good at data manipulation, and using a DBMS will > > involve learning SQL, a non-trivial cost. > > > > (b) You have a choice of interfaces to a DBMS, RODBC and the DBI+ > family, > > e.g. DBI+RMySQL and DBI+RSQLite. I'm biased, but I find RODBC more > > intuitive, and many people have reported it to be faster. If all you > want > > is non-permanent storage for manipulation of large data sets, consider > > also SQLiteDF. > > > > On Sat, 18 Aug 2007, Duncan Murdoch wrote: > > > > > Martin Brown wrote: > > >> [i sent this message earlier but apparently should have sent it plain > > >> text, as follows..] > > >> > > >> Hi there, > > >> > > >> I would like some advice, not so much about how to use R, but about > > >> software that I need to complement R. I've rooted around in the > FAQ's > > >> and done a few searches on this mailing list but haven't quite found > > >> the perspective I need. > > >> > > >> I am an experienced data analyst in my field (forest ecology and > > >> ecological monitoring) but new to R. I am a long time user of SPSS > and > > >> have gotten pretty handy with it. However, I am frustrated with SPSS > > >> for several reasons: There's the cost (I'm a freelancer; I pay for > my > > >> software myself); the Windows dependence (I use Kubuntu as my usual > > >> OS now, and switching back and forth is a pain); the horrible > > >> inefficiency when I do certain types of file manipulations; and the > > >> inability to do the kind of publication-quality graphs I want... I've > > >> usually ended up using a commercial graphing program (another source > > >> of expense and limitation). > > >> > > >> I'd like to switch to using R on Kubuntu, for all those reasons. In > > >> addition I think the mathematical formality that R encourages might > be > > >> good for me. > > >> > > >> However, reviewing the FAQ's on the R project web site makes me > > >> realize that I've been using SPSS as three kinds of software really: > > >> a DBMS; a statistical analysis package; and a graphing package. It > > >> looks like moving to R might involve learning three kinds of > software, > > >> not just one. I wonder: > > >> > > >> 1) What open-source DBMS works most seamlessly with R? I have seen > > >> MySQL recommended but wonder if there are alternatives. I sometimes > > >> need to handle big data files. In fact a lot of my work involves > > >> exploratory and descriptive analyses of rather large and messy > > >> databases from ecological monitoring, rather than statistical tests > > >> per se. In SPSS the data files I have been generating have dozens of > > >> columns and thousands of rows, often with value and variable labels > > >> helpful for documenting my work. > > > > See above. > > > > > > > > I think you won't find much difference in the R interface between > MySQL, > > > PostgreSQL, or SQLite. The choice should be made based on the > qualities > > > of the database (and I don't know enough about the differences to give > a > > > recommendaton.) > > >> 2) For the purpose of creating publication-quality graphs, do R users > > >> typically need to go outside of the R system? If so, what open-source > > >> programs would you all recommend? > > >> > > > R is great for this, but you might need to go outside for some > > > specialized stuff (e.g. medical imaging). > > > > > >> 3) Any other software I need to learn that would make my work in R > > >> more productive? (for example, a code editor). > > > > > > A lot of people are happy with ESS mode in Emacs. > > > > > > Duncan Murdoch > > > > > > ______________________________________________ > > > R-help@stat.math.ethz.ch mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > -- > > Brian D. Ripley, [EMAIL PROTECTED] > > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > > University of Oxford, Tel: +44 1865 272861 (self) > > 1 South Parks Road, +44 1865 272866 (PA) > > Oxford OX1 3TG, UK Fax: +44 1865 272595 > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.