Thanks for your answer. I've thought of the possibility of an index when using a SQLite database, but as you mentioned, I don't really benefit from it in regard of rather small data sets. What I am considering as a problem when using csv files is the occurrence of data redundancy: When I wan't to print a table in a document, it is often not necessary to insert all columns of a table. You might choose to leave out certain parameters that are aren't of interest for the reader. I feared that the generation of a second table with less information would be unavoidable in this case before including it in LaTeX with Sweave. However, after having tried to get started with sqldf, this powerful addition to R's data analysis and manipulation abilities is probably sufficient for my purposes.
Best regards, Juliet > RSQLite is actually much more than a "driver for data import". It > basically allows any manipulation of SQLite databases, including > reading, writing, or updating and munging data within the database. > > > Regarding the original question of data import: > > I use csv and sqlite files interchangeably to store my data, and > converting between one and the other is generally trivial (read one > format into a data.frame and then write it into the other). > > For me, the key determinant is whether a given data set is so big that > reading it into a data.frame with read.csv() every time I need it is a > pain. I usually keep all my original data in csv files and write > routines to write the contents of huge files into sqlite databases. > > sqldf is a great package as well, but it definitely does not obviate the > need to actually be able to manipulate data base files. For one thing, > you cannot index a csv file or a data.frame. If you have to repeatedly > select subsets of your large data set, creating an index on the relevant > column in the sqlite table is an absolute life saver. > > (However, with a 1000x20 data set, you will probably not have to worry > about the speed of selecting from the table. Unless you need to use > bigger data sets, the simplest way is probably to just use csv files, > read the contents into a data.frame with read.csv, and then use sqldf on > the data.frame if you need to do complicated subsetting) > > Best, > Magnus ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.