Stella, A few brief words of advice:
1. Work through your code a line at a time, making sure that each is what you would expect. I think some of your later problems are a result of something early not being as expected. For example, if the read.delim() is in fact not giving you what you expect, stop there before moving onwards. I suspect some funny character(s) or character encodings might be a problem. 2. 32-bit Windows can be limiting. With 2 GB of RAM, you're probably not going to be able to work effectively in native R with objects over 200-300 MB, and the error indicates that something (you or a package you're using) simply have run out of memory. So... 3. Consider more RAM (and preferably with 64-bit R). Other solutions might be possible, such as using a database to hand the data transition into R. 2.5 million rows by 18 columns is apt to be around 360 MB. Although you can afford 1 (or a few) copies of this, it doesn't leave you much room for the memory overhead of working with such an object. Part of the oringal message below. Jay ------------------------------------------------------------- Message: 80 Date: Mon, 19 Apr 2010 22:07:03 +0200 From: Stella Pachidi <stella.pach...@gmail.com> To: r-h...@stat.math.ethz.ch Subject: [R] Huge data sets and RAM problems Message-ID: <g2j133363581004191307t2a48c1bfqd9d57cf0d6c62...@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Dear all, .... I am using R 2.10.1 in a laptop with Windows 7 - 32bit system, 2GB RAM and CPU Intel Core Duo 2GHz. ..... Finally, another problem I have is when I perform association mining on the data set using the package arules: I turn the data frame into transactions table and then run the apriori algorithm. When I put too low support in order to manage to find the rules I need, the vector of rules becomes too big and I get problems with the memory such as: Error: cannot allocate vector of size 923.1 Mb In addition: Warning messages: 1: In items(x) : Reached total allocation of 153Mb: see help(memory.size) Could you please help me with how I could allocate more RAM? Or, do you think there is a way to process the data by loading them into a document instead of loading all into RAM? Do you know how I could manage to read all my data set? I would really appreciate your help. Kind regards, Stella Pachidi -- John W. Emerson (Jay) Associate Professor of Statistics Department of Statistics Yale University http://www.stat.yale.edu/~jay [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.