On Apr 20, 2013, at 2:19 PM, Benjamin Caldwell wrote: > Dear R helpers > > Reproducible example: > > #warning - this causes a hard freeze on the machines I've tried it on > matrix.holder<- matrix(rnorm(150), nrow=30, ncol=5) > > Out= > expand.grid(matrix.holder[,1],matrix.holder[,2],matrix.holder[,3],matrix.holder[,4], > matrix.holder[,5]) >
On my machine: object.size(Out) 972014344 bytes So with proper setup you might be able to work with this on a 4GB machine, but not likely to be able to do so on the machine you are describing below. > Problem: > > I'm running an analysis that I would like to do using a matrix containing > all the possible combinations of the elements in a [30,5] matrix. Briefly, > each possible combination is used to index and subset another matrix. I > then run some models on the data in the subsetted matrix and then sometimes > export the model results based on a couple criteria. 24,300,000 > combinations seems to be too big for R on my computer (Intel i5, about 2.5 > GB RAM free, 4 GB total, Rx64 2.15 ) to handle. > > Requests: > > 1. Can you tell me how I can estimate the amount of memory a matrix will > require before I create it? Roughly: 5* 8* prod(dim(mat)) # 8 bytes per double > 5*8*(30^5)/972014344 [1] 0.9999852 # so my estimate was accurate on a ratio basis to 5 decimal places. > > 2. Do you have recommendations for packages that allow the user to send an > object directly to the hard drive? I guess it would have to be partially > created in RAM and then dumped to the HD, but the point is that there isn't > room for whole thing to be created and then written in pieces to the HD > (which even I think I could do). And then of course if it was written as > one big piece to the HD, I would need to be able to read it in piece by > piece. > > 3. I also see packages out there to connect R to C. Anyone have ideas for > one designed or containing functions designed for this type of problem? Are you saying you have facility with C programming? (And you really have not described the problem. Perhaps a redesign of the solution could accommodate your limited computing resources. > > Background: > > When I tried to throw expand.grid() at a matrix of size [30,5] (24,300,000 > combinations), my computer choked (I assume due to RAM memory limits, but > it might be that doing that just takes a long time and I wasn't ready to > stare at a frozen computer for very long). It took 7 seconds on my 6 year-old MacPro. > I'm currently working around the > problem with five nested loops, with all the drawbacks of and limits > imposed by that approach (the biggest for me is that I'd like to attempt to > multithread with some of the packages that exist for that). > > I don't have any formal training in computer science, and the only > programming language I use enough to do something of this complexity is R, > so programming the whole thing in C (which all the remote sensing folks > across the hall said would make creation of this matrix trivial) isn't an > easy alternative for me. There are many threads on Rhelp and advice in various manuals about how to avoid memory limitations. > > Thanks! > > Ben Caldwell > > Graduate Fellow > University of California, Berkeley -- David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.