I should have also noted in this email how I have allocated memory and an error that appears.
I'm using Windows, so as in FAQ 2.2 I did "C:\Program Files\R\R-2.2.0\bin\Rgui.exe" --sdi --max-mem-size=2Gb # Check memory size in R > example(memory.size) mmry.s> memory.size() [1] 11894064 mmry.s> memory.size(TRUE) [1] 12500992 mmry.s> round(memory.limit()/1048576, 2) [1] 2048 An interesting issue appears after trying to import the subset of the larger file (which is a csv file 75,238 KB). R indicates it has run out of memory as: Error: vector memory exhausted (limit reached?) Error: vector memory exhausted (limit reached?) So, when I then try to quit R, it doesn't allow me to. Here is a copy and paste from my workspace. > quit() Error: vector memory exhausted (limit reached?) > quit() Error: recursive default argument reference > quit() Error: vector memory exhausted (limit reached?) > Clearly, enough memory is allocated to handle this file. But, I also wonder why R then locks and I need to do a forced shut down. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Doran, Harold Sent: Tuesday, December 13, 2005 5:33 AM To: r-help@stat.math.ethz.ch Subject: [R] Technique for reading large sparse fwf data file Dear list: A datafile was sent to me that is very large (92890 x 1620) and is *very* sparse. Instead of leaving the entries with missing data blank, each cell with missing data contains a dot (.) The data are binary in almost all columns, with only a few columns containing whole numbers, which I believe requires 2 bytes for the binary and 4 for the others. So, by my calculations (assuming 4 bytes for all cells to create an upperbound) I should need around 92890 * 1620 * 4 = 574MB to read in these data and about twice that for analyses. My computer has 3GB. But, I am unable to read in the file even though I have allocated sufficient memory to R for this. My first question is do the dots in the empty cells consume additional memory? I am assuming the answer is yes and believe I should remove them before I do the read in. Because my data are in a fixed width format file, I can open the file in a text editor and find and replace all dots with nothing. Then, I should retry the read in process? Maybe this will work? I created a smaller data file (~ 14000 * 1620) in SAS and tried to import this subset (it still had the dots), but R still would not allow for me to do so. I could use a little guidance as I think I have allocated sufficient memory to read in a datafile assuming my calculations are right. Does anyone have any thoughts on a strategy? Harold [[alternative HTML version deleted]] ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html