Since you are reading it in chunks, I assume that you are writing out each segment as you read it in. How are you writing it out to save it? Is the time you are quoting both the reading and the writing? If so, can you break down the differences in what these operations are taking?
How do you plan to use the data? Is it all numeric? Are you keeping it in a dataframe? Have you considered using 'scan' to read in the data and to specify what the columns are? If you would like some more help, the answer to these questions will help. On Sat, May 9, 2009 at 10:09 PM, Rob Steele <freenx.10.robste...@xoxy.net>wrote: > Thanks guys, good suggestions. To clarify, I'm running on a fast > multi-core server with 16 GB RAM under 64 bit CentOS 5 and R 2.8.1. > Paging shouldn't be an issue since I'm reading in chunks and not trying > to store the whole file in memory at once. Thanks again. > > Rob Steele wrote: > > I'm finding that readLines() and read.fwf() take nearly two hours to > > work through a 3.5 GB file, even when reading in large (100 MB) chunks. > > The unix command wc by contrast processes the same file in three > > minutes. Is there a faster way to read files in R? > > > > Thanks! > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.