Since you are reading it in chunks, I assume that you are writing out each
segment as you read it in.  How are you writing it out to save it?  Is the
time you are quoting both the reading and the writing?  If so, can you break
down the differences in what these operations are taking?

How do you plan to use the data?  Is it all numeric?  Are you keeping it in
a dataframe?  Have you considered using 'scan' to read in the data and to
specify what the columns are?  If you would like some more help, the answer
to these questions will help.

On Sat, May 9, 2009 at 10:09 PM, Rob Steele <freenx.10.robste...@xoxy.net>wrote:

> Thanks guys, good suggestions.  To clarify, I'm running on a fast
> multi-core server with 16 GB RAM under 64 bit CentOS 5 and R 2.8.1.
> Paging shouldn't be an issue since I'm reading in chunks and not trying
> to store the whole file in memory at once.  Thanks again.
>
> Rob Steele wrote:
> > I'm finding that readLines() and read.fwf() take nearly two hours to
> > work through a 3.5 GB file, even when reading in large (100 MB) chunks.
> >  The unix command wc by contrast processes the same file in three
> > minutes.  Is there a faster way to read files in R?
> >
> > Thanks!
>  >
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to