The read.table.ffdf function in the ff package can read in delimited files 
and store them to disk as individual columns.  The ffbase package provides 
additional data management and analytic functionality.  I have used these 
packages on 15 Gb files of 18 million rows and 250 columns.


On Tuesday, August 5, 2014 1:39:03 PM UTC-5, David Winsemius wrote:
>
>
> On Aug 5, 2014, at 10:20 AM, Spencer Graves wrote: 
>
> >      What tools do you like for working with tab delimited text files up 
> to 1.5 GB (under Windows 7 with 8 GB RAM)? 
>
> ?data.table::fread 
>
> >      Standard tools for smaller data sometimes grab all the available 
> RAM, after which CPU usage drops to 3% ;-) 
> > 
> > 
> >      The "bigmemory" project won the 2010 John Chambers Award but "is 
> not available (for R version 3.1.0)". 
> > 
> > 
> >      findFn("big data", 999) downloaded 961 links in 437 packages. That 
> contains tools for data PostgreSQL and other formats, but I couldn't find 
> anything for large tab delimited text files. 
> > 
> > 
> >      Absent a better idea, I plan to write a function getField to 
> extract a specific field from the data, then use that to split the data 
> into 4 smaller files, which I think should be small enough that I can do 
> what I want. 
>
> There is the colbycol package with which I have no experience, but I 
> understand it is designed to partition data into column sized objects. 
> #--- from its help file----- 
> cbc.get.col {colbycol}        R Documentation 
> Reads a single column from the original file into memory 
>
> Description 
>
> Function cbc.read.table reads a file, stores it column by column in disk 
> file and creates a colbycol object. Functioncbc.get.col queries this object 
> and returns a single column. 
>
> >      Thanks, 
> >      Spencer 
> > 
> > ______________________________________________ 
> > r-h...@r-project.org <javascript:> mailing list 
> > https://stat.ethz.ch/mailman/listinfo/r-help 
> > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html 
> > and provide commented, minimal, self-contained, reproducible code. 
>
> David Winsemius 
> Alameda, CA, USA 
>
> ______________________________________________ 
> r-h...@r-project.org <javascript:> mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code. 
>
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to