The read.table.ffdf function in the ff package can read in delimited files and store them to disk as individual columns. The ffbase package provides additional data management and analytic functionality. I have used these packages on 15 Gb files of 18 million rows and 250 columns.
On Tuesday, August 5, 2014 1:39:03 PM UTC-5, David Winsemius wrote: > > > On Aug 5, 2014, at 10:20 AM, Spencer Graves wrote: > > > What tools do you like for working with tab delimited text files up > to 1.5 GB (under Windows 7 with 8 GB RAM)? > > ?data.table::fread > > > Standard tools for smaller data sometimes grab all the available > RAM, after which CPU usage drops to 3% ;-) > > > > > > The "bigmemory" project won the 2010 John Chambers Award but "is > not available (for R version 3.1.0)". > > > > > > findFn("big data", 999) downloaded 961 links in 437 packages. That > contains tools for data PostgreSQL and other formats, but I couldn't find > anything for large tab delimited text files. > > > > > > Absent a better idea, I plan to write a function getField to > extract a specific field from the data, then use that to split the data > into 4 smaller files, which I think should be small enough that I can do > what I want. > > There is the colbycol package with which I have no experience, but I > understand it is designed to partition data into column sized objects. > #--- from its help file----- > cbc.get.col {colbycol} R Documentation > Reads a single column from the original file into memory > > Description > > Function cbc.read.table reads a file, stores it column by column in disk > file and creates a colbycol object. Functioncbc.get.col queries this object > and returns a single column. > > > Thanks, > > Spencer > > > > ______________________________________________ > > r-h...@r-project.org <javascript:> mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA > > ______________________________________________ > r-h...@r-project.org <javascript:> mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.