Jeff, If you're willing to educate, I'd be happy to learn what wide vs long format means. I'll give rbind a shot in the meantime. Ben On Nov 2, 2012 4:31 PM, "Jeff Newmiller" <jdnew...@dcn.davis.ca.us> wrote:
> I would first confirm that you need the data in wide format... many > algorithms are more efficient in long format anyway, and rbind is way more > efficient than merge. > > If you feel this is not negotiable, you may want to consider sqldf. Yes, > you need to learn a bit of SQL, but it is very well integrated into R. > --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live > Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --------------------------------------------------------------------------- > Sent from my phone. Please excuse my brevity. > > Benjamin Caldwell <btcaldw...@berkeley.edu> wrote: > > >Dear R help; > >I'm currently trying to combine a large number (about 30 x 30) of large > >.csvs together (each at least 10000 records). They are organized by > >plots, > >hence 30 X 30, with each group of csvs in a folder which corresponds to > >the > >plot. The unmerged csvs all have the same number of columns (5). The > >fifth > >column has a different name for each csv. The number of rows is > >different. > > > >The combined csvs are of course quite large, and the code I'm running > >is > >quite slow - I'm currently running it on a computer with 10 GB ram, > >ssd, > >and quad core 2.3 ghz processor; it's taken 8 hours and it's only 75% > >of > >the way through (it's hung up on one of the largest data groupings now > >for > >an hour, and using 3.5 gigs of RAM. > > > >I know that R isn't the most efficient way of doing this, but I'm not > >familiar with sql or C. I wonder if anyone has suggestions for a > >different > >way to do this in the R environment. For instance, the key function now > >is > >merge, but I haven't tried join from the plyr package or rbind from > >base. > >I'm willing to provide a dropbox link to a couple of these files if > >you'd > >like to see the data. My code is as follows: > > > > > >#multmerge is based on code by Tony cookson, > > > http://www.r-bloggers.com/merging-multiple-data-files-into-one-data-frame/ > ; > >The function takes a path. This path should be the name of a folder > >that > >contains all of the files you would like to read and merge together and > >only those files you would like to merge. > > > >multmerge = function(mypath){ > >filenames=list.files(path=mypath, full.names=TRUE) > >datalist = try(lapply(filenames, > >function(x){read.csv(file=x,header=T)})) > >try(Reduce(function(x,y) {merge(x, y, all=TRUE)}, datalist)) > >} > > > >#this function renames files using a fixed list and outputs a .csv > > > >merepk <- function (path, nf.name) { > > > >output<-multmerge(mypath=path) > >name <- list("x", "y", "z", "depth", "amplitude") > >try(names(output) <- name) > > > >write.csv(output, nf.name) > >} > > > >#assumes all folders are in the same directory, with nothing else there > > > >merge.by.folder <- function (folderpath){ > > > >foldernames<-list.files(path=folderpath) > >n<- length(foldernames) > >setwd(folderpath) > > > >for (i in 1:n){ > >path<-paste(folderpath,foldernames[i], sep="\\") > > nf.name <- as.character(paste(foldernames[i],".csv", sep="")) > >merepk (path,nf.name) > > } > >} > > > >folderpath <- "yourpath" > > > >merge.by.folder(folderpath) > > > > > >Thanks for looking, and happy friday! > > > > > > > >*Ben Caldwell* > > > >PhD Candidate > >University of California, Berkeley > > > > [[alternative HTML version deleted]] > > > >______________________________________________ > >R-help@r-project.org mailing list > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.