Hi again, Below are two versions, depending on whether you want to use scan or read.table,
## with scan library(reshape) listOfFiles <- list.files() d <- llply(listOfFiles, scan) names(d) <- basename(listOfFiles) melt(d) ## with read.table listOfFiles <- list.files() names(listOfFiles) <- basename(listOfFiles) library(plyr) ldply(listOfFiles, read.table) Note, I tested this code with the following files, system("mkdir dummy") setwd(paste(getwd(), "/dummy", sep="")) files <- replicate(5, rnorm(sample(3:20, 1)), simplify=FALSE) names <- paste("datafile", letters[1:5],".txt", sep="") l_ply(seq_along(files), function(ii, ...) write.table(x=files[[ii]], file=names[ii], ... ), row.names = F, col.names = F) HTH, baptiste On 30 January 2010 14:23, Maxim <deeeperso...@googlemail.com> wrote: > Hi, > > my data is really not spectacular, each of the 6 files (later several > hundred) contains correlation coefficients in plain text format like: > > 0.923960073 > 0.923960073 > 0.612571344 > 0.064183275 > 0.007733399 > -0.315444372 > -0.064591277 > -0.268336142 > ........... > > with between 1000-13000 rows. > > Scanning from the directory works, as this script: > > comb<-data.frame() > count<-0 > files <- list.files() # all files in the working directory > for(i in files) { > count<-count+1 > > tmp <- scan(i) > assign(files[count], tmp) > > if (i ==1) > comb<-data.frame(dats=c(tmp), index=c(rep(files[1], length(tmp)))) > else > combadd<-data.frame(dats=c(tmp), index=c(rep(files[count], > length(tmp)))) > comb<-rbind(comb,combadd) > > } > boxplot(dats ~ index, data = comb) > > > works just great. There is no additional files in the folder. But look, how > much code for such a simple task. I'd definitely prefer the plyr solution. > > Maxim > > > 2010/1/30 baptiste auguie <baptiste.aug...@googlemail.com> >> >> Why don't you post an example of what your input files look like? (to >> the list, not just to me!) A reproducible example is always required >> if you want a good answer. >> >> Note that if you are scanning *all* files in the working directory, >> you may also be scanning the R file containing your instructions which >> won't have the correct format, obviously. >> >> Best, >> >> baptiste >> >> On 30 January 2010 13:52, Maxim <deeeperso...@googlemail.com> wrote: >> > Hi, >> > >> > thanks, that looks much more elegant than what I managed to accomplish >> > in >> > meantime: >> > >> > count<-1 >> > files <- list.files() # all files in the working directory >> > for(i in files) { >> > >> > tmp <- scan(i) >> > assign(files[count], tmp) >> > >> > if (i ==1) >> > comb<-data.frame(dats=c(tmp), index=c(rep(files[1], >> > length(tmp)))) >> > else >> > combadd<-data.frame(dats=c(tmp), index=c(rep(files[count], >> > length(tmp)))) >> > comb<-rbind(comb,combadd) >> > >> > count<-count+1 >> > } >> > boxplot(dats ~ index, data = comb) >> > >> > >> > This code works, unfortunately the plots get plotted in a different >> > order >> > than expected (appears to be more or less random to me). Why is this? >> > >> > >> > Concerning your code: I get an error like: >> > >> > Read 2652 items >> > Read 3310 items >> > Read 1096 items >> > Read 2177 items >> > Read 11387 items >> > Read 12503 items >> > Error in list_to_dataframe(res, attr(.data, "split_labels")) : >> > Results are not equal lengths >> > >> > hmmh? >> > >> > Maxim >> > >> > >> > 2010/1/30 baptiste auguie <baptiste.aug...@googlemail.com> >> >> >> >> Hi, >> >> >> >> Hadley recently proposed a strategy using plyr for a very similar >> >> problem, >> >> >> >> listOfFiles <- list.files() >> >> names(listOfFiles) <- basename(listOfFiles) >> >> >> >> library(plyr) >> >> d <- ldply(listOfFiles, scan) >> >> >> >> Even if you don't want to use plyr, it's always better to group things >> >> in a list rather than clutter your workspace with lots of assign()ed >> >> variables. >> >> >> >> HTH, >> >> >> >> baptiste >> >> >> >> >> >> On 30 January 2010 13:19, Maxim <deeeperso...@googlemail.com> wrote: >> >> > Hi, >> >> > >> >> > I have many files containing one column of data. I like to use the >> >> > scan >> >> > function to parse the data. Next I like to bind to a large vector. >> >> > I try this like: >> >> > >> >> > count<-1 >> >> > files <- list.files() # all files in the working directory >> >> > for(i in files) { >> >> > >> >> > tmp <- scan(i) >> >> > assign(files[count], tmp) >> >> > count<-count+1 >> >> > } >> >> > >> >> > This part works! >> >> > >> >> > Now I like to plot the data in a boxplot. >> >> > >> >> > Usually I do this from individual vectors like: >> >> > >> >> > comb <- data.frame(dat = c(vector1, vector2 ......), ind = >> >> > c(rep('vector1', >> >> > length(vector1)).......)) >> >> > boxplot(dat ~ ind, data = comb) >> >> > >> >> > But how do I do this i a loop? >> >> > >> >> > I know the vector names (according to the filenames in the working >> >> > directory), but I do not how to access them in my R code after having >> >> > assigned the names. >> >> > >> >> > I guess the "lapply" or "dply" from the plyr library can do this, but >> >> > I >> >> > seem >> >> > not to be able to do it. >> >> > >> >> > Is there a way to do this? >> >> > >> >> > gma >> >> > >> >> > [[alternative HTML version deleted]] >> >> > >> >> > ______________________________________________ >> >> > R-help@r-project.org mailing list >> >> > https://stat.ethz.ch/mailman/listinfo/r-help >> >> > PLEASE do read the posting guide >> >> > http://www.R-project.org/posting-guide.html >> >> > and provide commented, minimal, self-contained, reproducible code. >> >> > >> > >> > > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.