On Mon, 9 Aug 2004, F Duan wrote: > Dear R people, > > I have a very big tab-delim txt file with header and I only want to import > several columns into R. I checked the options for “read.table” and only > found “nrows” which lets you specify the maximum number of rows to read in. > Although I can use some text editors (e.g., wordpad) to edit the txt file first > before running R, I feel it’s not very convenient. The reason for me to do this > is that if I import the whole file into R, it will eat up too much of my > system’s memory. Even after I remove it later, I still can’t release the memory. >
You can't avoid reading the whole file, but you can avoid having it in memory. I'll assume you know how many lines are in the file, call it N. (this isn't necessary but it is tidier) and that you are interested in columns 10 and 110, both numeric If you do something like inputfile<-file("inputfile.txt",open="r") result<-data.frame(col10=numeric(N), col110=numeric(N)) chunksize<-1000 nchunks<- ceiling(N/1000) for(i in 1:nchunks){ chunk<-read.table(inputfile,nrows=chunksize) result[ (i-1)*chunksize+ (1:chunksize),]<-chunk[,c(10,110)] } close(inputfile) you can choose the chunk size so that the memory use is not too bad. There are also more efficient ways that make you do more of the work (eg read in lines of text with readLines and use regular expressions to extract the columns you need) -thomas ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html