2009/3/5 ling ling <metal_lical...@live.com>: > > Dear all, > > I am a newcomer to R programming, I met the problem: > > I have a lot of .txt files in my directory. > > Firstly, I check whether the file satisfies the conditions: > 1.empty > 2.the "Rep" column of the file has no "useractivity_idle" or > "useractivity_act" > 3.even The "rep" has both of them, numbers of "useractivity_idle"==numbers of > "useractivity_act"==1 > If the file has one of those conditions, skip this file, jump to and read the > next .txt file: > I made the programming as: > > name<-list.files(path = ".", pattern = NULL, all.files = FALSE, > full.names = FALSE, recursive = FALSE, > ignore.case = FALSE) > > for(k in 1:length(name)){ > > log1<-read.table(name[k],header=TRUE,stringsAsFactors=FALSE) > > x<-which(log1$Rep=="useractivity_act") > y<-which(log1$Rep=="useractivity_idle") > > while(all(log1$Rep!="useractivity_act")||all(log1$Rep!="useractivity_idle")||(length(x)==1 > && length(y)==1)||(file.info(name[k])$size== 0)){ > k=k+1 > log1<-read.table(name[k],header=TRUE,stringsAsFactors=FALSE) > } > > ........ > > } > > But I always get the following information: > Error in file(file, "r") : cannot open the connection > In addition: Warning message: > In file(file, "r") : cannot open file 'NA': No such file or directory > > > I have been exploring this for long time, any help would be appreciated. > Thanks a lot!
You are trying to read one more file than you have! Simplified your code looks like this: name = list.files(...) for(k in 1:length(name)){ log1 = read.table(name[k],....) while(something){ k =k + 1 log1 = read.table(name[k],...) # 1 } } What will happen is that when the last file is read at point #1, the loop goes round again, k becomes more than the length of name, and it will fail at #1 again. I think you've overcomplicated it. You just need one loop with an 'if' in it. I'd write it as: processFiles = function(){ name<-list.files(path = ".", pattern = NULL, all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE) for(k in 1:length(name)){ log1<-read.table(name[k],header=TRUE,stringsAsFactors=FALSE) if(testCondition(log1)){ cat("Processing ",name[k],"\n") processLog(log1) }else{ cat("Skipping ",name[k],"\n") } } } Then you need two more functions, testCondition and processLog. testCondition takes a data frame and decides whether you want to process it or note. I'm not sure I've got the test logic right here, but you should get the idea: `testCondition` <- function(log1){ ## test for Rep column: if(!any(names(log1)=="Rep"))return(FALSE) ## test active/idle count nAct = sum(log1$Rep == "useractivity_act") nIdle = sum(log1$Rep == "useractivity_idle") ## if we have no active or idle, return False if(nAct + nIdle == 0)return(FALSE) ## if we only have one of either, return False if(nAct == 1 || nIdle ==1) return(FALSE) ## maybe some other tests here? return(TRUE) } here is a simple processLog function that just prints the summary of the data frame. Put whatever you want in here: `processLog` <- function(log1){ ## for example: print(summary(log1)) } How's that? Note the use of comments and breaking the code up into small independent, testable functions. Barry ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.