Gabor wrote: >Assuming that the problem is that your input file has >additional embedded characters added by the data base >program you could try extracting just the text using >the UNIX strings program: > > strings myfile.csv > myfile.txt
Spencer wrote: >"strsplit" can break character strings into single >characters, and "%in%" can be used to classify them. The first suggestion helped me identify and remove some of the embedded characters, namely "^K". Many more remained hidden. The second suggestion gave me the idea of splitting the string on whitespace first, and seeing if the embedded character problem would go way along with the "blank" spaces. It did. In the snippet below, x is the character variable I am trying to process: str.vec <- strsplit(x, "\\s+", perl=T)[[1]] if(length(str.vec) > 0) { x <- paste(str.vec, collapse=" ") x <- gsub("^\\s+", "", x, perl=T) x <- gsub("\\s+$", "", x, perl=T) } There were no problems in processing x thereafter. Thank you, gentlemen. Scott Waichler ______________________________________________ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html