A starting point might be the string splitting function strsplit For example,
> X = c("1,4,5" "1,2,5" "5,1,2") > strsplit(X) [[1]] [1] "1" "4" "5" [[2]] [1] "1" "2" "5" [[3]] [1] "5" "1" "2" This returns a list of the parsed vectors. Next you can do something like: > Z = data.frame(matrix(unlist(X), nrow = 3, byrow=T)) > Z X1 X2 X3 1 1 4 5 2 1 2 5 3 5 1 2 -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: 26 August 2003 09:00 To: R-help Subject: Re: [R] R tools for large files This has been an interesting thread! My first reaction to Murray's query was to think "use standard Unix tools, especially awk", 'awk' being a compact, fast, efficient program with great powers for processing lines of data files (and in particular extracting, subsetting and transforming database-like files e.g. CSV-type). Of course, that became a sub-thread in its own right. But -- and here I know I'm missing a trick which is why I'm responding now so that someone who knows the trick can tell me -- while I normally use 'awk' "externally" (i.e. I filter a data file through an 'awk' program outside of R and then read the resulting file into R), I began to think about doing it from within R. Something on the lines of X <- system("cat raw_data | awk '...' ", intern=TRUE) would create an object X which is a character vector, each element of which is one line from the output of the command "cat ...... ". E.g. if "raw_data" starts out as 1,2,3,4,5 1,3,4,2,5 5,4,3,2,1 5,3,4,1,2 then X<-system("cat raw_data.csv | awk 'BEGIN{FS=\",\"}{if($3>$2){print $1 \",\" $4 \",\" $5}}'", intern=TRUE) gives > X [1] "1,4,5" "1,2,5" "5,1,2" Now my Question: How do I convert X into the dataframe I would have got if I had read this output from a file instead of into the character vector X? In other words, how to convert a vector of character strings, each of which is in comma-separated format as above, into the rows of a data-frame (or matrix, come to that)? With thanks, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 167 1972 Date: 26-Aug-03 Time: 08:59:48 ------------------------------ XFMail ------------------------------ ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help