Here is one way of doing it using the reshape package: > # test data from email > x <- "$$ Experiment Number: + $$ Associated Data: + + FUNCTION 1 + + Scan 1 + Retention Time 0.017 + + 399.8112 184 + 399.8742 0 + 399.9372 152 + .... + + Scan 2 + Retention Time 0.021 + + 399.8112 181 + 399.8742 1 + 399.9372 153 + ..... + " > # read in the vector > x.in <- readLines(textConnection(x)) > result <- list() # output list > i.result <- 1 > # process each line > for (i in x.in){ + # if Retention, pick off the time + if (regexpr("^Retention", i) > 0){ + time <- gsub("^Ret.*?([0-9.]+)", "\\1", i, perl=TRUE) + } else if (regexpr("^\\d+", i, perl=TRUE) > 0){ + # if data, parse it and store in result + idVal <- strsplit(i, "\\s+") + result[[i.result]] <- c(time, idVal[[1]]) + i.result <- i.result + 1 + } + } > # create data frame > df <- as.data.frame(do.call(rbind, result)) > colnames(df) <- c('time', 'id', 'value') > require(reshape) # use reshape package Loading required package: reshape [1] TRUE > y <- melt(df) > # convert to long > cast(y, time ~ id) time X399.8112 X399.8742 X399.9372 1 0.017 184 0 152 2 0.021 181 1 153 > >
Jim Holtman "What is the problem you are trying to solve?" ----- Original Message ---- From: Gabor Grothendieck <[EMAIL PROTECTED]> To: Bart Joosen <[EMAIL PROTECTED]> Cc: r-help@stat.math.ethz.ch Sent: Thursday, March 1, 2007 12:35:43 PM Subject: Re: [R] How to read in this data format? Read in the data using readLines, extract out all desired lines (namely those containing only numbers, dots and spaces or those with the word Time) and remove Retention from all lines so that all remaining lines have two fields. Now that we have desired lines and all lines have two fields read them in using read.table. Finally, split them into groups and restructure them using "by" and in the last line we convert the "by" output to a data frame. At the end we display an alternate function f for use with by should we wish to generate long rather than wide output (using the terminology of the reshape command). Lines <- "$$ Experiment Number: $$ Associated Data: FUNCTION 1 Scan 1 Retention Time 0.017 399.8112 184 399.8742 0 399.9372 152 .... Scan 2 Retention Time 0.021 399.8112 181 399.8742 1 399.9372 153 " # replace next line with: Lines. <- readLines("myfile.dat") Lines. <- readLines(textConnection(Lines)) Lines. <- grep("^[1-9][0-9. ]*$|Time", Lines., value = TRUE) Lines. <- gsub("Retention", "", Lines.) DF <- read.table(textConnection(Lines.), as.is = TRUE) closeAllConnections() f <- function(x) c(id = x[1,2], structure(x[-1,2], .Names = x[-1,1])) out.by <- by(DF, cumsum(DF[,1] == "Time"), f) as.data.frame(do.call("rbind", out.by)) We could alternately consider producing long format by replacing the function f with: f <- function(x) data.frame(x[-1,], id = x[1,2]) On 3/1/07, Bart Joosen <[EMAIL PROTECTED]> wrote: > Hi, > > I recieved an ascii file, containing following information: > > $$ Experiment Number: > $$ Associated Data: > > FUNCTION 1 > > Scan 1 > Retention Time 0.017 > > 399.8112 184 > 399.8742 0 > 399.9372 152 > .... > > Scan 2 > Retention Time 0.021 > > 399.8112 181 > 399.8742 1 > 399.9372 153 > ..... > > > I would like to import this data in R into a dataframe, where there is a > column time, the first numbers as column names, and the second numbers as > data in the dataframe: > > Time 399.8112 399.8742 399.9372 > 0.017 184 0 152 > 0.021 181 1 153 > > I did take a look at the read.table, read.delim, scan, ... But I 've no idea > about how to solve this problem. > > Anyone? > > > Thanks > > Bart > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ____________________________________________________________________________________ It's here! Your new message! Get new email alerts with the free Yahoo! Toolbar. [[alternative HTML version deleted]] ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.