Sebastian, There is rarely a completely free lunch, but fortunately for us R has some wonderful tools to make this possible. R supports regular expressions with commands like grep(), gsub(), strsplit(), and others documented on the help pages. It's just a matter of constructing and algorithm that does the job. In your case, for example (though please note there are probably many different, completely reasonable approaches in R):
x <- scan("logfilename", what="", sep="\n") should give you a vector of character strings, one line per element. Now, lines containing "GET" seem to identify interesting lines, so x <- x[grep("GET", x)] should trim it to only the interesting lines. If you want information from other lines, you'll have to treat them separately. Next, you might try y <- strsplit(x) which by default splits on whitespace, returning a list (one component per line) of vectors based on the split. Try it. It it looks good, you might check lapply(y, length) to see if all lines contain the same number of records. If so, you can then get quickly into a matrix, z <- matrix(unlist(strsplit(x)), ncol=K, byrow=TRUE) where K is the common length you just observed. If you think this is cool, great! If not, well... hire a programmer, or if you're lucky Microsoft or Apache have tools to help you with this. There might be something in the Perl/Python world. Or maybe there's a package in R designed just for this, but I encourage students to develop the raw skills... Jay -- John W. Emerson (Jay) Associate Professor of Statistics Department of Statistics Yale University http://www.stat.yale.edu/~jay ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.