[R] Reading a file w/ two delimiters

2011-11-18 Thread Langston, Jim
Hi all, I've been scratching and poking, but basically, the file I need to read has two delimiters that I need to contend with. The first is that the file contains tabs (\t) , instead of newlines (\n), and the second is that the fields have | for the seperators. I can easily do a read if I first

Re: [R] Reading a file w/ two delimiters

2011-11-18 Thread Paul Hiemstra
Hi Jim, You can read the text file using readLines. This puts each line in the file into an element of a list. Then you can go through the lines manually (e.g. using grep, sub, strsplit) and create your data.frame. cheers, Paul On 11/18/2011 12:37 PM, Langston, Jim wrote: Hi all, I've been

Re: [R] Reading a file w/ two delimiters

2011-11-18 Thread Langston, Jim
Thanks Paul, That's the path I was marching down, I was hoping for something a little cleaner, I do the same with Perl or Java. Jim On 11/18/11 8:35 AM, Paul Hiemstra paul.hiems...@knmi.nl wrote: Hi Jim, You can read the text file using readLines. This puts each line in the file into an

Re: [R] Reading a file w/ two delimiters

2011-11-18 Thread jim holtman
It is pretty straightforward in R: x - readLines(textConnection(sadf|asdf|asdf\tqwer|qwer|qwer\tzxcv|zxcv|zxfcgv)) closeAllConnections() # convert tabs to newlines x - gsub(\t, \n, x) # write out to a temp file and then read in as a data frame myFile - tempfile() writeLines(x, con =

Re: [R] Reading a file w/ two delimiters

2011-11-18 Thread David Winsemius
On Nov 18, 2011, at 9:13 AM, Langston, Jim wrote: Thanks Paul, That's the path I was marching down, I was hoping for something a little cleaner, I do the same with Perl or Java. tesfil - aa|bb|cc\tdd|ee|ff\t read.table(textConnection(gsub(\\\t, \n, scan(

Re: [R] Reading a file w/ two delimiters

2011-11-18 Thread jim holtman
The thing to watch out for is if you file is large, 'textConnection' is very slow at providing the data stream for something like read.table. It is usually much faster to read in the file with 'readLines', preprocess the data data, write it out to a tempfile and then read it back in with

Re: [R] Reading a file w/ two delimiters

2011-11-18 Thread David Winsemius
On Nov 18, 2011, at 9:28 AM, jim holtman wrote: It is pretty straightforward in R: x - readLines(textConnection(sadf|asdf|asdf\tqwer|qwer|qwer\tzxcv| zxcv|zxfcgv)) closeAllConnections() # convert tabs to newlines x - gsub(\t, \n, x) Did the rules get liberalized for escaping patterns? Or

Re: [R] Reading a file w/ two delimiters

2011-11-18 Thread Gabor Grothendieck
On Fri, Nov 18, 2011 at 10:26 AM, David Winsemius dwinsem...@comcast.net wrote: On Nov 18, 2011, at 9:28 AM, jim holtman wrote: It is pretty straightforward in R: x - readLines(textConnection(sadf|asdf|asdf\tqwer|qwer|qwer\tzxcv|zxcv|zxfcgv)) closeAllConnections() # convert tabs to

Re: [R] Reading a file w/ two delimiters

2011-11-18 Thread Bert Gunter
David: As you now realize \t etc. is a perfectly legal single tab character. Now consider: Error in gsub(\\, a, \\) : invalid regular expression '\', reason 'Trailing backslash' BUT gsub(,a,\\) [1] a ??? The issue is there are two levels of escapes here -- the R parser's and the reg

Re: [R] Reading a file w/ two delimiters

2011-11-18 Thread Bert Gunter
... I failed to correctly paste the first line of an example: On Fri, Nov 18, 2011 at 10:44 AM, Bert Gunter bgun...@gene.com wrote: David: As you now realize \t etc. is a perfectly legal single tab character. Now consider: - left this out -- gsub(\\,a,\\)

Re: [R] Reading a file w/ two delimiters

2011-11-18 Thread Bert Gunter
... and yet another line I left out below! I apologize for this baloney! On Fri, Nov 18, 2011 at 10:48 AM, Bert Gunter bgun...@gene.com wrote: ... I failed to correctly paste the first line of an example: On Fri, Nov 18, 2011 at 10:44 AM, Bert Gunter bgun...@gene.com wrote: David: As you