The thing to watch out for is if you file is large, 'textConnection'
is very slow at providing the data stream for something like
read.table.  It is usually much faster to read in the file with
'readLines', preprocess the data data, write it out to a tempfile and
then read it back in with 'read.table'.

On Fri, Nov 18, 2011 at 9:52 AM, David Winsemius <dwinsem...@comcast.net> wrote:
>
> On Nov 18, 2011, at 9:13 AM, Langston, Jim wrote:
>
>> Thanks Paul,
>>
>> That's the path I was marching down, I was hoping for something
>> a little cleaner, I do the same with Perl or Java.
>
>> tesfil <- "aa|bb|cc\tdd|ee|ff\t"
>
>> read.table(textConnection(gsub("\\\t", "\n", scan(
>               textConnection(tesfil), # substitute your file here
>               what="character")) ), sep="|")
> Read 2 items
>  V1 V2 V3
> 1 aa bb cc
> 2 dd ee ff
>
>>
>> Jim
>>
>> On 11/18/11 8:35 AM, "Paul Hiemstra" <paul.hiems...@knmi.nl> wrote:
>>
>>> Hi Jim,
>>>
>>> You can read the text file using readLines. This puts each line in the
>>> file into an element of a list. Then you can go through the lines
>>> manually (e.g. using grep, sub, strsplit) and create your data.frame.
>>>
>>> cheers,
>>> Paul
>>>
>>> On 11/18/2011 12:37 PM, Langston, Jim wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I've been scratching and poking, but basically, the file I need to read
>>>> has
>>>> two delimiters that I need to contend with. The first is that the file
>>>> contains
>>>> tabs (\t) , instead of newlines (\n), and the second is that the fields
>>>> have
>>>> | for the seperators. I can easily do a read if I first convert the \t
>>>> to
>>>> \n
>>>> and then use read.table to get the file read with the | separator. But,
>>>> what I would really like to do, is do this all within R. I have a lot of
>>>> files
>>>> to read and do analysis on.
>>>>
>>>> I can read the data into a table using the \t has delimiter, but can't
>>>> figure
>>>> out how to take that table data and use the | for separation, I've look
>>>> at
>>>> string splits, etc. but haven't figured out how to split the whole
>>>> table.
>>>>
>>>> Any thoughts ? hints ?
>>>>
>>>> Thanks,
>>>>
>>>> Jim
>>>>
>>>>
>>>> The contents of this e-mail are intended for the named a...{{dropped:6}}
>>>>
>>>>
>> The contents of this e-mail are intended for the named addressee only. It
>> contains information that may be confidential. Unless you are the named
>> addressee or an authorized designee, you may not copy or use it, or disclose
>> it to anyone else. If you received it in error please notify us immediately
>> and then destroy it.
>>
>>>> R-help@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>> --
>>> Paul Hiemstra, Ph.D.
>>> Global Climate Division
>>> Royal Netherlands Meteorological Institute (KNMI)
>>> Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
>>> P.O. Box 201 | 3730 AE | De Bilt
>>> tel: +31 30 2206 494
>>>
>>> http://intamap.geo.uu.nl/~paul
>>> http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770
>>>
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to