The software attempts to read the registry and temporarily augment the
path in case you have Rtools installed so that the filter can access
all the tools that Rtools provides.  I am not sure why its failing on
your system but there is evidently some differences between systems
here and I have added some code to trap and bypass that portion in
case it fails.  I have added the new version to the svn repository so
try this:

library(sqldf)
# overwrite with development version
source("http://sqldf.googlecode.com/svn/trunk/R/sqldf.R";)
# your code to call read.csv.sql


On Sat, Feb 6, 2010 at 5:18 PM, Vadlamani, Satish {FLNA}
<satish.vadlam...@fritolay.com> wrote:
>
> Gabor:
> Here is the update. As you can see, I got the same error as below in 1.
>
> 1. Error
>  test_df <- read.csv.sql(file="out_small.txt", sql = "select * from file", 
> header = TRUE, sep = ",", filter="perl parse_3wkout.pl", eol="\n")
> Error in readRegistry(key, maxdepth = 3) :
>  Registry key 'SOFTWARE\R-core' not found
>
> 2. But the loading of the bigger file was successful as you can see below. 
> 857 MB, 333,250 rows, 227 columns. This is good.
>
> I will have to just do an inline edit in Perl and change the file to csv from 
> within R and then call the read.csv.sql.
>
> If you have any suggestions to fix 1, I would like to try them.
>
>  system.time(test_df <- read.csv.sql(file="out.txt"))
>   user  system elapsed
>  192.53   15.50  213.68
> Warning message:
> closing unused connection 3 (out.txt)
>
> Thanks again.
>
> Satish
>
> -----Original Message-----
> From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
> Sent: Saturday, February 06, 2010 3:02 PM
> To: Vadlamani, Satish {FLNA}
> Cc: r-help@r-project.org
> Subject: Re: [R] Reading large files
>
> Note that you can shorten #1 to read.csv.sql("out.txt") since your
> other arguments are the default values.
>
> For the second one, use read.csv.sql, eliminate the arguments that are
> defaults anyways (should not cause a problem but its error prone) and
> add an explicit eol= argument since SQLite can have problems with end
> of line in some cases.  Also test out your perl script separately from
> R first to ensure that it works:
>
> test_df <- read.csv.sql(file="3wkoutstatfcst_small.dat", filter="perl
> parse_3wkout.pl", eol = "\n")
>
> SQLite has some known problems with end of line so try it with and
> without the eol= argument just in case.  When I just made up the
> following gawk example I noticed that I did need to specify the eol=
> argument.
>
> Also I have added a complete example using gawk as Example 13c on the
> home page just now:
> http://code.google.com/p/sqldf/#Example_13._read.csv.sql_and_read.csv2.sql
>
>
> On Sat, Feb 6, 2010 at 3:52 PM, Vadlamani, Satish {FLNA}
> <satish.vadlam...@fritolay.com> wrote:
>> Gabor:
>>
>> I had success with the following.
>> 1. I created a csv file with a perl script called "out.txt". Then ran the 
>> following successfully
>> library("sqldf")
>> test_df <- read.csv.sql(file="out.txt", sql = "select * from file", header = 
>> TRUE, sep = ",", dbname = tempfile())
>>
>> 2. I did not have success with the following. Could you tell me what I may 
>> be doing wrong? I could paste the perl script if necessary. From the perl 
>> script, I am reading the file, creating the csv record and printing each 
>> record one by one and then exiting.
>>
>> Thanks.
>>
>> Not had success with below..
>> #test_df <- read.csv2.sql(file="3wkoutstatfcst_small.dat", sql = "select * 
>> from file", header = TRUE, sep = ",", filter="perl parse_3wkout.pl", dbname 
>> = tempfile())
>> test_df
>>
>> Error message below:
>> test_df <- read.csv2.sql(file="3wkoutstatfcst_small.dat", sql = "select * 
>> from file", header = TRUE, sep = ",", filter="perl parse_3wkout.pl", dbname 
>> = tempfile())
>> Error in readRegistry(key, maxdepth = 3) :
>>  Registry key 'SOFTWARE\R-core' not found
>> In addition: Warning messages:
>> 1: closing unused connection 14 (3wkoutstatfcst_small.dat)
>> 2: closing unused connection 13 (3wkoutstatfcst_small.dat)
>> 3: closing unused connection 11 (3wkoutstatfcst_small.dat)
>> 4: closing unused connection 9 (3wkoutstatfcst_small.dat)
>> 5: closing unused connection 3 (3wkoutstatfcst_small.dat)
>>> test_df <- read.csv2.sql(file="3wkoutstatfcst_small.dat", sql = "select * 
>>> from file", header = TRUE, sep = ",", filter="perl parse_3wkout.pl", dbname 
>>> = tempfile())
>> Error in readRegistry(key, maxdepth = 3) :
>>  Registry key 'SOFTWARE\R-core' not found
>>
>> -----Original Message-----
>> From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
>> Sent: Saturday, February 06, 2010 12:14 PM
>> To: Vadlamani, Satish {FLNA}
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Reading large files
>>
>> No.
>>
>> On Sat, Feb 6, 2010 at 1:01 PM, Vadlamani, Satish {FLNA}
>> <satish.vadlam...@fritolay.com> wrote:
>>> Gabor:
>>> Can I pass colClasses as a vector to read.csv.sql? Thanks.
>>> Satish
>>>
>>>
>>> -----Original Message-----
>>> From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
>>> Sent: Saturday, February 06, 2010 9:41 AM
>>> To: Vadlamani, Satish {FLNA}
>>> Cc: r-help@r-project.org
>>> Subject: Re: [R] Reading large files
>>>
>>> Its just any Windows batch command string that filters stdin to
>>> stdout.  What the command consists of should not be important.   An
>>> invocation of perl that runs a perl script that filters stdin to
>>> stdout might look like this:
>>>  read.csv.sql("myfile.dat", filter = "perl myprog.pl")
>>>
>>> For an actual example see the source of read.csv2.sql which defaults
>>> to using a Windows vbscript program as a filter.
>>>
>>> On Sat, Feb 6, 2010 at 10:16 AM, Vadlamani, Satish {FLNA}
>>> <satish.vadlam...@fritolay.com> wrote:
>>>> Jim, Gabor:
>>>> Thanks so much for the suggestions where I can use read.csv.sql and embed 
>>>> Perl (or gawk). I just want to mention that I am running on Windows. I am 
>>>> going to read the documentation the filter argument and see if it can take 
>>>> a decent sized Perl script and then use its output as input.
>>>>
>>>> Suppose that I write a Perl script that parses this fwf file and creates a 
>>>> CSV file. Can I embed this within the read.csv.sql call? Or, can it only 
>>>> be a statement or something? If you know the answer, please let me know. 
>>>> Otherwise, I will try a few things and report back the results.
>>>>
>>>> Thanks again.
>>>> Saitsh
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: jim holtman [mailto:jholt...@gmail.com]
>>>> Sent: Saturday, February 06, 2010 6:16 AM
>>>> To: Gabor Grothendieck
>>>> Cc: Vadlamani, Satish {FLNA}; r-help@r-project.org
>>>> Subject: Re: [R] Reading large files
>>>>
>>>> In perl the 'unpack' command makes it very easy to parse fixed fielded 
>>>> data.
>>>>
>>>> On Fri, Feb 5, 2010 at 9:09 PM, Gabor Grothendieck
>>>> <ggrothendi...@gmail.com> wrote:
>>>>> Note that the filter= argument on read.csv.sql can be used to pass the
>>>>> input through a filter written in perl, [g]awk or other language.
>>>>> For example: read.csv.sql(..., filter = "gawk -f myfilter.awk")
>>>>>
>>>>> gawk has the FIELDWIDTHS variable for automatically parsing fixed
>>>>> width fields, e.g.
>>>>> http://www.delorie.com/gnu/docs/gawk/gawk_44.html
>>>>> making this very easy but perl or whatever you are most used to would
>>>>> be fine too.
>>>>>
>>>>> On Fri, Feb 5, 2010 at 8:50 PM, Vadlamani, Satish {FLNA}
>>>>> <satish.vadlam...@fritolay.com> wrote:
>>>>>> Hi Gabor:
>>>>>> Thanks. My files are all in fixed width format. They are a lot of them. 
>>>>>> It would take me some effort to convert them to CSV. I guess this cannot 
>>>>>> be avoided? I can write some Perl scripts to convert fixed width format 
>>>>>> to CSV format and then start with your suggestion. Could you let me know 
>>>>>> your thoughts on the approach?
>>>>>> Satish
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
>>>>>> Sent: Friday, February 05, 2010 5:16 PM
>>>>>> To: Vadlamani, Satish {FLNA}
>>>>>> Cc: r-help@r-project.org
>>>>>> Subject: Re: [R] Reading large files
>>>>>>
>>>>>> If your problem is just how long it takes to load the file into R try
>>>>>> read.csv.sql in the sqldf package.  A single read.csv.sql call can
>>>>>> create an SQLite database and table layout for you, read the file into
>>>>>> the database (without going through R so R can't slow this down),
>>>>>> extract all or a portion into R based on the sql argument you give it
>>>>>> and then remove the database.  See the examples on the home page:
>>>>>> http://code.google.com/p/sqldf/#Example_13._read.csv.sql_and_read.csv2.sql
>>>>>>
>>>>>> On Fri, Feb 5, 2010 at 2:11 PM, Satish Vadlamani
>>>>>> <satish.vadlam...@fritolay.com> wrote:
>>>>>>>
>>>>>>> Matthew:
>>>>>>> If it is going to help, here is the explanation. I have an end state in
>>>>>>> mind. It is given below under "End State" header. In order to get 
>>>>>>> there, I
>>>>>>> need to start somewhere right? I started with a 850 MB file and could 
>>>>>>> not
>>>>>>> load in what I think is reasonable time (I waited for an hour).
>>>>>>>
>>>>>>> There are references to 64 bit. How will that help? It is a 4GB RAM 
>>>>>>> machine
>>>>>>> and there is no paging activity when loading the 850 MB file.
>>>>>>>
>>>>>>> I have seen other threads on the same types of questions. I did not see 
>>>>>>> any
>>>>>>> clear cut answers or errors that I could have been making in the 
>>>>>>> process. If
>>>>>>> I am missing something, please let me know. Thanks.
>>>>>>> Satish
>>>>>>>
>>>>>>>
>>>>>>> End State
>>>>>>>> Satish wrote: "at one time I will need to load say 15GB into R"
>>>>>>>
>>>>>>>
>>>>>>> -----
>>>>>>> Satish Vadlamani
>>>>>>> --
>>>>>>> View this message in context: 
>>>>>>> http://n4.nabble.com/Reading-large-files-tp1469691p1470667.html
>>>>>>> Sent from the R help mailing list archive at Nabble.com.
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help@r-project.org mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide 
>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>>
>>>>>>
>>>>>
>>>>> ______________________________________________
>>>>> R-help@r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide 
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jim Holtman
>>>> Cincinnati, OH
>>>> +1 513 646 9390
>>>>
>>>> What is the problem that you are trying to solve?
>>>>
>>>
>>
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to