You will probably have to contact the maintainer of package sqldf (G. Grothendieck <ggrothendi...@gmail.com>) as it appears that the package does not support missing values. I tried stripping out the NAs so that the fields were empty, but the results are the same. Fields specified as numeric are converted to character. Inserting a complete record at the top of the data prevents the conversion to text, but NA's are still converted to zeros.
As a workaround, you could replace NA's with 9999 before reading the file and then convert this value to NA within R. ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352 > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of David Reiner > Sent: Wednesday, September 05, 2012 4:40 PM > To: r-help@r-project.org > Subject: Re: [R] using sqldf's read.csv.sql to read a file with "NA" > for missing > > I see the attachment didn't go through. > here is the text: > > 2012-07-01,15:50:00,NA,0,NA,0,NA > 2012-07-01,15:51:00,NA,0,NA,0,NA > 2012-07-01,15:52:00,NA,0,NA,0,NA > 2012-07-01,15:53:00,NA,0,NA,0,NA > 2012-07-01,15:54:00,NA,0,NA,0,NA > 2012-07-01,15:55:00,NA,0,NA,0,NA > 2012-07-01,15:56:00,NA,0,NA,0,NA > 2012-07-01,15:57:00,NA,0,NA,0,NA > 2012-07-01,15:58:00,NA,0,NA,0,NA > 2012-07-01,15:59:00,NA,0,NA,0,NA > 2012-07-01,16:00:00,1358.5,1,1358.5,4,NA > 2012-07-01,16:01:00,NA,0,1358.25,2,NA > 2012-07-01,16:02:00,NA,0,1358.25,2,NA > 2012-07-01,16:03:00,NA,0,1358.25,2,NA > 2012-07-01,16:04:00,NA,0,1358.25,2,NA > 2012-07-01,16:05:00,NA,0,1358.25,2,NA > 2012-07-01,16:06:00,1358.25,2,1358.25,2,NA > 2012-07-01,16:07:00,1358.25,2,1358.25,2,NA > 2012-07-01,16:08:00,1358,3,1358,2,NA > 2012-07-01,16:09:00,1358.25,2,1358.25,2,NA > 2012-07-01,16:10:00,1358.25,2,1358.25,2,NA > 2012-07-01,16:11:00,1358.25,2,1358.25,2,NA > 2012-07-01,16:12:00,1357.5,3,1357.5,3,NA > 2012-07-01,16:13:00,1357.25,10,1357.25,3,1357.25 > 2012-07-01,16:14:00,NA,0,1357.25,10,1357.125 > 2012-07-01,16:15:00,1357,20,1357,18,1357 > 2012-07-01,16:16:00,1357,20,1357,18,1357 > 2012-07-01,16:17:00,1357,20,1357,18,1357 > 2012-07-01,16:18:00,1357,20,1357,18,1357 > 2012-07-01,16:19:00,1357,20,1357,18,1357 > 2012-07-01,16:20:00,1357,20,1357,18,1357 > 2012-07-01,16:21:00,1357,20,1357,18,1357 > > -- David > > > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of David Reiner > Sent: Wednesday, September 05, 2012 4:32 PM > To: r-help@r-project.org > Subject: [R] using sqldf's read.csv.sql to read a file with "NA" for > missing > > I'm trying to use sqdf's function read.csv.sql to read CSV files in > which the missing values are represented by NA's. > Plain old read.csv works fine on these files, but they are rather large > and I'd like to filter using sql-like statements. > However, even if I specify field.types correctly and nrows=-1, it still > turns the columns with NA's into chars or 0. > I'm trying to make this OS independent, so I don't think I can use a > filter to convert the NA's to NULL's or whatever SQLite would > understand. > I can accept it everything has to be read in as char and then convert > to doubles with as.numeric, but I'm looking for speed. > > Here is code I thought would read the file (I've attached a small > sample.) > It almost works if there are no NA's in the initial rows, but it still > turns NA's into 0's instead of NA or something I can change into NA; > and it returns characters if there are NA's in the initial rows. > (0 is a possible value so I can't filter out the 0's.) > > field.types <- list(V1='char', V2='char', V3='real', V4='int', > V5='real', V6='int', V7='real') > dtst <- read.csv.sql("./tmp.csv", header=FALSE, > field.types=field.types, nrows=-1) > str(dtst) > > 'data.frame': 32 obs. of 7 variables: > $ V1: chr "2012-07-01" "2012-07-01" "2012-07-01" "2012-07-01" ... > $ V2: chr "15:50:00" "15:51:00" "15:52:00" "15:53:00" ... > $ V3: chr "NA" "NA" "NA" "NA" ... > $ V4: int 0 0 0 0 0 0 0 0 0 0 ... > $ V5: chr "NA" "NA" "NA" "NA" ... > $ V6: int 0 0 0 0 0 0 0 0 0 0 ... > $ V7: chr "NA" "NA" "NA" "NA" ... > > Any suggestions? > Thanks, > -- David L. Reiner > > > > > This e-mail and any materials attached hereto, including, without > limitation, all content hereof and thereof (collectively, "XR Content") > are confidential and proprietary to XR Trading, LLC ("XR") and/or its > affiliates, and are protected by intellectual property laws. Without > the prior written consent of XR, the XR Content may not (i) be > disclosed to any third party or (ii) be reproduced or otherwise used by > anyone other than current employees of XR or its affiliates, on behalf > of XR or its affiliates. > > THE XR CONTENT IS PROVIDED AS IS, WITHOUT REPRESENTATIONS OR WARRANTIES > OF ANY KIND. TO THE MAXIMUM EXTENT PERMISSIBLE UNDER APPLICABLE LAW, > XR HEREBY DISCLAIMS ANY AND ALL WARRANTIES, EXPRESS AND IMPLIED, > RELATING TO THE XR CONTENT, AND NEITHER XR NOR ANY OF ITS AFFILIATES > SHALL IN ANY EVENT BE LIABLE FOR ANY DAMAGES OF ANY NATURE WHATSOEVER, > INCLUDING, BUT NOT LIMITED TO, DIRECT, INDIRECT, CONSEQUENTIAL, SPECIAL > AND PUNITIVE DAMAGES, LOSS OF PROFITS AND TRADING LOSSES, RESULTING > FROM ANY PERSON'S USE OR RELIANCE UPON, OR INABILITY TO USE, ANY XR > CONTENT, EVEN IF XR IS ADVISED OF THE POSSIBILITY OF SUCH DAMAGES OR IF > SUCH DAMAGES WERE FORESEEABLE. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.