On 23/09/15 10:00, Therneau, Terry M., Ph.D. wrote:
I have a csv file from an automatic process (so this will happen
thousands of times), for which the first row is a vector of variable
names and the second row often starts something like this:

5724550,"000202075214",2005.02.17,2005.02.17,"F", .....

Notice the second variable which is
       a character string (note the quotation marks)
       a sequence of numeric digits
       leading zeros are significant

The read.csv function insists on turning this into a numeric.  Is there
any simple set of options that
will turn this behavior off?  I'm looking for a way to tell it to "obey
the bloody quotes" -- I still want the first, third, etc columns to
become numeric.  There can be more than one variable like this, and not
always in the second position.

This happens deep inside the httr library; there is an easy way for me
to add more options to the read.csv call but it is not so easy to
replace it with something else.

IMHO this is a bug in read.csv().

A possible workaround:

ccc <- c("integer","character",rep(NA,k))
X   <- read.csv("melvin.csv",colClasses=ccc)

where "melvin.csv" is the file from which you are attempting to read and
where k+2 = the number of columns in that file.

Kludgey, but it might work.

Another workaround is to specify quote="", but this has the side effect
of making the 5th column character rather than logical.

cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to