On 03/12/10 14:33, Duncan Murdoch wrote:
On 02/12/2010 8:04 PM, Peter Ehlers wrote:
On 2010-12-02 16:26, Rolf Turner wrote:
On 3/12/2010, at 1:08 PM, Phil Spector wrote:

Rolf -
      I'd suggest using

       junk<- read.csv("junk.csv",header=TRUE,fill=FALSE)

if you don't want the behaviour you're seeing.

The point is not that I don't want this kind of behaviour.
The point is that it seems to me to be unexpected and dangerous.

I can indeed take precautions against it, now that I know about it,
by specifying fill=FALSE.  Given that I remember to do so.

Now that you've pointed it out I can see that this is the reason
for the different behaviour between read.table() and read.csv();
in read.table() fill=FALSE is effectively the default.

Having fill=TRUE being the default in read.csv() strikes me as
being counter-intuitive and dangerous.

Rolf,
This is not to argue with your point re counter-intuitive,
but I always run a count.fields() first if I haven't seen
(or can't easily see) the file in my editor. I must have
learned that the hard way a long time ago.
I think the fill=TRUE option arrived about 10 years ago, in R 1.2.0.
The comment in the NEWS file suggests it was in response to some strange
csv file coming out of Excel.

The real problem with the CSV format is that there really isn't a well
defined standard for it.  The first RFC about it was published in 2005,
and it doesn't claim to be authoritative.  Excel is kind of a standard,
but it does some very weird things.  (For example:  enter the string 01
into a field.  To keep the leading 0, you need to type it as '01.  Save
the file, read it back:  goodbye 0.  At least that's what a website I
was just on says about Excel, and what OpenOffice does.)

I've been burned so many times by storing data in .csv files, that I
just avoid them whenever I can.
Absolutely agree with this Duncan. Playing around with .csv files is like playing with some sort of unstable explosive. I also avoid them as much as possible.

David Scott


Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
_________________________________________________________________
David Scott     Department of Statistics
                The University of Auckland, PB 92019
                Auckland 1142,    NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to