On Dec 2, 2010, at 8:33 PM, Duncan Murdoch wrote:

snipped

I think the fill=TRUE option arrived about 10 years ago, in R 1.2.0. The comment in the NEWS file suggests it was in response to some strange csv file coming out of Excel.

The real problem with the CSV format is that there really isn't a well defined standard for it. The first RFC about it was published in 2005, and it doesn't claim to be authoritative. Excel is kind of a standard, but it does some very weird things. (For example: enter the string 01 into a field. To keep the leading 0, you need to type it as '01. Save the file, read it back: goodbye 0. At least that's what a website I was just on says about Excel, and what OpenOffice does.)

In both Excel and in OO,org you can select a column (or any other range) and set its format to text. (The default is numeric, not that different that read.table()'s default behavior.) Once a format has been set, you then do not need leading quotes. I just created a small example with OO.org Calc entered leading "0" without leading quotes and this code runs as desired after copying the three cells to the clipboard:

> read.table(pipe("pbpaste"), colClasses="character")
    V1
1   01
2  004
3 0005

The same applies to date field in both OO.org and Excel. In this regard, it is simply a matter of understanding what is the defined behavior of your software and how one can manipulate it. This is no different than learning R's classes, coercing them to your ends, and dealing with other formatting issues.


I've been burned so many times by storing data in .csv files, that I just avoid them whenever I can.

No argument there. I know one physician whose weapon of choice is Stata who always uses "|" as his separator, but that's perhaps because he works entirely in Windows. I imagine that might not be the most uncommon character in *NIXen.

--

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to