Re: [R] SAS "datalines" or "cards" statement equivalent in R?

David Winsemius Mon, 07 Dec 2009 09:52:50 -0800


On Dec 7, 2009, at 12:37 PM, Marshall Feldman wrote:

I totally agree with Barry, although it's sometimes convenient to

include data with analysis code for debugging and/or documentationpurposes.

However, the example actually applies equally to separate datafiles. In

fact, the example is from the U.S. Bureau of Labor Statistics at
ftp://ftp.bls.gov/pub/time.series/sm/, which contains nothing but data
and documentation files. At issue is not where the data come from, but
rather how to parse relatively complex data organized inconsistently.
SAS has built-in the ability to parse five different organizations of

data: list (delimited), modified list, column, formatted, and mixed(see

http://www.masil.org/sas/input.html). It seems R can parse such data,
but only .

It is hard to know what you mean, because you have not specified whatyou mean by "with considerable work by the user."

It would be great to have a
function/package that implements something with as easy (hah!) and
flexible as SAS.

In particular it is not clear whether you were anticipating usingread.fwf() and why you think that requires "considerable more work"than a SAS INPUT statement. The output of read.fwf gets passed toread.table, so that help page would document your options regardingdefinition of classes of input variables.


--
David


   Marsh

Barry Rowlingson wrote:

On Mon, Dec 7, 2009 at 3:53 PM, Marshall Feldman <ma...@uri.edu>wrote:
Regarding the various methods people have suggested, what if atypical
tab-delimited data line looks like:

   SMS11000000000000001 1990 M01 688.0

and the SAS INPUT statement is
INPUT survey $ 1-2 seasonal $ 3 state $ 4-5 area $ 6-10supersector $11-12 @13 industry $8. datatype $ 21-22 year period $ valuefootnote $ ;
Note that most data lines have no footnote item, as in the sample.
Here (I think) we'd want all the character variables to be read asfactors,
possibly "year" as a date, and "value" as numeric.
Actually I'm surprised that nobody has yet said what a clearly
bonkers thing it is to mix up your data and your analysis code in a
single file. Now suppose you have another set of data you want to
analyse with the same code? Are you going to create a new file and
paste the new data in? You've now got two copies of your analysiscode
- good luck keeping corrections to that code synchronised.

This just seems like horrendously bad practice, which is one reason
it's kludgy in R. If it was good practice, someone would surely have
written a way to do it neatly.

Keep your data in data files, and your functions in .R function
files. You'll thank me later.

Barry



        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SAS "datalines" or "cards" statement equivalent in R?

Reply via email to