On Feb 9, 2012, at 3:51 PM, barny wrote:
I've been trying to get some data from the National Survey for
Family Growth
into R - however, the data is in a .dat file and the data I need
doesn't
have any spaces or commas separating fields - rather you have to
look into
the codebook and what number of digits along the line the data you
need is.
The data I want are the following, where 1,12,int means that the
data I'm
interested starts in column 1 and finishes in column 12 and is an
integer.
('caseid', 1, 12, int),
('nbrnaliv', 22, 22, int),
('babysex', 56, 56, int),
('birthwgt_lb', 57, 58, int),
('birthwgt_oz', 59, 60, int),
('prglength', 275, 276, int),
('outcome', 277, 277, int),
('birthord', 278, 279, int),
('agepreg', 284, 287, int),
('finalwgt', 423, 440, float)
That's not the way the read.fwf is set up to accept data. You will
need to loop over that input stream and apply logic like:
vec<numeric(0);
nams <-character(0)
getwidth = first-last+1
vec=c(vec, getwidth)
nams=c(nams, <whatever>)
getwidblank = last-first.next-1
If( getblank>0) namskip= <junk-name>
Then remove all the zeros and that will be your vector of widths and
your string of col.names
How can I do this using R? I've written a python programme which
basically
does it but it'd be nicer if I could skip the Python bit and just do
it
using R. Cheers for any help.
--
View this message in context:
http://r.789695.n4.nabble.com/Getting-codebook-data-into-R-tp4374331p4374331.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.