You might want to look at Anthony Damico's work at

http://www.asdfree.com/search/label/behavioral%20risk%20factor%20surveillance%20system%20%28brfss%29

--
Better name for the general practitioner might be multispecialist. 
~Martin H. Fischer (1879-1962)


-----Original Message-----
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Torvon
Sent: Tuesday, February 23, 2016 2:13 PM
To: r-help@r-project.org
Subject: [R] Loading large .pxt and .asc datasets causes issues.

Hi,

I want to load a dataset into R. This dataset is available in two formats:
.XPT and .ASC. The dataset is available at 
http://www.cdc.gov/brfss/annual_data/annual_2006.htm.

They are about 40mb zipped, and about 500mb unzipped.

I can get the .xpt data to load, using:

> library(hmisc)
> data <- sasxport.get("CDBRFS06.XPT")

The data look fine, no error messages. However, the data only contains 302 
columns, which is less than it should have (according to the documentation). It 
does not contain my variables of interest, so either the documentation or the 
data file is wrong, and I want to make sure it's not the data file.

Hence I wanted to see if I get the same results loading the .ASC file.
However, multiple ways to do so have failed.

> library(adehabitat)
> import.asc("CDBRFS06.asc")

Results in:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
: scan() expected 'a real', got '1191.8808943.38209868648.960119'

> library(SDMTools)
> read.asc("CDBRFS06.asc")

Results in:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
: scan() expected 'a real', got '1191.8808943.38209868648.960119' In
addition: Warning messages: 1: In scan(file, what, nmax, sep, dec, quote, skip, 
nlines, na.strings, : number of items read is not a multiple of the number of 
columns 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, 
: number of items read is not a multiple of the number of columns 3: In 
scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : number of 
items read is not a multiple of the number of columns 4: In scan(file, what, 
nmax, sep, dec, quote, skip, nlines, na.strings, : number of items read is not 
a multiple of the number of columns 5: In scan(file, nmax = nl * nc, skip = 6, 
quiet = TRUE) : NAs introduced by coercion to integer range

Thank you for your help.
   Eiko

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to