Dear Rolf,
I'll take a look how to fix it tomorrow, your proposal is very welocme,
of course,
Best,
Uwe
On 18.01.2016 00:01, Rolf Turner wrote:
On 18/01/16 10:48, Uwe Ligges wrote:
This is not a tab delimited file (as you apparently assume given the
code), but a fixed width format, hence I'd try:
url <- "http://data.princeton.edu/wws509/datasets/divorce.dat"
widths <- c(9, 13, 10, 8, 10, 6)
f5 <- read.fwf(url, widths = widths, skip = 1, strip.white = TRUE)
names(f5) <- as.character(unlist(read.fwf(url, widths = widths,
strip.white=TRUE, n=1)))
Not sure why reading it simply with header=TRUE des not work, but no
time to investiagte this now.
Dear Uwe,
I have fiddled around a bit and the situation seems to me to be of the
nature of a bug in read.fwf. It would seem that in order for
header=TRUE to work, the entries of the header need to be separated by
the sep delimiter which defaults to "\t". In the case in question the
entries are separated by blanks, so presumably the header gets read in
as a single entity, rather than 6 such, leading to a mismatch between
the length of the header and the number of columns.
It seems that the specified widths get ignored when the header line is
dealt with.
It also seems that if one specifies sep="" then the header gets read
correctly but then strings of blanks get interpreted as field separators
throughout and then blanks within the fields result in the
wrong number of columns.
I think that the code of read.fwf is easy enough to fix; a slight
adjustment will make the header get treated the same way as the body of
the file.
I don't see any problems/drawbacks with so-doing, and experimenting with
my modified function resulted in the divorce data being read in with
header=TRUE with no problems.
If this mod is made, I see no reason to keep the "sep" argument in
read.fwf --- except maybe for backward compatibility issues, and I don't
think there would be any since it never worked properly anyhow.
cheers,
Rolf
P. S. I can send you my modified version of read.fwf off-list if this
would be of any use to you.
R.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.