On Jul 17, 2009, at 5:25 PM, Ulrike Grömping wrote:

David Winsemius wrote:


On Jul 17, 2009, at 3:24 PM, Ulrike Grömping wrote:


David,

thanks. Your explanation does not quite fit, though, as it refers to
using
function data.frame, while I assigned the new column with $<-.
poly() does
return an object of classes poly and matrix, not model.matrix,

But model.matrix is not a class as far as I can tell. It has no
"is.<>" function, and examining a sample model matrix does not
indicate that it carries a special class attribute.

It is a class all right, but is apparently not per default assigned to
objects generated with function model.matrix. Try
mm <- model.matrix(lm(swiss))
str(data.frame(swiss,mm))
class(mm) <- c("model.matrix","matrix")
str(data.frame(swiss,mm))


Not what I expected. Seems odd to need to assert a class in order to get that effect.


David Winsemius wrote:

...
It is just the assignment with "$" that does behave differently -
and not
only for poly objects but for any matrix object. After I eventually
remembered how to get to the documentation of extractors
(?"$<-.data.frame"), I found this behavior documented there in the
section
on Coercion. Nevertheless, this does seem to contradict the
understanding of
what a data frame is. I am aware that data frames are lists, but
they are of
course special lists, requiring that all list elements have the same
number
of rows. So far I thought that all list elements also have the same
number
of columns, namely just one. In fact, the documentation of function
data.frame states that

"A data frame is a list of variables of the same length with unique
row
names, given class "data.frame".",

which would imply such a rule.

Except that the same page asserts:

"Note that when the replacement value is an array (including a matrix)
it is not treated as a series of columns (as data.frame and
as.data.frame do) but inserted as a single column."

This is the piece on coercion in the extract documentation I was also
referring to.


David Winsemius wrote:

... which is more on point documentation than what I offered earlier.
I also found that the <-I() construct within the data.frame()  would
replicate the behavior of df$x<-<mtx> (as was documented in
data.frame's help:
dat2 <- data.frame(X1=1:10, X2=LETTERS[1:10], X1poly <- I(poly(dat
$X1,3)) )
length(dat2)
[1] 3
dat2[1,3]
              1        2          3
[1,] -0.4954337 0.522233 -0.4534252
attr(,"class")
[1] "poly"   "matrix"
The possibility of a matrix with more than
one column being a column of the data frame contradicts this piece of
documentation, since the length of the matrix is not the same as the
length
of the other columns (e.g. length(poly(dat$X1,3) is 30, not 10 like
for the
other variables). Or would one consider the columns of the matrix
X1poly the
variables, but X1poly a column ? I'm not trying to be difficult, I
just find
this quite confusing and wonder about the consequences when using
such a
data frame in analyses.

The could be unforeseen consequences, but I am not the right person to
answer for all of those possibilities. I can see another instance
where it would be desirable to have tuples included in data.frames as
arrays and that is in the representation of complex numbers, but it
appears that the internal representation of complex numbers is more
completely hidden from casual view than is the capacity of data.frames
to carry matrices. If you have a compelling argument to change the
behavior of [<-.data.frame, you will need to take it up with the
developers.

I have no idea which behavior is more useful; also, if this behavior has
been
around for a long time, changing it would presumably break some code. I suppose I would just opt for clearer documentation of the data frame class. The bugs interface is currently down, I may file a documentation wish or
documentation bug later.

Best regards, Ulrike


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to