Heinz Tuechler tuech...@gmx.at
on Sat, 07 Aug 2010 01:01:24 +0100 writes:
Also Surv objects are matrices and they share the same problem when
rbind-ing data.frames.
If contained in a data.frame, Surv objects loose their class after
rbind and therefore do not more represent Surv objects afterwards.
Using rbind with Surv objects outside of data.frames shows a similar
problem, but not the same column names.
In conclusion, yes, matrices are common in data.frames, but not
without problems.
My understanding ( 20 yr long S and R experience) has been that
a dataframe definitely can have matrix-like components,
and as Bill Dunlap (with equal S R experience) has just
explained, that's actually more common than you have thought.
To have *data frame*s instead of simple matrices, should be much
less common, I'm not sure if it's a good idea.
But getting back to 'matrices',
I think they should work without problems, at least for basic
R operations such as rbind().
I don't have time to analyze the Surv - example below,
but at the moment think, that we'd be interested in
fixing the problems..
Martin Maechler, ETH Zurich
Heinz
## example
library(survival)
## create example data
starttime - rep(0,5)
stoptime - 1:5
event - c(1,0,1,1,1)
group - c(1,1,1,2,2)
## build Surv object
survobj - Surv(starttime, stoptime, event)
## build data.frame with Surv object
df.test - data.frame(survobj, group)
df.test
## rbind data.frames
rbind(df.test, df.test)
## rbind Surv objects
rbind(survobj, survobj)
At 06.08.2010 09:34 -0700, William Dunlap wrote:
-Original Message-
From: r-devel-boun...@r-project.org
[mailto:r-devel-boun...@r-project.org] On Behalf Of Nicholas
L Crookston
Sent: Friday, August 06, 2010 8:35 AM
To: Michael Lachmann
Cc: r-devel-boun...@r-project.org; r-devel@r-project.org
Subject: Re: [Rd] rbind on data.frame that contains a column
that is also a data.frame
OK...I'll put in my 2 cents worth.
It seems to me that the problem is with this line:
b$a=a , where s is something other than a vector with
length equal to nrow(b).
I had no idea that a dataframe could hold a dataframe. It is not just
rbind(b,b) that fails, apply(b,1,sum) fails and so does plot(b). I'll
bet other R commands fail as well.
My point of view is that a dataframe is a list of vectors
of equal length and various types (this is not exactly what the help
page says, but it is what it suggests to me).
Hum, I wonder how much code is based on the idea that a
dataframe can hold
a dataframe.
I used to think that non-vectors in data.frames were
pretty rare things but when I started looking into
the details of the modelling code I discovered that
matrices in data.frames are common. E.g.,
library(splines)
sapply(model.frame(data=mtcars, mpg~ns(hp)+poly(disp,2)), class)
$mpg
[1] numeric
$`ns(hp)`
[1] ns basis matrix
$`poly(disp, 2)`
[1] poly matrix
You may not see these things because you don't call model.frame()
directly, but most modelling functions (e.g., lm() and glm())
do call it and use the grouping provided by the matrices to encode
how the columns of the design matrix are related to one another.
If matrices are allowed, shouldn't data.frames be allowed as well?
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
15 years of using R just isn't enough! But, I can
say that not
one
line of code I've written expects a dataframe to hold a dataframe.
Hi,
The following was already a topic on r-help, but after
understanding
what is
going on, I think it fits better in r-devel.
The problem is this:
When a data.frame has another data.frame in it, rbind
doesn't work well.
Here is an example:
--
a=data.frame(x=1:10,y=1:10)
b=data.frame(z=1:10)
b$a=a
b
z a.x a.y
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 5
6 6 6 6
7 7 7 7
8 8 8 8
9 9 9 9
10 10 10 10
rbind(b,b)
Error in `row.names-.data.frame`(`*tmp*`, value = c(1,
2, 3, 4,
:
duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ?1?, ?10?, ?2?,
?3?, ?4?,
?5?,
?6?, ?7?, ?8?, ?9?
--
Looking at the code of rbind.data.frame, the error comes from the
lines:
--
xij - xi[[j]]
if (has.dim[jj]) {