On Sat, 15 Feb 2003 14:41:09 +0100
Fan [EMAIL PROTECTED] wrote:
Thanks to Frank for pointing out that. There're so many misc in the
package hmisc, I haven't yet explored all the functionalities !
The implementation of mChoice / summary() is very interesting, and it could
be a good starting point for adding more functionalities on the class mChoice.
I'm having a little question on the usage of the function summary.formula() in hmisc:
how to get the cross tabluations result like an array, as what xtabs does ?
For example, suppose titanic is a dataset as the following:
str(titanic)
`data.frame': 1313 obs. of 11 variables:
$ pclass : Factor w/ 3 levels 1st,2nd,3rd: 1 1 1 1 1 1 1 1 1 1 ...
$ survived : int 1 0 0 0 1 1 1 0 1 0 ...
$ sex : Factor w/ 2 levels female,male: 1 1 2 1 2 2 1 2 1 2 ...
$ age : num 29.000 2.000 30.000 25.000 0.917 ...
...
ftable(xtabs( ~ sex + pclass + survived, data=titanic))
survived 0 1
sexpclass
female 1st 9 134
2nd 13 94
3rd 134 79
male 1st 120 59
2nd 148 25
3rd 440 58
My question is how to get that with hmisc::summary() ?
(survived could be a mChoice variable)
Thanks in advance
--
Fan
On Mon, 10 Feb 2003 21:51:50 +0100
Fan [EMAIL PROTECTED] wrote:
Hi,
I've read in the past a thead in the R discussion list
about the multi-valued type variable (what was called checklist).
At the moment Gregory had intention to add some general code
in his gregmisc package.
I'm wondering if there's some general code / packages available ?
A general class for taking account this type of variable
would be very useful in the domain of survey processings,
as multi-responses questions are often used.
The simple operations applied to these variables are holecount,
cross tabulations with others variables, transformation to single
coded variables like number of responses, etc.
Thanks in advance for any help
--
Fan
Fan, Take a look at p. 38-44 of
http://hesweb1.med.virginia.edu/biostat/s/doc/summary.pdf where examples of the
mChoice (multiple choice) function in Hmisc are given.
Hello Fan,
[This reminds me that I forgot to mail you a paper I promised - will do that on Monday
- Sorry] For cross-classification, summarize in Hmisc is favored over summary(...,
method='cross') and summary(..., method='cross') does not handle mChoice variables
until I make a small change to use the new function about to be described. If you
define
as.character.mChoice - function(x) {
lev - dimnames(x)[[2]]
d - dim(x)
w - rep('',d[1])
for(j in 1:d[2]) {
w - paste(w,ifelse(w!='' x[,j],',',''),
ifelse(x[,j],lev[j],''),sep='')
}
w
}
you can add the line
if(inherits(xi,'mChoice')) xi - as.character(xi) else
before
if(is.matrix(xi) ncol(xi) 1)
in summary.formula and obtain an (ugly) output with method='cross'. Defining
as.character.mChoice will fix summarize (here I'm using the titanic3 data frame):
n - nrow(titanic3)
set.seed(1)
w - c('good','bad','ugly')
a - factor(sample(w,n,TRUE))
b - factor(sample(w,n,TRUE))
m - mChoice(a,b)
table(as.character(m))
bad bad,good bad,ugly good good,ugly ugly
146 275 284 150 319 135
attach(titanic3)
summarize(survived,llist(sex,pclass,m),
function(y)c(died=sum(y==0),lived=sum(y==1)))
sex pclass m survived lived
1 female1st bad014
2 female1st bad,good128
3 female1st bad,ugly034
4 female1st good321
5 female1st good,ugly133
6 female1st ugly0 9
7 female2nd bad213
8 female2nd bad,good128
9 female2nd bad,ugly413
10 female2nd good1 9
11 female2nd good,ugly419
. . . .
Here m is the multiple choice variable, not survived, but you get the idea.
These changes will be in the next version of Hmisc.
--
Frank E Harrell Jr Prof. of Biostatistics Statistics
Div. of Biostatistics Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat
__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help