Re: [R] Type of multi-valued variable

2003-02-15 Thread Fan
Thanks to Frank for pointing out that. There're so many misc in the 
package hmisc, I haven't yet explored all the functionalities !

The implementation of mChoice / summary() is very interesting, and it could
be a good starting point for adding more functionalities on the class mChoice.

I'm having a little question on the usage of the function summary.formula() in hmisc:
how to get the cross tabluations result like an array, as what xtabs does ?

For example, suppose titanic is a dataset as the following:
 str(titanic)
`data.frame':   1313 obs. of  11 variables:
 $ pclass   : Factor w/ 3 levels 1st,2nd,3rd: 1 1 1 1 1 1 1 1 1 1 ...
 $ survived : int  1 0 0 0 1 1 1 0 1 0 ...
 $ sex  : Factor w/ 2 levels female,male: 1 1 2 1 2 2 1 2 1 2 ...
 $ age  : num  29.000  2.000 30.000 25.000  0.917 ...
 ...

 ftable(xtabs( ~ sex + pclass + survived, data=titanic))
  survived   0   1
sexpclass 
female 1st   9 134
   2nd  13  94
   3rd 134  79
male   1st 120  59
   2nd 148  25
   3rd 440  58

My question is how to get that with hmisc::summary() ?
(survived could be a mChoice variable)

Thanks in advance
--
Fan

Frank E Harrell Jr a écrit :
 
 On Mon, 10 Feb 2003 21:51:50 +0100
 Fan [EMAIL PROTECTED] wrote:
 
  Hi,
 
  I've read in the past a thead in the R discussion list
  about the multi-valued type variable (what was called checklist).
  At the moment Gregory had intention to add some general code
  in his gregmisc package.
 
  I'm wondering if there's some general code / packages available ?
 
  A general class for taking account this type of variable
  would be very useful in the domain of survey processings,
  as multi-responses questions are often used.
  The simple operations applied to these variables are holecount,
  cross tabulations with others variables, transformation to single
  coded variables like number of responses, etc.
 
  Thanks in advance for any help
  --
  Fan
 
 
 Fan, Take a look at p. 38-44 of 
http://hesweb1.med.virginia.edu/biostat/s/doc/summary.pdf where examples of the 
mChoice (multiple choice) function in Hmisc are given.
 
 --
 Frank E Harrell Jr  Prof. of Biostatistics  Statistics
 Div. of Biostatistics  Epidem. Dept. of Health Evaluation Sciences
 U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] Type of multi-valued variable

2003-02-15 Thread Frank E Harrell Jr
On Sat, 15 Feb 2003 14:41:09 +0100
Fan [EMAIL PROTECTED] wrote:

 Thanks to Frank for pointing out that. There're so many misc in the 
 package hmisc, I haven't yet explored all the functionalities !
 
 The implementation of mChoice / summary() is very interesting, and it could
 be a good starting point for adding more functionalities on the class mChoice.
 
 I'm having a little question on the usage of the function summary.formula() in hmisc:
 how to get the cross tabluations result like an array, as what xtabs does ?
 
 For example, suppose titanic is a dataset as the following:
  str(titanic)
 `data.frame':   1313 obs. of  11 variables:
  $ pclass   : Factor w/ 3 levels 1st,2nd,3rd: 1 1 1 1 1 1 1 1 1 1 ...
  $ survived : int  1 0 0 0 1 1 1 0 1 0 ...
  $ sex  : Factor w/ 2 levels female,male: 1 1 2 1 2 2 1 2 1 2 ...
  $ age  : num  29.000  2.000 30.000 25.000  0.917 ...
  ...
 
  ftable(xtabs( ~ sex + pclass + survived, data=titanic))
   survived   0   1
 sexpclass 
 female 1st   9 134
2nd  13  94
3rd 134  79
 male   1st 120  59
2nd 148  25
3rd 440  58
 
 My question is how to get that with hmisc::summary() ?
 (survived could be a mChoice variable)
 
 Thanks in advance
 --
 Fan
 
  
  On Mon, 10 Feb 2003 21:51:50 +0100
  Fan [EMAIL PROTECTED] wrote:
  
   Hi,
  
   I've read in the past a thead in the R discussion list
   about the multi-valued type variable (what was called checklist).
   At the moment Gregory had intention to add some general code
   in his gregmisc package.
  
   I'm wondering if there's some general code / packages available ?
  
   A general class for taking account this type of variable
   would be very useful in the domain of survey processings,
   as multi-responses questions are often used.
   The simple operations applied to these variables are holecount,
   cross tabulations with others variables, transformation to single
   coded variables like number of responses, etc.
  
   Thanks in advance for any help
   --
   Fan
  
  
  Fan, Take a look at p. 38-44 of 
http://hesweb1.med.virginia.edu/biostat/s/doc/summary.pdf where examples of the 
mChoice (multiple choice) function in Hmisc are given.

Hello Fan,

[This reminds me that I forgot to mail you a paper I promised - will do that on Monday 
- Sorry]  For cross-classification, summarize in Hmisc is favored over summary(..., 
method='cross')  and summary(..., method='cross') does not handle mChoice variables 
until I make a small change to use the new function about to be described.  If you 
define

as.character.mChoice - function(x) {
  lev - dimnames(x)[[2]]
  d - dim(x)
  w - rep('',d[1])
  for(j in 1:d[2]) {
w - paste(w,ifelse(w!=''  x[,j],',',''),
   ifelse(x[,j],lev[j],''),sep='')
  }
w
}

you can add the line 
if(inherits(xi,'mChoice')) xi - as.character(xi) else
before
if(is.matrix(xi)  ncol(xi)  1) 
in summary.formula and obtain an (ugly) output with method='cross'.  Defining 
as.character.mChoice will fix summarize (here I'm using the titanic3 data frame):

n - nrow(titanic3)
set.seed(1)
w - c('good','bad','ugly')
a - factor(sample(w,n,TRUE))
b - factor(sample(w,n,TRUE))
m - mChoice(a,b)
table(as.character(m))

  bad  bad,good  bad,ugly  good good,ugly  ugly 
  146   275   284   150   319   135 

attach(titanic3)
summarize(survived,llist(sex,pclass,m),
  function(y)c(died=sum(y==0),lived=sum(y==1)))

  sex pclass m survived lived
1  female1st   bad014
2  female1st  bad,good128
3  female1st  bad,ugly034
4  female1st  good321
5  female1st good,ugly133
6  female1st  ugly0 9
7  female2nd   bad213
8  female2nd  bad,good128
9  female2nd  bad,ugly413
10 female2nd  good1 9
11 female2nd good,ugly419
. . . .

Here m is the multiple choice variable, not survived, but you get the idea.
These changes will be in the next version of Hmisc.
-- 
Frank E Harrell Jr  Prof. of Biostatistics  Statistics
Div. of Biostatistics  Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] Type of multi-valued variable

2003-02-10 Thread Fan
Hi,

I've read in the past a thead in the R discussion list
about the multi-valued type variable (what was called checklist).
At the moment Gregory had intention to add some general code
in his gregmisc package.

I'm wondering if there's some general code / packages available ?

A general class for taking account this type of variable 
would be very useful in the domain of survey processings,
as multi-responses questions are often used. 
The simple operations applied to these variables are holecount, 
cross tabulations with others variables, transformation to single 
coded variables like number of responses, etc.

Thanks in advance for any help
--
Fan

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help