large N, categorical outcomes, significance?

2001-08-17 Thread JDriscoll

I have a large dataset (N can be 2,000-9,000) with
mostly categorical outcome variables.  Any
chi square is significant with residuals of 100+
for tiny differences.  I  know one can determine
effect size for continuous variables and show
result is sign only due to size of the N, but...how
do I do this for categorical outcome variables?
Thanks!


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: large N, categorical outcomes, significance?

2001-08-18 Thread Donald Burrill

One approach:  (I assume that by "residual" you mean (O-E)/sqrt(E) for 
each cell of a two-way frequency table, where O=observed frequency and 
E=expected frequency under the null hypothesis).  For the several (or 
the single) largest residual(s), report O and E as proportions (of total 
N).  Express the residual in terms of proportions, which will turn out 
to include N (or its square root) as a factor.  Show that the residual 
can be whatever it was (105.6, say) only if N is as large as it is in 
your dataset, and that the same proportions for some smaller (more 
reasonable?) N would _not_ produce a "significant" residual.

For purposes of this exercise, you could express the total chi-square 
in terms of proportions and N, and show that for the observed proportions 
only values of N larger than some value would produce a "significant" 
result;  or you could take, for any single cell, a critical value for 
chi-square with one d.f.  
 (One could argue for d.f. = (r-1)(c-1)/(rc), since the table has rc 
cells but only (r-1)(c-1) d.f., but 1 d.f. is arguably "conservative", 
and finding critical values for fractional d.f. may be difficult.) 

On 17 Aug 2001, JDriscoll wrote:

> I have a large dataset (N can be 2,000-9,000) with
> mostly categorical outcome variables.  Any
> chi square is significant with residuals of 100+
> for tiny differences.  I  know one can determine
> effect size for continuous variables and show
> result is sign only due to size of the N, but...how
> do I do this for categorical outcome variables?
> Thanks!

 
 Donald F. Burrill [EMAIL PROTECTED]
 184 Nashua Road, Bedford, NH 03110  603-471-7128



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: large N, categorical outcomes, significance?

2001-08-19 Thread Rich Ulrich

On 17 Aug 2001 10:14:04 -0700, [EMAIL PROTECTED] (JDriscoll) wrote:

> I have a large dataset (N can be 2,000-9,000) with
> mostly categorical outcome variables.  Any
> chi square is significant with residuals of 100+
> for tiny differences.  I  know one can determine
> effect size for continuous variables and show
> result is sign only due to size of the N, but...how
> do I do this for categorical outcome variables?
> Thanks!

"contingency coefficient" is one old description of RxK tables.

For examples that I have had, I am usually interested in
various 2x2 sub-tables.  If that suits you,  you have a slew of
options:  symmetric: e.g., the phi coefficient (Pearson correlation),
kappa, and odds ratio; or dependent on the labeling and roles 
of variables: e.g.,  sensitivity/specificity, etc.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=