On Sun, 14 Mar 2004 16:16:52 +0000, [EMAIL PROTECTED] wrote:

> I'm trying to do a cluster analysis with a data set that is in the
> form of a contingency table (i.e. cross tabulation of counts in
> various categories). I wanted to use k-means but I'm not sure that
> this is a valid thing to do. Has anyone got any opinions as to whether
> I should use just hierarchical or k-means.
> 

So, you would be looking at distances between cases
based on dichotomous dummy variables?  That does 
not seem very promising.  Either there will be an enormous
number of tied distances, or there will be a large number
of options to explore, for selecting a distance-metric.

If you have multiple levels of contingency tables, I 
wonder if Correspondence Analysis might fit your
needs better.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
 - I need a new job, after March 31.  Openings? -
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to