I've given more detail in a follow up to Dr Kendall's query which I hope explains the problem more fully.
Thanks Keith On Sun, 14 Mar 2004 17:56:49 -0500, Rich Ulrich <[EMAIL PROTECTED]> wrote: >On Sun, 14 Mar 2004 16:16:52 +0000, [EMAIL PROTECTED] wrote: > >> I'm trying to do a cluster analysis with a data set that is in the >> form of a contingency table (i.e. cross tabulation of counts in >> various categories). I wanted to use k-means but I'm not sure that >> this is a valid thing to do. Has anyone got any opinions as to whether >> I should use just hierarchical or k-means. >> > >So, you would be looking at distances between cases >based on dichotomous dummy variables? That does >not seem very promising. Either there will be an enormous >number of tied distances, or there will be a large number >of options to explore, for selecting a distance-metric. > >If you have multiple levels of contingency tables, I >wonder if Correspondence Analysis might fit your >needs better. . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
