Hello all,
I wonder if someone might help with advice on approaches to selecting
variables when clustering cases (I'm using several methods - Ward,
bagged k-means, etc) - and am working with a large number of apparently
relevant variables. I fear that "noisy" or irrelevant variables may be
weakening my analysis and I would like to refine the input space by
identifying and then eliminating any nuisance variables.
My concern is to locate procedures (hopefully software) to select a best
sub-set of variables as input; and then refine the input space following
my initial exploratory clustering. I can use discriminant analysis or
simply examine univariate F ratios - but would would seem to simply bias
any subsequent runs towards the classification structure produced by the
first analysis?
Are there any other procedures for estimating the relative power of the
input variables? and then refining the input space?
Regards
Tim Brennan
--
Replies will be sent to the list.
For more information visit http://www.morphometrics.org