In article <[EMAIL PROTECTED]>, Richard Ulrich
<[EMAIL PROTECTED]> writes

>Here is an opinion, which I wonder if there is much
>objection to -- 
>Clustering by computer is a moderately useless enterprise 
>even in moderately skilled hands.
>

I think there are two situations where clustering is sensible.

1. You have some data sets which have been clustered (perhaps by a
person) and which are considered correctly clustered. You can then look
for an automatic way of achieving similar results on these data sets by
trying different algorithms, then cross your fingers and hope the chosen
algorithm does well on new but 'similar' data sets.

2. You have some end-goal, some reason for clustering, which you can
express in terms of a function (of a clustering) which is to minimised.
For example you might want to compress some data by replacing points by
their nearest cluster centres, minimising the size of the data and some
measure of how badly the centres approximate the points. Or you might
aim to improve the accuracy of a classifier by clustering within
individual classes.

I am not sure if I am agreeing or disagreeing with your opinion. I think
you (and probably the OP) are talking about using clustering as some
kind of data exploration or visualisation tool. In that context, I agree
with you.

I would be interested to know if anyone thinks there is a good reason to
use a clustering algorithm besides the two above.

-- 
Graham Jones
http://www.visiv.co.uk
Emails to [EMAIL PROTECTED] may be deleted as spam
Please add a j just before the @ to ensure delivery

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to