We agree, it was just me explaining things vaguely. The bottom line is: a lot depends on what you're planning to do with the clusters and the methodology should be suitable to this.
Dawid On Mon, Jan 4, 2010 at 8:53 PM, Ted Dunning <[email protected]> wrote: > I think I agree with this for clusters that are intended for human > consumption, but I am sure that I disagree with this if you are looking to > use the clusters internally for machine learning purposes. > > The basic idea for the latter is that the distances to a bunch of clusters > can be used as a description of a point. This description in terms of > distances to cluster centroids can make some machine learning tasks vastly > easier. > > On Mon, Jan 4, 2010 at 11:44 AM, Dawid Weiss <[email protected]> wrote: > >> What's worse -- neither method is "better". We at Carrot2 have a >> strong feeling that clusters should be described properly in order to >> be useful, but one may argue that in many, many applications of >> clustering, the labels are _not_ important and just individual >> features of clusters (like keywords or even documents themselves) are >> enough. >> > > > > -- > Ted Dunning, CTO > DeepDyve >
