Since k-means is a hard clustering, that term should appear in no more than
2 clusters and even that is very unlikely.  It is also very unlikely if the
cluster explanation would return that term as a top term even if it appeared
in just one cluster.

This could be some confusion in turning the id's back into terms.  It
definitely does indicate serious problems.

On Sat, Jan 2, 2010 at 10:27 AM, Bogdan Vatkov <[email protected]>wrote:

> How is this even possible - for 23, 000 docs and for a term which is
> mentioned only 2 times I have it as a top term in 9 clusters? I definitely
> did something wrong, do you have an idea what that could be?
>



-- 
Ted Dunning, CTO
DeepDyve

Reply via email to