On Wed, Aug 12, 2009 at 10:50 AM, Ted Dunning<[email protected]> wrote:
> Whoa....
>
> No.  It sounds like I have muddied things thoroughly.  What I was saying is
> that there are times that tf.idf and llr agree and times that tf.idf and llr
> disagree.  In my experience, most of the second category are where tf.idf is
> over-weighting coincidental cases or where both scores are producing not
> good stuff.
>
> If a phrase or term is marked as good by LLR and is a prominent feature of
> the centroid, that is fine.
>

Thanks for the explanation, Ted.

Is this a necessary & sufficient  condition for a good cluster label?

On a different note,  is there any way to identify relationship among
the top labels of the clusters? For example, if I have cluster related
automobiles, I may get the companies (GM, Ford, Toyota) along with
their poupular models (Corolla,  Cadillac, ) as top labels. How can I
figure out Toyota and Corolla are strongly related?

--shashi

Reply via email to