Not sure.  Should work.  It would save you the effort of normalizing.

The canopy and k-means stuff should be good to go.

Any other clustering algorithm such as spectral or agglomerative clustering
will need all of the distance pairs to be computed.  With that, it is
important to use the coocurrence trick so that you wind up with a sparse
similarity matrix.

On Thu, May 28, 2009 at 5:16 PM, Grant Ingersoll <[email protected]>wrote:

>
>> cosine norm.
>>
>
> o.a.mahout.utils.CosineDistanceMeasure?




-- 
Ted Dunning, CTO
DeepDyve

Reply via email to