(Back to user@ for the benefit of the list.) I see, so you wish to cluster movies -- by attributes or by ratings? or both? cosine similarity would only make sense in the context of ratings.
I just want to make sure you don't mean you're producing recommendations. On Tue, May 10, 2011 at 5:14 PM, Abin Varghese <[email protected]> wrote: > Hi Sean, > > I have ordered the book (Mahout in action ) today, but that would be > another 2-3 days, before which I could not look the right API. > Let me be specific. > > I have a set of items vector. > > Movie1 - [ 0,1,1,0,0,0,0,0,0,1] > Movie2 - [1,0, 0,0,0,0,0,0,0,0] > Movie3 - [0, 0,0,1,0,0,0,0,0,1] > Movie4 - [1,0, 0,0,0,0,0,1,0,] ..etc > > where each of the movies has a attribute vector, denoting the category to > which it belongs. > I am looking for the right OOB clustering API, rather writing my own > Distance Measure / Cosine similarity. > Or should I write one ? > > > Abin
