Thanks Vivek, We do not have predefined clusters/groups. We expect the groups to mutate as more history (data) is accumulated. A simple use case is as follows: John has viewed a pair of jeans, a cowboy hat, a red shirt and a pair of boots. Scott has viewed a pair of jeans, a cowboy hat, a red shirt and a pocket watch. Larry has viewed a pair of jeans, a cowboy hat and a red shirt.
When we send Larry and his items into our reco engine, we would expect a pair of boots and a pocket watch to be recommended. We'd expect this because we've determined that John and Scott are 'like' Larry and thus are in the same cluster. Again, we fully expect the cluster members to change, as user/item data accumulates. On Tue, Jun 22, 2010 at 4:37 PM, Vivek Khanna <[email protected]>wrote: > > Hi, > > > > For your clustering/grouping, what is your expectation? Do you have > pre-defined clusters/groups that you want to cluster the items within those, > or do you envision a system where clusters/groups will change and evolve as > the data changes? > > > > In each case, it seems you are looking for unsupervised approaches. Is that > correct? > > > > I am new to this email list, so pardon my ignorance, but from what work I > have done in the past with IR, ML (clustering, More like this, > categorization, topic detection etc.), my advice to you is to identify your > requirements, use cases and page flow interactions as the first step. :) > > > > Hope this helps! > > Vivek. > > > Date: Tue, 22 Jun 2010 15:50:18 -0700 > > Subject: User/Items Reco Engine clustering > > From: [email protected] > > To: [email protected] > > > > I'm looking to enhance a product recommendation engine. It currently > works > > with all data as a whole. I want to introduce clustering/grouping. Its > > model based and the relationship is the common User-Items relationship. > > Originally I was thinking of using a Canopy / kmeans cluster. And then > > determine which cluster a user is in and execute Item Similarity against > > only that cluster of items. However I can't figure out how to build a > > SequenceFile using vectors with the User/Items relationship. I don't know > > which data points to feed the vector. So I scratched that idea and turned > > my attention to Lucene, thinking that this is really a document issue. > Where > > users are documents and the items are the content. I should be able to > ask > > Lucene, give me documents that look like this "productId3 productId9056 > > productId234". > > > > I'm looking for any and all feedback from those experienced in the > > recommendation world, specifically with the grouping of users and items. > > > > Thanks, > > -Jay > > _________________________________________________________________ > The New Busy is not the old busy. Search, chat and e-mail from your inbox. > > http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3
