On Sat, Sep 7, 2013 at 2:35 PM, Pat Ferrel <p...@occamsmachete.com> wrote:
> ... > > > > Clustering can be done by doing SVD or ALS on the user x thing matrix > first > > or by directly clustering the columns of the user x thing matrix after > some > > kind of IDF weighting. I think that only the streaming k-means currently > > does well on sparse vectors. > > > > Was thinking about filtering out all but the top x% of items to get things > the user is likely to have heard about if not seen. Do this before any > factorizing or clustering. > Hmm... My reflex would be to trim *after* clustering so that clustering has the benefit of the long-tail. > ...> > > For #2, I think that this is a great example of multi-modal > > recommendations. You have browsing behavior and your tomatoes-reviews > > behavior. Combining that allows you to recommend for people who have > only > > one kind of behavior. Of course, our viewing behavior will be very > sparse > > to start. > > Yes, that's why I'm not convinced it will be useful but an interesting > experiment now that we have the online Solr recommender. Soon we'll have > category and description metadata from the crawler. We can experiment with > things like category boosting if a category trend emerges during the > browsing session and I suspect it often does--maybe release date etc. The > ease of mixing metadata with behavior is another thing worth experimenting > with. > Cool. And remember meta-data becomes behavior when you interact with an item since you have just interacted with the meta-data as well. Btw... I am spinning up a team internally and a team at a partner site to help with the Mahout demo. I am trying to generate realistic music consumption data this weekend as well.