On Sat, Sep 7, 2013 at 2:35 PM, Pat Ferrel <p...@occamsmachete.com> wrote:

> ...
> >
> > Clustering can be done by doing SVD or ALS on the user x thing matrix
> first
> > or by directly clustering the columns of the user x thing matrix after
> some
> > kind of IDF weighting.  I think that only the streaming k-means currently
> > does well on sparse vectors.
> >
>
> Was thinking about filtering out all but the top x% of items to get things
> the user is likely to have heard about if not seen. Do this before any
> factorizing or clustering.
>

Hmm...

My reflex would be to trim *after* clustering so that clustering has the
benefit of the long-tail.


> ...>
> > For #2, I think that this is a great example of multi-modal
> > recommendations.  You have browsing behavior and your tomatoes-reviews
> > behavior.  Combining that allows you to recommend for people who have
> only
> > one kind of behavior.  Of course, our viewing behavior will be very
> sparse
> > to start.
>
> Yes, that's why I'm not convinced it will be useful but an interesting
> experiment now that we have the online Solr recommender. Soon we'll have
> category and description metadata from the crawler. We can experiment with
> things like category boosting if a category trend emerges during the
> browsing session and I suspect it often does--maybe release date etc. The
> ease of mixing metadata with behavior is another thing worth experimenting
> with.
>

Cool.

And remember meta-data becomes behavior when you interact with an item
since you have just interacted with the meta-data as well.

Btw... I am spinning up a team internally and a team at a partner site to
help with the Mahout demo.  I am trying to generate realistic music
consumption data this weekend as well.

Reply via email to