Hi! As a starting point I remember this conversation containing both elements (although the reconstruction part is rather small, hint!)
http://markmail.org/message/5cfewal3oyt6vw2k On Tue, May 7, 2013 at 1:00 AM, Dominik Hübner <cont...@dhuebner.com> wrote: > One more thing for now @Ted: > What do you refer to with sparsification and reconstruction? > > On May 7, 2013, at 12:19 AM, Ted Dunning <ted.dunn...@gmail.com> wrote: > > > Truly cold start is best handled by recommending the most popular items. > > > > If you know *anything* at all such as geo or browser or OS, then you can > > use that to recommend using conventional techniques (that is, you can > > recommend for the characteristics rather than for the person). > > > > Within a very few interactions, however, real recommendations will kick > in. > > > > My lately preferred approach is to derive indicators using sparsification > > or ALS+reconstruction. These indicators can be historical items or > static > > items such as geo information. These indicators can be combined in a > > single step using a search engine. > > > > > > > > > > > > > > On Mon, May 6, 2013 at 2:58 PM, Dominik Hübner <cont...@dhuebner.com> > wrote: > > > >> The cluster was mostly intended for tackling the cold start problem for > >> new users. > >> I want to build a recommender based on existing components or to be > >> precise a combination of them. > >> > >> Unfortunately, the only product meta-data I currently have is the > product > >> price. Furthermore, this is a project > >> I am working on alone. As a consequence, the approaches I can examine in > >> the given time are limited. > >> > >> Would using ALS and ranking its outcome by e.g. frequent item set > >> algorithms be something worth looking into? > >> Or did you mean something different? > >> > >> My personal goal is to build a recommender providing acceptable results > >> using the data I currently have available. > >> Of course, this will only serve as a basis for further improvements > where > >> necessary or if further information can be obtained. > >> > >> > >> On May 6, 2013, at 11:21 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: > >> > >>> Are you looking to build a product recommender based on your own > design? > >>> Or do you want to build one based on existing methods? > >>> > >>> If you want to use existing methods, clustering has essentially no > role. > >>> > >>> I think that composite approaches that use item meta-data and different > >>> kinds of behavioral cues are important to best performance. > >>> > >>> > >>> On Mon, May 6, 2013 at 12:35 PM, Dominik Hübner <cont...@dhuebner.com > >>> wrote: > >>> > >>>> Well, as you already might have guessed, I am building a product > >>>> recommender system for my thesis. > >>>> > >>>> I am planning to evaluate ALS (both, implicit and explicit) as well as > >>>> item -similarity recommendation for users with at least a few known > >>>> products. Nevertheless, the majority of users only has seen a single > (or > >>>> 2-3) product(s). I want to recommend them the most popular items from > >>>> clusters, their only product comes from (as a workaround for the > >> cold-start > >>>> problem). Furthermore, I expect to be able to see which "kind" of > >> products > >>>> users like. This might provide me some information about how well ALS > >> and > >>>> similarity recommenders fit the user's area of interest (an early > >>>> evaluation) or at least to estimate if the chosen approach will work > in > >>>> some way. > >>>> > >>>> On May 6, 2013, at 9:09 PM, Ted Dunning <ted.dunn...@gmail.com> > wrote: > >>>> > >>>>> I don't even think that clustering is all that necessary. > >>>>> > >>>>> The reduced cooccurrence matrix will give you items related to each > >> item. > >>>>> > >>>>> You can use something like PCA, but SVD is just as good here due to > >> near > >>>>> zero mean. You could SSVD or ALS from Mahout to do this analysis and > >>>> then > >>>>> use k-means on the right singular vectors (aka item representation). > >>>>> > >>>>> What is the high level goal that you are trying to solve with this > >>>>> clustering? > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On Mon, May 6, 2013 at 12:01 PM, Dominik Hübner < > cont...@dhuebner.com > >>>>> wrote: > >>>>> > >>>>>> And running the clustering on the cooccurrence matrix or doing PCA > by > >>>>>> removing eigenvalues/vectors? > >>>>>> > >>>>>> On May 6, 2013, at 8:52 PM, Ted Dunning <ted.dunn...@gmail.com> > >> wrote: > >>>>>> > >>>>>>> On Mon, May 6, 2013 at 11:29 AM, Dominik Hübner < > >> cont...@dhuebner.com > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Oh, and I forgot how the views and sales are used to build product > >>>>>>>> vectors. As of now, I implemented binary vectors, vectors counting > >> the > >>>>>>>> number of views and sales (e.g 1view=1count, 1sale=10counts) and > >>>>>> ordinary > >>>>>>>> vectors ( view => 1, sale=>5). > >>>>>>>> > >>>>>>> > >>>>>>> I would recommend just putting the view and sale in different > columns > >>>> and > >>>>>>> doing cooccurrence analysis on this. > >>>>>> > >>>>>> > >>>> > >>>> > >> > >> > >