I think that payloads are a bad idea here. My rationale is that you really want to index these signals if at all possible.
Also, payloads (as of a while ago) were not accessed very efficiently. This can massively slow down scoring. On Mon, Nov 5, 2012 at 7:01 AM, shubham srivastava <shubha...@gmail.com>wrote: > http://sujitpal.blogspot.in/2011/01/payloads-with-solr.html > > On Fri, Nov 2, 2012 at 12:13 PM, Johannes Schulte < > johannes.schu...@gmail.com> wrote: > > > Hi, > > > > i can also encourage to go the simple way with a solr or lucene index. It > > gives you almost unlimited possibilities when you want include new > > "relevance signals" and even more important, have business requirements > > like filtering etc. > > > > I'm using a plain lucene index to combine stuff. The pre-calculated > > Item-To-Item similarities are stored as payload fields so the > similarities > > can be used in the scoring process. This way you can easy issue a query > > like "contains x and is similar to items a,b,c". > > > > You can even use boosting different parts of the query to fade between > the > > signals. Only question is how much you can achieve "by hand". Probably > you > > want to somehow learn which weights on the signals perform best. Maybe > this > > blog article by netflix is a good start > > > > > > > http://techblog.netflix.com/2012/06/netflix-recommendations-beyond-5-stars.html > > > > > > > > Cheers, > > Johannes > > > > > > On Fri, Nov 2, 2012 at 6:21 AM, Ted Dunning <ted.dunn...@gmail.com> > wrote: > > > > > Speaking with no principles in hand at all, I find that it is possible > to > > > encode multiple item similarity matrices together in a SolR instance > and > > > then do very nice coordinated recommendations from multiple sources of > > > information. > > > > > > Abusing a text retrieval engine this way has only vague basis in > theory, > > > but it can be particularly nice from a practical point of view. > > > > > > On Thu, Nov 1, 2012 at 10:41 AM, Sean Owen <sro...@gmail.com> wrote: > > > > > > > There is not a very direct way to do this in Mahout, but, you can > piece > > > > together a solution that reuses a lot of what Mahout has. > > > > > > > > It sounds like you should look at this as an item-item > similarity-based > > > > recommender to start. You have two sources of similarity. First is > > based > > > on > > > > interactions (no ratings); for this, you can use > > LogLikelihoodSimilarity > > > > and an existing DataModel. This much is straightforward. > > > > > > > > You can also make an ItemSimilarity based on item properties. There > is > > no > > > > pre-packaged solution for this. You can make up a similarity metric, > or > > > > export some similarities based on, say, descriptions, maybe from Solr > > > yes. > > > > > > > > Then you can combine them. There's no great principled answer. You > > could > > > > make an ItemSimilarity that just returns the product of these two > > > > similarity measures (assuming they are both >= 0). > > > > > > > > And then the rest is a matter of using GenericItemBasedRecommender > with > > > > your hybrid ItemSimilarity. > > > > > > > > This isn't a distributed solution but is a good start. > > > > > > > > Sean > > > > > > > > > > > > On Thu, Nov 1, 2012 at 5:33 PM, shubham srivastava < > > shubha...@gmail.com > > > > >wrote: > > > > > > > > > Hi, > > > > > > > > > > I am looking into designing implementing a recommendation engine > > with > > > > the > > > > > below use cases . There is no specific rating's etc given by user's > > as > > > > such > > > > > for items accessed. > > > > > > > > > > 1. Item's viewed by other user's who viewed this particular Item > > > > > > > > > > 2. Item's booked by other user's who viewed this particular Item > > > > > > > > > > 3. Most viewed item('s) viewed by other user's who viewed this > > > particular > > > > > Item > > > > > > > > > > The idea behind is the below : > > > > > > > > > > 1.I want to interpret user behavior where recommendation would be > > based > > > > on > > > > > the other user's patterns which falls into the bracket of CF(item > > based > > > > > similarities or user based) . > > > > > > > > > > 2.I want to exploit item item similarity which is based on N number > > of > > > > > attributes. The attributes can be say : > > price,location,features(1...n) > > > as > > > > > so on. > > > > > > > > > > The recommendation should be a mix of both of the above. > > > > > > > > > > A) For 1 given that I don't have an explicit rating my initial > > thought > > > > was > > > > > around interpreting ratings as based on what user does for a > product > > eg > > > > > > > > > > If he only views it I give a 1 rating > > > > > If he further sees the details I give 2 rating > > > > > If he goes to the booking page I give him 3 rating > > > > > If he books it I give him 4 rating etc > > > > > > > > > > And when I have the same I would go for a standard CF item-item > > > > similarity > > > > > implemented through Mahout > > > > > > > > > > B) For 2. I was looking into our search framework(Solr) to give the > > > same > > > > > i.e Solr's MoreLikeThis feature. Also carrot also seems to make it > > > better > > > > > but I don't how much would that be scalable etc. > > > > > > > > > > Idea is to get an intersection if A and B to get started with. > Also > > I > > > > need > > > > > to figure out the processing and latency part of getting the > results > > as > > > > > well. > > > > > > > > > > I guess the group user's must have solved a similar problem more > > > > > efficiently and could advise better. > > > > > > > > > > Please let me know the same. > > > > > > > > > > Regards, > > > > > Shubham > > > > > > > > > > > > > > >