Ted,
Thank you very much. This is very insightful.
The log scaling is definitely an intuitive way of building the meta model.
Not much disagreement about the uselessness of predicting ratings.


On Fri, May 31, 2013 at 4:00 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> In my case, I put all the indicators from all different sources in the same
> Solr/Lucene index.  Recommendations consists of making a single query to
> Solr/Lucene with as much data as I have or want to include.
>
> At the point that this query is done, there are no weights on the
> indicators ... merely presence or absence in a field or query.  The weights
> that I typically use are computed on the fly by Lucene's default similarity
> score and the results tend to be very good.  There is no issue of combining
> scores on different scales since there is only one composite score.
>
> If you *really* want to build multiple models using different technologies
> and combine them, you need a so-called meta-model.  There are many ways to
> build such a beast.  A very simple way is to reduce all scores to quantiles
> then to a log-odds scale (taking care not to ever estimate a quantile as
> either 0 or 1).  A linear combination of these rescaled scores can work
> pretty well although you do have to learn the linear weights.
>
> Sometimes scores vary strongly from query to query.  In such cases,
> reducing a score to being some kind of rank statistic can be helpful. For
> instance, you may want to have a score that is the log of the rank that an
> item appears at in the results list.  You might also be able to normalize
> scores based on properties of the query. Such rank-based or normalized
> scores can then often be combined by any meta-model, including the one I
> mentioned above.
>
> You should also look at the netflix papers, especially the one describing
> the winning entry for more ideas on model combination.  The major
> difference there is that they were trying to predict a rating which is a
> task that I find essentially useless since ranking is so much more
> important in most real-world applications.  Others may dispute my
> assessment on this, of course.
>
> There are many ways of building the meta-model that you need, but one
> over-riding thought that I have is that the deviations from ideal in all
> real cases will be large enough that theory should not be taken too
> literally here, but rather should be used as a weak, though still useful,
> inspirational guide.
>
>
> On Fri, May 31, 2013 at 3:18 PM, Koobas <koo...@gmail.com> wrote:
>
> > I am also very interested in the answer to this question.
> > Just to reiterate, if you use different recommenders, e.g.,
> > kNN user-based, kNN item-based, ALS, each one produces
> > recommendations on a different scale. So how do you combine them?
> >
> >
> > On Fri, May 31, 2013 at 3:07 PM, Dominik Hübner <cont...@dhuebner.com
> > >wrote:
> >
> > > Hey,
> > > I have implemented a cross recommender based on the approach Ted
> Dunning
> > > proposed (cannot find the original post, but here is a follow up
> > > http://www.mail-archive.com/user@mahout.apache.org/msg12983.html).
> > > Currently I am struggling with the last step of blending the initial
> > > recommendations.
> > >
> > > My current approach:
> > > 1. Compute a cooccurrence matrix for each useful combination of
> > > user-product interaction (e.g. which product views and purchased do
> > appear
> > > in common …)
> > > 2. Perform initial recommendation based on each matrix and the required
> > > type of user vector (e.g. a user's history of views OR purchases) (like
> > the
> > > item-based recommender implemented in Mahout)
> > >
> > > In step 2, I adapted the AggregateAndRecommendReducer of Mahout, which
> > > normalizes vectors while building the sum of weighted similarities or
> in
> > > this case => cooccurrences.
> > >
> > > Now I end up with multiple recommendations for each product, but all of
> > > them are on a different scale.
> > > How can I convert them to have the same scale, in order to be able to
> > > weight them and build the linear combinations of initial
> recommendations
> > as
> > > Ted proposed?
> > > Would it make sense to normalize user vectors (before multiplying) as
> > well?
> > >
> > > Otherwise views would have a much higher influence than purchases due
> to
> > > their plain characteristics (they just appear way more frequently). Or
> is
> > > this the reason for weighting purchases higher and views lower? If so,
> I
> > > think it's sort of inconvenient. Wouldn't it be much more favorable to
> > get
> > > each type of interaction within the same scale and use the weights just
> > to
> > > control each types influence on the final recommendation?
> > >
> > > Thanks in advance for any suggestions!
> > >
> > >
> > >
> > > Regards
> > > Dominik
> > >
> > > Sent from my iPhone
> >
>

Reply via email to