Re: Evalutation of recommenders

Sean Owen Tue, 10 Apr 2012 14:35:03 -0700

You are making recommendations, and you want to do this via
clustering. OK, that's fine. How you implement it isn't so important
-- it's that you have some parameters to change and want to know how
any given process does.


You just want to use some standard recommender metrics, to start, I'd
imagine. If you're estimating ratings -- root mean squared error of
the difference between estimate and actual on the training data. Or
you can fall back to precision, recall, and nDCG as a form of score.
So, yes, definitely well-established approaches here.

I have this sense that you are saying you have business users who are
going to measure some real-world metric (conversion rate, uplift,
clickthrough), and guess at some changes to algorithm parameters that
might make them better. If you have *that* kind of feedback -- much
better. That is a far more realistic metric. Of course, it's much
harder to experiment when using that metric since you have to run the
algo for a day or something to collect data.

It's a separate question, but I don't know if in the end a business
user can meaningfully decide weights on feature vectors. I mean, I
couldn't eyeball those kinds of things. It may just be how you need to
do things, but would double-check that everyone has a similar and
reasonable expectation about what these inputs are and what they do.


On Tue, Apr 10, 2012 at 3:23 PM, Saikat Kanjilal <sxk1...@hotmail.com> wrote:
>
> I see the architecture similar to  the following:
>
> Asynchronously:Given a set of feature vectors    run 
> clustering/classification algorithms for each of our feature vectors to 
> create the appropriate buckets for the set of users, feed the result of these 
> computations into the synchronous database.
> Synchronously:For each bucket run item similarity recommendation algorithms 
> to display a real time set of recommendations for each user
>
> For the asynchronous computations we need the ability to tweak the weights 
> associated with each feature of the feature vectors (typical features might 
> include income/age/dining preferences etc) and we need the business folks to 
> adjust the weights associated with each of these to regenerate the async 
> buckets
>
> So given the above architecture we need the ability for the async 
> computations to judge which algorithm to use based on a set of performance 
> measuring criteria, that was the heart of my initial question, whether folks 
> have built this sort of framework and what are some things to think about 
> when building this.
> Thanks for your feedback
>
>
>
>> Date: Tue, 10 Apr 2012 14:33:56 -0500
>> Subject: Re: Evalutation of recommenders
>> From: sro...@gmail.com
>> To: user@mahout.apache.org
>>
>> You're talking about recommendations now... are we talking about a
>> clustering, classification or recommender system?
>>
>> In general I don't know if it makes sense for business users to be
>> deciding aspects of the internal model. At most someone should input
>> the tradeoffs -- how important is accuracy vs speed? those kinds of
>> things. Then it's an optimization problem. But, understood, maybe you
>> need to let people explore these things manually at first.
>>
>> On Tue, Apr 10, 2012 at 2:21 PM, Saikat Kanjilal <sxk1...@hotmail.com> wrote:
>> >
>> > The question really is what are some tried approaches to figure out how to 
>> > measure the quality of a set of algorithms currently being used for 
>> > clustering/classification?
>> >
>> > And in thinking about this some more we also need to be able to regenerate 
>> > models as soon as the business users tweak the weights associated with 
>> > features inside a feature vector, we need to figure out a way to 
>> > efficiently tie this into our online workflow which could show updated 
>> > recommendations every few hours?
>> >
>> > When I say picking an algorithm on the fly what I mean is that we need to 
>> > continuously test our basket of algorithms based on a new set of training 
>> > data and make the determination offline as to which of the algorithms to 
>> > use at that moment to regenerate our recommendations.
>> >> Date: Tue, 10 Apr 2012 14:08:17 -0500
>> >> Subject: Re: Evalutation of recommenders
>> >> From: sro...@gmail.com
>> >> To: user@mahout.apache.org
>> >>
>> >> Picking an algorithm 'on the fly' is almost surely not realistic --
>> >> well, I am not sure what eval process you would run in milliseconds.
>> >> But it's also unnecessary; you usually run evaluations offline on
>> >> training/test data that reflects real input, and then, the resulting
>> >> tuning should be fine for that real input that comes the next day.
>> >>
>> >> Is that really the question, or are you just asking about how you
>> >> measure the quality of clustering or a classifier?
>> >>
>> >> On Tue, Apr 10, 2012 at 10:41 AM, Saikat Kanjilal <sxk1...@hotmail.com> 
>> >> wrote:
>> >> >
>> >> > Hi everyone,We're looking at building out some clustering and 
>> >> > classification algorithms using mahout and one of the things we're also 
>> >> > looking at doing is to build performance metrics around each of these 
>> >> > algorithms, as we go down the path of choosing the best model in an 
>> >> > iterative closed feedback loop (i.e. our business users manipulate 
>> >> > weights for each attribute for our feature vectors->we use these 
>> >> > changes to regenerate an asynchronous model using the appropriate 
>> >> > clustering/classification algorithms and then replenish our online 
>> >> > component with this newly recalculated data for fresh recommendations). 
>> >> >   So our end goal is to have a basket of algorithms and use a set of 
>> >> > performance metrics to pick and choose the right algorithm on the fly.  
>> >> > I was wondering if anyone has done this type of analysis before and if 
>> >> > so are there approaches that have worked well and approaches that 
>> >> > haven't when it comes to measuring the "quality" of each of the 
>> >> > recommendation algorithms.
>> >> > Regards
>> >
>

Re: Evalutation of recommenders

Reply via email to