Yes we have business users who are putting measures on a real world metric and 
in turn provide that level of feedback by putting some weighting on some 
algorithm parameters to tweak results, the results should be different and will 
be driven off from this.
Thanks again for your insight on recommender metrics, will look at implementing 
these, will post more as we get this off the ground as we run into challenging 
scenarios.

> Date: Tue, 10 Apr 2012 16:34:33 -0500
> Subject: Re: Evalutation of recommenders
> From: sro...@gmail.com
> To: user@mahout.apache.org
> 
> You are making recommendations, and you want to do this via
> clustering. OK, that's fine. How you implement it isn't so important
> -- it's that you have some parameters to change and want to know how
> any given process does.
> 
> You just want to use some standard recommender metrics, to start, I'd
> imagine. If you're estimating ratings -- root mean squared error of
> the difference between estimate and actual on the training data. Or
> you can fall back to precision, recall, and nDCG as a form of score.
> So, yes, definitely well-established approaches here.
> 
> I have this sense that you are saying you have business users who are
> going to measure some real-world metric (conversion rate, uplift,
> clickthrough), and guess at some changes to algorithm parameters that
> might make them better. If you have *that* kind of feedback -- much
> better. That is a far more realistic metric. Of course, it's much
> harder to experiment when using that metric since you have to run the
> algo for a day or something to collect data.
> 
> It's a separate question, but I don't know if in the end a business
> user can meaningfully decide weights on feature vectors. I mean, I
> couldn't eyeball those kinds of things. It may just be how you need to
> do things, but would double-check that everyone has a similar and
> reasonable expectation about what these inputs are and what they do.
> 
> 
> On Tue, Apr 10, 2012 at 3:23 PM, Saikat Kanjilal <sxk1...@hotmail.com> wrote:
> >
> > I see the architecture similar to  the following:
> >
> > Asynchronously:Given a set of feature vectors    run 
> > clustering/classification algorithms for each of our feature vectors to 
> > create the appropriate buckets for the set of users, feed the result of 
> > these computations into the synchronous database.
> > Synchronously:For each bucket run item similarity recommendation algorithms 
> > to display a real time set of recommendations for each user
> >
> > For the asynchronous computations we need the ability to tweak the weights 
> > associated with each feature of the feature vectors (typical features might 
> > include income/age/dining preferences etc) and we need the business folks 
> > to adjust the weights associated with each of these to regenerate the async 
> > buckets
> >
> > So given the above architecture we need the ability for the async 
> > computations to judge which algorithm to use based on a set of performance 
> > measuring criteria, that was the heart of my initial question, whether 
> > folks have built this sort of framework and what are some things to think 
> > about when building this.
> > Thanks for your feedback
> >
> >
> >
> >> Date: Tue, 10 Apr 2012 14:33:56 -0500
> >> Subject: Re: Evalutation of recommenders
> >> From: sro...@gmail.com
> >> To: user@mahout.apache.org
> >>
> >> You're talking about recommendations now... are we talking about a
> >> clustering, classification or recommender system?
> >>
> >> In general I don't know if it makes sense for business users to be
> >> deciding aspects of the internal model. At most someone should input
> >> the tradeoffs -- how important is accuracy vs speed? those kinds of
> >> things. Then it's an optimization problem. But, understood, maybe you
> >> need to let people explore these things manually at first.
> >>
> >> On Tue, Apr 10, 2012 at 2:21 PM, Saikat Kanjilal <sxk1...@hotmail.com> 
> >> wrote:
> >> >
> >> > The question really is what are some tried approaches to figure out how 
> >> > to measure the quality of a set of algorithms currently being used for 
> >> > clustering/classification?
> >> >
> >> > And in thinking about this some more we also need to be able to 
> >> > regenerate models as soon as the business users tweak the weights 
> >> > associated with features inside a feature vector, we need to figure out 
> >> > a way to efficiently tie this into our online workflow which could show 
> >> > updated recommendations every few hours?
> >> >
> >> > When I say picking an algorithm on the fly what I mean is that we need 
> >> > to continuously test our basket of algorithms based on a new set of 
> >> > training data and make the determination offline as to which of the 
> >> > algorithms to use at that moment to regenerate our recommendations.
> >> >> Date: Tue, 10 Apr 2012 14:08:17 -0500
> >> >> Subject: Re: Evalutation of recommenders
> >> >> From: sro...@gmail.com
> >> >> To: user@mahout.apache.org
> >> >>
> >> >> Picking an algorithm 'on the fly' is almost surely not realistic --
> >> >> well, I am not sure what eval process you would run in milliseconds.
> >> >> But it's also unnecessary; you usually run evaluations offline on
> >> >> training/test data that reflects real input, and then, the resulting
> >> >> tuning should be fine for that real input that comes the next day.
> >> >>
> >> >> Is that really the question, or are you just asking about how you
> >> >> measure the quality of clustering or a classifier?
> >> >>
> >> >> On Tue, Apr 10, 2012 at 10:41 AM, Saikat Kanjilal <sxk1...@hotmail.com> 
> >> >> wrote:
> >> >> >
> >> >> > Hi everyone,We're looking at building out some clustering and 
> >> >> > classification algorithms using mahout and one of the things we're 
> >> >> > also looking at doing is to build performance metrics around each of 
> >> >> > these algorithms, as we go down the path of choosing the best model 
> >> >> > in an iterative closed feedback loop (i.e. our business users 
> >> >> > manipulate weights for each attribute for our feature vectors->we use 
> >> >> > these changes to regenerate an asynchronous model using the 
> >> >> > appropriate clustering/classification algorithms and then replenish 
> >> >> > our online component with this newly recalculated data for fresh 
> >> >> > recommendations).   So our end goal is to have a basket of algorithms 
> >> >> > and use a set of performance metrics to pick and choose the right 
> >> >> > algorithm on the fly.  I was wondering if anyone has done this type 
> >> >> > of analysis before and if so are there approaches that have worked 
> >> >> > well and approaches that haven't when it comes to measuring the 
> >> >> > "quality" of each of the recommendation algorithms.
> >> >> > Regards
> >> >
> >
                                          

Reply via email to