The question really is what are some tried approaches to figure out how to 
measure the quality of a set of algorithms currently being used for 
clustering/classification?

And in thinking about this some more we also need to be able to regenerate 
models as soon as the business users tweak the weights associated with features 
inside a feature vector, we need to figure out a way to efficiently tie this 
into our online workflow which could show updated recommendations every few 
hours?

When I say picking an algorithm on the fly what I mean is that we need to 
continuously test our basket of algorithms based on a new set of training data 
and make the determination offline as to which of the algorithms to use at that 
moment to regenerate our recommendations.
> Date: Tue, 10 Apr 2012 14:08:17 -0500
> Subject: Re: Evalutation of recommenders
> From: sro...@gmail.com
> To: user@mahout.apache.org
> 
> Picking an algorithm 'on the fly' is almost surely not realistic --
> well, I am not sure what eval process you would run in milliseconds.
> But it's also unnecessary; you usually run evaluations offline on
> training/test data that reflects real input, and then, the resulting
> tuning should be fine for that real input that comes the next day.
> 
> Is that really the question, or are you just asking about how you
> measure the quality of clustering or a classifier?
> 
> On Tue, Apr 10, 2012 at 10:41 AM, Saikat Kanjilal <sxk1...@hotmail.com> wrote:
> >
> > Hi everyone,We're looking at building out some clustering and 
> > classification algorithms using mahout and one of the things we're also 
> > looking at doing is to build performance metrics around each of these 
> > algorithms, as we go down the path of choosing the best model in an 
> > iterative closed feedback loop (i.e. our business users manipulate weights 
> > for each attribute for our feature vectors->we use these changes to 
> > regenerate an asynchronous model using the appropriate 
> > clustering/classification algorithms and then replenish our online 
> > component with this newly recalculated data for fresh recommendations).   
> > So our end goal is to have a basket of algorithms and use a set of 
> > performance metrics to pick and choose the right algorithm on the fly.  I 
> > was wondering if anyone has done this type of analysis before and if so are 
> > there approaches that have worked well and approaches that haven't when it 
> > comes to measuring the "quality" of each of the recommendation algorithms.
> > Regards
                                          

Reply via email to