Complementary Naive Bayes classification is for unbalanced datasets and is 
available in Mahout, see the relevant section in the Rennie paper on this 
subject - http://people.csail.mit.edu/jrennie/papers/icml03-nb.pdf

The code for Theta Normalization seems complete, so not sure as to why its 
still commented out (been that way since Mahout 0.7). 

Need to verify if its behavior is correct though.






On Sunday, February 23, 2014 5:46 PM, qiaoresearcher <qiaoresearc...@gmail.com> 
wrote:
 
Suneel and Andrew,

Many thanks for the clarification, I do have included the -c option when
train the naive bayes. Will debug the code later on to discover more
details.

A general question, what are the options available in Mahout when we have
very imbalanced data sets?

Regards,




On Fri, Feb 21, 2014 at 12:09 AM, Suneel Marthi <suneel_mar...@yahoo.com>wrote:

> Complimentary Naive Bayes does exist in Mahout (invoked with -c option
> when running BayesDriver).
>
> The code for ThetaSummer job does exist and the code being still commented
> out (been that way since Mahout 0.7) could be either due to oversight or
> due to not having tested Theta Normalization thoroughly.
>
> There's a jira already open for this, see MAHOUT-1369.  Robin Anil, could
> u explain if this code can be uncommented or if its still not functional?
>
> For whomever that would like to work on this, it would be great to add
> code comments (presently missing from this code) and also refer the
> original paper (see below).
>
> For reference, Mahout Naive Bayes (and complementary Naive Bayes)
> classifiers impl is based on the Rennie paper on this subject -
> http://people.csail.mit.edu/jrennie/papers/icml03-nb.pdf
>
>
>
>
>
>
>
>
>
> On Thursday, February 20, 2014 11:40 PM, Andrew Musselman <
> andrew.mussel...@gmail.com> wrote:
>
> It's an option when you run the examples as I recall.  Search in
> examples/bin and you can trace it out.
>
>
> > On Feb 20, 2014, at 8:02 PM, qiaoresearcher <qiaoresearc...@gmail.com>
> wrote:
> >
> > Does mahout have complementary naive bayes implementation available?
> > I checked the mahout source code, it seems the author did not finish it
> > yet? as shown in the following, the thetaSummer job is not submitted.
> >
> > public final class TrainNaiveBayesJob extends AbstractJob {
> >
> > ....
> >
> >
> thetaSummer.getConfiguration().setBoolean(ThetaMapper.TRAIN_COMPLEMENTARY,
> > trainComplementary);
> > /* TODO(robinanil): Enable this when thetanormalization works.
> >    succeeded = thetaSummer.waitForCompletion(true);
> >    if (!succeeded) {
> >      return -1;
> >    }*/
> >
> > .....
> >
> > }
> >
> > Any comments will be appreciated.
>

Reply via email to