Naive Bayes Classifier Bug ?

2014-06-21 Thread toyoharu ogihara
Hi Mahout, In Naive Bayes, I think that a term does not exist in a training data should not affect a score. What do you think? org.apache.mahout.classifier. naivebayes.AbstractNaiveBayesClassifier Before: protected double getScoreForLabelInstance(int label, Vector instance) { double

RE: Naive Bayes Classifier Bug ?

2014-06-21 Thread Andrew Palumbo
Hi Toyoharu, Mahout Naive Bayes uses Laplace smoothing (the alpha_I parameter with default=1) to deal with terms unseen by the training set. See Rennie et al. sec. 2.3 [1]. Your modification will certainly work, and may in fact give better results for the problem that your working on.

Re: Naive Bayes Classifier Bug ?

2014-06-21 Thread toyoharu ogihara
Hi Andy, Thanks for your response. I must read those documents. There are lots of things I have to learn about Naive Bayes. Toyoharu 2014-06-22 6:15 GMT+09:00 Andrew Palumbo ap@outlook.com: Hi Toyoharu, Mahout Naive Bayes uses Laplace smoothing (the alpha_I parameter with default=1)