RE: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?

Chandler Burgess Fri, 28 Mar 2014 14:10:30 -0700

Ok, then I should remove it? There's about 2 dozen lines of code in 
TestNaiveBayesDriver for running sequentially.


-----Original Message-----
From: Suneel Marthi [mailto:suneel_mar...@yahoo.com] 
Sent: Friday, March 28, 2014 3:51 PM
To: dev@mahout.apache.org
Subject: Re: MAHOUT-1369 - Why does theta normalization for naive bayes 
classification commented out?

Bayes doesn't have a non-mapreduce impl so -seq flag wouldn't  work. 

Sent from my iPhone

> On Mar 28, 2014, at 4:16 PM, Chandler Burgess <cburg...@icontrolesi.com> 
> wrote:
> 
> Well, maybe someone can correct me but this seems disappointing. I 
> uncommented the code in NaiveBayesModel, BayesUtil and TrainNaiveBayesJob, 
> added some trace statements in ComplementaryThetaMapper and 
> ComplementaryNaiveBayesClassifier to verify they were being called, and then 
> ran some tests using trainnb/testnb. There was not a single difference in the 
> classifications when train/testcomplementary was specified vs standard naïve 
> bayes.
> 
> Also, running testnb with the -seq flag doesn't appear to work.
> 
> -----Original Message-----
> From: Chandler Burgess [mailto:cburg...@icontrolesi.com]
> Sent: Thursday, March 27, 2014 5:17 PM
> To: dev@mahout.apache.org
> Subject: RE: MAHOUT-1369 - Why does theta normalization for naive bayes 
> classification commented out?
> 
> The program I wrote didn't use a model that was trained with Cbayes. After 
> looking at the scorers in SNB and CNB, I figured they would give different 
> results even on a model not trained with CNB. That could very well be 
> ignorance on my part as to the math. 
> 
> However, I did some command line tests using -c on both training and testing 
> and didn't see any difference in the testnb output.
> ________________________________________
> From: Suneel Marthi <suneel_mar...@yahoo.com>
> Sent: Thursday, March 27, 2014 5:12 PM
> To: dev@mahout.apache.org
> Cc: s...@apache.org
> Subject: Re: MAHOUT-1369 - Why does theta normalization for naive bayes 
> classification commented out?
> 
> Just checking , u r testing Cbayes on a model that's already been trained 
> using Cbayes correct?
> 
> Also the jira I mentioned earlier was fixed for .9, so u should be 
> good. No code changes were done to naive bayes since .9
> 
> 
> Sent from my iPhone
> 
>> On Mar 27, 2014, at 6:01 PM, Chandler Burgess <cburg...@icontrolesi.com> 
>> wrote:
>> 
>> Ok, I'll uncomment those lines and see. I also have plenty of test data 
>> available  too (I'm doing document classification with unbalanced classes), 
>> so I'll see if it improves there as well.
>> 
>> Also, I'll try to make some time in the next week and go over the algorithm 
>> in detail compared with the paper as an extra check.
>> 
>> Thanks,
>> Chandler
>> ________________________________________
>> From: Sebastian Schelter <s...@apache.org>
>> Sent: Thursday, March 27, 2014 4:01 PM
>> To: dev@mahout.apache.org
>> Subject: Re: MAHOUT-1369 - Why does theta normalization for naive bayes 
>> classification commented out?
>> 
>> Hi Chandler,
>> 
>> I think a good way to go would be to reenable theta normalization and 
>> run the classification examples that we already have to see how it 
>> affects the result (and make sure it improves the result).
>> 
>> Would be great to have this fixed. I'm also planning to port NB to 
>> our Spark DSL very soon (should be just a few lines of code).
>> 
>> --sebastian
>> 
>> 
>>> On 03/27/2014 09:07 PM, Suneel Marthi wrote:
>>> Which Mahout version r u running? While its true that ThetaNormalizer is 
>>> still disabled today, Mahout-1389 fixes a bug wherein Complementary NB 
>>> wasn't being called when invoked.
>>> 
>>> Please test with Mahout 0.9 or trunk.
>>> 
>>> 
>>> 
>>> 
>>> On Thursday, March 27, 2014 3:53 PM, Chandler Burgess 
>>> <cburg...@icontrolesi.com> wrote:
>>> 
>>> Hello all,
>>> 
>>> It seems Robin Anil hasn't responded, and no one is sure of the status on 
>>> this. What needs to be done on this, and/or what can I do to help? I'm no 
>>> ML expert, but I do have the paper and should be able to verify/fix the 
>>> implementation. I'm REALLY interested in using the CNB classifier, since it 
>>> seems well suited to the problem I'm trying to tackle, before I give up and 
>>> use something else.
>>> 
>>> I've done tests and see no difference when -c is passed on the command line 
>>> for training or testing. I also wrote a program to print the scores using 
>>> StandardNaiveBayesClassifier and ComplementaryNaiveBayesClassifier in a 
>>> binary classification problem and see no difference between the scores, so 
>>> it seems complementary naïve bayes is completely disabled.
>>> 
>>> Thanks,
>>> Chandler Burgess
>>

RE: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out?

Reply via email to