Ok, then I should remove it? There's about 2 dozen lines of code in TestNaiveBayesDriver for running sequentially.
-----Original Message----- From: Suneel Marthi [mailto:suneel_mar...@yahoo.com] Sent: Friday, March 28, 2014 3:51 PM To: dev@mahout.apache.org Subject: Re: MAHOUT-1369 - Why does theta normalization for naive bayes classification commented out? Bayes doesn't have a non-mapreduce impl so -seq flag wouldn't work. Sent from my iPhone > On Mar 28, 2014, at 4:16 PM, Chandler Burgess <cburg...@icontrolesi.com> > wrote: > > Well, maybe someone can correct me but this seems disappointing. I > uncommented the code in NaiveBayesModel, BayesUtil and TrainNaiveBayesJob, > added some trace statements in ComplementaryThetaMapper and > ComplementaryNaiveBayesClassifier to verify they were being called, and then > ran some tests using trainnb/testnb. There was not a single difference in the > classifications when train/testcomplementary was specified vs standard naïve > bayes. > > Also, running testnb with the -seq flag doesn't appear to work. > > -----Original Message----- > From: Chandler Burgess [mailto:cburg...@icontrolesi.com] > Sent: Thursday, March 27, 2014 5:17 PM > To: dev@mahout.apache.org > Subject: RE: MAHOUT-1369 - Why does theta normalization for naive bayes > classification commented out? > > The program I wrote didn't use a model that was trained with Cbayes. After > looking at the scorers in SNB and CNB, I figured they would give different > results even on a model not trained with CNB. That could very well be > ignorance on my part as to the math. > > However, I did some command line tests using -c on both training and testing > and didn't see any difference in the testnb output. > ________________________________________ > From: Suneel Marthi <suneel_mar...@yahoo.com> > Sent: Thursday, March 27, 2014 5:12 PM > To: dev@mahout.apache.org > Cc: s...@apache.org > Subject: Re: MAHOUT-1369 - Why does theta normalization for naive bayes > classification commented out? > > Just checking , u r testing Cbayes on a model that's already been trained > using Cbayes correct? > > Also the jira I mentioned earlier was fixed for .9, so u should be > good. No code changes were done to naive bayes since .9 > > > Sent from my iPhone > >> On Mar 27, 2014, at 6:01 PM, Chandler Burgess <cburg...@icontrolesi.com> >> wrote: >> >> Ok, I'll uncomment those lines and see. I also have plenty of test data >> available too (I'm doing document classification with unbalanced classes), >> so I'll see if it improves there as well. >> >> Also, I'll try to make some time in the next week and go over the algorithm >> in detail compared with the paper as an extra check. >> >> Thanks, >> Chandler >> ________________________________________ >> From: Sebastian Schelter <s...@apache.org> >> Sent: Thursday, March 27, 2014 4:01 PM >> To: dev@mahout.apache.org >> Subject: Re: MAHOUT-1369 - Why does theta normalization for naive bayes >> classification commented out? >> >> Hi Chandler, >> >> I think a good way to go would be to reenable theta normalization and >> run the classification examples that we already have to see how it >> affects the result (and make sure it improves the result). >> >> Would be great to have this fixed. I'm also planning to port NB to >> our Spark DSL very soon (should be just a few lines of code). >> >> --sebastian >> >> >>> On 03/27/2014 09:07 PM, Suneel Marthi wrote: >>> Which Mahout version r u running? While its true that ThetaNormalizer is >>> still disabled today, Mahout-1389 fixes a bug wherein Complementary NB >>> wasn't being called when invoked. >>> >>> Please test with Mahout 0.9 or trunk. >>> >>> >>> >>> >>> On Thursday, March 27, 2014 3:53 PM, Chandler Burgess >>> <cburg...@icontrolesi.com> wrote: >>> >>> Hello all, >>> >>> It seems Robin Anil hasn't responded, and no one is sure of the status on >>> this. What needs to be done on this, and/or what can I do to help? I'm no >>> ML expert, but I do have the paper and should be able to verify/fix the >>> implementation. I'm REALLY interested in using the CNB classifier, since it >>> seems well suited to the problem I'm trying to tackle, before I give up and >>> use something else. >>> >>> I've done tests and see no difference when -c is passed on the command line >>> for training or testing. I also wrote a program to print the scores using >>> StandardNaiveBayesClassifier and ComplementaryNaiveBayesClassifier in a >>> binary classification problem and see no difference between the scores, so >>> it seems complementary naïve bayes is completely disabled. >>> >>> Thanks, >>> Chandler Burgess >>