[
https://issues.apache.org/jira/browse/MAHOUT-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Oleg Kalnichevski resolved MAHOUT-562.
--------------------------------------
Resolution: Invalid
Apparently I used the wrong module produced with the 'bayes' algorithm type. My
bad. Apologies for the noise.
Oleg
> Results produced by Complementary Bayes Classifier seem odd
> -----------------------------------------------------------
>
> Key: MAHOUT-562
> URL: https://issues.apache.org/jira/browse/MAHOUT-562
> Project: Mahout
> Issue Type: Bug
> Components: Classification
> Affects Versions: 0.4
> Reporter: Oleg Kalnichevski
>
> The 20newsgroups example produces expected results (95% correctness rate)
> when using the Naive Bayes algorithm. When switching the algorithm to the
> Complementary Bayes while all other parameters remain the same the rate of
> correctly classified documents drops to 5%. This seems odd to me.
> I admit I know next to nothing about the Bayes theorem and possibly my
> expectations are totally off.
> ---
> Dec 11, 2010 8:47:47 PM org.apache.mahout.classifier.bayes.TestClassifier
> classifySequential
> INFO: Loading model from:
> {basePath=/home/oleg/data/mahout/20news-bayes-model, classifierType=cbayes,
> alpha_i=1, dataSource=hdfs, gramSize=1, verbose=false, encoding=UTF-8,
> defaultCat=unknown,
> testDirPath=/home/oleg/data/mahout/20news-bayes-train-input}
> Dec 11, 2010 8:47:47 PM org.apache.mahout.classifier.bayes.TestClassifier
> classifySequential
> INFO: Testing Complementary Bayes Classifier
> ...
> INFO: =======================================================
> Summary
> -------------------------------------------------------
> Correctly Classified Instances : 578 5.1087%
> Incorrectly Classified Instances : 10736 94.8913%
> Total Classified Instances : 11314
> =======================================================
> Confusion Matrix
> -------------------------------------------------------
> a b c d e f g h i j
> k l m n o p q r s t
> <--Classified as
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 597 0 0 0 0 0 0
> | 597 a = rec.sport.baseball
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 595 0 0 0 0 0 0
> | 595 b = sci.crypt
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 600 0 0 0 0 0 0
> | 600 c = rec.sport.hockey
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 546 0 0 0 0 0 0
> | 546 d = talk.politics.guns
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 599 0 0 0 0 0 0
> | 599 e = soc.religion.christian
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 591 0 0 0 0 0 0
> | 591 f = sci.electronics
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 591 0 0 0 0 0 0
> | 591 g = comp.os.ms-windows.misc
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 585 0 0 0 0 0 0
> | 585 h = misc.forsale
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 377 0 0 0 0 0 0
> | 377 i = talk.religion.misc
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 480 0 0 0 0 0 0
> | 480 j = alt.atheism
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 593 0 0 0 0 0 0
> | 593 k = comp.windows.x
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 564 0 0 0 0 0 0
> | 564 l = talk.politics.mideast
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 590 0 0 0 0 0 0
> | 590 m = comp.sys.ibm.pc.hardware
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 578 0 0 0 0 0 0
> | 578 n = comp.sys.mac.hardware
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 593 0 0 0 0 0 0
> | 593 o = sci.space
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 598 0 0 0 0 0 0
> | 598 p = rec.motorcycles
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 594 0 0 0 0 0 0
> | 594 q = rec.autos
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 584 0 0 0 0 0 0
> | 584 r = comp.graphics
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 465 0 0 0 0 0 0
> | 465 s = talk.politics.misc
> 0 0 0 0 0 0 0 0 0 0
> 0 0 0 594 0 0 0 0 0 0
> | 594 t = sci.med
> Default Category: unknown: 20
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.