I'm working on a naive Bayes classifier in a case where a few
categories are much less common than the rest.  In the latest run of
the process it happened that no instances of one of these ended up in
the test set.   As a result testnb failed with the following error
(actual name of the label elided):

Exception in thread "main" java.lang.IllegalArgumentException: Label
not found: LabelXYZ
        at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
        at 
org.apache.mahout.classifier.ConfusionMatrix.getCount(ConfusionMatrix.java:102)
        at 
org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.java:122)
        at 
org.apache.mahout.classifier.ConfusionMatrix.incrementCount(ConfusionMatrix.java:126)
        at 
org.apache.mahout.classifier.ConfusionMatrix.addInstance(ConfusionMatrix.java:94)
        at 
org.apache.mahout.classifier.ResultAnalyzer.addInstance(ResultAnalyzer.java:71)
        at 
org.apache.mahout.classifier.naivebayes.test.TestNaiveBayesDriver.analyzeResults(TestNaiveBayesDriver.java:158)
        at 
org.apache.mahout.classifier.naivebayes.test.TestNaiveBayesDriver.run(TestNaiveBayesDriver.java:124)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at 
org.apache.mahout.classifier.naivebayes.test.TestNaiveBayesDriver.main(TestNaiveBayesDriver.java:65)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

I see why this is happening, but I'm not sure it makes sense for the
test to fail entirely rather than just fill that column in the
confusion matrix with zeroes.  Before I dive into the ConfusionMatrix
code to deal with this, is there a reason I'm missing for this
behavior?

-- 
Andrea Leistra
aleis...@gmail.com

Reply via email to