[ https://issues.apache.org/jira/browse/MAHOUT-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sebastian Schelter updated MAHOUT-1391: --------------------------------------- Resolution: Not a Problem Status: Resolved (was: Patch Available) If you have labels in your testset that are not in your trainingset, then your setup is flawed and you should not run that test. > Possibility to disable confusion matrix in naive bayes > ------------------------------------------------------ > > Key: MAHOUT-1391 > URL: https://issues.apache.org/jira/browse/MAHOUT-1391 > Project: Mahout > Issue Type: New Feature > Components: Classification > Affects Versions: 0.8 > Reporter: Mansur Iqbal > Fix For: 1.0 > > Attachments: MAHOUT-1391.patch > > > Sometimes confusion matrix is to big and not really necessary. > And there is another case for the possibility: > If you split a dataset with many labels with random selection percent to > testdataset and trainingdataset, it could happen, that there are > classes/labels in testdata, which do not appear in the trainingdataset. By > creating a model with the trainingdata the created labelindex does not > include some labels from testdata. Therefore if you test on this model with > the testdata, mahout tries to create a confusion matrix with the labels from > testdata which are not included in the labelindex and throws an exception. -- This message was sent by Atlassian JIRA (v6.2#6252)