[
https://issues.apache.org/jira/browse/MAHOUT-605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994054#comment-12994054
]
Robin Swezey commented on MAHOUT-605:
-------------------------------------
Ted: Yes, this is my question. If the weights are indeed normal weight and
complementary weight, then this means in terms of design that we need to
reverse the array again when we use CNB? To obtain the _real_ most probable
class (not a class weighed by its complementary) first and then the others.
Also for example, in public ClassifierResult classifyDocument(String[]
document, Datastore datastore, String defaultCategory), this would imply:
-- if (max < prob) {
++ if (max > prob) {
if I catch you right on what you say.
Robin A: Yes, we are using the cbayes option. If the values are indeed coming
different for NB and CNB then I think the matter is solved.
> Array returned by classifier.bayes.algorithm.CBayesAlgorithm.classifyDocument
> is sorted ascendant
> -------------------------------------------------------------------------------------------------
>
> Key: MAHOUT-605
> URL: https://issues.apache.org/jira/browse/MAHOUT-605
> Project: Mahout
> Issue Type: Bug
> Components: Classification
> Affects Versions: 0.4
> Environment: Linux
> Reporter: Robin Swezey
> Assignee: Robin Anil
> Priority: Minor
> Labels: bayesian, classification
> Fix For: 0.5
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> The array returned for a n-best call to classifyDocument is sorted ascendant
> instead of descendant.
> Ex:
> {quote}
> 47-best: [ClassifierResult\{category='香川県', score=32.28281232047167\},
> ClassifierResult\{category='宮崎県', score=32.28969992600906\}, ......,
> ClassifierResult\{category='愛知県', score=32.487981016587796\},
> ClassifierResult\{category='東京都', score=32.49189358054859\},
> ClassifierResult\{category='北海道', score=32.49811200756193\}]
> {quote}
> (classification of documents for Japanese prefectures)
> Inside the classifyDocument method, just before the return statement we found
> this line:
> {quote}
> Collections.reverse(result);
> {quote}
> Is this a mistake or a design choice? (we are not sure, hence the "Minor"
> priority)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira