[ 
https://issues.apache.org/jira/browse/MAHOUT-605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994046#comment-12994046
 ] 

Robin Swezey commented on MAHOUT-605:
-------------------------------------

Robin

Thank you for your very quick answer. As stated in the example of the very 
first post, we have 47 classes (Japanese prefectures). But we want to use it on 
more than 1700 classes (Japanese cities), hence the need for CNB because the Ja 
Wikipedia corpus does not give a lot of information on small cities.

I have a paper in review which uses this feature of Mahout and explains in more 
detail, in case you need it.

If there is a mistake in the code, this could help explain the current 
efficiency of our classifier, which is not really good as of the moment.

> Array returned by classifier.bayes.algorithm.CBayesAlgorithm.classifyDocument 
> is sorted ascendant
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-605
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-605
>             Project: Mahout
>          Issue Type: Bug
>          Components: Classification
>    Affects Versions: 0.4
>         Environment: Linux
>            Reporter: Robin Swezey
>            Assignee: Robin Anil
>            Priority: Minor
>              Labels: bayesian, classification
>             Fix For: 0.5
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The array returned for a n-best call to classifyDocument is sorted ascendant 
> instead of descendant. 
> Ex:
> {quote}
> 47-best: [ClassifierResult\{category='香川県', score=32.28281232047167\},
> ClassifierResult\{category='宮崎県', score=32.28969992600906\}, ......,
> ClassifierResult\{category='愛知県', score=32.487981016587796\},
> ClassifierResult\{category='東京都', score=32.49189358054859\},
> ClassifierResult\{category='北海道', score=32.49811200756193\}]
> {quote}
> (classification of documents for Japanese prefectures)
> Inside the classifyDocument method, just before the return statement we found 
> this line:
> {quote}
> Collections.reverse(result);
> {quote}
> Is this a mistake or a design choice? (we are not sure, hence the "Minor" 
> priority)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to