Gergő Törcsvári created LUCENE-5699:
---------------------------------------

             Summary: Lucene classification score calculation normalize and 
return lists
                 Key: LUCENE-5699
                 URL: https://issues.apache.org/jira/browse/LUCENE-5699
             Project: Lucene - Core
          Issue Type: Sub-task
          Components: modules/classification
            Reporter: Gergő Törcsvári


Now the classifiers can return only the "best matching" classes. If somebody 
want it to use more complex tasks he need to modify these classes for get 
second and third results too. If it is possible to return a list and it is not 
a lot resource why we dont do that? (We iterate a list so also.)

The Bayes classifier get too small return values, and there were a bug with the 
zero floats. It was fixed with logarithmic. It would be nice to scale the class 
scores sum vlue to one, and then we coud compare two documents return score and 
relevance. (If we dont do this the wordcount in the test documents affected the 
result score.)

With bulletpoints:
* In the Bayes classification normalized score values, and return with result 
lists.
* In the KNN classifier possibility to return a result list.
* Make the ClassificationResult Comparable for list sorting.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to