Re: Doccat evaluator

Jörn Kottmann Thu, 10 Apr 2014 15:38:29 -0700

I thought it should be done similar to the way pos tags are measuredwhen I implemented that.

A confusion matrix might also be helpful to see which categories aremore difficult to classify for the system.


Jörn

On 04/10/2014 03:00 PM, William Colen wrote:

Actually, since we always add a tag to each document, accuracy makes sense.
We could implement F-1 for the individual categories.

2014-04-09 17:23 GMT-03:00 William Colen <[email protected]>:

Hello,

I was checking if there is any open issue related to Doccat, and I found
this one -

OPENNLP-81: Add a cli tool for the doccat evaluation support

I noticed that there is already a class
named DocumentCategorizerEvaluator, which is not used anywhere internally.
This is evaluating performance in terms of accuracy, but I believe it would
be better do do it in terms of F-Measuare.

Any thoughts?

As we are working in a major version, I think it would be OK to change it.


Thank you,
William

Re: Doccat evaluator

Reply via email to