Re: TokenNameFinderEvaluator Outputting No Stats

Richard Head Jr. Fri, 27 Feb 2015 18:36:42 -0800

> Add -misclassified true
Very handy
> To evaluate you need a annotated corpus...
This was my problem. 
No that I can run it I see measurements of 0.99XXXXX, but I noticed that the 
better models -as determined by my separate unit tests, which check what was 
actually classified- have lower measurements.
According to my test cases this is a very good model: 
Precision: 0.9905921169966114Recall: 0.9946277476832162F-Measure: 
0.9926058304478945 
While this one is not so great:
Precision: 0.9951354487436962Recall: 0.9982540179970453F-Measure: 
0.9966922939388522
Am I missing something here? 
Thanks
     On Wednesday, February 25, 2015 11:48 PM, William Colen 
<[email protected]> wrote:

 Add 
-misclassified trueto the command to output what was misclassified.But I have a 
guess. To evaluate you need a annotated corpus. Is the file /tmp/db-raw.txt 
annotated? It should look like this:<START:person> Pierre Vinken <END> , 61 
years old , will join the board as a nonexecutive director Nov. 29 .
Mr . <START:person> Vinken <END> is chairman of Elsevier N.V. , the Dutch 
publishing group .
<START:person> Rudolph Agnew <END> , 55 years old and former chairman of 
Consolidated Gold Fields PLC ,
    was named a director of this British industrial conglomerate .
Regards,William2015-02-26 1:24 GMT-03:00 Richard Head Jr. 
<[email protected]>:

> Are you using 1.5.3?

Yes. 

> Can you send a small sample?
I can't send the model. Any other options? What format is the file given to the 
-data option supposed to be in? 
Thanks
     On Friday, February 20, 2015 2:14 PM, William Colen 
<[email protected]> wrote:

 Are you using 1.5.3? Can you send a small sample?

Em segunda-feira, 16 de fevereiro de 2015, Richard Head Jr. 
<[email protected]> escreveu:

I ran the command line evaluator several times on tokenized/untokenized and 
large/small input but get no results (see below). The model appears to be 
finding tokens quite well, I'd just like to evaluate *how* well:

opennlp TokenNameFinderEvaluator  -data some-data.txt -model a-model.bin
Loading Token Name Finder model ... done (0.111s)

Average: 104.2 sent/s
Total: 15 sent
Runtime: 0.144s

Precision: 0.0
Recall: 0.0
F-Measure: -1.0

Now on a larger set of data:

opennlp TokenNameFinderEvaluator -encoding latin1 -data /tmp/db-raw.txt -model 
a-model.bin
Loading Token Name Finder model ... done (0.156s)
current: 364.9 sent/s avg: 364.9 sent/s total: 366 sent
current: 427.4 sent/s avg: 396.1 sent/s total: 793 sent

Average: 477.7 sent/s
Total: 1434 sent
Runtime: 3.002s

Precision: 0.0
Recall: 0.0
F-Measure: -1.0

What am I doing wrong?

Thanks

--
William Colen

Re: TokenNameFinderEvaluator Outputting No Stats

Reply via email to