[ 
https://issues.apache.org/jira/browse/OPENNLP-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970017#action_12970017
 ] 

Jörn Kottmann commented on OPENNLP-9:
-------------------------------------

The training data only contains tokens which are
the begin or a continuation of a name, but zero "other"
tokens.

If the name finder is trained like this, it will always
estimate that these are the two only valid outcomes. That should
be possible actually (but maybe not useful).

I didn't look at the source code, but I guess the error is caused by
a bug in the outcome validating code. We should add your case
to the unit test and fix the problem 

> Training name finder only with names fails
> ------------------------------------------
>
>                 Key: OPENNLP-9
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-9
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Name Finder
>    Affects Versions: 1.5.0
>            Reporter: Jörn Kottmann
>
> A. Allen:
> "I followed the instructions
> in the wiki and used pieces of the sample code, but keep getting the
> following:
> Indexing events using cutoff of 5
> Computing event counts...  done. 29376 events
> Indexing...  done.
> Sorting and merging events... done. Reduced 29376 events to 8313.
> Done indexing.
> Incorporating indexed data for training...
> done.
> Number of Event Tokens: 8313
>     Number of Outcomes: 1
>   Number of Predicates: 11869
> ...done.
> Computing model parameters...
> Performing 100 iterations.
>   1:  .. loglikelihood=0.0 1.0
>   2:  .. loglikelihood=0.0 1.0
> Exception in thread "main" java.lang.IllegalArgumentException: Model not
> compatible with name finder!
> at
> opennlp.tools.namefind.TokenNameFinderModel.<init>(TokenNameFinderModel.java:50)
> at opennlp.tools.namefind.NameFinderME.train(NameFinderME.java:350)
> at opennlp.tools.namefind.NameFinderME.train(NameFinderME.java:356)
> at NameTrainer.main(NameTrainer.java:21)
> My training data looks like this:
> <START:person>Neil Abercrombie<END>
> <START:person>Anibal Acevedo-Vila<END>
> <START:person>Gary Ackerman<END>
> <START:person>Robert Aderholt<END>
> <START:person>Daniel Akaka<END>
> <START:person>Todd Akin<END>
> <START:person>Lamar Alexander<END>
> <START:person>Rodney Alexander<END>
> "

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to