On 11/28/2013 02:59 PM, Rahil Bohra wrote:
Hey Everyone.

I am trying to train the opennlp name finder, here is the structure of my
training data:

Upon hearing of <START:author>Italo Calvino<END>’s death in September of
1985, <START:author>John Updike<END> commented, “<START:author>Calvino<END>
was a genial as well as brilliant writer.

What is the nature of your dreams? Are you more interested in Jung than you
are in Freud?

Once after reading <START:author>Freud<END>’s <START:title>The
Interpretation of Dreams<END> I went to bed.

I dreamt.

Unfortunately, when I run the trainer with "opennlp TokenNameFinderTrainer
-lang en -encoding UTF-8 -data en-author-person.train -model
en-author-person.bin", the output is as follows;

Indexing events using cutoff of 5

Computing event counts...  done. 27904 events
Indexing...  done.
Sorting and merging events... done. Reduced 27904 events to 26448.
Done indexing.
Incorporating indexed data for training...
done.
Number of Event Tokens: 26448
     Number of Outcomes: 1
   Number of Predicates: 7748
...done.
Computing model parameters ...
Performing 100 iterations.
   1:  ... loglikelihood=0.0 1.0
   2:  ... loglikelihood=0.0 1.0
Exception in thread "main" java.lang.IllegalArgumentException: Model not
compatible with name finder!

What am I doing wrong? I read that I need spaces between the token and the
tag, but when these were added, the output is the same.


OpenNLP doesn't fail nicely if there are fundamental issues with the training data.
What is wrong in your case?

This outputline

"Number of Outcomes: 1"

usually indicates that you don't have a single name annotation in your training data. The trained classification model has only one class. The name finder model has a check which fails, because that is not a valid model.

We should open a jira and fix this so, the name finder trainer fails nicely with an exception which indicates
the actual problem.

Jörn


Reply via email to