Hey Rahil,

I can see what's wrong with your input data.  You need spaces around the
tags.

RIght meow, you have:

some text <START:Entity>blah<END>some text

Instead, you need:

some text <START:Entity> blah <END> some text



On Thu, Nov 28, 2013 at 6:47 AM, Jörn Kottmann <[email protected]> wrote:

> On 11/28/2013 02:59 PM, Rahil Bohra wrote:
>
>> Hey Everyone.
>>
>> I am trying to train the opennlp name finder, here is the structure of my
>> training data:
>>
>> Upon hearing of <START:author>Italo Calvino<END>’s death in September of
>> 1985, <START:author>John Updike<END> commented,
>> “<START:author>Calvino<END>
>> was a genial as well as brilliant writer.
>>
>> What is the nature of your dreams? Are you more interested in Jung than
>> you
>> are in Freud?
>>
>> Once after reading <START:author>Freud<END>’s <START:title>The
>> Interpretation of Dreams<END> I went to bed.
>>
>> I dreamt.
>>
>> Unfortunately, when I run the trainer with "opennlp TokenNameFinderTrainer
>> -lang en -encoding UTF-8 -data en-author-person.train -model
>> en-author-person.bin", the output is as follows;
>>
>> Indexing events using cutoff of 5
>>
>> Computing event counts...  done. 27904 events
>> Indexing...  done.
>> Sorting and merging events... done. Reduced 27904 events to 26448.
>> Done indexing.
>> Incorporating indexed data for training...
>> done.
>> Number of Event Tokens: 26448
>>      Number of Outcomes: 1
>>    Number of Predicates: 7748
>> ...done.
>> Computing model parameters ...
>> Performing 100 iterations.
>>    1:  ... loglikelihood=0.0 1.0
>>    2:  ... loglikelihood=0.0 1.0
>> Exception in thread "main" java.lang.IllegalArgumentException: Model not
>> compatible with name finder!
>>
>> What am I doing wrong? I read that I need spaces between the token and the
>> tag, but when these were added, the output is the same.
>>
>
>
> OpenNLP doesn't fail nicely if there are fundamental issues with the
> training data.
> What is wrong in your case?
>
> This outputline
>
> "Number of Outcomes: 1"
>
> usually indicates that you don't have a single name annotation in your
> training data. The trained classification
> model has only one class. The name finder model has a check which fails,
> because that is not a  valid model.
>
> We should open a jira and fix this so, the name finder trainer fails
> nicely with an exception which indicates
> the actual problem.
>
> Jörn
>
>
>

Reply via email to