oh, i forgot a thing. does the order of the surrounding tokens matter? I
mean if i train:

my name is PERSON

when it searches the entity does it will "exactly" match "name is" or if i
write "is name" is the same thing? (or maybe i need to write the "negative"
version of it)

2016-08-12 16:51 GMT+02:00 Damiano Porta <damianopo...@gmail.com>:

> Ok thank you so much guys!
>
> 2016-08-12 16:43 GMT+02:00 William Colen <william.co...@gmail.com>:
>
>> You need to train with a corpus that is as close as possible as your
>> runtime corpus. If your runtime corpus is like that I think it is ok.
>> Otherwise, the model can learn that an entity is too often. Like, there is
>> an entity in the middle of every window.
>>
>>
>> 2016-08-12 11:35 GMT-03:00 Damiano Porta <damianopo...@gmail.com>:
>>
>> > Ok, but why not just ignore all the others tokens? i mean... when i
>> write 2
>> > TOKENS + ENTITY + 2 TOKENS i am interested on finding the entity with
>> this
>> > surrounding tokens so it should mean that other "cases" can be ignored.
>> No?
>> >
>> > Why do i need to write all the other cases when those must be ignored.
>> >
>> > 2016-08-12 16:26 GMT+02:00 William Colen <william.co...@gmail.com>:
>> >
>> > > You also need examples of what is not entities.
>> > >
>> > >
>> > > 2016-08-12 11:21 GMT-03:00 Damiano Porta <damianopo...@gmail.com>:
>> > >
>> > > > Hello everyone,
>> > > > pardon for the stupid question but i really do not get the point
>> about
>> > > > training a maxent model with complete sentences.
>> > > >
>> > > > For example:
>> > > >
>> > > > <START:person> Pierre Vinken <END> , 61 years old , will join the
>> board
>> > > as
>> > > > a nonexecutive director Nov. 29 .
>> > > >
>> > > > it has ~20 tokens.
>> > > > As described here:
>> > > > https://opennlp.apache.org/documentation/1.6.0/manual/
>> > > > opennlp.html#tools.namefind.training.featuregen
>> > > > the default window should be 2 tokens on the left and 2 tokens on
>> the
>> > > right
>> > > > of the entity. So, what's the point of writing the entire sentence
>> if
>> > > there
>> > > > are no other entities ?
>> > > >
>> > > > As far i have understood it correctly, it should take into account
>> the
>> > > > Pierre Vinken (as entity name) and "," "61" as the next 2 tokens.
>> So,
>> > why
>> > > > do we need "*years old , will join the board as a nonexecutive*" ?
>> > > >
>> > > > Thank you in advance for the clarification!
>> > > >
>> > > > Best
>> > > > Damiano
>> > > >
>> > >
>> >
>>
>
>

Reply via email to