Re: Problem with openNLP Name Finder API....

Joern Kottmann Wed, 08 Feb 2012 07:32:50 -0800

Have a look at our documentation. The NER code you see there is correct.
If you have problems to detect multi-token names I suspect that something
with your training data is wrong.


The Name Finder takes a tokenized sentence at a time. After you are done
with a document
you should clear the adaptive data. POS tags are not used by the Name
Finder and cannot be passed to it.

Jörn

On Wed, Feb 8, 2012 at 3:02 PM, Jim - FooBar(); <[email protected]>wrote:

> Any chance you remember whether you tokenized the sentences *and
> pos-tagged the tokens* before feeding them to the maxent NER model? I' m
> asking because the docs say you *ONLY* need to tokenize sentences before
> sending them over to the trained model. However, i just stumbled upon this
> website: http://tech.knime.org/named-**entity-recognizer-and-tag-**
> cloud-example<http://tech.knime.org/named-entity-recognizer-and-tag-cloud-example>
>
> which states:
>
> " After pos tagging, the names entities can be tagged. The "OpenNLP NE
> tagger" node uses an OpenNLP NER model to tag the data. It is suggested to
> apply the NE tagger nodes after the pos tagger, in order to keep the named
> entities consisting of multi-words. "
>
> According to this i must pos-tag the tokens and NOT SIMPLY tokenize them
> if i want to keep multi-word entities as one!!! Could this be the case? Can
> you remember the details from your case?
>
> Regards,
> Jim
>
>
>
>
> On 08/02/12 11:44, Aliaksandr Autayeu wrote:
>
>> Yes, we had multiword entities. Actually, the dataset was quite "dirty"
>> and
>> "funny" - there were names like "al`XXX" and "al XXX" and some other where
>> the separator was some funny unicode character. But I don't remember any
>> problems similar to those you have (I followed the thread). But that was
>> OpenNLP 1.4.0 or 1.4.3, somewhere in that range. I don't have exact
>> figures
>> now, but I've fished out a precision (for one class) from an old
>> email: 80.98%
>>
>> Aliaksandr
>>
>> On Wed, Feb 8, 2012 at 11:45 AM, Jim - FooBar();<[email protected]**
>> >wrote:
>>
>>  Hi there Autayeu,
>>>
>>> Did you have any multi-word entities in your annotated corpus?
>>> If yes, how did the maxent NER model perform? Could it find them or was
>>> it
>>> just finding single-word entities?
>>> If you don't understand why i'm asking have a  look at the previous
>>> messages....
>>>
>>> I really appreciate the help...
>>>
>>> Regards,
>>> Jim
>>>
>>>
>>>
>>> On 08/02/12 10:39, Aliaksandr Autayeu wrote:
>>>
>>>  p.s: have you ever done any serious NER (not for demonstration purposes)
>>>>
>>>>> using openNLP?
>>>>>
>>>>>  I did experiments (more than a year ago, with 1.4.3) for standard
>>>> three
>>>> classes, got the state of the art for our private corpus, but then we
>>>> changed approach.
>>>>
>>>> Aliaksandr
>>>>
>>>>
>>>>
>

Re: Problem with openNLP Name Finder API....

Reply via email to