Have a look at our documentation. The NER code you see there is correct. If you have problems to detect multi-token names I suspect that something with your training data is wrong.
The Name Finder takes a tokenized sentence at a time. After you are done with a document you should clear the adaptive data. POS tags are not used by the Name Finder and cannot be passed to it. Jörn On Wed, Feb 8, 2012 at 3:02 PM, Jim - FooBar(); <[email protected]>wrote: > Any chance you remember whether you tokenized the sentences *and > pos-tagged the tokens* before feeding them to the maxent NER model? I' m > asking because the docs say you *ONLY* need to tokenize sentences before > sending them over to the trained model. However, i just stumbled upon this > website: http://tech.knime.org/named-**entity-recognizer-and-tag-** > cloud-example<http://tech.knime.org/named-entity-recognizer-and-tag-cloud-example> > > which states: > > " After pos tagging, the names entities can be tagged. The "OpenNLP NE > tagger" node uses an OpenNLP NER model to tag the data. It is suggested to > apply the NE tagger nodes after the pos tagger, in order to keep the named > entities consisting of multi-words. " > > According to this i must pos-tag the tokens and NOT SIMPLY tokenize them > if i want to keep multi-word entities as one!!! Could this be the case? Can > you remember the details from your case? > > Regards, > Jim > > > > > On 08/02/12 11:44, Aliaksandr Autayeu wrote: > >> Yes, we had multiword entities. Actually, the dataset was quite "dirty" >> and >> "funny" - there were names like "al`XXX" and "al XXX" and some other where >> the separator was some funny unicode character. But I don't remember any >> problems similar to those you have (I followed the thread). But that was >> OpenNLP 1.4.0 or 1.4.3, somewhere in that range. I don't have exact >> figures >> now, but I've fished out a precision (for one class) from an old >> email: 80.98% >> >> Aliaksandr >> >> On Wed, Feb 8, 2012 at 11:45 AM, Jim - FooBar();<[email protected]** >> >wrote: >> >> Hi there Autayeu, >>> >>> Did you have any multi-word entities in your annotated corpus? >>> If yes, how did the maxent NER model perform? Could it find them or was >>> it >>> just finding single-word entities? >>> If you don't understand why i'm asking have a look at the previous >>> messages.... >>> >>> I really appreciate the help... >>> >>> Regards, >>> Jim >>> >>> >>> >>> On 08/02/12 10:39, Aliaksandr Autayeu wrote: >>> >>> p.s: have you ever done any serious NER (not for demonstration purposes) >>>> >>>>> using openNLP? >>>>> >>>>> I did experiments (more than a year ago, with 1.4.3) for standard >>>> three >>>> classes, got the state of the art for our private corpus, but then we >>>> changed approach. >>>> >>>> Aliaksandr >>>> >>>> >>>> >
