Re: Getting started with OpenNLP

Jörn Kottmann Mon, 27 Oct 2014 01:25:15 -0700

On 10/24/2014 06:20 PM, [email protected] wrote:

Hello all,


First off, thanks to all who contribute to this project! I've been tasked with doing some 
research on Apache Stanbol, which uses OpenNLP, to see if it can fill some roles in a few 
company projects. I've been reading about how to train a model for named entity 
recognition and it seems like the simplest case of "I have a list of n proper nouns, 
please just recognize them directly and nothing else" isn't addressed in the 
documentation. Is this too simple a use case? Would I be doing better to just use a 
simple substring match on a phrase then? I would later like to extend the model to 
recognize things other than just simple proper nouns, but for now, that is the simplest 
case I can think of.

The name finder is intended to find entities which are embedded in atext, e.g. a news articles, medical records or company filings. It caneven recognize names

which it hasn't seen before by evaluating the context the entity appears in.

If you just have a list of proper nouns you might be better of using thedoccat package instead of the name finder. The doccat component tries toassign the categoriesfor the entire input text, compared to the name finder which labels eachinput token.


HTH,
Jörn

Re: Getting started with OpenNLP

Reply via email to