On 06/24/2014 01:10 PM, Vivekanand Ittigi wrote:
Hi Jorn,

I read the document
http://opennlp.apache.org/documentation/manual/opennlp.html#tools.namefind.recognition.cmdline.
But i felt i needed more information to put it in code.

I got to know that we need to train the model. But could not get it.
Can you please explain it. so that i could start implementing it.

Thanks,
Vivek

Thanks,
Vivek


On Tue, Jun 24, 2014 at 3:28 PM, Jörn Kottmann <[email protected]> wrote:

On 06/24/2014 09:44 AM, Vivekanand Ittigi wrote:

Hi,

If i use a query like this in command line

./opennlp TokenNameFinder en-ner-person.bin <input.txt> <output.txt>

I'll get person names printed in output.txt but I want to write own models
such that i should print my own entities.

E.g.

1. what is the risk value on icm2500.
2. Delivery of prd_234 will be arrived late.
3. Watson is handling router_34.

If i pass these lines, it should parse and extract product_entities.
icm2500, prd_234, router_34... etc these are all Products( we can save
this
information in a file and we can use it as look up kind of for models or
openNLP).

Can anyone please tel me how to do this  ?


You need to train your own model. To do that you have to collect some of
the texts
and annotate them with the entities you wish to detect.

Have a look at the documentation about the name finder. It explains how to
the training
works.

For the training you need to produce annotated texts like the sample in the documentation. If you have a training data file in that format you can use the command line interface to
actual train a model.

The latest trunk version of OpenNLP can also be trained on files in the brat data format,
those can be easily created with brat.

Have a look here:
http://brat.nlplab.org/index.html

In my experience brat works quite well in the latest trunk version.

To train with brat you need to suffix the training command like this bin/opennlp TokenNameFinderTrainer.brat
That command will print a help message explaining the inputs it needs.

There is no need to write code to train a name finder model.

Jörn




Reply via email to