Hi Jorn, Let me use training model itself.
Let me just say what i've done so far 1. I've written the following text into a file called test.train <START:Product_entities>icm2500<END> <START:Product_entities>prd_234<END> . . . 2. i ran the following ./opennlp TokenNameFinderTrainer -encoding UTF-8 -lang en -data test.train -model en-ner-person.bin 3. I've added the bellow line in "sometext.txt" What is the risk value on icm2500. Delivery of prd_234 will be arrived late. Watson is handling router_34. 4. I ran the command ./opennlp TokenNameFinder en-ner-person.bin <sometext.txt> output/output4.txt result: It threw me the same line instead of What is the risk value on <START:Product_entities>icm2500<END> Delivery of <START:Product_entities>prd_234<END> will be arrived late....... Please tell me what am i doing wrong?????? Thanks, Vivek On Tue, Jun 24, 2014 at 5:06 PM, Jörn Kottmann <[email protected]> wrote: > On 06/24/2014 01:10 PM, Vivekanand Ittigi wrote: > >> Hi Jorn, >> >> I read the document >> http://opennlp.apache.org/documentation/manual/opennlp. >> html#tools.namefind.recognition.cmdline. >> But i felt i needed more information to put it in code. >> >> I got to know that we need to train the model. But could not get it. >> Can you please explain it. so that i could start implementing it. >> >> Thanks, >> Vivek >> >> Thanks, >> Vivek >> >> >> On Tue, Jun 24, 2014 at 3:28 PM, Jörn Kottmann <[email protected]> >> wrote: >> >> On 06/24/2014 09:44 AM, Vivekanand Ittigi wrote: >>> >>> Hi, >>>> >>>> If i use a query like this in command line >>>> >>>> ./opennlp TokenNameFinder en-ner-person.bin <input.txt> <output.txt> >>>> >>>> I'll get person names printed in output.txt but I want to write own >>>> models >>>> such that i should print my own entities. >>>> >>>> E.g. >>>> >>>> 1. what is the risk value on icm2500. >>>> 2. Delivery of prd_234 will be arrived late. >>>> 3. Watson is handling router_34. >>>> >>>> If i pass these lines, it should parse and extract product_entities. >>>> icm2500, prd_234, router_34... etc these are all Products( we can save >>>> this >>>> information in a file and we can use it as look up kind of for models or >>>> openNLP). >>>> >>>> Can anyone please tel me how to do this ? >>>> >>>> >>>> You need to train your own model. To do that you have to collect some >>> of >>> the texts >>> and annotate them with the entities you wish to detect. >>> >>> Have a look at the documentation about the name finder. It explains how >>> to >>> the training >>> works. >>> >> > For the training you need to produce annotated texts like the sample in > the documentation. > If you have a training data file in that format you can use the command > line interface to > actual train a model. > > The latest trunk version of OpenNLP can also be trained on files in the > brat data format, > those can be easily created with brat. > > Have a look here: > http://brat.nlplab.org/index.html > > In my experience brat works quite well in the latest trunk version. > > To train with brat you need to suffix the training command like this > bin/opennlp TokenNameFinderTrainer.brat > That command will print a help message explaining the inputs it needs. > > There is no need to write code to train a name finder model. > > Jörn > > > > >
