I recently wrote up post, doing this in java, not using command line. Maybe it will help. Code samples in java. http://johnmiedema.com/?p=744
On Tue, Jun 24, 2014 at 8:53 AM, Vivekanand Ittigi <[email protected]> wrote: > It means you want me to write small story integrating these entities.? > > > On Tue, Jun 24, 2014 at 5:59 PM, Mark G <[email protected]> wrote: > > > Hello, you need to annotate the entity within some of the sentences it > > occurs in. The name finder needs context. It's giving you the same > sentence > > back because it was trained to find any token anywhere. > > Mg > > > > > > > On Jun 24, 2014, at 8:12 AM, Vivekanand Ittigi <[email protected]> > > wrote: > > > > > > Hi Jorn, > > > > > > Let me use training model itself. > > > > > > Let me just say what i've done so far > > > > > > 1. I've written the following text into a file called test.train > > > <START:Product_entities>icm2500<END> > > > <START:Product_entities>prd_234<END> > > > . > > > . > > > . > > > > > > 2. i ran the following > > > > > > ./opennlp TokenNameFinderTrainer -encoding UTF-8 -lang en -data > > test.train > > > -model en-ner-person.bin > > > > > > 3. I've added the bellow line in "sometext.txt" > > > > > > What is the risk value on icm2500. Delivery of prd_234 will be arrived > > > late. Watson is handling router_34. > > > > > > 4. I ran the command > > > > > > ./opennlp TokenNameFinder en-ner-person.bin <sometext.txt> > > > output/output4.txt > > > > > > result: It threw me the same line instead of What is the risk value on > > > <START:Product_entities>icm2500<END> Delivery of > > > <START:Product_entities>prd_234<END> will be arrived late....... > > > > > > Please tell me what am i doing wrong?????? > > > > > > Thanks, > > > Vivek > > > > > > > > > > > > > > > > > >> On Tue, Jun 24, 2014 at 5:06 PM, Jörn Kottmann <[email protected]> > > wrote: > > >> > > >>> On 06/24/2014 01:10 PM, Vivekanand Ittigi wrote: > > >>> > > >>> Hi Jorn, > > >>> > > >>> I read the document > > >>> http://opennlp.apache.org/documentation/manual/opennlp. > > >>> html#tools.namefind.recognition.cmdline. > > >>> But i felt i needed more information to put it in code. > > >>> > > >>> I got to know that we need to train the model. But could not get it. > > >>> Can you please explain it. so that i could start implementing it. > > >>> > > >>> Thanks, > > >>> Vivek > > >>> > > >>> Thanks, > > >>> Vivek > > >>> > > >>> > > >>> On Tue, Jun 24, 2014 at 3:28 PM, Jörn Kottmann <[email protected]> > > >>> wrote: > > >>> > > >>>> On 06/24/2014 09:44 AM, Vivekanand Ittigi wrote: > > >>>> > > >>>> Hi, > > >>>>> > > >>>>> If i use a query like this in command line > > >>>>> > > >>>>> ./opennlp TokenNameFinder en-ner-person.bin <input.txt> > <output.txt> > > >>>>> > > >>>>> I'll get person names printed in output.txt but I want to write own > > >>>>> models > > >>>>> such that i should print my own entities. > > >>>>> > > >>>>> E.g. > > >>>>> > > >>>>> 1. what is the risk value on icm2500. > > >>>>> 2. Delivery of prd_234 will be arrived late. > > >>>>> 3. Watson is handling router_34. > > >>>>> > > >>>>> If i pass these lines, it should parse and extract > product_entities. > > >>>>> icm2500, prd_234, router_34... etc these are all Products( we can > > save > > >>>>> this > > >>>>> information in a file and we can use it as look up kind of for > > models or > > >>>>> openNLP). > > >>>>> > > >>>>> Can anyone please tel me how to do this ? > > >>>>> > > >>>>> > > >>>>> You need to train your own model. To do that you have to collect > some > > >>>> of > > >>>> the texts > > >>>> and annotate them with the entities you wish to detect. > > >>>> > > >>>> Have a look at the documentation about the name finder. It explains > > how > > >>>> to > > >>>> the training > > >>>> works. > > >> For the training you need to produce annotated texts like the sample > in > > >> the documentation. > > >> If you have a training data file in that format you can use the > command > > >> line interface to > > >> actual train a model. > > >> > > >> The latest trunk version of OpenNLP can also be trained on files in > the > > >> brat data format, > > >> those can be easily created with brat. > > >> > > >> Have a look here: > > >> http://brat.nlplab.org/index.html > > >> > > >> In my experience brat works quite well in the latest trunk version. > > >> > > >> To train with brat you need to suffix the training command like this > > >> bin/opennlp TokenNameFinderTrainer.brat > > >> That command will print a help message explaining the inputs it needs. > > >> > > >> There is no need to write code to train a name finder model. > > >> > > >> Jörn > > >> > > >> > > >> > > >> > > >> > > > -- _________________________________________ johnmiedema.com
