Hi Jorn,

Let me use training model itself.

Let me just say what i've done so far

1. I've written the following text into a file called test.train
<START:Product_entities>icm2500<END>
<START:Product_entities>prd_234<END>
.
.
.

2.  i ran the following

./opennlp TokenNameFinderTrainer -encoding UTF-8 -lang en -data test.train
-model en-ner-person.bin

3. I've added the bellow line in "sometext.txt"

 What is the risk value on icm2500. Delivery of prd_234 will be arrived
late. Watson is handling router_34.

4. I ran the command

./opennlp TokenNameFinder en-ner-person.bin <sometext.txt>
output/output4.txt

result: It threw me the same line instead of What is the risk value on
<START:Product_entities>icm2500<END> Delivery of
<START:Product_entities>prd_234<END> will be arrived late.......

Please tell me what am i doing wrong??????

Thanks,
Vivek





On Tue, Jun 24, 2014 at 5:06 PM, Jörn Kottmann <[email protected]> wrote:

> On 06/24/2014 01:10 PM, Vivekanand Ittigi wrote:
>
>> Hi Jorn,
>>
>> I read the document
>> http://opennlp.apache.org/documentation/manual/opennlp.
>> html#tools.namefind.recognition.cmdline.
>> But i felt i needed more information to put it in code.
>>
>> I got to know that we need to train the model. But could not get it.
>> Can you please explain it. so that i could start implementing it.
>>
>> Thanks,
>> Vivek
>>
>> Thanks,
>> Vivek
>>
>>
>> On Tue, Jun 24, 2014 at 3:28 PM, Jörn Kottmann <[email protected]>
>> wrote:
>>
>>  On 06/24/2014 09:44 AM, Vivekanand Ittigi wrote:
>>>
>>>  Hi,
>>>>
>>>> If i use a query like this in command line
>>>>
>>>> ./opennlp TokenNameFinder en-ner-person.bin <input.txt> <output.txt>
>>>>
>>>> I'll get person names printed in output.txt but I want to write own
>>>> models
>>>> such that i should print my own entities.
>>>>
>>>> E.g.
>>>>
>>>> 1. what is the risk value on icm2500.
>>>> 2. Delivery of prd_234 will be arrived late.
>>>> 3. Watson is handling router_34.
>>>>
>>>> If i pass these lines, it should parse and extract product_entities.
>>>> icm2500, prd_234, router_34... etc these are all Products( we can save
>>>> this
>>>> information in a file and we can use it as look up kind of for models or
>>>> openNLP).
>>>>
>>>> Can anyone please tel me how to do this  ?
>>>>
>>>>
>>>>  You need to train your own model. To do that you have to collect some
>>> of
>>> the texts
>>> and annotate them with the entities you wish to detect.
>>>
>>> Have a look at the documentation about the name finder. It explains how
>>> to
>>> the training
>>> works.
>>>
>>
> For the training you need to produce annotated texts like the sample in
> the documentation.
> If you have a training data file in that format you can use the command
> line interface to
> actual train a model.
>
> The latest trunk version of OpenNLP can also be trained on files in the
> brat data format,
> those can be easily created with brat.
>
> Have a look here:
> http://brat.nlplab.org/index.html
>
> In my experience brat works quite well in the latest trunk version.
>
> To train with brat you need to suffix the training command like this
> bin/opennlp TokenNameFinderTrainer.brat
> That command will print a help message explaining the inputs it needs.
>
> There is no need to write code to train a name finder model.
>
> Jörn
>
>
>
>
>

Reply via email to