I presume that context is the text category (i.e. Manuals, news texts and so 
forth); you can apply your model on any source text bur your recall and 
precision will suffer if your training corpora differs too much from the text.

The easiest method for generating a training corpus is to search the internet 
for sentences where they occur and manually mark your entities according to 
their type.
You can either do this yourself or use a service such as amazon's mechanical 
turk.

Given a sufficiently large training corpus for your domain you can then use 
word distance for entity disambiguation.

Sincerely

Alexander


> 5 nov 2014 kl. 16:26 skrev <[email protected]>:
> 
> I've tried two formats so far:
> 
> <START>The Home Depot<END>.
> <START>Black and Decker<END>.
> <START>Ryobi<END>.
> 
> And:
> 
> <START:company>The Home Depot<END>.
> <START:company>Black and Decker<END>.
> <START:company>Ryobi<END>.
> 
> All that I'm trying to achieve is company names and other specific words 
> being marked as such. I'm beginning to think that maybe I should be aiming 
> towards regular expressions, but I want to capture alternate spellings like 
> "Black & Decker", which is why I thought NER might help. I don't really quite 
> understand how I am supposed to supply a "corpus" that includes these words 
> in some kind of context; mostly, I just need context-free matching...
> 
> Patrick Baggett
> Online Engineer - Search Team
> e: [email protected]
> p: +1 (214) 202-8964
> 
> -----Original Message-----
> From: Rodrigo Agerri [mailto:[email protected]]
> Sent: Wednesday, November 05, 2014 12:47 AM
> To: [email protected]
> Subject: Re: How to resolve "Model not compatible with name finder!"
> 
> Hi Patrick,
> 
> This error sometimes is due to mal-formed training data. Can you please send 
> a sample of your training data?
> 
> R
> 
>> On Tue, Nov 4, 2014 at 6:05 PM,  <[email protected]> wrote:
>> I'm using the OpenNLP SVN code and attempting to train a model for the name 
>> finder doesn't work. I'm not sure I conceptually understand the problem yet. 
>> It doesn't appear that my training data is even utilized yet.
>> 
>> opennlp TokenNameFinderTrainer -encoding UTF-8 -model
>> namefinder-test.bin -lang en -data namefinder-test.model
>> 
>> Exception in thread "main" java.lang.IllegalArgumentException: Model not 
>> compatible with name finder!
>>        at 
>> opennlp.tools.namefind.TokenNameFinderModel.<init>(TokenNameFinderModel.java:81)
>>        at 
>> opennlp.tools.namefind.TokenNameFinderModel.<init>(TokenNameFinderModel.java:106)
>>        at opennlp.tools.namefind.NameFinderME.train(NameFinderME.java:374)
>>        at opennlp.tools.namefind.NameFinderME.train(NameFinderME.java:403)
>>        at 
>> opennlp.tools.cmdline.namefind.TokenNameFinderTrainerTool.run(TokenNameFinderTrainerTool.java:179)
>>        at opennlp.tools.cmdline.CLI.main(CLI.java:222)
>> 
>> A similar exception happens with using the training API:
>> 
>> Exception in thread "main" java.lang.IllegalArgumentException: Model not 
>> compatible with name finder!
>>                at 
>> opennlp.tools.namefind.TokenNameFinderModel.<init>(TokenNameFinderModel.java:107)
>>                at
>> opennlp.tools.namefind.NameFinderME.train(NameFinderME.java:362)
>> 
>> Patrick Baggett
> 
> 
> ________________________________
> 
> The information in this Internet Email is confidential and may be legally 
> privileged. It is intended solely for the addressee. Access to this Email by 
> anyone else is unauthorized. If you are not the intended recipient, any 
> disclosure, copying, distribution or any action taken or omitted to be taken 
> in reliance on it, is prohibited and may be unlawful. When addressed to our 
> clients any opinions or advice contained in this Email are subject to the 
> terms and conditions expressed in any applicable governing The Home Depot 
> terms of business or client engagement letter. The Home Depot disclaims all 
> responsibility and liability for the accuracy and content of this attachment 
> and for any damages or losses arising from any inaccuracies, errors, viruses, 
> e.g., worms, trojan horses, etc., or other items of a destructive nature, 
> which may be contained in this attachment and shall not be liable for direct, 
> indirect, consequential or special damages in connection with this e-mail 
> message or its attachment.

Reply via email to