Hi Rodrigo, By extending model I meant, combining the base corpora (used to train existing model) with additional annotated text and retrain the model. Apart from licensing, this is one of the reason I am interested in knowing the source/base corpora used for training name finder models.
Thanks, Raj -----Original Message----- From: Rodrigo Agerri [mailto:[email protected]] Sent: Wednesday, November 5, 2014 12:16 PM To: [email protected] Subject: Re: Corpora used for training OpenNLP english models Hi Raj, I do not know which license the models in sourceforge are distributed under. But you cannot extend the existing English models. You need to train new ones for your domain based on annotated data. Best, R On Tue, Nov 4, 2014 at 7:05 PM, Raj Kiran <[email protected]> wrote: > Hi All, > > We want to use OpenNLP for NER and other capabilities in a commercial > software (English only). It looks like existing OpenNLP english models > available at sourceforge might have some license restriction. Is there any > information available on the source corpora used for training existing > OpenNLP english models ? > > Apart from purchasing the source corpora, this information would help us to > enhance the existing models by adding more training data. > > Thanks and Regards, > Raj > > > > ________________________________ > ________________________________
