Re: how to train sentence detector without losing previous data?

Sam Li Mon, 20 Aug 2012 19:29:24 -0700

I see. Would love to see a feature that allows additive training that doesn't 
require having the original corpus. That way in the future if there are new 
special cases that users want to add to the model, it would be easier. What do 
you think?


Is there something like this type of feature in the works?

-Sam

On Aug 20, 2012, at 3:44 PM, Jörn Kottmann <[email protected]> wrote:

> On 08/17/2012 08:15 AM, Sam Li wrote:
>> Right now I'm using the English sentence model provided on sourceforge. I 
>> would like to append additional data to it.
>> But this means I need the original source of the model, right? If so, how do 
>> I get that?
> 
> The orginial data is copyright protected, its data from the MUC corpus, so we 
> cannot distribute it
> with OpenNLP. But you can use other English resources for training.
> You need data which is sentence segmented, such as CONLL2000 for example.
> 
> Jörn

Re: how to train sentence detector without losing previous data?

Reply via email to