Re: how to train sentence detector without losing previous data?

Yuan Luo Thu, 04 Oct 2012 17:49:13 -0700

Hi Jörn,
Are all the original training corpora MUC? And would you mind
providing a list of which MUC corpora you used or all of them? I am
thinking of getting them from MUC if you guys didn't make customized
changes to those corpora.


Best,
Yuan

On Mon, Aug 20, 2012 at 3:44 AM, Jörn Kottmann <[email protected]> wrote:
> On 08/17/2012 08:15 AM, Sam Li wrote:
>>
>> Right now I'm using the English sentence model provided on sourceforge. I
>> would like to append additional data to it.
>> But this means I need the original source of the model, right? If so, how
>> do I get that?
>
>
> The orginial data is copyright protected, its data from the MUC corpus, so
> we cannot distribute it
> with OpenNLP. But you can use other English resources for training.
> You need data which is sentence segmented, such as CONLL2000 for example.
>
> Jörn

Re: how to train sentence detector without losing previous data?

Reply via email to