I suspected this might be the case. What about the tools used to
generate the model? Are those freely available or part of OpenNLP?

I tried searching through OpenNLP's codebase, but I'm still new to it,
so I'm not really sure what I'm looking for.

Regards,
Chris

On Mon, Feb 14, 2011 at 5:58 PM, James Kosin <[email protected]> wrote:
> Chris,
>
> Unfortunately, most... if not all, of the training data is not FREE or
> openly available due to copyright.  If you would like to start a group
> to engage in collecting non-copyrighted text and parse the data by hand
> you are more than welcome and encouraged to do so.
> Jorn or Jason may have a more complete set of training data and could
> help if you pass on your samples.
>
> James
>
> On 2/13/2011 11:03 PM, Chris Spencer wrote:
>> Where would we download the source data and tools used to generate the
>> pretrained models available at
>> http://opennlp.sourceforge.net/models-1.5/, specifically for the
>> English Treebank Parser?
>>
>> I have a large corpus of hand-corrected sentence/parse-tree pairs, as
>> well as an extended lexicon, and I'd like to incorporate these into
>> the training data and retrain a new parser better fitted for my
>> domain.
>>
>> Regards,
>> Chris
>
>

Reply via email to