Chris,

Unfortunately, most... if not all, of the training data is not FREE or
openly available due to copyright.  If you would like to start a group
to engage in collecting non-copyrighted text and parse the data by hand
you are more than welcome and encouraged to do so.
Jorn or Jason may have a more complete set of training data and could
help if you pass on your samples.

James

On 2/13/2011 11:03 PM, Chris Spencer wrote:
> Where would we download the source data and tools used to generate the
> pretrained models available at
> http://opennlp.sourceforge.net/models-1.5/, specifically for the
> English Treebank Parser?
>
> I have a large corpus of hand-corrected sentence/parse-tree pairs, as
> well as an extended lexicon, and I'd like to incorporate these into
> the training data and retrain a new parser better fitted for my
> domain.
>
> Regards,
> Chris

Reply via email to