I ran the OpenNlp stock models on wikipedia text at one time.  You may be
able to use this.

https://gist.github.com/3291931

If you create superior models without licensing restrictions, please share.

Peace.  Michael


On Wed, Mar 13, 2013 at 12:25 PM, John Helmsen <
[email protected]> wrote:

> Gentlemen and Ladies,
>
> Currently, my group is undertaking a project that involves performing
> english understanding of sentence fragments.  While the Apache parser with
> the pre-trained binary is very good, we anticipate the need to retrain the
> parser eventually on our own data sets to handle special terms and
> idiosyncrasies that may arise in our particular context.
>
> The best way to retrain the parser is to mix our parsing solutions in with
> the existing parser training set, so that we enhance the already good
> performance of the parser in the direction of our particular input.
>
> Unfortunately, it seems that the online documentation for
> en-parser-chunking.bin does not include links to the training sets that
> were used.  Do any of you good people know what these might be? Thanks!
>
> John Helmsen
>

Reply via email to