I ran the OpenNlp stock models on wikipedia text at one time. You may be able to use this.
https://gist.github.com/3291931 If you create superior models without licensing restrictions, please share. Peace. Michael On Wed, Mar 13, 2013 at 12:25 PM, John Helmsen < [email protected]> wrote: > Gentlemen and Ladies, > > Currently, my group is undertaking a project that involves performing > english understanding of sentence fragments. While the Apache parser with > the pre-trained binary is very good, we anticipate the need to retrain the > parser eventually on our own data sets to handle special terms and > idiosyncrasies that may arise in our particular context. > > The best way to retrain the parser is to mix our parsing solutions in with > the existing parser training set, so that we enhance the already good > performance of the parser in the direction of our particular input. > > Unfortunately, it seems that the online documentation for > en-parser-chunking.bin does not include links to the training sets that > were used. Do any of you good people know what these might be? Thanks! > > John Helmsen >
