On 11/8/11 3:43 PM, John Stewart wrote:
Thanks Jörn. Was it trained on the whole Penn treebank? And do you
happen to know if this means there are licensing restrictions on the
use of the parser, that say would need to be resolved via the LDC?
Using the models in a commercial application is a grey area.
The models do not break the copyright of the original corpus, because
it is not possible to reproduce it with it. Therefore I am in doubt that the
LDC can restrict the usage of them.
We didn't spent time to resolve these issues yet, and that is also the
reason
why the models are still distributed via our old SourceForge page.
I don't know if the training file contains the entire TreeBank, it has
arround
60K sentences. I believe section 23 is used for testing.
Jörn