On 07/11/2012 10:35 AM, Chi Dat Nguyen wrote:
I would like to ask which dataset OpenNLP uses to train the provided English Name Finder models (i.e. en-ner-organization.bin, en-ner-person.bin, ...)? And may I know where I can download it?
It is trained on MUC 6 / 7 data with manual fixes. The data can be purchases at the LDC. We cannot distribute the data because it is copyright protected. Jörn
