Training data sets used in OpenNLP

demaidim Fri, 30 Jan 2015 15:15:12 -0800

I am using Opennlp in my research to extract
terms from educational corpus and I would like to ask you about the
opennlp models (chunker, Sentence Detector, Tokenizer, maxentropy POS
tagger). What is the training data set used. It is mentioned clearly that
CONLL 2000 is used to train the chunker. however, no information is
provided about the training data used in Sentence Detector, Tokenizer,
maxentropy POS tagger.

Training data sets used in OpenNLP

Reply via email to