Hi, I've used OpenNLP for a few years--in particular the chunker, POS tagger, and tokenizer. We're grateful for a high performance library with an Apache license, but one of our greatest complaints is the quality of the models. Yes--we're aware we can train our own--but most people are looking for something that is good enough out of the box (we aim for this with out products). I'm not surprised that volunteer engineers don't want to spend their time annotating data ;-)
I'm curious what other people see as the biggest shortcomings for Open NLP or the most important next steps for OpenNlp. I may have an opportunity to contribute to the project and I'm trying to figure out where the community thinks the biggest impact could be made. Peace. Michael Schmitz
