Hi everyone, OpenNLP became a top-level Apache project around 10 years ago and since then there have been lots of 1.x releases. In the past few years the NLP community has seen transformative changes primarily centered around the Python ecosystem. While OpenNLP still performs and is used by many projects, I think it would be good for the community to make a plan for OpenNLP 2.0 and how the project can adapt and continue its goal of being a capable Java NLP library.
In order for OpenNLP to minimize dependencies, support newer NLP architectures, and provide backward compatibility, I believe some refactoring is needed. Moving the opennlp-tools interfaces to an "opennlp-common" project can provide the decoupling necessary to position OpenNLP to be able to incorporate deep learning capabilities in 2.x, provide backwards compatibility, and not introduce any burdensome dependencies to users of opennlp-tools. I have diagrammed these changes here: https://cwiki.apache.org/confluence/display/OPENNLP/OpenNLP+2.x I am proposing that this refactoring take place in a 1.9.x release to set the stage for 2.0. I will volunteer to take on the refactoring coding effort. Please share your thoughts on this proposal and anything else you would like to see in an OpenNLP 2.0 release. (This proposal doesn't specify any deep learning capabilities - we'll figure that out later. This is just to get ready for it.) Thanks for everyone's support of OpenNLP over the years! Jeff