The Apache OpenNLP team is pleased to announce the release of version 2.0.0 of Apache OpenNLP. The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, and parsing.
The OpenNLP 2.0.0 binary and source distributions are available for download from our download page: https://opennlp.apache.org/download.html The OpenNLP library is distributed by Maven Central as well. See the Maven Dependency page for more details: http://opennlp.apache.org/maven-dependency.html Changes in this version: - Now built using Java 11 - Supports model inference using the ONNX Runtime - Adds MASC format support - Made NameSample overlap exception more helpful - Tokenizers can now output a new line token - Adding missing charset to DictionaryLemmatizer - Updated documentation to fix training API sample code - Fixed build issues with Java 17 - Adds ability to download models from within Apache OpenNLP For a complete list of fixed bugs and improvements please see the RELEASE_NOTES file included in the distribution. The Apache OpenNLP Team