The Apache OpenNLP team is pleased to announce the release of version 2.5.0 of Apache OpenNLP.
The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, and parsing. The OpenNLP 2.5.0 binary and source distributions are available for download from our download page: https://opennlp.apache.org/download.html The latest set of pre-trained model files for 23 languages is available at the models page: https://opennlp.apache.org/models.html The OpenNLP library is distributed by Maven Central as well. See the Maven Dependency page for more details: https://opennlp.apache.org/maven-dependency.html Changes in this version: In total, this release tackles 62 issues and brings several dependency updates, bug fixes, substantial additions and some corrections for the API! OpenNLP version 2.5.0 supports thread-safe sentence detection, tokenization and POS-tagging (see: OPENNLP-936). With this release, there is the possibility to disable the POS tag mapper (see: OPENNLP-1600) to achieve a custom mapping. Furthermore, it relies on opennlp-models in version 1.1 which got substantially extended by models for 18 new languages (see: OPENNLP-1615) as listed on the Model page: https://opennlp.apache.org/models.html. The OpenNLP Brat Annotator component has been moved to the OpenNLP sandbox repository due to limited quality and usability concerns (see: OPENNLP-1634). Thereby, several compile and runtime dependencies could be dropped (Jackson, Jersey, etc.) and are thus no longer shipped with the "bin" artifacts. For a full list of improvements, please see the items in Jira: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12311215&version=12354554 The Apache OpenNLP Team