The Apache OpenNLP team is pleased to announce the release of version 2.5.9 of Apache OpenNLP. The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, and parsing.
The OpenNLP 2.5.9 binary and source distributions are available for download from our download page: https://opennlp.apache.org/download.html The OpenNLP library is distributed by Maven Central as well. See the Maven Dependency page for more details: https://opennlp.apache.org/maven-dependency.html Changes in this version: This is a maintenance and security release on the 2.x line. It addresses three security issues (also fixed in 3.0.0-M3) and refreshes several dependencies. Security fixes: • OPENNLP-1819: Fix XXE vulnerability in DictionaryEntryPersistor by aligning XML parsing with XmlUtil (secure processing enabled, DOCTYPE disallowed) • OPENNLP-1820: Restrict ExtensionLoader to an allowlisted set of package prefixes, preventing arbitrary class initialization from crafted model archives • OPENNLP-1821: Prevent OutOfMemoryError in AbstractModelReader by bounding count fields read from binary models before array allocation Dependency updates: • OPENNLP-1817: Update log4j2 to 2.25.4 • OPENNLP-1822: Update ONNX runtime to 1.25.0 For a complete list of fixed bugs and improvements please see the RELEASE_NOTES file included in the distribution. The Apache OpenNLP Team
