[X] Both source (tar.gz/zip) and binary artifacts (tar.gz/zip) are present, 
along with .asc and .sha512 files for each.
[X] PGP signatures are valid for the release artifacts using the KEYS file from 
dist.apache.org
[X] SHA512 checksums are correct and verified.
[X] LICENSE and NOTICE files exist and are accurate.
[X] No unexpected binary files in the source release.
[X] All source files have appropriate ASF headers (excluding generated files 
and legacy files).
[X] Build completes successfully from source and the instruction to do so are 
clear.

+1 (binding)

> Am 25.06.2026 um 13:58 schrieb Martin Wiesner <[email protected]>:
> 
> Hi all,
> 
> I have posted a first release candidate for the Apache OpenNLP 3.0.0-M4 
> release and it is ready for testing.
> 
> The 3.x release line of Apache OpenNLP introduces no known breaking changes 
> while significantly modularizing the project to improve library usage and 
> future extensibility.
> The core API remains stable and fully compatible with 2.x, so existing 
> projects can continue using the opennlp-tools artifact without modifications.
> 
> Key Highlights:
>   • New Features: 
>       • Include list of stop words for various languages (OPENNLP-660)
>       • Add SymSpell-based spell correction component (OPENNLP-1832)
>       • Add BertTokenizer with BERT basic tokenization (OPENNLP-1837)
>   • Bug Fixes:
>       • This release ships four bug fixes for: OPENNLP-1826, OPENNLP-1836, 
> OPENNLP-1839, and OPENNLP-1840
>   • Improvements:
>       • Harden SvmDoccatModel.deserialize() with ObjectInputFilter and 
> resource limits (OPENNLP-1823)
>       • Tolerate unsupported XML parser security options (OPENNLP-1835)
>       • Fix NameFinderDL only worked with Person, expand to all types 
> (OPENNLP-1846)
>       • Several updates of dependencies were conducted, see Jira release 
> notes listing - URL down below
>       • Some minor tasks have been completed
>   • IMPORTANT Changes: 
>       • The ONNX input encoding in SentenceVectorsDL was fixed, which changes 
> the produced sentence vectors. Any embeddings persisted with the old encoding 
> are not comparable to the new output and must be re-generated. (OPENNLP-1836 
> - PR #1072)
>       • WordpieceTokenizer (public API, used by opennlp-dl) now splits 
> punctuation runs into single tokens, collapses partially-matched words to a 
> single [UNK], and throws from tokenizePos instead of returning null. These 
> change tokenization output for existing callers. (OPENNLP-1837 - PR #1073)
>       • NameFinderDL now decodes all BIO entity types (PER/ORG/LOC/…) instead 
> of only persons. Span.getType() now returns the entity label rather than the 
> covered text, which is a contract change for existing callers. (OPENNLP-1846 
> - PR #1086)
>       • The opennlp-dl components are now thread-safe; as part of this, 
> loadVocab became public static (source- and binary-incompatible) and 
> AbstractDL's implicit no-arg constructor was removed. Both affect downstream 
> code that calls loadVocab or extends AbstractDL. (OPENNLP-1844 - PR #1084)
> 
> Thank you to everyone who contributed to this release, including all of our 
> users and the people who submitted bug reports, contributed code or 
> documentation enhancements.
> 
> The release was made using the OpenNLP release process, documented on the 
> website:
> https://opennlp.apache.org/release.html
> 
> Maven Repo:
> https://repository.apache.org/content/repositories/orgapacheopennlp-1070
> 
> <repositories>
> <repository>
> <id>opennlp-3.0.0-M4-RC1</id>
> <name>Testing OpenNLP 3.0.0-M4 release candidate</name>
> <url>
> https://repository.apache.org/content/repositories/orgapacheopennlp-1070
> </url>
> </repository>
> </repositories>
> 
> Binaries & Source:
> 
> https://dist.apache.org/repos/dist/dev/opennlp/opennlp-3.0.0-M4-rc1/
> 
> Tag:
> 
> https://github.com/apache/opennlp/releases/tag/opennlp-3.0.0-M4
> 
> Tag Hash: 1e05d1ef5a7c35b83015ebce87bb9a43c55e2226
> 
> Release notes:
> 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12311215&version=12356941
> 
> The results of the eval tests for the aforementioned tag can be found
> here: https://ci-builds.apache.org/job/OpenNLP/job/eval-tests-releases/35/
> 
> Reminder: The up-2-date KEYS file for signature verification can be
> found here: https://dist.apache.org/repos/dist/release/opennlp/KEYS
> 
> Checklist for reference:
> 
> [ ] Both source (tar.gz/zip) and binary artifacts (tar.gz/zip) are present, 
> along with .asc and .sha512 files for each.
> [ ] PGP signatures are valid for the release artifacts using the KEYS file 
> from dist.apache.org
> [ ] SHA512 checksums are correct and verified.
> [ ] LICENSE and NOTICE files exist and are accurate.
> [ ] No unexpected binary files in the source release.
> [ ] All source files have appropriate ASF headers (excluding generated files 
> and legacy files).
> [ ] Build completes successfully from source and the instruction to do so are 
> clear.
> 
> Please vote on releasing these packages as Apache OpenNLP 3.0.0-M4
> The vote is open for at least the next 72 hours.
> 
> Only votes from OpenNLP PMC are binding, but everyone is welcome to
> check the release candidate and vote.
> The vote passes if at least three binding +1 votes are cast.
> 
> Please VOTE
> 
> [+1] go ship it
> [+0] meh, don't care
> [-1] stop, there is a ${showstopper}
> 
> Thanks!
> Martin | mawiesne

Reply via email to