Hi folks, I've posted a 1st release candidate for the pre-Trained Apache OpenNLP Model release version 1.3 and it is ready for testing.
Here are the changes compared to version 1.2: - All models have been trained on the Universal Dependencies corpus version 2.16 (May 15, 2025) using Apache OpenNLP 2.5.4 (and should work with OpenNLP versions >= 1.0). - Training was conducted with 300 iterations. - Please note: - The Czech models required a change in training, that is a different treebank: 'PDTC', see: https://github.com/UniversalDependencies/UD_Czech-PDTC/blob/master/README.md - We now provide models for the following new languages: - "Afrikaans|af|AfriBooms" - "Indonesian|id|GSD" - "Irish|ga|IDT" - "Persian|fa|PerDT" - Refer to the opennlp-training-eval-logs-1.3-2.5.4.zip file for the individual model training and evaluation logs. In total, the OpenNLP project now provides pre-trained models for 36 languages. The release candidate was prepared using the OpenNLP release process, documented on the website: https://opennlp.apache.org/release-model.html Models: https://dist.apache.org/repos/dist/dev/opennlp/models/ud-models-1.3/ The results of the eval tests for each model are contained in the "opennlp-training-eval-logs-1.3-2.5.4.zip" on dist/dev. Reminder: The up-2-date KEYS file for signature verification can be found here: https://dist.apache.org/repos/dist/release/opennlp/KEYS Please vote on releasing these packages as pre-trained Apache OpenNLP Models v1.3. The vote is open for at least the next 72 hours. Only votes from OpenNLP PMC are binding, but everyone is welcome to check the release candidate and vote. The vote passes if at least three binding +1 votes are cast. Please VOTE [+1] go ship it [+0] meh, don't care [-1] stop, there is a ${showstopper} Thanks! Martin