Re: [VOTE] Apache OpenNLP Models 1.0.1

2024-07-25 Thread Jeff Zemerick
+1 PR looked great. Thanks, Jeff On Thu, Jul 25, 2024 at 3:28 AM Richard Zowalla wrote: > Hi folks, > > I have posted a first release candidate for the Apache OpenNLP Models > 1.0.1 release and it is ready for testing. > > This is a minor patch release of OpenNLP (UD) models via Maven with on

Re: [VOTE] OpenNLP 2.4.0

2024-07-18 Thread Jeff Zemerick
+1 Build and tests look good. Thanks for putting the release together. Jeff On Thu, Jul 18, 2024 at 6:18 AM Martin Wiesner wrote: > Hi everybody > > here’s my +1 (binding). > > Procedure: > (a) checked the contents of the source (and binary) artifact, > (b) verified the signature of the rele

Re: Release 2.3.3 ?

2024-06-27 Thread Jeff Zemerick
+1 to the release and also to "esteemed Elves." Thanks, Jeff On Thu, Jun 27, 2024 at 7:53 AM Richard Zowalla wrote: > Dear Devs of the OpenNLP Realm, > > Greetings once again, noble Lords and Ladies, esteemed Elves, and diligent > Dwarves of the OpenNLP Kingdom! > > I write to you to clarify a

Re: [VOTE] Apache OpenNLP Models 1.0.0 RC1

2024-06-26 Thread Jeff Zemerick
+1 Thanks for putting all this together! On Wed, Jun 26, 2024 at 2:20 PM Richard Zowalla wrote: > Here is my own +1 > > > Am 26.06.2024 um 13:19 schrieb Martin Wiesner : > > > > +1 > > > > > > Thanks @Richard. > > > > > >> Am 24.06.2024 um 22:39 schrieb Bruno Kinoshita >: > >> > >> [+1] go shi

Re: Release of OpenNLP Models 1.0.0 ?

2024-06-19 Thread Jeff Zemerick
None from me! +1 Thanks, Jeff On Wed, Jun 19, 2024 at 3:12 AM Martin Wiesner wrote: > I think we are good to go. +1 > > Best > Martin > > > Am 18.06.2024 um 11:26 schrieb Bruno Kinoshita >: > > > > No objections from me. Thanks! > > > > On Tue, 18 Jun 2024, 08:55 Richard Zowalla, wrote: > >

Re: [DISCUSS] Version Scheme for OpenNLP Models

2024-05-28 Thread Jeff Zemerick
I favor option (a) because we likely won't release models as frequently but we will have to keep track of what's compatible with what. There is a manifest file inside the model files and it contains the version number of OpenNLP that trained the model. It's used to check if the version of OpenNLP

Re: [VOTE] Apache OpenNLP 2.3.3 (rc1)

2024-04-22 Thread Jeff Zemerick
+1 Thanks! On Mon, Apr 22, 2024 at 4:27 AM Tommaso Teofili wrote: > +1 > > tag builds ok, sigs ok. > > Regards, > Tommaso > > On Mon, 22 Apr 2024 at 08:56, Bruno Kinoshita > wrote: > > > +1 > > > > Tag is building fine on my env: > > > > Apache Maven 3.8.5 (3599d3414f046de2324203b78ddcf9b5e438

Re: [VOTE] Apache OpenNLP 2.3.2 (RC1)

2024-02-03 Thread Jeff Zemerick
+1 Thanks for putting the build together! Thanks, Jeff On Sat, Feb 3, 2024 at 5:46 AM Martin Wiesner wrote: > Hi everybody > > here’s my +1 (binding). > > Comments: > I checked the tar.gz source artifact, ran a local build of this release > candidate (passing), and checked the corresponding E

Re: A new OpenNLP release (2.3.2) soon?

2024-01-28 Thread Jeff Zemerick
+1 to a release. I can RM but am happy to let someone else, too. Thanks, Jeff On Sun, Jan 28, 2024 at 3:14 PM Richard Zowalla wrote: > +1 > > (happy to help, if someone wants to act as release manager for the > first time ;-) ) > > Am Sonntag, dem 28.01.2024 um 20:54 +0100 schrieb Martin Wiesn

POS tags

2023-12-29 Thread Jeff Zemerick
Hi all, The following two issues were recently discovered (credit to Martin Wiesner): https://issues.apache.org/jira/browse/OPENNLP-1538 - Ensure all integration tests are run during Eval build It was recently learned that *IT.java tests are not being run during the build. One of those tests, th

Re: [VOTE] Apache OpenNLP 2.3.1 Release Candidate

2023-11-24 Thread Jeff Zemerick
+1 Thanks, Jeff On Thu, Nov 23, 2023 at 9:03 AM Tommaso Teofili wrote: > +1 > > Tommaso > > On Thu, 23 Nov 2023 at 12:05, Richard Zowalla wrote: > > > +1 (binding) > > > > > > (We should create an issue for the year in the NOTICE file though) > > > > Am Mittwoch, dem 22.11.2023 um 15:12 +0100

Re: Potential 2.3.1 release?

2023-11-22 Thread Jeff Zemerick
for the task. > > Best, > Martin > --​​​ > > Am Mittwoch, November 01, 2023 19:58 CET, schrieb Richard Zowalla < > r...@apache.org>: > https://issues.apache.org/jira/browse/OPENNLP-1457 might be helpful as > well ;-) > > Am Dienstag, dem 31.10.2023 um 21:

Re: Voice Of Apache (Formerly Feathercast) podcast request

2023-11-14 Thread Jeff Zemerick
Hi Rich, Sure, be glad to! Thanks, Jeff On Fri, Nov 10, 2023 at 10:04 AM Rich Bowen wrote: > Hi, folks, > > I was wondering if one of you fine people (preferably a Committer or PMC > member) would be willing to do a brief (30 minutes ish) interview with me > for the Voice Of Apache podcast? >

Re: Potential 2.3.1 release?

2023-10-31 Thread Jeff Zemerick
I can volunteer to be the Release Manager after a 6 yrs hiatus since I > did this the last time. > > On Tue, Oct 31, 2023 at 9:29 AM Eric Pugh > > wrote: > > > +1 > > > > > On Oct 31, 2023, at 9:20 AM, Richard Zowalla wrote: > > > > > >

Potential 2.3.1 release?

2023-10-31 Thread Jeff Zemerick
Hi all, It looks like it might be a good time for a 2.3.1 release? We have had a few pull requests. Thoughts? Thanks, Jeff

Re: ONNX Runtime GPU/CPU in OpenNLP

2023-10-13 Thread Jeff Zemerick
Adding to my previous reply, it would be great to also get OPENNLP-1384 into the minor release. I will try to get a PR for it up this weekend. https://issues.apache.org/jira/browse/OPENNLP-1384 Thanks, Jeff On Fri, Oct 13, 2023 at 8:50 AM Jeff Zemerick wrote: > I think it would be good to

Re: ONNX Runtime GPU/CPU in OpenNLP

2023-10-13 Thread Jeff Zemerick
Thu, 12 Oct 2023, 21:50 Richard Zowalla, wrote: > > > I am fine with the suggested approach. > > > > Maybe, we should do a minor release after this is merged in order to > > unblock the SOLR folks? > > > > Gruß > > Richard > > > >

ONNX Runtime GPU/CPU in OpenNLP

2023-10-12 Thread Jeff Zemerick
Hi all, I created OPENNLP-1515 to change the ONNX Runtime dependency from onnxruntime-gpu to onnxruntime. This change will remove GPU support and cause OpenNLP to always use CPU for inference. The reason for this change is the onnxruntime dependency supports Linux, Windows, and Mac x64, and the on

Re: [VOTE] Apache OpenNLP 2.3.0 Release Candidate

2023-07-31 Thread Jeff Zemerick
+1 Built and tested from tag. Apache Maven 3.6.3 Maven home: /usr/share/maven Java version: 17.0.7, vendor: Private Build, runtime: /usr/lib/jvm/java-17-openjdk-amd64 Default locale: en_US, platform encoding: UTF-8 OS name: "linux", version: "5.19.0-50-generic", arch: "amd64", family: "unix" Tha

Re: OpenNLP 2.3.0 ?

2023-07-30 Thread Jeff Zemerick
Sorry, I'm late, but it sounds awesome! Thanks, Jeff On Sun, Jul 30, 2023 at 6:49 AM Richard Zowalla wrote: > Thanks for the responses. > > I will try to prepare a 2.3.0 release candidate within the next days > > Gruß > Richard > > > > > Am Dienstag, dem 25.07.2023 um 16:05 +0200 schrieb Tommas

Re: Feedback & opinions on: "OPENNLP-1496 Migrate OpenNLP towards JDK 17"

2023-06-04 Thread Jeff Zemerick
I'm also a +1. Thanks, Jeff On Sun, Jun 4, 2023 at 7:08 AM Atita Arora wrote: > +1 from my side ! > Let's move with time, :) I wonder if we have use cases where we need to > support backward compatibility. > > > > On Sun, Jun 4, 2023 at 12:31 PM Martin Wiesner > wrote: > > > Hi everybody, > >

[ANNOUNCE] Apache OpenNLP 2.2.0 released

2023-05-03 Thread Jeff Zemerick
The Apache OpenNLP team is pleased to announce the release of version 2.2.0 of Apache OpenNLP. The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-spee

Re: [VOTE] Apache OpenNLP 2.2.0 Release Candidate

2023-04-20 Thread Jeff Zemerick
k, that only one binding is missing. > > > > Gruß > > Richard > > > > Am Montag, dem 10.04.2023 um 07:10 -0400 schrieb Jeff Zemerick: > > > Hi folks, > > > > > > I have posted a release candidate for the Apache OpenNLP 2.2.0 > > > rel

Re: [VOTE] Apache OpenNLP 2.2.0 Release Candidate

2023-04-10 Thread Jeff Zemerick
he-maven-3.8.5 > Java version: 17.0.6, vendor: Private Build, runtime: > /usr/lib/jvm/java-17-openjdk-amd64 > Default locale: en_US, platform encoding: UTF-8 > OS name: "linux", version: "5.15.0-69-generic", arch: "amd64", family: > "unix" >

[VOTE] Apache OpenNLP 2.2.0 Release Candidate

2023-04-10 Thread Jeff Zemerick
Hi folks, I have posted a release candidate for the Apache OpenNLP 2.2.0 release and it is ready for testing. The distributables can be downloaded from: https://repository.apache.org/content/repositories/orgapacheopennlp-1033/org/apache/opennlp/opennlp-distr/2.2.0/ The release was made from the

Re: Thoughts on an OpenNLP 2.2.0 release?

2023-04-04 Thread Jeff Zemerick
Thanks everyone. I will plan to kick off the release around this coming weekend. Thanks, Jeff On Tue, Apr 4, 2023 at 8:11 AM Martin Wiesner wrote: > > +1 > > On 2023/04/03 19:03:06 Jeff Zemerick wrote: > > Hi all, > > > > I would like to start a discussion fo

Re: Thoughts on an OpenNLP 2.2.0 release?

2023-04-03 Thread Jeff Zemerick
llest ones, are very much welcome and appreciated! Thanks, Jeff [1] https://issues.apache.org/jira/projects/OPENNLP/issues/OPENNLP-1163?filter=allopenissues On Mon, Apr 3, 2023 at 3:47 PM Bruno Kinoshita wrote: > > +1 > > On Mon, 3 Apr 2023, 9:03 pm Jeff Zemerick, wrote: > > >

Thoughts on an OpenNLP 2.2.0 release?

2023-04-03 Thread Jeff Zemerick
Hi all, I would like to start a discussion for the release of OpenNLP 2.2.0. Since the last release of 2.1.1, there have been a couple significant additions and a variety of improvements. Some notable changes include the introduction of SLF4J for logging instead of System.out, sentence-transformer

Re: Eval Tests are now run on ASF Jenkins CI

2023-03-06 Thread Jeff Zemerick
This is great -- thanks Richard and Bruno! I can take a look at those failing tests to see what's going on. Thanks, Jeff On Sat, Mar 4, 2023 at 2:49 AM Richard Zowalla wrote: > > Hi all, > > I've setup a job to run our eval tests on the ASF infrastructure [1]. > It will run daily. It takes arou

New committer: Atita Arora

2023-03-01 Thread Jeff Zemerick
The Project Management Committee (PMC) for Apache OpenNLP has invited Atita Arora to become a committer and we are pleased to announce that they have accepted. Atita, thanks for your contributions to OpenNLP and we look forward to your future contributions! Thanks, Jeff

[ANNOUNCE] OpenNLP 2.1.1 released

2023-02-27 Thread Jeff Zemerick
The Apache OpenNLP team is pleased to announce the release of version 2.1.1 of Apache OpenNLP. The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-spee

Re: [VOTE] Apache OpenNLP 2.1.1 Release Candidate 2

2023-02-22 Thread Jeff Zemerick
y for the late reply. +1 from me. > I run test and checked some signatures of the release. Thanks! > > Koji > > 2023年2月20日(月) 23:23 Jeff Zemerick : > > > > Hi everyone, > > > > We are still looking for a third +1 vote (or a -1 vote and a reason to > not >

Re: [VOTE] Apache OpenNLP 2.1.1 Release Candidate 2

2023-02-20 Thread Jeff Zemerick
Hi everyone, We are still looking for a third +1 vote (or a -1 vote and a reason to not proceed with the release). Thanks, Jeff On Sat, Feb 18, 2023 at 9:26 AM Jeff Zemerick wrote: > Hi Atita, > > Yes, I think it is an issue with the M1 Mac. Looking at onnxruntime it > says

Re: [VOTE] Apache OpenNLP 2.1.1 Release Candidate 2

2023-02-18 Thread Jeff Zemerick
Hi Atita, Yes, I think it is an issue with the M1 Mac. Looking at onnxruntime it says it supports Windows x64, Linux x64, macOS x64. ( https://onnxruntime.ai/docs/get-started/with-java.html) Thanks, Jeff On Fri, Feb 17, 2023 at 4:47 PM Atita Arora wrote: > Hi, > > Thank you Jeff for sharing t

Re: [VOTE] Apache OpenNLP 2.1.1 Release Candidate 2

2023-02-17 Thread Jeff Zemerick
ion ... SKIPPED > > > > [INFO] Apache OpenNLP Distribution SKIPPED > > > > [INFO] Apache OpenNLP DL .. SKIPPED > > > > [INFO] > >

Re: [VOTE] Apache OpenNLP 2.1.1 Release Candidate 2

2023-02-08 Thread Jeff Zemerick
+1 from me as well. The eval tests all passed. Thanks, Jeff On Tue, Feb 7, 2023 at 3:34 PM Bruno Kinoshita wrote: > [x] +1 Release the packages as Apache OpenNLP 2.1.1 > > Thank you! > > On Mon, 6 Feb 2023 at 17:28, Jeff Zemerick wrote: > > > Hi folks, > > &

[VOTE] Apache OpenNLP 2.1.1 Release Candidate 2

2023-02-06 Thread Jeff Zemerick
Hi folks, I have posted a 2nd release candidate for the Apache OpenNLP 2.1.1 release and it is ready for testing. The distributables can be downloaded from: https://repository.apache.org/content/repositories/orgapacheopennlp-1032/org/apache/opennlp/opennlp-distr/2.1.1/ The release was made from

Re: [VOTE] Apache OpenNLP 2.1.1 Release Candidate

2023-02-01 Thread Jeff Zemerick
:08 PM Jeff Zemerick wrote: > Richard, yes, it is possible that the values in the tests should be > updated, and in this case that looks likely. > > The affecting change is this: > https://github.com/apache/opennlp/pull/442/files > > Both tests pass when that change is reverte

Re: [VOTE] Apache OpenNLP 2.1.1 Release Candidate

2023-01-30 Thread Jeff Zemerick
te simple > via git bisect? If we have the commit, we can see, what has changed and > find a solution for it. > > Gruß > Richard > > > Am Montag, dem 30.01.2023 um 09:05 -0500 schrieb Jeff Zemerick: > > Good catch, Bruno. I wrote the NOTICE file date up as > >

Re: [VOTE] Apache OpenNLP 2.1.1 Release Candidate

2023-01-30 Thread Jeff Zemerick
to 2023? > > Cheers > Bruno > > On Sat, 28 Jan 2023 at 16:07, Jeff Zemerick wrote: > > > Hi folks, > > > > I have posted a first release candidate for the Apache OpenNLP 2.1.1 > > release and it is ready for testing. > > > > There were 60 Jira i

[VOTE] Apache OpenNLP 2.1.1 Release Candidate

2023-01-28 Thread Jeff Zemerick
Hi folks, I have posted a first release candidate for the Apache OpenNLP 2.1.1 release and it is ready for testing. There were 60 Jira issues addressed in this version. Most of these issues were improvements like code refactoring and unit tests. The full list is available in Jira at: https://issu

Re: Thoughts on an Apache OpenNLP 2.1.1 release

2023-01-23 Thread Jeff Zemerick
e. > > > > If we are fine to ship these changes in a patch version (and not in a > > new minor version), we can go with 2.1.1 > > > > Gruß > > Richard > > > > > > > > Am Mittwoch, dem 11.01.2023 um 10:29 -0500 schrieb Jeff Zemerick: > >

Thoughts on an Apache OpenNLP 2.1.1 release

2023-01-11 Thread Jeff Zemerick
Hi everyone, Since the last release (2.1.0) back in November, we have had 56 closed Jira tickets. Scrolling through them, I see lots of improvements to documentation, Javadocs, tests, code quality, refactoring, and a few minor fixes. Because I don't see any new features, I would like to propose a

Re: Logging in OpenNLP

2023-01-11 Thread Jeff Zemerick
Sounds great, Richard! Thanks for taking this one on. Thanks, Jeff On Tue, Jan 10, 2023 at 9:19 AM Richard Zowalla wrote: > Hi all, > > any additional thoughts? > > At the moment, most people tend towards slf4j (c). > > If there is no other opinion or a hard veto on that code change, I > would

Re: Renaming of master branch to main

2022-12-21 Thread Jeff Zemerick
Infra has renamed master to main per https://issues.apache.org/jira/browse/INFRA-24022. Thanks, Jeff On Tue, Dec 20, 2022 at 11:21 AM Jeff Zemerick wrote: > I created the Infra ticket - > https://issues.apache.org/jira/browse/INFRA-24022 > > Richard, thanks for volunteering to mak

Re: Renaming of master branch to main

2022-12-20 Thread Jeff Zemerick
Zowalla wrote: > The renaming operation itself requires a INFRA ticket via a PMC member. > Afterwards it is rather easy to adjust the CI / website things. I can > do that. > > Gruß > Richard > > Am Montag, dem 19.12.2022 um 14:34 -0500 schrieb Jeff Zemerick: > > I don&#x

Re: Logging in OpenNLP

2022-12-19 Thread Jeff Zemerick
Thanks for bringing this to the mailing list, Richard. I think your findings from digging into the history are accurate. Of your given options, which seem comprehensive, I think my preference is option D, followed by C, but I'm very interested in what others think. With option D, if we implement

Re: Renaming of master branch to main

2022-12-19 Thread Jeff Zemerick
t; We only need to adjust build bot and github actions > > > > Gruß > > Richard > > > > On 2022/07/28 13:22:29 Jeff Zemerick wrote: > > > Hi all, > > > > > > I want to see about the community's input as to renaming the master > branch >

New Committer: Richard Zowalla

2022-12-08 Thread Jeff Zemerick
The Project Management Committee (PMC) for Apache OpenNLP has invited Richard Zowalla to become a committer and we are pleased to announce that they have accepted. Welcome, Richard! Your contributions to Apache OpenNLP are much appreciated. Thanks, Jeff

New Committer: Martin Wiesner

2022-12-08 Thread Jeff Zemerick
The Project Management Committee (PMC) for Apache OpenNLP has invited Martin Wiesner to become a committer and we are pleased to announce that they have accepted. Welcome, Martin! Your contributions to Apache OpenNLP are much appreciated. Thanks, Jeff

[ANNOUNCE] OpenNLP 2.1.0 released

2022-11-28 Thread Jeff Zemerick
The Apache OpenNLP team is pleased to announce the release of version 2.1.0 of Apache OpenNLP. The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-spee

OpenNLP snapshot artifacts now in the snapshot repository

2022-11-25 Thread Jeff Zemerick
Hi all, Just wanted to let dev@ know that the OpenNLP GitHub Action now uploads snapshot artifacts after building. https://repository.apache.org/content/repositories/snapshots/org/apache/opennlp/ Thanks Richard Zowalla for the PR ( https://github.com/apache/opennlp/pull/433) to make this happen.

Re: [VOTE] Apache OpenNLP 2.1.0 Release Candidate

2022-11-23 Thread Jeff Zemerick
encoding: UTF-8 > > OS name: "linux", version: "5.14.0-1054-oem", arch: "amd64", family: > "unix" > > > > Thanks! > > > > On 2022/11/07 13:39:32 Jeff Zemerick wrote: > > > Hi folks, > > > > > > I have posted

Re: [VOTE] Apache OpenNLP 2.1.0 Release Candidate

2022-11-13 Thread Jeff Zemerick
ot; in /issuesFixed in the ZIP > > doesn't contain any items. Don't know, if this is intended, but no > > blocker - guess it needs to be removed or fixed. > > > > Gruß > > Richard > > > > > > Am Donnerstag, dem 10.11.2022 um 10:32 -0500 schrie

Re: [VOTE] Apache OpenNLP 2.1.0 Release Candidate

2022-11-10 Thread Jeff Zemerick
] On Mon, Nov 7, 2022 at 8:39 AM Jeff Zemerick wrote: > Hi folks, > > I have posted a release candidate for the Apache OpenNLP 2.1.0 release and > it is ready for testing. > > Changes in this version: > https://issues.apache.org/jira/browse/OPENNLP-1370?jql=pr

OpenNLP 2.1.0 Release Helper Guide

2022-11-10 Thread Jeff Zemerick
Hi all, To help with validating releases, I would like the project to have a documented list of common release validation steps that can be referenced in the VOTE email. Here's a draft of a guide for the 2.1.0 release. Please feel free to suggest changes to it. Perhaps we can make the finished ve

[VOTE] Apache OpenNLP 2.1.0 Release Candidate

2022-11-07 Thread Jeff Zemerick
Hi folks, I have posted a release candidate for the Apache OpenNLP 2.1.0 release and it is ready for testing. Changes in this version: https://issues.apache.org/jira/browse/OPENNLP-1370?jql=project%20%3D%20OPENNLP%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20in%20(2.1.0)%20OR

Re: Training of MaxEnt Model with large corpora fails with java.io.UTFDataFormatException

2022-10-28 Thread Jeff Zemerick
gt; av > > > a:312) > > > at > > > opennlp.tools.util.model.BaseModel.loadModel(BaseModel.java:242) > > > at > > > opennlp.tools.util.model.BaseModel.(BaseModel.java:176) > > > at > > > opennlp.tools.lemmatizer.LemmatizerModel.(Lemm

Re: Suggestion to rename 'isAlphaNumopt' param

2022-10-27 Thread Jeff Zemerick
, Jeff [1] https://issues.apache.org/jira/browse/OPENNLP-1387 On Wed, Oct 26, 2022 at 11:46 AM Atita Arora wrote: > Hi all, > This is my first time writing to the forum and made a few small > contributions to the project under the due guidance of my colleague Jeff > Zemerick. >

OpenNLP 2.1.0 Release

2022-10-17 Thread Jeff Zemerick
Hi everyone, We have some fixed Jira issues [1] and this might be a good time to do a 2.1.0 release before the upcoming holidays when it's harder to find time for releasing and voting. If anyone has any thoughts please reply! If there aren't any objections I will try to initiate a vote in the com

Span.getCoveredText() returning string based on character positions

2022-10-13 Thread Jeff Zemerick
Hi All, The NameFinder implementations create spans based on the entity's token-based start/end indexes. But, Span.getCoveredText() gets the covered text based on the character start/end instead of the token start/end. An example: sentence = "Neil Abercrombie Anibal Acevedo-Vila Gary Ackerman"

OPENNLP-1386: Making parameters be not case sensitive

2022-10-06 Thread Jeff Zemerick
Hi all, The training parameters like "cutoff" are case-sensitive when training models in the parameters file, so cutoff != CUTOFF != Cutoff. I made a PR to make them not be case sensitive and if anybody has any comments please check out the PR: https://github.com/apache/opennlp/pull/423 The idea

Renaming of master branch to main

2022-07-28 Thread Jeff Zemerick
Hi all, I want to see about the community's input as to renaming the master branch to main for the opennlp and opennlp-site repositories. I see quite a few ASF projects have made the change and I personally think we should, too, for the same reason of promoting inclusivity. I think we would have

Re: Training of MaxEnt Model with large corpora fails with java.io.UTFDataFormatException

2022-07-27 Thread Jeff Zemerick
Hi Richard, I know it's been a while but I wanted to circle back to this to see if there are any updates. Thanks, Jeff On Mon, Apr 25, 2022 at 1:48 PM Richard Eckart de Castilho wrote: > Hi, > > > On 11. Apr 2022, at 14:50, Zowalla, Richard < > richard.zowa...@hs-heilbronn.de> wrote: > > > > T

Re: Unit tests in CLITest fail with openjdk 11+

2022-07-27 Thread Jeff Zemerick
Hi Bertrand, Thanks for reporting this. This is the right place! Could you please open a JIRA ticket for it? In the ticket, could you please provide the output of the "mvn -v" command? In version 2 we moved to Java 11 so I'm surprised to see the tests fail for it. I am not, however, surprised to

Re: "Reuters" data for the CONLL 2003 task

2022-07-24 Thread Jeff Zemerick
HI Bertrand, This probably shouldn't be considered factual advice, but I think as an individual you can "accept" it yourself. The folks at NIST ( reuters-requ...@nist.gov) can likely give a definitive answer to that. Thanks, Jeff On Sun, Jul 24, 2022 at 3:28 PM Bertrand Rigaldies wrote: > Hi

Re: [ANNOUNCE] OpenNLP 2.0.0 released

2022-06-13 Thread Jeff Zemerick
rote: > Thanks Jeff! > > But i could not find the RELEASE_NOTES anywhere in source or binary > distribution. The issuesFixed HTML file in the binary distribution is > empty. Am i looking in the wrong direction? > > Thanks > > > Op ma 6 jun. 2022 om 15:25 schreef Jef

Re: [VOTE] Apache OpenNLP 2.0.0 Release Candidate

2022-06-07 Thread Jeff Zemerick
ra URL which may or may not be an issue. > > Cheers, Paul. > > On 2022/05/08 12:26:38 Jeff Zemerick wrote: > > Hi folks, > > > > I have posted a first release candidate for the Apache OpenNLP 2.0.0 > > release and it is ready for testing. > > > > The dist

[ANNOUNCE] OpenNLP 2.0.0 released

2022-06-06 Thread Jeff Zemerick
The Apache OpenNLP team is pleased to announce the release of version 2.0.0 of Apache OpenNLP. The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-spee

Re: [VOTE] Apache OpenNLP 2.0.0 Release Candidate

2022-06-05 Thread Jeff Zemerick
Cheers, > Jörn > > > On Wed, Jun 1, 2022 at 9:57 PM Suneel Marthi wrote: > > > +1 binding > > > > On Wed, Jun 1, 2022 at 3:12 PM Jeff Zemerick > wrote: > > > > > Just pinging folks on the thread about the active vote. The project > has a > >

Re: [VOTE] Apache OpenNLP 2.0.0 Release Candidate

2022-06-01 Thread Jeff Zemerick
Just pinging folks on the thread about the active vote. The project has a board report due in a week - it would be awesome to get this release in that report. Thanks, Jeff On Thu, May 26, 2022 at 9:39 AM Jeff Zemerick wrote: > I created a JIRA task to update the NOTICE file. > > I re-

Re: [VOTE] Apache OpenNLP 2.0.0 Release Candidate

2022-05-26 Thread Jeff Zemerick
I created a JIRA task to update the NOTICE file. I re-ran build tests and eval tests and am +1 to release as 2.0.0 Thanks, Jeff On Tue, May 10, 2022 at 8:37 AM Jeff Zemerick wrote: > Bruno, > > Good catch. Does updating the date require a new RC? > > Thanks for the rem

Re: [VOTE] Apache OpenNLP 2.0.0 Release Candidate

2022-05-10 Thread Jeff Zemerick
est/models that others used to > run for other releases. In case you know how to run that, it'd be good if > you could post in your vote saying whether everything worked fine. > Otherwise check with another PMC/committer about it. Since it's a 2.0 > release I expect a few user

[VOTE] Apache OpenNLP 2.0.0 Release Candidate

2022-05-08 Thread Jeff Zemerick
Hi folks, I have posted a first release candidate for the Apache OpenNLP 2.0.0 release and it is ready for testing. The distributables can be downloaded from: https://repository.apache.org/content/repositories/orgapacheopennlp-1029/org/apache/opennlp/opennlp-distr/2.0.0/ The release was made fro

OpenNLP 2.x supporting 1.x models

2022-04-20 Thread Jeff Zemerick
There is a block in BaseModel.java that checks to see if the model's major version number matches OpenNLP's major version number. Running this with an OpenNLP 2.0.0 version causes the check to fail (2 != 1). I have made a pull request [1] to change that check to just make sure the major version nu

Re: Training of MaxEnt Model with large corpora fails with java.io.UTFDataFormatException

2022-04-18 Thread Jeff Zemerick
ieb Zowalla, > > > Richard: > > > > Hi Jeff, > > > > > > > > thanks for the update. > > > > > > > > We will give the change a try with a SNAPSHOT build including the > > > > potential patch and start a run on the cluste

Re: Training of MaxEnt Model with large corpora fails with java.io.UTFDataFormatException

2022-04-12 Thread Jeff Zemerick
purpose-of-the-encoded-string-too-long-restriction [2] https://docs.oracle.com/javase/7/docs/api/java/io/DataOutputStream.html#writeUTF(java.lang.String) On Mon, Apr 11, 2022 at 1:41 PM Jeff Zemerick wrote: > Great, thanks. I was able to reproduce the problem. I'll take a look and > kee

Re: Training of MaxEnt Model with large corpora fails with java.io.UTFDataFormatException

2022-04-11 Thread Jeff Zemerick
> > It basically boils down to a size limitation in the JDK's > DataOutputStream. > > Gruß > Richard > > Am Montag, dem 11.04.2022 um 10:13 -0400 schrieb Jeff Zemerick: > > Hi Richard, > > > > Thanks for reporting this. A Jira issue with steps to rep

Re: Training of MaxEnt Model with large corpora fails with java.io.UTFDataFormatException

2022-04-11 Thread Jeff Zemerick
Hi Richard, Thanks for reporting this. A Jira issue with steps to reproduce it would be fantastic. https://issues.apache.org/jira/projects/OPENNLP Please create one and reply back here with its ID once you do. I can take a look and see what can be done. Thanks, Jeff On Mon, Apr 11, 2022 at 8:47

Re: OpenNLP 2.0 release discussion

2022-04-06 Thread Jeff Zemerick
gt; > > > Bruno > > > > On Wednesday, 6 April 2022, 02:34:35 am NZST, Jeff Zemerick < > > jzemer...@apache.org> wrote: > > > > Hi all, > > > > I would like to propose an OpenNLP 2.0 release for the following reasons: > > > > - Th

OpenNLP 2.0 release discussion

2022-04-05 Thread Jeff Zemerick
Hi all, I would like to propose an OpenNLP 2.0 release for the following reasons: - There are a few significant changes: Building using Java 11, support for ONNX models, automatic model downloading - User activity has been somewhat low and a 2.0 release might help bring attention to these new fea

OPENNLP-1185: Tokenizers should be able to output a new line token

2022-03-29 Thread Jeff Zemerick
There is a JIRA task [1] that Jörn wrote a few years ago that calls for allowing the tokenizers to output new line tokens and there is a PR [2] for it. The PR does not change the interfaces and just adds a keepNewLines boolean to the tokenizers. It doesn't look like this change would affect any ex

OPENNLP-1318 - reopened PR for automatic model downloading

2022-03-25 Thread Jeff Zemerick
I have reopened the pull request [1] for automatic model downloading OPENNLP-1318 [2]. I updated the code to download the models trained on UD instead of the SourceForge models. [1] https://github.com/apache/opennlp/pull/383 [2] https://issues.apache.org/jira/projects/OPENNLP/issues/OPENNLP-1318

Re: Moving OpenNLP to Java 11

2022-03-24 Thread Jeff Zemerick
to Java 11 from me. > > Koji > > > On 2022/03/22 1:43, Jeff Zemerick wrote: > > OpenNLP is built targeting Java 8. I would like to propose changing it to > > Java 11, eyeing an eventual 2.0 release. > > > > I have tested the build on 11 without issues. > &g

Moving OpenNLP to Java 11

2022-03-21 Thread Jeff Zemerick
OpenNLP is built targeting Java 8. I would like to propose changing it to Java 11, eyeing an eventual 2.0 release. I have tested the build on 11 without issues. Does anyone have any thoughts/concerns about changing to 11 from 8? Thanks, Jeff

updating documentation

2022-02-17 Thread Jeff Zemerick
The OpenNLP developer documentation looks nice and integrates in the build nicely, but editing it can be a bit of a chore since it is XML documents. Does anyone have any suggestions of other documentation tools that can fit nicely in our build but be a little bit more developer-friendly? I don't m

Re: OpenNLP support for ONNX models pull request

2022-01-10 Thread Jeff Zemerick
00 Thanks, Jeff On Tue, Jan 4, 2022 at 10:35 AM Jeff Zemerick wrote: > I have created a pull request to add OpenNLP implementations of > DocumentCategorizer and TokenNameFinder for ONNX models. The goal of this > PR is to enable the use of Huggingface transformers models from OpenN

Re: [Build] Move from Travis CI to GH Actions?

2022-01-10 Thread Jeff Zemerick
Bruno, I don't have any experience with GH Actions but I'm fine with a move since other ASF projects have adopted it and it seems easy to do. Thanks, Jeff On Fri, Jan 7, 2022 at 3:44 AM Bruno P. Kinoshita wrote: > Hi, > > I was reviewing a simple PR but after squashing commits and rebasing, >

OpenNLP support for ONNX models pull request

2022-01-04 Thread Jeff Zemerick
I have created a pull request to add OpenNLP implementations of DocumentCategorizer and TokenNameFinder for ONNX models. The goal of this PR is to enable the use of Huggingface transformers models from OpenNLP. Look at the README.md under the opennlp-dl project for info on how to convert a transfo

Re: [ANNOUNCE] OpenNLP 1.9.4 released

2021-11-11 Thread Jeff Zemerick
#x27;d like to retweet it. > > Koji > > On 2021/11/10 2:49, Jeff Zemerick wrote: > > The Apache OpenNLP team is pleased to announce the release of version > 1.9.4 > > of Apache OpenNLP. The Apache OpenNLP library is a machine learning based > > toolkit for the

[ANNOUNCE] OpenNLP 1.9.4 released

2021-11-09 Thread Jeff Zemerick
The Apache OpenNLP team is pleased to announce the release of version 1.9.4 of Apache OpenNLP. The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-spee

Re: [VOTE] Apache OpenNLP 1.9.4 Release Candidate

2021-11-08 Thread Jeff Zemerick
3, 2021 at 8:29 AM Jeff Zemerick wrote: > Thanks Bruno and Koji for voting! With three +1 binding votes and no -1 > votes the vote passes to release as OpenNLP 1.9.4. The process to release > 1.9.4 will continue as documented in the Release Guide [1]. > > Thanks, >

Re: [VOTE] Apache OpenNLP 1.9.4 Release Candidate

2021-11-03 Thread Jeff Zemerick
Jeff Zemerick wrote: > +1. I ran the eval tests and they all passed > > On Thu, Oct 28, 2021 at 12:45 AM Koji Sekiguchi < > koji.sekigu...@rondhuit.com> wrote: > >> +1 to release. >> >> I ran mvn install and checked some checksums of the distribution. Look

Re: [VOTE] Apache OpenNLP 1.9.4 Release Candidate

2021-10-29 Thread Jeff Zemerick
+1. I ran the eval tests and they all passed On Thu, Oct 28, 2021 at 12:45 AM Koji Sekiguchi wrote: > +1 to release. > > I ran mvn install and checked some checksums of the distribution. Looked > fine. > > Koji > > On 2021/10/27 22:13, Jeff Zemerick wrote: > > Hi

[VOTE] Apache OpenNLP 1.9.4 Release Candidate

2021-10-27 Thread Jeff Zemerick
Hi folks, I have posted a 1st release candidate for the Apache OpenNLP 1.9.4 release and it is ready for testing. Issues resolved in this version: https://issues.apache.org/jira/browse/OPENNLP-1311?jql=project%20%3D%20OPENNLP%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%2

OpenNLP 1.9.4 and 2.0.0

2021-10-08 Thread Jeff Zemerick
Team, There have been some minor issues [1] that have been fixed for an upcoming 1.9.4 release. The last release, 1.9.3, was over a year ago so I think we are due for a release. Please let me know if you have any thoughts about 1.9.4. If I don't hear anything in the next few days I will begin the

Re: Planning for OpenNLP 2.0

2021-06-14 Thread Jeff Zemerick
quest. Please take a look at the branch comparison and share any thoughts/suggestions/concerns. Thanks, Jeff On Tue, Jun 8, 2021 at 10:18 AM Jeff Zemerick wrote: > Hi everyone, > > OpenNLP became a top-level Apache project around 10 years ago and since > then there have been lots of 1.x

Planning for OpenNLP 2.0

2021-06-08 Thread Jeff Zemerick
Hi everyone, OpenNLP became a top-level Apache project around 10 years ago and since then there have been lots of 1.x releases. In the past few years the NLP community has seen transformative changes primarily centered around the Python ecosystem. While OpenNLP still performs and is used by many p

Re: [VOTE] Apache OpenNLP Models 1.0

2021-05-23 Thread Jeff Zemerick
With three +1 binding votes the vote passes to release the OpenNLP models as version 1.0. An announcement email will be sent once the models are available on the website. Thanks, Jeff On Thu, May 20, 2021 at 12:34 PM Jeff Zemerick wrote: > I have re-reviewed the artifacts and am voting

Re: [VOTE] Apache OpenNLP Models 1.0

2021-05-20 Thread Jeff Zemerick
r existed or had no issues. > > Everything OK. > > > > The NOTICE file says 2017, and in Commons we try to update that to the > > year of the release of the component, but that shouldn't be a blocker I > > think. > > > > +1 > > > > Thank yo

  1   2   >