Joey, thanks for the comments/questions. - The models can be named along with the 1.9.3 version to show that's what trained it but we should be careful to not give the impression that the models *only* work with that version. I think that can be made sufficiently clear in the documentation. - My opinion is that it would be best if the models were not tied to the OpenNLP lifecycle. I would like for the project to be able to release new models independently of OpenNLP releases. So I hope we can do the latter and train the models from an official OpenNLP release, vote, and publish. - I feel that the models fall somewhere in between being a direct binary artifact and something more derivative like a Docker container because a model needs to be evaluated prior to being made available. Contrast that with a docker container which either works or doesn't. More things (how it was trained, performance, etc.) should be considered when voting on a model release than just if it works or not.
>From that page https://incubator.apache.org/guides/distribution.html: - Convenience binaries must be made from IPMC approved ASF releases. - Convenience binaries need to follow licensing policy and not include any category X licensed software. - Convenience binaries should be signed and have hashes to verify their contents. I think we are ok with those 3 things. I will update the naming of the models as Joey suggested (to include the OpenNLP version that created them) and update the README to explain 1.9.3 is the version that created them but should work with all OpenNLP versions (but only tested with 1.9.3). Are there any concerns about the model release process given my responses to your questions? Thanks, Jeff On Mon, Mar 15, 2021 at 7:49 PM Joey Frazee <joey.fra...@icloud.com.invalid> wrote: > Jeff, in the other thread you mentioned “I personally have been thinking > of the models as convenience binaries”. > > I think that’s the most obvious answer and is what I’d think too. > > If that’s the case, then the policies suggest that the version needs to > match the version they’re created from. So should this be something like > opennlp-ud-models-1.0-1.9.3 or similar? > > The other thing, which is murky in practice, is that do the models need to > be voted on concurrently with a release or just created by the PMC from an > official release and published on Apache supported infrastructure? > > Direct binary artifacts are almost always evaluated at the time of a > release vote but more derivative ones often aren’t. E.g., a lot of projects > publish Docker images from approved releases but not with an independent > vote. Which are these? > > Incubator recently published some helpful guidelines which clarify related > stuff for the podlings: > > https://incubator.apache.org/guides/distribution.html > > -joey > > > On Mar 15, 2021, at 3:26 PM, Jeff Zemerick <jzemer...@apache.org> wrote: > > > > Before starting a second release vote thread for the OpenNLP models and > > since this is the first release of pretrained OpenNLP models, I would > like > > to pause to solicit feedback from the community in regards to the release > > configuration. > > > > - The files are staged on the ASF dev SVN at > > https://dist.apache.org/repos/dist/dev/opennlp/ud-models-1.0/. > > - The model files are signed and hashed. > > - Includes README, CHANGES, NOTICE, and LICENSE files. > > - The training and evaluation outputs are in the training-eval-logs.zip > > file (also signed and hashed). > > > > Please let me know if anything is missing or should be changed. Once > things > > are in a good state I will make a PR to document the steps on the website > > (OPENNLP-1328) and start a vote thread. > > > > ASF Release Creation Process: > > https://infra.apache.org/release-publishing.html > > > > Thanks, > > Jeff >