Re: Preparing for UD 1.0 Models Vote

Jeff Zemerick Tue, 16 Mar 2021 04:08:54 -0700

Joey, thanks for the comments/questions.

- The models can be named along with the 1.9.3 version to show that's what
trained it but we should be careful to not give the impression that the
models *only* work with that version. I think that can be made sufficiently
clear in the documentation.
- My opinion is that it would be best if the models were not tied to the
OpenNLP lifecycle. I would like for the project to be able to release new
models independently of OpenNLP releases. So I hope we can do the latter
and train the models from an official OpenNLP release, vote, and publish.
- I feel that the models fall somewhere in between being a direct binary
artifact and something more derivative like a Docker container because a
model needs to be evaluated prior to being made available. Contrast that
with a docker container which either works or doesn't. More things (how it
was trained, performance, etc.) should be considered when voting on a model
release than just if it works or not.


>From that page https://incubator.apache.org/guides/distribution.html:

- Convenience binaries must be made from IPMC approved ASF releases.
- Convenience binaries need to follow licensing policy and not include any
category X licensed software.
- Convenience binaries should be signed and have hashes to verify their
contents.

I think we are ok with those 3 things. I will update the naming of the
models as Joey suggested (to include the OpenNLP version that created them)
and update the README to explain 1.9.3 is the version that created them but
should work with all OpenNLP versions (but only tested with 1.9.3).

Are there any concerns about the model release process given my responses
to your questions?

Thanks,
Jeff




On Mon, Mar 15, 2021 at 7:49 PM Joey Frazee <joey.fra...@icloud.com.invalid>
wrote:

> Jeff, in the other thread you mentioned “I personally have been thinking
> of the models as convenience binaries”.
>
> I think that’s the most obvious answer and is what I’d think too.
>
> If that’s the case, then the policies suggest that the version needs to
> match the version they’re created from. So should this be something like
> opennlp-ud-models-1.0-1.9.3 or similar?
>
> The other thing, which is murky in practice, is that do the models need to
> be voted on concurrently with a release or just created by the PMC from an
> official release and published on Apache supported infrastructure?
>
> Direct binary artifacts are almost always evaluated at the time of a
> release vote but more derivative ones often aren’t. E.g., a lot of projects
> publish Docker images from approved releases but not with an independent
> vote. Which are these?
>
> Incubator recently published some helpful guidelines which clarify related
> stuff for the podlings:
>
> https://incubator.apache.org/guides/distribution.html
>
> -joey
>
> > On Mar 15, 2021, at 3:26 PM, Jeff Zemerick <jzemer...@apache.org> wrote:
> >
> > Before starting a second release vote thread for the OpenNLP models and
> > since this is the first release of pretrained OpenNLP models, I would
> like
> > to pause to solicit feedback from the community in regards to the release
> > configuration.
> >
> > - The files are staged on the ASF dev SVN at
> > https://dist.apache.org/repos/dist/dev/opennlp/ud-models-1.0/.
> > - The model files are signed and hashed.
> > - Includes README, CHANGES, NOTICE, and LICENSE files.
> > - The training and evaluation outputs are in the training-eval-logs.zip
> > file (also signed and hashed).
> >
> > Please let me know if anything is missing or should be changed. Once
> things
> > are in a good state I will make a PR to document the steps on the website
> > (OPENNLP-1328) and start a vote thread.
> >
> > ASF Release Creation Process:
> > https://infra.apache.org/release-publishing.html
> >
> > Thanks,
> > Jeff
>

Re: Preparing for UD 1.0 Models Vote

Reply via email to