Hello, The Apache OpenNLP project only distributes models that are licensed under the AL 2.0 license, or models that comply with the strict licensing requirements at Apache. So far we only release a language detection model at the Apache OpenNLP project.
The OpenNLP project was hosted in the past at SourceForge and back then there was also a release of various pre-trained models, but the license situation for these models is unclear to me. The problem is that the models are derived from copyright protected corpora, and depending on the source the license for the corpus has a clause about derived works from it. The project back then was under the LGPL license, and I would believe the intention was to release the models under the same license (copyright holders of the corpora never complained but certainly wouldn't agree). Today we can train models on UT and release them under an open source license, but this hasn't been done yet due to lack of contributions / time from the maintainers. Jörn On Mon, Dec 30, 2019 at 9:33 PM Andrej Shadura <[email protected]> wrote: > > Hi, > > There’s a bunch of pre-trained 1.5 models available for download at the > OpenNLP website, but they lack licensing information. Someone reuploaded > them as Java JAR files to MvnRepository stating they’re Apache-2.0 > licensed, but I’m not sure that’s correct. > > I’m concerned because LanguageTool depends on these models, and I’m > packaging it for Debian, and I need license clarity, since Debian > doesn’t accept non-free files or files with unclear licensing. > > Could please somebody clarify this? > > Thanks! > > References > [1]: http://opennlp.sourceforge.net/models-1.5/ > [2]: https://mvnrepository.com/artifact/edu.washington.cs.knowitall > > -- > Cheers, > Andrej
