Re: License for NE model?

Thomas Zastrow Wed, 30 Oct 2013 05:54:39 -0700

Dear all,

Thanks for the information.



Am 30.10.2013 13:20, schrieb Jörn Kottmann:

On 10/30/2013 12:03 PM, Nils Reiter wrote:
I guess the question is whether a trained model is an “adaptation” ofthe work according to the license. If that’s the case you’re bound tousing creative commons, I think.

I want to publish both: the binary model and the raw, manually annotatedtexts. The latter is derivated work from Wikipedia, you can still readthe articles and just have some annotations in between. So, for thatfile(s) it will be the original Wikipedia license.

The model does not contain the original texts, it contains the wordsand bigrams,
but that nothing the original author has a copyright on.

Hhm, thats the point: I know from other contexts, that also trainedmodels from Treebanks have to be under the same condition than theoriginal treebank. So I'm not sure if I'm free to use another licensefor the binary file. And I don't know whats about the other models onthe OpenNLP page: I used the German tokenizer and sentence-detectormodel, together with the OpenNLP tools. At least, my binary model is amixture of CC, Apache License and whatever is used for the alreadyexisting models.

Any interest to contribute your work back to OpenNLP? It would reallybe a great start for usto finally have some annotated data as proper Open Source as well. Thewikipedia effort can probably
easily be replicated for other language

Yes, of course. I build this model for my own hobby project, but Ialways had in mind to give it free. I also implemented a graphical userinterface for doing manually NE annotation ... all the OpenNLP tools areintegrated and now, it can be seen as a generic graphical user interfacefor OpenNLP. That tool is far away from beeing perfect, but I think Iwill publish a "beta of a pre-alpha version" the next days :-)

I also found out that the tokenizer and sentence model for German are... not the best ones. I don't know who did them, but they are lackingsome very common features of German texts.

Last not least, I'm working on some converters for the OpenNLP formats,because I need the output beeing TCF. Still don't found the hook in thecode if and where that would fit.


Best,

Tom


--
Dr. Thomas Zastrow
Riemerfeldring 7a

85748 Garching
Tel.: 0162 422 8029
www.thomas-zastrow.de

Re: License for NE model?

Reply via email to