Dear all,

I created now a named entity model for German. It is trained on 5.000 manually annotated sentences and performs - not perfect, but its already usable. I will go on with more texts.

I used only texts from Wikipedia and Wikinews, so in my eyes it shouldn't be a problem to distribute the model. But I'm not sure which license would be a good choice: OpenNLP uses the Apache license, but Wikipedia is Creative Commons. On the other hand, because I have the "raw" trained data, it would be easy to train other NE detectors with the data.

The OpenNLP page doesn't say anything about the licences of the models which can be found there already.

So, what do you think, would be the best license for

a)
a trained model

and

b)
the raw data which is overall Wikipedia content

?

Thanks in advance and best regards,

Tom


--
Dr. Thomas Zastrow
Riemerfeldring 7a

85748 Garching
Tel.: 0162 422 8029
www.thomas-zastrow.de

Reply via email to