Has Penn said explicitly that they'll never release the Treebank under an open license?
jds On Thu, Jul 19, 2012 at 10:22 PM, James Kosin <[email protected]> wrote: > On 7/19/2012 2:07 AM, Lance Norskog wrote: >> What is the legitimacy of data which is tagged using an encumbered >> model? I mean, if I tag documents with OpenNLP's non-free models on >> sourceforge, the tagged output is a "derived work". Is this tagged >> output considered free? Does this depend on the license of the >> original data? >> >> > Lance, > > The problem is two-fold. > > (1) We would like to distribute the models on Apache. Unfortunately, > to do so would mean the models and source used to create the models > would have to be under the Apache license to be distributed. We don't > see any way around this than to generate our own training data with an > open license compatible with the Apache license. > Jorn is getting the groundwork done for this with the tagging server > to allow us to hand-tag and correct data for our own training data. I > know it is re-doing work that already has been done; but, the benefits > will be large in the long run. Anyone could download the training data > and add/remove/etc all they want to customize the training set to > various situations without the worry of a copyright issue. > The down side, we have a lot of work to do to get there. > > (2) The models themselves although available on sourceforge are for > research purposes ONLY. The copyright and contract with those holding > the copyright for the original works have stated so. I've asked many on > this point. We are not helping by breaking the law on this, nor do we > suggest anyone to do this. > The next problem is we can't distribute the training data for the > models.... so, modifications to the models are next to impossible to add > training for other situations. The data used to train are mainly from > news sources and that limits some of the usefulness for some. > > ..... > I guess I'll have to get the FAQ section on our web-site done soon. > > Thanks, > James
