Hi Lane, I'm not really involved with Moses or NLTK and never meant to take that on personally. However, it still seems to me like a reasonable and achievable goal.
matt > On May 29, 2018, at 4:53 PM, Lane Schwartz <dowob...@gmail.com > <mailto:dowob...@gmail.com>> wrote: > > Matt, > > Did you ever track down the people who contributed to the tokenizer? It seems > like we should be able to dual license that script. It would be very nice to > be able to include the Moses tokenizer and detokenizer as part of NLTK. > > Lane > > > On Fri, Apr 20, 2018 at 12:38 AM, liling tan <alvati...@gmail.com > <mailto:alvati...@gmail.com>> wrote: > Dear Moses Devs and Community, > > Sorry for the delayed response. > > We've repackaged the MosesTokenizer Python code as a library and made it > pip-able. > https://github.com/alvations/sacremoses > <https://github.com/alvations/sacremoses> > > I hope that's okay with the Moses community and the license compliance is > good with this now. > > Regards, > Liling > > > > On Wed, Apr 11, 2018 at 1:41 AM, Matt Post <p...@cs.jhu.edu > <mailto:p...@cs.jhu.edu>> wrote: > Seems worth a shot. I suggest contacting each of them with individual emails > until (and if) you get a “no”. > > matt (from my phone) > > Le 10 avr. 2018 à 19:26, liling tan <alvati...@gmail.com > <mailto:alvati...@gmail.com>> a écrit : > >> @Matt I'm not sure whether that'll work. >> >> >> For tokenizer, that'll include: >> >> phikoehn <https://github.com/phikoehn> >> hieuhoang <https://github.com/hieuhoang> >> bhaddow <https://github.com/bhaddow> >> jimregan <https://github.com/jimregan> >> kpu <https://github.com/kpu> >> ugermann <https://github.com/ugermann> >> pjwilliams <https://github.com/pjwilliams> >> jgwinnup <https://github.com/jgwinnup> >> mhuck <https://github.com/mhuck> >> tofula <https://github.com/tofula> >> a455bcd9 <https://github.com/a455bcd9> >> >> And these for the detokenizer: >> >> >> phikoehn <https://github.com/phikoehn> >> flammie <https://github.com/flammie> >> hieuhoang <https://github.com/hieuhoang> >> pjwilliams <https://github.com/pjwilliams> >> bhaddow <https://github.com/bhaddow> >> alvations <https://github.com/alvations> >> >> Not sure if everyone agrees though. >> >> Regards, >> Liling >> >> On Wed, Apr 11, 2018 at 12:39 AM, Matt Post <p...@cs.jhu.edu >> <mailto:p...@cs.jhu.edu>> wrote: >> Liling—Would it work to get the permission of just those people who are in >> the commit log of the specific scripts you want to port? >> >> matt (from my phone) >> >> Le 10 avr. 2018 à 18:19, liling tan <alvati...@gmail.com >> <mailto:alvati...@gmail.com>> a écrit : >> >>> Got it. >>> >>> So I think we'll just remove the MosesTokenizer and MosesDetokenizer >>> function from NLTK and maybe create a PR to put it in >>> mosesdecoder/scripts/tokenizer >>> >>> Thank you for the clarification! >>> Liling >>> >>> On Wed, Apr 11, 2018 at 12:17 AM, Hieu Hoang <hieuho...@gmail.com >>> <mailto:hieuho...@gmail.com>> wrote: >>> Still the same problem - everyone owns Moses so you need everyone's >>> permission, not just mine. So no >>> >>> Hieu Hoang >>> http://moses-smt.org/ <http://moses-smt.org/> >>> >>> >>> On 10 April 2018 at 17:13, liling tan <alvati...@gmail.com >>> <mailto:alvati...@gmail.com>> wrote: >>> I understand. >>> >>> Could we have permission that it's okay to derive work from Moses with >>> respect to the (de-)tokenizer and possibly other scripts under an >>> MIT/Apache tool? >>> >>> Legally it's a restriction but I think for what's it worth, having mutual >>> agreement between the OSS is sufficient to still keep any port of LGPL work >>> until someone starts to enforce legal actions and I think it's safe to back >>> off to taking down these functionalities in the Apache/MIT code. >>> >>> Regards, >>> Liling >>> >>> On Wed, Apr 11, 2018 at 12:09 AM, Hieu Hoang <hieuho...@gmail.com >>> <mailto:hieuho...@gmail.com>> wrote: >>> we can't change the license, or dual license it, without the agreement of >>> everyone who's contributed to Moses. Too much work >>> >>> Hieu Hoang >>> http://moses-smt.org/ <http://moses-smt.org/> >>> >>> >>> On 10 April 2018 at 15:47, liling tan <alvati...@gmail.com >>> <mailto:alvati...@gmail.com>> wrote: >>> Dear Moses Dev, >>> >>> NLTK has a Python port of the word tokenizer in Moses. The tokenizer works >>> well in Python and create a good synergy to bridge Python users to the code >>> that Moses developers have spent years to hone. >>> >>> But it seemed to have hit a wall with some licensing issues. >>> https://github.com/nltk/nltk/issues/2000 >>> <https://github.com/nltk/nltk/issues/2000> >>> >>> General port of LGPL code is considered derivative and is incompatible with >>> Apache or MIT license. I understand that LGPL keeps derivative from being >>> proprietary but it's a little less permissive than non-copyleft license >>> like Apache and MIT licenses. >>> >>> Note that this licensing issue might also affect Marian which is MIT >>> license and also incompatible with LGPL so although technically users can >>> chain the code from different libraries, but Marian couldn't have any >>> dependencies on the Moses components. (But we know do know that none of our >>> models built with Marian would work without the Moses tokenizer which is in >>> LGPL). >>> >>> Would there be a possibility to dual license the Moses repository with LGPL >>> and Apache/BSD/MIT license. I'm not sure whether it's allowed to have dual >>> licenses with LGPL and Apache/BSD/MIT license though. Might have to check >>> with some proper legal personnel though. >>> >>> If dual license is not possible would it be possible relicense the code >>> under BSD/Apache/MIT license? That way it's more permissive for derivatiive >>> work? >>> >>> I think the last scenario is for NLTK to drop the Python port of Moses code >>> entirely from Apache license repository but I think that'll remove the >>> synergy between various OSS. >>> >>> Hope to hear from Moses devs soon! >>> >>> Regards, >>> Liling >>> >>> >>> >>> _______________________________________________ >>> Moses-support mailing list >>> Moses-support@mit.edu <mailto:Moses-support@mit.edu> >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> <http://mailman.mit.edu/mailman/listinfo/moses-support> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Moses-support mailing list >>> Moses-support@mit.edu <mailto:Moses-support@mit.edu> >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> <http://mailman.mit.edu/mailman/listinfo/moses-support> >> > > > On Wed, Apr 11, 2018 at 1:41 AM, Matt Post <p...@cs.jhu.edu > <mailto:p...@cs.jhu.edu>> wrote: > Seems worth a shot. I suggest contacting each of them with individual emails > until (and if) you get a “no”. > > matt (from my phone) > > Le 10 avr. 2018 à 19:26, liling tan <alvati...@gmail.com > <mailto:alvati...@gmail.com>> a écrit : > >> @Matt I'm not sure whether that'll work. >> >> >> For tokenizer, that'll include: >> >> phikoehn <https://github.com/phikoehn> >> hieuhoang <https://github.com/hieuhoang> >> bhaddow <https://github.com/bhaddow> >> jimregan <https://github.com/jimregan> >> kpu <https://github.com/kpu> >> ugermann <https://github.com/ugermann> >> pjwilliams <https://github.com/pjwilliams> >> jgwinnup <https://github.com/jgwinnup> >> mhuck <https://github.com/mhuck> >> tofula <https://github.com/tofula> >> a455bcd9 <https://github.com/a455bcd9> >> >> And these for the detokenizer: >> >> >> phikoehn <https://github.com/phikoehn> >> flammie <https://github.com/flammie> >> hieuhoang <https://github.com/hieuhoang> >> pjwilliams <https://github.com/pjwilliams> >> bhaddow <https://github.com/bhaddow> >> alvations <https://github.com/alvations> >> >> Not sure if everyone agrees though. >> >> Regards, >> Liling >> >> On Wed, Apr 11, 2018 at 12:39 AM, Matt Post <p...@cs.jhu.edu >> <mailto:p...@cs.jhu.edu>> wrote: >> Liling—Would it work to get the permission of just those people who are in >> the commit log of the specific scripts you want to port? >> >> matt (from my phone) >> >> Le 10 avr. 2018 à 18:19, liling tan <alvati...@gmail.com >> <mailto:alvati...@gmail.com>> a écrit : >> >>> Got it. >>> >>> So I think we'll just remove the MosesTokenizer and MosesDetokenizer >>> function from NLTK and maybe create a PR to put it in >>> mosesdecoder/scripts/tokenizer >>> >>> Thank you for the clarification! >>> Liling >>> >>> On Wed, Apr 11, 2018 at 12:17 AM, Hieu Hoang <hieuho...@gmail.com >>> <mailto:hieuho...@gmail.com>> wrote: >>> Still the same problem - everyone owns Moses so you need everyone's >>> permission, not just mine. So no >>> >>> Hieu Hoang >>> http://moses-smt.org/ <http://moses-smt.org/> >>> >>> >>> On 10 April 2018 at 17:13, liling tan <alvati...@gmail.com >>> <mailto:alvati...@gmail.com>> wrote: >>> I understand. >>> >>> Could we have permission that it's okay to derive work from Moses with >>> respect to the (de-)tokenizer and possibly other scripts under an >>> MIT/Apache tool? >>> >>> Legally it's a restriction but I think for what's it worth, having mutual >>> agreement between the OSS is sufficient to still keep any port of LGPL work >>> until someone starts to enforce legal actions and I think it's safe to back >>> off to taking down these functionalities in the Apache/MIT code. >>> >>> Regards, >>> Liling >>> >>> On Wed, Apr 11, 2018 at 12:09 AM, Hieu Hoang <hieuho...@gmail.com >>> <mailto:hieuho...@gmail.com>> wrote: >>> we can't change the license, or dual license it, without the agreement of >>> everyone who's contributed to Moses. Too much work >>> >>> Hieu Hoang >>> http://moses-smt.org/ <http://moses-smt.org/> >>> >>> >>> On 10 April 2018 at 15:47, liling tan <alvati...@gmail.com >>> <mailto:alvati...@gmail.com>> wrote: >>> Dear Moses Dev, >>> >>> NLTK has a Python port of the word tokenizer in Moses. The tokenizer works >>> well in Python and create a good synergy to bridge Python users to the code >>> that Moses developers have spent years to hone. >>> >>> But it seemed to have hit a wall with some licensing issues. >>> https://github.com/nltk/nltk/issues/2000 >>> <https://github.com/nltk/nltk/issues/2000> >>> >>> General port of LGPL code is considered derivative and is incompatible with >>> Apache or MIT license. I understand that LGPL keeps derivative from being >>> proprietary but it's a little less permissive than non-copyleft license >>> like Apache and MIT licenses. >>> >>> Note that this licensing issue might also affect Marian which is MIT >>> license and also incompatible with LGPL so although technically users can >>> chain the code from different libraries, but Marian couldn't have any >>> dependencies on the Moses components. (But we know do know that none of our >>> models built with Marian would work without the Moses tokenizer which is in >>> LGPL). >>> >>> Would there be a possibility to dual license the Moses repository with LGPL >>> and Apache/BSD/MIT license. I'm not sure whether it's allowed to have dual >>> licenses with LGPL and Apache/BSD/MIT license though. Might have to check >>> with some proper legal personnel though. >>> >>> If dual license is not possible would it be possible relicense the code >>> under BSD/Apache/MIT license? That way it's more permissive for derivatiive >>> work? >>> >>> I think the last scenario is for NLTK to drop the Python port of Moses code >>> entirely from Apache license repository but I think that'll remove the >>> synergy between various OSS. >>> >>> Hope to hear from Moses devs soon! >>> >>> Regards, >>> Liling >>> >>> >>> >>> _______________________________________________ >>> Moses-support mailing list >>> Moses-support@mit.edu <mailto:Moses-support@mit.edu> >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> <http://mailman.mit.edu/mailman/listinfo/moses-support> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Moses-support mailing list >>> Moses-support@mit.edu <mailto:Moses-support@mit.edu> >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> <http://mailman.mit.edu/mailman/listinfo/moses-support> >> > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu <mailto:Moses-support@mit.edu> > http://mailman.mit.edu/mailman/listinfo/moses-support > <http://mailman.mit.edu/mailman/listinfo/moses-support> > > > > > -- > When a place gets crowded enough to require ID's, social collapse is not > far away. It is time to go elsewhere. The best thing about space travel > is that it made it possible to go elsewhere. > -- R.A. Heinlein, "Time Enough For Love"
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support