Okay, skipping over all the noise on this thread, I think we should take Ross up on his generous offer. I agree Parashuram would be in a good position as well, but he is not yet a on the pmc. Starting a private vote thread.
Cheers, Jesse @purplecabbage risingj.com On Fri, May 30, 2014 at 12:06 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: > On Fri, May 30, 2014 at 1:29 AM, Josh Soref <jso...@blackberry.com> wrote: > > > Ted Dunning wrote: > > > Also, if you choose to switch to a different translator at some point, > it > > > is likely that they will use the previous translations as the base for > a > > > translation memory even if humans are doing the translation. That > counts > > > as the project using the text to train a translation engine. > > > > I don't think that counts. > > > > ... > > > > > > If "my translation app" takes your X->Y and uses it to apply to the next > > application it sees, then it's opening itself up to some really bad > > poisoning models. Because there's a lot of garbage that will be uploaded > > into translation engines. I'd be shocked if anyone actually did this. > > > > Google translate does this. They detect parallel text on the web and build > language and translation models using techniques that go back, more or > less, to the Candide work at IBM. The really big addition that Google made > is that they can and do detect parallel text that is not explicitly marked > as parallel. > > This means that if somebody translated the Cordova stuff later using Google > translate, it would likely include this earlier Bing content. > > > > > > And yes, I do maintain translation tools. My tools certainly wouldn't do > > this. I maintain translation tools because I've seen the quality of > > translations, and they're awful. > > > > I trust you about your tools. That doesn't imply others avoid doing this. > > > > > > The goal of such a restriction is to prevent someone from using this > > output as a basis for making another generic translation tool. > > > > If someone takes a document from Spanish, and uses Bing to translate it > > into French, Microsoft is not going to complain if someone later takes > > that (French) document and translates it into Italian, no matter who does > > the translation. > > > > They're only concerned if someone takes the mapping between Spanish words > > and French words and uses it on an unrelated corpus / to improve their > > handling of unrelated corpora. > > > > Do you say this from actual knowledge of Microsoft's intent? Or are you > depending on what you read here? >