Re: [Moses-support] Multilingually Sentence-Aligned Corpora

2016-01-22 Thread Jorg Tiedemann
The DGT translation memories are truly multilingually aligned: https://ec.europa.eu/jrc/en/language-technologies/dgt-translation-memory Otherwise you would have several multilingual corpora in OPUS even though they are a

Re: [Moses-support] Multilingually Sentence-Aligned Corpora

2016-01-22 Thread Jörg Tiedemann
The DGT translation memories are truly multilingually aligned: https://ec.europa.eu/jrc/en/language-technologies/dgt-translation-memory Otherwise you would have several multilingual corpora in OPUS even though they are all bilingually aligned. In most cases it is quite straightforward to combin

Re: [Moses-support] Multilingually Sentence-Aligned Corpora

2016-01-22 Thread Graham Neubig
Hi Marcin, Wow, that would be really excellent. I'm looking forward to it! Graham On Fri, Jan 22, 2016 at 10:36 AM, Marcin Junczys-Dowmunt wrote: > Hi Graham, > At the UN we are now working to release an official version of our data. > As a bonus to the pair-wise alignment, it will contain a 6

Re: [Moses-support] Multilingually Sentence-Aligned Corpora

2016-01-22 Thread Lane Schwartz
Graham, I'm in the process of developing a multi-lingual sentence aligner. I'm planning to use it on Europarl, which is currently NOT sentence-aligned in any multil-lingual way. Lane On Fri, Jan 22, 2016 at 9:26 AM, Graham Neubig wrote: > Dear Moses Mailing List, > > This is not directly rela

Re: [Moses-support] Multilingually Sentence-Aligned Corpora

2016-01-22 Thread Lane Schwartz
Marcin, That sounds great! Yes, please do make an announcement. I would definitely make use of such a multi-aligned corpus. Lane On Fri, Jan 22, 2016 at 9:36 AM, Marcin Junczys-Dowmunt wrote: > Hi Graham, > At the UN we are now working to release an official version of our data. > As a bonus

Re: [Moses-support] Multilingually Sentence-Aligned Corpora

2016-01-22 Thread Marcin Junczys-Dowmunt
Hi Graham, At the UN we are now working to release an official version of our data. As a bonus to the pair-wise alignment, it will contain a 6-way fully aligned subcorpus for English, French, Spanish, Russian, Chinese, Arabic; about 13M segments per language. We are waiting for some LREC feedb

[Moses-support] Multilingually Sentence-Aligned Corpora

2016-01-22 Thread Graham Neubig
Dear Moses Mailing List, This is not directly related to Moses, but I was wondering if there are any high-quality, multi-lingually sentence aligned corpora available (i.e. 3 or more languages with aligned sentences). We're aware of the Europarl and Bible corpora, but Europarl only covers European