Graham, I'm in the process of developing a multi-lingual sentence aligner. I'm planning to use it on Europarl, which is currently NOT sentence-aligned in any multil-lingual way.
Lane On Fri, Jan 22, 2016 at 9:26 AM, Graham Neubig <neu...@is.naist.jp> wrote: > Dear Moses Mailing List, > > This is not directly related to Moses, but I was wondering if there are > any high-quality, multi-lingually sentence aligned corpora available (i.e. > 3 or more languages with aligned sentences). We're aware of the Europarl > and Bible corpora, but Europarl only covers European languages, and the > Bible corpus is quite small in MT terms. > > TED and MULTI-UN are options, but as far as I know the data is only > bilingually aligned at the moment, and it can be a bit hard to get a clean > multi-lingual corpus from them. If anyone has any experience with this, or > resource available, I'd love some info. > > Thanks in advance, > Graham > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > > -- When a place gets crowded enough to require ID's, social collapse is not far away. It is time to go elsewhere. The best thing about space travel is that it made it possible to go elsewhere. -- R.A. Heinlein, "Time Enough For Love"
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support