Hi Miguel Two tools you can try are
Hunalign: https://github.com/danielvarga/hunalign Bleualign: https://github.com/rsennrich/Bleualign <https://github.com/rsennrich/Bleualign> I don’t know what exactly the effect of wildly different sentence lengths is though. Regards Mathias > On 20 Apr 2018, at 09:24, Miguel Domingo <mido...@prhlt.upv.es> wrote: > > Good morning, > > I have two documents which have the same text (in different languages) but > different structure (one language was written using very short sentences > while the other was written using longer sentences). Does anybody know of a > tool with which to align the sentences to obtain a parallel corpus suitable > for MT? (So far I've tried Gargantua, but it's deleting most of the text.) > > Thanks in advance, > > Miguel > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support