Hi All,

I apologise in advance if this kind of question has no place on this list. This
question is more Europarl related than Moses related.

On using the sentence-align script that ships with the source version of version
3 of europarl I get a lot of 'different number of paragraphs' messages. Does
anybody know why different numbers of paragraphs are so common. 1-n sentence
alignment is understandable but I was unaware that 1-n paragraph matching was
such a common thing.

Does anybody know of any attempts to automatically align paragraphs in the
corpus? It seems a shame to filter out so much language just because the number
of paragraphs don't match.
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to