Dear Friends thank you a lot for your help before and i hope that you will help me again i try to build an arabic-english SMT with moses but in the training Giza do not do the alignment it is because the corpus UN ar-en is not well cleaned ; in fact this is the problem because they are not parallel ;they have not the same number of lines. i'm working with 2000 directory (2000ar and 2000en). does anyone worked with UN ar-en corpus??? i want to ask how to make the same number of lines for ar-en in 2000 in order to pass the cleaning step
thank you in advance i hope you will answer my question
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
