Hi, 2 million segments is plenty.
This question is generally hard to answer - the more data you have the better. There has been some success with already only 1 million words in narrow domains - the systems for news translation have typically at least a magnitude more than that. -phi On Tue, Feb 17, 2015 at 4:00 AM, Ihab Ramadan <[email protected]> wrote: > Dears, > > I just wonder how much data should I use to say I have enough data to > build a qualified MT > > For example If I have 2 million segments in the parallel files is that > enough? > > Thanks > > > > > *Regards,**Ihab Ramadan | *Senior Developer* | **Saudisoft-Egypt | ** Tel: > *+2 023 303 2037 - *ext *128 | *M *+2 01007570826 | *Fax *+2 023 303 2036 > | *Follow us on ** | ** | * > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
