hey evelyn
a thread a few days ago on chinese tokenization may be useful to you
http://article.gmane.org/gmane.comp.nlp.moses.user/2563
as miles mentioned, tokenization is critical to good MT performance
many people would be very grateful if you could integrate some good
tokenizers into the moses toolkit. If you're feeling ambitious, create a
c++ framework where people can more easily plug in new tokenizers
On 15/02/2010 21:26, Evelyn Teo wrote:
Hello Moses Team,
My name is Evelyn Teo. I am a graduate student in the Translation and
Localization Management program at the Monterey Institute of
International Studies in California. My working languages are English,
Chinese and Bahasa Indonesia.
I am currently working on my research paper on the role of MT in the
industry and would love to participate in the Moses projects. Do let
me know if you have any projects that I can contribute to!
--Evelyn
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support