You have to find out what encoding the output is in. This page says MeCab
output in EUC-JP
   https://forum.koohii.com/thread-8952.html
But running
   iconv -f EUCJP  -t UTF-8 temp_rs
doesn't seem to fix it.

You should ask the Mecab developers to get a definite answer

Hieu Hoang
http://moses-smt.org/


On 22 June 2017 at 05:44, Ngô Thị Vinh <ntv...@ictu.edu.vn> wrote:

> Hi all,
> Does anyone has ever done experiments with Japanese-English and using
> mecab for tokenization japanese?
> I test mecab with a simple file temp from terminal such as:
> mecab -O wakati temp -o temp_rs
> but temp_rs is error with font, although I have installed font for
> Japanese?
>  Please help me to solve the problem if you know.
> Thank you!
>
> --
> *Ngô Thị Vinh*
> Faculty of Electronics and Communications,
> Thai Nguyen University of Information and Communication Technology (ICTU).
> TEL: 0987 706 830
> Email:   *ntv...@ictu.edu.vn <ptng...@ictu.edu.vn>*
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to