It is because the content I read from the file is encoded in UTF8, I use Text.decode to decode it back to plain text string, the problem is gone now.
-Gang ----- 原始邮件 ---- 发件人: Gang Luo <lgpub...@yahoo.com.cn> 收件人: common-user@hadoop.apache.org 发送日期: 2009/11/10 (周二) 12:14:44 上午 主 题: Re: Re: how to read file in hadoop I download it to my local filesystem. The content is correct, I can see it either by command or by texteditor. So, I think the file itself has no problem. --Gang ----- 原始邮件 ---- 发件人: Jeff Zhang <zjf...@gmail.com> 收件人: common-user@hadoop.apache.org 发送日期: 2009/11/9 (周一) 11:58:22 下午 主 题: Re: Re: how to read file in hadoop Maybe you can download the file to local to see what content is there. Jeff Zhang 2009/11/10 Gang Luo <lgpub...@yahoo.com.cn> > Since no response to this question up to now, I'd like to discribe more > details about it. > > I try to read a file in HDFS and copy it to another file. It works well and > I can see the content by 'cat' is what it supposed to be. The only problems > is that, when I read it to Bytes[] and print it out to stdout, it is NOT > what it should be. Thus, I cannot do anything (e.g. comparison) except write > it directely to another file. > > I guess this problem may due to the setting of file format (text or binary) > or coding (e.g.utf-8). Can someone give me some ideas? > > > --Gang > > > > ----- 原始邮件 ---- > 发件人: Gang Luo <lgpub...@yahoo.com.cn> > 收件人: common-user@hadoop.apache.org > 发送日期: 2009/11/9 (周一) 11:47:02 上午 > 主 题: how to read file in hadoop > > Hi all > I want to use HDFS IO api to read a result file of the previous mapreduce > job. But what I read is not the things in that file, say the content I print > to stdout is different from what I get from the console by command 'cat'. I > guese there maybe some problem about the file format (binary or text). Can > anyone give me some hints? > > > Gang Luo > > > > ___________________________________________________________ > 好玩贺卡等你发,邮箱贺卡全新上线! > http://card.mail.cn.yahoo.com/ > > > > ___________________________________________________________ > 好玩贺卡等你发,邮箱贺卡全新上线! > http://card.mail.cn.yahoo.com/ > ___________________________________________________________ 好玩贺卡等你发,邮箱贺卡全新上线! http://card.mail.cn.yahoo.com/ ___________________________________________________________ 好玩贺卡等你发,邮箱贺卡全新上线! http://card.mail.cn.yahoo.com/