[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788489#action_12788489
 ] 

ZhuGuanyin commented on MAPREDUCE-1277:
---------------------------------------

this patch change 

System.err.println(lineStr);

to
 
System.err.write(line.getBytes(),0,line.getLength());
System.err.println();

I think it could be verified by review, and it not very easy to write a 
testcase for this jira.

manual steps to check this :

1)copy a small file to hdfs

2)run streaming job using the mapper as follows:

#!/bin/sh
cat >/dev/null

echo "㊣ ?※" >&2
echo "礙骯襖壩闆辦" >&2

3) check the task stderr output, the logs would corrupted.

4) add the patch, and run the streaming job again, the task stderr would be 
fine.

this patch is usefull when user need write some debug message, example: some 
input record which might be encoded by big5, GBK and so on.

> Streaming job should support other characterset in user's stderr log, not 
> only utf8
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1277
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1277
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.21.0
>            Reporter: ZhuGuanyin
>             Fix For: 0.21.0
>
>         Attachments: streaming-1277.patch
>
>
> Current implementation in streaming  only support utf8 encoded user stderr 
> log, it should encode free to support other characterset.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to