[ 
https://issues.apache.org/jira/browse/PIG-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164724#comment-13164724
 ] 

Ashutosh Chauhan commented on PIG-2374:
---------------------------------------

We should push for backward compatibility of getBytes() on Hadoop for this. The 
way it is fixed with this patch will necessitate an extra buffer copy in Pig, 
an unnecessary performance hit.
                
> streaming regression with dotNext
> ---------------------------------
>
>                 Key: PIG-2374
>                 URL: https://issues.apache.org/jira/browse/PIG-2374
>             Project: Pig
>          Issue Type: Bug
>         Environment: hadoopApache Pig version 0.9.2.1111101150 (r1200499)
> compiled Nov 10 2011, 19:50:15
>  -bash-3.1$ hadoop version
> Hadoop 0.23.0.1111080202
> Subversion 
> http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.23.0/hadoop-common-project/hadoop-common
>  -r 1196973
> Compiled by hadoopqa on Tue Nov  8 02:12:04 PST 2011
> From source with checksum 4e42b2d96c899a98a8ab8c7cc23f27ae
>            Reporter: Araceli Henley
>            Assignee: Daniel Dai
>              Labels: hadoop2.0
>             Fix For: 0.9.2
>
>         Attachments: PIG-2374-1.patch
>
>
> Streaming seems to be broken in dotNext. There are several tests that are 
> failing.
> The results from C below produce clean results.
> The results from D which are streamed through CMD produce control characters 
> on some of the output.
> define CMD `perl GroupBy.pl '\t' 0` 
> ship('/homes/monster/pigtest/pigtest_next/pigharness/dist/pig_harness/libexec/PigTest/GroupBy.pl');
> A = load '/user/user1/pig/tests/data/singlefile/studenttab10k';
> B = group A by $0;
> C = foreach B generate flatten(A);
> D = stream C through CMD;
> store C into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_C.out';
> store D into '/user/user1/pig/out/user1.1321117428/ComputeSpec_7_D.out';
> Other streaming tests that fail with control characters:
> EST FAILED <ComputeSpec_7>
> TEST FAILED <ComputeSpec_8>
> TEST FAILED <ComputeSpec_10>
> TEST FAILED <ComputeSpec_11>
> TEST FAILED <ComputeSpec_12>
> TEST FAILED <JobManagement_2>
> TEST FAILED <JobManagement_3>
> TEST FAILED <StreamingIO_4>
> TEST FAILED <NonStreaming_1>
> TEST FAILED <MultiQuery_21>
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to