[ 
https://issues.apache.org/jira/browse/PIG-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168014#comment-13168014
 ] 

xuting zhao commented on PIG-2291:
----------------------------------

I think this problem is caused by the specialty of dump operation. From the Pig 
tutorial, it says:

With multi-query exection, you want to use STORE to save (persist) your 
results. You do not want to use DUMP as it will disable multi-query execution 
and is likely to slow down execution. (If you have included DUMP statements in 
your scripts for debugging purposes, you should remove them.)

DUMP Example: In this script, because the DUMP command is interactive, the 
multi-query execution will be disabled and two separate jobs will be created to 
execute this script. The first job will execute A > B > DUMP while the second 
job will execute A > B > C > STORE.

A = LOAD 'input' AS (x, y, z);
B = FILTER A BY x > 5;
DUMP B;
C = FOREACH B GENERATE y, z;
STORE C INTO 'output';

Similarly, adding dump B into the python script will lead to two logical plan 
to be executed and the second logical plan is empty and will return a 
unsuccessful status to the exec function in the BoundScript. I am not sure if 
we need to fix this because this problem seems an "expected" behavior of the 
dump operation.
                
> PigStats.isSuccessful returns false if embedded pig script has dump
> -------------------------------------------------------------------
>
>                 Key: PIG-2291
>                 URL: https://issues.apache.org/jira/browse/PIG-2291
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.0
>            Reporter: Vivek Padmanabhan
>            Assignee: xuting zhao
>             Fix For: 0.11
>
>
> The below is my python script, 
> {code}
> #! /usr/bin/python
> from  org.apache.pig.scripting import Pig
> P = Pig.compileFromFile("""a.pig""")
> result = P.bind().runSingle()
> if result.isSuccessful():
>     print 'Pig job succeeded'
> else:
>     print 'Pig job failed'
> {code}
> The below is the pig script embedded (a.pig)
> A = LOAD 'a1' USING PigStorage(',') AS (f1:chararray,f2:chararray);
> B = GROUP A by f1;
> dump B;
> For this script execution, even though the job is successful the output 
> printed is 'Pig job failed'
> This is because result.isSuccessful() is returning false whenever the pig 
> script is having a dump statement.
> If i run the pig script alone, then the error code returned is proper.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to