[ 
https://issues.apache.org/jira/browse/PIG-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510109#comment-15510109
 ] 

Koji Noguchi commented on PIG-4976:
-----------------------------------

Nandor, granted I've never used this feature myself, but from 
http://pig.apache.org/docs/r0.16.0/basic.html#define-udfs
I'm guessing {{define CMD `perl kk.pl` output('foo')}} means the streaming 
command (here, it would be perl) would write all its output to file 'foo'.  
Then PigStreaming would 'deserialize' them into Tuple form and pass them to 
next call.

It is responsibility of the streaming process to create the file.  I don't want 
the framework creating an empty output file and risk getting false positives.  
(From the current code, it should still fail but why risk it.)

> streaming job with store clause stuck if the script fail
> --------------------------------------------------------
>
>                 Key: PIG-4976
>                 URL: https://issues.apache.org/jira/browse/PIG-4976
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.17.0
>
>         Attachments: PIG-4976-1.patch, PIG-4976-2.patch, PIG-4976-3.patch, 
> PIG-4976-4.patch, PIG-4976-5-knoguchi.patch
>
>
> When investigating PIG-4972, I also notice Pig job stuck when the perl script 
> have syntax error. This happens if we have output clause in stream 
> specification (means use a file as staging). The bug exist in both Tez and 
> MR, and it is not a regression.
> Here is an example:
> {code}
> define CMD `perl kk.pl` output('foo') ship('kk.pl');
> A = load 'studenttab10k' as (name, age, gpa);
> B = foreach A generate name;
> C = stream B through CMD;
> store C into 'ooo';
> {code}
> kk.pl is any perl script contain a syntax error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to