[
https://issues.apache.org/jira/browse/PIG-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507822#comment-15507822
]
Koji Noguchi commented on PIG-4976:
-----------------------------------
bq. Ok, so in this case it seems that the file is not getting created in
FileOutputHandler:
File not created is expected. Streaming process failed so I'd rather keep it
that way instead of risking getting a false positive by having an empty output
file.
Trying to sort out the issue. So far, I've only tested on my macbook.
Code path for the error I saw were two types.
(1) With small input (including the test from Daniel's patch),
It fails in close() at
{code}
308 if (inp != null && inp.returnStatus ==
POStatus.STATUS_EOP) {
309 // signal cleanup in ExecutableManager
310 close();
311 return;
312 }
{code}
which then calls
-> ExecutableManager.close():{{114 inputHandler.close(process);}}
somehow pass then
->ExecutableManager.close():{{160 outputHandler.bindTo("", null, 0,
-1);}} fails with
{noformat}
2016-09-20 16:59:42,425 [Thread-30] ERROR
org.apache.pig.impl.streaming.ExecutableManager - Error while reading from
POStream and passing it to thes
java.io.FileNotFoundException: foo (No such file or directory)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at
org.apache.pig.impl.streaming.FileOutputHandler.bindTo(FileOutputHandler.java:57)
at
org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:160)
at
org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:131)
at
org.apache.pig.impl.streaming.ExecutableManager$ProcessInputThread.run(ExecutableManager.java:310)
{noformat}
then outer catch block call killprocess and also fails with
{noformat}
Exception in thread "Thread-30" java.lang.NullPointerException
at
org.apache.pig.impl.streaming.OutputHandler.close(OutputHandler.java:178)
at
org.apache.pig.impl.streaming.ExecutableManager.killProcess(ExecutableManager.java:184)
at
org.apache.pig.impl.streaming.ExecutableManager.access$200(ExecutableManager.java:52)
at
org.apache.pig.impl.streaming.ExecutableManager$ProcessInputThread.run(ExecutableManager.java:368)
{noformat}
(2) When input is large, {{326
inputHandler.putNext(t);}} threw Exception
{code}
324 try {
325 t = (Tuple) inp.result;
326 inputHandler.putNext(t);
327 } catch (IOException e) {
328 // if input type is synchronous then it could
329 // be related to the process terminating
330 if(inputHandler.getInputType() ==
InputType.SYNCHRONOUS) {
331 LOG.warn("Exception while trying to write
to stream binary's input", e);
...
343 close();
344 return;
{code}
{noformat}
2016-09-20 17:07:12,362 [Thread-30] WARN
org.apache.pig.impl.streaming.ExecutableManager - Exception while trying to
write to stream binary's input
java.io.IOException: Stream closed
at
java.lang.ProcessBuilder$NullOutputStream.write(ProcessBuilder.java:433)
at java.io.OutputStream.write(OutputStream.java:116)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at
org.apache.pig.impl.streaming.InputHandler.putNext(InputHandler.java:72)
at
org.apache.pig.impl.streaming.ExecutableManager$ProcessInputThread.run(ExecutableManager.java:326)
{noformat}
then {{343 close();}} call inside the catch
block failed at {{:{{114 inputHandler.close(process);}}}} with
{noformat}
2016-09-20 17:07:12,365 [Thread-30] ERROR
org.apache.pig.impl.streaming.ExecutableManager - Error while reading from
POStream and passing it to thes
java.io.IOException: Stream closed
at
java.lang.ProcessBuilder$NullOutputStream.write(ProcessBuilder.java:433)
at java.io.OutputStream.write(OutputStream.java:116)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at
org.apache.pig.impl.streaming.InputHandler.close(InputHandler.java:93)
at
org.apache.pig.impl.streaming.DefaultInputHandler.close(DefaultInputHandler.java:50)
at
org.apache.pig.impl.streaming.ExecutableManager.close(ExecutableManager.java:114)
at
org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager.close(HadoopExecutableManager.java:131)
at
org.apache.pig.impl.streaming.ExecutableManager$ProcessInputThread.run(ExecutableManager.java:343)
{noformat}
then you would also see killprocess fails with NullPointerException just like
in (1).
Daniel's {{PIG-4976-1.patch}} and {{PIG-4976-2.patch}} both handles issue (2)
at different level. I am not sure why the provided testcase is hitting (1) in
my environment but (2) in Daniel's environment.
> streaming job with store clause stuck if the script fail
> --------------------------------------------------------
>
> Key: PIG-4976
> URL: https://issues.apache.org/jira/browse/PIG-4976
> Project: Pig
> Issue Type: Bug
> Components: impl
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.17.0
>
> Attachments: PIG-4976-1.patch, PIG-4976-2.patch, PIG-4976-3.patch,
> PIG-4976-4.patch
>
>
> When investigating PIG-4972, I also notice Pig job stuck when the perl script
> have syntax error. This happens if we have output clause in stream
> specification (means use a file as staging). The bug exist in both Tez and
> MR, and it is not a regression.
> Here is an example:
> {code}
> define CMD `perl kk.pl` output('foo') ship('kk.pl');
> A = load 'studenttab10k' as (name, age, gpa);
> B = foreach A generate name;
> C = stream B through CMD;
> store C into 'ooo';
> {code}
> kk.pl is any perl script contain a syntax error.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)