[
https://issues.apache.org/jira/browse/PIG-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892625#comment-13892625
]
Cheolsoo Park commented on PIG-3679:
------------------------------------
[~daijy], unfortunately, you patch introduces another issue. The e2e test still
fails but for a different reason. Here is the stack trace-
{code}
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Failed adding
input to inputQueue
at org.apache.pig.impl.builtin.StreamingUDF.getOutput(StreamingUDF.java:328)
at org.apache.pig.impl.builtin.StreamingUDF.exec(StreamingUDF.java:150)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:328)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextDouble(POUserFunc.java:394)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:318)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.NullPointerException
at
java.util.concurrent.ArrayBlockingQueue.checkNotNull(ArrayBlockingQueue.java:145)
at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:319)
at org.apache.pig.impl.builtin.StreamingUDF.getOutput(StreamingUDF.java:326)
... 17 more
{code}
To be clear, I don't think POUserFunc used to filter out nulls before PIG-3568
because it didn't modify POUserFunc. I believe\(?\) it's one of expression
operators that filtered out nulls in the following plan-
{code}
| POUserFunc(org.apache.pig.builtin.DoubleRound)[long] - scope-25
| |
| |---Multiply[double] - scope-24
| |
| |---POBinCond[double] - scope-21
| | |
| | |---Equal To[boolean] - scope-15
| | | |
| | | |---Project[chararray][2] - scope-13
| | | |
| | | |---Constant(true) - scope-14
| | |
| | |---Project[double][1] - scope-16
| | |
| | |---Add[double] - scope-20
| | |
| | |---Project[double][1] - scope-17
| | |
| | |---Cast[double] - scope-19
| | |
| | |---Constant(1) - scope-18
| |
| |---Cast[double] - scope-23
| |
| |---Constant(10000) - scope-22
{code}
I am not entirely sure whether we can bring the old behavior in every case
where PIG-3568 breaks backward compatibility. We discovered this particular
one, but we don't know how many cases like this exist.
> e2e StreamingPythonUDFs_10 fails in trunk
> -----------------------------------------
>
> Key: PIG-3679
> URL: https://issues.apache.org/jira/browse/PIG-3679
> Project: Pig
> Issue Type: Bug
> Reporter: Cheolsoo Park
> Assignee: Cheolsoo Park
> Fix For: 0.13.0
>
> Attachments: PIG-3679-1.patch, PIG-3679-2.patch, PIG-3679-3.patch
>
>
> The e2e test StreamingPythonUDFs_10 fails in trunk with NPE-
> {code}
> Caused by: java.lang.NullPointerException
> at org.apache.pig.builtin.DoubleRound.exec(DoubleRound.java:45)
> {code}
> The test query is as follows-
> {code}
> a = load '/user/pig/tests/data/singlefile/allscalar10k' using PigStorage() as
> (name:chararray, age:int, gpa:double, instate:chararray);
> b = foreach a generate name,
> ((double)ROUND((instate=='true'?gpa:gpa+1)*10000)) / 10000.0;
> store b into
> '/user/pig/out/cheolsoop-1390330024-nightly.conf-StreamingPythonUDFs/StreamingPythonUDFs_10_benchmark.out';
> {code}
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)