[ 
https://issues.apache.org/jira/browse/PIG-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16727017#comment-16727017
 ] 

Rohini Palaniswamy commented on PIG-5372:
-----------------------------------------

This failure is only with Tez or mapreduce as well? I am not able to tell 
exactly why Daniel is doing that in 
https://issues.apache.org/jira/browse/PIG-1467 as the description of jira does 
not have proper stacktrace.  In Tez, we read the quantile file from memory 
(broadcast edge) instead of local file (distributed cache) like in mapreduce. 
So we can override setConf() in SkewedPartitionerTez if this issue is specific 
to Tez.

> SAMPLE/RANDOM(udf) before skewed join failing with NPE
> ------------------------------------------------------
>
>                 Key: PIG-5372
>                 URL: https://issues.apache.org/jira/browse/PIG-5372
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.16.0
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>            Priority: Major
>         Attachments: pig-5372-v1.patch
>
>
> Sample short code like below
> {code}
> A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int);
> B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int);
> A2 = FOREACH A generate *, RANDOM() as randnum;
> D = join A2 by a1, B by b1 using 'skewed' parallel 2;
> store D into '$output';
> {code}
> Fails with NPE. 
> {noformat}
> 2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO  
> org.apache.tez.dag.history.HistoryEventHandler - 
> [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: 
> vertexName=scope-55, taskId=task_1544648742542_0001_1_02_000000, 
> startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, 
> status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, 
> info=[Error: Failure while running 
> task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 ->       
> scope-58 Operator Key: scope-29): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
>         at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131)
>         at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420)
>         at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282)
>         at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
>         at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>         at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>         at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>         at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>         at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> Exception while executing [POUserFunc (Name: 
> POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: 
> scope-40) children: null at []]: java.lang.NullPointerException
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305)
>         ... 17 more
> Caused by: java.lang.NullPointerException
>         at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:51)
>         at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:37)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:332)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextDouble(POUserFunc.java:396)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:343)
>         ... 20 more
> ]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to