[
https://issues.apache.org/jira/browse/HIVE-8628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gunther Hagleitner updated HIVE-8628:
-------------------------------------
Fix Version/s: 0.14.0
> NPE in case of shuffle join in tez
> ----------------------------------
>
> Key: HIVE-8628
> URL: https://issues.apache.org/jira/browse/HIVE-8628
> Project: Hive
> Issue Type: Bug
> Affects Versions: 0.14.0
> Reporter: Vikram Dixit K
> Assignee: Vikram Dixit K
> Priority: Critical
> Fix For: 0.14.0
>
> Attachments: HIVE-8628.1.patch
>
>
> test throws NullPointerException:
> {noformat}
> Vertex failed, vertexName=Reducer 2, vertexId=vertex_1413774081318_0803_5_03,
> diagnostics=[Task failed, taskId=task_1413774081318_0803_5_03_000000,
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running
> task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime
> Error while closing operators: null
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:187)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Hive Runtime Error while closing
> operators: null
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:218)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:178)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:368)
> at
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:310)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:582)
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:200)
> ... 14 more
> ], TaskAttempt 1 failed, info=[Error: Failure while running
> task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime
> Error while closing operators: null
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:187)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Hive Runtime Error while closing
> operators: null
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:218)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:178)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:368)
> at
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:310)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:582)
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:200)
> ... 14 more
> ], TaskAttempt 2 failed, info=[Error: Failure while running
> task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime
> Error while closing operators: null
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:187)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Hive Runtime Error while closing
> operators: null
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:218)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:178)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:368)
> at
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:310)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:582)
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:200)
> ... 14 more
> ], TaskAttempt 3 failed, info=[Error: Failure while running
> task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime
> Error while closing operators: null
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:187)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142)
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Hive Runtime Error while closing
> operators: null
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:218)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:178)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:368)
> at
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:310)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:582)
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:200)
> ... 14 more
> ]], Vertex failed as one or more tasks failed. failedTasks:1]
> Vertex killed, vertexName=Reducer 3, vertexId=vertex_1413774081318_0803_5_04,
> diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as
> other vertex failed. failedTasks:0]
> DAG failed due to vertex failure. failedVertices:1 killedVertices:1
> FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.tez.TezTask
> {noformat}
> {noformat}
> set hive.auto.convert.join=false;
> select c.c_first_name, c.c_last_name, cd.cd_gender, hd.hd_buy_potential from
> customer c left outer join customer_demographics cd on cd.cd_demo_sk =
> c.c_current_cdemo_sk left outer join household_demographics hd on
> hd.hd_demo_sk = c.c_current_hdemo_sk where c.c_customer_sk < 1000;
> {noformat}
> Plan (auto.convert.join=false, vectorization on, cbo on,execution.engine=tez)
> {noformat}
> STAGE DEPENDENCIES:
> Stage-1 is a root stage
> Stage-0 depends on stages: Stage-1
> STAGE PLANS:
> Stage: Stage-1
> Tez
> Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE)
> Reducer 3 <- Map 5 (SIMPLE_EDGE), Reducer 2 (SIMPLE_EDGE)
> DagName: hrt_qa_20141020210707_53fdc731-2d96-455e-8c20-ce7fec75f01f:6
> Vertices:
> Map 1
> Map Operator Tree:
> TableScan
> alias: c
> filterExpr: (c_customer_sk < 1000) (type: boolean)
> Statistics: Num rows: 100000 Data size: 4679516 Basic
> stats: COMPLETE Column stats: NONE
> Filter Operator
> predicate: (c_customer_sk < 1000) (type: boolean)
> Statistics: Num rows: 33333 Data size: 1559823 Basic
> stats: COMPLETE Column stats: NONE
> Select Operator
> expressions: c_current_cdemo_sk (type: int),
> c_current_hdemo_sk (type: int), c_first_name (type: string), c_last_name
> (type: string)
> outputColumnNames: _col1, _col2, _col3, _col4
> Statistics: Num rows: 33333 Data size: 1559823 Basic
> stats: COMPLETE Column stats: NONE
> Reduce Output Operator
> key expressions: _col1 (type: int)
> sort order: +
> Map-reduce partition columns: _col1 (type: int)
> Statistics: Num rows: 33333 Data size: 1559823 Basic
> stats: COMPLETE Column stats: NONE
> value expressions: _col2 (type: int), _col3 (type:
> string), _col4 (type: string)
> Execution mode: vectorized
> Map 4
> Map Operator Tree:
> TableScan
> alias: cd
> Statistics: Num rows: 1920800 Data size: 5893494 Basic
> stats: COMPLETE Column stats: NONE
> Select Operator
> expressions: cd_demo_sk (type: int), cd_gender (type:
> string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 1920800 Data size: 5893494 Basic
> stats: COMPLETE Column stats: NONE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 1920800 Data size: 5893494 Basic
> stats: COMPLETE Column stats: NONE
> value expressions: _col1 (type: string)
> Execution mode: vectorized
> Map 5
> Map Operator Tree:
> TableScan
> alias: hd
> Statistics: Num rows: 7200 Data size: 840 Basic stats:
> COMPLETE Column stats: NONE
> Select Operator
> expressions: hd_demo_sk (type: int), hd_buy_potential
> (type: string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 7200 Data size: 840 Basic stats:
> COMPLETE Column stats: NONE
> Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 7200 Data size: 840 Basic stats:
> COMPLETE Column stats: NONE
> value expressions: _col1 (type: string)
> Execution mode: vectorized
> Reducer 2
> Reduce Operator Tree:
> Merge Join Operator
> condition map:
> Right Outer Join0 to 1
> condition expressions:
> 0 {VALUE._col0}
> 1 {VALUE._col1} {VALUE._col2} {VALUE._col3}
> outputColumnNames: _col1, _col4, _col5, _col6
> Statistics: Num rows: 2112880 Data size: 6482843 Basic stats:
> COMPLETE Column stats: NONE
> Reduce Output Operator
> key expressions: _col4 (type: int)
> sort order: +
> Map-reduce partition columns: _col4 (type: int)
> Statistics: Num rows: 2112880 Data size: 6482843 Basic
> stats: COMPLETE Column stats: NONE
> value expressions: _col1 (type: string), _col5 (type:
> string), _col6 (type: string)
> Reducer 3
> Reduce Operator Tree:
> Merge Join Operator
> condition map:
> Left Outer Join0 to 1
> condition expressions:
> 0 {VALUE._col1} {VALUE._col4} {VALUE._col5}
> 1 {VALUE._col0}
> outputColumnNames: _col1, _col5, _col6, _col8
> Statistics: Num rows: 2324168 Data size: 7131127 Basic stats:
> COMPLETE Column stats: NONE
> Select Operator
> expressions: _col5 (type: string), _col6 (type: string),
> _col1 (type: string), _col8 (type: string)
> outputColumnNames: _col0, _col1, _col2, _col3
> Statistics: Num rows: 2324168 Data size: 7131127 Basic
> stats: COMPLETE Column stats: NONE
> File Output Operator
> compressed: false
> Statistics: Num rows: 2324168 Data size: 7131127 Basic
> stats: COMPLETE Column stats: NONE
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format:
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> serde:
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> Stage: Stage-0
> Fetch Operator
> limit: -1
> Processor Tree:
> ListSink
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)