Jagruti Varia created HIVE-10885:
------------------------------------
Summary: with vectorization enabled join operation involving
interval_day_time fails
Key: HIVE-10885
URL: https://issues.apache.org/jira/browse/HIVE-10885
Project: Hive
Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Jagruti Varia
Assignee: Matt McCline
When vectorization is on, join operation involving interval_day_time type
throws following error:
{noformat}
Status: Failed
Vertex failed, vertexName=Map 2, vertexId=vertex_1432858236614_0247_1_01,
diagnostics=[Task failed, taskId=task_1432858236614_0247_1_01_000000,
diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running
task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator
initialization failed
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Map operator initialization failed
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
... 14 more
Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for
interval_day_time
at
org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213)
at
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
at
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214)
... 15 more
], TaskAttempt 1 failed, info=[Error: Failure while running
task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator
initialization failed
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Map operator initialization failed
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
... 14 more
Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for
interval_day_time
at
org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213)
at
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
at
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214)
... 15 more
], TaskAttempt 2 failed, info=[Error: Failure while running
task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator
initialization failed
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Map operator initialization failed
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
... 14 more
Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for
interval_day_time
at
org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213)
at
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
at
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214)
... 15 more
], TaskAttempt 3 failed, info=[Error: Failure while running
task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator
initialization failed
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Map operator initialization failed
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
... 14 more
Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for
interval_day_time
at
org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213)
at
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
at
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214)
... 15 more
{noformat}
query ran:
{noformat}
select
v1.s,
v2.s,
v1.intrvl1
from
( select
s,
(cast(dt as date) - cast(ts as date)) as intrvl1
from
vectortab10korc ) v1
join
(
select
s ,
(cast(dt as date) - cast(ts as date)) as intrvl2
from
vectorparttab10korc
) v2
on v1.intrvl1 = v2.intrvl2
and v1.s = v2.s;
{noformat}
explain plan:
{noformat}
OK
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
STAGE PLANS:
Stage: Stage-1
Tez
Edges:
Map 2 <- Map 1 (BROADCAST_EDGE)
DagName: hrt_qa_20150601024305_7745bc8f-169f-45c6-8856-7391eef0d819:3
Vertices:
Map 1
Map Operator Tree:
TableScan
alias: vectortab10korc
filterExpr: s is not null (type: boolean)
Statistics: Num rows: 10000 Data size: 4597592 Basic stats:
COMPLETE Column stats: PARTIAL
Filter Operator
predicate: s is not null (type: boolean)
Statistics: Num rows: 10000 Data size: 1340000 Basic stats:
COMPLETE Column stats: PARTIAL
Select Operator
expressions: s (type: string), (dt - CAST( ts AS DATE))
(type: interval_day_time)
outputColumnNames: _col0, _col1
Statistics: Num rows: 10000 Data size: 940000 Basic
stats: COMPLETE Column stats: PARTIAL
Filter Operator
predicate: _col1 is not null (type: boolean)
Statistics: Num rows: 10000 Data size: 940000 Basic
stats: COMPLETE Column stats: PARTIAL
Reduce Output Operator
key expressions: _col1 (type: interval_day_time),
_col0 (type: string)
sort order: ++
Map-reduce partition columns: _col1 (type:
interval_day_time), _col0 (type: string)
Statistics: Num rows: 10000 Data size: 940000 Basic
stats: COMPLETE Column stats: PARTIAL
Select Operator
expressions: _col0 (type: string)
outputColumnNames: _col0
Statistics: Num rows: 10000 Data size: 940000 Basic
stats: COMPLETE Column stats: PARTIAL
Group By Operator
keys: _col0 (type: string)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 5000 Data size: 470000 Basic
stats: COMPLETE Column stats: PARTIAL
Dynamic Partitioning Event Operator
Target Input: vectorparttab10korc
Partition key expr: s
Statistics: Num rows: 5000 Data size: 470000
Basic stats: COMPLETE Column stats: PARTIAL
Target column: s
Target Vertex: Map 2
Execution mode: vectorized
Map 2
Map Operator Tree:
TableScan
alias: vectorparttab10korc
filterExpr: s is not null (type: boolean)
Statistics: Num rows: 10000 Data size: 3656191 Basic stats:
COMPLETE Column stats: PARTIAL
Select Operator
expressions: s (type: string), (dt - CAST( ts AS DATE))
(type: interval_day_time)
outputColumnNames: _col0, _col1
Statistics: Num rows: 10000 Data size: 1840000 Basic stats:
COMPLETE Column stats: PARTIAL
Filter Operator
predicate: _col1 is not null (type: boolean)
Statistics: Num rows: 10000 Data size: 1840000 Basic
stats: COMPLETE Column stats: PARTIAL
Map Join Operator
condition map:
Inner Join 0 to 1
keys:
0 _col1 (type: interval_day_time), _col0 (type:
string)
1 _col1 (type: interval_day_time), _col0 (type:
string)
outputColumnNames: _col0, _col1, _col2
input vertices:
0 Map 1
Statistics: Num rows: 344 Data size: 95632 Basic stats:
COMPLETE Column stats: PARTIAL
HybridGraceHashJoin: true
Select Operator
expressions: _col0 (type: string), _col2 (type:
string), _col1 (type: interval_day_time)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 344 Data size: 95632 Basic
stats: COMPLETE Column stats: PARTIAL
File Output Operator
compressed: false
Statistics: Num rows: 344 Data size: 95632 Basic
stats: COMPLETE Column stats: PARTIAL
table:
input format:
org.apache.hadoop.mapred.TextInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Execution mode: vectorized
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
Time taken: 0.402 seconds, Fetched: 91 row(s)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)