Hi,
I'm trying out a JOIN query on 2 tables using a map value (which is a
BIGINT) on the 1st table and a BIGINT column on the 2nd table. So here is
the schema:

*CREATE TABLE test_table0(userid BIGINT, mapCol map<STRING, BIGINT>)*
*COMMENT 'Test table 0'*
*STORED AS SEQUENCEFILE;*

*CREATE TABLE test_table1(userid BIGINT, col1 STRING, col2 STRING)*
*COMMENT 'Test table 1'*
*STORED AS SEQUENCEFILE;*

with these rows:

*INSERT INTO TABLE test_table0 VALUES (1, map('a', 1, 'b', 2));*
*INSERT INTO TABLE test_table0 VALUES (2, map('c', 3, 'd', 4));*
*INSERT INTO TABLE test_table1 VALUES (1, 'mycol1', 'mycol2');*

and here is the query:

*SELECT a.*, b.* FROM test_table0 a INNER JOIN test_table1 b ON
a.mapCol['a'] = b.userid;*

which results to this error:

*ERROR : Status: Failed*
*ERROR : Vertex failed, vertexName=Map 1,
vertexId=vertex_1546408189013_0167_9_01, diagnostics=[Task failed,
taskId=task_1546408189013_0167_9_01_000000, diagnostics=[TaskAttempt 0
failed, info=[Error: Error while running task ( failure ) :
attempt_1546408189013_0167_9_01_000000_0:java.lang.RuntimeException:
java.lang.RuntimeException: Map operator initialization failed*
* at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)*
* at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)*
* at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)*
* at
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)*
* at
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)*
* at java.security.AccessController.doPrivileged(Native Method)*
* at javax.security.auth.Subject.doAs(Subject.java:422)*
* at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)*
* at
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)*
* at
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)*
* at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)*
* at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)*
* at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)*
* at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)*
* at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)*
* at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)*
* at java.lang.Thread.run(Thread.java:745)*
*Caused by: java.lang.RuntimeException: Map operator initialization failed*
* at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:354)*
* at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)*
* ... 16 more*
*Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected
column vector type MAP*
* at
org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:302)*
* at
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:419)*
* at
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.initializeOp(VectorMapJoinGenerateResultOperator.java:115)*
* at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)*
* at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:572)*
* at
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:524)*
* at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)*
* at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:335)*
* ... 17 more*

But running this query is successful (I just discarded the map column
"colMap"):

*SELECT a.userid, b.* FROM test_table0 a INNER JOIN test_table1 b ON
a.mapCol['a'] = b.userid;*

Btw, I'm running this query on HDP 3.0.1 cluster with Apache Hive 3.1.0.

Thanks,
Jan Charles

Reply via email to