[
https://issues.apache.org/jira/browse/HIVE-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916794#action_12916794
]
Amareshwari Sriramadasu commented on HIVE-1678:
-----------------------------------------------
The same query succeeds when no MapJoin is used.
Looks like plan generation went wrong in MapJoinProcessor. explain output for
the query:
{noformat}
explain
select /*+MAPJOIN(src, myinput1) */ count(srcpart.key) from srcpart join src on
(srcpart.value=src.value) join myinput1 on (srcpart.key=myinput1.key);
OK
ABSTRACT SYNTAX TREE:
(TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN (TOK_TABREF srcpart) (TOK_TABREF
src) (= (. (TOK_TABLE_OR_COL srcpart) value) (. (TOK_TABLE_OR_COL src) value)))
(TOK_TABREF myinput1) (= (. (TOK_TABLE_OR_COL srcpart) key) (.
(TOK_TABLE_OR_COL myinput1) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR
TOK_TMP_FILE)) (TOK_SELECT (TOK_HINTLIST (TOK_HINT TOK_MAPJOIN (TOK_HINTARGLIST
src myinput1))) (TOK_SELEXPR (TOK_FUNCTION count (. (TOK_TABLE_OR_COL srcpart)
key))))))
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-2 depends on stages: Stage-1
Stage-3 depends on stages: Stage-2
Stage-0 is a root stage
STAGE PLANS:
Stage: Stage-1
Map Reduce
Alias -> Map Operator Tree:
srcpart
TableScan
alias: srcpart
Common Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0 {key}
1
handleSkewJoin: false
keys:
0 [Column[value]]
1 [Column[value]]
outputColumnNames: _col0
Position of Big Table: 0
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Local Work:
Map Reduce Local Work
Alias -> Map Local Tables:
src
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
src
TableScan
alias: src
Common Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0 {key}
1
handleSkewJoin: false
keys:
0 [Column[value]]
1 [Column[value]]
outputColumnNames: _col0
Position of Big Table: 0
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Stage: Stage-2
Map Reduce
Alias -> Map Operator Tree:
hdfs://localhost:19000/tmp/hive-amarsri/hive_2010-10-01_11-24-15_198_629889728835692043/-mr-10002
Select Operator
expressions:
expr: _col0
type: int
outputColumnNames: _col0
Common Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0 {_col0}
1
handleSkewJoin: false
keys:
0 [Column[_col0]]
1 [Column[key]]
outputColumnNames: _col0
Position of Big Table: 0
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Local Work:
Map Reduce Local Work
Alias -> Map Local Tables:
myinput1
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
myinput1
TableScan
alias: myinput1
Common Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0 {_col0}
1
handleSkewJoin: false
keys:
0 [Column[_col0]]
1 [Column[key]]
outputColumnNames: _col0
Position of Big Table: 0
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Stage: Stage-3
Map Reduce
Alias -> Map Operator Tree:
hdfs://localhost:19000/tmp/hive-amarsri/hive_2010-10-01_11-24-15_198_629889728835692043/-mr-10002
Select Operator
expressions:
expr: _col0
type: int
outputColumnNames: _col0
Common Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0 {_col0}
1
handleSkewJoin: false
keys:
0 [Column[_col0]]
1 [Column[key]]
outputColumnNames: _col0
Position of Big Table: 0
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
Reduce Operator Tree:
Group By Operator
aggregations:
expr: count(VALUE._col0)
bucketGroup: false
mode: mergepartial
outputColumnNames: _col0
Select Operator
expressions:
expr: _col0
type: bigint
outputColumnNames: _col0
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Stage: Stage-0
Fetch Operator
limit: -1
Time taken: 0.202 seconds
{noformat}
If I'm not wrong, Join operation should not be there in 3rd stage.
> NPE in MapJoin
> ---------------
>
> Key: HIVE-1678
> URL: https://issues.apache.org/jira/browse/HIVE-1678
> Project: Hadoop Hive
> Issue Type: Bug
> Components: Query Processor
> Reporter: Amareshwari Sriramadasu
>
> The query with two map joins and a group by fails with following NPE:
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:177)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
> at
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
> at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:464)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.