[ https://issues.apache.org/jira/browse/HIVE-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sergey Shelukhin updated HIVE-15235: ------------------------------------ Description: The relevant fragment of the reduce plan of a Tez job is as follows: {noformat} <MERGEJOIN>Id =13 <Children> <SEL>Id =12 <Children> <MAPJOIN>Id =10 <Children> ... <\Children> <Parent>Id = 12 nullId = 9 <HASHTABLEDUMMY>Id =9 <Children>null <\Children> <Parent><\Parent> <\HASHTABLEDUMMY><\Parent> <\MAPJOIN> <\Children> <Parent>Id = 13 null<\Parent> <\SEL> <\Children> {noformat} When sortmergejoin is enabled, during initialization, dummy operators are not initialized (presumably, they are not present in the work); that results in MapJoin not being initialized, even though its proper parent is. Manifests as an NPE {noformat} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:350) {noformat} was: {noformat} <MERGEJOIN>Id =13 <Children> <SEL>Id =12 <Children> <MAPJOIN>Id =10 <Children> <FS>Id =11 <Children> <\Children> <Parent>Id = 10 null<\Parent> <\FS> <\Children> <Parent>Id = 12 nullId = 9 <HASHTABLEDUMMY>Id =9 <Children>null <\Children> <Parent><\Parent> <\HASHTABLEDUMMY><\Parent> <\MAPJOIN> <\Children> <Parent>Id = 13 null<\Parent> <\SEL> <\Children> {noformat} This only happens when sortmergejoin is enabled. This is on reduce size of a Tez job; during initialization, dummy operators are not initialized (presumably, they are not present in the work); that results in MapJoin not being initialized, even though its proper parent is. Manifests as an NPE {noformat} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:350) {noformat} > sortmergejoin can produce incorrect plan wrt dummy operators > ------------------------------------------------------------ > > Key: HIVE-15235 > URL: https://issues.apache.org/jira/browse/HIVE-15235 > Project: Hive > Issue Type: Bug > Reporter: Sergey Shelukhin > > The relevant fragment of the reduce plan of a Tez job is as follows: > {noformat} > <MERGEJOIN>Id =13 > <Children> > <SEL>Id =12 > <Children> > <MAPJOIN>Id =10 > <Children> > ... > <\Children> > <Parent>Id = 12 nullId = 9 > <HASHTABLEDUMMY>Id =9 > <Children>null > <\Children> > <Parent><\Parent> > <\HASHTABLEDUMMY><\Parent> > <\MAPJOIN> > <\Children> > <Parent>Id = 13 null<\Parent> > <\SEL> > <\Children> > {noformat} > When sortmergejoin is enabled, during initialization, dummy operators are not > initialized (presumably, they are not present in the work); that results in > MapJoin not being initialized, even though its proper parent is. > Manifests as an NPE > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:350) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)