[ 
https://issues.apache.org/jira/browse/HIVE-23907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-23907:
---------------------------------------
    Description: 
For some join like Anti join and Semi join , hash set is used instead of a hash 
table. This is done as these joins do not emit the right side columns and just 
an existence check is enough for join.  When we check for the  table size , 
during map join conversion , this info is not considered. The hash table size 
for these join will be considerably small and thus hash table for bigger table 
can fit into memory.

 

  was:
For Anti Join, we emit the records if the join condition does not satisfies. In 
case of PK-FK rule we have to explore if this can be exploited to speed up Anti 
Join processing.

 


> Hash table type should be considered for calculating the Map join table size
> ----------------------------------------------------------------------------
>
>                 Key: HIVE-23907
>                 URL: https://issues.apache.org/jira/browse/HIVE-23907
>             Project: Hive
>          Issue Type: Bug
>            Reporter: mahesh kumar behera
>            Assignee: mahesh kumar behera
>            Priority: Major
>
> For some join like Anti join and Semi join , hash set is used instead of a 
> hash table. This is done as these joins do not emit the right side columns 
> and just an existence check is enough for join.  When we check for the  table 
> size , during map join conversion , this info is not considered. The hash 
> table size for these join will be considerably small and thus hash table for 
> bigger table can fit into memory.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to