[ 
https://issues.apache.org/jira/browse/HIVE-23609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal Vijayaraghavan updated HIVE-23609:
----------------------------------------
    Description: 
For self-joins, several other heuristics applied to Semijoins don't apply as 
the difference between rows on either side is likely to result in an exact 
reduction of rows scanned.

This change results in slightly different Tez priorities for self-joins which 
are heavily filtered on one side over the other, which helps ensure the smaller 
table is completed before the bigger table consumes resources.

  was:
For self-joins, several other heuristics applied to Semijoins don't apply as 
the difference between rows on either side is likely to result in an actual 
reduction of rows scanned.

This change results in slightly different Tez priorities for self-joins which 
are heavily filtered on one side over the other, which helps ensure the smaller 
table is completed before the bigger table consumes resources.


> SemiJoin: Relax big table size check for self-joins
> ---------------------------------------------------
>
>                 Key: HIVE-23609
>                 URL: https://issues.apache.org/jira/browse/HIVE-23609
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Gopal Vijayaraghavan
>            Assignee: Gopal Vijayaraghavan
>            Priority: Major
>
> For self-joins, several other heuristics applied to Semijoins don't apply as 
> the difference between rows on either side is likely to result in an exact 
> reduction of rows scanned.
> This change results in slightly different Tez priorities for self-joins which 
> are heavily filtered on one side over the other, which helps ensure the 
> smaller table is completed before the bigger table consumes resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to