[ 
https://issues.apache.org/jira/browse/HIVE-21074?focusedWorklogId=706234&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-706234
 ]

ASF GitHub Bot logged work on HIVE-21074:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 10/Jan/22 15:15
            Start Date: 10/Jan/22 15:15
    Worklog Time Spent: 10m 
      Work Description: szlta opened a new pull request #2931:
URL: https://github.com/apache/hive/pull/2931


   This PR adds fix for an easy use case where bucket pruning doesn't work: 
whenever we see a bucket column in an AND condition with other expressions we 
should go ahead with the pruning. Currently if these other expressions are not 
on the leaf level, pruning is turned off for no reason..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 706234)
    Remaining Estimate: 0h
            Time Spent: 10m

> Hive bucketed table query pruning does not work for IS NOT NULL condition
> -------------------------------------------------------------------------
>
>                 Key: HIVE-21074
>                 URL: https://issues.apache.org/jira/browse/HIVE-21074
>             Project: Hive
>          Issue Type: Bug
>          Components: Logical Optimizer
>    Affects Versions: 3.1.0, 3.0.0, 3.1.1
>            Reporter: Thai Bui
>            Assignee: Ádám Szita
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: HIVE-21074.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The current version of bucket pruning skips all the predicates when it 
> detects that one of the predicates is a compound type (e.g. NOT(IS_NULL) ) 
> when evaluating AND logical operators.
> This logic is faulty since as long as one of the AND operators is a bucketed 
> column (_col_ = *literal*), the *literal* value of that _col_ should be 
> considered in the bucket pruning optimization no matter what. For example:
> SELECT * FROM tbl WHERE bucketed_col = 1 AND (some_compound_expr)
> Then the the value '*1'* should be considered for pruning in the query plan. 
> This limitation has manifested into a simpler case where a table that I am 
> trying to optimized using bucketing technique is not effective when IS NOT 
> NULL is used. Since IS NOT NULL is parsed into NOT(IS_NULL) (a compound 
> expression), the pruning phase is completed skipped causing unnecessary tasks 
> to be spawned. For instance:
> SELECT * FROM tbl WHERE bucketed_col = 1 AND some_other_col IS NOT NULL
> Will not trigger bucket pruning logic and perform a full table scan.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to