[jira] [Resolved] (HIVE-21074) Hive bucketed table query pruning does not work for IS NOT NULL condition

2022-01-13 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-21074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita resolved HIVE-21074.
---
Resolution: Fixed

This is now committed to master. Thanks to [~thaibui] as the original 
contributor, and [~pvary] for reviewing.

> Hive bucketed table query pruning does not work for IS NOT NULL condition
> -
>
> Key: HIVE-21074
> URL: https://issues.apache.org/jira/browse/HIVE-21074
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 3.1.0, 3.0.0, 3.1.1
>Reporter: Thai Bui
>Assignee: Ádám Szita
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21074.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The current version of bucket pruning skips all the predicates when it 
> detects that one of the predicates is a compound type (e.g. NOT(IS_NULL) ) 
> when evaluating AND logical operators.
> This logic is faulty since as long as one of the AND operators is a bucketed 
> column (_col_ = *literal*), the *literal* value of that _col_ should be 
> considered in the bucket pruning optimization no matter what. For example:
> SELECT * FROM tbl WHERE bucketed_col = 1 AND (some_compound_expr)
> Then the the value '*1'* should be considered for pruning in the query plan. 
> This limitation has manifested into a simpler case where a table that I am 
> trying to optimized using bucketing technique is not effective when IS NOT 
> NULL is used. Since IS NOT NULL is parsed into NOT(IS_NULL) (a compound 
> expression), the pruning phase is completed skipped causing unnecessary tasks 
> to be spawned. For instance:
> SELECT * FROM tbl WHERE bucketed_col = 1 AND some_other_col IS NOT NULL
> Will not trigger bucket pruning logic and perform a full table scan.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-21074) Hive bucketed table query pruning does not work for IS NOT NULL condition

2018-12-28 Thread Thai Bui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thai Bui resolved HIVE-21074.
-
   Resolution: Duplicate
Fix Version/s: 4.0.0

This is fixed in https://jira.apache.org/jira/browse/HIVE-19097

> Hive bucketed table query pruning does not work for IS NOT NULL condition
> -
>
> Key: HIVE-21074
> URL: https://issues.apache.org/jira/browse/HIVE-21074
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 3.0.0, 3.1.0, 3.1.1
>Reporter: Thai Bui
>Assignee: Thai Bui
>Priority: Minor
> Fix For: 4.0.0
>
>
> The current version of bucket pruning skips all the predicates when it 
> detects that one of the predicates is a compound type (e.g. NOT(IS_NULL) ) 
> when evaluating AND logical operators.
> This logic is faulty since as long as one of the AND operators is a bucketed 
> column (_col_ = *literal*), the *literal* value of that _col_ should be 
> considered in the bucket pruning optimization no matter what. For example:
> SELECT * FROM tbl WHERE bucketed_col = 1 AND (some_compound_expr)
> Then the the value '*1'* should be considered for pruning in the query plan. 
> This limitation has manifested into a simpler case where a table that I am 
> trying to optimized using bucketing technique is not effective when IS NOT 
> NULL is used. Since IS NOT NULL is parsed into NOT(IS_NULL) (a compound 
> expression), the pruning phase is completed skipped causing unnecessary tasks 
> to be spawned. For instance:
> SELECT * FROM tbl WHERE bucketed_col = 1 AND some_other_col IS NOT NULL
> Will not trigger bucket pruning logic and perform a full table scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)