[ 
https://issues.apache.org/jira/browse/HIVE-25867?focusedWorklogId=745642&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-745642
 ]

ASF GitHub Bot logged work on HIVE-25867:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Mar/22 08:15
            Start Date: 22/Mar/22 08:15
    Worklog Time Spent: 10m 
      Work Description: pvary commented on pull request #2947:
URL: https://github.com/apache/hive/pull/2947#issuecomment-1074862624


   Thanks for the explanation @zzzzming95! This was a misunderstanding on my 
side.
   
   What happens if your query is like this - would the filtering work in this 
case?
   ```
   select * from src_play_day WHERE dt='20211125' AND 
u_gtype='activity_workshop' limit 10;
   ```
   Notice that I used `'`-s to convert 20211125 to string.
   
   Another question is, how does this work with the other supported backend 
databases.
   The issue with directSql is that we have to handle the differences between 
the SQL engines ourselves.
   
   Also what happens when we have different types, like timestamp/binary/struct?
   
   Thanks for investigating this issue, and coming up with possible solutions!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 745642)
    Time Spent: 1h 50m  (was: 1h 40m)

> Partition filter condition should pushed down to metastore query if it is 
> equivalence Predicate
> -----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-25867
>                 URL: https://issues.apache.org/jira/browse/HIVE-25867
>             Project: Hive
>          Issue Type: Improvement
>          Components: Standalone Metastore
>            Reporter: shezm
>            Assignee: shezm
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The colnum type of the partition is different from the column type of the hql 
> query, the metastore will not push down the query to the RDBMS, but will 
> instead get all PARTITIONS.PART_NAME of the hive table then filter it 
> according to the hql Expression. 
> https://github.com/apache/hive/blob/5b112aa6dcc4e374c0a7c2b24042f24ae6815da1/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java#L1316
> If the hive table has too many partitions and there are multiple hql queries 
> at the same time,RDBMS will increasing CPU IO_WAIT and affect performance.
> If the partition filter condition in hql is an equivalent predicate, the 
> metastore should be pushed down to RDBMS, which can optimize the query 
> performance of hive large tables.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to