[ 
https://issues.apache.org/jira/browse/HIVE-20683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20683:
-------------------------------------------
    Fix Version/s: 4.0.0
       Resolution: Fixed
           Status: Resolved  (was: Patch Available)

Pushed to master, thanks [~nishantbangarwa] (and [~bslim] for reviewing).

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> ------------------------------------------------------------------
>
>                 Key: HIVE-20683
>                 URL: https://issues.apache.org/jira/browse/HIVE-20683
>             Project: Hive
>          Issue Type: New Feature
>          Components: Druid integration
>            Reporter: Nishant Bangarwa
>            Assignee: Nishant Bangarwa
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: HIVE-20683.1.patch, HIVE-20683.10.patch, 
> HIVE-20683.2.patch, HIVE-20683.3.patch, HIVE-20683.4.patch, 
> HIVE-20683.5.patch, HIVE-20683.6.patch, HIVE-20683.8.patch, HIVE-20683.patch
>
>          Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to