[ 
https://issues.apache.org/jira/browse/HIVE-22157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuyan updated HIVE-22157:
--------------------------
    Description: 
Currently Hive can not push aggr spec if one want to use customized extension 
in druid for the execution

when using Hive, below query is been rewritten with no aggr defined 

Explain  select  floor_day(`__time`),count(distinct visitor_id) as uv from 
druid group by floor_day(`__time`);

.....

"limitSpec":\{"type":"default"},

"aggregations":[],

"intervals":["1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z"]

...... 

 

But what one really need was 

 

"aggregations": [

{ "type": "distinctCount", "name": "uv", "fieldName": "visitor_id" }

]

 

and aggregations spec is using the {{druid-distinctcount}} extension.  

 

If we can call Druid's UDAF from HiveSQL and been able push that call into the 
execution plan to use that UDAF on Druid DataSource, this would be a nice thing 
to power up the Hive-Druid Integration.

 

  was:
Currently Hive can not push aggr spec if one want to use customized extension 
in druid for the execution

when using Hive, below query is been rewritten with no aggr defined 

Explain  select  floor_day(`__time`),count(distinct visitor_id) as uv from 
druid group by floor_day(`__time`);

.....

"limitSpec":\{"type":"default"},

"aggregations":[],

"intervals":["1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z"]

...... 

 

But what one really need was 

 

"aggregations": [ { "type": "distinctCount", "name": "uv", "fieldName": 
"visitor_id" } ]

 

and aggregations spec is using the {{druid-distinctcount}} extension.  

 

It's we can call Druid's UDAF from HiveSQL and been able push that call into 
the execution plan to use that UDAF on Druid DataSource, this would be a nice 
thing to power up the Hive-Druid Integration.

 


> Hive Pushing Aggr extension to Druid
> ------------------------------------
>
>                 Key: HIVE-22157
>                 URL: https://issues.apache.org/jira/browse/HIVE-22157
>             Project: Hive
>          Issue Type: Wish
>          Components: Druid integration
>    Affects Versions: 3.0.0, 3.1.0, 3.1.1, 3.1.2
>            Reporter: liuyan
>            Priority: Minor
>
> Currently Hive can not push aggr spec if one want to use customized extension 
> in druid for the execution
> when using Hive, below query is been rewritten with no aggr defined 
> Explain  select  floor_day(`__time`),count(distinct visitor_id) as uv from 
> druid group by floor_day(`__time`);
> .....
> "limitSpec":\{"type":"default"},
> "aggregations":[],
> "intervals":["1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z"]
> ...... 
>  
> But what one really need was 
>  
> "aggregations": [
> { "type": "distinctCount", "name": "uv", "fieldName": "visitor_id" }
> ]
>  
> and aggregations spec is using the {{druid-distinctcount}} extension.  
>  
> If we can call Druid's UDAF from HiveSQL and been able push that call into 
> the execution plan to use that UDAF on Druid DataSource, this would be a nice 
> thing to power up the Hive-Druid Integration.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to