[jira] [Commented] (CALCITE-1828) Push the FILTER clause into Druid as a Filtered Aggregator

Jesus Camacho Rodriguez (JIRA) Sat, 03 Jun 2017 02:40:23 -0700

    [ 
https://issues.apache.org/jira/browse/CALCITE-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035907#comment-16035907
 ]


Jesus Camacho Rodriguez commented on CALCITE-1828:
--------------------------------------------------

[~zhumayun], it would help if you could post a sample query together with the 
plan generated by Calcite to know how the filter predicates for the query look 
like, etc. Extending {{DruidAggregateProjectRule}} should be fine, but it would 
be good to understand the scope of the extension first: the logical plan will 
help with that.

FWIW, many databases do not support the {{sum("col1") FILTER (WHERE 
<condition1>)}} construct. In those cases, same behavior is often reproduced 
with {{CASE}} statements, e.g., {{sum(CASE WHEN <condition1> THEN "col1" 
END)}}. It would be good to know the plan that is generated in those cases too, 
mainly to see whether it is similar (since CASE might be transformed into 
AND/OR clauses by Calcite) or there is a gap that we could close in a follow-up 
JIRA.

> Push the FILTER clause into Druid as a Filtered Aggregator 
> -----------------------------------------------------------
>
>                 Key: CALCITE-1828
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1828
>             Project: Calcite
>          Issue Type: Improvement
>          Components: druid
>    Affects Versions: 1.12.0
>            Reporter: Zain Humayun
>            Assignee: Zain Humayun
>
> Druid has support for a special aggregator it calls the [Filtered 
> Aggregator|http://druid.io/docs/latest/querying/aggregations.html] that 
> allows aggregations to occur with filters independent to other filters in the 
> Druid query.
> An example where the filtered aggregator is useful:
> {code:sql}
> SELECT 
> sum("col1") FILTER (WHERE <condition1>),
> sum("col2") FILTER (WHERE <condition2>)
> FROM "table"; 
> {code} 
> Currently, calcite will scan Druid, then do the filtering and aggregation 
> itself. With filtered aggregators, both the filter and aggregation and be 
> pushed into Druid. 
> *A few comments/questions:*
> 1) If all conditions in the filter clause are the same, then instead of 
> pushing filtered aggregators individually, it would make more sense to push 1 
> single filter into the Druid query. I.e the filters can be factored out into 
> 1 filter. I don't see calcite currently do this, does it have such a rule in 
> place already?
> 2) The filters can/should only be pushed if they are filtering on dimension 
> columns
> 3) Currently, the above query would create the following relation: 
> DruidQuery -> Project -> Aggregate. There is already a rule called 
> {{DruidAggregateProjectRule}} which matches the previous relation. Is it 
> better to add logic to that rule, or to create a new rule that also matches 
> that relation?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (CALCITE-1828) Push the FILTER clause into Druid as a Filtered Aggregator

Reply via email to