[jira] [Comment Edited] (CALCITE-732) Implement multiple distinct-COUNT using GROUPING SETS

Julian Hyde (JIRA) Fri, 15 May 2015 15:58:25 -0700

    [ 
https://issues.apache.org/jira/browse/CALCITE-732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546322#comment-14546322
 ]


Julian Hyde edited comment on CALCITE-732 at 5/15/15 10:57 PM:
---------------------------------------------------------------

The query{code}select "department_id" as d,
  count(distinct "gender", "education_level") as c0,
  count(distinct "gender") as c1,
  count(distinct "education_level") as c2
from foodmart_clone."employee"
group by "department_id";{code} can be evaluated using an equivalent query that 
uses grouping sets:{code}select d,
  count(case i when 0 then 1 else null end) as c0,
  count(case i when 1 then 1 else null end) as c1,
  count(case i when 2 then 1 else null end) as c2
from (
  select "department_id" as d,
    grouping_id("gender", "education_level") as i,
   "gender" as g,
   "education_level" as e
  from foodmart_clone."employee"
  group by "department_id",
    grouping sets ("gender", "education_level", ("gender", "education_level")))
group by d;{code}


was (Author: julianhyde):
The query{code}select "department_id" as d,
  count(distinct "gender", "education_level") as c0,
  count(distinct "education_level") as c1,
  count(distinct "gender") as c2
from foodmart_clone."employee"
group by "department_id";{code} can be evaluated using an equivalent query that 
uses grouping sets:{code}select d,
  count(case i when 0 then 1 else null end) as c0,
  count(case i when 1 then 1 else null end) as c1,
  count(case i when 2 then 1 else null end) as c2
from (
  select "department_id" as d,
    grouping_id("gender", "education_level") as i,
   "gender" as g,
   "education_level" as e
  from foodmart_clone."employee"
  group by "department_id",
    grouping sets ("gender", "education_level", ("gender", "education_level")))
group by d;{code}

> Implement multiple distinct-COUNT using GROUPING SETS
> -----------------------------------------------------
>
>                 Key: CALCITE-732
>                 URL: https://issues.apache.org/jira/browse/CALCITE-732
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Assignee: Julian Hyde
>
> Currently if a query has COUNT(DISTINCT x) and COUNT(DISTINCT y) we compute 
> the distinct counts separately and combine them using a join. The join isn't 
> too expensive (because usually the GROUP BY has only a few keys) but we make 
> multiple scans over the base table.
> I think we could translate multiple distinct-counts into a GROUPING SETS 
> query (i.e. an Aggregate with more than one element in the groupSets field). 
> If the underlying engine can evaluate that efficiently, then we have saved 
> ourselves a join and several scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CALCITE-732) Implement multiple distinct-COUNT using GROUPING SETS

Reply via email to