[jira] [Commented] (CALCITE-1824) GROUP_ID returns wrong result

Vladimir Sitnikov (Jira) Mon, 06 Jan 2020 22:59:30 -0800


    [ 
https://issues.apache.org/jira/browse/CALCITE-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009408#comment-17009408
 ]


Vladimir Sitnikov commented on CALCITE-1824:
--------------------------------------------

How about rewriting non-zero grouping cases with a help of UNION ALL?

For instance:

{code:sql}SELECT deptno, GROUP_ID() AS g
FROM Emp
GROUP BY GROUPING SETS (deptno, (), ());{code}
=>
{code:sql}SELECT deptno, 0 AS g
FROM Emp
GROUP BY GROUPING SETS (deptno, ())
union all
SELECT null, 1 AS g
FROM Emp
GROUP BY GROUPING SETS (())
;{code}

It looks to be simpler to reason about, and it would probably be easier for 
pattern-matching (e.g. materialization review)


> GROUP_ID returns wrong result
> -----------------------------
>
>                 Key: CALCITE-1824
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1824
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Priority: Major
>
> We implemented the {{GROUP_ID()}} function in CALCITE-512 but we got the 
> specification wrong, and it returns the wrong result.
> {{GROUP_ID}} is not in the SQL standard. It is implemented only by Oracle.
> I mistakenly believed that {{GROUP_ID()}} is equivalent to {{GROUPING_ID(g1, 
> ..., gn)}} (in a query with {{GROUP BY g1, ..., gn}}). In fact, {{GROUP_ID}} 
> is useful only if you have duplicate grouping sets. If grouping sets are 
> distinct, {{GROUP_ID()}} will always return zero.
> Example 1
> {code}SELECT deptno, job, GROUP_ID() AS g
> FROM Emp
> GROUP BY ROLLUP(deptno, job)
>     DEPTNO JOB                      G
> ---------- --------- ----------
>       10 CLERK              0
>       10 MANAGER            0
>       10 PRESIDENT          0
>       10                    0
>       20 CLERK              0
>       20 ANALYST            0
>       20 MANAGER            0
>       20                    0
>       30 CLERK              0
>       30 MANAGER            0
>       30 SALESMAN           0
>       30                    0
>                             0
> {code} produces grouping sets (deptno, job), (deptno), (). These are 
> distinct, so GROUP_ID() is 0 for all rows.
> Example 2
> {code}SELECT deptno, GROUP_ID() AS g
> FROM Emp
> GROUP BY GROUPING SETS (deptno, (), ());
>     DEPTNO        G
> ---------- ----------
>       10          0
>       20          0
>       30          0
>                   0
>                   1
> {code}
> As you can see, the grouping set () occurs twice. So there is one row in the 
> result for each occurrence: the first occurrence has g = 0; the second has g 
> = 1.
> In my fix for CALCITE-1069, I will change GROUP_ID() to always return 0. This 
> is wrong, but nevertheless closer to the required behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-1824) GROUP_ID returns wrong result

Reply via email to