Suppose we have one row that represents the total for department 10, and 
another that represents the grand total of all departments. Which row would we 
say that department is “grouped” (in Calcite’s parlance) or “aggregated” in 
(Hive’s parlance)?

I find the terms confusing. It’s possible that Calcite has them “wrong”. 

I would simply run a query like

  SELECT deptno, job, GROUPING(deptno, job), COUNT(*)
  FROM emp
  GROUP BY CUBE (deptno, job)

and see whether Hive and Calcite return the same result. 

> On Apr 20, 2020, at 6:58 PM, ZZY <joa...@gmail.com> wrote:
> 
> Hi, Hyde:
> It's confused me that some annotations in
> Calcite(org.apache.calcite.sql.fun.SqlGroupingFunction.java) :
> /**
> * The {@code GROUPING} function.
> *
> * <p>Accepts 1 or more arguments.
> * Example: {@code GROUPING(deptno, gender)} returns
> * 3 if both deptno and gender are being grouped,
> * 2 if only deptno is being grouped,
> * 1 if only gender is being groped,
> * 0 if neither deptno nor gender are being grouped.
> *
> * <p>This function is defined in the SQL standard.
> * {@code GROUPING_ID} is a non-standard synonym.
> *
> * <p>Some examples are in {@code agg.iq}.
> */
> 
> The annotations above seems conflicts with other implementations like Hive(
> https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction
> )
> 
> Notice that: "The grouping function indicates whether an expression in a
> GROUP BY clause is aggregated or not for a given row. The value 0
> represents a column that is part of the grouping set, while the value 1
> represents a column that is not part of the grouping set. "
> 
> 
> It is clearly that 0 and 1 bit have different interpretation  between
> annotations in Calcite and in Hive. And I did not figure out why...
> 
> Any feedback can give me on this would be highly appreciated.
> 
> Best regards!

Reply via email to