[ 
https://issues.apache.org/jira/browse/CALCITE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730760#comment-17730760
 ] 

Will Noble commented on CALCITE-5767:
-------------------------------------

I updated the description and summary, but kept the verbose prose description 
because it's really a complicated problem with seemingly no ideal solution. 
Really, there are 4 potential "problems" that could be blamed for this. The 
first 3 are MSSQL's seemingly non-standard design decisions outlined in the 
ticket description (each of these is not necessarily something that needs 
changing, but if any one of these were different, there would be no problem). 
The fourth potential target for blame is the fact that {{RelBuilder}} uses a 
"default" null direction by default instead of {{NullDirection.UNSPECIFIED}} 
which, if used, would also eliminate this weird issue, and allow us to return 
{{null}} from {{emulateNullDirection}} and not have to include an effectively 
constant expression in the {{ORDER BY}} list.

But, changing the default null direction in {{RelBuilder}} from NULLS-high to 
unspecified would be a major breaking change, so I'm assuming that's a 
non-starter.

> Calcite's MSSQL dialect should not give GROUPING special treatment when 
> emulating NULL direction
> ------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-5767
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5767
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Will Noble
>            Assignee: Will Noble
>            Priority: Minor
>              Labels: pull-request-available
>
> {{MssqlSqlDialect.emulateNullDirection}} has [special logic for {{GROUPING}} 
> calls|https://github.com/apache/calcite/blob/814ae6ec09e72544ba010f2591e06020c55b162b/core/src/main/java/org/apache/calcite/sql/dialect/MssqlSqlDialect.java#L95].
>  This seems to be an optimization attempt since {{GROUPING}} is known to 
> never return {{{}NULL{}}}, and therefore needs no null direction emulation. 
> However, this causes problems because:
> 1. MSSQL does not have the same default null direction as Calcite's 
> {{RelBuilder}} (and most other dialects).
> 2. MSSQL does not support the common {{NULLS FIRST}} or {{NULLS LAST}} syntax.
> 3. MSSQL does not allow sorting on the same field twice, even though there is 
> no theoretical issue with this.
> Each of these properties must be present for the problem to occur, so it's a 
> bit niche and specific to MSSQL. Seems like the best solution is to simply 
> eliminate the special-case treatment for {{GROUPING}} in 
> {{{}emulateNullDirection{}}}, which is currently creating problems due to 
> property #3.
> {*}More in-depth explanation{*}:
> In {{{}RelBuilder.collation{}}}, we use the "default null direction" to 
> insert rex nodes as sorting expressions, but this is only the default null 
> direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem 
> because MSSQL has special-case logic for emulating null direction of GROUPING 
> calls, whereby it effectively duplicates the expression. Really, 
> {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning 
> {{null}} instead, signalling to callers that no null-direction emulation is 
> necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes 
> another problem due to property #2 above when the null direction is 
> non-default as is caused simply by using {{RelBuilder.collation}} as 
> described above (it should be noted that this method takes rex nodes instead 
> of {{RelFieldCollation}} object, so there is no way to specify null 
> direction) because the non-default null direction is not expanded into a 
> {{CASE}} expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} 
> syntax).
> Here's a test illustrating the problem:
> Input SQL (default dialect)
> {code:xml}
> select "product_class_id", "brand_name", GROUPING("brand_name")
> from "product"
> group by GROUPING SETS (("product_class_id", "brand_name"), 
> ("product_class_id"))
> order by 3, 2, 1
> {code}
> Current behavior for unparsing as MSSQL (incorrect because it orders by the 
> same column twice; {{GROUPING([brand_name])}} and {{{}3{}}}, which will fail 
> if you try to actually run this against a real MSSQL database, even though it 
> seems like it shouldn't):
> {code:xml}
> SELECT [product_class_id], [brand_name], GROUPING([brand_name])
> FROM [foodmart].[product]
> GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id])
> ORDER BY
>          GROUPING([brand_name]),
>          3,
>          CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
>          [brand_name],
>          CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
>          [product_class_id]
> {code}
> Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns 
> {{null}} for {{GROUPING}} expressions (incorrect because it uses {{NULLS 
> LAST}} syntax):
> {code:xml}
> ...
> ORDER BY
>          3 NULLS LAST,
>          CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
>          [brand_name],
>          CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
>          [product_class_id]
> {code}
> Acceptable behavior (although the first {{{}ORDER BY{}}}-clause is 
> effectively ordering by a constant, this will at least run and produce the 
> correct results):
> {code:xml}
> ...
> ORDER BY
>          CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END,
>          3,
>          CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
>          [brand_name],
>          CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
>          [product_class_id]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to