[ 
https://issues.apache.org/jira/browse/CALCITE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17735913#comment-17735913
 ] 

Will Noble commented on CALCITE-5775:
-------------------------------------

Filed CALCITE-5793.

Calcite already is so inclined to group and order by ordinals for BQ. It just 
doesn't work for null direction emulation since that requires adding an 
addition {{IS NULL}} operator, and the only operators you can apply to sorting 
ordinals is {{DESC}} or {{NULLS FIRST}} / {{LAST}}, so it has to expand it into 
the full expression in that case. This could potentially be mitigated by 
introducing a subquery with the {{IS NULL}} expression in the select list. This 
may be worth considering.

I believe that previous bug you're talking about is CALCITE-5767. I'm not sure 
if that's related to the claims that you think are red herrings, but the 
problems outlined in this bug mean that only a single null direction (the 
default for whatever dialect is being unparsed) can possibly work. The user 
does not have the ability to use the non-default null direction if it has to be 
emulated with an extra {{IS NULL}} sort.

> Null direction emulation broken for complex expressions on some dialects
> ------------------------------------------------------------------------
>
>                 Key: CALCITE-5775
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5775
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Will Noble
>            Priority: Minor
>              Labels: pull-request-available
>
> This is a problem in BigQuery, and may be a problem in other dialects as 
> well. Consider the following piece of valid BQ SQL:
> {code:sql}
> SELECT REPEAT(first_name, 2),
>        COUNT(id)
> FROM looker_test.users
> GROUP BY REPEAT(first_name, 2)
> ORDER BY 1
> {code}
> Now consider a version where the {{ORDER BY}} clause is changed to this:
> {code:sql}
> ORDER BY REPEAT(first_name, 2)
> {code}
> This is logically the same query, because the expression in the {{ORDER}} 
> clause is the same as the one in the {{SELECT}} / {{GROUP}} clauses. BigQuery 
> is sophisticated enough to match the select to the group expression in both 
> queries, but cannot match either with the order expression. It gives this 
> error: _ORDER BY clause expression references column first_name which is 
> neither grouped nor aggregated_.
> So, when sorting by complex expressions in BQ, Calcite relies on either:
> * No null direction emulation required. 
> * sorting by alias or ordinal, which is a problem with current null direction 
> emulation because it adds an extra complex sort expression, or
> * having a query that just happens to also have the underlying field in the 
> {{GROUP BY}} clause by itself, which seems to actually happen pretty often in 
> my testing, but obviously shouldn't be a constraint.
> As I wrote that, I realized this may be easily fixable for BQ since it added 
> support for {{NULLS FIRST}} / {{LAST}} in 2020 and it seems Calcite has not 
> caught up yet. Consider this rel node:
> {code}
> LogicalSort(sort0=[$0], dir0=[ASC])
>   LogicalAggregate(group=[{0}], cent=[COUNT($1)])
>     LogicalProject($f0=[CASE(IS NULL($4), 0, 1)], MGR=[$3])
>       JdbcTableScan(table=[[JDBC_SCOTT, EMP]])
> {code}
> Calcite would convert it to this in BigQuery due to null direction emulation:
> {code:sql}
> SELECT CASE WHEN HIREDATE IS NULL THEN 0 ELSE 1 END AS `$f0`, COUNT(MGR) AS 
> cent
> FROM SCOTT.EMP
> GROUP BY CASE WHEN HIREDATE IS NULL THEN 0 ELSE 1 END
> ORDER BY CASE WHEN HIREDATE IS NULL THEN 0 ELSE 1 END IS NULL, 1
> {code}
> Which of course triggers the problem described above. This may be a problem 
> for MSSQL as well since it doesn't support {{NULLS LAST}}. The fix for BQ, at 
> least, may be to just support {{NULLS LAST}} and sort by ordinal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to