[ 
https://issues.apache.org/jira/browse/CALCITE-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189745#comment-17189745
 ] 

Vladimir Sitnikov commented on CALCITE-4215:
--------------------------------------------

[~julianhyde], I've prepared a PR that makes {{Statistic}} methods to {{return 
null}} by default (especially for 
{{org.apache.calcite.schema.Statistics#UNKNOWN}} instance), and it rippled in 
quite a few of NPEs.

Some observations:
 1) RelTrait does not support null values. Both collation and distribution 
traits exist in core, and they fail with NPE when rules attempt to replace 
"collation trait with null". I don't think we want to support {{null}} trait 
values yet, do we?

2) There is code that treats 
{{org.apache.calcite.sql.validate.SqlMonotonicity}} to be an exhaustive enum. 
This change 
[https://github.com/apache/calcite/pull/2136/commits/bce22a73f3f244f82facd495ecac9c91c9bb0bb9]
 was hard to figure out. Without such a change, OVER expressions are converted 
with an extra order column.

3) There are non-trivial {{collation}} usages: 
[https://github.com/apache/calcite/pull/2136/commits/cb2e93ba8acb510303ed36e6aec1dbb8ce68bea1]

Frankly speaking, I don't think there's much difference between "I don't know 
if the table has any collation" and "I know the table has no collation". At the 
end of the day, if the collation is not known, it is unlikely we could figure 
it out. Then it is indistinguishable from "table is completely unsorted".

Tracking {{null}} value for {{SqlMonotonicity}} would be hard for both Calcite 
core (unless we go with Kotlin or checkerframework) and Calcite users.
 What if we add {{SqlMonotonicity.UNKNOWN}} enum, {{RelDistributions.UNKNOWN}}?

Of course, adding explicit {{@nullable}} annotations to Calcite would help to 
detect such NPEs, however, it won't help us to detect cases like {{if 
(monotonicity != NOT_MONOTONIC)}}.

> Ensure org.apache.calcite.schema.Statistic uses null vs emptyList 
> appropriately
> -------------------------------------------------------------------------------
>
>                 Key: CALCITE-4215
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4215
>             Project: Calcite
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 1.25.0
>            Reporter: Vladimir Sitnikov
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> null: statistic is *not* *known*
> emptyList: statistic is *known*, and the value is *empty* (e.g. no unique 
> keys in the table)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to