[ 
https://issues.apache.org/jira/browse/CALCITE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189069#comment-17189069
 ] 

Julian Hyde commented on CALCITE-4199:
--------------------------------------

Please don't change anything without review.

Statistic is an SPI. Users can provide their own implementation, and we have to 
deal with it. You can't tighten the contract.

> Add nullability annotations to the methods and fields, ensure consistency 
> with checkerframework
> -----------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-4199
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4199
>             Project: Calcite
>          Issue Type: New Feature
>    Affects Versions: 1.25.0
>            Reporter: Vladimir Sitnikov
>            Priority: Major
>
> The current codebase uses jsr305 javax.annotation.Nullable / NonNull which 
> helps to catch bugs while developing Calcite and libraries.
> Unfortunately, the current annotation is not enforced, and it lacks support 
> for generics.
> jsr305.jar is probably abandoned (see 
> https://github.com/google/guava/issues/2960), so we should probably migrate 
> to something else.
> https://checkerframework.org/ is a solid framework for machine verification 
> which is tailored to Java.
> The releases are quite frequent: 
> https://github.com/typetools/checker-framework/releases
> Their annotations are recognized by major IDEs.
> So I see the following options:
> a) Leave the code as is
> b) Annotate (gradually?) the code with checkerframework annotations
> c) Migrate (gradually?) the code to Kotlin
> I've created a PR to verify which changes would be needed, verify if CI is 
> able to type check in a reasonable time, and so on: 
> https://github.com/apache/calcite/pull/1929
> My findings regarding checkerframework so far are:
> 0) It does detect NPEs (which were hidden in the current code), and it does 
> detect initialized files in the constructors
> 1) Checkerframework takes ~2 minutes for {{:linq4j}}, and 9min for {{:core}} 
> in GitHub Actions CI
> 2) "non-nullable by default" is quite close to the current Calcite 
> conventions.
> 3) There are cases when javadoc comments says "or null", however, the code 
> reads much easier if {{@Nullable}} nullable appears in the signature
> 4) If a third-party library supplies invalid type annotations, there's a way 
> to fix that. For instance, Guava's Function is annotated as "always 
> nullable", and we can overrule that (so the nullability is taken from generic 
> signature rather than "always nullable"). The override files are placed to 
> src/main/config/checkerframework
> 5) Generic-heavy code might be challenging (they are always like that), 
> however, in the most obscure cases there's always a way to suppress the 
> warning
> 6) I've filed a Gradle improvement so it schedules recently modified files 
> first for the compilation: https://github.com/gradle/gradle/issues/14332
> ---
> h2. Option a: leave the code as is
> Typically, Calcite code does have decent documentation if {{null}} is 
> permissible, however, I often need to check the existing calls in order to 
> figure out if {{null}} is expected.
> Note: nullability does not mean just "null vs non-null" value, but it is 
> related to other common pitfalls like "failing to initialize the value in 
> constructor", "calling non-final method in the constructor"
> h2. Option b: annotate (gradually?) the code with checkerframework annotations
> Pros:
> * Checkerframework verifies one method at a time, so the resulting code is 
> easy to reason about. For instance, it requires that {{boolean 
> equals(@Nullable Object other)}} overloads must explicitly declare 
> {{@Nulable}} annotation. That might sound too verbose at first, however, it 
> helps to read the code: you don't need to analyze class/interface 
> declarations in order to figure out if nulls are expected or not.
> * Checkerframework supports nullability with generics. For instance, {{class 
> Wrapper<T>}} means T can be used as either non-nullable or nullable. {{class 
> Wrapper<T>}} means T is always nullable (all implementations can assume T is 
> always nullable), and {{class Wrapper<T extends @NonNull Object>}} (nonnull 
> is redundant in this case and can be omitted) means T can't be null.
> * Checkerframework supports nullability with arrays. {{@Nullable Object[]}} 
> is non-nullable array with nullable elements, {{Object @Nullable []}} is 
> nullable array with non-nullable elements.
> * The verifier sees untested code
> * checkerframework has a coherent set of verifiers, and it is pluggable
> * checkerframework compiler understand lots of different nullability 
> annotations, so we could use jsr305 nullability annotations instead of the 
> ones from checkerframework
> * Nullability annotations might make it easier to contribute: one has to 
> reason about nulls anyway, and the annotations make it simpler.
> Cons:
> * Nullability annotations add an extra burden on developers. Checkerframework 
> documentation is good, however, it might be challenging to figure out the 
> proper annotations especially when generics and/or inheritance is involved
> * Verification takes significant time, which is especially sad for local 
> development. For instance, {{core}} verification takes ~10 minutes. Note: 
> extra time is spent only in case checkerframework verification is activated, 
> and the verification itself is NOT required for regular Calcite operation 
> (e.g. development, testing, etc). The nullability verification is similar to 
> code style checks like Checkstyle: it can be executed as a completely 
> separate task.
> * IDEs do not have full support of checkerframework annotations. For 
> instance, they can't infer nullability from generic signatures, so the code 
> might look OK in IDE, and it would fail verification. At the same time, IDE 
> might show false positives (warning in IDE while the verification passes)
> * Other languages might miss nullability inference from checkerframework 
> annotations. For instance, Kotlin compiler can infer nullability from jsr305 
> annotations, however, it does not understand checkerframework yet
> * Checkerframework verifies one method at a time, so does not know if certain 
> methods are used in a specific order only. One of the cases is {{Enumerator}} 
> interface with {{current}}, {{moveNext}}, and {{close}} methods. Checker 
> assumes the clients might call the methods in any order, so it enforces that 
> null-checks must be placed in all the methods. It might sound like "adding 
> verbosity to make checker happy"
> * Checkerframework is not suitable for test code, as it would take 
> significant effort to write nullability annotations, and it would make test 
> code much harder to write. For instance, checkerframework knows nothing on 
> JUnit's {{@Before...}} annotations, so it assumes the fields are not 
> initialized in the test methods
> * Nullability annotations might sound like adding a new language. For 
> instance, the difference between {{<T> public T get()}}, {{<@Nullable T> 
> public T get()}}, {{<T> public @Nullable T get()}}, {{<T extends @Nullable 
> Object> public T get()}}, {{<@PolyNull T> public @PolyNull T get()}} might be 
> not that easy to understand.
> * Java's type system is unsound, so it is hard to write code without 
> suppression. For instance: {{Map.get(...)}} sometimes permits nulls 
> (HashMap), and sometimes it does not (ImmutableMap). That is one of the 
> reasons {{Map.get(...)}} and {{Map.remove(..)}} are annotated to require 
> non-nullable argument.
> * Nullability annotations might make it harder to contribute: 
> checkerframework might be unfamiliar to the contributors
> * Nullability annotations do not replace runtime checks, so we still need to 
> use {{requireNonNull(...)}} to verify inputs in the runtime
> * There's a risk of human error while migrating the code: a wrong assertion 
> might be added, so the code might start failing
> h2. Option b: migrate (gradually?) the code to Kotlin
> Pros:
> * Better IDE support than the one of checkerframework
> * Standard library contains lots of collection manipulation code, so 
> typical-for-Calcite Itarables.transform. Lists.transform, and so on would be 
> easier to read and write. Streams in Java are still too verbose, especially 
> for one-time mapping.
> For instance,
> {noformat}
>     return columns.stream()
>         .map(columnOrdinal -> table.getRowType().getFieldNames()
>             .get(columnOrdinal))
>         .collect(Collectors.toList());{noformat}
> becomes
> {noformat}
>     return columns.map { columnOrdinal ->
>         table.getRowType().getFieldNames().get(columnOrdinal)
>     }
> {noformat}
> or even
> {noformat}
>     return columns.map { table.getRowType().getFieldNames().get(it) }
> {noformat}
> * Faster (?) compilation times. Even though Kotlin is slower to compile than 
> regular Java, Kotlin is way faster than checkerframework compiler. So 
> edit-compile cycle would likely be much faster
> * Some type annotations would work for Kotlin as well. For instance, 
> {{@CheckReturnValue}} would work in IDEs even for Kotlin code
> * It might attract contributors. StackOverflow 2020 survey suggests [Kotlin 
> is the 4th loved (and 6th wanted) programming 
> language|https://insights.stackoverflow.com/survey/2020#technology-most-loved-dreaded-and-wanted-languages-loved]
> * Kotlin integrates with Java nicely: Java classes can extend Kotlin classes 
> and vice versa
> * Kotlin is safe and pleasant to use for test code as well. For instance, 
> {{lateinit var}} enables to have non-nullable reference and initialize it in 
> {{@Before..}}.
> * Migration to Kotlin can be fully backward compatible with Java (see OkHttp 
> [migration|https://square.github.io/okhttp/upgrading_to_okhttp_4/] 
> [case|https://cashapp.github.io/2019-06-26/okhttp-4-goes-kotlin])
> * Kotlin advanced deprecation features simplify migrations for the clients. 
> For instance, {{@Deprecated}} annotation in Kotlin can suggest the 
> replacements, so IDEs would suggest replacing the deprecations. With Java we 
> have to guess the alternatives
> * Better "public API" support: Kotlin code fails to compile if a public class 
> inherits from a private (or package-private) one.
> * Non-nullable arguments would automatically be verified, so hand-crafted 
> {{Objects.requireNonNull(...)}} could be removed
> Cons:
> * Kotlin might make it harder to contribute: Kotlin might be unfamiliar to 
> the contributors
> * There's a risk of human error while migrating the code: a wrong assertion 
> might be added, so the code might start failing
> * Kotlin standard library becomes an extra 1MiB dependency



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to