[ https://issues.apache.org/jira/browse/CALCITE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188704#comment-17188704 ]
Julian Hyde commented on CALCITE-4199: -------------------------------------- I'm going to stop arguing now. > Add nullability annotations to the methods and fields, ensure consistency > with checkerframework > ----------------------------------------------------------------------------------------------- > > Key: CALCITE-4199 > URL: https://issues.apache.org/jira/browse/CALCITE-4199 > Project: Calcite > Issue Type: New Feature > Affects Versions: 1.25.0 > Reporter: Vladimir Sitnikov > Priority: Major > > The current codebase uses jsr305 javax.annotation.Nullable / NonNull which > helps to catch bugs while developing Calcite and libraries. > Unfortunately, the current annotation is not enforced, and it lacks support > for generics. > jsr305.jar is probably abandoned (see > https://github.com/google/guava/issues/2960), so we should probably migrate > to something else. > https://checkerframework.org/ is a solid framework for machine verification > which is tailored to Java. > The releases are quite frequent: > https://github.com/typetools/checker-framework/releases > Their annotations are recognized by major IDEs. > So I see the following options: > a) Leave the code as is > b) Annotate (gradually?) the code with checkerframework annotations > c) Migrate (gradually?) the code to Kotlin > I've created a PR to verify which changes would be needed, verify if CI is > able to type check in a reasonable time, and so on: > https://github.com/apache/calcite/pull/1929 > My findings regarding checkerframework so far are: > 0) It does detect NPEs (which were hidden in the current code), and it does > detect initialized files in the constructors > 1) Checkerframework takes ~2 minutes for {{:linq4j}}, and 9min for {{:core}} > in GitHub Actions CI > 2) "non-nullable by default" is quite close to the current Calcite > conventions. > 3) There are cases when javadoc comments says "or null", however, the code > reads much easier if {{@Nullable}} nullable appears in the signature > 4) If a third-party library supplies invalid type annotations, there's a way > to fix that. For instance, Guava's Function is annotated as "always > nullable", and we can overrule that (so the nullability is taken from generic > signature rather than "always nullable"). The override files are placed to > src/main/config/checkerframework > 5) Generic-heavy code might be challenging (they are always like that), > however, in the most obscure cases there's always a way to suppress the > warning > 6) I've filed a Gradle improvement so it schedules recently modified files > first for the compilation: https://github.com/gradle/gradle/issues/14332 > --- > h2. Option a: leave the code as is > Typically, Calcite code does have decent documentation if {{null}} is > permissible, however, I often need to check the existing calls in order to > figure out if {{null}} is expected. > Note: nullability does not mean just "null vs non-null" value, but it is > related to other common pitfalls like "failing to initialize the value in > constructor", "calling non-final method in the constructor" > h2. Option b: annotate (gradually?) the code with checkerframework annotations > Pros: > * Checkerframework verifies one method at a time, so the resulting code is > easy to reason about. For instance, it requires that {{boolean > equals(@Nullable Object other)}} overloads must explicitly declare > {{@Nulable}} annotation. That might sound too verbose at first, however, it > helps to read the code: you don't need to analyze class/interface > declarations in order to figure out if nulls are expected or not. > * Checkerframework supports nullability with generics. For instance, {{class > Wrapper<T>}} means T can be used as either non-nullable or nullable. {{class > Wrapper<T>}} means T is always nullable (all implementations can assume T is > always nullable), and {{class Wrapper<T extends @NonNull Object>}} (nonnull > is redundant in this case and can be omitted) means T can't be null. > * Checkerframework supports nullability with arrays. {{@Nullable Object[]}} > is non-nullable array with nullable elements, {{Object @Nullable []}} is > nullable array with non-nullable elements. > * The verifier sees untested code > * checkerframework has a coherent set of verifiers, and it is pluggable > * checkerframework compiler understand lots of different nullability > annotations, so we could use jsr305 nullability annotations instead of the > ones from checkerframework > * Nullability annotations might make it easier to contribute: one has to > reason about nulls anyway, and the annotations make it simpler. > Cons: > * Nullability annotations add an extra burden on developers. Checkerframework > documentation is good, however, it might be challenging to figure out the > proper annotations especially when generics and/or inheritance is involved > * Verification takes significant time, which is especially sad for local > development. For instance, {{core}} verification takes ~10 minutes. Note: > extra time is spent only in case checkerframework verification is activated, > and the verification itself is NOT required for regular Calcite operation > (e.g. development, testing, etc). The nullability verification is similar to > code style checks like Checkstyle: it can be executed as a completely > separate task. > * IDEs do not have full support of checkerframework annotations. For > instance, they can't infer nullability from generic signatures, so the code > might look OK in IDE, and it would fail verification. At the same time, IDE > might show false positives (warning in IDE while the verification passes) > * Other languages might miss nullability inference from checkerframework > annotations. For instance, Kotlin compiler can infer nullability from jsr305 > annotations, however, it does not understand checkerframework yet > * Checkerframework verifies one method at a time, so does not know if certain > methods are used in a specific order only. One of the cases is {{Enumerator}} > interface with {{current}}, {{moveNext}}, and {{close}} methods. Checker > assumes the clients might call the methods in any order, so it enforces that > null-checks must be placed in all the methods. It might sound like "adding > verbosity to make checker happy" > * Checkerframework is not suitable for test code, as it would take > significant effort to write nullability annotations, and it would make test > code much harder to write. For instance, checkerframework knows nothing on > JUnit's {{@Before...}} annotations, so it assumes the fields are not > initialized in the test methods > * Nullability annotations might sound like adding a new language. For > instance, the difference between {{<T> public T get()}}, {{<@Nullable T> > public T get()}}, {{<T> public @Nullable T get()}}, {{<T extends @Nullable > Object> public T get()}}, {{<@PolyNull T> public @PolyNull T get()}} might be > not that easy to understand. > * Java's type system is unsound, so it is hard to write code without > suppression. For instance: {{Map.get(...)}} sometimes permits nulls > (HashMap), and sometimes it does not (ImmutableMap). That is one of the > reasons {{Map.get(...)}} and {{Map.remove(..)}} are annotated to require > non-nullable argument. > * Nullability annotations might make it harder to contribute: > checkerframework might be unfamiliar to the contributors > * Nullability annotations do not replace runtime checks, so we still need to > use {{requireNonNull(...)}} to verify inputs in the runtime > * There's a risk of human error while migrating the code: a wrong assertion > might be added, so the code might start failing > h2. Option b: migrate (gradually?) the code to Kotlin > Pros: > * Better IDE support than the one of checkerframework > * Standard library contains lots of collection manipulation code, so > typical-for-Calcite Itarables.transform. Lists.transform, and so on would be > easier to read and write. Streams in Java are still too verbose, especially > for one-time mapping. > For instance, > {noformat} > return columns.stream() > .map(columnOrdinal -> table.getRowType().getFieldNames() > .get(columnOrdinal)) > .collect(Collectors.toList());{noformat} > becomes > {noformat} > return columns.map { columnOrdinal -> > table.getRowType().getFieldNames().get(columnOrdinal) > } > {noformat} > or even > {noformat} > return columns.map { table.getRowType().getFieldNames().get(it) } > {noformat} > * Faster (?) compilation times. Even though Kotlin is slower to compile than > regular Java, Kotlin is way faster than checkerframework compiler. So > edit-compile cycle would likely be much faster > * Some type annotations would work for Kotlin as well. For instance, > {{@CheckReturnValue}} would work in IDEs even for Kotlin code > * It might attract contributors. StackOverflow 2020 survey suggests [Kotlin > is the 4th loved (and 6th wanted) programming > language|https://insights.stackoverflow.com/survey/2020#technology-most-loved-dreaded-and-wanted-languages-loved] > * Kotlin integrates with Java nicely: Java classes can extend Kotlin classes > and vice versa > * Kotlin is safe and pleasant to use for test code as well. For instance, > {{lateinit var}} enables to have non-nullable reference and initialize it in > {{@Before..}}. > * Migration to Kotlin can be fully backward compatible with Java (see OkHttp > [migration|https://square.github.io/okhttp/upgrading_to_okhttp_4/] > [case|https://cashapp.github.io/2019-06-26/okhttp-4-goes-kotlin]) > * Kotlin advanced deprecation features simplify migrations for the clients. > For instance, {{@Deprecated}} annotation in Kotlin can suggest the > replacements, so IDEs would suggest replacing the deprecations. With Java we > have to guess the alternatives > * Better "public API" support: Kotlin code fails to compile if a public class > inherits from a private (or package-private) one. > * Non-nullable arguments would automatically be verified, so hand-crafted > {{Objects.requireNonNull(...)}} could be removed > Cons: > * Kotlin might make it harder to contribute: Kotlin might be unfamiliar to > the contributors > * There's a risk of human error while migrating the code: a wrong assertion > might be added, so the code might start failing > * Kotlin standard library becomes an extra 1MiB dependency -- This message was sent by Atlassian Jira (v8.3.4#803005)