Re: Null checking in Beam

Jan Lukavský Fri, 22 Jan 2021 11:35:59 -0800

> I'd argue there's value in asking Beam developers to make thatchange. It makes us acknowledge that there's a possibilitymyNullableObject is null. If myNullableObject being null is somethingrelevant to the user we would likely want to wrap it in some otherexception or provide a more useful message than just NPE(!!).

Agree, if we can throw a better exception that is what should be done.That's what I meant by "tell user what specifically is missing or goingwrong". I'm not sure, if this solution would be applicable at allplaces, though.

> The reason there are so many suppressions is that fixing all theerrors is a monumental task. Rather than addressing them all, Kennprogrammatically added suppressions for classes that failed the checks(https://github.com/apache/beam/pull/13261). This allowed us to startrunning the checker on the code that passes it while the errors are fixed.

That's understandable as well. Essentially, what I think is the mostquestionable part, is that _the checker modified source code_, by adding@Nonnull annotations (and others). That is something, that can annoyusers (warnings in IDEs, or source code full of annotations) even morethan (mostly) quite easy-to-debug NPEs. I'm not saying it is doing moreharm than good, I'm just saying we don't know that.

Is it an option for the checked to be only really checker and not changethe source code? If yes, I think it might be good idea to switch it tothat mode only.


 Jan

On 1/22/21 7:37 PM, Brian Hulette wrote:

On Fri, Jan 22, 2021 at 1:18 AM Jan Lukavský <[email protected]<mailto:[email protected]>> wrote:


    Hi,

    I'll give my two cents here.

    I'm not 100% sure that the 1-5% of bugs are as severe as other
    types of bugs. Yes, throwing NPEs at user is not very polite. On
    the other hand, many of these actually boil down to user errors.
    Then we might ask what a correct solution would be. If we manage
    to figure out what the actual problem is and tell user what
    specifically is missing or going wrong, that would be just
    awesome. On the other hand, if a tool used for avoiding
    "unexpected" NPEs forces us to code

       Object value = Objects.requireNonNull(myNullableObject); // or
    similar using Preconditions
       value.doStuff();

    instead of just

      myNullableObject.doStuff()

    what we actually did, is a) made a framework happy, and b) changed
    a line at which NPE is thrown by 1 (and yes, internally prevented
    JVM from thrown SIGSEGV at itself, but that is deeply technical
    thing). Nothing changed semantically, from user perspective.

I'd argue there's value in asking Beam developers to make that change.It makes us acknowledge that there's a possibility myNullableObject isnull. If myNullableObject being null is something relevant to the userwe would likely want to wrap it in some other exception or provide amore useful message than just NPE(!!).



    Now, given that the framework significantly rises compile time
    (due to all the checks), causes many "bugs" being reported by
    static code analysis tools (because the framework adds @Nonnull
    default annotations everywhere, even when Beam's code actually
    counts with nullability of a field), and given how much we
    currently suppress these checks ($ git grep BEAM-10402 | wc -l ->
    1981), I'd say this deserves a deeper discussion.

The reason there are so many suppressions is that fixing all theerrors is a monumental task. Rather than addressing them all, Kennprogrammatically added suppressions for classes that failed the checks(https://github.com/apache/beam/pull/13261). This allowed us to startrunning the checker on the code that passes it while the errors are fixed.


     Jan


    On 1/20/21 10:48 PM, Kenneth Knowles wrote:

    Yes, completely sound nullability checking has been added to the
    project via checkerframework, based on a large number of NPE bugs
    (1-5% depending on how you search, but many other bugs likely
    attributable to nullness-based design errors) which are extra
    embarrassing because NPEs have were essentially solved, even in
    practice for Java, well before Beam existed.

    Checker framework is a pluggable type system analysis with some
    amount of control flow sensitivity. Every value has a type that
    is either nullable or not, and certain control structures (like
    checking for null) can alter the type inside a scope. The best
    way to think about it is to consider every value in the program
    as either nullable or not, much like you think of every value as
    either a string or not, and to view method calls as inherently
    stateful/nondetermistic. This can be challenging in esoteric
    cases, but usually makes the overall code health better anyhow.

    Your example illustrates how problematic the design of the Java
    language is: the analysis cannot assume that `getDescription` is
    a pure function, and neither should you. Even if it is aware of
    boolean-short-circuit it would not be sound to accept this code.
    There is an annotation for this in the cases where it is true
    (like proto-generate getters):
    
https://checkerframework.org/api/org/checkerframework/dataflow/qual/Pure.html

    The refactor for cases like this is trivial so there isn't a lot
    of value to thinking too hard about it.

    if (statusCode.equals(Code.INVALID_ARGUMENT)
      @Nullable String desc = statusCode.toStatus().getDescription();
      if (desc != null && desc.contains("finalized")) {
        return false;
      }
    }

    To a casual eye, this may look like a noop change. To the
    analysis it makes all the difference. And IMO this difference is
    real. Humans may assume it is a noop and humans would be wrong.
    So many times when you think/expect/hope that `getXYZ()` is a
    trivial getter method you later learn that it was tweaked for
    some odd reason. I believe this code change makes the code
    better. Suppressing these errors should be exceptionally rare,
    and never in normal code. It is far better to improve your code
    than to suppress the issue.

    It would be very cool for the proto compiler to annotate enough
    that new-and-improved type checkers could make things more
    convenient.

    Kenn

    On Mon, Jan 11, 2021 at 8:53 PM Reuven Lax <[email protected]
    <mailto:[email protected]>> wrote:

        I can use that trick. However I'm surprised that the check
        appears to be so simplistic.

        For example, the following code triggers the check on
        getDescription().contains(), since getDescription returns a
        Nullable string. However even a simplistic analysis should
        realize that getDescription() was checked for null first! I
        have a dozen or so cases like this, and I question how useful
        such a simplistic check it will be.

        if (statusCode.equals(Code.INVALID_ARGUMENT)
        &&statusCode.toStatus().getDescription() !=null 
&&statusCode.toStatus().getDescription().contains("finalized")) {return false;
        }


        On Mon, Jan 11, 2021 at 8:32 PM Boyuan Zhang
        <[email protected] <mailto:[email protected]>> wrote:

            Yeah it seems like the checker is enabled:
            https://issues.apache.org/jira/browse/BEAM-10402. I used
            @SuppressWarnings({"nullness" )}) to suppress the error
            when I think it's not really a concern.

            On Mon, Jan 11, 2021 at 8:28 PM Reuven Lax
            <[email protected] <mailto:[email protected]>> wrote:

                Has extra Nullable checking been enabled in the Beam
                project? I have a PR that was on hold for several
                months, and I'm struggling now with compile failing
                to complaints about assigning something that is
                nullable to something that is not nullable. Even when
                the immediate control flow makes it absolutely
                impossible for the variable to be null.

                Has something changed here?

                Reuven

Re: Null checking in Beam

Reply via email to