Re: Null checking in Beam

Kyle Weaver Fri, 12 Mar 2021 15:05:49 -0800

> On the other hand, modifying core is a less common developer use case so
passing a couple flags to skip it seems manageable for those people who are
touching the core.


Every time I check out a new git branch that has a modified core compared
to the previous branch (i.e. almost always) it needs to be rebuilt, and
runs the null checker again. I usually use Intellij to run tests, and I
haven't figured out how to make it pass skipCheckerFramework by default, so
I've resorted to just setting "skipCheckerFramework=true" in my
gradle.properties.

On Fri, Mar 12, 2021 at 2:53 PM Kenneth Knowles <[email protected]> wrote:

> I'm OK with this as long as they are treated as strictly merge blocking.
>
> On the other hand, modifying core is a less common developer use case so
> passing a couple flags to skip it seems manageable for those people who are
> touching the core.
>
> Kenn
>
> On Fri, Mar 12, 2021 at 1:19 PM Pablo Estrada <[email protected]> wrote:
>
>> Does it make sense to add a Jenkins precommit suite that runs null
>> checking exclusively?
>>
>> On Fri, Mar 12, 2021 at 11:57 AM Kyle Weaver <[email protected]> wrote:
>>
>>> I don't mind fixing my code or suppressing nullness errors, but the cost
>>> of the null check itself is hurting my productivity. In the best case, null
>>> checks add about ten minutes to the build time for large modules
>>> like :sdks:java:core. In the worst case, they cause my build to fail
>>> altogether, because the framework logs a warning that "Memory constraints
>>> are impeding performance," which is interpreted by -Wall as an error. It
>>> might not be a problem on big machines with a lot of memory, but on my
>>> Macbook, and on the Github Actions executors it is a real problem.
>>> https://issues.apache.org/jira/browse/BEAM-11837
>>>
>>> Since nullness checks seem to work fine for now on Jenkins, I propose
>>> making them opt-in rather than opt-out, and only run them on Jenkins by
>>> default.
>>>
>>> On Tue, Mar 2, 2021 at 12:13 PM Kyle Weaver <[email protected]> wrote:
>>>
>>>> Can you try adding the generated classes to generatedClassPatterns in
>>>> the JavaNatureConfiguration?
>>>>
>>>>
>>>> https://github.com/apache/beam/blob/03b883b415d27244ddabb17a0fb5bab147b86f89/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L92
>>>>
>>>>
>>>> On Tue, Mar 2, 2021 at 12:05 AM Reuven Lax <[email protected]> wrote:
>>>>
>>>>> I'm running into a problem with this check. I added a protocol-buffer
>>>>> file to a module (google-cloud-platform) that previously did have any
>>>>> protobuf files in it. The generated files contain lines that violate this
>>>>> null checker, so they won't compile. I can't annotate the files, because
>>>>> they are codegen files. I tried adding the package to spotbugs-filter.xml,
>>>>> but that didn't help.
>>>>>
>>>>> Any suggestions?
>>>>>
>>>>> Reuven
>>>>>
>>>>> On Fri, Jan 22, 2021 at 10:38 AM Brian Hulette <[email protected]>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jan 22, 2021 at 1:18 AM Jan Lukavský <[email protected]> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'll give my two cents here.
>>>>>>>
>>>>>>> I'm not 100% sure that the 1-5% of bugs are as severe as other types
>>>>>>> of bugs. Yes, throwing NPEs at user is not very polite. On the other 
>>>>>>> hand,
>>>>>>> many of these actually boil down to user errors. Then we might ask what 
>>>>>>> a
>>>>>>> correct solution would be. If we manage to figure out what the actual
>>>>>>> problem is and tell user what specifically is missing or going wrong, 
>>>>>>> that
>>>>>>> would be just awesome. On the other hand, if a tool used for avoiding
>>>>>>> "unexpected" NPEs forces us to code
>>>>>>>
>>>>>>>    Object value = Objects.requireNonNull(myNullableObject); // or
>>>>>>> similar using Preconditions
>>>>>>>    value.doStuff();
>>>>>>>
>>>>>>> instead of just
>>>>>>>
>>>>>>>   myNullableObject.doStuff()
>>>>>>>
>>>>>>> what we actually did, is a) made a framework happy, and b) changed a
>>>>>>> line at which NPE is thrown by 1 (and yes, internally prevented JVM from
>>>>>>> thrown SIGSEGV at itself, but that is deeply technical thing). Nothing
>>>>>>> changed semantically, from user perspective.
>>>>>>>
>>>>>> I'd argue there's value in asking Beam developers to make that
>>>>>> change. It makes us acknowledge that there's a possibility 
>>>>>> myNullableObject
>>>>>> is null. If myNullableObject being null is something relevant to the user
>>>>>> we would likely want to wrap it in some other exception or provide a more
>>>>>> useful message than just NPE(!!).
>>>>>>
>>>>>>>
>>>>>>> Now, given that the framework significantly rises compile time (due
>>>>>>> to all the checks), causes many "bugs" being reported by static code
>>>>>>> analysis tools (because the framework adds @Nonnull default annotations
>>>>>>> everywhere, even when Beam's code actually counts with nullability of a
>>>>>>> field), and given how much we currently suppress these checks ($ git 
>>>>>>> grep
>>>>>>> BEAM-10402 | wc -l -> 1981), I'd say this deserves a deeper discussion.
>>>>>>>
>>>>>> The reason there are so many suppressions is that fixing all the
>>>>>> errors is a monumental task. Rather than addressing them all, Kenn
>>>>>> programmatically added suppressions for classes that failed the checks (
>>>>>> https://github.com/apache/beam/pull/13261). This allowed us to start
>>>>>> running the checker on the code that passes it while the errors are 
>>>>>> fixed.
>>>>>>
>>>>>>>  Jan
>>>>>>>
>>>>>>>
>>>>>>> On 1/20/21 10:48 PM, Kenneth Knowles wrote:
>>>>>>>
>>>>>>> Yes, completely sound nullability checking has been added to the
>>>>>>> project via checkerframework, based on a large number of NPE bugs (1-5%
>>>>>>> depending on how you search, but many other bugs likely attributable to
>>>>>>> nullness-based design errors) which are extra embarrassing because NPEs
>>>>>>> have were essentially solved, even in practice for Java, well before 
>>>>>>> Beam
>>>>>>> existed.
>>>>>>>
>>>>>>> Checker framework is a pluggable type system analysis with some
>>>>>>> amount of control flow sensitivity. Every value has a type that is 
>>>>>>> either
>>>>>>> nullable or not, and certain control structures (like checking for null)
>>>>>>> can alter the type inside a scope. The best way to think about it is to
>>>>>>> consider every value in the program as either nullable or not, much like
>>>>>>> you think of every value as either a string or not, and to view method
>>>>>>> calls as inherently stateful/nondetermistic. This can be challenging
>>>>>>> in esoteric cases, but usually makes the overall code health better 
>>>>>>> anyhow.
>>>>>>>
>>>>>>> Your example illustrates how problematic the design of the Java
>>>>>>> language is: the analysis cannot assume that `getDescription` is a pure
>>>>>>> function, and neither should you. Even if it is aware of
>>>>>>> boolean-short-circuit it would not be sound to accept this code. There 
>>>>>>> is
>>>>>>> an annotation for this in the cases where it is true (like 
>>>>>>> proto-generate
>>>>>>> getters):
>>>>>>> https://checkerframework.org/api/org/checkerframework/dataflow/qual/Pure.html
>>>>>>>
>>>>>>> The refactor for cases like this is trivial so there isn't a lot of
>>>>>>> value to thinking too hard about it.
>>>>>>>
>>>>>>> if (statusCode.equals(Code.INVALID_ARGUMENT)
>>>>>>>   @Nullable String desc = statusCode.toStatus().getDescription();
>>>>>>>   if (desc != null && desc.contains("finalized")) {
>>>>>>>     return false;
>>>>>>>   }
>>>>>>> }
>>>>>>>
>>>>>>> To a casual eye, this may look like a noop change. To the analysis
>>>>>>> it makes all the difference. And IMO this difference is real. Humans may
>>>>>>> assume it is a noop and humans would be wrong. So many times when you
>>>>>>> think/expect/hope that `getXYZ()` is a trivial getter method you later
>>>>>>> learn that it was tweaked for some odd reason. I believe this code 
>>>>>>> change
>>>>>>> makes the code better. Suppressing these errors should be exceptionally
>>>>>>> rare, and never in normal code. It is far better to improve your code 
>>>>>>> than
>>>>>>> to suppress the issue.
>>>>>>>
>>>>>>> It would be very cool for the proto compiler to annotate enough that
>>>>>>> new-and-improved type checkers could make things more convenient.
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Mon, Jan 11, 2021 at 8:53 PM Reuven Lax <[email protected]> wrote:
>>>>>>>
>>>>>>>> I can use that trick. However I'm surprised that the check appears
>>>>>>>> to be so simplistic.
>>>>>>>>
>>>>>>>> For example, the following code triggers the check on
>>>>>>>> getDescription().contains(), since getDescription returns a Nullable
>>>>>>>> string. However even a simplistic analysis should realize that
>>>>>>>> getDescription() was checked for null first! I have a dozen or so cases
>>>>>>>> like this, and I question how useful such a simplistic check it will 
>>>>>>>> be.
>>>>>>>>
>>>>>>>> if (statusCode.equals(Code.INVALID_ARGUMENT)    && 
>>>>>>>> statusCode.toStatus().getDescription() != null    && 
>>>>>>>> statusCode.toStatus().getDescription().contains("finalized")) {  
>>>>>>>> return false;
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jan 11, 2021 at 8:32 PM Boyuan Zhang <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Yeah it seems like the checker is enabled:
>>>>>>>>> https://issues.apache.org/jira/browse/BEAM-10402. I used
>>>>>>>>> @SuppressWarnings({"nullness" )}) to suppress the error when I think 
>>>>>>>>> it's
>>>>>>>>> not really a concern.
>>>>>>>>>
>>>>>>>>> On Mon, Jan 11, 2021 at 8:28 PM Reuven Lax <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Has extra Nullable checking been enabled in the Beam project? I
>>>>>>>>>> have a PR that was on hold for several months, and I'm struggling 
>>>>>>>>>> now with
>>>>>>>>>> compile failing to complaints about assigning something that is 
>>>>>>>>>> nullable to
>>>>>>>>>> something that is not nullable. Even when the immediate control flow 
>>>>>>>>>> makes
>>>>>>>>>> it absolutely impossible for the variable to be null.
>>>>>>>>>>
>>>>>>>>>> Has something changed here?
>>>>>>>>>>
>>>>>>>>>> Reuven
>>>>>>>>>>
>>>>>>>>>

Re: Null checking in Beam

Reply via email to