Re: Avoid UB in pslist.h (NULL + 0)

Taylor R Campbell Sat, 21 Mar 2020 17:50:46 -0700

> Date: Sun, 22 Mar 2020 00:03:57 +0100
> From: Kamil Rytarowski <n...@gmx.com>
> 
> I propose to change the fun(pointer + 0) logic with fun(pointer, 0).


I don't think this is a good approach -- it requires modifying code
further and further away from the relevant part.


But let's step back a moment.

So far, after several weeks of discussion, nobody has presented a case
that there is a credible thread of a compiler actually misbehaving in
this scenario.

Yes, it is technically undefined behaviour -- nobody disputes that --
but so is lots of other stuff that NetBSD relies on such as inline
asm.  In C++, it is explicitly _not_ undefined behaviour; Clang
specifically stopped short of exploiting it as undefined behaviour in
C; nobody presented reasons why it should be undefined in C but
defined in C++.

So this all serves to work around false alarms from ubsan -- not
actual bugs, just noise from the tool.  Presumably the reason we use
ubsan at all is that it helps find actual bugs -- true alarms.
There's a few ways we might approach the false alarms:

(a) Ignore them.  It's what we've been doing so far.

(b) Change how we invoke ubsan and the compiler by passing
    -fno-delete-null-pointer-checks to clang.  joerg objected to this
    but I don't recall the details off the top of my head; joerg, can
    you expand on your argument against this, and which alternative
    you would prefer?

(c) Change how we invoke ubsan, but just ubsan -- not the compiler.
    There's a cost to this: if we diverge from how we invoke the
    compiler, we might be disabling true alarms from ubsan that
    reflect how the compiler will actually behave.

(d) Patch ubsan to disable the false alarms.  This incurs a
    maintenance burden, but maybe leaves less of a sharp edge to trip
    on than setting compiler flags in the makefile -- updates that
    change the behaviour are more likely to lead to merge conflicts.

(e) Patch our own code to suppress the false alarms.  The cost to this
    churn is that it can introduce bugs of its own, and make the code
    harder to understand, and the complexity may become obsolete in
    the next version of the tool but will remain a Chesterton's fence
    (`why is there a dummy argument in all these functions? can we get
    rid of it?').

(f) Ditch ubsan.  Does it have enough _true_ alarms, detecting actual
    bugs, to make it worth keeping, or are we wasting a lot of time to
    appease a tool that isn't actually giving us much value?

    From memory, most of the changes I've seen to appease ubsan are to
    replace (1 << n) by (1u << n) when n=31.  Some projects take
    advantage of -ftrapv and use signed integer types to express that
    overflow is a mistake.  Some projects want two's complement
    everywhere and use -fwrapv.

    We're not committed one way or another; maybe we should keep it
    that way, and ubsan helps us to do so; maybe it would be easier to
    commit to -fwrapv.  There is a cost to patching all of these
    shifts -- it makes merging from upstreams more painful.

    Other issues that I've seen raised by ubsan in practice, from
    memory: memcpy(NULL,NULL,0), floating-point division by zero,
    alignment issues.  The first two are non-issues in real
    implementations; the last suggests that maybe the tests we're
    running on, e.g., sparc64 aren't very extensive.

    Maybe someone can present an argument that ubsan is worth the pain
    of patching and working around false alarms and merging local
    changes into upstream updates.  It's not a priori clear to me --
    I'm not proposing that we ditch ubsan; I'm just saying I'm not the
    one to argue the case that we should use it as a bug-finding tool
    and take on the costs of working around its many false alarms as a
    rule.

First, I'd like to see a clearer argument about these options before
we move forward with more churn to suppress false alarms.


_After_ that argument, if we choose (e), patching our code, I would
like to see a way to get the static type checking in the pslist macros
(which has detected real bugs, at least in the container_of version
and probably also for the pslist macros) that doesn't require nonlocal
changes.  Originally (for container_of) I did something like

(sizeof(&a - &b), ...)

to verify that a and b have compatible types, but some compilers (gcc,
I think) objected to a left-hand expression with no side effects in a
comma operator.  So I switched to

(expression yielding a pointer) + 0*sizeof(&a - &b)

to suppress that warning.  However, coverity didn't like something
about the sizeof and patching coverity is not an option for us, so we
defined __validate_container_of(...) to expand to 0 for coverity and
to 0*sizeof(&a - &b) for non-coverity -- which is fine because we
still get the type-checking for normal builds.

This is still a bit of a kludge -- it only works when some
subexpression is a pointer to a complete type, although since the
point is to assert that types are compatible that can generally be
arranged.

But it's a kludge that's limited in scope to the specific place where
we want to assert that two expressions have compatible types, and it's
a kludge that works for real compilers and for static analyzers like
coverity, and it has no adverse consequences _except_ the false alarms
in ubsan.

So if we're going to find a replacement kludge, I'd like to keep it
localized at the point where want to assert type compatibility, and
make sure it works with all our compilers and coverity.  Maybe we
could find a way to generically write

_ASSERT_TYPES_COMPATIBLE(P, Q, NEXT)

so that it expands into an expression equivalent to NEXT but as a side
effect at compile-time causes an error if P and Q have incompatible
types, and similarly with _ASSERT_PTR_MEMBER_TYPE(P, T, F, NEXT) or
something to assert that the member T::F has a type compatible with
*P.

Re: Avoid UB in pslist.h (NULL + 0)

Reply via email to