Re: Avoid UB in pslist.h (NULL + 0)

2020-03-21 Thread Kamil Rytarowski
On 22.03.2020 01:50, Taylor R Campbell wrote:
>> Date: Sun, 22 Mar 2020 00:03:57 +0100
>> From: Kamil Rytarowski 
>>
>> I propose to change the fun(pointer + 0) logic with fun(pointer, 0).
> 
> I don't think this is a good approach -- it requires modifying code
> further and further away from the relevant part.
> 
> 
> But let's step back a moment.
> 
> So far, after several weeks of discussion, nobody has presented a case
> that there is a credible thread of a compiler actually misbehaving in
> this scenario.
> 

There are no public declarations (that was requested on a local ML) but
according to my IRC talks, there were plans to start optimizing on NULL
+ 0 operations, but is was/is considered as a chicken-egg issue. There
is a time span now to alert users and

It was also confirmed by two people that clang or a C compiler in
general is allowed to optimize the code.

> Yes, it is technically undefined behaviour -- nobody disputes that --
> but so is lots of other stuff that NetBSD relies on such as inline
> asm.  In C++, it is explicitly _not_ undefined behaviour; Clang
> specifically stopped short of exploiting it as undefined behaviour in
> C; nobody presented reasons why it should be undefined in C but
> defined in C++.
> 

Our kernel/rump code is mostly in C or at most a common subset of C and
C++. As the languages are not fully compatible in details, it's safe to
assume the C behavior that is valid for every C++ program, rather than
C++ one that is not valid for C.

Both languages can be synced here as the incompatibility was revealed
and discussed. Personally I wouldn't be surprised to see adoption of the
C behavior as nullptr + 0 is invalid in C++ (and it will be likely
invalid in future C).

> So this all serves to work around false alarms from ubsan -- not
> actual bugs, just noise from the tool.  Presumably the reason we use
> ubsan at all is that it helps find actual bugs -- true alarms.
> There's a few ways we might approach the false alarms:
> 

Out of 3 cases that I was requested, 3 of them were confirmed to be real
bugs (alignment, memcpy(NULL,NULL, 0), ptr + 0). Two of the cases can
generate crashing code now. One of them was researched by the clang
developers and C committee and confirmed that a compiler can impose
assumptions that would break misdesigned code.

I presented examples for the two UB issues that they are harmful now.

So far in reply I get 'just noise from the tool' so maybe my messages
were not clear enough and reaching authorities (from the clang community
and C committee) not credible enough.

> (a) Ignore them.  It's what we've been doing so far.
> 

This is not true. We already run our kernel with GCC with 0 UBSan
reports and perform syzbot fuzzing. We catch real problems there.

It already happened.

We are reaching to the point of running all ATF reports under UBSan.

If we cannot suppress noise from a tool (any kind of it), then the tool
is not much usable for signaling real problems.

> (b) Change how we invoke ubsan and the compiler by passing
> -fno-delete-null-pointer-checks to clang.  joerg objected to this
> but I don't recall the details off the top of my head; joerg, can
> you expand on your argument against this, and which alternative
> you would prefer?
> 

I tried to enable -fno-delete-null-pointer-check for RUMPKERNEL, but I
was asked to revert... after first getting acknowledge from you that it
is a good idea...

> (c) Change how we invoke ubsan, but just ubsan -- not the compiler.
> There's a cost to this: if we diverge from how we invoke the
> compiler, we might be disabling true alarms from ubsan that
> reflect how the compiler will actually behave.
> 

UBSan is an integral part of a compiler.

> (d) Patch ubsan to disable the false alarms.  This incurs a
> maintenance burden, but maybe leaves less of a sharp edge to trip
> on than setting compiler flags in the makefile -- updates that
> change the behaviour are more likely to lead to merge conflicts.
> 

I'm fine to patch the tool to disable false alarms or rather report them
upstream.

There are some indeed false positives sometimes when we depend on
assumptions that are imposed after compilation, during linking phase. In
these cases we disable sanitizing as there is no way to teach the tool.

> (e) Patch our own code to suppress the false alarms.  The cost to this
> churn is that it can introduce bugs of its own, and make the code
> harder to understand, and the complexity may become obsolete in
> the next version of the tool but will remain a Chesterton's fence
> (`why is there a dummy argument in all these functions? can we get
> rid of it?').
> 

Every single commit can introduce bugs on their own.

If we introduce code that relies on UB assumptions it is rather opposite
of comprehensible code. Handling UB imposes handling corner-cases that
is rather a good style of programming.

"complexity may become obsolete in the next version of 

Re: Avoid UB in pslist.h (NULL + 0)

2020-03-21 Thread Joerg Sonnenberger
On Sun, Mar 22, 2020 at 12:50:16AM +, Taylor R Campbell wrote:
> (b) Change how we invoke ubsan and the compiler by passing
> -fno-delete-null-pointer-checks to clang.  joerg objected to this
> but I don't recall the details off the top of my head; joerg, can
> you expand on your argument against this, and which alternative
> you would prefer?

I objected to using it *in general*. The committed change *always*
adding it, not just when using sanitizers.

Joerg


Re: Avoid UB in pslist.h (NULL + 0)

2020-03-21 Thread Taylor R Campbell
> Date: Sun, 22 Mar 2020 00:03:57 +0100
> From: Kamil Rytarowski 
> 
> I propose to change the fun(pointer + 0) logic with fun(pointer, 0).

I don't think this is a good approach -- it requires modifying code
further and further away from the relevant part.


But let's step back a moment.

So far, after several weeks of discussion, nobody has presented a case
that there is a credible thread of a compiler actually misbehaving in
this scenario.

Yes, it is technically undefined behaviour -- nobody disputes that --
but so is lots of other stuff that NetBSD relies on such as inline
asm.  In C++, it is explicitly _not_ undefined behaviour; Clang
specifically stopped short of exploiting it as undefined behaviour in
C; nobody presented reasons why it should be undefined in C but
defined in C++.

So this all serves to work around false alarms from ubsan -- not
actual bugs, just noise from the tool.  Presumably the reason we use
ubsan at all is that it helps find actual bugs -- true alarms.
There's a few ways we might approach the false alarms:

(a) Ignore them.  It's what we've been doing so far.

(b) Change how we invoke ubsan and the compiler by passing
-fno-delete-null-pointer-checks to clang.  joerg objected to this
but I don't recall the details off the top of my head; joerg, can
you expand on your argument against this, and which alternative
you would prefer?

(c) Change how we invoke ubsan, but just ubsan -- not the compiler.
There's a cost to this: if we diverge from how we invoke the
compiler, we might be disabling true alarms from ubsan that
reflect how the compiler will actually behave.

(d) Patch ubsan to disable the false alarms.  This incurs a
maintenance burden, but maybe leaves less of a sharp edge to trip
on than setting compiler flags in the makefile -- updates that
change the behaviour are more likely to lead to merge conflicts.

(e) Patch our own code to suppress the false alarms.  The cost to this
churn is that it can introduce bugs of its own, and make the code
harder to understand, and the complexity may become obsolete in
the next version of the tool but will remain a Chesterton's fence
(`why is there a dummy argument in all these functions? can we get
rid of it?').

(f) Ditch ubsan.  Does it have enough _true_ alarms, detecting actual
bugs, to make it worth keeping, or are we wasting a lot of time to
appease a tool that isn't actually giving us much value?

From memory, most of the changes I've seen to appease ubsan are to
replace (1 << n) by (1u << n) when n=31.  Some projects take
advantage of -ftrapv and use signed integer types to express that
overflow is a mistake.  Some projects want two's complement
everywhere and use -fwrapv.

We're not committed one way or another; maybe we should keep it
that way, and ubsan helps us to do so; maybe it would be easier to
commit to -fwrapv.  There is a cost to patching all of these
shifts -- it makes merging from upstreams more painful.

Other issues that I've seen raised by ubsan in practice, from
memory: memcpy(NULL,NULL,0), floating-point division by zero,
alignment issues.  The first two are non-issues in real
implementations; the last suggests that maybe the tests we're
running on, e.g., sparc64 aren't very extensive.

Maybe someone can present an argument that ubsan is worth the pain
of patching and working around false alarms and merging local
changes into upstream updates.  It's not a priori clear to me --
I'm not proposing that we ditch ubsan; I'm just saying I'm not the
one to argue the case that we should use it as a bug-finding tool
and take on the costs of working around its many false alarms as a
rule.

First, I'd like to see a clearer argument about these options before
we move forward with more churn to suppress false alarms.


_After_ that argument, if we choose (e), patching our code, I would
like to see a way to get the static type checking in the pslist macros
(which has detected real bugs, at least in the container_of version
and probably also for the pslist macros) that doesn't require nonlocal
changes.  Originally (for container_of) I did something like

(sizeof( - ), ...)

to verify that a and b have compatible types, but some compilers (gcc,
I think) objected to a left-hand expression with no side effects in a
comma operator.  So I switched to

(expression yielding a pointer) + 0*sizeof( - )

to suppress that warning.  However, coverity didn't like something
about the sizeof and patching coverity is not an option for us, so we
defined __validate_container_of(...) to expand to 0 for coverity and
to 0*sizeof( - ) for non-coverity -- which is fine because we
still get the type-checking for normal builds.

This is still a bit of a kludge -- it only works when some
subexpression is a pointer to a complete type, although since the
point is to assert that types are compatible that 

Avoid UB in pslist.h (NULL + 0)

2020-03-21 Thread Kamil Rytarowski
I propose to change the fun(pointer + 0) logic with fun(pointer, 0).

We still maintain the sanity checks and we can optimize the code to
generate not worse code than before. We can pass the "0" argument only
with DIAGNOSTIC or DEBUG kernel.

http://netbsd.org/~kamil/patch-00242-pslist-avoid-null-pointer-arithmetic.txt

I confirm that with this change there are no longer any reports UBSan
reports.

I'm open to tune this patch to expected coding style.



signature.asc
Description: OpenPGP digital signature