> Date: Sun, 22 Mar 2020 00:03:57 +0100 > From: Kamil Rytarowski <n...@gmx.com> > > I propose to change the fun(pointer + 0) logic with fun(pointer, 0).
I don't think this is a good approach -- it requires modifying code further and further away from the relevant part. But let's step back a moment. So far, after several weeks of discussion, nobody has presented a case that there is a credible thread of a compiler actually misbehaving in this scenario. Yes, it is technically undefined behaviour -- nobody disputes that -- but so is lots of other stuff that NetBSD relies on such as inline asm. In C++, it is explicitly _not_ undefined behaviour; Clang specifically stopped short of exploiting it as undefined behaviour in C; nobody presented reasons why it should be undefined in C but defined in C++. So this all serves to work around false alarms from ubsan -- not actual bugs, just noise from the tool. Presumably the reason we use ubsan at all is that it helps find actual bugs -- true alarms. There's a few ways we might approach the false alarms: (a) Ignore them. It's what we've been doing so far. (b) Change how we invoke ubsan and the compiler by passing -fno-delete-null-pointer-checks to clang. joerg objected to this but I don't recall the details off the top of my head; joerg, can you expand on your argument against this, and which alternative you would prefer? (c) Change how we invoke ubsan, but just ubsan -- not the compiler. There's a cost to this: if we diverge from how we invoke the compiler, we might be disabling true alarms from ubsan that reflect how the compiler will actually behave. (d) Patch ubsan to disable the false alarms. This incurs a maintenance burden, but maybe leaves less of a sharp edge to trip on than setting compiler flags in the makefile -- updates that change the behaviour are more likely to lead to merge conflicts. (e) Patch our own code to suppress the false alarms. The cost to this churn is that it can introduce bugs of its own, and make the code harder to understand, and the complexity may become obsolete in the next version of the tool but will remain a Chesterton's fence (`why is there a dummy argument in all these functions? can we get rid of it?'). (f) Ditch ubsan. Does it have enough _true_ alarms, detecting actual bugs, to make it worth keeping, or are we wasting a lot of time to appease a tool that isn't actually giving us much value? From memory, most of the changes I've seen to appease ubsan are to replace (1 << n) by (1u << n) when n=31. Some projects take advantage of -ftrapv and use signed integer types to express that overflow is a mistake. Some projects want two's complement everywhere and use -fwrapv. We're not committed one way or another; maybe we should keep it that way, and ubsan helps us to do so; maybe it would be easier to commit to -fwrapv. There is a cost to patching all of these shifts -- it makes merging from upstreams more painful. Other issues that I've seen raised by ubsan in practice, from memory: memcpy(NULL,NULL,0), floating-point division by zero, alignment issues. The first two are non-issues in real implementations; the last suggests that maybe the tests we're running on, e.g., sparc64 aren't very extensive. Maybe someone can present an argument that ubsan is worth the pain of patching and working around false alarms and merging local changes into upstream updates. It's not a priori clear to me -- I'm not proposing that we ditch ubsan; I'm just saying I'm not the one to argue the case that we should use it as a bug-finding tool and take on the costs of working around its many false alarms as a rule. First, I'd like to see a clearer argument about these options before we move forward with more churn to suppress false alarms. _After_ that argument, if we choose (e), patching our code, I would like to see a way to get the static type checking in the pslist macros (which has detected real bugs, at least in the container_of version and probably also for the pslist macros) that doesn't require nonlocal changes. Originally (for container_of) I did something like (sizeof(&a - &b), ...) to verify that a and b have compatible types, but some compilers (gcc, I think) objected to a left-hand expression with no side effects in a comma operator. So I switched to (expression yielding a pointer) + 0*sizeof(&a - &b) to suppress that warning. However, coverity didn't like something about the sizeof and patching coverity is not an option for us, so we defined __validate_container_of(...) to expand to 0 for coverity and to 0*sizeof(&a - &b) for non-coverity -- which is fine because we still get the type-checking for normal builds. This is still a bit of a kludge -- it only works when some subexpression is a pointer to a complete type, although since the point is to assert that types are compatible that can generally be arranged. But it's a kludge that's limited in scope to the specific place where we want to assert that two expressions have compatible types, and it's a kludge that works for real compilers and for static analyzers like coverity, and it has no adverse consequences _except_ the false alarms in ubsan. So if we're going to find a replacement kludge, I'd like to keep it localized at the point where want to assert type compatibility, and make sure it works with all our compilers and coverity. Maybe we could find a way to generically write _ASSERT_TYPES_COMPATIBLE(P, Q, NEXT) so that it expands into an expression equivalent to NEXT but as a side effect at compile-time causes an error if P and Q have incompatible types, and similarly with _ASSERT_PTR_MEMBER_TYPE(P, T, F, NEXT) or something to assert that the member T::F has a type compatible with *P.