> Date: Mon, 24 Feb 2020 11:42:01 +0100 > From: Kamil Rytarowski <n...@gmx.com> > > Forbidding NULL pointer arithmetic is not just for C purists trolls. It > is now in C++ mainstream and already in C2x draft. > > The newer C standard will most likely (already accepted by the > committee) adopt nullptr on par with nullptr from C++. In C++ we can > "#define NULL nullptr" and possibly the same will be possible in C. > > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2394.pdf > > This will change all arithmetic code operating on NULL into syntax error.
Arithmetic on bare NULL is already an error, flagged by the options -Wpointer-arith -Werror which we already use, and arithmetic on the proposed nullptr will remain so. This question is not about that, or about syntax. The question is whether it is realistic to imagine that a compiler we will ever use to build the kernel -- particularly with the option -fno-delete-null-pointer-checks as we already use to build the kernel with gcc -- will actually meaningfully distinguish the fragments char *x = NULL; return x; and char *x = NULL; return x + 0; Will two programs that differ only by this fragment actually behave differently on any serious C implementation we use in NetBSD, ignoring the pedantry of ubsan? (The question is the same if you substitute the proposed nullptr for NULL; it's about the meaning of + on a null pointer, not whether the program is syntactically written with the letters `NULL' or `nullptr'.) The second program technically has undefined behaviour because in, e.g., C99 6.5.6 `Additive operators', the meaning of + is defined on pointer/integer operands only when the pointer is to an object in an array and the sum stays within the array or points one past the end -- in other words, there's nothing in C99 formally defining what x + 0 means when x is a null pointer. Why is the standard written this way? I surmise that it's because technically there exist implementations such as Zeta-C where a `pointer' is not simply a virtual address in a machine register but actually a pair of a Lisp array and an index into it. NetBSD does not run on such implementations. Corners of the standard that serve _only_ to accommodate such implementations are not relevant to NetBSD on their own. The standard is also technically written so that a null pointer is not necessarily stored as all bits zero in memory, so char *x; memset(&x, 0, sizeof x); return x; is not guaranteed to return a null pointer. However, NetBSD only runs on C implementations where it actually is guaranteed to return a null pointer, and we rely on this pervasively. If we make _only_ the assumptions that the standard formally guarantees, then ubsan would be right to object that char *x; memset(&x, 0, sizeof x); return x == NULL ? 0 : *(char *)x; has undefined behaviour. But in NetBSD this is guaranteed to return 0 and so if ubsan flagged it we would treat that as a useless false alarm that detracts from the value of ubsan as a tool. If you can present a compelling argument that C implementations which are _relevant to NetBSD_ -- not merely technically allowed by the letter of the standard like Zeta-C -- will actually behave differently from how I described, please present that. Otherwise please find a way to suppress the false alarm in the tool so it doesn't waste any more time. (And please do the same for memcpy(x,NULL,0)/memcpy(NULL,y,0)!)