On Thu, 19 Jul 2018, Joerg Schilling wrote:

> Signed integer overflow can only trigger undefined behavior in case that more 
> than a single specific architecture could be envolved.

No, signed integer overflow is part of the contract between a C programmer 
and the implementation.  A C programmer guarantees to the implementation 
that the execution of their code according to the semantics of the 
abstract machine defined by the C standard will not result in undefined 
behavior, and the C implementation uses that information to optimize the 
code (for example, determining that the result of a call to abs must be 
non-negative, and that the product of two non-negative signed integers 
must be non-negative, because by using signed operations the programmer 
has given those guarantees to the implementation).

C semantics are always those of the high-level language as defined by the 
standard, not those of particular machine instructions for the 
architecture in use; there hasn't really been a correspondence to machine 
instruction semantics for decades.

http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
https://blog.regehr.org/archives/213

(and see parts 2 and 3 of both those posts as well).  This is critical 
understanding for any systems-level C programmer in the modern world; the 
optimizations are not new, but have been around for decades in major 
implementations.

> Since POSIX de-facto only allowed two's complement machines since several 
> years 
> already (the current change is just fixing the slipped parts in the 
> standard), 
> it is now well defined what happens in case of an integer overflow.

No, it is very definitely undefined.  It's true that anyone programming in 
C on POSIX systems for about the past 20 years has probably in fact only 
needed to care about two's complement systems.  But it's also true that 
programming in C for about the past 20 years without a proper modern 
understanding of undefined behavior as discussed in the above blog posts 
(or otherwise avoiding anything the C standard says is undefined) is a 
rapid route to code that does not work correctly and introduces security 
holes.

All that's meant by two's complement, as far as C is concerned, is certain 
constraints on what the byte representations of integers look like, if you 
inspect or modify those representations using lvalues of type unsigned 
char (which is a legitimate thing to do in standard C).  It in no way 
constrains overflow behavior.  The same implementations that have been 
ubiquitously two's complement for the past 20 years have also been 
ubiquitously optimizing on the basis of undefined signed integer overflow 
for the past 20 years.

POSIX is generally meant to standardize existing APIs rather than invent 
things that haven't been tried in practice.  That makes it entirely 
appropriate to require two's complement representation (and thus two's 
complement sets of values for signed integer types) rather than attempting 
to come up with a purely-invented specification for how certain interfaces 
would work with other historical integer representations.  But it also 
makes it entirely *inappropriate* to invent specifications for functions 
such as abs that are contrary to well-established implementations practice 
for those functions over the past few decades.  The specification in this 
issue8-tagged issue is such a completely inappropriate and ill-thought-out 
invention taking no account of very longstanding implementation practice 
and providing no benefits to users.

If POSIX wishes to provide a mode that profiles the C standard to define 
certain cases of signed integer overflow that are otherwise undefined, it 
should standardize the -fwrapv compiler flag (using that existing, de 
facto standard, spelling).  In accordance with existing practice, that 
mode should not be the default.

-- 
Joseph S. Myers
jos...@codesourcery.com

Reply via email to