Re: _Bool and trap representations

Alexander Cherepanov Mon, 13 Jun 2016 12:02:59 -0700

On 2016-06-08 17:37, Martin Sebor wrote:

On 06/08/2016 12:36 AM, Alexander Cherepanov wrote:

Hi!


If a variable of type _Bool contains something different from 0 and 1
its use amounts to UB in gcc and clang. There is a couple of examples in
[1] ([2] is also interesting).

[1] https://github.com/TrustInSoft/tis-interpreter/issues/39
[2] https://github.com/TrustInSoft/tis-interpreter/issues/100

But my question is about the following example:

----------------------------------------------------------------------
#include <stdio.h>

int main()
{
   _Bool b;
   *(char *)&b = 123;
   printf("%d\n", *(char *)&b);
}
----------------------------------------------------------------------

Results:

----------------------------------------------------------------------
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
123

$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
1
----------------------------------------------------------------------

gcc version: gcc (GCC) 7.0.0 20160604 (experimental)


Similar example with long double:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71522

It seems that padding in _Bool is treated as permanently unspecified. Is
this behavior intentional? What's the theory behind it?

One possible explanations is C11, 6.2.6.2p1, which reads: "The values of
any padding bits are unspecified." But it's somewhat a stretch to
conclude from it that the values of padding bits cannot be specified
even with explicit assignment.

Another possible approach is to refer to Committee Response for Question
1 in DR 260 which reads: "Values may have any bit-pattern that validly
represents them and the implementation is free to move between alternate
representations (for example, it may normalize pointers, floating-point
representations etc.). [...] the actual bit-pattern may change without
direct action of the program."


There has been quite a bit of discussion among the committee on
this subject lately (the last part is the subject of DR #451,
though it's discussed in the context of uninitialized objects
with indeterminate values).


Are there notes from these discussions or something?

I would hesitate to call it
consensus but I think it would be fair to say that the opinion
of the vocal majority is that implementations aren't intended
to spontaneously change valid (i.e., determinate) representations
of objects in the absence of an access to the value of the object.

Thanks for the info. IMHO this part of DR 260 has even more seriousconsequences than the part about pointer provenance. It effectivelyprohibits manual byte-by-byte (or any non-atomic) copying of objects fortypes like long double. If an implementation decides to normalize avalue in a variable during copying it will see an inconsistentrepresentation, e.g. a trap representation. It's a sure way to get totalgarbage. I don't know if allowing implementations to normalize values isuseful but the current language in DR 260 allows too much.

As for valid/determinate representation this is another place wheredistinction between a value and a representation is worth stressing.Uninitialized variables are a clear case -- both its value andrepresentation are indeterminate. But what if we set some part ofrepresentation of a variable -- it doesn't yet have a determinate valuebut we want the part that we have set to be preserved. Anotherinteresting example is a pointer after free() -- its representation iskinda determinate but its value is indeterminate.


--
Alexander Cherepanov

Re: _Bool and trap representations

Reply via email to