subject:"Obscure crashes due to gcc 4.9 \-O2 = \-fisolate\-erroneous\-paths\-dereference"

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-03-04 Thread Richard Biener

On Wed, Mar 4, 2015 at 12:38 AM, Jeff Law  wrote:
> On 03/03/15 12:57, Martin Sebor wrote:
>>
>>
>> As a data point(*) it might be interesting to note that GCC itself
>> relies on memcpy providing stronger guarantees than the C standard
>> requires it to by emitting calls to the function for large structure
>> self-assignments (which are strictly conforming, as discussed in bug
>> 65029).
>
> Right.  I actually spent quite a bit of time struggling with this a while
> back in a different context.   The only case I could come up with where GCC
> would generate an overlapping memcpy was self assignment, but even that was
> bad and while we ultimately punted, I've always considered it a wart.

?

struct A { int large[100]; };
void foo (struct A *x, struct A *y)
{
  *x = *y;
}

call it as foo (&a, &a);  (on x86 you need -mstringop-strategy=libcall,
even at -O0, to emit a memcpy call)

The self-assignment doesn't have to be visible to the compiler - so
to fix this we'd have to assume pointer equality everywhere and
either emit a conditional call to memcpy or always emit a call to
memmove.

Richard.

>
>  [*] IMO, one in favor of tightening up the memcpy specification
>>
>> to require implementations to provide the expected semantics.
>
> That works for me :-)
>
> The things done in glibc's memcpy are a bit on the absurd side and the pain
> caused by the changes over time is almost impossible to overstate.  If the
> Austin group tightens memcpy to require fewer surprises I think most
> developers would ultimately be happy with the result -- a few would complain
> about the performance impacts for specific workloads, but I suspect they'd
> be in the minority.
>
>
> jeff
>

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-03-03 Thread Jeff Law


On 03/03/15 12:57, Martin Sebor wrote:


As a data point(*) it might be interesting to note that GCC itself
relies on memcpy providing stronger guarantees than the C standard
requires it to by emitting calls to the function for large structure
self-assignments (which are strictly conforming, as discussed in bug
65029).
Right.  I actually spent quite a bit of time struggling with this a 
while back in a different context.   The only case I could come up with 
where GCC would generate an overlapping memcpy was self assignment, but 
even that was bad and while we ultimately punted, I've always considered 
it a wart.



 [*] IMO, one in favor of tightening up the memcpy specification

to require implementations to provide the expected semantics.

That works for me :-)

The things done in glibc's memcpy are a bit on the absurd side and the 
pain caused by the changes over time is almost impossible to overstate. 
 If the Austin group tightens memcpy to require fewer surprises I think 
most developers would ultimately be happy with the result -- a few would 
complain about the performance impacts for specific workloads, but I 
suspect they'd be in the minority.



jeff

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-03-03 Thread Martin Sebor


On 02/20/2015 10:01 AM, Jeff Law wrote:

On 02/20/15 05:10, Jakub Jelinek wrote:

On Fri, Feb 20, 2015 at 12:06:28PM +0100, Florian Weimer wrote:

On 02/19/2015 09:56 PM, Sandra Loosemore wrote:

H,  Passing the additional option in user code would be one thing,
but what about library code?  E.g., using memcpy (either explicitly or
implicitly for a structure copy)?


The memcpy problem isn't restricted to embedded architectures.

   size_t size;
   const unsigned char *source;
   std::vector vec;
   …
   vec.resize(size);
   memcpy(vec.data(), source, size);

std::vector::data() can return a null pointer if the vector is empty,
which means that this code is invalid for empty inputs.

I think the C standard is wrong here.  We should extend it, as a QoI
matter, and support null pointers for variable-length inputs and outputs
if the size is 0.  But I suspect this is still a minority view.


I disagree.  If you want a function that will have that different
property,
don't call it memcpy.

Right.  If someone wants to take it up with the Austin group, that's
fine. But until/unless the Austin group blesses, I don't think we should
extend as a QoI matter.


As a data point(*) it might be interesting to note that GCC itself
relies on memcpy providing stronger guarantees than the C standard
requires it to by emitting calls to the function for large structure
self-assignments (which are strictly conforming, as discussed in bug
65029).

Martin

[*] IMO, one in favor of tightening up the memcpy specification
to require implementations to provide the expected semantics.

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-03-03 Thread Chris Johns


On 28/02/2015 9:12 am, Manuel López-Ibáñez wrote:

On 02/19/15 14:56, Chris Johns wrote:


My main concern is not knowing the trap has been added to the code. If I
could build an application and audit it somehow then I can manage it. We
have a similar issue with the possible use of FP registers being used in
general code (ISR save/restore trade off).

Can the ELF be annotated in some GCC specific way that makes it to the
final executable to flag this is happening ? We can then create tools to
audit the executables.


Simply ignore me if I'm misunderstanding the issue: Couldn't GCC
generate, instead of a trap, a call to a noinline noreturn cold weak
function __gcc_is_a_trap that by default calls the trap? Then, audit
tools can inspect the code and see if such a function call appears and
even override it with something else.


Yes it could and this is a nice idea.



Chris, wouldn't that be enough for your purposes?



I think it does because we can scan an executable for the call locations 
and audit them.


Chris

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-27 Thread Manuel López-Ibáñez

On 02/19/15 14:56, Chris Johns wrote:
>
> My main concern is not knowing the trap has been added to the code. If I
> could build an application and audit it somehow then I can manage it. We
> have a similar issue with the possible use of FP registers being used in
> general code (ISR save/restore trade off).
>
> Can the ELF be annotated in some GCC specific way that makes it to the
> final executable to flag this is happening ? We can then create tools to
> audit the executables.

Simply ignore me if I'm misunderstanding the issue: Couldn't GCC
generate, instead of a trap, a call to a noinline noreturn cold weak
function __gcc_is_a_trap that by default calls the trap? Then, audit
tools can inspect the code and see if such a function call appears and
even override it with something else.

Chris, wouldn't that be enough for your purposes?

Cheers,

Manuel.

[RFC/patch for stage1] Embed compiler dumps into generated .o files (was Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference)

2015-02-27 Thread David Malcolm

On Thu, 2015-02-26 at 11:17 -0500, David Malcolm wrote:
> On Fri, 2015-02-20 at 10:29 -0700, Jeff Law wrote:
> > On 02/19/15 14:56, Chris Johns wrote:
> > > On 20/02/2015 8:23 am, Joel Sherrill wrote:
> > >>
> > >> On 2/19/2015 2:56 PM, Sandra Loosemore wrote:
> > >>> Jakub Jelinek wrote:
> >  On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
> > > Starting with gcc 4.9, -O2 implicitly invokes
> > >
> > >  -fisolate-erroneous-paths-dereference:
> > >
> > > which
> > >
> > >  https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> > >
> > > documents as
> > >
> > >  Detect paths that trigger erroneous or undefined behavior due to
> > >  dereferencing a null pointer. Isolate those paths from the
> > > main control
> > >  flow and turn the statement with erroneous or undefined
> > > behavior into a
> > >  trap. This flag is enabled by default at -O2 and higher.
> > >
> > > This results in a sizable number of previously working embedded
> > > programs mysteriously
> > > crashing when recompiled under gcc 4.9.  The problem is that embedded
> > > programs will often have ram starting at address zero (think
> > > hardware-defined
> > > interrupt vectors, say) which gets initialized by code which the
> > > -fisolate-erroneous-paths-deference logic can recognize as reading
> > > and/or
> > > writing address zero.
> >  If you have some pages mapped at address 0, you really should
> >  compile your
> >  code with -fno-delete-null-pointer-checks, otherwise you can run
> >  into tons
> >  of other issues.
> > >>> H,  Passing the additional option in user code would be one thing,
> > >>> but what about library code?  E.g., using memcpy (either explicitly or
> > >>> implicitly for a structure copy)?
> > >>>
> > >>> It looks to me like cr16 and avr are currently the only architectures
> > >>> that disable flag_delete_null_pointer_checks entirely, but I am sure
> > >>> that this issue affects other embedded targets besides nios2, too.  E.g.
> > >>> scanning Mentor's ARM board support library, I see a whole pile of
> > >>> devices that have memory mapped at address zero (TI Stellaris/Tiva,
> > >>> Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ).  Plus our simulator
> > >>> BSPs assume a flat address space starting at address 0.
> > >> I forwarded this to the RTEMS list and was promptly pointed to a patch
> > >> on a Coldfire BSP where someone worked around this behavior.
> > >>
> > >> We are discussing how to deal with this. It is likely OK in user code but
> > >> horrible in BSP and driver code. We don't have a solution ourselves. We
> > >> just recognize it impacts a number of targets.
> > >>
> > >
> > > My main concern is not knowing the trap has been added to the code. If I
> > > could build an application and audit it somehow then I can manage it. We
> > > have a similar issue with the possible use of FP registers being used in
> > > general code (ISR save/restore trade off).
> > >
> > > Can the ELF be annotated in some GCC specific way that makes it to the
> > > final executable to flag this is happening ? We can then create tools to
> > > audit the executables.
> > Not really, for a variety of reasons.
> 
> Is information on this reaching the pass-specific dumpfile?  I don't see
> any explicit dumping in gimple-ssa-isolate-paths.c, but I guess that
> insert_trap_and_remove_trailing_statements could log itself to the
> dumpfile, or use the statistics framework (which itself also reaches a
> dumpfile).
> 
> Assuming the info is reaching a dumpfile, could gcc have an option to
> write its dumpfiles into a special ELF section in the .s, rather than to
> disk?
> 
> Then (given a suitable new option to e.g. eu-readelf) you'd be able to
> read the dumpfiles from a .o file, and (handwaving about linkage) from
> an execuable or library.
> 
> Not that I'm volunteering...

Perhaps foolishly I had a go at prototyping this; attached is a
proof-of-concept patch (albeit with FIXMEs and no ChangeLog or testsuite
coverage).

When writing out the final asm, each dumpfile is read, and embedded into
its own section.  Manual review of the built .o file shows that the
dumpfiles make it into them, e.g.:

$ eu-readelf -x .note.GNU-dump.tree-switchconv smoketest.o|head

Hex dump of section [28] '.note.GNU-dump.tree-switchconv', 2698 bytes at offset 
0xf021:
  0x 0a3b3b20 46756e63 74696f6e 20746573 .;; Function tes
  0x0010 745f7068 69202874 6573745f 7068692c t_phi (test_phi,
  0x0020 2066756e 63646566 5f6e6f3d 302c2064  funcdef_no=0, d
  0x0030 65636c5f 7569643d 31383332 2c206367 ecl_uid=1832, cg
  0x0040 72617068 5f756964 3d302c20 73796d62 raph_uid=0, symb
  0x0050 6f6c5f6f 72646572 3d30290a 0a746573 ol_order=0)..tes
  0x0060 745f7068 69202869 6e742069 2c20696e t_phi (int i, in
  0x0070 74206a29 0a7b0a20 20696e74 206b3b0a t j).{.  int

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-26 Thread David Malcolm

On Fri, 2015-02-20 at 10:29 -0700, Jeff Law wrote:
> On 02/19/15 14:56, Chris Johns wrote:
> > On 20/02/2015 8:23 am, Joel Sherrill wrote:
> >>
> >> On 2/19/2015 2:56 PM, Sandra Loosemore wrote:
> >>> Jakub Jelinek wrote:
>  On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
> > Starting with gcc 4.9, -O2 implicitly invokes
> >
> >  -fisolate-erroneous-paths-dereference:
> >
> > which
> >
> >  https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> >
> > documents as
> >
> >  Detect paths that trigger erroneous or undefined behavior due to
> >  dereferencing a null pointer. Isolate those paths from the
> > main control
> >  flow and turn the statement with erroneous or undefined
> > behavior into a
> >  trap. This flag is enabled by default at -O2 and higher.
> >
> > This results in a sizable number of previously working embedded
> > programs mysteriously
> > crashing when recompiled under gcc 4.9.  The problem is that embedded
> > programs will often have ram starting at address zero (think
> > hardware-defined
> > interrupt vectors, say) which gets initialized by code which the
> > -fisolate-erroneous-paths-deference logic can recognize as reading
> > and/or
> > writing address zero.
>  If you have some pages mapped at address 0, you really should
>  compile your
>  code with -fno-delete-null-pointer-checks, otherwise you can run
>  into tons
>  of other issues.
> >>> H,  Passing the additional option in user code would be one thing,
> >>> but what about library code?  E.g., using memcpy (either explicitly or
> >>> implicitly for a structure copy)?
> >>>
> >>> It looks to me like cr16 and avr are currently the only architectures
> >>> that disable flag_delete_null_pointer_checks entirely, but I am sure
> >>> that this issue affects other embedded targets besides nios2, too.  E.g.
> >>> scanning Mentor's ARM board support library, I see a whole pile of
> >>> devices that have memory mapped at address zero (TI Stellaris/Tiva,
> >>> Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ).  Plus our simulator
> >>> BSPs assume a flat address space starting at address 0.
> >> I forwarded this to the RTEMS list and was promptly pointed to a patch
> >> on a Coldfire BSP where someone worked around this behavior.
> >>
> >> We are discussing how to deal with this. It is likely OK in user code but
> >> horrible in BSP and driver code. We don't have a solution ourselves. We
> >> just recognize it impacts a number of targets.
> >>
> >
> > My main concern is not knowing the trap has been added to the code. If I
> > could build an application and audit it somehow then I can manage it. We
> > have a similar issue with the possible use of FP registers being used in
> > general code (ISR save/restore trade off).
> >
> > Can the ELF be annotated in some GCC specific way that makes it to the
> > final executable to flag this is happening ? We can then create tools to
> > audit the executables.
> Not really, for a variety of reasons.

Is information on this reaching the pass-specific dumpfile?  I don't see
any explicit dumping in gimple-ssa-isolate-paths.c, but I guess that
insert_trap_and_remove_trailing_statements could log itself to the
dumpfile, or use the statistics framework (which itself also reaches a
dumpfile).

Assuming the info is reaching a dumpfile, could gcc have an option to
write its dumpfiles into a special ELF section in the .s, rather than to
disk?

Then (given a suitable new option to e.g. eu-readelf) you'd be able to
read the dumpfiles from a .o file, and (handwaving about linkage) from
an execuable or library.

Not that I'm volunteering...

> However, the compiler can do 
> better for warning about some of these kinds of things -- but we 
> certainly can't guarantee we catch all of them as there are cases where 
> the point where we determine a property (such as non-nullness) may be 
> very different from the point where we exploit that property.
> 
> I did propose some patches to improve the warnings back in the 4.9 time 
> frame, but they never got reviewed.  See BZ 16351.   We'll have to 
> revisit them during the next open development period.
> 
> Jeff

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Paul_Koning

> On Feb 20, 2015, at 12:01 PM, Jeff Law  wrote:
> 
> ...
> Regardless, the right thing to do is to disable elimination of NULL pointer 
> checks on targets where page 0 is mapped and thus a reference to *0 may not 
> fault.  In my mind this is an attribute of both the processor (see H8 above) 
> and/or the target OS.
> 
> On those targets the C-runtime had better also ensure that its headers aren't 
> decorated with non-null attributes, particularly for the mem* and str* 
> functions.

pdp11 is an example of such a target, independent of OS (with only 8 pages, 
clearly no one is going to unmap page 0).  Fortunately there one is unlikely to 
find C program data structures at 0 (instead, vectors if kernel, stack limit if 
user mode).  So no fault is correct there, but no null (0 bits) pointer is also 
— for practical purposes — a valid assumption.

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Paul_Koning

> On Feb 20, 2015, at 12:01 PM, Jeff Law  wrote:
> 
> On 02/20/15 04:43, Jonathan Wakely wrote:
>>> ...
>> 
>> I'm inclined to agree.
>> 
>> Most developers aren't aware of the preconditions on memcpy, but GCC
>> optimizes aggressively based on those preconditions, so we have a
>> large and potentially dangerous gap between what developers expect and
>> what actually happens.
> But that's always true -- this isn't any different than aliasing, arithmetic 
> overflow, etc.  The standards define the contract between the 
> compiler/library implementors and the developers.  Once the contract is 
> broken, all bets are off.

True.  The unfortunate problem with C, and even more so with C++, is that the 
contract is so large and complex that few, if any, are skilled enough language 
lawyers to know what exactly it says.  For that matter, the contract (the 
standard) is large and complex enough that it has bugs and ambiguities, so the 
contract is not in fact precisely defined.

There’s a nice paper that drives this home: 
http://people.csail.mit.edu/akcheung/papers/apsys12.pdf

For example, while most people know about the “no overlaps” rule of memcpy, 
stuff like aliasing are far more obscure. Or the exact meaning (if there is 
one) of “volatile”.

It also doesn’t help that a large fraction of the contract is unenforced.  You 
only find out about it when a new version of the compiler starts using a 
particular rule to make an optimization that suddenly breaks 10 year old code. 
I remember some heated debates between Linux folks and compiler builders when 
such things strike the Linux kernel.

paul

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Jeff Law


On 02/20/15 10:09, Florian Weimer wrote:

On 02/20/2015 06:01 PM, Jeff Law wrote:


But that's always true -- this isn't any different than aliasing,
arithmetic overflow, etc.  The standards define the contract between the
compiler/library implementors and the developers.  Once the contract is
broken, all bets are off.


What I don't like about this case (std::vector::data() returning
nullptr vs memcpy/memcmp/qsort non-null assertions) is that it is
internally non-composing in a totally non-obvious way.  data() is
explicitly intended to cover interoperability with these older C
functions, and it fails.
And that's precisely why I consider this class of issues the most 
problematical.


Jeff

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Jeff Law


On 02/19/15 14:56, Chris Johns wrote:

On 20/02/2015 8:23 am, Joel Sherrill wrote:


On 2/19/2015 2:56 PM, Sandra Loosemore wrote:

Jakub Jelinek wrote:

On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:

Starting with gcc 4.9, -O2 implicitly invokes

 -fisolate-erroneous-paths-dereference:

which

 https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

documents as

 Detect paths that trigger erroneous or undefined behavior due to
 dereferencing a null pointer. Isolate those paths from the
main control
 flow and turn the statement with erroneous or undefined
behavior into a
 trap. This flag is enabled by default at -O2 and higher.

This results in a sizable number of previously working embedded
programs mysteriously
crashing when recompiled under gcc 4.9.  The problem is that embedded
programs will often have ram starting at address zero (think
hardware-defined
interrupt vectors, say) which gets initialized by code which the
-fisolate-erroneous-paths-deference logic can recognize as reading
and/or
writing address zero.

If you have some pages mapped at address 0, you really should
compile your
code with -fno-delete-null-pointer-checks, otherwise you can run
into tons
of other issues.

H,  Passing the additional option in user code would be one thing,
but what about library code?  E.g., using memcpy (either explicitly or
implicitly for a structure copy)?

It looks to me like cr16 and avr are currently the only architectures
that disable flag_delete_null_pointer_checks entirely, but I am sure
that this issue affects other embedded targets besides nios2, too.  E.g.
scanning Mentor's ARM board support library, I see a whole pile of
devices that have memory mapped at address zero (TI Stellaris/Tiva,
Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ).  Plus our simulator
BSPs assume a flat address space starting at address 0.

I forwarded this to the RTEMS list and was promptly pointed to a patch
on a Coldfire BSP where someone worked around this behavior.

We are discussing how to deal with this. It is likely OK in user code but
horrible in BSP and driver code. We don't have a solution ourselves. We
just recognize it impacts a number of targets.



My main concern is not knowing the trap has been added to the code. If I
could build an application and audit it somehow then I can manage it. We
have a similar issue with the possible use of FP registers being used in
general code (ISR save/restore trade off).

Can the ELF be annotated in some GCC specific way that makes it to the
final executable to flag this is happening ? We can then create tools to
audit the executables.
Not really, for a variety of reasons.  However, the compiler can do 
better for warning about some of these kinds of things -- but we 
certainly can't guarantee we catch all of them as there are cases where 
the point where we determine a property (such as non-nullness) may be 
very different from the point where we exploit that property.


I did propose some patches to improve the warnings back in the 4.9 time 
frame, but they never got reviewed.  See BZ 16351.   We'll have to 
revisit them during the next open development period.


Jeff

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Jeff Law


On 02/20/15 04:43, Jonathan Wakely wrote:

On 20 February 2015 at 11:06, Florian Weimer wrote:

On 02/19/2015 09:56 PM, Sandra Loosemore wrote:

H,  Passing the additional option in user code would be one thing,
but what about library code?  E.g., using memcpy (either explicitly or
implicitly for a structure copy)?


The memcpy problem isn't restricted to embedded architectures.

   size_t size;
   const unsigned char *source;
   std::vector vec;
   …
   vec.resize(size);
   memcpy(vec.data(), source, size);

std::vector::data() can return a null pointer if the vector is empty,
which means that this code is invalid for empty inputs.

I think the C standard is wrong here.  We should extend it, as a QoI
matter, and support null pointers for variable-length inputs and outputs
if the size is 0.  But I suspect this is still a minority view.


I'm inclined to agree.

Most developers aren't aware of the preconditions on memcpy, but GCC
optimizes aggressively based on those preconditions, so we have a
large and potentially dangerous gap between what developers expect and
what actually happens.
But that's always true -- this isn't any different than aliasing, 
arithmetic overflow, etc.  The standards define the contract between 
the compiler/library implementors and the developers.  Once the contract 
is broken, all bets are off.



jeff

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Florian Weimer

On 02/20/2015 06:01 PM, Jeff Law wrote:

> But that's always true -- this isn't any different than aliasing,
> arithmetic overflow, etc.  The standards define the contract between the
> compiler/library implementors and the developers.  Once the contract is
> broken, all bets are off.

What I don't like about this case (std::vector::data() returning
nullptr vs memcpy/memcmp/qsort non-null assertions) is that it is
internally non-composing in a totally non-obvious way.  data() is
explicitly intended to cover interoperability with these older C
functions, and it fails.

But you are right about overflows.  I think we should give up and just
enable -fwrapv by default in Fedora and downstream.  This issue has been
explicitly documented since 2002 at least (explicitly with
security-related checks in mind), and programmers still write overflow
checks which are only correct with -fwrapv, and it passes code review.
I fear that's not going to change, ever.

-- 
Florian Weimer / Red Hat Product Security

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Jeff Law


On 02/20/15 05:10, Jakub Jelinek wrote:

On Fri, Feb 20, 2015 at 12:06:28PM +0100, Florian Weimer wrote:

On 02/19/2015 09:56 PM, Sandra Loosemore wrote:

H,  Passing the additional option in user code would be one thing,
but what about library code?  E.g., using memcpy (either explicitly or
implicitly for a structure copy)?


The memcpy problem isn't restricted to embedded architectures.

   size_t size;
   const unsigned char *source;
   std::vector vec;
   …
   vec.resize(size);
   memcpy(vec.data(), source, size);

std::vector::data() can return a null pointer if the vector is empty,
which means that this code is invalid for empty inputs.

I think the C standard is wrong here.  We should extend it, as a QoI
matter, and support null pointers for variable-length inputs and outputs
if the size is 0.  But I suspect this is still a minority view.


I disagree.  If you want a function that will have that different property,
don't call it memcpy.
Right.  If someone wants to take it up with the Austin group, that's 
fine. But until/unless the Austin group blesses, I don't think we should 
extend as a QoI matter.


jeff

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Jeff Law


On 02/20/15 04:06, Florian Weimer wrote:

On 02/19/2015 09:56 PM, Sandra Loosemore wrote:

H,  Passing the additional option in user code would be one thing,
but what about library code?  E.g., using memcpy (either explicitly or
implicitly for a structure copy)?


The memcpy problem isn't restricted to embedded architectures.

   size_t size;
   const unsigned char *source;
   std::vector vec;
   …
   vec.resize(size);
   memcpy(vec.data(), source, size);

std::vector::data() can return a null pointer if the vector is empty,
which means that this code is invalid for empty inputs.

I think the C standard is wrong here.  We should extend it, as a QoI
matter, and support null pointers for variable-length inputs and outputs
if the size is 0.  But I suspect this is still a minority view.

And it's these kind of uses that scare me the most.

jeff

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Jeff Law


On 02/20/15 04:45, Florian Weimer wrote:

On 02/20/2015 10:30 AM, Andrew Haley wrote:

I doubt that such a thing is ever going to be safe.  The idea that a
null pointer points to nothing is so hard-baked into the design of C
that you can't get away from it.  Also, almost every C programmer and
especially library writer "knows" that a null pointer points to
nothing.


NULL pointer dereferences (or NULL pointer with small offsets) were
common programming idioms in the DOS days because the interrupt vector
table was located at this address.  Quite a few systems once had a
readable page zero, and (manual, I assume) optimizations for list
traversal (p != NULL && p->next != NULL → p->next != NULL) were commonly
used on these systems.

True, but thankfully this isn't blessed anymore.




I think the treatment of pointers not as addresses, but something that
has type information and provenience associated with it, came much
later, when most of the design was already settled.
We still have targets where page0 is special.  The H8 for example comes 
to mind.  Folks regularly place data into page0 and mark it as special 
so the compiler emits more efficient sequences to access that data.


Regardless, the right thing to do is to disable elimination of NULL 
pointer checks on targets where page 0 is mapped and thus a reference to 
*0 may not fault.  In my mind this is an attribute of both the processor 
(see H8 above) and/or the target OS.


On those targets the C-runtime had better also ensure that its headers 
aren't decorated with non-null attributes, particularly for the mem* and 
str* functions.


jeff

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Andrew Haley

On 02/20/2015 11:06 AM, Florian Weimer wrote:
> On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
>> H,  Passing the additional option in user code would be one thing,
>> but what about library code?  E.g., using memcpy (either explicitly or
>> implicitly for a structure copy)?
> 
> The memcpy problem isn't restricted to embedded architectures.
> 
>   size_t size;
>   const unsigned char *source;
>   std::vector vec;
>   …
>   vec.resize(size);
>   memcpy(vec.data(), source, size);
> 
> std::vector::data() can return a null pointer if the vector is empty,
> which means that this code is invalid for empty inputs.

Sure, but if that's a bug then it's a bug in the definition of
memcpy(), not in the definition of the properties of a null pointer.
If the size is zero then it really shouldn't matter if the destination
address is null.

Andrew.

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Jakub Jelinek

On Fri, Feb 20, 2015 at 12:06:28PM +0100, Florian Weimer wrote:
> On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
> > H,  Passing the additional option in user code would be one thing,
> > but what about library code?  E.g., using memcpy (either explicitly or
> > implicitly for a structure copy)?
> 
> The memcpy problem isn't restricted to embedded architectures.
> 
>   size_t size;
>   const unsigned char *source;
>   std::vector vec;
>   …
>   vec.resize(size);
>   memcpy(vec.data(), source, size);
> 
> std::vector::data() can return a null pointer if the vector is empty,
> which means that this code is invalid for empty inputs.
> 
> I think the C standard is wrong here.  We should extend it, as a QoI
> matter, and support null pointers for variable-length inputs and outputs
> if the size is 0.  But I suspect this is still a minority view.

I disagree.  If you want a function that will have that different property,
don't call it memcpy.

Jakub

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Florian Weimer

On 02/20/2015 12:43 PM, Jonathan Wakely wrote:
> On 20 February 2015 at 11:06, Florian Weimer wrote:
>> On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
>>> H,  Passing the additional option in user code would be one thing,
>>> but what about library code?  E.g., using memcpy (either explicitly or
>>> implicitly for a structure copy)?
>>
>> The memcpy problem isn't restricted to embedded architectures.
>>
>>   size_t size;
>>   const unsigned char *source;
>>   std::vector vec;
>>   …
>>   vec.resize(size);
>>   memcpy(vec.data(), source, size);
>>
>> std::vector::data() can return a null pointer if the vector is empty,
>> which means that this code is invalid for empty inputs.
>>
>> I think the C standard is wrong here.  We should extend it, as a QoI
>> matter, and support null pointers for variable-length inputs and outputs
>> if the size is 0.  But I suspect this is still a minority view.
> 
> I'm inclined to agree.
> 
> Most developers aren't aware of the preconditions on memcpy, but GCC
> optimizes aggressively based on those preconditions, so we have a
> large and potentially dangerous gap between what developers expect and
> what actually happens.

Maybe we can add, as a compromise, an always-inline wrapper like this?

void *memcpy(void *dst, const *void src, size_t size)
{
  if (__builtin_constant_p(size > 0) && size > 0) {
// Or whatever else is needed as non-null assertions.
*(char *)dst;
*(const char *)src;
  }
  return memcpy_real(dst, src, size); // Without non-null assertion.
}

Then we'll still get the non-NULL optimization for the common positive
size case.

-- 
Florian Weimer / Red Hat Product Security

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Florian Weimer

On 02/20/2015 10:30 AM, Andrew Haley wrote:
> I doubt that such a thing is ever going to be safe.  The idea that a
> null pointer points to nothing is so hard-baked into the design of C
> that you can't get away from it.  Also, almost every C programmer and
> especially library writer "knows" that a null pointer points to
> nothing.

NULL pointer dereferences (or NULL pointer with small offsets) were
common programming idioms in the DOS days because the interrupt vector
table was located at this address.  Quite a few systems once had a
readable page zero, and (manual, I assume) optimizations for list
traversal (p != NULL && p->next != NULL → p->next != NULL) were commonly
used on these systems.

I think the treatment of pointers not as addresses, but something that
has type information and provenience associated with it, came much
later, when most of the design was already settled.

-- 
Florian Weimer / Red Hat Product Security

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Jonathan Wakely

On 20 February 2015 at 11:06, Florian Weimer wrote:
> On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
>> H,  Passing the additional option in user code would be one thing,
>> but what about library code?  E.g., using memcpy (either explicitly or
>> implicitly for a structure copy)?
>
> The memcpy problem isn't restricted to embedded architectures.
>
>   size_t size;
>   const unsigned char *source;
>   std::vector vec;
>   …
>   vec.resize(size);
>   memcpy(vec.data(), source, size);
>
> std::vector::data() can return a null pointer if the vector is empty,
> which means that this code is invalid for empty inputs.
>
> I think the C standard is wrong here.  We should extend it, as a QoI
> matter, and support null pointers for variable-length inputs and outputs
> if the size is 0.  But I suspect this is still a minority view.

I'm inclined to agree.

Most developers aren't aware of the preconditions on memcpy, but GCC
optimizes aggressively based on those preconditions, so we have a
large and potentially dangerous gap between what developers expect and
what actually happens.

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Florian Weimer

On 02/19/2015 09:56 PM, Sandra Loosemore wrote:
> H,  Passing the additional option in user code would be one thing,
> but what about library code?  E.g., using memcpy (either explicitly or
> implicitly for a structure copy)?

The memcpy problem isn't restricted to embedded architectures.

  size_t size;
  const unsigned char *source;
  std::vector vec;
  …
  vec.resize(size);
  memcpy(vec.data(), source, size);

std::vector::data() can return a null pointer if the vector is empty,
which means that this code is invalid for empty inputs.

I think the C standard is wrong here.  We should extend it, as a QoI
matter, and support null pointers for variable-length inputs and outputs
if the size is 0.  But I suspect this is still a minority view.

-- 
Florian Weimer / Red Hat Product Security

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-20 Thread Andrew Haley

On 18/02/15 19:21, Jeff Prothero wrote:

> BTW, I'd also be curious to know what is regarded as engineering
> best practice for writing a value to address zero when this is
> architecturally required by the hardware platform at hand.
> Obviously one can do various things to obscure the process
> sufficiently that the current gcc implementation won't detect it and
> complain, but as gcc gets smarter about optimization those are at
> risk of failing in a future release.  It would be nice to have a
> guaranteed-to-work future-proof idiom for doing this. Do we have
> one, short of retreating to assembly code?

I doubt that such a thing is ever going to be safe.  The idea that a
null pointer points to nothing is so hard-baked into the design of C
that you can't get away from it.  Also, almost every C programmer and
especially library writer "knows" that a null pointer points to
nothing.  The only way to initialize an interrupt vector table at
address zero is to go outside the language.  While it may be possible
to use some proprietary GCC options to get around this, it's never
going to be a good idea because some GCC author (or indeed library
author) may make the "mistake" of assuming that null pointer point to
nothing.  Using C to initialize anything at address zero is so
dangerous that we shouldn't tell people that they can use
such-and-such a GCC option to support it; we should warn them never to
do it, and provide as many warnings as we can.

IMO, engineering best practice is to use assembly code.

Andrew.

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-19 Thread Jeff Prothero

(Thanks to everyone for the helpful feedback!)

Daniel Gutson wrote:

> what about then two warnings (disabled by default), one intended to
> tell the user each time the compiler removes a conditional
>   (-fdelete-null-pointer-checks)
> and another intended to tell the user each time the compiler adds a
> trap due to dereference an address 0?
>
> E.g.
>   -Wnull-pointer-check-deleted
>   -Wnull-dereference-considered-erroneous

I very much like the idea of such warnings.

I'm not clear why one would not warn by default when detecting
non-standards-conformant code and producing code guaranteed not
to do what the programmer intended.  But presumably most sane
engineers these days compile with -Wall. :-)

-Jeff

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-19 Thread Sandra Loosemore


On 02/19/2015 03:09 PM, Jakub Jelinek wrote:


If you have hw where NULL is mapped and you know your code violates the
C/C++ standards by placing objects at that address, simply do use
the option that is designed for that purpose.


As I pointed out earlier, though, that won't help you if your program 
(perhaps implicitly) uses library code that's been built without the 
magic option.


I thought that declaring an object explicitly placed at address 0 as 
"volatile" ought to be sufficient to prevent writes to it from being 
optimized away and replaced with traps, but alas, that is not the case. 
 Also, a structure copy to such a volatile object is implicitly 
invoking the regular memcpy function without complaining about casting 
away volatile-ness, as an explicit call would do.  I've attached a 
couple quick test cases and .s output from arm-none-eabi-gcc -O2, using 
a mainline build from last night.


-Sandra
static int * const x0 = (int *) 0x;
static volatile int * const xv = (int *) 0x;
static int * const xn = (int *) 0x1000;

void f0 (int *data)
{
  *x0 = *data;
}

void fv (int *data)
{
  *xv = *data;
}

void fn (int *data)
{
  *xn = *data;
}
.cpu arm7tdmi
.fpu softvfp
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 1
.eabi_attribute 30, 2
.eabi_attribute 34, 0
.eabi_attribute 18, 4
.arm
.syntax divided
.file   "test.c"
.text
.align  2
.global f0
.type   f0, %function
f0:
@ Function supports interworking.
@ Volatile: function does not return.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
mov r3, #0
str r3, [r3]
.inst   0xe7f000f0
.size   f0, .-f0
.align  2
.global fv
.type   fv, %function
fv:
@ Function supports interworking.
@ Volatile: function does not return.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
mov r3, #0
str r3, [r3]
.inst   0xe7f000f0
.size   fv, .-fv
.align  2
.global fn
.type   fn, %function
fn:
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
mov r3, #268435456
ldr r2, [r0]
str r2, [r3]
bx  lr
.size   fn, .-fn
.ident  "GCC: (GNU) 5.0.0 20150219 (experimental)"
struct big {
  int a, b;
  char c [100];
};

static struct big * const x0 = (struct big *) 0x;
static volatile struct big * const xv = (struct big *) 0x;
static struct big * const xn = (struct big *) 0x1000;

void f0 (struct big *data)
{
  *x0 = *data;
}

void fv (struct big *data)
{
  *xv = *data;
}

void fn (struct big *data)
{
  *xn = *data;
}

.cpu arm7tdmi
.fpu softvfp
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 1
.eabi_attribute 30, 2
.eabi_attribute 34, 0
.eabi_attribute 18, 4
.arm
.syntax divided
.file   "test2.c"
.text
.align  2
.global f0
.type   f0, %function
f0:
@ Function supports interworking.
@ Volatile: function does not return.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
mov r1, r0
stmfd   sp!, {r4, lr}
mov r2, #108
mov r0, #0
bl  memcpy
.inst   0xe7f000f0
.size   f0, .-f0
.align  2
.global fv
.type   fv, %function
fv:
@ Function supports interworking.
@ Volatile: function does not return.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
mov r1, r0
stmfd   sp!, {r4, lr}
mov r2, #108
mov r0, #0
bl  memcpy
.inst   0xe7f000f0
.size   fv, .-fv
.align  2
.global fn
.type   fn, %function
fn:
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
stmfd   sp!, {r4, lr}
mov r1, r0
mov r2, #108
mov r0, #268435456
bl  memcpy
ldmfd   sp!, {r4, lr}
bx  lr
.size   fn, .-fn
.ident  "GCC: (GNU) 5.0.0 20150219 (experimental)"

Re: Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-19 Thread Jakub Jelinek

On Thu, Feb 19, 2015 at 06:16:05PM -0300, Daniel Gutson wrote:
> what about then two warnings (disabled by default), one intended to
> tell the user each time the compiler removes a conditional
> (-fdelete-null-pointer-checks)
> and another intended to tell the user each time the compiler adds a trap due 
> to
> dereference an address 0?
> 
> E.g.
>-Wnull-pointer-check-deleted
>-Wnull-dereference-considered-erroneous
> 
> or alike

That would be extremely difficult.  The -fdelete-null-pointer-checks option
is used in many places, like the path isolation, value range propagation,
alias oracle, number of iteration analysis etc.  E.g. in case of value
range propagation, it is really hard to warn if something has been optimized
some way because of it, because you really don't know the reason why after
all the propagation some SSA_NAME got certain range, to warn you'd
essentially have to do all of VRP twice, once with
-fdelete-null-pointer-checks and once without, and then compare that when
actually performing optimizations.  But some optimizations are also done far
later than directly in the VRP pass.

If you have hw where NULL is mapped and you know your code violates the
C/C++ standards by placing objects at that address, simply do use
the option that is designed for that purpose.

Jakub

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-19 Thread Chris Johns


On 20/02/2015 8:23 am, Joel Sherrill wrote:


On 2/19/2015 2:56 PM, Sandra Loosemore wrote:

Jakub Jelinek wrote:

On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:

Starting with gcc 4.9, -O2 implicitly invokes

 -fisolate-erroneous-paths-dereference:

which

 https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

documents as

 Detect paths that trigger erroneous or undefined behavior due to
 dereferencing a null pointer. Isolate those paths from the main control
 flow and turn the statement with erroneous or undefined behavior into a
 trap. This flag is enabled by default at -O2 and higher.

This results in a sizable number of previously working embedded programs 
mysteriously
crashing when recompiled under gcc 4.9.  The problem is that embedded
programs will often have ram starting at address zero (think hardware-defined
interrupt vectors, say) which gets initialized by code which the
-fisolate-erroneous-paths-deference logic can recognize as reading and/or
writing address zero.

If you have some pages mapped at address 0, you really should compile your
code with -fno-delete-null-pointer-checks, otherwise you can run into tons
of other issues.

H,  Passing the additional option in user code would be one thing,
but what about library code?  E.g., using memcpy (either explicitly or
implicitly for a structure copy)?

It looks to me like cr16 and avr are currently the only architectures
that disable flag_delete_null_pointer_checks entirely, but I am sure
that this issue affects other embedded targets besides nios2, too.  E.g.
scanning Mentor's ARM board support library, I see a whole pile of
devices that have memory mapped at address zero (TI Stellaris/Tiva,
Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ).  Plus our simulator
BSPs assume a flat address space starting at address 0.

I forwarded this to the RTEMS list and was promptly pointed to a patch
on a Coldfire BSP where someone worked around this behavior.

We are discussing how to deal with this. It is likely OK in user code but
horrible in BSP and driver code. We don't have a solution ourselves. We
just recognize it impacts a number of targets.



My main concern is not knowing the trap has been added to the code. If I 
could build an application and audit it somehow then I can manage it. We 
have a similar issue with the possible use of FP registers being used in 
general code (ISR save/restore trade off).


Can the ELF be annotated in some GCC specific way that makes it to the 
final executable to flag this is happening ? We can then create tools to 
audit the executables.


Chris

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-19 Thread Joel Sherrill


On 2/19/2015 2:56 PM, Sandra Loosemore wrote:
> Jakub Jelinek wrote:
>> On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
>>> Starting with gcc 4.9, -O2 implicitly invokes
>>>
>>> -fisolate-erroneous-paths-dereference:
>>>
>>> which
>>>
>>> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>>>
>>> documents as
>>>
>>> Detect paths that trigger erroneous or undefined behavior due to
>>> dereferencing a null pointer. Isolate those paths from the main control
>>> flow and turn the statement with erroneous or undefined behavior into a
>>> trap. This flag is enabled by default at -O2 and higher.
>>>
>>> This results in a sizable number of previously working embedded programs 
>>> mysteriously
>>> crashing when recompiled under gcc 4.9.  The problem is that embedded
>>> programs will often have ram starting at address zero (think 
>>> hardware-defined
>>> interrupt vectors, say) which gets initialized by code which the
>>> -fisolate-erroneous-paths-deference logic can recognize as reading and/or
>>> writing address zero.
>> If you have some pages mapped at address 0, you really should compile your
>> code with -fno-delete-null-pointer-checks, otherwise you can run into tons
>> of other issues.
> H,  Passing the additional option in user code would be one thing, 
> but what about library code?  E.g., using memcpy (either explicitly or 
> implicitly for a structure copy)?
>
> It looks to me like cr16 and avr are currently the only architectures 
> that disable flag_delete_null_pointer_checks entirely, but I am sure 
> that this issue affects other embedded targets besides nios2, too.  E.g. 
> scanning Mentor's ARM board support library, I see a whole pile of 
> devices that have memory mapped at address zero (TI Stellaris/Tiva, 
> Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ).  Plus our simulator 
> BSPs assume a flat address space starting at address 0.
I forwarded this to the RTEMS list and was promptly pointed to a patch
on a Coldfire BSP where someone worked around this behavior.

We are discussing how to deal with this. It is likely OK in user code but
horrible in BSP and driver code. We don't have a solution ourselves. We
just recognize it impacts a number of targets.

> I can see both sides of the issue here  On the one hand, you get 
> better code for 99.99% of situations by enabling 
> -fdelete-null-pointer-checks, but if it makes GCC unusable in the other 
> .01% case, that is a problem for the users for whom that case is 
> critical.  :-S  So I think  Jeff's request here is something that 
> deserves an answer:
Which side is benefiting? The embedded side or self-hosted side?
What is the actual benefit?

Can the option be disabled/enabled on a per target basis?  I honestly
could see disabling it for *-rtems* and on purely embedded
architectures.
>>> BTW, I'd also be curious to know what is regarded as engineering best
>>> practice for writing a value to address zero when this is architecturally
>>> required by the hardware platform at hand.  Obviously one can do various
>>> things to obscure the process sufficiently that the current gcc 
>>> implementation
>>> won't detect it and complain, but as gcc gets smarter about optimization
>>> those are at risk of failing in a future release.  It would be nice to have
>>> a guaranteed-to-work future-proof idiom for doing this. Do we have one, 
>>> short
>>> of retreating to assembly code?
Or GCC specific code.

I considered a special memcpy clone for our BSPs which was not inlined
and used a pragma to turn this optimization off. Then it would just
be a matter of using it in the right places. But knowing all the right
places
is a hard problem by itself. :(
> -Sandra
>

-- 
Joel Sherrill, Ph.D. Director of Research & Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available(256) 722-9985

Re: Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-19 Thread Daniel Gutson

(Hi Sandra, so long!)

On Thu, Feb 19, 2015 at 5:56 PM, Sandra Loosemore
 wrote:
> Jakub Jelinek wrote:
>>
>>
>> On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
>>>
>>> Starting with gcc 4.9, -O2 implicitly invokes
>>>
>>> -fisolate-erroneous-paths-dereference:
>>>
>>> which
>>>
>>> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>>>
>>> documents as
>>>
>>> Detect paths that trigger erroneous or undefined behavior due to
>>> dereferencing a null pointer. Isolate those paths from the main
>>> control
>>> flow and turn the statement with erroneous or undefined behavior into
>>> a
>>> trap. This flag is enabled by default at -O2 and higher.
>>>
>>> This results in a sizable number of previously working embedded programs
>>> mysteriously
>>> crashing when recompiled under gcc 4.9.  The problem is that embedded
>>> programs will often have ram starting at address zero (think
>>> hardware-defined
>>> interrupt vectors, say) which gets initialized by code which the
>>> -fisolate-erroneous-paths-deference logic can recognize as reading and/or
>>> writing address zero.
>>
>>
>> If you have some pages mapped at address 0, you really should compile your
>> code with -fno-delete-null-pointer-checks, otherwise you can run into tons
>> of other issues.
>
>
> H,  Passing the additional option in user code would be one thing, but
> what about library code?  E.g., using memcpy (either explicitly or
> implicitly for a structure copy)?
>
> It looks to me like cr16 and avr are currently the only architectures that
> disable flag_delete_null_pointer_checks entirely, but I am sure that this
> issue affects other embedded targets besides nios2, too.  E.g. scanning
> Mentor's ARM board support library, I see a whole pile of devices that have
> memory mapped at address zero (TI Stellaris/Tiva, Energy Micro EFM32Gxxx,
> Atmel AT91SAMxxx, ).  Plus our simulator BSPs assume a flat address
> space starting at address 0.
>
> I can see both sides of the issue here  On the one hand, you get better
> code for 99.99% of situations by enabling -fdelete-null-pointer-checks, but
> if it makes GCC unusable in the other .01% case, that is a problem for the
> users for whom that case is critical.  :-S  So I think  Jeff's request here
> is something that deserves an answer:

what about then two warnings (disabled by default), one intended to
tell the user each time the compiler removes a conditional
(-fdelete-null-pointer-checks)
and another intended to tell the user each time the compiler adds a trap due to
dereference an address 0?

E.g.
   -Wnull-pointer-check-deleted
   -Wnull-dereference-considered-erroneous

or alike

 ?

  Daniel.

>
>
>>> BTW, I'd also be curious to know what is regarded as engineering best
>>> practice for writing a value to address zero when this is architecturally
>>> required by the hardware platform at hand.  Obviously one can do various
>>> things to obscure the process sufficiently that the current gcc
>>> implementation
>>> won't detect it and complain, but as gcc gets smarter about optimization
>>> those are at risk of failing in a future release.  It would be nice to
>>> have
>>> a guaranteed-to-work future-proof idiom for doing this. Do we have one,
>>> short
>>> of retreating to assembly code?
>
>
> -Sandra
>



-- 

Daniel F. Gutson
Chief Technology Officer, SPD

San Lorenzo 47, 3rd Floor, Office 5
Córdoba, Argentina

Phone:   +54 351 4217888 / +54 351 4218211
Skype:dgutson
LinkedIn: http://ar.linkedin.com/in/danielgutson

Re: Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-19 Thread Sandra Loosemore


Jakub Jelinek wrote:


On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:

Starting with gcc 4.9, -O2 implicitly invokes

-fisolate-erroneous-paths-dereference:

which

https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

documents as

Detect paths that trigger erroneous or undefined behavior due to
dereferencing a null pointer. Isolate those paths from the main control
flow and turn the statement with erroneous or undefined behavior into a
trap. This flag is enabled by default at -O2 and higher.

This results in a sizable number of previously working embedded programs 
mysteriously
crashing when recompiled under gcc 4.9.  The problem is that embedded
programs will often have ram starting at address zero (think hardware-defined
interrupt vectors, say) which gets initialized by code which the
-fisolate-erroneous-paths-deference logic can recognize as reading and/or
writing address zero.


If you have some pages mapped at address 0, you really should compile your
code with -fno-delete-null-pointer-checks, otherwise you can run into tons
of other issues.


H,  Passing the additional option in user code would be one thing, 
but what about library code?  E.g., using memcpy (either explicitly or 
implicitly for a structure copy)?


It looks to me like cr16 and avr are currently the only architectures 
that disable flag_delete_null_pointer_checks entirely, but I am sure 
that this issue affects other embedded targets besides nios2, too.  E.g. 
scanning Mentor's ARM board support library, I see a whole pile of 
devices that have memory mapped at address zero (TI Stellaris/Tiva, 
Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ).  Plus our simulator 
BSPs assume a flat address space starting at address 0.


I can see both sides of the issue here  On the one hand, you get 
better code for 99.99% of situations by enabling 
-fdelete-null-pointer-checks, but if it makes GCC unusable in the other 
.01% case, that is a problem for the users for whom that case is 
critical.  :-S  So I think  Jeff's request here is something that 
deserves an answer:



BTW, I'd also be curious to know what is regarded as engineering best
practice for writing a value to address zero when this is architecturally
required by the hardware platform at hand.  Obviously one can do various
things to obscure the process sufficiently that the current gcc implementation
won't detect it and complain, but as gcc gets smarter about optimization
those are at risk of failing in a future release.  It would be nice to have
a guaranteed-to-work future-proof idiom for doing this. Do we have one, short
of retreating to assembly code?


-Sandra

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-18 Thread Andrew Pinski

On Wed, Feb 18, 2015 at 11:21 AM, Jeff Prothero  wrote:
>
> Starting with gcc 4.9, -O2 implicitly invokes
>
> -fisolate-erroneous-paths-dereference:
>
> which
>
> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>
> documents as
>
> Detect paths that trigger erroneous or undefined behavior due to
> dereferencing a null pointer. Isolate those paths from the main control
> flow and turn the statement with erroneous or undefined behavior into a
> trap. This flag is enabled by default at -O2 and higher.
>
> This results in a sizable number of previously working embedded programs 
> mysteriously
> crashing when recompiled under gcc 4.9.  The problem is that embedded
> programs will often have ram starting at address zero (think hardware-defined
> interrupt vectors, say) which gets initialized by code which the
> -fisolate-erroneous-paths-deference logic can recognize as reading and/or
> writing address zero.

You should have used -fno-delete-null-pointer-checks which has been
doing this optimization for a long time now, just it got better with
-fisolate-erroneous-paths-dereference pass.

Thanks,
Andrew Pinski



>
> What happens then is that the previously running program compiles without
> any warnings, but then typically locks up mysteriously (often disabling the
> remote debug link) due to the trap not being gracefully handled by the
> embedded runtime.
>
> Granted, such code is out-of-spec wrt to C standards.
>
> None the less, the problem is quite painful to track down and
> unexpected.
>
> Is there any good reason the
>
> -fisolate-erroneous-paths-dereference
>
> logic could not issue a compiletime warning or error, instead of just
> silently generating code virtually certain to crash at runtime?
>
> Such a warning/error would save a lot of engineers significant amounts
> of time, energy and frustration tracking down this problem.
>
> I would like to think that the spirit of gcc is about helping engineers
> efficiently correct nonstandard pain, rather than inflicting maximal
> pain upon engineers violating C standards.  :-)
>
> -Jeff
>
> BTW, I'd also be curious to know what is regarded as engineering best
> practice for writing a value to address zero when this is architecturally
> required by the hardware platform at hand.  Obviously one can do various
> things to obscure the process sufficiently that the current gcc implementation
> won't detect it and complain, but as gcc gets smarter about optimization
> those are at risk of failing in a future release.  It would be nice to have
> a guaranteed-to-work future-proof idiom for doing this. Do we have one, short
> of retreating to assembly code?

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-18 Thread Jakub Jelinek

On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
> Starting with gcc 4.9, -O2 implicitly invokes
> 
> -fisolate-erroneous-paths-dereference:
> 
> which
> 
> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> 
> documents as
> 
> Detect paths that trigger erroneous or undefined behavior due to
> dereferencing a null pointer. Isolate those paths from the main control
> flow and turn the statement with erroneous or undefined behavior into a
> trap. This flag is enabled by default at -O2 and higher.
> 
> This results in a sizable number of previously working embedded programs 
> mysteriously
> crashing when recompiled under gcc 4.9.  The problem is that embedded
> programs will often have ram starting at address zero (think hardware-defined
> interrupt vectors, say) which gets initialized by code which the
> -fisolate-erroneous-paths-deference logic can recognize as reading and/or
> writing address zero.

If you have some pages mapped at address 0, you really should compile your
code with -fno-delete-null-pointer-checks, otherwise you can run into tons
of other issues.
Also, there is -fsanitize=undefined that allows discovery of such invalid
calls at runtime, though admittedly it isn't supported for all targets.

Jakub

Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

2015-02-18 Thread Jeff Prothero


Starting with gcc 4.9, -O2 implicitly invokes

-fisolate-erroneous-paths-dereference:

which

https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

documents as

Detect paths that trigger erroneous or undefined behavior due to
dereferencing a null pointer. Isolate those paths from the main control
flow and turn the statement with erroneous or undefined behavior into a
trap. This flag is enabled by default at -O2 and higher.

This results in a sizable number of previously working embedded programs 
mysteriously
crashing when recompiled under gcc 4.9.  The problem is that embedded
programs will often have ram starting at address zero (think hardware-defined
interrupt vectors, say) which gets initialized by code which the
-fisolate-erroneous-paths-deference logic can recognize as reading and/or
writing address zero.

What happens then is that the previously running program compiles without
any warnings, but then typically locks up mysteriously (often disabling the
remote debug link) due to the trap not being gracefully handled by the
embedded runtime.

Granted, such code is out-of-spec wrt to C standards.

None the less, the problem is quite painful to track down and
unexpected.

Is there any good reason the 

-fisolate-erroneous-paths-dereference

logic could not issue a compiletime warning or error, instead of just
silently generating code virtually certain to crash at runtime?

Such a warning/error would save a lot of engineers significant amounts
of time, energy and frustration tracking down this problem.

I would like to think that the spirit of gcc is about helping engineers
efficiently correct nonstandard pain, rather than inflicting maximal
pain upon engineers violating C standards.  :-)

-Jeff

BTW, I'd also be curious to know what is regarded as engineering best
practice for writing a value to address zero when this is architecturally
required by the hardware platform at hand.  Obviously one can do various
things to obscure the process sufficiently that the current gcc implementation
won't detect it and complain, but as gcc gets smarter about optimization
those are at risk of failing in a future release.  It would be nice to have
a guaranteed-to-work future-proof idiom for doing this. Do we have one, short
of retreating to assembly code?

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

[RFC/patch for stage1] Embed compiler dumps into generated .o files (was Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference)

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

33 matches

Site Navigation

Mail list logo

Footer information