Re: [PATCH 0/6] improve handling of char arrays with missing nul (PR 86552, 86711, 86714)

2018-08-24 Thread Richard Biener
On Wed, Aug 15, 2018 at 5:42 PM Jeff Law  wrote:
>
> On 08/15/2018 08:47 AM, Martin Sebor wrote:
> > On 08/15/2018 12:02 AM, Jeff Law wrote:
> >> On 08/13/2018 03:23 PM, Martin Sebor wrote:
> >>> To make reviewing the changes easier I've split up the patch
> >>> into a series:
> >> [ ... ]
> >> I'm about done for the night and thus won't get into the series (and as
> >> you know Bernd has a competing patch in this space).  But I did want to
> >> chime in on two things...
> >>
> >>>
> >>> There are many more string functions where unterminated (constant
> >>> or otherwise) should be diagnosed.  I plan to continue to work on
> >>> those (with the constant ones first)  but I want to post this
> >>> updated patch for review now, mainly so that the wrong code bug
> >>> (PR 86711) can be resolved and the basic detection infrastructure
> >>> agreed on.
> >> Yes, I think we definitely want to focus on the wrong code bug first.
> >>
> >>>
> >>> An open question in my mind is what should GCC do with such calls
> >>> after issuing a warning: replace them with traps?  Fold them into
> >>> constants?  Or continue to pass them through to the corresponding
> >>> library functions?
> >> My personal preference is to turn them into traps.  I don't think we
> >> have to preserve the call itself in this case.   I think the sequencing
> >> is to insert the trap before the call point, split the block after the
> >> trap, remove the outgoing edges, let DCE clean up the rest.  At least I
> >> think that's the sequencing.
> >
> > That sounds fine to me.  It would be close in its effects to
> > what _FORTIFY_SOURCE does.
> The bad guys are exceedingly resourceful in how they exploit undefined
> behavior.  By trapping immediately they don't have any window to do
> anything nefarious.
>
> >
> > It would be helpful to get a broader consensus on this and start
> > adopting the same consistent solution in all contexts.  The question
> > has come up a few times, most recently also in PR 86519 (folding
> > memcmp(a, "a", 3)) where GCC ends up calling the library function.
> Yup.  We've got a mish-mash of strategies here.

Folding cannot easily make sth "regular" as memcmp a noreturn thing.
At least not all callers expect that to happen.  So what you'd need to
do is ensure GF_CALL_CTRL_ALTERING is not set on the replacement
trap().  The next fixup_cfg () pass will fix things for you then.

> >
> > FWIW, if there are other preferences it might be worthwhile to
> > consider providing an option to control the behavior in these
> > cases.  There may also be interactions with or implications for
> > the sanitizers to consider.
> There's some (Marc Glisse IIRC) that would prefer to see the control
> path to the undefined behavior zapped entirely.  We didn't initially do
> that because the path my have other observable side effects.  However,
> there may be cases where it makes sense.

You can't remove observable side-effects and given that there exist
things like signal handlers for SIGSEGV even changing a memcmp
to __builtin_trap() may change observable behavior.

This is why some places in GCC simply refuse to optimize "broken"
cases but keep calling the library.

Richard.

> >
> > Once there is agreement on what the solution should be I can look
> > into implementing it at some point in the future.
> ACK.  Certainly lower priority than the stuff in flight right now.
>
> jeff


Re: [PATCH 0/6] improve handling of char arrays with missing nul (PR 86552, 86711, 86714)

2018-08-15 Thread Jeff Law
On 08/15/2018 08:47 AM, Martin Sebor wrote:
> On 08/15/2018 12:02 AM, Jeff Law wrote:
>> On 08/13/2018 03:23 PM, Martin Sebor wrote:
>>> To make reviewing the changes easier I've split up the patch
>>> into a series:
>> [ ... ]
>> I'm about done for the night and thus won't get into the series (and as
>> you know Bernd has a competing patch in this space).  But I did want to
>> chime in on two things...
>>
>>>
>>> There are many more string functions where unterminated (constant
>>> or otherwise) should be diagnosed.  I plan to continue to work on
>>> those (with the constant ones first)  but I want to post this
>>> updated patch for review now, mainly so that the wrong code bug
>>> (PR 86711) can be resolved and the basic detection infrastructure
>>> agreed on.
>> Yes, I think we definitely want to focus on the wrong code bug first.
>>
>>>
>>> An open question in my mind is what should GCC do with such calls
>>> after issuing a warning: replace them with traps?  Fold them into
>>> constants?  Or continue to pass them through to the corresponding
>>> library functions?
>> My personal preference is to turn them into traps.  I don't think we
>> have to preserve the call itself in this case.   I think the sequencing
>> is to insert the trap before the call point, split the block after the
>> trap, remove the outgoing edges, let DCE clean up the rest.  At least I
>> think that's the sequencing.
> 
> That sounds fine to me.  It would be close in its effects to
> what _FORTIFY_SOURCE does.
The bad guys are exceedingly resourceful in how they exploit undefined
behavior.  By trapping immediately they don't have any window to do
anything nefarious.

> 
> It would be helpful to get a broader consensus on this and start
> adopting the same consistent solution in all contexts.  The question
> has come up a few times, most recently also in PR 86519 (folding
> memcmp(a, "a", 3)) where GCC ends up calling the library function.
Yup.  We've got a mish-mash of strategies here.

> 
> FWIW, if there are other preferences it might be worthwhile to
> consider providing an option to control the behavior in these
> cases.  There may also be interactions with or implications for
> the sanitizers to consider.
There's some (Marc Glisse IIRC) that would prefer to see the control
path to the undefined behavior zapped entirely.  We didn't initially do
that because the path my have other observable side effects.  However,
there may be cases where it makes sense.

> 
> Once there is agreement on what the solution should be I can look
> into implementing it at some point in the future.
ACK.  Certainly lower priority than the stuff in flight right now.

jeff


Re: [PATCH 0/6] improve handling of char arrays with missing nul (PR 86552, 86711, 86714)

2018-08-15 Thread Martin Sebor

On 08/15/2018 12:02 AM, Jeff Law wrote:

On 08/13/2018 03:23 PM, Martin Sebor wrote:

To make reviewing the changes easier I've split up the patch
into a series:

[ ... ]
I'm about done for the night and thus won't get into the series (and as
you know Bernd has a competing patch in this space).  But I did want to
chime in on two things...



There are many more string functions where unterminated (constant
or otherwise) should be diagnosed.  I plan to continue to work on
those (with the constant ones first)  but I want to post this
updated patch for review now, mainly so that the wrong code bug
(PR 86711) can be resolved and the basic detection infrastructure
agreed on.

Yes, I think we definitely want to focus on the wrong code bug first.



An open question in my mind is what should GCC do with such calls
after issuing a warning: replace them with traps?  Fold them into
constants?  Or continue to pass them through to the corresponding
library functions?

My personal preference is to turn them into traps.  I don't think we
have to preserve the call itself in this case.   I think the sequencing
is to insert the trap before the call point, split the block after the
trap, remove the outgoing edges, let DCE clean up the rest.  At least I
think that's the sequencing.


That sounds fine to me.  It would be close in its effects to
what _FORTIFY_SOURCE does.

It would be helpful to get a broader consensus on this and start
adopting the same consistent solution in all contexts.  The question
has come up a few times, most recently also in PR 86519 (folding
memcmp(a, "a", 3)) where GCC ends up calling the library function.

FWIW, if there are other preferences it might be worthwhile to
consider providing an option to control the behavior in these
cases.  There may also be interactions with or implications for
the sanitizers to consider.

Once there is agreement on what the solution should be I can look
into implementing it at some point in the future.

Martin


Re: [PATCH 0/6] improve handling of char arrays with missing nul (PR 86552, 86711, 86714)

2018-08-15 Thread Jeff Law
On 08/13/2018 03:23 PM, Martin Sebor wrote:
> To make reviewing the changes easier I've split up the patch
> into a series:
[ ... ]
I'm about done for the night and thus won't get into the series (and as
you know Bernd has a competing patch in this space).  But I did want to
chime in on two things...

> 
> There are many more string functions where unterminated (constant
> or otherwise) should be diagnosed.  I plan to continue to work on
> those (with the constant ones first)  but I want to post this
> updated patch for review now, mainly so that the wrong code bug
> (PR 86711) can be resolved and the basic detection infrastructure
> agreed on.
Yes, I think we definitely want to focus on the wrong code bug first.

> 
> An open question in my mind is what should GCC do with such calls
> after issuing a warning: replace them with traps?  Fold them into
> constants?  Or continue to pass them through to the corresponding
> library functions?
My personal preference is to turn them into traps.  I don't think we
have to preserve the call itself in this case.   I think the sequencing
is to insert the trap before the call point, split the block after the
trap, remove the outgoing edges, let DCE clean up the rest.  At least I
think that's the sequencing.

Jeff


[PATCH 0/6] improve handling of char arrays with missing nul (PR 86552, 86711, 86714)

2018-08-13 Thread Martin Sebor

To make reviewing the changes easier I've split up the patch
into a series:

1. Detection of nul-terminated constant arrays to prevent early
   folding.  This resolves PR 86711 - wrong folding of memchr,
   and prevents PR 86714 - tree-ssa-forwprop.c confused by too
   long initializer, but doesn't warn.

2. Warn for reads past unterminated constant character arrays.
   This adds warnings for string functions called with such arrays
   to resolve PR 86552 - missing warning for reading past the end
   of non-string arrays.  Now that GCC transforms braced-initializer
   lists into STRING_CSTs (even those with no nul), the warning is
   capable of diagnosing even those.

   2.1 strlen
   2.2 strcpy
   2.3 sprintf
   2.4 stpcpy
   2.5 strnlen

There are many more string functions where unterminated (constant
or otherwise) should be diagnosed.  I plan to continue to work on
those (with the constant ones first)  but I want to post this
updated patch for review now, mainly so that the wrong code bug
(PR 86711) can be resolved and the basic detection infrastructure
agreed on.

An open question in my mind is what should GCC do with such calls
after issuing a warning: replace them with traps?  Fold them into
constants?  Or continue to pass them through to the corresponding
library functions?

Martin

On 07/25/2018 05:38 PM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01124.html

The fix for bug 86532 has been checked in so this enhancement
can now be applied on top of it (with only minor adjustments).

On 07/19/2018 02:08 PM, Martin Sebor wrote:

In the discussion of my patch for pr86532 Bernd noted that
GCC silently accepts constant character arrays with no
terminating nul as arguments to strlen (and other string
functions).

The attached patch is a first step in detecting these kinds
of bugs in strlen calls by issuing -Wstringop-overflow.
The next step is to modify all other handlers of built-in
functions to detect the same problem (not part of this patch).
Yet another step is to detect these problems in arguments
initialized using the non-string form:

  const char a[] = { 'a', 'b', 'c' };

This patch is meant to apply on top of the one for bug 86532
(I tested it with an earlier version of that patch so there
is code in the context that does not appear in the latest
version of the other diff).

Martin