Re: [PATCH 0/6] improve handling of char arrays with missing nul (PR 86552, 86711, 86714)
On Wed, Aug 15, 2018 at 5:42 PM Jeff Law wrote: > > On 08/15/2018 08:47 AM, Martin Sebor wrote: > > On 08/15/2018 12:02 AM, Jeff Law wrote: > >> On 08/13/2018 03:23 PM, Martin Sebor wrote: > >>> To make reviewing the changes easier I've split up the patch > >>> into a series: > >> [ ... ] > >> I'm about done for the night and thus won't get into the series (and as > >> you know Bernd has a competing patch in this space). But I did want to > >> chime in on two things... > >> > >>> > >>> There are many more string functions where unterminated (constant > >>> or otherwise) should be diagnosed. I plan to continue to work on > >>> those (with the constant ones first) but I want to post this > >>> updated patch for review now, mainly so that the wrong code bug > >>> (PR 86711) can be resolved and the basic detection infrastructure > >>> agreed on. > >> Yes, I think we definitely want to focus on the wrong code bug first. > >> > >>> > >>> An open question in my mind is what should GCC do with such calls > >>> after issuing a warning: replace them with traps? Fold them into > >>> constants? Or continue to pass them through to the corresponding > >>> library functions? > >> My personal preference is to turn them into traps. I don't think we > >> have to preserve the call itself in this case. I think the sequencing > >> is to insert the trap before the call point, split the block after the > >> trap, remove the outgoing edges, let DCE clean up the rest. At least I > >> think that's the sequencing. > > > > That sounds fine to me. It would be close in its effects to > > what _FORTIFY_SOURCE does. > The bad guys are exceedingly resourceful in how they exploit undefined > behavior. By trapping immediately they don't have any window to do > anything nefarious. > > > > > It would be helpful to get a broader consensus on this and start > > adopting the same consistent solution in all contexts. The question > > has come up a few times, most recently also in PR 86519 (folding > > memcmp(a, "a", 3)) where GCC ends up calling the library function. > Yup. We've got a mish-mash of strategies here. Folding cannot easily make sth "regular" as memcmp a noreturn thing. At least not all callers expect that to happen. So what you'd need to do is ensure GF_CALL_CTRL_ALTERING is not set on the replacement trap(). The next fixup_cfg () pass will fix things for you then. > > > > FWIW, if there are other preferences it might be worthwhile to > > consider providing an option to control the behavior in these > > cases. There may also be interactions with or implications for > > the sanitizers to consider. > There's some (Marc Glisse IIRC) that would prefer to see the control > path to the undefined behavior zapped entirely. We didn't initially do > that because the path my have other observable side effects. However, > there may be cases where it makes sense. You can't remove observable side-effects and given that there exist things like signal handlers for SIGSEGV even changing a memcmp to __builtin_trap() may change observable behavior. This is why some places in GCC simply refuse to optimize "broken" cases but keep calling the library. Richard. > > > > Once there is agreement on what the solution should be I can look > > into implementing it at some point in the future. > ACK. Certainly lower priority than the stuff in flight right now. > > jeff
Re: [PATCH 0/6] improve handling of char arrays with missing nul (PR 86552, 86711, 86714)
On 08/15/2018 08:47 AM, Martin Sebor wrote: > On 08/15/2018 12:02 AM, Jeff Law wrote: >> On 08/13/2018 03:23 PM, Martin Sebor wrote: >>> To make reviewing the changes easier I've split up the patch >>> into a series: >> [ ... ] >> I'm about done for the night and thus won't get into the series (and as >> you know Bernd has a competing patch in this space). But I did want to >> chime in on two things... >> >>> >>> There are many more string functions where unterminated (constant >>> or otherwise) should be diagnosed. I plan to continue to work on >>> those (with the constant ones first) but I want to post this >>> updated patch for review now, mainly so that the wrong code bug >>> (PR 86711) can be resolved and the basic detection infrastructure >>> agreed on. >> Yes, I think we definitely want to focus on the wrong code bug first. >> >>> >>> An open question in my mind is what should GCC do with such calls >>> after issuing a warning: replace them with traps? Fold them into >>> constants? Or continue to pass them through to the corresponding >>> library functions? >> My personal preference is to turn them into traps. I don't think we >> have to preserve the call itself in this case. I think the sequencing >> is to insert the trap before the call point, split the block after the >> trap, remove the outgoing edges, let DCE clean up the rest. At least I >> think that's the sequencing. > > That sounds fine to me. It would be close in its effects to > what _FORTIFY_SOURCE does. The bad guys are exceedingly resourceful in how they exploit undefined behavior. By trapping immediately they don't have any window to do anything nefarious. > > It would be helpful to get a broader consensus on this and start > adopting the same consistent solution in all contexts. The question > has come up a few times, most recently also in PR 86519 (folding > memcmp(a, "a", 3)) where GCC ends up calling the library function. Yup. We've got a mish-mash of strategies here. > > FWIW, if there are other preferences it might be worthwhile to > consider providing an option to control the behavior in these > cases. There may also be interactions with or implications for > the sanitizers to consider. There's some (Marc Glisse IIRC) that would prefer to see the control path to the undefined behavior zapped entirely. We didn't initially do that because the path my have other observable side effects. However, there may be cases where it makes sense. > > Once there is agreement on what the solution should be I can look > into implementing it at some point in the future. ACK. Certainly lower priority than the stuff in flight right now. jeff
Re: [PATCH 0/6] improve handling of char arrays with missing nul (PR 86552, 86711, 86714)
On 08/15/2018 12:02 AM, Jeff Law wrote: On 08/13/2018 03:23 PM, Martin Sebor wrote: To make reviewing the changes easier I've split up the patch into a series: [ ... ] I'm about done for the night and thus won't get into the series (and as you know Bernd has a competing patch in this space). But I did want to chime in on two things... There are many more string functions where unterminated (constant or otherwise) should be diagnosed. I plan to continue to work on those (with the constant ones first) but I want to post this updated patch for review now, mainly so that the wrong code bug (PR 86711) can be resolved and the basic detection infrastructure agreed on. Yes, I think we definitely want to focus on the wrong code bug first. An open question in my mind is what should GCC do with such calls after issuing a warning: replace them with traps? Fold them into constants? Or continue to pass them through to the corresponding library functions? My personal preference is to turn them into traps. I don't think we have to preserve the call itself in this case. I think the sequencing is to insert the trap before the call point, split the block after the trap, remove the outgoing edges, let DCE clean up the rest. At least I think that's the sequencing. That sounds fine to me. It would be close in its effects to what _FORTIFY_SOURCE does. It would be helpful to get a broader consensus on this and start adopting the same consistent solution in all contexts. The question has come up a few times, most recently also in PR 86519 (folding memcmp(a, "a", 3)) where GCC ends up calling the library function. FWIW, if there are other preferences it might be worthwhile to consider providing an option to control the behavior in these cases. There may also be interactions with or implications for the sanitizers to consider. Once there is agreement on what the solution should be I can look into implementing it at some point in the future. Martin
Re: [PATCH 0/6] improve handling of char arrays with missing nul (PR 86552, 86711, 86714)
On 08/13/2018 03:23 PM, Martin Sebor wrote: > To make reviewing the changes easier I've split up the patch > into a series: [ ... ] I'm about done for the night and thus won't get into the series (and as you know Bernd has a competing patch in this space). But I did want to chime in on two things... > > There are many more string functions where unterminated (constant > or otherwise) should be diagnosed. I plan to continue to work on > those (with the constant ones first) but I want to post this > updated patch for review now, mainly so that the wrong code bug > (PR 86711) can be resolved and the basic detection infrastructure > agreed on. Yes, I think we definitely want to focus on the wrong code bug first. > > An open question in my mind is what should GCC do with such calls > after issuing a warning: replace them with traps? Fold them into > constants? Or continue to pass them through to the corresponding > library functions? My personal preference is to turn them into traps. I don't think we have to preserve the call itself in this case. I think the sequencing is to insert the trap before the call point, split the block after the trap, remove the outgoing edges, let DCE clean up the rest. At least I think that's the sequencing. Jeff
[PATCH 0/6] improve handling of char arrays with missing nul (PR 86552, 86711, 86714)
To make reviewing the changes easier I've split up the patch into a series: 1. Detection of nul-terminated constant arrays to prevent early folding. This resolves PR 86711 - wrong folding of memchr, and prevents PR 86714 - tree-ssa-forwprop.c confused by too long initializer, but doesn't warn. 2. Warn for reads past unterminated constant character arrays. This adds warnings for string functions called with such arrays to resolve PR 86552 - missing warning for reading past the end of non-string arrays. Now that GCC transforms braced-initializer lists into STRING_CSTs (even those with no nul), the warning is capable of diagnosing even those. 2.1 strlen 2.2 strcpy 2.3 sprintf 2.4 stpcpy 2.5 strnlen There are many more string functions where unterminated (constant or otherwise) should be diagnosed. I plan to continue to work on those (with the constant ones first) but I want to post this updated patch for review now, mainly so that the wrong code bug (PR 86711) can be resolved and the basic detection infrastructure agreed on. An open question in my mind is what should GCC do with such calls after issuing a warning: replace them with traps? Fold them into constants? Or continue to pass them through to the corresponding library functions? Martin On 07/25/2018 05:38 PM, Martin Sebor wrote: Ping: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01124.html The fix for bug 86532 has been checked in so this enhancement can now be applied on top of it (with only minor adjustments). On 07/19/2018 02:08 PM, Martin Sebor wrote: In the discussion of my patch for pr86532 Bernd noted that GCC silently accepts constant character arrays with no terminating nul as arguments to strlen (and other string functions). The attached patch is a first step in detecting these kinds of bugs in strlen calls by issuing -Wstringop-overflow. The next step is to modify all other handlers of built-in functions to detect the same problem (not part of this patch). Yet another step is to detect these problems in arguments initialized using the non-string form: const char a[] = { 'a', 'b', 'c' }; This patch is meant to apply on top of the one for bug 86532 (I tested it with an earlier version of that patch so there is code in the context that does not appear in the latest version of the other diff). Martin