[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 Martin Sebor changed: What|Removed |Added Keywords||diagnostic Status|NEW |RESOLVED Known to work||10.0, 7.3.0, 8.3.0, 9.1.0 See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=86349 Resolution|--- |DUPLICATE Known to fail||6.3.0 --- Comment #18 from Martin Sebor --- Starting with r240298, GCC 7 and later diagnose the buffer overflow in both calls so this request can be resolved as a duplicate of pr49905. The output with the current trunk at -O0 is below. $ gcc -S -Wall pr54582.c pr54582.c: In function ‘f’: pr54582.c:8:18: warning: ‘%d’ directive writing between 1 and 11 bytes into a region of size 0 [-Wformat-overflow=] 8 | sprintf(buf, "ab%d", n); | ^~ pr54582.c:8:2: note: ‘sprintf’ output between 4 and 14 bytes into a destination of size 2 8 | sprintf(buf, "ab%d", n); | ^~~ pr54582.c:11:18: warning: ‘sprintf’ writing a terminating nul past the end of the destination [-Wformat-overflow=] 11 | sprintf(buf, "ab"); | ^ pr54582.c:11:2: note: ‘sprintf’ output 3 bytes into a destination of size 2 11 | sprintf(buf, "ab"); | ^~ With -D_FORTIFY_SOURCE, both calls are still instrumented. ;; Function f (f, funcdef_no=23, decl_uid=2524, cgraph_uid=24, symbol_order=23) f (int n) { char buf[2]; [local count: 1073741824]: __builtin___sprintf_chk (&buf, 1, 2, "ab%d", n_2(D)); __builtin_puts (&buf); __builtin___sprintf_chk (&buf, 1, 2, "ab"); __builtin_puts (&buf); buf ={v} {CLOBBER}; return; } What still doesn't work is the overflow detection/prevention for dynamically allocated objects and VLAs. Bug 86349 records this limitation. The strlen pass tracks dynamic allocation and uses it for optimization but not yet to detect buffer overflow. Making use of it is a relatively simple enhancement. Now that the sprintf checker runs as part of the strlen pass, with the enhancement in place, exposing the dynamic allocation sizes to it to detect the overflow should be straightforward (and is on my list of things to do). $ cat t.c && gcc -O2 -D_FORTIFY_SOURCE=2 -S -Wall -fdump-tree-optimized=/dev/stdout t.c #include int n = 2; void f (int n) { char buf[n]; sprintf (buf, "abcdef"); printf ("%s\n", buf); } ;; Function f (f, funcdef_no=23, decl_uid=2525, cgraph_uid=24, symbol_order=24) f (int n) { char[0:D.2530] * buf.1; sizetype _1; [local count: 1073741824]: _1 = (sizetype) n_2(D); buf.1_7 = __builtin_alloca_with_align (_1, 8); __builtin_memcpy (buf.1_7, "abcdef", 7); __builtin_puts (buf.1_7); return; } *** This bug has been marked as a duplicate of bug 49905 ***
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 Eric Gallager changed: What|Removed |Added CC||msebor at gcc dot gnu.org --- Comment #17 from Eric Gallager --- Martin Sebor might be interested in this
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 --- Comment #16 from David Binderman --- Created attachment 30287 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30287&action=edit C++ source code Latest version understands NUM1.NUM2, 53 formatting specifiers and a bunch of prefixes like %# It seems to work well over the source code of Fedora Linux.
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 --- Comment #15 from David Binderman 2013-03-08 08:48:47 UTC --- Created attachment 29615 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29615 C++ source code This one now understands 40 common formats and can properly interpret field widths. The next version will understand the NUM1.NUM2 format.
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 --- Comment #14 from David Binderman 2013-02-14 19:06:54 UTC --- Created attachment 29453 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29453 C++ source code Old version understood about a dozen formats, this later version understands 34 formats. Next version will understand some field widths.
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 --- Comment #13 from David Binderman 2013-02-07 21:21:56 UTC --- Created attachment 29391 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29391 C++ source code This code doesn't understand all sprintf % specifiers. It is future work to make it understand more specifiers.
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 --- Comment #12 from David Binderman 2013-02-07 21:19:15 UTC --- I coded up some of my ideas into the attached C++ source code. Then I modified builtins.c and did a bootstrap. My small patch seems to be working well. I'm sure someone could convert my C++ code into something that follows the compiler coding conventions, add test cases etc.
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 --- Comment #11 from David Binderman 2013-02-06 18:47:37 UTC --- >It isn't that easy. For %'s you really have to parse all the characters after >% and figure out where the format specifier ends. Agreed. But it doesn't have to be perfect, it just has to be better than the previous solution. This is only a warning we are discussing. It can be ignored. The Werror folks are, IMHO, far too keen for their own good. I wrote: >All the numeric specifiers (%d, %u etc) all produce at least one >character, so gcc could take account of this in checking buffer lengths. In computing a lower bound, all % specifiers could be assumed to produce at least one byte of output. Parsing the % specifier doesn't have to be perfect, it just has to understand the common cases (%s, %d, %u etc) and punt on the rest. More understanding can be added in later versions.
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 Florian Weimer changed: What|Removed |Added CC||fweimer at redhat dot com --- Comment #10 from Florian Weimer 2013-02-06 14:28:35 UTC --- (In reply to comment #9) > 1) this is -D_FORTIFY_SOURCE warning, you can invent other warnings elsewhere > 2) with -D_FORTIFY_SOURCE, e.g. sprintf is an inline function, so the FE sees > it as a call to an inline function with some argument, you need to inline it, > figure out what the inline does, then fold the builtins used in the inline. > Also consider > char buf[2]; > char *p; > p = buf; > sprintf(buf, "ab%d", n); I think you mean sprintf(p, ...). > Unless you move the optimization passes into the FE, you aren't going to warn > about this properly in the FE. Insisting on a FE warning in this case is just > dumb. We could duplicate optimizations in the front end (like others do). More seriously, I think this is a case where a layering violation makes perfect sense.
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 --- Comment #9 from Jakub Jelinek 2013-02-06 13:46:05 UTC --- 1) this is -D_FORTIFY_SOURCE warning, you can invent other warnings elsewhere 2) with -D_FORTIFY_SOURCE, e.g. sprintf is an inline function, so the FE sees it as a call to an inline function with some argument, you need to inline it, figure out what the inline does, then fold the builtins used in the inline. Also consider char buf[2]; char *p; p = buf; sprintf(buf, "ab%d", n); Unless you move the optimization passes into the FE, you aren't going to warn about this properly in the FE. Insisting on a FE warning in this case is just dumb.
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 --- Comment #8 from Manuel López-Ibáñez 2013-02-06 13:39:32 UTC --- (In reply to comment #7) > Because object sizes are finalized only during the objsz pass, after lots of > optimization passes. Note, as I said earlier, what matters most is that the > check is performed at runtime in that case and thus the source code bug can't > be exploited. The warning is just to let the user know earlier than at > runtime, when easily possible. Maybe we are talking about different things. cppcheck seems to be able to give the warning without any optimizations. The FE doesn't know that buf[2] is length 2 and "ab" is length 3?
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 --- Comment #7 from Jakub Jelinek 2013-02-06 13:25:29 UTC --- Because object sizes are finalized only during the objsz pass, after lots of optimization passes. Note, as I said earlier, what matters most is that the check is performed at runtime in that case and thus the source code bug can't be exploited. The warning is just to let the user know earlier than at runtime, when easily possible. -D_FORTIFY_SOURCE{,=2} is done using inline functions, so the FE pretty much never knows the object size, you need inlining and various propagations (plus for many cases also the objsz pass that propagates the object size properties through the IL). In the FE you could do it only if all the fortification functions were preprocessor macros, and handle only the most simple cases.
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 Manuel López-Ibáñez changed: What|Removed |Added CC||manu at gcc dot gnu.org --- Comment #6 from Manuel López-Ibáñez 2013-02-06 13:13:52 UTC --- (In reply to comment #5) > So, if we are going to do something about this, either we could do something > very simple, like strchr (str, '%') - str as low bound guess, or reuse the > c-format tables somehow (but they are in the FE, while this is in middle-end), And why is this check in the middle-end?
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #5 from Jakub Jelinek 2013-02-06 12:41:09 UTC --- It isn't that easy. For %'s you really have to parse all the characters after % and figure out where the format specifier ends. Users can have printf hooks installed, so it certainly needs to give up any time it sees something it doesn't fully understand. In that case I guess it could safely just assume the lower bound as if the string ended on the % after which it doesn't understand the letters. Note, that this is just about the compile time warning, the code will fail at runtime the same way in the first as in the second case. So, if we are going to do something about this, either we could do something very simple, like strchr (str, '%') - str as low bound guess, or reuse the c-format tables somehow (but they are in the FE, while this is in middle-end), or write a simple parse of few most common formatting specifiers and give up on anything else.
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 --- Comment #4 from Richard Biener 2013-02-06 12:21:12 UTC --- (In reply to comment #3) > Code is (maybe_emit_sprintf_chk_warning): > > /* If the format doesn't contain % args or %%, we know its size. */ > if (strchr (fmt_str, target_percent) == 0) > len = build_int_cstu (size_type_node, strlen (fmt_str)); > /* If the format is "%s" and first ... argument is a string literal, > we know it too. */ > else if (fcode == BUILT_IN_SPRINTF_CHK >&& strcmp (fmt_str, target_percent_s) == 0) > ... > else > return; > > so it lacks a way to compute an upper bound for the format which I guess > we can always compute (just not account all %'s at all?). lower bound of course
[Bug middle-end/54582] gap in FORTIFY checking of buffer lengths
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54582 Richard Biener changed: What|Removed |Added Component|c |middle-end Severity|minor |enhancement --- Comment #3 from Richard Biener 2013-02-06 12:18:21 UTC --- Code is (maybe_emit_sprintf_chk_warning): /* If the format doesn't contain % args or %%, we know its size. */ if (strchr (fmt_str, target_percent) == 0) len = build_int_cstu (size_type_node, strlen (fmt_str)); /* If the format is "%s" and first ... argument is a string literal, we know it too. */ else if (fcode == BUILT_IN_SPRINTF_CHK && strcmp (fmt_str, target_percent_s) == 0) ... else return; so it lacks a way to compute an upper bound for the format which I guess we can always compute (just not account all %'s at all?).