[Bug c/71613] Useful warnings silenced when macros from system headers are used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71613 Tavian Barnes changed: What|Removed |Added CC||tavianator at gmail dot com --- Comment #11 from Tavian Barnes --- Mostly commenting to make this searchable in case someone else runs into it: -Wtsan is also affected by this: $ cat foo.c #include int main(void) { atomic_thread_fence(memory_order_relaxed); __atomic_thread_fence(__ATOMIC_RELAXED); return 0; } $ gcc -c -fsanitize=thread foo.c foo.c: In function ‘main’: foo.c:5:9: warning: ‘atomic_thread_fence’ is not supported with ‘-fsanitize=thread’ [-Wtsan] 5 | __atomic_thread_fence(__ATOMIC_RELAXED); | ^~~
[Bug sanitizer/113430] Trivial program segfaults intermittently with ASAN with large CONFIG_ARCH_MMAP_RND_BITS in kernel configuration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113430 --- Comment #5 from Tavian Barnes --- (In reply to Xi Ruoyao from comment #3) > Updated the title to make it more precise. > > Note that even with Linux 6.7 the default value of CONFIG_ARCH_MMAP_RND_BITS > is still 28 (32 is set by some distro maintainer who apparently does not > know this will hit the sanitizer runtime), so "since Linux 6.7" is just > misleading. Yep agreed. I didn't expect such a patch from Arch, so I assumed it was a change in the default kernel config. For completeness, here's the Arch bug: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/20
[Bug sanitizer/113430] New: Trivial program segfaults intermittently with ASAN since Linux 6.7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113430 Bug ID: 113430 Summary: Trivial program segfaults intermittently with ASAN since Linux 6.7 Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org Target Milestone: --- Since updating to Linux 6.7, I'm getting intermittent segfaults with ASAN and ASLR enabled. $ cat foo.c int main(void) { return 0; } $ gcc -fsanitize=address foo.c -o foo $ while ./foo; do :; done AddressSanitizer:DEADLYSIGNAL = ==337494==ERROR: AddressSanitizer: SEGV on unknown address 0x636c68879e78 (pc 0x7dde493b538f bp 0x sp 0x7ffc78949970 T0) ==337494==The signal is caused by a READ memory access. AddressSanitizer:DEADLYSIGNAL AddressSanitizer: nested bug in the same thread, aborting. tavianator@graphene $ gcc --version gcc (GCC) 13.2.1 20230801 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ uname -a Linux graphene 6.7.0-arch3-1 #1 SMP PREEMPT_DYNAMIC Sat, 13 Jan 2024 14:37:14 + x86_64 GNU/Linux Here's the backtrace: (gdb) set disable-randomization off (gdb) run Starting program: /home/tavianator/code/bfs/foo [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. do_lookup_x (undef_name=undef_name@entry=0x761941b3e6d8 "_thread_db_sizeof_pthread", new_hash=new_hash@entry=3872132951, old_hash=old_hash@entry=0x716f0cc8, ref=0x0, result=result@entry=0x716f0cd0, scope=, i=0, version=0x0, flags=3, skip=, type_class=0, undef_map=) at dl-lookup.c:405 405 const ElfW(Sym) *symtab = (const void *) D_PTR (map, l_info[DT_SYMTAB]); (gdb) bt #0 do_lookup_x (undef_name=undef_name@entry=0x761941b3e6d8 "_thread_db_sizeof_pthread", new_hash=new_hash@entry=3872132951, old_hash=old_hash@entry=0x716f0cc8, ref=0x0, result=result@entry=0x716f0cd0, scope=, i=0, version=0x0, flags=3, skip=, type_class=0, undef_map=) at dl-lookup.c:405 #1 0x7619421e20b8 in _dl_lookup_symbol_x (undef_name=0x761941b3e6d8 "_thread_db_sizeof_pthread", undef_map=, ref=0x716f0d58, symbol_scope=, version=0x0, type_class=0, flags=3, skip_map=0x0) at dl-lookup.c:793 #2 0x76194197300e in do_sym (handle=, name=0x761941b3e6d8 "_thread_db_sizeof_pthread", who=0x761941afffb3 <__sanitizer::ThreadDescriptorSize()+35>, vers=vers@entry=0x0, flags=flags@entry=2) at dl-sym.c:146 #3 0x761941973331 in _dl_sym (handle=, name=, who=) at dl-sym.c:195 #4 0x7619418a6ae8 in dlsym_doit (a=a@entry=0x716f0fc0) at dlsym.c:40 #5 0x7619421d94e1 in __GI__dl_catch_exception (exception=exception@entry=0x716f0f20, operate=0x7619418a6ad0 , args=0x716f0fc0) at dl-catch.c:237 #6 0x7619421d9603 in _dl_catch_error (objname=0x716f0f78, errstring=0x716f0f80, mallocedp=0x716f0f77, operate=, args=) at dl-catch.c:256 #7 0x7619418a64f7 in _dlerror_run (operate=operate@entry=0x7619418a6ad0 , args=args@entry=0x716f0fc0) at dlerror.c:138 #8 0x7619418a6b75 in dlsym_implementation (dl_caller=, name=, handle=) at dlsym.c:54 #9 ___dlsym (handle=, name=) at dlsym.c:68 #10 0x761941afffb3 in __sanitizer::ThreadDescriptorSize () at /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:298 #11 0x761941b017ae in __sanitizer::ThreadDescriptorSize () at /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:294 #12 __sanitizer::GetTls (size=0x716f1098, addr=0x7619421b0040) at /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:498 #13 __sanitizer::GetThreadStackAndTls (main=true, stk_addr=stk_addr@entry=0x7619421b0020, stk_size=stk_size@entry=0x716f10a0, tls_addr=tls_addr@entry=0x7619421b0040, tls_size=tls_size@entry=0x716f1098) at /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:595 #14 0x761941af0ff4 in __asan::AsanThread::SetThreadStackAndTls (this=this@entry=0x7619421b, options=) at /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_thread.h:77 #15 0x761941af14ee in __asan::AsanThread::Init (this=this@entry=0x7619421b, options=options@entry=0x0) at /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_thread.cpp:234 #16 0x00
[Bug middle-end/112748] memmove(ptr, ptr, n) call optimized out even at -O0 with -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112748 --- Comment #2 from Tavian Barnes --- (In reply to Andrew Pinski from comment #1) > Does -fsanitize=address remove it? Yes, it's still removed with -fsanitize=address. While ASAN is necessary to check that the memory is really allocated, UBSAN should at least check that ptr is not NULL. So it shouldn't be removed in either case.
[Bug c/112748] New: memmove(ptr, ptr, n) call optimized out even at -O0 with -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112748 Bug ID: 112748 Summary: memmove(ptr, ptr, n) call optimized out even at -O0 with -fsanitize=undefined Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Target Milestone: --- This is counter-productive, as I wrote the memmove() specifically to get the sanitizers to check that ptr really points to a big enough allocation. $ cat foo.c typedef __SIZE_TYPE__ size_t; void *memmove(void *dest, void *src, size_t n); void foo(void *ptr, size_t n) { memmove(ptr, ptr, n); } $ gcc -O0 -fsanitize=undefined -S foo.c $ cat foo.s .file "foo.c" .text .globl foo .type foo, @function foo: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq%rsp, %rbp .cfi_def_cfa_register 6 movq%rdi, -8(%rbp) movq%rsi, -16(%rbp) nop popq%rbp .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE0: .size foo, .-foo .ident "GCC: (GNU) 13.2.1 20230801" .section.note.GNU-stack,"",@progbits
[Bug middle-end/94787] Failure to detect single bit popcount pattern
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94787 Tavian Barnes changed: What|Removed |Added CC||tavianator at gmail dot com --- Comment #4 from Tavian Barnes --- (In reply to Wilco from comment #3) > I actually posted a patch for this and popcount(x) > 1 given the reverse > transformation is faster on all targets - even if they have popcount > instruction (since they are typically more expensive). This is true on x86 > as well, (x-1) u (x ^ -x)
[Bug target/106952] Missed optimization: x < y ? x : y not lowered to minss
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106952 --- Comment #2 from Tavian Barnes --- (In reply to Alexander Monakov from comment #1) > Note, your 'max' function is the same as 'min' (the issue remains with that > corrected). Whoops, thanks. Also I just noticed that GCC 12.2 does better (but not perfect) with #define min(x, y) ((x) < (y) ? (x) : (y)) #define max(x, y) ((x) > (y) ? (x) : (y)) instead of the inline functions. Doesn't seem to help GCC trunk though.
[Bug target/106952] New: Missed optimization: x < y ? x : y not lowered to minss
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106952 Bug ID: 106952 Summary: Missed optimization: x < y ? x : y not lowered to minss Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Target Milestone: --- Created attachment 53580 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53580&action=edit Assembly from gcc -O3 -S bug.c The following is an implementation of a ray/axis-aligned box intersection test: struct ray { float origin[3]; float dir_inv[3]; }; struct box { float min[3]; float max[3]; }; static inline float min(float x, float y) { return x < y ? x : y; } static inline float max(float x, float y) { return x < y ? x : y; } _Bool intersection(const struct ray *ray, const struct box *box) { float tmin = 0.0, tmax = 1.0 / 0.0; for (int i = 0; i < 3; ++i) { float t1 = (box->min[i] - ray->origin[i]) * ray->dir_inv[i]; float t2 = (box->max[i] - ray->origin[i]) * ray->dir_inv[i]; tmin = min(max(t1, tmin), max(t2, tmin)); tmax = max(min(t1, tmax), min(t2, tmax)); } return tmin < tmax; } However, gcc -O3 doesn't use minss/maxss for every min()/max(). Instead, some of them are lowered to conditional jumps which regresses performance significantly since the branches are unpredictable. Simpler variants like tmin = max(tmin, min(t1, t2)); tmax = min(tmax, max(t1, t2)); get the desired codegen, but that behaves differently if t1 or t2 is NaN. "Bisecting" with godbolt.org, it seems this is an old regression: 4.8.5 was good, but 4.9.0 was bad.
[Bug tree-optimization/65752] Too strong optimizations int -> pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 Tavian Barnes changed: What|Removed |Added CC||tavianator at gmail dot com --- Comment #61 from Tavian Barnes --- (In reply to rguent...@suse.de from comment #44) > The other (unfortunate) thing is that in GCC pointer subtraction > is always performed on integers, thus for the C source code > > int idx = ptr1 - ptr2; > > we internally have sth like > > int idx = ((long)ptr1 - (long)ptr2) / 4; > > so you can't really treat pointers as "escaped" here without loss. It seems possible to distinguish between ptr-to-int casts that actually occur in the source, from ptr-to-int casts that are generated for other reasons by the compiler.
[Bug middle-end/93742] New: Optimization request: pattern-match typical addition overflow checks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93742 Bug ID: 93742 Summary: Optimization request: pattern-match typical addition overflow checks Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Target Milestone: --- It would be nice if GCC could pattern-match the typical pattern for overflow-safe addition and optimize it to just branch on the overflow flag. Right now, this: _Bool add_overflow(int a, int b, int *res) { if ((a > 0 && b > INT_MAX - a) || (a < 0 && b < INT_MIN - a)) { return 1; } else { *res = a + b; return 0; } } compiles to this: testl %edi, %edi jle .L2 movl$2147483647, %ecx movl$1, %eax subl%edi, %ecx cmpl%esi, %ecx jl .L12 .L4: addl%esi, %edi xorl%eax, %eax movl%edi, (%rdx) ret .L12: ret .L2: je .L4 movl$-2147483648, %ecx movl$1, %eax subl%edi, %ecx cmpl%esi, %ecx jle .L4 ret #48580 is similar but about multiplication, for which the overflow-safe pattern is a lot more complicated. But the addition one above is a lot simpler and pretty widespread I think. For example, it's what CERT recommends: https://wiki.sei.cmu.edu/confluence/display/c/INT32-C.+Ensure+that+operations+on+signed+integers+do+not+result+in+overflow
[Bug tree-optimization/93681] Wrong optimization: instability of x87 floating-point results leads to nonsense
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93681 Tavian Barnes changed: What|Removed |Added CC||tavianator at gmail dot com --- Comment #2 from Tavian Barnes --- Similar to (dupe of?) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85957
[Bug tree-optimization/90766] strlen(a + i) missing range for arrays of unknown bound with strings of known length and variable i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90766 Tavian Barnes changed: What|Removed |Added CC||tavianator at gmail dot com --- Comment #2 from Tavian Barnes --- If char b[] = "123\01234"; then it can't be folded to false for i == 4, right? Presumably g() is only folded because you have &s[i] not &a[i].
[Bug libstdc++/90246] std::bad_variant_access messages are not useful
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90246 Tavian Barnes changed: What|Removed |Added CC||tavianator at gmail dot com --- Comment #2 from Tavian Barnes --- > It's too late to change this now, but we could still improve the messages Would it be ABI compatible to make a new exception type that derives from std::bad_variant_access, and throw that instead?
[Bug c++/90205] Wformat-signedness detects %d and suggests %d fixit hint
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90205 Tavian Barnes changed: What|Removed |Added CC||tavianator at gmail dot com --- Comment #8 from Tavian Barnes --- Maybe "argument 2 has type 'double' (promoted from 'float')"?
[Bug c/87806] Option -Wall should warn about unused structs, typdefs, enums, etc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87806 Tavian Barnes changed: What|Removed |Added CC||tavianator at gmail dot com --- Comment #4 from Tavian Barnes --- Perhaps this is reasonable for types that are defined in the file itself, not in an included header?
[Bug middle-end/87647] New: ICE on valid code in decode_addr_const, at varasm.c:2958
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87647 Bug ID: 87647 Summary: ICE on valid code in decode_addr_const, at varasm.c:2958 Product: gcc Version: 8.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Target Milestone: --- $ cat test.c struct a { }; struct a *const b = &(struct a){}; int main() { struct { char *s; struct a *t; } a[] = {"", b, "", b, "", b, "", b, "", b, "", b, "", b, "", b, "", b, "", b, "", b, "", b, "", b, "", b, "", b, "", b, "", b}; } $ gcc -O1 test.c test.c: In function ‘main’: test.c:8:5: internal compiler error: in decode_addr_const, at varasm.c:2958 } a[] = {"", b, "", b, "", b, "", b, "", b, "", b, "", b, "", b, "", ^ Please submit a full bug report, with preprocessed source if appropriate. See <https://bugs.archlinux.org/> for instructions. $ gcc --version gcc (GCC) 8.2.1 20180831 Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Reduced from a reddit post: https://www.reddit.com/r/C_Programming/comments/9p44be/internal_compiler_error/
[Bug tree-optimization/86029] gcc -O3 make very slow product
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86029 --- Comment #5 from Tavian Barnes --- (In reply to Zsolt from comment #3) > What is the difference between gcc's and clang's __mulsc3? The important difference is that Clang (and GCC trunk) expand the fastpath inline, and fall back on __mulsc3 for the more complicated cases. So instead of complex float a = b * c; expanding to complex float a = __mulsc3(crealf(b), cimagf(b), crealf(c), cimagf(c)); it's more like complex float a = (crealf(b)*crealf(c) - cimagf(b)*cimagf(c) + I*(crealf(b)*cimagf(c) + cimagf(b)*crealf(c)); if (isunordered(crealf(a), cimagf(a))) { a = __mulsc3(crealf(b), cimagf(b), crealf(c), cimagf(c)); } The fastpath and unlikely branch tends to be faster than the function call.
[Bug tree-optimization/86029] gcc -O3 make very slow product
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86029 Tavian Barnes changed: What|Removed |Added CC||tavianator at gmail dot com --- Comment #1 from Tavian Barnes --- Maybe a dupe of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70291? In the -O3 version, __mulsc3() dominates the profile. │ for(int i=0; i<=decimate_taps_length; i++) decim += samplebuf[i] * decimate_taps[i]; 0.20 │430:┌─→vmovss 0x4(%r13,%rbx,1),%xmm1 3.63 ││ vmovss 0x0(%r13,%rbx,1),%xmm0 12.35 ││ vmovss 0x4(%r12,%rbx,1),%xmm3 0.31 ││ vmovss (%r12,%rbx,1),%xmm2 0.02 ││ add$0x8,%rbx 36.48 ││→ callq __mulsc3 0.01 ││ vmovss -0x78(%rbp),%xmm6 0.00 ││ vmovss -0x80(%rbp),%xmm4 23.70 ││ vmovq %xmm0,-0x68(%rbp) 14.25 ││ vaddss -0x68(%rbp),%xmm6,%xmm5 1.54 ││ vaddss -0x64(%rbp),%xmm4,%xmm7 0.48 ││ vmovss %xmm5,-0x78(%rbp) 5.92 ││ vmovss %xmm7,-0x80(%rbp) │├──cmp$0x2590,%rbx 0.01 │└──jne430 At -Ofast, │ for(int i=0; i<=decimate_taps_length; i++) decim += samplebuf[i] * decimate_taps[i]; 9.36 │5e0: vpermilps $0xf5,(%r12,%rax,1),%ymm0 15.56 │ vpermilps $0xa0,(%r12,%rax,1),%ymm1 11.24 │ vmulps (%rbx,%rax,1),%ymm0,%ymm0 17.55 │ vpermilps $0xb1,(%rbx,%rax,1),%ymm4 3.31 │ add$0x20,%rax 2.11 │ vmovaps %ymm1,%ymm3 6.62 │ vfmadd132ps %ymm4,%ymm0,%ymm3 3.79 │ vfmsub231ps %ymm4,%ymm1,%ymm0 2.91 │ vblendps $0xaa,%ymm0,%ymm3,%ymm0 10.75 │ vaddps %ymm0,%ymm6,%ymm6 │ cmp$0x2580,%rax 5.59 │ ↑ jne5e0 0.01 │ vmovss 0x258c(%rbx),%xmm0 0.01 │ vmovss -0x70(%rbp),%xmm7 0.01 │ vmovss %xmm5,-0xd0(%rbp) 0.05 │ vextractf128 $0x1,%ymm6,%xmm3 0.01 │ vmovss 0x2588(%rbx),%xmm8 0.03 │ vshufps $0xff,%xmm3,%xmm3,%xmm13
[Bug c++/85958] Make const qualifier error clear
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85958 Tavian Barnes changed: What|Removed |Added CC||tavianator at gmail dot com --- Comment #4 from Tavian Barnes --- IMHO "discards qualifiers" and even "discards const qualifier" are still confusing. Making it clearly counterfactual, as in "...would discard (const) qualifier(s)...," would be an improvement. But I'd further argue that "discarding qualifiers" is not really how most people think of this kind of error. When a minor tries to get into a bar, they are not told that "entering this bar discards your age," they are told that "minors aren't allowed." So I think "cannot bind 'const int' to non-const reference type 'int&'" would be more intuitive phrasing.
[Bug preprocessor/81515] C pre-processor allows invalid words
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81515 Tavian Barnes changed: What|Removed |Added CC||tavianator at gmail dot com --- Comment #2 from Tavian Barnes --- Is this report about BADTEST or #ele?
[Bug middle-end/61118] Spurious -Wclobbered warning generated by gcc 4.9.0 for pthread_cleanup_push
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61118 --- Comment #10 from Tavian Barnes --- > I think it is - __cancel_arg is assigned inside a while loop Specifically a do { } while(0); loop, which obviously has only one iteration.
[Bug middle-end/61118] Spurious -Wclobbered warning generated by gcc 4.9.0 for pthread_cleanup_push
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61118 --- Comment #7 from Tavian Barnes --- > the warning is issued for variables which are alive after return from longjmp > but not marked as volatile. Such variables will have undefined value > according to C standard > (http://pubs.opengroup.org/onlinepubs/7908799/xsh/longjmp.html). But this condition is not met: > - They are changed between the setjmp() invocation and longjmp() call.
[Bug c/78584] Bug in GCC argument parser expandargv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78584 Tavian Barnes changed: What|Removed |Added CC||tavianator at gmail dot com --- Comment #1 from Tavian Barnes --- Can't reproduce with GCC 6.2.1. The `@file` syntax is used to read command line options from `file`. You can probably write `./@.` if you want to compile a file with that name.
[Bug middle-end/71177] Spurious -Waggressive-loop-optimizations warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71177 --- Comment #4 from Tavian Barnes --- > I remember seeing a similar bugreport. PR57199 is very similar, pretty much an exact dupe actually.
[Bug middle-end/71177] Spurious -Waggressive-loop-optimizations warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71177 Tavian Barnes changed: What|Removed |Added Attachment #38516|0 |1 is obsolete|| --- Comment #2 from Tavian Barnes --- Created attachment 38520 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38520&action=edit Further reduced testcase > Please don't set regression markers if you don't know what you are doing (but > feel free to suggest that something should have regression markers). > > Happens also in 5.3.0, 4.9.3 and 7.0, so not a regression. Oops, sorry! This only showed up in my actual build on upgrading to GCC 6, but I didn't check the reduced version against older GCCs. > If you could reduce the testcase further and point out at a missing > optimization, the chances of getting this fixed would increase. Otherwise, > someone needs to look at the dump files and figure out why the ranges are not > affected by the check and whether they should. I've attached a further reduced testcase. > The 'else' block is not unreachable, but loop optimizations look at the > possible > ranges within the loop and those may not be taking into account any range > information derived from the caller. The else block is unreachable *from within normalize()*. And since the warning only shows up due to constant propagation from normalize(), it would be nice to avoid it. Specifically, normalize() calls resize(length - 2) which calls (in the unreached else block) append(n - length). (n - length) gets constant-folded to (size_t)-2 which then overflows a ptrdiff_t later.
[Bug middle-end/71177] New: [6 Regression] Spurious -Waggressive-loop-optimizations warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71177 Bug ID: 71177 Summary: [6 Regression] Spurious -Waggressive-loop-optimizations warning Product: gcc Version: 6.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Target Milestone: --- Created attachment 38516 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38516&action=edit Reduced testcase The following use of boost::container::string gives a warning since GCC 6: $ cat stringbug.cpp #include using boost::container::string; string normalize(string token) { if (token.length() >= 2) { token.resize(token.length() - 2); } return token; } $ g++ -Wall -O3 -c stringbug.cpp In function ‘boost::container::string normalize(boost::container::string)’: cc1plus: warning: iteration 9223372036854775807 invokes undefined behavior [-Waggressive-loop-optimizations] In file included from stringbug.cpp:1:0: /usr/include/boost/container/string.hpp:2608:10: note: within this loop for (; first != last; ++dest, ++first, ++constructed){ ^~~ cc1plus: warning: iteration 9223372036854775807 invokes undefined behavior [-Waggressive-loop-optimizations] /usr/include/boost/container/string.hpp:2626:7: note: within this loop for ( ; first != last; ++first, ++result) ^~~ What's happening is that resize() is implemented like this: void resize(size_type n, CharT c) { if (n <= this->size()) this->erase(this->begin() + n, this->end()); else this->append(n - this->size(), c); } After inlining/constant propagation, the else block contains undefined behaviour for resize(length - 2). But the else block is also unreachable due to the if (length >= 2) check. Reduced testcase attached.
[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 --- Comment #11 from Tavian Barnes --- Yeah I reported the Boost bug as https://svn.boost.org/trac/boost/ticket/12183.
[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 --- Comment #5 from Tavian Barnes --- > But if it is not POD then assuming it gets copied correctly when > init-constructing a POD union where they placed such object is > an interesting assumption... Hrm? They seem to always copy it manually with long_t's copy constructor: ::new(&m_repr.long_repr()) long_t(other.m_repr.long_repr());
[Bug middle-end/71002] [6/7 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 --- Comment #3 from Tavian Barnes --- Because their long_t is not POD. I don't know why that is though. It could be POD if they removed the default/copy constructors and assignment operator. Actually they're probably worried about custom allocators where the pointer type is not POD. So it couldn't be POD in general, and thus can't appear in a union directly (in C++03).
[Bug middle-end/71002] New: [6 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71002 Bug ID: 71002 Summary: [6 Regression] -fstrict-aliasing breaks Boost's short string optimization implementation Product: gcc Version: 6.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Target Milestone: --- Created attachment 38438 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38438&action=edit Reduced test case (This is the miscompilation I reported seeing in PR70054. I have a reduced test case for it now.) Since GCC 6, the following simple use of boost::container::string is broken: $ cat foo.cpp #include #include #include using boost::container::string; struct foo { __attribute__((noinline)) foo(string str) : m_str{std::move(str)}, m_len{m_str.length()} { } string m_str; std::size_t m_len; }; int main() { foo f{"the quick brown fox jumps over the lazy dog"}; if (f.m_len == 0) { std::abort(); } return 0; } $ g++ -O2 -Wall foo.cpp -o foo && ./foo [1]12277 abort (core dumped) ./foo It works with -fno-strict-aliasing. I reduced the problem to the attached standalone test case. Boost's code doesn't seem to be 100% compliant, but the worst thing it does is access a non-active union member (the is_short bitfield). As far as I know, GCC permits that as an extension.
[Bug middle-end/70054] New: GCC 6 gives a strict-aliasing warning on use of std::aligned_storage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70054 Bug ID: 70054 Summary: GCC 6 gives a strict-aliasing warning on use of std::aligned_storage Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Target Milestone: --- GCC 6 warns on this code; GCC 5 didn't. boost::container::string uses this pattern as part of its SSO implementation, so I'm getting a lot of new warnings with GCC 6 as a result. As well, I'm seeing what appears to be a miscompilation as a result, but I haven't reduced that yet. $ cat foo.cpp #include struct foo { std::aligned_storage::type raw; long& cooked() { return *static_cast(static_cast(&raw)); } }; $ g++ -O2 -Wall -c foo.cpp foo.cpp: In member function ‘long int& foo::cooked()’: foo.cpp:7:56: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] return *static_cast(static_cast(&raw));
[Bug c++/64372] Spurious warning with throw in ternary operator returning const reference
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64372 --- Comment #4 from Tavian Barnes --- I meant to include -std=c++11 in the OP, it still happens with that flag set. Good suggestion for the replacement though. The actual code was closer to i < length ? a[i] : throw ... but I guess that can become (i < length ? a : throw ...)[i].
[Bug c++/64372] New: Spurious warning with throw in ternary operator returning const reference
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64372 Bug ID: 64372 Summary: Spurious warning with throw in ternary operator returning const reference Product: gcc Version: 4.9.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com The following program produces a spurious warning: $ cat ternary-warning.cpp const char& foo(const char* ptr) { return ptr ? *ptr : throw ptr; } $ g++ -Wall -c ternary-warning.cpp ternary-warning.cpp: In function ‘const char& foo(const char*)’: ternary-warning.cpp:4:29: warning: returning reference to temporary [-Wreturn-local-addr] return ptr ? *ptr : throw ptr; ^
[Bug c++/63723] Narrowing conversion allowed in braced init list in SFINAE context
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63723 --- Comment #8 from Tavian Barnes --- Actually it is a regression: according to http://gcc.godbolt.org/, GCC 4.6.4 compiled it correctly with -std=c++0x and typedef decltype(helper(0)) type; instead of using type = decltype(helper(0)); while still allowing int x{1.0};
[Bug tree-optimization/64308] Missed optimization: 64-bit divide used when 32-bit divide would work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64308 --- Comment #3 from Tavian Barnes --- @Richard Biener: Yes the range for _16 could be [0, 4294967294]. Why can't VRP can't assume division by zero doesn't occur? If it can then it could say anything mod [a, b] fits in [0, b - 1]. That's a reasonable improvement by itself but it's not enough to optimize this PR, because to use divl for (ret * b % m), you need (ret * b / m) to fit in [0, 4294967295] as well. And to know that that, as Marc Glisse suggests, you'd need symbolic ranges. @Marc Glisse: Is there currently no support at all for symbolic ranges? If you can infer that b < m is an invariant then that's all you need. Formally it's something like this: If x, y, and z are 32-bit unsigned integers, and x <= z || y <= z, then (uint64_t)x * (uint64_t)y % z can be computed with mull and divl because x * y / z is always <= max(x, y) which fits in 32 bits.
[Bug tree-optimization/64308] New: Missed optimization: 64-bit divide used when 32-bit divide would work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64308 Bug ID: 64308 Summary: Missed optimization: 64-bit divide used when 32-bit divide would work Product: gcc Version: 4.9.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Created attachment 34280 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34280&action=edit Test case The following is a fairly typical implementation of exponentiation modulo m: $ cat ipowm.c // Computes (b**e) % m unsigned int ipowm(unsigned int b, unsigned int e, unsigned int m) { unsigned int ret; b %= m; for (ret = m > 1; e; e >>= 1) { if ((e & 1) == 1) { ret = (unsigned long long)ret * b % m; } b = (unsigned long long)b * b % m; } return ret; } Unfortunately, GCC emits a 64-bit multiply and divide for both "... * b % m" expressions on x86 and x86-64, where a 32-bit multiply and divide would be equivalent and faster. $ gcc -std=c11 -O3 -Wall -S -save-temps ipowm.c $ cat ipowm.s ... imulq%rdi, %rax divq%rcx ... imulq%rdi, %rax divq%rcx ... The pattern mull%edi divl%ecx would be much faster. They're equivalent because b is always reduced mod m, so b < m and therefore (for any unsigned int x), x * b / m <= x * m / m == x, thus the quotient will always fit in 32 bits.
[Bug c++/57510] initializer_list memory leak
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57510 --- Comment #4 from Tavian Barnes --- I have a testing tool that automatically inserts operator new failures, to help test exception safety and check for leaks. This bug causes all kinds of spurious failures that I have to work around, since anything like this vector vec = {"some", "strings"}; leaks memory if the second string constructor fails. Usually it can be worked around like this string arr[] = {"some", "strings"}; but it's still quite annoying. If someone can point me in the right direction or give an outline of how to fix this bug, I'm happy to try and write up a patch myself. Thanks!
[Bug libstdc++/63840] New: std::function copy constructor deletes an uninitialized pointer if new fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63840 Bug ID: 63840 Summary: std::function copy constructor deletes an uninitialized pointer if new fails Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Created attachment 33953 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33953&action=edit Reproducer std::function's copy constructor looks like this: template function<_Res(_ArgTypes...)>:: function(const function& __x) : _Function_base() { if (static_cast(__x)) { _M_invoker = __x._M_invoker; _M_manager = __x._M_manager; __x._M_manager(_M_functor, __x._M_functor, __clone_functor); } } _M_manager(..., __clone_functor) calls _M_clone, which looks like this when the functor is stored on the heap: static void _M_clone(_Any_data& __dest, const _Any_data& __source, false_type) { __dest._M_access<_Functor*>() = new _Functor(*__source._M_access<_Functor*>()); } If operator new or the copy-constructor throws, __dest._M_pod_data remains uninitialized. Then the stack unwinds, and ~_Function_base() gets called: ~_Function_base() { if (_M_manager) _M_manager(_M_functor, _M_functor, __destroy_functor); } Which ultimately calls static void _M_destroy(_Any_data& __victim, false_type) { delete __victim._M_access<_Functor*>(); } Which deletes _M_pod_data. A simple fix could be: template function<_Res(_ArgTypes...)>:: function(const function& __x) : _Function_base() { if (static_cast(__x)) { + __x._M_manager(_M_functor, __x._M_functor, __clone_functor); _M_invoker = __x._M_invoker; _M_manager = __x._M_manager; - __x._M_manager(_M_functor, __x._M_functor, __clone_functor); } } I have a test case attached.
[Bug c++/63723] Narrowing conversion allowed in braced init list in SFINAE context
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63723 --- Comment #4 from Tavian Barnes --- Still happens with 4.9.2 though. Is a backport of the fix possible?
[Bug c++/63723] Narrowing conversion allowed in braced init list in SFINAE context
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63723 --- Comment #2 from Tavian Barnes --- It's decltype(requires_conversion({std::declval()})). Not sure why it says .
[Bug c++/63723] New: Narrowing conversion allowed in braced init list in SFINAE context
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63723 Bug ID: 63723 Summary: Narrowing conversion allowed in braced init list in SFINAE context Product: gcc Version: 4.9.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Created attachment 33877 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33877&action=edit Preprocessed source GCC allows narrowing conversions in braced init lists with only a warning. The documentation suggests that this extension doesn't apply in SFINAE contexts, but it appears to be wrong: $ cat foo.cpp #include #include template class is_list_convertible_helper { template static void requires_conversion(To2 t); template ({std::declval()}))> // ^ Braced initializer static std::true_type helper(int); template static std::false_type helper(...); public: using type = decltype(helper(0)); }; template class is_list_convertible : public is_list_convertible_helper::type { }; static_assert(!is_list_convertible::value, "double -> int is narrowing!"); $ g++ -std=c++11 foo.cpp foo.cpp: In substitution of ‘template static std::true_type is_list_convertible_helper::helper(int) [with From2 = double; To2 = int; = ]’: foo.cpp:18:31: required from ‘class is_list_convertible_helper’ foo.cpp:22:7: required from ‘class is_list_convertible’ foo.cpp:26:48: required from here foo.cpp:10:46: warning: narrowing conversion of ‘std::declval()’ from ‘double’ to ‘int’ inside { } [-Wnarrowing] typename = decltype(requires_conversion({std::declval()}))> ^ foo.cpp:26:1: error: static assertion failed: double -> int is narrowing! static_assert(!is_list_convertible::value, ^
[Bug tree-optimization/63537] [4.8/4.9/5 Regression] Missed optimization: Loop unrolling adds extra copy when returning aggregate
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63537 --- Comment #2 from Tavian Barnes --- Is it possible to make SRA work even if the loop isn't unrolled? If the array size is increased to 4 then -O2 doesn't unroll the loop at all, resulting in: movq%rdi, %rax xorl%edx, %edx .L3: movsd8(%rsp,%rdx), %xmm1 mulsd%xmm0, %xmm1 movsd%xmm1, -40(%rsp,%rdx) addq$8, %rdx cmpq$32, %rdx jne.L3 movq-40(%rsp), %rdx movq%rdx, (%rax) movq-32(%rsp), %rdx movq%rdx, 8(%rax) movq-24(%rsp), %rdx movq%rdx, 16(%rax) movq-16(%rsp), %rdx movq%rdx, 24(%rax) ret which would be a lot prettier as something like: movq%rdi, %rax xorl%edx, %edx .L3: movsd8(%rsp,%rdx), %xmm1 mulsd%xmm0, %xmm1 movsd%xmm1, (%rax,%rdx) addl$8, %edx cmpl$32, %edx jne.L3 ret
[Bug tree-optimization/63537] New: Missed optimization: Loop unrolling adds extra copy when returning aggregate
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63537 Bug ID: 63537 Summary: Missed optimization: Loop unrolling adds extra copy when returning aggregate Product: gcc Version: 4.9.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Created attachment 33715 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33715&action=edit Reproducer At -O2 and above on x86_64, this manually unrolled loop generates much better code than the automatically unrolled one: struct vec { double n[3]; }; struct vec mul_unrolled(struct vec lhs, double rhs) { struct vec ret; ret.n[0] = lhs.n[0]*rhs; ret.n[1] = lhs.n[1]*rhs; ret.n[2] = lhs.n[2]*rhs; return ret; } This generates the beautiful: movsd16(%rsp), %xmm2 movq%rdi, %rax movsd24(%rsp), %xmm1 mulsd%xmm0, %xmm2 mulsd%xmm0, %xmm1 mulsd8(%rsp), %xmm0 movsd%xmm2, 8(%rdi) movsd%xmm1, 16(%rdi) movsd%xmm0, (%rdi) ret In contrast, at -O2 this: struct vec mul_loop(struct vec lhs, double rhs) { struct vec ret; for (int i = 0; i < 3; ++i) { ret.n[i] = lhs.n[i]*rhs; } return ret; } generates this: movsd8(%rsp), %xmm1 movq%rdi, %rax mulsd%xmm0, %xmm1 movsd%xmm1, -40(%rsp) movq-40(%rsp), %rdx movsd16(%rsp), %xmm1 mulsd%xmm0, %xmm1 movq%rdx, (%rdi) mulsd24(%rsp), %xmm0 movsd%xmm1, -32(%rsp) movq-32(%rsp), %rdx movsd%xmm0, -24(%rsp) movq%rdx, 8(%rdi) movq-24(%rsp), %rdx movq%rdx, 16(%rdi) ret which puts the result in -40(%rsp) and then copies it to (%rdi). At -O3 it gets vectorized but the extra copy is still there: movapd%xmm0, %xmm1 mulsd24(%rsp), %xmm0 movupd8(%rsp), %xmm2 movq%rdi, %rax unpcklpd%xmm1, %xmm1 mulpd%xmm1, %xmm2 movsd%xmm0, -24(%rsp) movaps%xmm2, -40(%rsp) movq-40(%rsp), %rdx movq%rdx, (%rdi) movq-32(%rsp), %rdx movq%rdx, 8(%rdi) movq-24(%rsp), %rdx movq%rdx, 16(%rdi)
[Bug c++/63323] New: "confused by earlier errors, bailing out" with no other errors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63323 Bug ID: 63323 Summary: "confused by earlier errors, bailing out" with no other errors Product: gcc Version: 4.9.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Created attachment 33529 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33529&action=edit Preprocessed source The following file gives a "confused by earlier errors, bailing out" error message, with no previous errors printed, when compiled with -fsanitize=undefined: $ cat confused.cpp #include #include int main() { std::unique_ptr ptr{new int}; std::map> map; map.insert({0, std::move(ptr)}); return 0; } $ g++ -std=c++11 -fsanitize=undefined confused.cpp ‘ /usr/include/c++/4.9.1/ext/new_allocator.h:120: confused by earlier errors, bailing out Without the -fsanitize=undefined, it points to the actual error, which is fixed by using std::make_pair instead of a braced initializer as the argument to insert().
[Bug c/61957] New: Wrong -Warray-bounds warning depending on parameter types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61957 Bug ID: 61957 Summary: Wrong -Warray-bounds warning depending on parameter types Product: gcc Version: 4.9.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Created attachment 33205 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33205&action=edit Test case On x86-64 Linux, the attached testcase produces the duplicate spurious warning with -O3 and -Wall: $ gcc -std=c11 -O3 -Wall -c array-bounds.c array-bounds.c: In function 'cofactor': array-bounds.c:9:10: warning: array subscript is above array bounds [-Warray-bounds] n[k] = A[i][j]; ^ array-bounds.c:9:10: warning: array subscript is above array bounds [-Warray-bounds] Interestingly, if the 'row' and 'col' parameters are changed to 'unsigned long', the warning goes away.
[Bug c/61118] Spurious -Wclobbered warning generated by gcc 4.9.0 for pthread_cleanup_push
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61118 --- Comment #1 from Tavian Barnes --- Created attachment 32763 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32763&action=edit Preprocessed source file
[Bug c/61118] New: Spurious -Wclobbered warning generated by gcc 4.9.0 for pthread_cleanup_push
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61118 Bug ID: 61118 Summary: Spurious -Wclobbered warning generated by gcc 4.9.0 for pthread_cleanup_push Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: tavianator at gmail dot com Created attachment 32762 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32762&action=edit Source file On gcc 4.9.0 (but not on 4.8.2 or earlier), the attached code gives two spurious warnings: $ gcc -std=gnu99 -Wclobbered -O2 -c clobber_warning.c -pthread -save-temps clobber_warning.c: In function ‘dmnsn_future_wait’: clobber_warning.c:23:54: warning: variable ‘__cancel_routine’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] pthread_cleanup_push(cleanup_fn, &future->mutex); ^ clobber_warning.c:23:103: warning: variable ‘__cancel_arg’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Wclobbered] pthread_cleanup_push(cleanup_fn, &future->mutex); Those variables come from the expansion of the pthread_cleanup_{push,pop} macros, which look like this (reformatted a little): do { __pthread_unwind_buf_t __cancel_buf; void (*__cancel_routine) (void *) = (cleanup_fn); void *__cancel_arg = (&future->mutex); int __not_first_call = __sigsetjmp ((struct __jmp_buf_tag *) (void *) __cancel_buf.__cancel_jmp_buf, 0); if (__builtin_expect ((__not_first_call), 0)) { __cancel_routine (__cancel_arg); __pthread_unwind_next (&__cancel_buf); } __pthread_register_cancel (&__cancel_buf); do {; pthread_cond_wait(&future->cond, &future->mutex); do { } while (0); } while (0); __pthread_unregister_cancel (&__cancel_buf); if (0) __cancel_routine (__cancel_arg); } while (0); The __cancel_routine and __cancel_arg variables are never modified, so I don't see how they can be clobbered.
[Bug c/43904] Wrong code with -foptimize-sibling-calls and memcpy on x86_64
--- Comment #2 from tavianator at gmail dot com 2010-04-26 23:47 --- Created an attachment (id=20497) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=20497&action=view) Full testcase Proper output: Stored: 0x40071c Got:0x40071c Hello world! Output with -O -foptimize-sibling-calls: Stored: 0x40070c Got:0x1 [1]15940 segmentation fault (core dumped) ./a.out -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43904
[Bug c/43904] New: Wrong code with -foptimize-sibling-calls and memcpy on x86_64
On x86_64, gcc 4.5.0, this code generates bad assembly: --- C code --- typedef unsigned long size_t; void *memcpy(void *dest, const void *src, size_t n); void buggy_init(void *ptr, size_t size) { const char *str = "Hello world!"; memcpy(ptr, &str, size); } -- Compiled with gcc -O -foptimize-sibling-calls, the generated assembly looks like this: -- buggy_init: movq%rsi, %rdx movq$.LC0, -16(%rsp) leaq-16(%rsp), %rsi jmp memcpy -- which passes rsp-16 as memcpy's second argument. memcpy overwrites this part of the stack, and copies the wrong value, which causes a crash later. -- Summary: Wrong code with -foptimize-sibling-calls and memcpy on x86_64 Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: tavianator at gmail dot com GCC build triplet: x86_64-unknown-linux-gnu GCC host triplet: x86_64-unknown-linux-gnu GCC target triplet: x86_64-unknown-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43904