[Bug tree-optimization/57359] wrong code for union access at -O3 on x86_64-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Richard Biener changed: What|Removed |Added CC||hstong at ca dot ibm.com --- Comment #15 from Richard Biener --- *** Bug 81028 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/57359] wrong code for union access at -O3 on x86_64-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #14 from Richard Biener --- Testcase from PR81028 extern void abort(); typedef int A; typedef float B; void __attribute__((noinline,noclone)) foo(A *p, B *q, long unk) { for (long i = 0; i < unk; ++i) { *p = 1; q[i] = 42; } } int main(void) { union { A x; B f; } u; foo(, , 1); if (u.f != 42) abort(); return 0; }
[Bug tree-optimization/57359] wrong code for union access at -O3 on x86_64-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 --- Comment #13 from rguenther at suse dot de --- On Tue, 24 Oct 2017, ch3root at openwall dot com wrote: > I've also converted the testcase to allocated memory: > > -- > #include > #include > > __attribute__((__noinline__,__noclone__)) > void test(int *pi, long *pl, int k, int *pa) > { > for (int i = 0; i < 3; i++) { > pl[k] = // something that doesn't change but have to be calculated > *pa; // something that potentially can be changed by assignment to *pi > *pi = 0; > } > } > > int main(void) > { > int *pi = malloc(10); > int a = 1; > > test(pi, (void *)pi, 0, ); > > printf("%d\n", *pi); > } Thanks for the unobfuscated testcase, this indeed shows exactly the issue I mention (store motion sinking a store across another store). The *pa load prevents store-motion from also moving the *pi store which would mitigate this issue (in a way earlier fix we ensured the stores on exit are done in the original order). Testcase for the testsuite that should fail on both big and little-endian: __attribute__((__noinline__,__noclone__)) void test(__INT32_TYPE__ *pi, __INT64_TYPE__ *pl, int k, __INT32_TYPE__ *pa) { for (int i = 0; i < 3; i++) { pl[k] = *pa; *pi = 1; } } int main() { __INT32_TYPE__ *pi = __builtin_malloc (10); __INT32_TYPE__ a = 2; test(pi, (void *)pi, 0, ); if (*pi != 1) __builtin_abort (); return 0; }
[Bug tree-optimization/57359] wrong code for union access at -O3 on x86_64-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Alexander Cherepanov changed: What|Removed |Added CC||ch3root at openwall dot com --- Comment #12 from Alexander Cherepanov --- Still reproducible if the number of iterations is changed to 3. I've also converted the testcase to allocated memory: -- #include #include __attribute__((__noinline__,__noclone__)) void test(int *pi, long *pl, int k, int *pa) { for (int i = 0; i < 3; i++) { pl[k] = // something that doesn't change but have to be calculated *pa; // something that potentially can be changed by assignment to *pi *pi = 0; } } int main(void) { int *pi = malloc(10); int a = 1; test(pi, (void *)pi, 0, ); printf("%d\n", *pi); } -- Results: -- $ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out 0 $ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out 1 -- gcc version: gcc (GCC) 8.0.0 20171024 (experimental)
[Bug tree-optimization/57359] wrong code for union access at -O3 on x86_64-linux
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57359 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Keywords||wrong-code Status|RESOLVED|ASSIGNED Component|rtl-optimization|tree-optimization Resolution|INVALID |--- Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #11 from Richard Biener rguenth at gcc dot gnu.org --- Note that the GIMPLE/RTL IL does not have all restrictions of C so even if the testcase is invalid C the generated GIMPLE IL may be valid and thus there may still be a bug in GCC. In particular the middle-end memory model does not require the effective type change to go through a union type. And indeed the bug is in store-motion which sinks *pll = a cross *pii = 0. Replace u with anonymous storage and use placement-new to properly construct a new type in it and get a valid C++ testcase that is miscompiled. -fno-tree-loop-im fixes it. I will have a look.