[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #25 from Sean --- Ah, that makes sense, thank you Richard. I didn't pay as close attention to the actual swap() code and casting going on there. Apparently unrelated, but perhaps worth noting the reason this come up on my radar is because the latest compilers now also issue a warning on the subtraction: error: performing pointer subtraction with a null pointer has undefined behavior [-Werror,-Wnull-pointer-subtraction]
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #24 from Richard Biener --- (In reply to Sean from comment #23) > Sorry for digging up the past, but recently ran into a related > implementation issue in OpenBSD's qsort implementation and came across this > discussion. While I understand the proposed patch was rejected, it looks > like it also degraded the implementation to single byte swaps. > > Admittedly untested, but a possible solution that I think would make LTO > happy (avoids aliasing the pointer) and doesn't change the implementation > profile would be to cast it differently since it's always dealing with a > simple type alignment check on long/int/char. Something like this: > > #define SWAPINIT(TYPE, a, es) swaptype_ ## TYPE = (uintptr_t)a % > sizeof(TYPE) || es % sizeof(TYPE) ? 2 : es == sizeof(TYPE) ? 0 : 1; > > Maybe a more acceptable alternative compromise since it would have a nearly > identical performance profile with it still swap-optimizing int and long. Note the issue is not the way the type is computed but the memory accesses done via #define swap(a, b) \ if (swaptype_long == 0) { \ long t = *(long *)(a); \ *(long *)(a) = *(long *)(b);\ *(long *)(b) = t; \ } else if (swaptype_int == 0) { \ int t = *(int *)(a);\ *(int *)(a) = *(int *)(b); \ *(int *)(b) = t;\ there exists non-standard ways to preserve the access pattern like typedef int notbaa_int __attribute__((may_alias)); typedef long notbaa_long __attribute__((may_alias)); #define swap(a, b) \ if (swaptype_long == 0) { \ long t = *(notbaa_long *)(a); \ *(notbaa_long *)(a) = *(notbaa_long *)(b);\ *(notbaa_long *)(b) = t; \ } else if (swaptype_int == 0) { \ int t = *(notbaa_int *)(a);\ *(notbaa_int *)(a) = *(notbaa_int *)(b); \ *(notbaa_int *)(b) = t;\ or portable ones like doing long t; memcpy (, a, sizeof (long)); memcpy (a, b, sizeof (long)); memcpy (b, , sizeof (long)); and hoping for the compiler to optimize this well (GCC will do). Note at time I proposed the change to the originating project (freebsd?) and it was accepted there.
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 Sean changed: What|Removed |Added CC||brlcad at mac dot com --- Comment #23 from Sean --- Sorry for digging up the past, but recently ran into a related implementation issue in OpenBSD's qsort implementation and came across this discussion. While I understand the proposed patch was rejected, it looks like it also degraded the implementation to single byte swaps. Admittedly untested, but a possible solution that I think would make LTO happy (avoids aliasing the pointer) and doesn't change the implementation profile would be to cast it differently since it's always dealing with a simple type alignment check on long/int/char. Something like this: #define SWAPINIT(TYPE, a, es) swaptype_ ## TYPE = (uintptr_t)a % sizeof(TYPE) || es % sizeof(TYPE) ? 2 : es == sizeof(TYPE) ? 0 : 1; Maybe a more acceptable alternative compromise since it would have a nearly identical performance profile with it still swap-optimizing int and long.
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #22 from Martin Liška --- (In reply to john henning from comment #21) > Comments from SPEC CPU Documentation Guy: > > As described at > https://www.spec.org/cpu2017/Docs/faq.html#Miscompare.07 > https://www.spec.org/cpu2017/Docs/benchmarks/505.mcf_r.html#Portability > SPEC's recommendation is to use >-fno-strict-aliasing > > The patch attached to this bug is not approved by SPEC. > > SPEC generally does not change benchmarks after release (no moving targets); > more detail is at the above links. > >John Henning >SPEC CPU Subcommittee Secretary Note that the same function is used by other benchmarks as well: 502.gcc_r, 511.povray_r, 527.cam4_r That means the same miscompilation can happen for them as well :/
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 john henning changed: What|Removed |Added CC||mailboxnotfound at yahoo dot com --- Comment #21 from john henning --- Comments from SPEC CPU Documentation Guy: As described at https://www.spec.org/cpu2017/Docs/faq.html#Miscompare.07 https://www.spec.org/cpu2017/Docs/benchmarks/505.mcf_r.html#Portability SPEC's recommendation is to use -fno-strict-aliasing The patch attached to this bug is not approved by SPEC. SPEC generally does not change benchmarks after release (no moving targets); more detail is at the above links. John Henning SPEC CPU Subcommittee Secretary
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 Jeffrey A. Law changed: What|Removed |Added Status|NEW |RESOLVED CC||law at redhat dot com Resolution|--- |INVALID --- Comment #20 from Jeffrey A. Law --- Based on c#18.
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #19 from rguenther at suse dot de --- On December 19, 2017 5:42:07 PM GMT+01:00, "pthaugen at gcc dot gnu.org"wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 > >--- Comment #18 from Pat Haugen --- >(In reply to Martin Liška from comment #16) >> (In reply to Richard Biener from comment #15) >> > SWAPINIT should end up with swaptype_long == 1 I think and >swaptype_int == 1 >> > for the cases in question. Enforcing swaptype_int = swaptype_long >= 2 >> > should make it work (scratch SWAPINIT calls). >> >> I can confirm that. > >Yes, that fixes the problem for me on PowerPC also. I can pass along >the info >to our SPEC rep. > > >Richi, >I'm curious if the alias violations were apparent in a dump file, or >did you >just happened to spot them looking through the source? I just looked at the source seeing *(long *) ptr and bells went off. Richard.
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #18 from Pat Haugen --- (In reply to Martin Liška from comment #16) > (In reply to Richard Biener from comment #15) > > SWAPINIT should end up with swaptype_long == 1 I think and swaptype_int == 1 > > for the cases in question. Enforcing swaptype_int = swaptype_long = 2 > > should make it work (scratch SWAPINIT calls). > > I can confirm that. Yes, that fixes the problem for me on PowerPC also. I can pass along the info to our SPEC rep. Richi, I'm curious if the alias violations were apparent in a dump file, or did you just happened to spot them looking through the source?
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #17 from Martin Liška --- Created attachment 42919 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42919=edit Patch that removes violation of aliasing rules
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #16 from Martin Liška --- (In reply to Richard Biener from comment #15) > SWAPINIT should end up with swaptype_long == 1 I think and swaptype_int == 1 > for the cases in question. Enforcing swaptype_int = swaptype_long = 2 > should make it work (scratch SWAPINIT calls). I can confirm that.
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #15 from Richard Biener --- SWAPINIT should end up with swaptype_long == 1 I think and swaptype_int == 1 for the cases in question. Enforcing swaptype_int = swaptype_long = 2 should make it work (scratch SWAPINIT calls). #define SWAPINIT(TYPE, a, es) swaptype_ ## TYPE = \ ((char *)a - (char *)0) % sizeof(TYPE) || \ es % sizeof(TYPE) ? 2 : es == sizeof(TYPE) ? 0 : 1; static INLINE void swapfunc(char *a, char *b, int n, int swaptype_long, int swaptype_int) { if (swaptype_long <= 1) swapcode(long, a, b, n) else if (swaptype_int <= 1) swapcode(int, a, b, n) else swapcode(char, a, b, n) } #define swap(a, b) \ if (swaptype_long == 0) { \ long t = *(long *)(a); \ *(long *)(a) = *(long *)(b);\ *(long *)(b) = t; \ } else if (swaptype_int == 0) { \ int t = *(int *)(a);\ *(int *)(a) = *(int *)(b); \ *(int *)(b) = t;\ } else \ swapfunc((char *)a, (char *)b, es, swaptype_long, swaptype_int) ... loop: SWAPINIT(long, a, es); SWAPINIT(int, a, es);
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #14 from Richard Biener --- Copies in more benchmarks: > find benchspec -name spec_qsort.c benchspec/common/spec_qsort/spec_qsort.c benchspec/CPU/505.mcf_r/src/spec_qsort/spec_qsort.c benchspec/CPU/502.gcc_r/src/spec_qsort/spec_qsort.c benchspec/CPU/511.povray_r/src/spec_qsort/spec_qsort.c benchspec/CPU/527.cam4_r/src/spec_qsort/spec_qsort.c ...
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #13 from Richard Biener --- Likely invalid. spec_qsort is full of alias violations. We sort typedef struct basket { arc_t *a; cost_t cost; cost_t abs_cost; LONG number; } BASKET; and spec_qsort does stuff like if (n < 7) { for (pm = (char *)a + es; pm < (char *)a + n * es; pm += es) for (pl = pm; pl > (char *)a && cmp(pl - es, pl) > 0; pl -= es) swap(pl, pl - es); return; with #define swap(a, b) \ if (swaptype_long == 0) { \ long t = *(long *)(a); \ *(long *)(a) = *(long *)(b);\ *(long *)(b) = t; \ } else if (swaptype_int == 0) { \ int t = *(int *)(a);\ *(int *)(a) = *(int *)(b); \ *(int *)(b) = t;\ } else \ swapfunc((char *)a, (char *)b, es, swaptype_long, swaptype_int) eh... (no, swapfunc isn't any better). Anybody up to reporting this to SPEC?
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #12 from rguenther at suse dot de --- On Tue, 19 Dec 2017, marxin at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 > > --- Comment #11 from Martin Liška --- > (In reply to rguent...@suse.de from comment #10) > > On Tue, 19 Dec 2017, marxin at gcc dot gnu.org wrote: > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 > > > > > > --- Comment #9 from Martin Liška --- > > > Using just a single ltrans, I see first divergence in > > > mcf_r.ltrans0.088t.dom1. > > > Richi, how possible is the revision real culprit? > > > > Unlikely, it probably triggers a latent issue... > > That's not much positive :/ Ideas what to do with that? Given 505.mcf_r isn't too large figuring out the miscompiled function shouldn't be too hard (heh, well ...). The key is probably to somehow reproduce without FDO ... like just do -fprofile-use? What other flags did you use on x86_64?
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #11 from Martin Liška --- (In reply to rguent...@suse.de from comment #10) > On Tue, 19 Dec 2017, marxin at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 > > > > --- Comment #9 from Martin Liška --- > > Using just a single ltrans, I see first divergence in > > mcf_r.ltrans0.088t.dom1. > > Richi, how possible is the revision real culprit? > > Unlikely, it probably triggers a latent issue... That's not much positive :/ Ideas what to do with that?
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #10 from rguenther at suse dot de --- On Tue, 19 Dec 2017, marxin at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 > > --- Comment #9 from Martin Liška --- > Using just a single ltrans, I see first divergence in mcf_r.ltrans0.088t.dom1. > Richi, how possible is the revision real culprit? Unlikely, it probably triggers a latent issue...
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #9 from Martin Liška --- Using just a single ltrans, I see first divergence in mcf_r.ltrans0.088t.dom1. Richi, how possible is the revision real culprit?
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 Martin Liška changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #8 from Martin Liška --- But my bisection script points to r217827 as first bad revision: 2014-11-20 Richard BienerPR tree-optimization/63677 * tree-ssa-dom.c: Include gimplify.h for unshare_expr. (avail_exprs_stack): Make a vector of pairs. (struct hash_expr_elt): Replace stmt member with vop member. (expr_elt_hasher::equal): Simplify. (initialize_hash_element): Adjust. (initialize_hash_element_from_expr): Likewise. (dom_opt_dom_walker::thread_across_edge): Likewise. (record_cond): Likewise. (dom_opt_dom_walker::before_dom_children): Likewise. (print_expr_hash_elt): Likewise. (remove_local_expressions_from_table): Restore previous state if requested. (record_equivalences_from_stmt): Record + CST as constant [, CST] for further propagation. (vuse_eq): New function. (lookup_avail_expr): For loads use the alias oracle to see whether a candidate from the expr hash is usable. (avail_expr_hash): Do not hash VUSEs. * gcc.dg/tree-ssa/ssa-dom-cse-2.c: New testcase. * gcc.dg/tree-ssa/ssa-dom-cse-3.c: Likewise.
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 Martin Liška changed: What|Removed |Added Target|powerpc64le-unknown-linux-g |powerpc64le-unknown-linux-g |nu |nu, x86_64-linux-gnu Status|UNCONFIRMED |NEW Last reconfirmed||2017-12-19 Host|powerpc64le-unknown-linux-g |powerpc64le-unknown-linux-g |nu |nu, x86_64-linux-gnu Ever confirmed|0 |1 Build|powerpc64le-unknown-linux-g |powerpc64le-unknown-linux-g |nu |nu, x86_64-linux-gnu --- Comment #7 from Martin Liška --- Confirmed on x86_64-linux-gnu with current trunk. I'll try to reproduce with the suspected revision.
[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201 --- Comment #6 from Pat Haugen --- So I did a bisect of trunk during the GCC 7 development timeframe (r235035-r247017) and it pointed to r236878 as the point where the failure started. +++ gcc/ChangeLog (revision 236878) @@ -1,3 +1,9 @@ +2016-05-30 Jan Hubicka+ + * tree-ssa-loop-ivcanon.c (try_peel_loop): Correctly set wont_exit + for peeled copies; avoid underflow when updating estimates; correctly + scale loop profile. +