[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2022-03-02 Thread brlcad at mac dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #25 from Sean  ---
Ah, that makes sense, thank you Richard.  I didn't pay as close attention to
the actual swap() code and casting going on there.

Apparently unrelated, but perhaps worth noting the reason this come up on my
radar is because the latest compilers now also issue a warning on the
subtraction:

error: performing pointer subtraction with a null pointer has undefined
behavior [-Werror,-Wnull-pointer-subtraction]

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2022-03-02 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #24 from Richard Biener  ---
(In reply to Sean from comment #23)
> Sorry for digging up the past, but recently ran into a related
> implementation issue in OpenBSD's qsort implementation and came across this
> discussion.  While I understand the proposed patch was rejected, it looks
> like it also degraded the implementation to single byte swaps.
> 
> Admittedly untested, but a possible solution that I think would make LTO
> happy (avoids aliasing the pointer) and doesn't change the implementation
> profile would be to cast it differently since it's always dealing with a
> simple type alignment check on long/int/char.  Something like this:
> 
> #define SWAPINIT(TYPE, a, es) swaptype_ ## TYPE = (uintptr_t)a %
> sizeof(TYPE) || es % sizeof(TYPE) ? 2 : es == sizeof(TYPE) ? 0 : 1;
> 
> Maybe a more acceptable alternative compromise since it would have a nearly
> identical performance profile with it still swap-optimizing int and long.

Note the issue is not the way the type is computed but the memory
accesses done via

#define swap(a, b)  \
if (swaptype_long == 0) {   \
long t = *(long *)(a);  \
*(long *)(a) = *(long *)(b);\
*(long *)(b) = t;   \
} else if (swaptype_int == 0) { \
int t = *(int *)(a);\
*(int *)(a) = *(int *)(b);  \
*(int *)(b) = t;\

there exists non-standard ways to preserve the access pattern like

typedef int notbaa_int __attribute__((may_alias));
typedef long notbaa_long __attribute__((may_alias));
#define swap(a, b)  \
if (swaptype_long == 0) {   \
long t = *(notbaa_long *)(a);  \
*(notbaa_long *)(a) = *(notbaa_long *)(b);\
*(notbaa_long *)(b) = t;   \
} else if (swaptype_int == 0) { \
int t = *(notbaa_int *)(a);\
*(notbaa_int *)(a) = *(notbaa_int *)(b);  \
*(notbaa_int *)(b) = t;\

or portable ones like doing

  long t;
  memcpy (, a, sizeof (long));
  memcpy (a, b, sizeof (long));
  memcpy (b, , sizeof (long));

and hoping for the compiler to optimize this well (GCC will do).

Note at time I proposed the change to the originating project (freebsd?) and
it was accepted there.

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2022-03-02 Thread brlcad at mac dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

Sean  changed:

   What|Removed |Added

 CC||brlcad at mac dot com

--- Comment #23 from Sean  ---
Sorry for digging up the past, but recently ran into a related implementation
issue in OpenBSD's qsort implementation and came across this discussion.  While
I understand the proposed patch was rejected, it looks like it also degraded
the implementation to single byte swaps.

Admittedly untested, but a possible solution that I think would make LTO happy
(avoids aliasing the pointer) and doesn't change the implementation profile
would be to cast it differently since it's always dealing with a simple type
alignment check on long/int/char.  Something like this:

#define SWAPINIT(TYPE, a, es) swaptype_ ## TYPE = (uintptr_t)a % sizeof(TYPE)
|| es % sizeof(TYPE) ? 2 : es == sizeof(TYPE) ? 0 : 1;

Maybe a more acceptable alternative compromise since it would have a nearly
identical performance profile with it still swap-optimizing int and long.

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2018-06-06 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #22 from Martin Liška  ---
(In reply to john henning from comment #21)
> Comments from SPEC CPU Documentation Guy:
> 
> As described at 
> https://www.spec.org/cpu2017/Docs/faq.html#Miscompare.07
> https://www.spec.org/cpu2017/Docs/benchmarks/505.mcf_r.html#Portability
> SPEC's recommendation is to use
>-fno-strict-aliasing
> 
> The patch attached to this bug is not approved by SPEC.
> 
> SPEC generally does not change benchmarks after release (no moving targets);
> more detail is at the above links.
> 
>John Henning
>SPEC CPU Subcommittee Secretary

Note that the same function is used by other benchmarks as well: 502.gcc_r,
511.povray_r, 527.cam4_r

That means the same miscompilation can happen for them as well :/

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2018-05-09 Thread mailboxnotfound at yahoo dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

john henning  changed:

   What|Removed |Added

 CC||mailboxnotfound at yahoo dot 
com

--- Comment #21 from john henning  ---
Comments from SPEC CPU Documentation Guy:

As described at 
https://www.spec.org/cpu2017/Docs/faq.html#Miscompare.07
https://www.spec.org/cpu2017/Docs/benchmarks/505.mcf_r.html#Portability
SPEC's recommendation is to use
   -fno-strict-aliasing

The patch attached to this bug is not approved by SPEC.

SPEC generally does not change benchmarks after release (no moving targets);
more detail is at the above links.

   John Henning
   SPEC CPU Subcommittee Secretary

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2018-01-06 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

Jeffrey A. Law  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||law at redhat dot com
 Resolution|--- |INVALID

--- Comment #20 from Jeffrey A. Law  ---
Based on c#18.

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-19 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #19 from rguenther at suse dot de  ---
On December 19, 2017 5:42:07 PM GMT+01:00, "pthaugen at gcc dot gnu.org"
 wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201
>
>--- Comment #18 from Pat Haugen  ---
>(In reply to Martin Liška from comment #16)
>> (In reply to Richard Biener from comment #15)
>> > SWAPINIT should end up with swaptype_long == 1 I think and
>swaptype_int == 1
>> > for the cases in question.  Enforcing swaptype_int = swaptype_long
>= 2
>> > should make it work (scratch SWAPINIT calls).
>> 
>> I can confirm that.
>
>Yes, that fixes the problem for me on PowerPC also. I can pass along
>the info
>to our SPEC rep.
>
>
>Richi,
>I'm curious if the alias violations were apparent in a dump file, or
>did you
>just happened to spot them looking through the source?

I just looked at the source seeing *(long *) ptr and bells went off. 

Richard.

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-19 Thread pthaugen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #18 from Pat Haugen  ---
(In reply to Martin Liška from comment #16)
> (In reply to Richard Biener from comment #15)
> > SWAPINIT should end up with swaptype_long == 1 I think and swaptype_int == 1
> > for the cases in question.  Enforcing swaptype_int = swaptype_long = 2
> > should make it work (scratch SWAPINIT calls).
> 
> I can confirm that.

Yes, that fixes the problem for me on PowerPC also. I can pass along the info
to our SPEC rep.


Richi,
  I'm curious if the alias violations were apparent in a dump file, or did you
just happened to spot them looking through the source?

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #17 from Martin Liška  ---
Created attachment 42919
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42919=edit
Patch that removes violation of aliasing rules

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #16 from Martin Liška  ---
(In reply to Richard Biener from comment #15)
> SWAPINIT should end up with swaptype_long == 1 I think and swaptype_int == 1
> for the cases in question.  Enforcing swaptype_int = swaptype_long = 2
> should make it work (scratch SWAPINIT calls).

I can confirm that.

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #15 from Richard Biener  ---
SWAPINIT should end up with swaptype_long == 1 I think and swaptype_int == 1
for the cases in question.  Enforcing swaptype_int = swaptype_long = 2
should make it work (scratch SWAPINIT calls).

#define SWAPINIT(TYPE, a, es) swaptype_ ## TYPE =   \
((char *)a - (char *)0) % sizeof(TYPE) ||   \
es % sizeof(TYPE) ? 2 : es == sizeof(TYPE) ? 0 : 1;

static INLINE void
swapfunc(char *a, char *b, int n, int swaptype_long, int swaptype_int)
{
if (swaptype_long <= 1)
swapcode(long, a, b, n)
else if (swaptype_int <= 1)
swapcode(int, a, b, n)
else
swapcode(char, a, b, n)
}

#define swap(a, b)  \
if (swaptype_long == 0) {   \
long t = *(long *)(a);  \
*(long *)(a) = *(long *)(b);\
*(long *)(b) = t;   \
} else if (swaptype_int == 0) { \
int t = *(int *)(a);\
*(int *)(a) = *(int *)(b);  \
*(int *)(b) = t;\
} else  \
swapfunc((char *)a, (char *)b, es, swaptype_long, swaptype_int)

...


loop:   SWAPINIT(long, a, es);
SWAPINIT(int, a, es);

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #14 from Richard Biener  ---
Copies in more benchmarks:

> find benchspec -name spec_qsort.c
benchspec/common/spec_qsort/spec_qsort.c
benchspec/CPU/505.mcf_r/src/spec_qsort/spec_qsort.c
benchspec/CPU/502.gcc_r/src/spec_qsort/spec_qsort.c
benchspec/CPU/511.povray_r/src/spec_qsort/spec_qsort.c
benchspec/CPU/527.cam4_r/src/spec_qsort/spec_qsort.c

...

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #13 from Richard Biener  ---
Likely invalid.  spec_qsort is full of alias violations.  We sort

typedef struct basket
{
arc_t *a;
cost_t cost;
cost_t abs_cost;
LONG number;
} BASKET;

and spec_qsort does stuff like

if (n < 7) {
for (pm = (char *)a + es; pm < (char *)a + n * es; pm += es)
for (pl = pm;
 pl > (char *)a && cmp(pl - es, pl) > 0;
 pl -= es)
swap(pl, pl - es);
return;

with

#define swap(a, b)  \
if (swaptype_long == 0) {   \
long t = *(long *)(a);  \
*(long *)(a) = *(long *)(b);\
*(long *)(b) = t;   \
} else if (swaptype_int == 0) { \
int t = *(int *)(a);\
*(int *)(a) = *(int *)(b);  \
*(int *)(b) = t;\
} else  \
swapfunc((char *)a, (char *)b, es, swaptype_long, swaptype_int)

eh... (no, swapfunc isn't any better).

Anybody up to reporting this to SPEC?

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-19 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #12 from rguenther at suse dot de  ---
On Tue, 19 Dec 2017, marxin at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201
> 
> --- Comment #11 from Martin Liška  ---
> (In reply to rguent...@suse.de from comment #10)
> > On Tue, 19 Dec 2017, marxin at gcc dot gnu.org wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201
> > > 
> > > --- Comment #9 from Martin Liška  ---
> > > Using just a single ltrans, I see first divergence in 
> > > mcf_r.ltrans0.088t.dom1.
> > > Richi, how possible is the revision real culprit?
> > 
> > Unlikely, it probably triggers a latent issue...
> 
> That's not much positive :/ Ideas what to do with that?

Given 505.mcf_r isn't too large figuring out the miscompiled function
shouldn't be too hard (heh, well ...).  The key is probably to somehow
reproduce without FDO ... like just do -fprofile-use?  What other flags
did you use on x86_64?

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #11 from Martin Liška  ---
(In reply to rguent...@suse.de from comment #10)
> On Tue, 19 Dec 2017, marxin at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201
> > 
> > --- Comment #9 from Martin Liška  ---
> > Using just a single ltrans, I see first divergence in 
> > mcf_r.ltrans0.088t.dom1.
> > Richi, how possible is the revision real culprit?
> 
> Unlikely, it probably triggers a latent issue...

That's not much positive :/ Ideas what to do with that?

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-19 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #10 from rguenther at suse dot de  ---
On Tue, 19 Dec 2017, marxin at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201
> 
> --- Comment #9 from Martin Liška  ---
> Using just a single ltrans, I see first divergence in mcf_r.ltrans0.088t.dom1.
> Richi, how possible is the revision real culprit?

Unlikely, it probably triggers a latent issue...

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #9 from Martin Liška  ---
Using just a single ltrans, I see first divergence in mcf_r.ltrans0.088t.dom1.
Richi, how possible is the revision real culprit?

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

Martin Liška  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #8 from Martin Liška  ---
But my bisection script points to r217827 as first bad revision:

2014-11-20   Richard Biener  

PR tree-optimization/63677
* tree-ssa-dom.c: Include gimplify.h for unshare_expr.
(avail_exprs_stack): Make a vector of pairs.
(struct hash_expr_elt): Replace stmt member with vop member.
(expr_elt_hasher::equal): Simplify.
(initialize_hash_element): Adjust.
(initialize_hash_element_from_expr): Likewise.
(dom_opt_dom_walker::thread_across_edge): Likewise.
(record_cond): Likewise.
(dom_opt_dom_walker::before_dom_children): Likewise.
(print_expr_hash_elt): Likewise.
(remove_local_expressions_from_table): Restore previous state
if requested.
(record_equivalences_from_stmt): Record  + CST as constant
[, CST] for further propagation.
(vuse_eq): New function.
(lookup_avail_expr): For loads use the alias oracle to see
whether a candidate from the expr hash is usable.
(avail_expr_hash): Do not hash VUSEs.

* gcc.dg/tree-ssa/ssa-dom-cse-2.c: New testcase.
* gcc.dg/tree-ssa/ssa-dom-cse-3.c: Likewise.

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

Martin Liška  changed:

   What|Removed |Added

 Target|powerpc64le-unknown-linux-g |powerpc64le-unknown-linux-g
   |nu  |nu, x86_64-linux-gnu
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-12-19
   Host|powerpc64le-unknown-linux-g |powerpc64le-unknown-linux-g
   |nu  |nu, x86_64-linux-gnu
 Ever confirmed|0   |1
  Build|powerpc64le-unknown-linux-g |powerpc64le-unknown-linux-g
   |nu  |nu, x86_64-linux-gnu

--- Comment #7 from Martin Liška  ---
Confirmed on x86_64-linux-gnu with current trunk. I'll try to reproduce with
the suspected revision.

[Bug lto/83201] [7/8 Regression] SPEC CPU2017 505.mcf_r produces incorrect output when built with -flto and FDO

2017-12-15 Thread pthaugen at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #6 from Pat Haugen  ---
So I did a bisect of trunk during the GCC 7 development timeframe
(r235035-r247017) and it pointed to r236878 as the point where the failure
started.


+++ gcc/ChangeLog   (revision 236878)
@@ -1,3 +1,9 @@
+2016-05-30  Jan Hubicka  
+
+   * tree-ssa-loop-ivcanon.c (try_peel_loop): Correctly set wont_exit
+   for peeled copies; avoid underflow when updating estimates; correctly
+   scale loop profile.
+