[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-02 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #25 from Sergei Trofimovich  ---
(In reply to Sergei Trofimovich from comment #22)
> (In reply to Martin Liška from comment #17)
> > For me tree optimized dump is correct, so likely a target issue.
> 
> Yeah, I agree. I finally understood why memory loads disappear (duh!).
> 
> > @Sergei: Is GCC 9 working properly?
> > Would it be possible to bisect that?
> 
> gcc-9 seems to work, bu I'm not sure if it's intentional or unrelated
> optimization passes change the code enough.
> 
> I'll try to cook up even smaller example given that -fno-delayed-branch
> seems to be a culprit and then bisect gcc.

Bisected down to:

$ git bisect good
8c3785c43d490d4f234e21c9dee6bb1bb8d1dbdf is the first bad commit
commit 8c3785c43d490d4f234e21c9dee6bb1bb8d1dbdf
Author: Martin Liska 
Date:   Wed Dec 4 11:13:49 2019 +0100

Initialize a BB count in switch lowering.

2019-12-04  Martin Liska  

* tree-switch-conversion.c
(switch_decision_tree::try_switch_expansion):
Initialize count of newly created BB.

From-SVN: r278959

 gcc/ChangeLog| 5 +
 gcc/tree-switch-conversion.c | 1 +
 2 files changed, 6 insertions(+)

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-02 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #24 from Sergei Trofimovich  ---
(In reply to Sergei Trofimovich from comment #23)
> cvise managed to shrink example down to the following:

For completeness assembly output difference is very clear now:

$ hppa2.0-unknown-linux-gnu-gcc -O2 -S ../bug_test.c -o bug.S
a:
bv %r0(%r2)
ldi 0,%r28

$ hppa2.0-unknown-linux-gnu-gcc -O2 -S ../bug_test.c -o bug.S
-fno-delayed-branch
a:
comclr,<> %r26,%r25,%r0
b,n .L11
nop
.L4:
.L12:
ldil L'.L6,%r28
ldo R'.L6(%r28),%r28
ldwx,s %r24(%r28),%r28
bv,n %r0(%r28)
.section.rodata
.align 4
.L6:
.begin_brtab
.word .L8
.word .L5
.word .L4
.word .L5
.word .L4
.word .L5
.end_brtab
.text
.L5:
ldi 10,%r28
bv,n %r0(%r2)
.L11:
ldi 0,%r28
bv,n %r0(%r2)
.L8:
ldi 1,%r28
bv,n %r0(%r2)

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-02 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #23 from Sergei Trofimovich  ---
cvise managed to shrink example down to the following:

"""
int b, c;
int a() __attribute__((noipa));
int a(int *d, int *f, int g) {
  int e;
  if (d == f)
e = 0;
  else
e = 1;
  switch (g) {
  case 0:
return e;
  case 1:
  case 3:
  case 5:
if (e)
  return 10;
  default:
__builtin_unreachable();
  }
}
int main() { return a(&b, &c, 0); }
"""


$ hppa2.0-unknown-linux-gnu-gcc -O2 bug_test.c -o bad; ./bad; echo $?
0
$ hppa2.0-unknown-linux-gnu-gcc -O2 bug_test.c -o good -fno-delayed-branch;
./good; echo $?
1

gcc-9.2.0 returns '1' in both cases. I'll bisect gcc against this example.

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-02 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #22 from Sergei Trofimovich  ---
(In reply to Martin Liška from comment #17)
> For me tree optimized dump is correct, so likely a target issue.

Yeah, I agree. I finally understood why memory loads disappear (duh!).

> @Sergei: Is GCC 9 working properly?
> Would it be possible to bisect that?

gcc-9 seems to work, bu I'm not sure if it's intentional or unrelated
optimization passes change the code enough.

I'll try to cook up even smaller example given that -fno-delayed-branch seems
to be a culprit and then bisect gcc.

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-02 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #21 from Sergei Trofimovich  ---
(In reply to Eric Botcazou from comment #18)
> If the control flow goes through .L12:
> 
> .L12:
> b .L3; return 0; (not interesting, fall through)
>   ldi 1,%r28
> 
> the return value will be 1 since ldi is in the delay slot of the branch.
> 
> What happens if you compile with -O2 -fno-delayed-branch instead?

Oh, -fno-delayed-branch makes test magically pass! Attaching both bad.S and
good.S for comparison.

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-02 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #20 from Sergei Trofimovich  ---
Created attachment 48828
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48828&action=edit
good.S

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-02 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #19 from Sergei Trofimovich  ---
Created attachment 48827
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48827&action=edit
bad.S

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-02 Thread ebotcazou at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

Eric Botcazou  changed:

   What|Removed |Added

 CC||ebotcazou at gcc dot gnu.org

--- Comment #18 from Eric Botcazou  ---
If the control flow goes through .L12:

.L12:
b .L3; return 0; (not interesting, fall through)
ldi 1,%r28

the return value will be 1 since ldi is in the delay slot of the branch.

What happens if you compile with -O2 -fno-delayed-branch instead?

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-02 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1
   Last reconfirmed||2020-07-02

--- Comment #17 from Martin Liška  ---
Well, I'm looking at the optimized tree dump for hppa and seems fine to me:

__attribute__((noipa, noinline, noclone, no_icf))
long_richcompare (int * self, int * other, int op)
{
  int _1;
  int _2;
  int _5;
  int prephitmp_6;

   [local count: 1073741823]:
  _1 = yes ();
  if (_1 == 0)
goto ; [51.12%]
  else
goto ; [48.88%]

   [local count: 524844999]:
  _2 = yes ();
  if (_2 == 0)
goto ; [34.00%]
  else
goto ; [66.00%]

   [local count: 346397698]:
  if (self_11(D) == other_12(D))
goto ; [30.00%]
  else
goto ; [70.00%]

   [local count: 103919309]:
  switch (op_14(D))  [33.33%], case 0:  [16.67%], case 1:
 [33.33%], case 3:  [33.33%], case 5:  [16.67%]>

   [local count: 17319885]:
:
  goto ; [100.00%]

   [local count: 115465900]:
  # prephitmp_6 = PHI <1(5), op_14(D)(11)>
:
  goto ; [100.00%]

   [count: 0]:
:
  __builtin_unreachable ();

   [local count: 727344125]:

   [local count: 1073741824]:
  # _5 = PHI <1(6), prephitmp_6(7), 0(9), 0(5), 0(11)>
:
  return _5;

   [local count: 242478389]:
  switch (op_14(D))  [33.33%], case 0:  [16.67%], case 1:
 [50.00%], case 3:  [50.00%], case 5:  [50.00%]>

}

we go to bb_2, then as yes() == 0 is false, to bb_3 and bb_4.
In bb_4 we jump to bb_11, from which we go to L6 (aka bb_7).
# prephitmp_6 = PHI <1(5), op_14(D)(11)>

here we set prephitmp_6 = op_14 = 0;
go to bb_10, here _5 = prephitmp_6 = 0;
return _5. So the function properly returns 0.

For me tree optimized dump is correct, so likely a target issue.

@Sergei: Is GCC 9 working properly?
Would it be possible to bisect that?

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #16 from Sergei Trofimovich  ---
If I looks at bad-bug.c.190t.dse3 I see 'self' and 'other' refer to the same
.MEM_10 memory location in 'basic block 5'. I think it should not, 'basic block
4' jumps into bb5 only when self != other. Do I read it correctly?

;;   basic block 4, loop depth 0, count 346397698 (estimated locally), maybe
hot
;;prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED)
;;pred:   3 [66.0% (guessed)]  count:346397697 (estimated locally)
(FALSE_VALUE,EXECUTABLE)
  if (self_11(D) == other_12(D))
goto ; [30.00%]
  else
goto ; [70.00%]
;;succ:   7 [30.0% (guessed)]  count:103919308 (estimated locally)
(TRUE_VALUE,EXECUTABLE)
;;5 [70.0% (guessed)]  count:242478390 (estimated locally)
(FALSE_VALUE,EXECUTABLE)

;;   basic block 5, loop depth 0, count 242478389 (estimated locally), maybe
hot
;;prev block 4, next block 6, flags: (NEW, REACHABLE, VISITED)
;;pred:   4 [70.0% (guessed)]  count:242478390 (estimated locally)
(FALSE_VALUE,EXECUTABLE)
  # VUSE <.MEM_10>
  _13 = *self_11(D);
  # VUSE <.MEM_10>
  _16 = *other_12(D);
  sign_17 = _13 - _16;
  if (sign_17 == 0)
goto ; [34.00%]
  else
goto ; [66.00%]
;;succ:   13 [34.0% (guessed)]  count:82442653 (estimated locally)
(TRUE_VALUE,EXECUTABLE)
;;6 [66.0% (guessed)]  count:160035736 (estimated locally)
(FALSE_VALUE,EXECUTABLE)

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #15 from Sergei Trofimovich  ---
Created attachment 48822
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48822&action=edit
bad-bug.c.190t.dse3

bad-bug.c.190t.dse3 previous tree phase for comparison.

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #14 from Sergei Trofimovich  ---
Created attachment 48821
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48821&action=edit
bad-bug.c.191t.cddce3

bad-bug.c.191t.cddce3 is the full file generated by -fdump-tree-all-all.

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #13 from Jeffrey A. Law  ---
Hmm, there's a control dependency though in bb13:

   [local count: 242478389]:
  # result_21 = PHI <1(5), sign_17(6)>
  switch (op_14(D))  [33.33%], case 0:  [16.67%], case 1:
 [50.00%], case 3:  [50.00%], case 5:  [50.00%]>
}

So I'd hazard a guess that sign_17 either has the value 1 here or that
result_21 is unused, otherwise you're right that cddce shouldn't remove the
block.

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #12 from Jeffrey A. Law  ---
The block in question goes away because it serves no purpose:

   [local count: 242478389]:
  _13 = *self_11(D);
  _16 = *other_12(D);
  sign_17 = _13 - _16;
  if (sign_17 == 0)
goto ; [34.00%]
  else
goto ; [66.00%]

   [local count: 160035736]:
  goto ; [100.00%]


Note that bb6 just transfers control to bb13 with no other side effects.  As a
result bb5 is equivalent to:

   [local count: 242478389]:
  _13 = *self_11(D);
  _16 = *other_12(D);
  sign_17 = _13 - _16;
  if (sign_17 == 0)
goto ; [34.00%]
  else
goto ; [66.00%]


With both arms of the conditional going to the same place and no other uses of
sign_17 the whole block just turns into

  goto ;

I see nothing wrong with what was done by DCE.  The problem must be earlier in
the optimizer pipeline.

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #11 from Sergei Trofimovich  ---
Looking at -fdump-tree-all:
$gcc/xgcc -B$gcc -lm -Wsign-compare -Wall -fno-PIE -no-pie
-fno-stack-protector -O2 -S bug_test.c -o bad-bug.S -fdump-tree-all

I see that stores are eliminated at 'bad-bug.c.191t.cddce3' stage:

Was (at bad-bug.c.190t.dse3):

"""
__attribute__((noipa, noinline, noclone, no_icf))
long_richcompare (int * self, int * other, int op)
{
  int sign;
  int result;
  int _1;
  int _2;
  int _5;
  int prephitmp_6;
  int _13;
  int _16;

   [local count: 1073741823]:
  _1 = yes ();
  if (_1 == 0)
goto ; [51.12%]
  else
goto ; [48.88%]

   [local count: 524844999]:
  _2 = yes ();
  if (_2 == 0)
goto ; [34.00%]
  else
goto ; [66.00%]

   [local count: 346397698]:
  if (self_11(D) == other_12(D))
goto ; [30.00%]
  else
goto ; [70.00%]

   [local count: 242478389]:
  _13 = *self_11(D);
  _16 = *other_12(D);
  sign_17 = _13 - _16;
  if (sign_17 == 0)
goto ; [34.00%]
  else
goto ; [66.00%]

   [local count: 160035736]:
  goto ; [100.00%]

   [local count: 103919309]:
  switch (op_14(D))  [33.33%], case 0:  [16.67%], case 1:
 [33.33%], case 3:  [33.33%], case 5:  [16.67%]>

   [local count: 23093180]:
:
  goto ; [100.00%]

   [local count: 115465900]:
  # prephitmp_6 = PHI <0(13), 1(7)>
:
  goto ; [100.00%]

   [count: 0]:
:
  __builtin_unreachable ();

   [local count: 727344125]:

   [local count: 1073741824]:
  # _5 = PHI <0(13), prephitmp_6(9), 0(11), 0(8), 1(7)>
:
  return _5;

   [local count: 242478389]:
  # result_21 = PHI <1(5), sign_17(6)>
  switch (op_14(D))  [33.33%], case 0:  [16.67%], case 1:
 [50.00%], case 3:  [50.00%], case 5:  [50.00%]>
}
"""

Became (at bad-bug.c.191t.cddce3):

"""
Removing basic block 5
__attribute__((noipa, noinline, noclone, no_icf))
long_richcompare (int * self, int * other, int op)
{
  int sign;
  int result;
  int _1;
  int _2;
  int _5;
  int prephitmp_6;

   [local count: 1073741823]:
  _1 = yes ();
  if (_1 == 0)
goto ; [51.12%]
  else
goto ; [48.88%]

   [local count: 524844999]:
  _2 = yes ();
  if (_2 == 0)
goto ; [34.00%]
  else
goto ; [66.00%]

   [local count: 346397698]:
  if (self_11(D) == other_12(D))
goto ; [30.00%]
  else
goto ; [70.00%]

   [local count: 103919309]:
  switch (op_14(D))  [33.33%], case 0:  [16.67%], case 1:
 [33.33%], case 3:  [33.33%], case 5:  [16.67%]>

   [local count: 23093180]:
:
  goto ; [100.00%]

   [local count: 115465900]:
  # prephitmp_6 = PHI <0(11), 1(5)>
:
  goto ; [100.00%]

   [count: 0]:
:
  __builtin_unreachable ();

   [local count: 727344125]:

   [local count: 1073741824]:
  # _5 = PHI <0(11), prephitmp_6(7), 0(9), 0(6), 1(5)>
:
  return _5;

   [local count: 242478389]:
  switch (op_14(D))  [33.33%], case 0:  [16.67%], case 1:
 [50.00%], case 3:  [50.00%], case 5:  [50.00%]>

}
"""


Note: the following block disappeared completely:
"""
   [local count: 242478389]:
  _13 = *self_11(D);
  _16 = *other_12(D);
"""

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #10 from Sergei Trofimovich  ---
Created attachment 48820
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48820&action=edit
good-bug.S

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #9 from Sergei Trofimovich  ---
(In reply to Martin Liška from comment #7)
> There's ASM diff in between GCC 9 and 10 version:
> 
> diff -u good.s bad.s
> --- good.s2020-07-01 15:04:58.315839436 +0200
> +++ bad.s 202
0-07-01 15:04:30.684040487 +0200

Hm, interesting! I think both these files are broken. Let's me try to elaborate
b annotating bad-bug.S. All the test does is to print result of comparison of
'2' and '1' stored in memory:

*lhs = 2; *rhs = 1;
int sign = *lhs - *rhs;
return sign;

But in bad-bug.S we never read from memory! (IPA is disabled to make functions
somewhat opaque):

"""
main:
ldi 2,%r28
stw %r2,-20(%r30)
ldo 64(%r30),%r30
stw %r28,-52(%r30) ; store '2' in RAM
ldo -56(%r30),%r19 ; get RAM address
ldi 1,%r28
ldi 0,%r24 ; arg3 = '0'
stw %r28,-56(%r30) ; store '1' in RAM
copy %r19,%r25 ; arg1 = &2
bl long_richcompare,%r2
ldo -52(%r30),%r26 ; arg0 = &1 (delay slot, executed before branch)
... (all ok so far)
long_richcompare:
stw %r2,-20(%r30)
stwm %r5,64(%r30)
copy %r26,%r5; arg0 = &1
stw %r4,-60(%r30)
copy %r25,%r4; arg1 = &2
stw %r3,-56(%r30)
bl yes,%r2
copy %r24,%r3; arg2 = 0 (delay slot, is it safe in general to 
comiclr,= 0,%r28,%r0 ; if (!yes()) ...
b,n .L22 ; go to actual comparison
.L15:
ldi 0,%r28   ; fall through to 'return 0;' (not interesting)
.L3:
.L26:
ldw -84(%r30),%r2
ldw -60(%r30),%r4
ldw -56(%r30),%r3
bv %r0(%r2)
ldwm -64(%r30),%r5
.L22:
bl yes,%r2
nop
comib,=,n 0,%r28,.L26 ; if ( .. || !yes()) return 0; (not interesting)
ldi 0,%r28
comiclr,<< 5,%r3,%r0 ; check if 'arg3 < 5' to fit into jump table,
otherwise skip (nullify) next instruction and run .L3
b,n .L25 ; handle jump table
.L12:
b .L3; return 0; (not interesting, fall through)
ldi 1,%r28
.L25:
ldil L'.L8,%r28;
ldo R'.L8(%r28),%r28   ; load jump table address
ldwx,s %r3(%r28),%r28  ; load target at .L8[arg2 * 4]
bv,n %r0(%r28) ; jump on target, should be .L12
.section.rodata
.align 4
.L8:
.begin_brtab
.word .L12
.word .L15
.word .L12
.word .L15
.word .L12
.word .L12
.end_brtab
"""

Note: during the whole execution at no point in time 'long_richcompare()' tried
to dereference arg0 and arg1 inputs (%r4, %r5 registers).

For comparison compiling with -O1 keeps the loads around:

good-bug.S:

"""
main:   ; same as above
stw %r2,-20(%r30)
ldo 64(%r30),%r30
ldi 2,%r28
stw %r28,-56(%r30)
ldi 1,%r28
ldo -52(%r30),%r19
stw %r28,-52(%r30)
ldi 0,%r24
copy %r19,%r25
bl long_richcompare,%r2
ldo -56(%r30),%r26
...
long_richcompare:
stw %r2,-20(%r30)
stwm %r5,64(%r30)
stw %r4,-60(%r30)
stw %r3,-56(%r30)
copy %r26,%r4   ; arg0
copy %r25,%r3   ; arg1
bl yes,%r2
copy %r24,%r5   ; arg2
or,= %r28,%r0,%r28  ; result = 0
b,n .L11; 
.L2:
ldw -84(%r30),%r2
.L12:
ldw -60(%r30),%r4
ldw -56(%r30),%r3
bv %r0(%r2) ; return
ldwm -64(%r30),%r5
.L11:
bl yes,%r2
nop
movb,= %r28,%r28,.L12 ; if(!yes()) return ...
ldw -84(%r30),%r2
comb,=,n %r3,%r4,.L9  ; if(arg0 == arg1) (at branch) diff = 0;
ldw 0(%r4),%r28   
ldw 0(%r3),%r19
sub %r28,%r19,%r28; diff = *arg0 - *arg1
comiclr,<> 0,%r28,%r0
ldi 1,%r28
.L4:
comiclr,>>= 5,%r5,%r0
b,n .L6
ldil L'.L7,%r19
ldo R'.L7(%r19),%r19
ldwx,s %r5(%r19),%r19
bv,n %r0(%r19); handle jump table, at .L8
.section.rodata
.align 4
.L7:
.begin_brtab
.word .L8
.word .L10
.word .L6
.word .L10
.word .L6
.word .L6
.end_brtab
.text
.L9:
b .L4
ldi 0,%r28
.L8:
comiclr,<> 0,%r28,%r28; if (result == 0)
ldi 1,%r28; result = 1;
b .L12; return
ldw -84(%r30),%r2
.L6:
comiclr,<> 0,%r28,%r28
ldi 1,%r28
b .L12
ldw -84(%r30),%r2
.L10:
b .L2
ldi 0,%r28
"""

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #8 from Martin Liška  ---
And first change happens in pr96015.c.299r.bbro which is likely a reason why a
jump table is partially copied.

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #7 from Martin Liška  ---
There's ASM diff in between GCC 9 and 10 version:

diff -u good.s bad.s
--- good.s  2020-07-01 15:04:58.315839436 +0200
+++ bad.s   2020-07-01 15:04:30.684040487 +0200
@@ -30,7 +30,7 @@
 .L15:
ldi 0,%r28
 .L3:
-.L25:
+.L26:
ldw -84(%r30),%r2
ldw -60(%r30),%r4
ldw -56(%r30),%r3
@@ -39,16 +39,14 @@
 .L22:
bl yes,%r2
nop
-   comib,=,n 0,%r28,.L25
+   comib,=,n 0,%r28,.L26
ldi 0,%r28
-   comclr,<> %r4,%r5,%r0
-   b,n .L23
comiclr,<< 5,%r3,%r0
-   b,n .L24
-.L6:
-.L23:
-   comib,<< 5,%r3,.L26
+   b,n .L25
+.L12:
+   b .L3
ldi 1,%r28
+.L25:
ldil L'.L8,%r28
ldo R'.L8(%r28),%r28
ldwx,s %r3(%r28),%r28
@@ -65,34 +63,6 @@
.word .L12
.end_brtab
.text
-.L12:
-   ldi 1,%r28
-.L26:
-   ldw -84(%r30),%r2
-   ldw -60(%r30),%r4
-   ldw -56(%r30),%r3
-   bv %r0(%r2)
-   ldwm -64(%r30),%r5
-.L24:
-   ldil L'.L11,%r28
-   ldo R'.L11(%r28),%r28
-   ldwx,s %r3(%r28),%r28
-   bv,n %r0(%r28)
-   .section.rodata
-   .align 4
-.L11:
-   .begin_brtab
-   .word .L14
-   .word .L15
-   .word .L6
-   .word .L15
-   .word .L6
-   .word .L15
-   .end_brtab
-   .text
-.L14:
-   b .L3
-   copy %r3,%r28
.EXIT
.PROCEND
.size   long_richcompare, .-long_richcompare
@@ -143,4 +113,4 @@
.EXIT
.PROCEND
.size   main, .-main
-   .ident  "GCC: (SUSE Linux) 9.3.1 20200406 [revision
6db837a5288ee3ca5ec504fbd5a765817e556ac2]"
+   .ident  "GCC: (SUSE Linux) 10.1.1 20200625 [revision
c91e43e9363bd119a695d64505f96539fa451bf2]"

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #6 from Sergei Trofimovich  ---
Created attachment 48816
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48816&action=edit
bad-bug.S

bad-bug.S is miscompiled file generated by main gcc (not clear what is wrong
yet).

Generated as:
$gcc/xgcc -B$gcc/gcc -lm -Wsign-compare -Wall -fno-PIE -no-pie
-fno-stack-protector -O2 -S bug_test.c -o bad-bug.S

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread slyfox at inbox dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

--- Comment #5 from Sergei Trofimovich  ---
I ran the test in qemu-hppa (qemu user binary emulation) against Gentoo's
hppa2.0 root system as:
/usr/bin/qemu-hppa -L /usr/hppa2.0-unknown-linux-gnu/ "$@"
where /usr/hppa2.0-unknown-linux-gnu/ is a hppa SYSROOT.

Cross-compiler is generated with Gentoo's 'crossdev' tool as:
   # crossdev hppa2.0-unknown-linux-gnu
The command builds cross-binutils, cross-gcc with
--sysroot=/usr/hppa2.0-unknown-linux-gnu/ and puts glibc into
/usr/hppa2.0-unknown-linux-gnu/.

Full native root system is also at
http://distfiles.gentoo.org/releases/hppa/autobuilds/current-stage3-hppa2.0/
(stage3-hppa2.0-*.tar.bz2 tarballs). Should be good enough to be used for
qemu-hppa as-is.

I also plan to pass through the assembly dump this evening to get the idea
where incorrect code got generated to spare you the debugging.

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

Martin Liška  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org

--- Comment #4 from Martin Liška  ---
Thank you for the report.
What system do you use that can cross compile (and link) the test-case in order
to run it in qemu?

[Bug target/96015] [10/11 Regression] gcc-10.1.0 miscompiles Python on hppa

2020-07-01 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96015

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |10.2
   Keywords||wrong-code
Summary|[regression] gcc-10.1.0 |[10/11 Regression]
   |miscompiles Python on hppa  |gcc-10.1.0 miscompiles
   ||Python on hppa