[Bug target/105325] power10: Error: operand out of range

2023-01-26 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105325

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 CC||acsawdey at gcc dot gnu.org

--- Comment #12 from acsawdey at gcc dot gnu.org ---
I do have a patch for this one that has been sitting around that I forgot
about, looking at reviving that to at least post.

[Bug target/103197] ppc inline expansion of memcpy/memmove should not use lxsibzx/stxsibx for a single byte

2021-11-16 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103197

--- Comment #5 from acsawdey at gcc dot gnu.org ---
Bisection reveals that this starts with this commit:

20d70cd2719815d9ea853314775ae5787648ece5 is the first bad commit
commit 20d70cd2719815d9ea853314775ae5787648ece5
Author: Alan Modra 
Date:   Thu May 9 08:37:26 2019 +0930

[RS6000] PR89271, gcc.target/powerpc/vsx-simode2.c

This patch makes a number of corrections to rs6000_register_move_cost,
adds a new register union class, GEN_OR_VSX_REGS, and adjusts insn
alternative costs to suit.

[Bug target/103197] ppc inline expansion of memcpy/memmove should not use lxsibzx/stxsibx for a single byte

2021-11-15 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103197

--- Comment #4 from acsawdey at gcc dot gnu.org ---
I was compiling with -mcpu=power9, yes:

/home2/sawdey/work/gcc/trunk/build/gcc/xgcc
-B/home2/sawdey/work/gcc/trunk/build/gcc -O3 -mcpu=power9 bug2.c

[Bug target/103197] ppc inline expansion of memcpy/memmove should not use lxsibzx/stxsibx for a single byte

2021-11-11 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103197

--- Comment #2 from acsawdey at gcc dot gnu.org ---
>From the reload dump:

0 Non input pseudo reload: reject++
1 Non-pseudo reload: reject+=2
1 Non input pseudo reload: reject++
  alt=0,overall=16,losers=2,rld_nregs=2
0 Non input pseudo reload: reject++
  alt=1,overall=7,losers=1,rld_nregs=1
  alt=2,overall=6,losers=1,rld_nregs=0
[...]
 Choosing alt 2 in insn 9:  (0) wa  (1) Z {*movqi_internal}

The addressing for insn 9 is just reg+const so why did it think it would have
to reload one register for alt 1 (d-form) and 0 for alt 2 which is x-form?

[Bug target/103197] ppc inline expansion of memcpy/memmove should not use lxsibzx/stxsibx for a single byte

2021-11-11 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103197

--- Comment #1 from acsawdey at gcc dot gnu.org ---
Looking at trunk, after expand we have this:

(note 5 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 2 5 3 2 (set (reg/v/f:DI 117 [ a ])
(reg:DI 3 3 [ a ])) "bug2.c":3:1 -1
 (nil))
(insn 3 2 4 2 (set (reg/v/f:DI 118 [ b ])
(reg:DI 4 4 [ b ])) "bug2.c":3:1 -1
 (nil))
(note 4 3 7 2 NOTE_INSN_FUNCTION_BEG)
(insn 7 4 9 2 (set (reg:DI 119)
(mem:DI (reg/v/f:DI 118 [ b ]) [0 MEM  [(void *)b_3(D)]+0 S8
A8])) "bug2.c":4:3 -1
 (nil))
(insn 9 7 8 2 (set (reg:QI 120)
(mem:QI (plus:DI (reg/v/f:DI 118 [ b ])
(const_int 8 [0x8])) [0 MEM  [(void *)b_3(D)]+8 S1
A8])) "bug2.c":4:3 -1
 (nil))
(insn 8 9 10 2 (set (mem:DI (reg/v/f:DI 117 [ a ]) [0 MEM  [(void
*)a_2(D)]+0 S8 A8])
(reg:DI 119)) "bug2.c":4:3 -1
 (nil))
(insn 10 8 0 2 (set (mem:QI (plus:DI (reg/v/f:DI 117 [ a ])
(const_int 8 [0x8])) [0 MEM  [(void *)a_2(D)]+8 S1
A8])
(reg:QI 120)) "bug2.c":4:3 -1
 (nil))

Which is the expected code, DI and QI loads/stores that should produce D-form
instructions.

But it looks like reload put the QI into hard reg 32 which is a fp reg:

(insn 9 17 8 2 (set (reg:QI 32 0 [orig:120 MEM  [(void *)b_3(D)]+8 ]
[120])
(mem:QI (reg:DI 10 10 [124]) [0 MEM  [(void *)b_3(D)]+8 S1
A8])) "bug2.c":4:3 549 {*movqi_internal}

which leads to the lxsibzx/stxsibx on output.

[Bug target/103197] New: ppc inline expansion of memcpy/memmove should not use lxsibzx/stxsibx for a single byte

2021-11-11 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103197

Bug ID: 103197
   Summary: ppc inline expansion of memcpy/memmove should not use
lxsibzx/stxsibx for a single byte
   Product: gcc
   Version: 10.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acsawdey at gcc dot gnu.org
  Target Milestone: ---

This got broken sometime in gcc 10 timeframe. For this test case:

#include 
void m(char *a, char *b)
{
  memcpy(a,b,9);
}

AT13 (gcc 9.3.1) produces:

m:
.LFB0:
.cfi_startproc
ld 10,0(4)
lbz 9,8(4)
std 10,0(3)
stb 9,8(3)
blr
.long 0
.byte 0,0,0,0,0,0,0,0
.cfi_endproc

which is the expected code to copy 9 bytes.

AT14 (gcc 10.3.1), gcc 11, and current trunk all produce:

m:
.LFB0:
.cfi_startproc
addi 10,4,8
ld 9,0(4)
lxsibzx 0,0,10
std 9,0(3)
addi 9,3,8
stxsibx 0,0,9
blr
.long 0
.byte 0,0,0,0,0,0,0,0
.cfi_endproc

which is really bad, mixing gpr and vsx. The inline expansion code in
expand_block_move() does not attempt to generate vsx code at all unless the
size is at least 16 bytes.

[Bug target/100996] rs6000 p10 vector add-add fusion should work with -m32 but doesn't

2021-06-09 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100996

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Target||powerpc-*-*-*
   Last reconfirmed||2021-06-09
   Assignee|unassigned at gcc dot gnu.org  |acsawdey at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED

[Bug target/100996] New: rs6000 p10 vector add-add fusion should work with -m32 but doesn't

2021-06-09 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100996

Bug ID: 100996
   Summary: rs6000 p10 vector add-add fusion should work with -m32
but doesn't
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acsawdey at gcc dot gnu.org
  Target Milestone: ---

The fusion-p10-addadd.c test case does not get vector add-add fusion when
compiling with -m32:

/home/sawdey/work/gcc/trunk/build/gcc/xgcc
-B/home/sawdey/work/gcc/trunk/build/gcc/
/home/sawdey/work/gcc/trunk/gcc/gcc/testsuite/gcc.target/powerpc/fusion-p10-addadd.c
 -m32  -fdiagnostics-plain-output  -mcpu=power10 -O3 -dap -fno-ident -S

typedef vector long vlong;
vlong vaddadd(vlong a, vlong b, vlong c)
{
  return a+b+c;
}

vaddadd:
.LFB3:
.cfi_startproc
vadduwm 2,2,3# 8[c=4 l=4]  addv4si3
vadduwm 2,2,4# 14   [c=4 l=4]  addv4si3
blr  # 24   [c=4 l=4]  simple_return
.cfi_endproc

[Bug target/97926] ICE in patch_jump_insn, at cfgrtl.c:1298

2021-03-18 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97926

--- Comment #3 from acsawdey at gcc dot gnu.org ---
So the underlying problem here is that the unordered comparisons are not
allowed with -ffinite-math-only due to this predicate:

;; Return 1 if OP is a comparison operation that is valid for a branch
;; instruction.  We check the opcode against the mode of the CC value.
;; validate_condition_mode is an assertion.
(define_predicate "branch_comparison_operator"
   (and (match_operand 0 "comparison_operator")
(match_test "GET_MODE_CLASS (GET_MODE (XEXP (op, 0))) == MODE_CC")
(if_then_else (match_test "GET_MODE (XEXP (op, 0)) == CCFPmode")
  (if_then_else (match_test "flag_finite_math_only")
(match_code "lt,le,gt,ge,eq,ne,unordered,ordered")
(match_code "lt,gt,eq,unordered,unge,unle,ne,ordered"))
  (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne"))
(match_test "validate_condition_mode (GET_CODE (op),
  GET_MODE (XEXP (op, 0))),
 1")))


But ubsan_instrument_float_cast() generates this:

  t = fold_build2 (UNLE_EXPR, boolean_type_node, expr, min);
  tt = fold_build2 (UNGE_EXPR, boolean_type_node, expr, max);

which eventually leads to the ICE. Even if this branch wasn't rewritten by
patch_dump_insn() it would not be recognized and would eventually ICE.

Segher is working on a change to that predicate for PR98092 though which may be
a workaround fix for this.

[Bug target/97926] ICE in patch_jump_insn, at cfgrtl.c:1298

2021-03-16 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97926

--- Comment #2 from acsawdey at gcc dot gnu.org ---
patch_jump_insn() is running into a land mine -- the insn before modification
is invalid:

(gdb) p insn_invalid_p(insn, true)
$4 = 1
(gdb) pr insn
(jump_insn 18 17 114 6 (set (pc)
(if_then_else (unle (reg:CCFP 131)
(const_int 0 [0]))
(label_ref 21)
(pc)))
"../../gcc/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-11.c":9:10 -1
 (nil)
 -> 21)

So verify_changes() fails because the same insn with a different label_ref
inserted is also invalid.

[Bug target/99070] ICE in extract_constrain_insn, at recog.c:2670

2021-03-08 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99070

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from acsawdey at gcc dot gnu.org ---
Fixed in trunk.

[Bug target/99070] ICE in extract_constrain_insn, at recog.c:2670

2021-02-11 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99070

--- Comment #4 from acsawdey at gcc dot gnu.org ---
OK, I see the fail with -mcpu=power9. Looks like I botched something with
addressing and allowed D-form addresses when it should be DS-form. On power10
this would result in selection of a prefix D-form load, which then causes the
ld-cmpi to be split. On anything previous we just ICE.

[Bug target/99070] ICE in extract_constrain_insn, at recog.c:2670

2021-02-11 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99070

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |acsawdey at gcc dot 
gnu.org

--- Comment #2 from acsawdey at gcc dot gnu.org ---
What are the build flags for this compiler?

[Bug rtl-optimization/98692] Unitialized Values reported only with -Os

2021-01-18 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98692

--- Comment #7 from acsawdey at gcc dot gnu.org ---
The inline expansion should be disabled by -Os, the patterns for cmpstr[n]si
both have this:

  if (optimize_insn_for_size_p ())
FAIL;

[Bug target/98688] C++ modules support does not work on PowerPC with opaque MMA types vector_pair/vector_quad

2021-01-15 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98688

--- Comment #3 from acsawdey at gcc dot gnu.org ---
Yeah it's pretty clear that something needs to be output, as with that code I
get an error like this:

In module imported at mma-module-2.C:1:1:
mma_foo0: In function ‘int bar(__vector_quad*, vec_t*, __vector_pair*)’:
mma_foo0: error: failed to read compiled module cluster 2: Bad file data
mma_foo0: note: compiled module file is ‘gcm.cache/mma_foo0.gcm’
mma-module-2.C:7:5: fatal error: failed to load binding ‘::foo0@mma_foo0’
7 | foo0 (dst, vec, pvecp);
  | ^~~~


This is with a little test case of two files:

export module mma_foo0;

typedef unsigned char  vec_t __attribute__((vector_size(16)));

export void
foo0 (__vector_quad *dst, vec_t *vec, __vector_pair *pvecp)
{
  __vector_quad acc;
  __vector_pair vecp0 = *pvecp;
  vec_t vec1 = vec[1];

  __builtin_mma_xvf64ger (&acc, vecp0, vec1);
  __builtin_mma_xvf64gerpp (&acc, vecp0, vec1);
  __builtin_mma_xvf64gerpn (&acc, vecp0, vec1);
  dst[0] = acc;
}



typedef unsigned char  vec_t __attribute__((vector_size(16)));

int bar(__vector_quad *dst, vec_t *vec, __vector_pair *pvecp)
{
foo0 (dst, vec, pvecp);
}

[Bug target/98688] C++ modules support does not work on PowerPC with opaque MMA types vector_pair/vector_quad

2021-01-14 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98688

--- Comment #1 from acsawdey at gcc dot gnu.org ---
I don't know if this is the right thing to do, but ignoring the opaque type
here make the ICE go away. I suspect I need to construct a module test case
using vector_pair/vector_quad to really test this though.


diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index d2093916c9e..3ec0b04def3 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -8831,6 +8831,10 @@ trees_out::type_node (tree type)
   }
   break;

+case OPAQUE_TYPE:
+  /* No additional data.  */
+  break;
+
 case OFFSET_TYPE:
   tree_node (TYPE_OFFSET_BASETYPE (type));
   break;

[Bug target/98688] New: C++ modules support does not work on PowerPC with opaque MMA types vector_pair/vector_quad

2021-01-14 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98688

Bug ID: 98688
   Summary: C++ modules support does not work on PowerPC with
opaque MMA types vector_pair/vector_quad
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acsawdey at gcc dot gnu.org
  Target Milestone: ---

Similar to PR98645, we run into trouble if we try to compile code using
vector_pair/vector_quad using -fmodule-header:

/home/sawdey/work/gcc/trunk2/build/gcc/xg++
-B/home/sawdey/work/gcc/trunk2/build/gcc/
/home/sawdey/work/gcc/trunk2/gcc/gcc/testsuite/gcc.target/powerpc/mma-builtin-2.c
   -std=c++2a -fmodule-header  -S -o mb2.s
/home/sawdey/work/gcc/trunk2/gcc/gcc/testsuite/gcc.target/powerpc/mma-builtin-2.c:
internal compiler error: in type_node, at cp/module.cc:8779
0x10468287 trees_out::type_node(tree_node*)
../../gcc/gcc/cp/module.cc:8779
0x1046446b trees_out::tree_node(tree_node*)
../../gcc/gcc/cp/module.cc:9106
0x10467dcb trees_out::type_node(tree_node*)
../../gcc/gcc/cp/module.cc:8773
0x1046446b trees_out::tree_node(tree_node*)
../../gcc/gcc/cp/module.cc:9106
0x10465d57 trees_out::core_vals(tree_node*)
../../gcc/gcc/cp/module.cc:6088
0x1046783b trees_out::tree_node_vals(tree_node*)
../../gcc/gcc/cp/module.cc:7141
0x1046783b trees_out::fn_parms_init(tree_node*)
../../gcc/gcc/cp/module.cc:10037
0x10461833 trees_out::decl_value(tree_node*, depset*)
../../gcc/gcc/cp/module.cc:7738
0x1046e163 depset::hash::find_dependencies()
../../gcc/gcc/cp/module.cc:13199
0x1046eae7 module_state::write(elf_out*, cpp_reader*)
../../gcc/gcc/cp/module.cc:17568
0x10470313 finish_module_processing(cpp_reader*)
../../gcc/gcc/cp/module.cc:19747
0x103ae82f c_parse_final_cleanups()
../../gcc/gcc/cp/decl2.c:5178
0x1072cad7 c_common_parse_file()
../../gcc/gcc/c-family/c-opts.c:1233
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

It looks like trees_out::type_node() needs to understand opaque type, and
possibly whatever reads that in needs to understand it on the way in as well.

[Bug c++/97947] [11 Regression] ICE in digest_init_r, at cp/typeck2.c:1145

2020-12-01 Thread acsawdey at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97947

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |acsawdey at gcc dot 
gnu.org

--- Comment #3 from acsawdey at gcc dot gnu.org ---
I'll take a look at this. Probably missed something when adding OPAQUE_TYPE.

[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412

2020-09-10 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791

--- Comment #10 from acsawdey at gcc dot gnu.org ---
For now, disabling use of POImode for expansion of memcpy/memmove to avoid this
problem while we figure out the real fix:

https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553672.html

[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412

2020-09-09 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791

--- Comment #9 from acsawdey at gcc dot gnu.org ---
I did post a small patch that fixes this, but more for the purpose of provoking
discussion than because I am sure it is the right way to fix this.

https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553523.html

[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412

2020-09-02 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791

--- Comment #8 from acsawdey at gcc dot gnu.org ---
Another small test case, reduced from my compile failure of c/c-typeck.c and
modified to provoke truncation from POImode to various other modes:

typedef int *a;
struct b { a ba; };
enum c { c1=1 };
struct e {
  union eu {
char f_char;
short f_short;
int f_int;
long f_long;
int *f_ptr;
long long f_ll;
  } u;
  c g;
  a h;
  b i;
};
a d(bool, bool, bool);
e j(int, e, bool, bool);
void k() {
  int l;
  for (;;) {
e expr;
l = sizeof(struct e);
expr = j(l, expr, true, false);
d(expr.u.f_char, false, __null);
d(expr.u.f_short, false, __null);
d(expr.u.f_int, false, __null);
d(expr.u.f_long, false, __null);
d(expr.u.f_ptr, false, __null);
d(expr.u.f_ll, false, __null);
  }
}

[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412

2020-08-31 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791

--- Comment #7 from acsawdey at gcc dot gnu.org ---
I wonder if this other case works properly when compiled with -m64. Trying to
generate a stxvp with a 32-bit address seems odd.

[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412

2020-08-27 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |acsawdey at gcc dot 
gnu.org

--- Comment #5 from acsawdey at gcc dot gnu.org ---
Nope, this is my patch that added vector pair to memcpy/memmove expansion. We
apparently don't have the right patterns defined for this to extract things
from the POImode reg that it uses.

This is the code in expr.c:

  if (GET_MODE_CLASS (from_mode) == MODE_PARTIAL_INT)
{
  rtx new_from;
  scalar_int_mode full_mode
= smallest_int_mode_for_size (GET_MODE_BITSIZE (from_mode));
  convert_optab ctab = unsignedp ? zext_optab : sext_optab;
  enum insn_code icode;

  icode = convert_optab_handler (ctab, full_mode, from_mode);
  gcc_assert (icode != CODE_FOR_nothing);

convert_optab_handler doesn't find anything to go from POImode to DImode, so
the assert fires.

[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412

2020-08-27 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791

--- Comment #3 from acsawdey at gcc dot gnu.org ---
This also requires -mbig which may be implicit in the original poster's build.
But I see it failing as well.

[Bug target/96787] rs6000 mcpu=power10 miscompiles libiberty htab_delete() causing bootstrap failure

2020-08-25 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96787

--- Comment #3 from acsawdey at gcc dot gnu.org ---
Never mind that, all I'm seeing is the lack of save/restore of r2 in the
power10 version.

[Bug target/96787] rs6000 mcpu=power10 miscompiles libiberty htab_delete() causing bootstrap failure

2020-08-25 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96787

--- Comment #2 from acsawdey at gcc dot gnu.org ---
I'm seeing some load-past-store code motion that happens when compiling for
power10 vs power9 that makes me suspicious.

[Bug target/96787] rs6000 mcpu=power10 miscompiles libiberty htab_delete() causing bootstrap failure

2020-08-25 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96787

--- Comment #1 from acsawdey at gcc dot gnu.org ---
Created attachment 49123
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49123&action=edit
hashtab.c with target power9 attribute on htab_delete()

[Bug target/96787] New: rs6000 mcpu=power10 miscompiles libiberty htab_delete() causing bootstrap failure

2020-08-25 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96787

Bug ID: 96787
   Summary: rs6000 mcpu=power10 miscompiles libiberty
htab_delete() causing bootstrap failure
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acsawdey at gcc dot gnu.org
  Target Milestone: ---

Created attachment 49122
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49122&action=edit
hashtab.c with no added attributes

Building r11-2827 configured using --with-cpu=power10, I am seeing some kind of
compile failure in libiberty hashtab.o function htab_delete(). Putting
__attribute__ ((target("cpu=power9")) in front of that function clears the
problem.

The manifestation is that genmddeps segfaults.

I've attached asm output with (.fixed.s) and without (.broken.s) the attribute
on htab_delete().

[Bug c/96151] bootstrap fails due to ICE in c_omp_split_clauses

2020-07-10 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96151

--- Comment #1 from acsawdey at gcc dot gnu.org ---
This compile is successful like this but fails if I add -mcpu=power9.

/home2/sawdey/work/gcc/mamboCI/build-mambo/./prev-gcc/xg++
-B/home2/sawdey/work/gcc/mamboCI/build-mambo/./prev-gcc/
-B/opt/binutils-gcc-p10/powerpc64le-unknown-linux-gnu/bin/ -nostdinc++
-B/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/src/.libs
-B/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs

-I/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu

-I/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include
 -I/home2/sawdey/work/gcc/mamboCI/gcc-master/libstdc++-v3/libsupc++
-L/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/src/.libs
-L/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
 -fno-PIE -c  -DIN_GCC_FRONTEND -DIN_GCC_FRONTEND -DIN_GCC_FRONTEND -g -O2
-fno-checking -gtoggle -DIN_GCC -fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -Werror -fno-common  -DHAVE_CONFIG_H -I. -Ic-family
-I../../gcc-master/gcc -I../../gcc-master/gcc/c-family
-I../../gcc-master/gcc/../include -I../../gcc-master/gcc/../libcpp/include 
-I../../gcc-master/gcc/../libdecnumber
-I../../gcc-master/gcc/../libdecnumber/dpd -I../libdecnumber
-I../../gcc-master/gcc/../libbacktrace   -o c-family/c-omp.o -MT
c-family/c-omp.o -MMD -MP -MF c-family/.deps/c-omp.TPo
../../gcc-master/gcc/c-family/c-omp.c

[Bug c/96151] New: bootstrap fails due to ICE in c_omp_split_clauses

2020-07-10 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96151

Bug ID: 96151
   Summary: bootstrap fails due to ICE in c_omp_split_clauses
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acsawdey at gcc dot gnu.org
  Target Milestone: ---

Started to see this on trunk last night. Tested again and still see it with
r11-2018.

configured with: 

/home2/sawdey/work/gcc/mamboCI/gcc-master/configure
--prefix=/opt/binutils-gcc-p10 --enable-languages=all --enable-bootstrap
--with-cpu=power9


/home2/sawdey/work/gcc/mamboCI/build-mambo/./prev-gcc/xg++
-B/home2/sawdey/work/gcc/mamboCI/build-mambo/./prev-gcc/
-B/opt/binutils-gcc-p10/powerpc64le-unknown-linux-gnu/bin/ -nostdinc++
-B/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/src/.libs
-B/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs

-I/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu

-I/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include
 -I/home2/sawdey/work/gcc/mamboCI/gcc-master/libstdc++-v3/libsupc++
-L/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/src/.libs
-L/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs
 -fno-PIE -c  -DIN_GCC_FRONTEND -DIN_GCC_FRONTEND -DIN_GCC_FRONTEND -g -O2
-fno-checking -gtoggle -DIN_GCC -fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -Werror -fno-common  -DHAVE_CONFIG_H -I. -Ic-family
-I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc
-I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/c-family
-I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/../include
-I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/../libcpp/include 
-I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/../libdecnumber
-I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/../libdecnumber/dpd
-I../libdecnumber
-I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/../libbacktrace   -o
c-family/c-omp.o -MT c-family/c-omp.o -MMD -MP -MF c-family/.deps/c-omp.TPo
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/c-family/c-omp.c
during RTL pass: expand
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/c-family/c-omp.c: In function
‘void c_omp_split_clauses(location_t, tree_code, omp_clause_mask, tree,
tree_node**)’:
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/c-family/c-omp.c:1561:1: internal
compiler error: in reduce_to_bit_field_precision, at expr.c:11530
 1561 | c_omp_split_clauses (location_t loc, enum tree_code code,
  | ^~~
0x10decc9f reduce_to_bit_field_precision
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:11530
0x10dde35b expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:8786
0x10de5443 expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:10152
0x10ddc6c7 expand_expr_real(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:8469
0x10db55f7 expand_expr
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.h:282
0x10ddac03 expand_operands(tree_node*, tree_node*, rtx_def*, rtx_def**,
rtx_def**, expand_modifier)
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:8065
0x10ddc8f7 expand_cond_expr_using_cmove
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:8518
0x10de3c97 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:9869
0x10b79587 expand_gimple_stmt_1
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/cfgexpand.c:3786
0x10b798d3 expand_gimple_stmt
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/cfgexpand.c:3847
0x10b83887 expand_gimple_basic_block
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/cfgexpand.c:5888
0x10b861d7 execute
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/cfgexpand.c:6572
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
Makefile:1124: recipe for target 'c-family/c-omp.o' failed

[Bug target/95347] rs6000 mcpu=future generating stfs instead of pstfs for pc-relative references

2020-06-09 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95347

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from acsawdey at gcc dot gnu.org ---
This is fixed now.

[Bug target/95347] rs6000 mcpu=future generating stfs instead of pstfs for pc-relative references

2020-06-02 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95347

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

   Last reconfirmed||2020-06-02
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

--- Comment #2 from acsawdey at gcc dot gnu.org ---
Turns out that lfs/plfs has the same problem. Patch for that coming shortly.

[Bug target/95347] New: rs6000 mcpu=future generating stfs instead of pstfs for pc-relative references

2020-05-26 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95347

Bug ID: 95347
   Summary: rs6000 mcpu=future generating stfs instead of pstfs
for pc-relative references
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acsawdey at gcc dot gnu.org
  Target Milestone: ---

Problem exists in r11-639.

/home2/sawdey/work/gcc/mamboCI/build-mambo/gcc/xgcc
-B/home2/sawdey/work/gcc/mamboCI/build-mambo/gcc/
/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/testsuite/gcc.c-torture/execute/pr79354.c
-mcpu=future -mpcrel -fno-diagnostics-show-caret
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never
-fdiagnostics-urls=never -O1 -w -lm -o ./pr79354.exe --save-temps

./pr79354.s: Assembler messages:
./pr79354.s:31: Error: missing operand

The relevant piece of the asm output:

xscvuxdsp 0,32
pstfs 0,.LANCHOR0+16@pcrel
stfs 0,.LANCHOR0+20@pcrel
lwa 10,0(3)
pstw 10,.LANCHOR0+20@pcrel

The extended mnemonic "pstfs Fx,value" is equivalent to "pstfs Fx,value(0),1"
and is only valid for pstfs not stfs.

[Bug target/94740] ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1

2020-04-23 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94740

--- Comment #1 from acsawdey at gcc dot gnu.org ---
Reduced test case:



struct __attribute__((scalar_storage_order("big-endian"))) {
  int a;
  int b[];
} c;
int d;
int e() { d = c.b[0]; }

[Bug target/94740] New: ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1

2020-04-23 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94740

Bug ID: 94740
   Summary: ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future
-mpcrel -O1
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acsawdey at gcc dot gnu.org
  Target Milestone: ---

Compiler is trunk 3bcdb5dec72b6d7b197821c2b814bc9fc07f4628 on ppc64le power9
host.

~/work/gcc/trunk/build/gcc/xgcc -B/home2/sawdey/work/gcc/trunk/build/gcc
/home2/sawdey/work/gcc/trunk/gcc/gcc/testsuite/gcc.dg/sso/t5.c -mcpu=future
-mpcrel -O1 -lm -o ./t5.exe
during RTL pass: reload
/home2/sawdey/work/gcc/trunk/gcc/gcc/testsuite/gcc.dg/sso/t5.c: In function
‘main’:
/home2/sawdey/work/gcc/trunk/gcc/gcc/testsuite/gcc.dg/sso/t5.c:73:1: internal
compiler error: in set_address_disp, at rtlanal.c:6254
   73 | }
  | ^
0x10a50ca3 set_address_disp
../../gcc/gcc/rtlanal.c:6254
0x10a50ca3 set_address_disp
../../gcc/gcc/rtlanal.c:6252
0x10a50ca3 decompose_automod_address
../../gcc/gcc/rtlanal.c:6297
0x10a50ca3 decompose_address(address_info*, rtx_def**, machine_mode, unsigned
char, rtx_code)
../../gcc/gcc/rtlanal.c:6457
0x10887973 process_address_1
../../gcc/gcc/lra-constraints.c:3367
0x10889b9b process_address
../../gcc/gcc/lra-constraints.c:3641
0x10889b9b curr_insn_transform
../../gcc/gcc/lra-constraints.c:3956
0x1088f95f lra_constraints(bool)
../../gcc/gcc/lra-constraints.c:5029
0x1087119f lra(_IO_FILE*)
../../gcc/gcc/lra.c:2440
0x10810b9b do_reload
../../gcc/gcc/ira.c:5523
0x10810b9b execute
../../gcc/gcc/ira.c:5709
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

[Bug target/94622] testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on powerpc64le with -mpcrel

2020-04-22 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94622

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from acsawdey at gcc dot gnu.org ---
Fixed in trunk.

[Bug target/94622] testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on powerpc64le with -mpcrel

2020-04-21 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94622

--- Comment #3 from acsawdey at gcc dot gnu.org ---
I'm wondering if the same problem exists for atomic_store, store_quadpti,
and pstq vs stq?

[Bug target/94622] testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on powerpc64le with -mpcrel

2020-04-20 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94622

--- Comment #2 from acsawdey at gcc dot gnu.org ---
Solution is going to be to always use plq if prefixed, which makes sense anyway
for little endian because it avoids the ugly doubleword swap.

[Bug target/94622] testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on powerpc64le with -mpcrel

2020-04-17 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94622

--- Comment #1 from acsawdey at gcc dot gnu.org ---
Compiling with -dap we see:

sync # 7[c=12 l=4]  *hwsync
plq 8,.LANCHOR0@pcrel# 8[c=8 l=12]  load_quadpti
mr 10,9  # 9[c=4 l=4]  *movdi_internal64/2
mr 11,8  # 10   [c=4 l=4]  *movdi_internal64/2

I think the problem is that atomic_load thinks it always needs to do a
doubleword swap if little endian for TImode, which is true for lq, but not for
plq.

[Bug target/94622] testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on powerpc64le with -mpcrel

2020-04-16 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94622

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

   Last reconfirmed||2020-04-16
   Assignee|unassigned at gcc dot gnu.org  |acsawdey at gcc dot 
gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

[Bug target/94622] New: testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on powerpc64le with -mpcrel

2020-04-16 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94622

Bug ID: 94622
   Summary: testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on
powerpc64le with -mpcrel
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acsawdey at gcc dot gnu.org
  Target Milestone: ---

Compile command:
/home2/sawdey/work/gcc/mamboCI/build-mambo/gcc/xgcc
-B/home2/sawdey/work/gcc/mamboCI/build-mambo/gcc/
/home2/sawdey/work/gcc/mamboCI/pike-trunk/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c
-B/home2/sawdey/work/gcc/mamboCI/build-mambo/powerpc64le-unknown-linux-gnu/./libatomic/
-L/home2/sawdey/work/gcc/mamboCI/build-mambo/powerpc64le-unknown-linux-gnu/./libatomic/.libs
-latomic -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
-fdiagnostics-color=never -fdiagnostics-urls=never -O1 -std=c11
-pedantic-errors -lm -mpcrel -mcpu=future -o c11-atomic-exec-1.exe

Compiler is trunk from about a week ago.

Reduced test case:

extern void abort (void);
extern void exit (int);
static void
test_simple_assign (void)
{
  do {
do { static volatile _Atomic (long double) b = (long double) ((1));
  if (b != ((long double) ((1 abort ();
} while (0);
  } while (0);
}

int
main (void)
{
  test_simple_assign ();
  exit (0);
}

The problem seems to be that with -mpcrel, we generate a plq for the load of
the long double constant and are swapping around the doublewords, which is only
needed for lq not plq.

The generated code with -mpcrel:

plq 8,.LANCHOR0@pcrel
mr 10,9
mr 11,8
cmpw 0,10,10
bne- 0,$+4
isync
std 9,32(1)
std 8,40(1)
plfd 0,.LC0@pcrel
plfd 1,.LC0+8@pcrel
lfd 12,32(1)
lfd 13,40(1)
fcmpu 0,12,0
bne 0,$+8
fcmpu 0,13,1
bne 0,.L4

And with -mno-pcrel:

addis 9,2,.LANCHOR0@toc@ha
addi 9,9,.LANCHOR0@toc@l
lq 10,0(9)
mr 8,10
mr 9,11
mr 10,11
mr 11,8
cmpw 0,10,10
bne- 0,$+4
isync
std 9,32(1)
std 8,40(1)
addis 9,2,.LC0@toc@ha
addi 9,9,.LC0@toc@l
lfd 0,0(9)
lfd 1,8(9)
lfd 12,32(1)
lfd 13,40(1)
fcmpu 0,12,0
bne 0,$+8
fcmpu 0,13,1
bne 0,.L4

[Bug target/94542] test gcc/testsuite/gcc.dg/tls/pr24428-2.c generates incorrect code on ppc64le with -mpcrel -mcpu=future -O2

2020-04-09 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94542

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |acsawdey at gcc dot 
gnu.org
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-04-09
 Ever confirmed|0   |1

[Bug target/94542] New: test gcc/testsuite/gcc.dg/tls/pr24428-2.c generates incorrect code on ppc64le with -mpcrel -mcpu=future -O2

2020-04-09 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94542

Bug ID: 94542
   Summary: test gcc/testsuite/gcc.dg/tls/pr24428-2.c generates
incorrect code on ppc64le with -mpcrel -mcpu=future
-O2
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acsawdey at gcc dot gnu.org
  Target Milestone: ---

The test case is:

__thread double thrtest[81];
int main ()
{
  double *p, *e;
  e = &thrtest[81];
  for (p = &thrtest[0]; p < e; ++p)
*p = 1.0;
  return 0;
}

Generated code for p and e is

paddi 9,13,thrtest@tprel
pla 8,thrtest+648@pcrel

The second should also be using a @tprel relocation. Because it didn't, the
loop runs off the end of allocated memory and segfaults. This test runs
correctly when compiled with -O0.

Compile command:

/home2/sawdey/work/gcc/mamboCI/build-mambo/gcc/xgcc
-B/home2/sawdey/work/gcc/mamboCI/build-mambo/gcc/
/home2/sawdey/work/gcc/mamboCI/pike-trunk/gcc/testsuite/gcc.dg/tls/pr24428-2.c
-O2 -mpcrel -mcpu=future -S -o pr24428-2.exe.s

[Bug target/92379] rs6000.c:5598:13: runtime error: shift exponent 64 is too large for 64-bit type 'long int'

2020-03-16 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92379

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #9 from acsawdey at gcc dot gnu.org ---
Fix checked in to trunk.

[Bug target/92379] rs6000.c:5598:13: runtime error: shift exponent 64 is too large for 64-bit type 'long int'

2020-03-06 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92379

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||acsawdey at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |acsawdey at gcc dot 
gnu.org

--- Comment #6 from acsawdey at gcc dot gnu.org ---
I've reproduced this with current trunk, going to see if I can cook up a patch
quick.

[Bug target/93129] PPC memset not using vector instruction on >= Power8

2020-01-06 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93129

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2020-01-06
 CC||acsawdey at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |acsawdey at gcc dot 
gnu.org
 Ever confirmed|0   |1

[Bug target/93130] PPC simple memset not inlined

2020-01-06 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93130

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2020-01-06
 CC||acsawdey at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |acsawdey at gcc dot 
gnu.org
 Ever confirmed|0   |1

[Bug jit/87808] gcc_lib_dir is missing from libgccjit's search path when driver is not installed

2019-07-22 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87808

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-07-22
 CC||acsawdey at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #6 from acsawdey at gcc dot gnu.org ---
I'm also seeing this same problem simply from not having the gcc driver in
PATH.

Using the example from downstream redhat BZ 1566178:

[sawdey@marlin trunk]$ /home2/sawdey/work/gcc/trunk/install/bin/gcc 
-Wl,-rpath,/home2/sawdey/work/gcc/trunk/install/lib -g -Wall -Werror t.c
-lgccjit
[sawdey@marlin trunk]$ ./a.out
ld: cannot find crtbeginS.o: No such file or directory
ld: cannot find -lgcc
ld: cannot find -lgcc_s
libgccjit.so: error: error invoking gcc driver
gcc_jit_result_get_code: NULL result
Segmentation fault (core dumped)
[sawdey@marlin trunk]$ PATH=/home2/sawdey/work/gcc/trunk/install/bin:$PATH
./a.out
hello foo

Using strace the failing version makes this ld command, with no path for
crtbeginS.o:
/usr/bin/ld --eh-frame-hdr -shared -m elf64lppc -o
/tmp/libgccjit-lMbVEL/fake.so /usr/lib/powerpc64le-linux-gnu/crti.o crtbeginS.o
-L/lib/powerpc64le-linux-gnu -L/lib/../lib64 -L/usr/lib/powerpc64le-linux-gnu
/tmp/cchtuxH0.o -lgcc --push-state --as-needed -lgcc_s --pop-state -lc -lgcc
--push-state --as-needed -lgcc_s --pop-state crtendS.o
/usr/lib/powerpc64le-linux-gnu/crtn.o

When it can find the driver, this is the ld command, with the full path to the
correct installed bits:
/usr/bin/ld --eh-frame-hdr -shared -m elf64lppc -o
/tmp/libgccjit-t81bpM/fake.so /usr/lib/powerpc64le-linux-gnu/crti.o
/home2/sawdey/work/gcc/trunk/install/lib/gcc/powerpc64le-unknown-linux-gnu/10.0.0/crtbeginS.o
-L/home2/sawdey/work/gcc/trunk/install/lib/gcc/powerpc64le-unknown-linux-gnu/10.0.0
-L/home2/sawdey/work/gcc/trunk/install/lib/gcc/powerpc64le-unknown-linux-gnu/10.0.0/../../../../lib64
-L/lib/powerpc64le-linux-gnu -L/lib/../lib64 -L/usr/lib/powerpc64le-linux-gnu
-L/home2/sawdey/work/gcc/trunk/install/lib/gcc/powerpc64le-unknown-linux-gnu/10.0.0/../../..
/tmp/ccD4Cbi4.o -lgcc --push-state --as-needed -lgcc_s -pop-state -lc -lgcc
--push-state --as-needed -lgcc_s --pop-state
/home2/sawdey/work/gcc/trunk/install/lib/gcc/powerpc64le-unknown-linux-gnu/10.0.0/crtendS.o
/usr/lib/powerpc64le-linux-gnu/crtn.o


This is with trunk, configured with

--disable-bootstrap --enable-languages=c,c++,jit --enable-host-shared
--prefix=/home2/sawdey/work/gcc/trunk/install

[Bug rtl-optimization/88308] ICE in maybe_record_trace_start, at dwarf2cfi.c:2309

2019-02-15 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88308

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from acsawdey at gcc dot gnu.org ---
Fixed in trunk.

[Bug rtl-optimization/88308] ICE in maybe_record_trace_start, at dwarf2cfi.c:2309

2019-02-15 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88308

--- Comment #6 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Fri Feb 15 15:41:25 2019
New Revision: 268942

URL: https://gcc.gnu.org/viewcvs?rev=268942&root=gcc&view=rev
Log:
2019-02-15  Aaron Sawdey  

PR rtl-optimization/88308
* shrink-wrap.c (move_insn_for_shrink_wrap): Fix LABEL_NUSES counts
on copied instruction.



Modified:
trunk/gcc/ChangeLog
trunk/gcc/shrink-wrap.c

[Bug target/88308] ICE in maybe_record_trace_start, at dwarf2cfi.c:2309

2019-02-13 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88308

--- Comment #5 from acsawdey at gcc dot gnu.org ---
After some more digging, it appears that the problem is
move_insn_for_shrink_wrap() is deleting and re-creating insns to move them from
one BB to another. The label reference count gets decremented in delete_insn()
but does not get re-incremented when the new insn is created in a different BB.

If you add -fno-shrink-wrap, the ICE does not occur.

[Bug target/88308] ICE in maybe_record_trace_start, at dwarf2cfi.c:2309

2019-02-12 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88308

--- Comment #4 from acsawdey at gcc dot gnu.org ---
Tracked down the difference between -m32 and -m64. In the -m64 case,
rs6000_emit_move calls force_const_mem and that will set LABEL_PRESERVE_P on a
label_ref that it finds, which is what marks the jump table label for
preservation. In the -m32 case, none of this if tests succeed inside the case
E_SImode/E_DImode and as a result rs6000_emit_move does not call
force_const_mem.

It really seems to me like the label for the jump table should be marked for
preservation somewhere more definite than this.

[Bug target/88308] ICE in maybe_record_trace_start, at dwarf2cfi.c:2309

2019-02-12 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88308

--- Comment #3 from acsawdey at gcc dot gnu.org ---
One difference between compiling this -m32 and -m64 is that the label for the
jump table is marked /s in the 64-bit version:

(code_label/s 22 21 23 4 (nil) [4 uses])
(jump_table_data 23 22 24 (addr_diff_vec:SI (label_ref:DI 22)
 [
(label_ref:DI 25)
(label_ref:DI 54)
(label_ref:DI 83)

In the 32-bit version it is not, and that label plus the jump_table_data insn
that follows are not present in the dumps after split2:

(code_label 23 22 24 4 (nil) [4 uses])
;; Insn is not within a basic block
(jump_table_data 24 23 25 (addr_diff_vec:SI (label_ref:SI 23)
 [
(label_ref:SI 26)
(label_ref:SI 64)
(label_ref:SI 102)
(label_ref:SI 140)
(label_ref:SI 178)

The significance of this is that tablejump_p() looks at the next insn to
determine if it is in fact a tablejump:

bool
tablejump_p (const rtx_insn *insn, rtx_insn **labelp,
 rtx_jump_table_data **tablep)
{
  if (!JUMP_P (insn))
return false;

  rtx target = JUMP_LABEL (insn);
  if (target == NULL_RTX || ANY_RETURN_P (target))
return false;

  rtx_insn *label = as_a (target);
  rtx_insn *table = next_insn (label);
  if (table == NULL_RTX || !JUMP_TABLE_DATA_P (table))
return false;

Since the label insn and jump table insn seem to be gone, this return false.

Then in create_trace_edges() we end up in the final stanza of the if
(JUMP_P(insn):

  else
{
  rtx_insn *lab = JUMP_LABEL_AS_INSN (insn);
  gcc_assert (lab != NULL);
  maybe_record_trace_start (lab, insn);
}

And so we try to create a trace for the jump table label which leads to the
ICE.

[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion

2019-02-09 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from acsawdey at gcc dot gnu.org ---
This is fixed in trunk and gcc-8-branch. Hopefully I got this into 8 in time
for it to get into 8.3.

[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion

2019-02-09 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112

--- Comment #10 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Sat Feb  9 17:11:06 2019
New Revision: 268725

URL: https://gcc.gnu.org/viewcvs?rev=268725&root=gcc&view=rev
Log:
2019-02-09  Aaron Sawdey  

Backported from mainline
2019-02-05  Aaron Sawdey  

PR target/89112
* config/rs6000/rs6000.md (tf_): Generate a local label
for the long branch case.

2019-02-05  Aaron Sawdey  

PR target/89112
* config/rs6000/rs6000-string.c (do_ifelse, expand_cmp_vec_sequence,
expand_compare_loop, expand_block_compare_gpr,
expand_strncmp_align_check, expand_strncmp_gpr_sequence): Insert
REG_BR_PROB notes in inline expansion of memcmp/strncmp. Add
#include "profile-count.h" and "predict.h" for types and functions
needed to work with REG_BR_PROB notes.

2019-02-09  Aaron Sawdey  

* config/rs6000/rs6000-string.c (expand_compare_loop,
expand_block_compare): Insert REG_BR_PROB notes in inline expansion of
memcmp/strncmp.



Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/config/rs6000/rs6000-string.c
branches/gcc-8-branch/gcc/config/rs6000/rs6000.md

[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion

2019-02-05 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112

--- Comment #9 from acsawdey at gcc dot gnu.org ---
The fixes for this are in trunk now. I will backport to gcc-8-branch in a week
and then this can be closed.

[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion

2019-02-05 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112

--- Comment #8 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Tue Feb  5 16:32:06 2019
New Revision: 268547

URL: https://gcc.gnu.org/viewcvs?rev=268547&root=gcc&view=rev
Log:
2019-02-05  Aaron Sawdey  

PR target/89112
* config/rs6000/rs6000-string.c (do_ifelse, expand_cmp_vec_sequence,
expand_compare_loop, expand_block_compare_gpr,
expand_strncmp_align_check, expand_strncmp_gpr_sequence): Insert
REG_BR_PROB notes in inline expansion of memcmp/strncmp. Add
#include "profile-count.h" and "predict.h" for types and functions
needed to work with REG_BR_PROB notes.



Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000-string.c

[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion

2019-02-05 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112

--- Comment #7 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Tue Feb  5 16:30:45 2019
New Revision: 268546

URL: https://gcc.gnu.org/viewcvs?rev=268546&root=gcc&view=rev
Log:
2019-02-05  Aaron Sawdey  

PR target/89112
* config/rs6000/rs6000.md (tf_): Generate a local label
for the long branch case.



Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000.md

[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion

2019-02-01 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112

--- Comment #5 from acsawdey at gcc dot gnu.org ---
This patch fixes the issue on trunk:

Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 268403)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -12639,8 +12639,8 @@
   else
 {
   static char seq[96];
-  char *bcs = output_cbranch (operands[3], "$+8", 1, insn);
-  sprintf(seq, " $+12\;%s;b %%l0", bcs);
+  char *bcs = output_cbranch (operands[3], ".L%=", 1, insn);
+  sprintf(seq, " .L%%=\;%s\;b %%l0\;.L%%=:", bcs);
   return seq;
 }
 }

I'm testing now, I will get this posted. Once approved for backport I'll apply
the same thing to gcc-8-branch for inclusion in the next 8 release.

[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion

2019-02-01 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112

--- Comment #4 from acsawdey at gcc dot gnu.org ---
Well I can't blame this one on the linker or optimization. The splitting for
the case where the branch destination is too far is wrong in tf_:

  static char seq[96];
  char *bcs = output_cbranch (operands[3], "$+8", 1, insn);
  sprintf(seq, " $+12\;%s;b %%l0", bcs);
  return seq;

This is wrong in both gcc 8 and 9. I'll get this fixed right away.

The longer term question is how do I convince gcc to keep the code for a memcmp
expansion together? I think this is happening because it thinks some of the
code is cold and is throwing it at the end of the function.

[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion

2019-02-01 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112

--- Comment #3 from acsawdey at gcc dot gnu.org ---
It appears that gcc decided to split the bdnzt generated by the memcmp
expansion because the destination was out of range, and produced this:

bdz $+12
beq 0,$+8
b $+8;b .L939
bne 0,.L937 ; --> to setb code

So after the second iteration the bdz should branch to the bne which branches
to a setb if there was a difference or falls through and does an overlapping
compare to get the last 4 bytes of the 36 being compared.

But the disassembly when I look at things in gdb has an extra branch in there
which messes things up:

   0x10008b90 : bdz 0x10008b9c 
   0x10008b94 : beq 0x10008b9c 
   0x10008b98 : b   0x10008ba0 
   0x10008b9c : b   0x1b8c 
   0x10008ba0 : bne 0x1bac 

So now the bdz branches to a branch to b8c which is back to the top of the loop
to compare another 16 bytes which is of course wrong.

It's possible this all happened because I didn't generate labels in the
splitter, so multiple conditional branches had to be split because they were
out of range.

[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion

2019-01-30 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112

--- Comment #2 from acsawdey at gcc dot gnu.org ---
I'm seeing this on both gcc-8-branch and trunk, but only with -mcpu=power9.
I'll figure out what happened here and get it fixed in trunk then back ported
to 8.

[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion

2019-01-30 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-01-30
   Assignee|unassigned at gcc dot gnu.org  |acsawdey at gcc dot 
gnu.org
 Ever confirmed|0   |1

[Bug target/88027] PowerPC generates slightly weird code for memset

2019-01-04 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88027

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from acsawdey at gcc dot gnu.org ---
Backport to 8 tested ok and is now checked in as 267580.

[Bug target/88027] PowerPC generates slightly weird code for memset

2019-01-03 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88027

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-01-03
   Assignee|unassigned at gcc dot gnu.org  |acsawdey at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #5 from acsawdey at gcc dot gnu.org ---
This is fixed in trunk but I should backport to 8 now too.

[Bug target/88027] PowerPC generates slightly weird code for memset

2018-11-14 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88027

--- Comment #3 from acsawdey at gcc dot gnu.org ---
This appears to have to do with alignment. In this test case,
expand_block_clear() sees alignment of only 8 bits for the pointer p. If you
declare a local struct st and pass that to __builtin_memset, it sees alignment
of 128 bits and generates 4 stxv or stvx.

There is a bug here though:

  for (offset = 0; bytes > 0; offset += clear_bytes, bytes -= clear_bytes)
{
  machine_mode mode = BLKmode;
  rtx dest;

  if (TARGET_ALTIVEC
  && ((bytes >= 16 && align >= 128)
  || (bytes >= 32 && TARGET_EFFICIENT_UNALIGNED_VSX)))

The intention here was to only do unaligned VSX if there were at least 32 bytes
to clear. However because bytes is decremented, what this actually does is to
always do the last 16 bytes using std if it is unaligned. This doesn't make a
lot of sense and would be an easy fix.

[Bug target/88027] PowerPC generates slightly weird code for memset

2018-11-14 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88027

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 CC||acsawdey at gcc dot gnu.org

--- Comment #2 from acsawdey at gcc dot gnu.org ---
What can I say? expand_block_clear() steps through the block to be cleared,
using smaller writes at the end if necessary. The rtx is generated for the
write by:

  dest = adjust_address (orig_dest, mode, offset);

  emit_move_insn (dest, CONST0_RTX (mode));

My guess is scheduling moved the gpr stores up.

[Bug target/87474] ICE in extract_insn, at recog.c:2305

2018-10-02 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87474

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from acsawdey at gcc dot gnu.org ---
Fixed on trunk.

[Bug target/87474] ICE in extract_insn, at recog.c:2305

2018-10-02 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87474

--- Comment #4 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Tue Oct  2 17:31:53 2018
New Revision: 264799

URL: https://gcc.gnu.org/viewcvs?rev=264799&root=gcc&view=rev
Log:
2018-10-02  Aaron Sawdey  

PR target/87474
* config/rs6000/rs6000-string.c (expand_strn_compare): Check that both
P8_VECTOR and VSX are enabled.



Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000-string.c

[Bug target/87474] ICE in extract_insn, at recog.c:2305

2018-10-01 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87474

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |acsawdey at gcc dot 
gnu.org

--- Comment #3 from acsawdey at gcc dot gnu.org ---
This looks like I screwed up the conditions, clearly it shouldn't be trying to
generate the vector/vsx strncmp expansion with -mno-power8-vector.

[Bug target/86222] ICE in final_scan_insn_1 calling strncmp() with a bound of PTRDIFF_MAX + 1

2018-06-26 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86222

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from acsawdey at gcc dot gnu.org ---
Fix committed to trunk and gcc-8-branch.

[Bug target/86222] ICE in final_scan_insn_1 calling strncmp() with a bound of PTRDIFF_MAX + 1

2018-06-26 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86222

--- Comment #6 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Tue Jun 26 16:43:38 2018
New Revision: 262157

URL: https://gcc.gnu.org/viewcvs?rev=262157&root=gcc&view=rev
Log:
2018-06-26  Aaron Sawdey  

Backport from trunk
2018-06-22  Aaron Sawdey  

PR target/86222
* config/rs6000/rs6000-string.c (expand_strn_compare): Handle -m32
correctly.



Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/config/rs6000/rs6000-string.c

[Bug target/86222] ICE in final_scan_insn_1 calling strncmp() with a bound of PTRDIFF_MAX + 1

2018-06-22 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86222

--- Comment #5 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Fri Jun 22 15:37:36 2018
New Revision: 261906

URL: https://gcc.gnu.org/viewcvs?rev=261906&root=gcc&view=rev
Log:
Forgot PR target/86222 in ChangeLog

Modified:
trunk/gcc/ChangeLog

[Bug target/86222] ICE in final_scan_insn_1 calling strncmp() with a bound of PTRDIFF_MAX + 1

2018-06-21 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86222

--- Comment #4 from acsawdey at gcc dot gnu.org ---
Well when compiling this with -m32 -mcpu=power[6789] I get this for the rtx of
the length argument:

(const_int -2147483648 [0x8000])

So when I am doing UINTVAL (bytes_rtx) I get 0x8000 and things go
awry.

In the tree optimized dump I am seeing this, as Martin did:

  _2 = strncmpD.898 (&aD.2760, &bD.2761, 2147483648); [tail call]

So somewhere in between something appears to be gratuitously sign extending
this.

[Bug target/86222] ICE in final_scan_insn_1 calling strncmp() with a bound of PTRDIFF_MAX + 1

2018-06-21 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86222

--- Comment #3 from acsawdey at gcc dot gnu.org ---
OK, so this requires -m32 and also -mcpu=power6 or higher. I have reproduced it
so should have a fix shortly.

[Bug target/86222] ICE in final_scan_insn_1 calling strncmp() with a bound of PTRDIFF_MAX + 1

2018-06-21 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86222

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug target/83660] ICE with vec_extract inside expression statement

2018-04-23 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #19 from acsawdey at gcc dot gnu.org ---
Backported to gcc 7 and 6. Closing again.

[Bug target/83660] ICE with vec_extract inside expression statement

2018-04-23 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660

--- Comment #18 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Tue Apr 24 00:19:43 2018
New Revision: 259590

URL: https://gcc.gnu.org/viewcvs?rev=259590&root=gcc&view=rev
Log:
2018-04-23  Aaron Sawdey  

Backport from mainline
2018-04-16  Aaron Sawdey  

PR target/83660
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Mark
vec_extract expression as having side effects to make sure it gets
a cleanup point.

2018-04-23  Aaron Sawdey  

Backport from mainline
2018-04-16  Aaron Sawdey  

PR target/83660
* gcc.target/powerpc/pr83660.C: New test.



Added:
branches/gcc-7-branch/gcc/testsuite/gcc.target/powerpc/pr83660.C
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/rs6000/rs6000-c.c
branches/gcc-7-branch/gcc/testsuite/ChangeLog

[Bug target/83660] ICE with vec_extract inside expression statement

2018-04-23 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660

--- Comment #17 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Tue Apr 24 00:14:21 2018
New Revision: 259586

URL: https://gcc.gnu.org/viewcvs?rev=259586&root=gcc&view=rev
Log:

2018-04-23  Aaron Sawdey  

Backport from mainline
2018-04-16  Aaron Sawdey  

PR target/83660
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Mark
vec_extract expression as having side effects to make sure it gets
a cleanup point.

2018-04-23  Aaron Sawdey  

Backport from mainline
2018-04-16  Aaron Sawdey  

PR target/83660
* gcc.target/powerpc/pr83660.C: New test.


Added:
branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr83660.C
Modified:
branches/gcc-6-branch/gcc/ChangeLog
branches/gcc-6-branch/gcc/config/rs6000/rs6000-c.c
branches/gcc-6-branch/gcc/testsuite/ChangeLog

[Bug target/85436] [7 Regression] ICE compiling go code with -mcpu=power9

2018-04-17 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85436

--- Comment #1 from acsawdey at gcc dot gnu.org ---
Created attachment 43966
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43966&action=edit
shorter reduced test case

I've further reduced the test case and now it's only 38 lines so so should be
easier to work with.

[Bug target/85436] New: [7 Regression] ICE compiling go code with -mcpu=power9

2018-04-17 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85436

Bug ID: 85436
   Summary: [7 Regression] ICE compiling go code with -mcpu=power9
   Product: gcc
   Version: 7.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acsawdey at gcc dot gnu.org
CC: bergner at gcc dot gnu.org, segher at gcc dot gnu.org,
wschmidt at gcc dot gnu.org
  Target Milestone: ---
Target: powerpc64le-linux-gnu

Created attachment 43964
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43964&action=edit
reduced test case

To reproduce:

Build gcc-7-branch using:

configure --disable-bootstrap --enable-languages=c,c++,go
--with-long-double-128 --enable-secureplt --disable-multilib --without-ppl
--without-cloog --without-libelf

gcc/gccgo -Bgcc -O3 -Lpowerpc64le-linux-gnu/libgo -S bug_reduced.go
-mcpu=power9

This affects 259009 through the head of the gcc-7-branch.

[Bug target/83660] ICE with vec_extract inside expression statement

2018-04-16 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #16 from acsawdey at gcc dot gnu.org ---
Possibly need backports to both 7 and 6.

[Bug target/83660] ICE with vec_extract inside expression statement

2018-04-16 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from acsawdey at gcc dot gnu.org ---
Fixed in 259403.

[Bug target/83660] ICE with vec_extract inside expression statement

2018-04-16 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660

--- Comment #14 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Mon Apr 16 14:50:06 2018
New Revision: 259403

URL: https://gcc.gnu.org/viewcvs?rev=259403&root=gcc&view=rev
Log:
2018-04-16  Aaron Sawdey  

PR target/83660
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Mark
vec_extract expression as having side effects to make sure it gets
a cleanup point.

2018-04-16  Aaron Sawdey  

PR target/83660
* gcc.target/powerpc/pr83660.C: New test.



Added:
trunk/gcc/testsuite/gcc.target/powerpc/pr83660.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000-c.c
trunk/gcc/testsuite/ChangeLog

[Bug target/83660] ICE with vec_extract inside expression statement

2018-04-13 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660

--- Comment #12 from acsawdey at gcc dot gnu.org ---
This function is called from cp/semantics.c maybe_cleanup_point_expr()

tree
fold_build_cleanup_point_expr (tree type, tree expr)
{
  /* If the expression does not have side effects then we don't have to wrap
 it with a cleanup point expression.  */
  if (!TREE_SIDE_EFFECTS (expr))
return expr;

In the vec_extract case it bails out due to no side effects and does not put in
the cleanup point.

So in fact a more minimal version of Jakub's patch also works. If you mark that
this has side effects, then the cleanup point is added for us by the existing
code:

Index: config/rs6000/rs6000-c.c
===
--- config/rs6000/rs6000-c.c(revision 259353)
+++ config/rs6000/rs6000-c.c(working copy)
@@ -6704,6 +6704,8 @@
   stmt = convert (innerptrtype, stmt);
   stmt = build_binary_op (loc, PLUS_EXPR, stmt, arg2, 1);
   stmt = build_indirect_ref (loc, stmt, RO_NULL);
+  if (c_dialect_cxx ())
+   TREE_SIDE_EFFECTS (stmt) = 1;

   return stmt;
 }

Any comments on whether this is the right way to fix this? I think the
vec_insert case does not need to be changed because the MODIFY_EXPR used there
will mark that there are side effects for us.

[Bug target/83660] ICE with vec_extract inside expression statement

2018-04-12 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |acsawdey at gcc dot 
gnu.org

[Bug target/83660] ICE with vec_extract inside expression statement

2018-04-12 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 CC||acsawdey at gcc dot gnu.org

--- Comment #11 from acsawdey at gcc dot gnu.org ---
Looking at the dump of an analogous test case for vec_insert:

#include 

typedef __vector unsigned int  uvec32_t  __attribute__((__aligned__(16)));

uvec32_t get_word(uvec32_t v)
{ 
  return({const unsigned _B1 = 32;
  vec_insert(10, (uvec32_t)v, 2);});
}

It seems that we do get an additional cleanup_point like you are proposing to
add for vec_extract, which is maybe why that does not get into trouble:

;; Function __vector(4) unsigned int get_word(__vector(4) unsigned int) (null)
;; enabled by -tree-original


{
  < = {
const unsigned int _B1 = 32;

<>;
<> + 8) = 10;, D.3231>>;
  }>>;
}

I've gotten as far as seeing that something is calling
fold_build_cleanup_point_expr an additional time compared to the vec_extract
example.

[Bug target/85321] Missing documentation and option misc for ppc64le

2018-04-11 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85321

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 CC||acsawdey at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #6 from acsawdey at gcc dot gnu.org ---
All fixed in 259324.

[Bug target/85321] Missing documentation and option misc for ppc64le

2018-04-11 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85321

--- Comment #5 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Wed Apr 11 15:25:42 2018
New Revision: 259324

URL: https://gcc.gnu.org/viewcvs?rev=259324&root=gcc&view=rev
Log:
2018-04-11  Aaron Sawdey  

PR target/85321
* doc/invoke.texi (RS/6000 and PowerPC Options): Document options
-mcall- and -mtraceback=. Remove options -mabi=spe and -mabi=no-spe
from PowerPC section.
* config/rs6000/sysv4.opt (mcall-): Improve help text.
* config/rs6000/rs6000.opt (mblock-compare-inline-limit=): Trim
help text that is too long.
* config/rs6000/rs6000.opt (mblock-compare-inline-loop-limit=): Trim
help text that is too long.
* config/rs6000/rs6000.opt (mstring-compare-inline-limit=): Trim
help text that is too long.



Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000.opt
trunk/gcc/config/rs6000/sysv4.opt
trunk/gcc/doc/invoke.texi

[Bug target/85321] Missing documentation and option misc for ppc64le

2018-04-10 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85321

--- Comment #4 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Tue Apr 10 22:05:41 2018
New Revision: 259302

URL: https://gcc.gnu.org/viewcvs?rev=259302&root=gcc&view=rev
Log:
2018-04-10  Aaron Sawdey  

PR target/85321
* doc/invoke.texi (RS/6000 and PowerPC Options): Document options
-mblock-compare-inline-limit, -mblock-compare-inline-loop-limit,
and -mstring-compare-inline-limit.



Modified:
trunk/gcc/ChangeLog
trunk/gcc/doc/invoke.texi

[Bug target/83822] trunk/gcc/config/rs6000/rs6000-string.c:970]: (style) Redundant condition

2018-04-02 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83822

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from acsawdey at gcc dot gnu.org ---
Fixed in 258975.

[Bug target/83822] trunk/gcc/config/rs6000/rs6000-string.c:970]: (style) Redundant condition

2018-03-30 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83822

--- Comment #4 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Fri Mar 30 12:17:31 2018
New Revision: 258975

URL: https://gcc.gnu.org/viewcvs?rev=258975&root=gcc&view=rev
Log:

2018-03-30  Aaron Sawdey  

PR target/83822
* config/rs6000/rs6000-string.c (expand_compare_loop): Fix redundant
condition.
* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Fix redundant
condition.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000-c.c
trunk/gcc/config/rs6000/rs6000-string.c

[Bug target/83707] g++.dg/eh/simd-3.C fails on power7 -m32

2018-03-29 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83707

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from acsawdey at gcc dot gnu.org ---
Apparently fixed so closing.

[Bug target/83707] g++.dg/eh/simd-3.C fails on power7 -m32

2018-03-29 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83707

--- Comment #5 from acsawdey at gcc dot gnu.org ---
I can also confirm with trunk 258957 I do not see this fail with -m32
-mcpu=power7.

[Bug target/84743] default widths for parallel reassociation now hurt rather than help

2018-03-13 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84743

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from acsawdey at gcc dot gnu.org ---
Updated parallel reassociation widths that give better performance than no
parallel reassociation are checked in now so this can be closed.

[Bug target/84743] default widths for parallel reassociation now hurt rather than help

2018-03-13 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84743

--- Comment #5 from acsawdey at gcc dot gnu.org ---
Author: acsawdey
Date: Tue Mar 13 16:28:09 2018
New Revision: 258495

URL: https://gcc.gnu.org/viewcvs?rev=258495&root=gcc&view=rev
Log:
2018-03-13  Aaron Sawdey  

PR target/84743
* config/rs6000/rs6000.c (rs6000_reassociation_width): Disable parallel
reassociation for int modes.



Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000.c

[Bug target/84743] default widths for parallel reassociation now hurt rather than help

2018-03-09 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84743

acsawdey at gcc dot gnu.org changed:

   What|Removed |Added

   Priority|P1  |P3

--- Comment #4 from acsawdey at gcc dot gnu.org ---
This turned out to be a system with a bad clock. The reassociation widths still
need to be checked and corrected but the performance differences are mostly in
the 0.5% range with just one that is about 2% (xz).

[Bug target/84743] default widths for parallel reassociation now hurt rather than help

2018-03-07 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84743

--- Comment #3 from acsawdey at gcc dot gnu.org ---
Yes I'm digging into this now and omnetpp is at the top of the list. I can see
if there is a difference between cpu2006 and 2017 as well. For gcc7 I used 2006
to determine the widths.

[Bug target/84743] New: default widths for parallel reassociation now hurt rather than help

2018-03-06 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84743

Bug ID: 84743
   Summary: default widths for parallel reassociation now hurt
rather than help
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: acsawdey at gcc dot gnu.org
  Reporter: acsawdey at gcc dot gnu.org
  Target Milestone: ---
Target: powerpc64le power8

The width settings in rs6000_reassociation_width() were chosen to help
performance for SPEC CPU in gcc 7.

Testing on power8 shows that in gcc 8 (258101) there are now major degradations
in CPU2017 int with the default reassociation widths as compared to using
--param tree-reassoc-width=1 to disable reassociation.

Benchmark   
500.perlbench_r -5.98%
502.gcc_r   -1.16%
505.mcf_r   -12.44%
520.omnetpp_r   -39.00%
523.xalancbmk_r -9.78%
525.x264_r  -1.76%
531.deepsjeng_r -4.23%
548.exchange2_r -0.66%
557.xz_r-2.04%

[Bug middle-end/84433] gcc 7 and before miscompile loop and remove exit due to incorrect range calculation

2018-02-19 Thread acsawdey at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84433

--- Comment #8 from acsawdey at gcc dot gnu.org ---
It looks like both gcc 7 and 8 assume that the statement 

  ptrA->sA[ptrB->int1].zt = parm1;

will only be executed 14+1 times because of the declaration sA[15].

However gcc 7 assumes the whole loop will only execute that number of times:

Statement ptrA_14(D)->sA[ptrB__int1_lsm.11_22].zt = _34;
 is executed at most 14 (bounded by 14) + 1 times in loop 1.
Analyzing # of iterations of loop 1
  exit condition [15, + , 4294967295] != 0
  bounds on difference of bases: -15 ... -15
  result:
# of iterations 15, bounded by 15
Loop 1 iterates 15 times.
Loop 1 iterates at most 14 times.
Loop 1 likely iterates at most 14 times.
Analyzing # of iterations of loop 1
  exit condition [15, + , 4294967295] != 0
  bounds on difference of bases: -15 ... -15
  result:
# of iterations 15, bounded by 15
Removed pointless exit: if (ivtmp_24 != 0)

were gcc8 does not:

Statement ptrA_13(D)->sA[ptrB__int1_lsm.5_22].zt = _20;
 is executed at most 14 (bounded by 14) + 1 times in loop 1.
Analyzing # of iterations of loop 1
  exit condition [15, + , 4294967295] != 0
  bounds on difference of bases: -15 ... -15
  result:
# of iterations 15, bounded by 15
Loop 1 iterates 15 times.
Loop 1 iterates at most 15 times.
Loop 1 likely iterates at most 15 times.

Neither gcc 7 nor 8 produce any warnings for the revised test case with -Wall.

  1   2   3   >