[Bug target/116103] [15 Regression] GCN vs. "Internal-fn: Only allow modes describe types for internal fn[PR115961]"

2024-07-29 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103

Thomas Schwinge  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pan2.li at intel dot com
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Keywords||patch
   Last reconfirmed||2024-07-29

--- Comment #9 from Thomas Schwinge  ---
(In reply to Li Pan from comment #7)
> confirm with you all related failures are covered.

Yes, the testing state is restored to what it was before, thanks!


Before 'git push', please note Richard Sandiford's comment:
.

[Bug fortran/20585] [meta-bug] Fortran 2003 support

2024-07-29 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20585
Bug 20585 depends on bug 105361, which changed state.

Bug 105361 Summary: Incorrect end-of-file condition for derived-type I/O
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105361

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

[Bug libfortran/105361] Incorrect end-of-file condition for derived-type I/O

2024-07-29 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105361

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org
 Resolution|FIXED   |---
   Last reconfirmed||2024-07-29
 Ever confirmed|0   |1
 Status|RESOLVED|REOPENED

--- Comment #9 from Thomas Schwinge  ---
In some of my test runs (have not yet been able to deduce any pattern), I'm
seeing this new test case FAIL its execution test:

At line 33 of file [...]/source-gcc/gcc/testsuite/gfortran.dg/pr105361.f90
(unit = 10, file = 'fort.10')
Fortran runtime error: Bad real number in item 2 of list input

Error termination. Backtrace:
#0  0xf7f114e1 in read_real
at [...]/source-gcc/libgfortran/io/list_read.c:2080
#1  0xf7f12a87 in list_formatted_read_scalar
at [...]/source-gcc/libgfortran/io/list_read.c:2236
#2  0xf7f182bd in wrap_scalar_transfer
at [...]/source-gcc/libgfortran/io/transfer.c:2591
#3  0x8049486 in ???
#4  0x8049695 in ???
#5  0xf7996ed4 in ???
#6  0x8049155 in ???
#7  0x in ???

There also appear to be similar "twinkling" PASS <-> FAIL reports in
 posts.

[Bug target/116104] [15 Regression] GCN vs. "[rtl-optimization/116037] Explicitly track if a destination was skipped in ext-dce"

2024-07-28 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116104

--- Comment #2 from Thomas Schwinge  ---
Created attachment 58772
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58772=edit
'2629-1.i'

Jeff, you're of course very welcome to have a look, but note that I didn't
assign you this PR -- as I said:

(In reply to myself from comment #0)
> I can't tell if this is an issue in 'RTL pass: ext_dce' or in the GCN back 
> end.

Andrew (in CC) knows the latter one best.

That said, I'm attaching a simple reproducer, per
'gcc.c-torture/compile/2629-1.c'.  With a '--target=amdgcn-amdhsa' build of
'make all-gcc', run:

$ gcc/cc1 -fpreprocessed 2629-1.i -quiet -march=gfx908 -O2 -w -version
-o 2629-1.s

[Bug target/116103] [15 Regression] GCN vs. "Internal-fn: Only allow modes describe types for internal fn[PR115961]"

2024-07-28 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103

--- Comment #5 from Thomas Schwinge  ---
(In reply to Li Pan from comment #3)
> best practice of cross
> compile gfx908 in x86 linux?

If you only need the 'cc1' (and no assembler, linker, libc), the following
should do:

$ [...]/configure --target=amdgcn-amdhsa --enable-languages=c
$ make -j12 all-gcc
$ gcc/cc1 [...]

[Bug target/116103] [15 Regression] GCN vs. "Internal-fn: Only allow modes describe types for internal fn[PR115961]"

2024-07-28 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103

--- Comment #4 from Thomas Schwinge  ---
(In reply to Richard Biener from comment #2)
>   if (VECTOR_BOOLEAN_TYPE_P (type)
>   && SCALAR_INT_MODE_P (TYPE_MODE (type)))
> return true;

>   && TYPE_PRECISION (TREE_TYPE (type)) == 1

> Thomas, does that resolve the issue?

Thanks, it does: restores the original '*.s' exactly.  (Assuming that's the
desired outcome, Andrew?)

[Bug target/116103] [15 Regression] GCN vs. "Internal-fn: Only allow modes describe types for internal fn[PR115961]"

2024-07-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103

--- Comment #1 from Thomas Schwinge  ---
Similarly for '-march=gfx908':

(In reply to myself from comment #0)
> [-PASS:-]{+FAIL:+} gcc.dg/tree-ssa/loop-bound-2.c scan-tree-dump-not 
> ivopts "zero if "

..., and:

> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smax_1.c scan-assembler-times 
> \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80

> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smin_1.c scan-assembler-times 
> \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80

> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umax_1.c scan-assembler-times 
> \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56

> [-PASS:-]{+FAIL:+} gcc.target/gcn/cond_umin_1.c scan-assembler-times 
> \\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56

..., but not the following ones:

> [-PASS:-]{+FAIL:+} gcc.target/gcn/smax_1.c scan-assembler-times 
> \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80

> [-PASS:-]{+FAIL:+} gcc.target/gcn/smin_1.c scan-assembler-times 
> \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80

> [-PASS:-]{+FAIL:+} gcc.target/gcn/umax_1.c scan-assembler-times 
> \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56

> [-PASS:-]{+FAIL:+} gcc.target/gcn/umin_1.c scan-assembler-times 
> \\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 56

..., as these already did FAIL before (for '-march=gfx1030', '-march=gfx1100').

[Bug ipa/116055] [14/15 Regression] ICE from gcc.c-torture/unsorted/dump-noaddr.c after "Fix modref's iteraction with store merging"

2024-07-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116055

--- Comment #5 from Thomas Schwinge  ---
(In reply to Jan Hubicka from comment #4)
> * ipa-modref.cc (analyze_function): Do not ICE when flags regress.

> Does it help?

Yes, confirming this does resolve the issue for GCN.

[Bug target/116104] New: [15 Regression] GCN vs. "[rtl-optimization/116037] Explicitly track if a destination was skipped in ext-dce"

2024-07-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116104

Bug ID: 116104
   Summary: [15 Regression] GCN vs. "[rtl-optimization/116037]
Explicitly track if a destination was skipped in
ext-dce"
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: build, ice-checking
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, law at gcc dot gnu.org
  Target Milestone: ---
Target: GCN

As of recent PR116037 commit r15-2275-g679086172b84be18c55fdbb9cda7e97806e7c083
"[rtl-optimization/116037] Explicitly track if a destination was skipped in
ext-dce", '--target=amdgcn-amdhsa' ICEs during build of GCC target libraries. 
I can't tell if this is an issue in 'RTL pass: ext_dce' or in the GCN back end.
 (This isn't fixed by subsequent the commit
r15-2321-g34fb0feca71f763b2fbe832548749666d34a4a76 "[PR
rtl-optimization/116039] Fix life computation for promoted subregs".)

during RTL pass: ext_dce
[...]/source-gcc/libgcc/libgcc2.c: In function ‘__negti2’:
[...]/source-gcc/libgcc/libgcc2.c:71:1: internal compiler error: RTL check:
expected code 'const_int', have 'const_vector' in carry_backpropagate, at
ext-dce.cc:497
   71 | }
  | ^
0x20f9815 internal_error(char const*, ...)
[...]/source-gcc/gcc/diagnostic-global-context.cc:491
0x86460a rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int,
char const*)
[...]/source-gcc/gcc/rtl.cc:770
0xa52e15 carry_backpropagate(unsigned long, rtx_code, rtx_def*)
[...]/source-gcc/gcc/ext-dce.cc:497
0x1dfbb46 carry_backpropagate(unsigned long, rtx_code, rtx_def*)
[...]/source-gcc/gcc/ext-dce.cc:659
0x1dfbb46 ext_dce_process_uses
[...]/source-gcc/gcc/ext-dce.cc:679
0x1dfc825 ext_dce_process_bb
[...]/source-gcc/gcc/ext-dce.cc:836
0x1dfc825 ext_dce_rd_transfer_n
[...]/source-gcc/gcc/ext-dce.cc:978
0xcb6f5e df_worklist_propagate_backward
[...]/source-gcc/gcc/df-core.cc:970
0xcb6f5e df_worklist_dataflow_doublequeue
[...]/source-gcc/gcc/df-core.cc:1054
0xcb6f5e df_worklist_dataflow(dataflow*, bitmap_head*, int*, int)
[...]/source-gcc/gcc/df-core.cc:1132
0x1dfcd67 ext_dce_execute()
[...]/source-gcc/gcc/ext-dce.cc:1007
0x1dfd0fc execute
[...]/source-gcc/gcc/ext-dce.cc:1047

during RTL pass: ext_dce
[...]/source-gcc/libgcc/libgcc2.c: In function ‘__popcountti2’:
[...]/source-gcc/libgcc/libgcc2.c:870:1: internal compiler error: RTL
check: expected code 'const_int', have 'const_vector' in carry_backpropagate,
at ext-dce.cc:505
  870 | }
  | ^
0x20f9815 internal_error(char const*, ...)
[...]/source-gcc/gcc/diagnostic-global-context.cc:491
0x86460a rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int,
char const*)
[...]/source-gcc/gcc/rtl.cc:770
0xa52d63 carry_backpropagate(unsigned long, rtx_code, rtx_def*)
[...]/source-gcc/gcc/ext-dce.cc:505
0x1dfbb46 carry_backpropagate(unsigned long, rtx_code, rtx_def*)
[...]/source-gcc/gcc/ext-dce.cc:659
[...]

during RTL pass: ext_dce
[...]/source-gcc/libgcc/libgcc2.c: In function ‘__udivti3’:
[...]/source-gcc/libgcc/libgcc2.c:1301:1: internal compiler error: RTL
check: expected code 'const_int', have 'const_vector' in carry_backpropagate,
at ext-dce.cc:497
 1301 | }
  | ^
0x20f9815 internal_error(char const*, ...)
[...]/source-gcc/gcc/diagnostic-global-context.cc:491
0x86460a rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int,
char const*)
[...]/source-gcc/gcc/rtl.cc:770
0xa52e15 carry_backpropagate(unsigned long, rtx_code, rtx_def*)
[...]/source-gcc/gcc/ext-dce.cc:497
0x1dfbb46 carry_backpropagate(unsigned long, rtx_code, rtx_def*)
[...]/source-gcc/gcc/ext-dce.cc:659
[...]

during RTL pass: ext_dce
[...]/source-gcc/libgcc/libgcc2.c: In function ‘__udivmodti4’:
[...]/source-gcc/libgcc/libgcc2.c:1205:1: internal compiler error: RTL
check: expected code 'const_int', have 'const_vector' in carry_backpropagate,
at ext-dce.cc:497
 1205 | }
  | ^
0x20f9815 internal_error(char const*, ...)
[...]/source-gcc/gcc/diagnostic-global-context.cc:491
0x86460a rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int,
char const*)
[...]/source-gcc/gcc/rtl.cc:770
0xa52e15 carry_backpropagate(unsigned long, rtx_code, rtx_def*)
[...]/source-gcc/gcc/ext-dce.cc:497
0x1dfbb46 carry_backpropagate(unsigned long, rtx_code, rtx_def*)
[...]/source-gcc/gcc/ext-dce.cc:659
[...]


[Bug target/116103] New: [15 Regression] GCN vs. "Internal-fn: Only allow modes describe types for internal fn[PR115961]"

2024-07-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103

Bug ID: 116103
   Summary: [15 Regression] GCN vs. "Internal-fn: Only allow modes
describe types for internal fn[PR115961]"
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, pan2.li at intel dot com
  Target Milestone: ---
Target: GCN

With recent commit r15-2241-g905973410957891fec8a3e42eeefa4618780e0ce
"Internal-fn: Only allow modes describe types for internal fn[PR115961]", we've
got a few regressions for '--target=amdgcn-amdhsa' (tested '-march=gfx908'). 
>From a quick glance, I can't tell if this is worse or just different code
generation.  (Andrew?)

PASS: gcc.dg/tree-ssa/loop-bound-2.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/loop-bound-2.c scan-tree-dump ivopts "bounded by 254"
PASS: gcc.dg/tree-ssa/loop-bound-2.c scan-tree-dump-not ivopts "bounded by
255"
[-PASS:-]{+FAIL:+} gcc.dg/tree-ssa/loop-bound-2.c scan-tree-dump-not ivopts
"zero if "

Note that 'scan-tree-dump ivopts "bounded by 254"' already did FAIL before, but
the FAIL of 'scan-tree-dump-not ivopts "zero if "' is new:

--- G/loop-bound-2.c.188t.ivopts2024-07-26 09:34:22.838958365 +0200
+++ B/loop-bound-2.c.188t.ivopts2024-07-26 09:47:10.822525365 +0200
@@ -5,15 +5,22 @@
 ;; Loop 1
 ;;  header 3, latch 6
 ;;  depth 1, outer 0, finite_p
-;;  niter scev_not_known
+;;  niter (unsigned short) bnd.8_23 + 63 > 63 ? ((unsigned short) bnd.8_23
+ 65535) / 64 : 0
 ;;  upper_bound 3
 ;;  likely_upper_bound 3
 ;;  iterations by profile: 3.00 (unreliable) entry count:105119324
(estimated locally, freq 0.8900)
 ;;  nodes: 3 6
 Processing loop 1 at
source-gcc/gcc/testsuite/gcc.dg/tree-ssa/loop-bound-2.c:14
-  single exit 3 -> 8, exit condition if (next_mask_38 != { 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0 })
+  single exit 3 -> 8, exit condition if (ivtmp_35 > 64)


+Analyzing # of iterations of loop 1
+  exit condition 64 < [(unsigned short) bnd.8_23, + , 65472]
+  bounds on difference of bases: -64 ... 65471
+  result:
+zero if (unsigned short) bnd.8_23 + 63 <= 63
+# of iterations ((unsigned short) bnd.8_23 + 65535) / 64, bounded by
1023
+  number of iterations ((unsigned short) bnd.8_23 + 65535) / 64; zero if
(unsigned short) bnd.8_23 + 63 <= 63

[...]


And then, a number of regressions of 'scan-assembler-times
\\tv_cmp_gt_i32\\tvcc, [...]' and 'scan-assembler-times \\tv_cmpx_gt_i32\\tvcc,
[...]':

@@ -125843,7 +125901,7 @@ PASS: gcc.target/gcn/cond_smax_1.c
scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo,
PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not
\\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+
PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not
\\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-not smaxv64si3/0
[-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smax_1.c scan-assembler-times
\\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times
\\tv_cmp_gt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times
\\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10
PASS: gcc.target/gcn/cond_smax_1.c scan-assembler-times smaxv64si3_exec 30
@@ -125854,7 +125912,7 @@ PASS: gcc.target/gcn/cond_smin_1.c
scan-assembler-not \\ts_cmpk_lg_u32\\tvcc_lo,
PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not
\\tv_cmpx_gt_i32\\tvcc, s[0-9]+, v[0-9]+
PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not
\\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-not sminv64si3/0
[-PASS:-]{+FAIL:+} gcc.target/gcn/cond_smin_1.c scan-assembler-times
\\tv_cmp_gt_i32\\tvcc, s[0-9]+, v[0-9]+ 80
PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times
\\tv_cmp_lt_i64\\tvcc, v[[0-9]+:[0-9]+], v[[0-9]+:[0-9]+] 10
PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times
\\tv_cmp_ne_u64\\ts\\[[0-9]+:[0-9]+\\], v\\[[0-9]+:[0-9]+\\], -1 10
PASS: gcc.target/gcn/cond_smin_1.c scan-assembler-times sminv64si3_exec 30
@@ -125864,7 +125922,7 @@ PASS: gcc.target/gcn/cond_umax_1.c (test for
excess errors)
PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not
\\ts_cmpk_lg_u32\\tvcc_lo, 0
PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not
\\tv_writelane_b32\\tv[0-9]+, vcc_??, 0
PASS: gcc.target/gcn/cond_umax_1.c scan-assembler-not umaxv64si3/0
[-PASS:-]{+FAIL:+} 

[Bug ipa/116055] [14/15 Regression] ICE from gcc.c-torture/unsorted/dump-noaddr.c after "Fix modref's iteraction with store merging"

2024-07-23 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116055

Thomas Schwinge  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-07-24
Summary|[14 regression] ICE from|[14/15 Regression] ICE from
   |gcc.c-torture/unsorted/dump |gcc.c-torture/unsorted/dump
   |-noaddr.c after |-noaddr.c after "Fix
   |r14-10495-g9ddd5f88e60972   |modref's iteraction with
   ||store merging"
 CC||tschwinge at gcc dot gnu.org
 Target|powerpc64le-linux-gnu   |powerpc64le-linux-gnu GCN
 Status|UNCONFIRMED |NEW

--- Comment #2 from Thomas Schwinge  ---
In addition to commit r14-10495-g9ddd5f88e60972147dff74b48658e2b12040d468 "Fix
modref's iteraction with store merging" mentioned before, there's also the
original commit r15-2205-g14074773350ffed7efdebbc553adf0f23b572e87 "Fix
modref's iteraction with store merging", where I see the same thing also for
'--target=amdgcn-amdhsa' (tested '-march=gfx908').

[Bug target/116044] [15 Regression] GCN vs. rtl-ssa: Avoid using a stale splay tree root [PR116009]

2024-07-23 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116044

--- Comment #1 from Thomas Schwinge  ---
Recent commit r15-2199-g34f33ea801563e2eabb348e8d3e9344a91abfd48 "rtl-ssa:
Avoid using a stale splay tree root [PR116009]" is causing one regression for
'--target=amdgcn-amdhsa' (tested '-march=gfx908'):

PASS: g++.dg/torture/pr81987.C   -O0  (test for excess errors)
PASS: g++.dg/torture/pr81987.C   -O1  (test for excess errors)
PASS: g++.dg/torture/pr81987.C   -O2  (test for excess errors)
PASS: g++.dg/torture/pr81987.C   -O3 -g  (test for excess errors)
{+FAIL: g++.dg/torture/pr81987.C   -Os  (internal compiler error: in
merge_clobber_groups, at rtl-ssa/accesses.cc:764)+}
[-PASS:-]{+FAIL:+} g++.dg/torture/pr81987.C   -Os  (test for excess errors)

during RTL pass: late_combine
[...]/source-gcc/gcc/testsuite/g++.dg/torture/pr81987.C: In function 'void
foo()':
[...]/source-gcc/gcc/testsuite/g++.dg/torture/pr81987.C:61:1: internal
compiler error: in merge_clobber_groups, at rtl-ssa/accesses.cc:764
0x2406235 internal_error(char const*, ...)
[...]/source-gcc/gcc/diagnostic-global-context.cc:491
0xb80382 fancy_abort(char const*, int, char const*)
[...]/source-gcc/gcc/diagnostic.cc:1725
0xb68dc1
rtl_ssa::function_info::merge_clobber_groups(rtl_ssa::clobber_info*,
rtl_ssa::clobber_info*, rtl_ssa::def_info*)
[...]/source-gcc/gcc/rtl-ssa/accesses.cc:764
0x226d49e rtl_ssa::function_info::remove_def(rtl_ssa::def_info*)
[...]/source-gcc/gcc/rtl-ssa/accesses.cc:1036
0x2274463
rtl_ssa::function_info::change_insns(array_slice)
[...]/source-gcc/gcc/rtl-ssa/changes.cc:852
0x222672a run
[...]/source-gcc/gcc/late-combine.cc:452
0x222672a combine_into_uses
[...]/source-gcc/gcc/late-combine.cc:683
0x2226c1d execute
[...]/source-gcc/gcc/late-combine.cc:711
0x2226c1d execute
[...]/source-gcc/gcc/late-combine.cc:760

[Bug target/116044] New: [15 Regression] GCN vs. rtl-ssa: Avoid using a stale splay tree root [PR116009]

2024-07-23 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116044

Bug ID: 116044
   Summary: [15 Regression] GCN vs. rtl-ssa: Avoid using a stale
splay tree root [PR116009]
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code, testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, rsandifo at gcc dot gnu.org
  Target Milestone: ---
Target: GCN

[Bug tree-optimization/116000] gcc.dg/vect/vect-reduc-chain-dot-slp-1.c etc. FAIL

2024-07-19 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116000

Thomas Schwinge  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=114440
 CC||ams at gcc dot gnu.org,
   ||fxue at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org
 Target|sparc-sun-solaris2.11   |sparc-sun-solaris2.11 GCN

--- Comment #2 from Thomas Schwinge  ---
For '--target=amdgcn-amdhsa' (tested '-march=gfx908'):

FAIL: gcc.dg/vect/vect-reduc-chain-dot-slp-1.c scan-tree-dump-times vect
"vectorizing statement: \\S+ = DOT_PROD_EXPR" 16

gcc.dg/vect/vect-reduc-chain-dot-slp-1.c: pattern found 0 times

..., and:

FAIL: gcc.dg/vect/vect-reduc-chain-dot-slp-2.c scan-tree-dump-times vect
"vectorizing statement: \\S+ = DOT_PROD_EXPR" 5

gcc.dg/vect/vect-reduc-chain-dot-slp-2.c: pattern found 0 times

No 'DOT_PROD_EXPR' mentioned in 'vect-reduc-chain-dot-slp-{1,2}.c.180t.vect'.

Everything else PASSes for 'gcc.dg/vect/vect-reduc-chain-dot-slp-{1,2,3,4}.c'.

[Bug testsuite/115989] [15 regression] libgomp.oacc-fortran/privatized-ref-2.f90 fails after r15-2135-gc3aa339ea50f05

2024-07-19 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115989

Thomas Schwinge  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=108889
   Keywords||patch, testsuite-fail
   Last reconfirmed||2024-07-19
 Status|UNCONFIRMED |NEW
 Target|powerpc64le-linux-gnu   |
 CC||haochen.jiang at intel dot com,
   ||tschwinge at gcc dot gnu.org
   Host|powerpc64le-linux-gnu   |
  Build|powerpc64le-linux-gnu   |

--- Comment #1 from Thomas Schwinge  ---
Patch:
.

[Bug target/115934] [15 Regression] nvptx vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934

Thomas Schwinge  changed:

   What|Removed |Added

 CC||sayle at gcc dot gnu.org

--- Comment #6 from Thomas Schwinge  ---
Tamar, Richard, thanks for having a look.

(In reply to Tamar Christina from comment #4)
> This one looks a bit like costing, [...]

I see.  So we (I) shall later re-visit this PR in context of

"[nvptx PATCH] Implement rtx_costs target hook for nvptx backend", and, if
necessary, follow-up work:

> I don't however see an implementation of TARGET_ADDRESS_COST for the target.

[Bug target/115936] New: [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936

Bug ID: 115936
   Summary: [15 Regression] GCN vs. ivopts: replace
constant_multiple_of with
aff_combination_constant_multiple_p [PR114932]
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: ice-checking, ice-on-valid-code, testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org
  Target Milestone: ---
Target: GCN

Recent commit r15-1809-g735edbf1e2479fa2323a2b4a9714fae1a0925f74 "ivopts:
replace constant_multiple_of with aff_combination_constant_multiple_p
[PR114932]" is causing one regression for '--target=amdgcn-amdhsa' (tested
'-march=gfx908', '-march=gfx1100'):

@@ -98531,8 +98547,9 @@ PASS: gcc.dg/torture/pr101173.c   -O0  (test for
excess errors)
PASS: gcc.dg/torture/pr101173.c   -O0  execution test
PASS: gcc.dg/torture/pr101173.c   -O1  (test for excess errors)
PASS: gcc.dg/torture/pr101173.c   -O1  execution test
{+FAIL: gcc.dg/torture/pr101173.c   -O2  (internal compiler error:
verify_gimple failed)+}
[-PASS:-]{+FAIL:+} gcc.dg/torture/pr101173.c   -O2  (test for excess
errors)
[-PASS:-]{+UNRESOLVED:+} gcc.dg/torture/pr101173.c   -O2  [-execution
test-]{+compilation failed to produce executable+}
PASS: gcc.dg/torture/pr101173.c   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
PASS: gcc.dg/torture/pr101173.c   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
PASS: gcc.dg/torture/pr101173.c   -O3 -g  (test for excess errors)

[...]/source-gcc/gcc/testsuite/gcc.dg/torture/pr101173.c: In function
'main':
[...]/source-gcc/gcc/testsuite/gcc.dg/torture/pr101173.c:5:5: error:
invalid (pointer) operands 'plus_expr'
ivtmp.39_65 = ivtmp.39_59 + 0B;
during GIMPLE pass: ivopts
[...]/source-gcc/gcc/testsuite/gcc.dg/torture/pr101173.c:5:5: internal
compiler error: verify_gimple failed
0x20dcb22 internal_error(char const*, ...)
[...]/source-gcc/gcc/diagnostic-global-context.cc:491
0x11fe23e verify_gimple_in_cfg(function*, bool, bool)
[...]/source-gcc/gcc/tree-cfg.cc:5678
0x1092710 execute_function_todo
[...]/source-gcc/gcc/passes.cc:2089
0x1092c5b execute_todo
[...]/source-gcc/gcc/passes.cc:2143

[Bug target/115934] [15 Regression] nvptx vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934

--- Comment #2 from Thomas Schwinge  ---
The most simple one: '--target=nvptx-none'.  :-)

[Bug target/115934] New: [15 Regression] nvptx vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115934

Bug ID: 115934
   Summary: [15 Regression] nvptx vs. ivopts: replace
constant_multiple_of with
aff_combination_constant_multiple_p [PR114932]
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: tnfchris at gcc dot gnu.org, vries at gcc dot gnu.org
  Target Milestone: ---
Target: nvptx

Recent commit r15-1809-g735edbf1e2479fa2323a2b4a9714fae1a0925f74 "ivopts:
replace constant_multiple_of with aff_combination_constant_multiple_p
[PR114932]" is causing one regression for nvptx target:

PASS: gcc.dg/tree-ssa/pr43378.c (test for excess errors)
PASS: gcc.dg/tree-ssa/pr43378.c scan-tree-dump-times ivopts "rite_[0-9]* =
rite_[0-9]* - element" 1
[-PASS:-]{+FAIL:+} gcc.dg/tree-ssa/pr43378.c scan-tree-dump-times ivopts
"left_[0-9]* = left_[0-9]* \\+ element|left_[0-9]* = element_[0-9]*\\(D\\) \\+
left" 1

Before (or, with this commit locally reverted), we have no 'diff' between the
previous dump file: 'pr43378.c.186t.slp1' vs. 'pr43378.c.188t.ivopts', but now
there's this:

--- pr43378.c.186t.slp1   2024-07-15 11:48:57.498943077 +0200
+++ pr43378.c.188t.ivopts 2024-07-15 11:48:57.498943077 +0200
@@ -3,6 +3,18 @@

 void foo (int left, int rite, int element)
 {
+  unsigned int _1;
+  unsigned int _2;
+  unsigned int _11;
+  unsigned int _12;
+  unsigned int _13;
+  unsigned int _17;
+  unsigned int _19;
+  unsigned int _20;
+  unsigned int _21;
+  unsigned int _22;
+  unsigned int _23;
+  
[local count: 118111600]:
   if (left_4(D) <= rite_5(D))
 goto ; [89.00%]
@@ -12,12 +24,23 @@
[local count: 105119324]:

[local count: 955630224]:
-  # left_14 = PHI 
   # rite_15 = PHI 
+  _17 = (unsigned int) left_4(D);
+  _2 = (unsigned int) rite_5(D);
+  _1 = _2 + _17;
+  _13 = (unsigned int) rite_15;
+  _11 = -_13;
+  _12 = _1 + _11;
+  left_14 = (int) _12;
   rite_8 = rite_15 - element_7(D);
   bar (left_14, rite_8, element_7(D));
-  left_10 = element_7(D) + left_14;
-  if (rite_8 >= left_10)
+  _19 = (unsigned int) left_4(D);
+  _20 = (unsigned int) rite_5(D);
+  _21 = _19 + _20;
+  _22 = (unsigned int) rite_8;
+  _23 = _21 - _22;
+  left_18 = (int) _23;
+  if (rite_8 >= left_18)
 goto ; [89.00%]
   else
 goto ; [11.00%]

(I've not yet looked any deeper.)

[Bug c/88737] RFE: Track ownership moves

2024-07-05 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88737

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org

--- Comment #11 from Thomas Schwinge  ---
This should be interesting to reference here:

The in-development GCC/Rust implementation now has a even-more-in-development
implementation of Rust borrow checking, using the Polonius engine,
.  See
 etc.  (This work
is not yet in upstream GCC; awaiting completion of work on GCC build system
integration.)

That work was done by Jakub Dupák for his Master's Thesis "Memory Safety
Analysis in Rust GCC", .  See the thesis
for a description of design and implementation.

In particular, this is currently implemented in the GCC/Rust front end, not
middle end (supposedly summarized as: "for practical reasons"?), and therefore
not readily usable by other front ends.

[Bug testsuite/113005] 'libgomp.fortran/rwlock_1.f90', 'libgomp.fortran/rwlock_3.f90' execution test timeouts

2024-07-02 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113005

--- Comment #16 from Thomas Schwinge  ---
Even with 'call omp_set_num_threads(8)' added, 'libgomp.fortran/rwlock_1.f90'
still takes ~1 min to execute with working directory on NFS, compared to almost
instantaneous via local disk.  (I've not observed any other irregularities in
NFS access.)

[Bug testsuite/113005] 'libgomp.fortran/rwlock_1.f90', 'libgomp.fortran/rwlock_3.f90' execution test timeouts

2024-07-02 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113005

Thomas Schwinge  changed:

   What|Removed |Added

   Last reconfirmed|2023-12-21 00:00:00 |2024-7-2

--- Comment #15 from Thomas Schwinge  ---
First, I apologize for the delay in continuing the discussion here.  Siemens
decided to cancel their Open Source Toolchains business (and more), so my group
had to find a new home...

At BayLibre, I first didn't run into this issue anymore.  However, today I've
done some "environmental changes", and it's back: I switched from "local disk"
to "NFS" testing, that is, the testing directory is NFS-mounted.

(In reply to Lipeng Zhu from comment #14)
> (In reply to Lipeng Zhu from comment #13)
> > OK, I think I find the root cause of this error, when thread number greater
> > than 1000, the file_name = 1000_tst.dat, character(11) will overflow. This
> > will generate the same file_name like ***_tst.dat. 

That's not the issue I've been running into, so:

> Can you help to verify if this draft patch will fix the error on your side?

No, that doesn't help in my case.  (..., but supposedly is still necessary for
the "greater than 1000" case.)

The problem I'm running into, on the following system:

$ grep ^cpu < /proc/cpuinfo | uniq -c
128 cpu : POWER9, altivec supported

..., when running 'libgomp.fortran/rwlock_1.f90' via 'strace -o s -ff
[...]/rwlock_1.exe', that produces 25984 (!) 's.*' files (process 'clone's --
OpenMP threads, I suppose; I didn't try to understand that number in more
detail), and in total:

$ cat s.* | grep -F '_tst.dat' | wc -l
51712

... 51712 operations on '*_tst.dat' files (multiplied by the number of
operations on the respective opened file descriptors), and I assume that's what
overwhelms the NFS subsystem.

I don't think there's really any kind of existing mechanism/precedent for test
cases to open files outside of their current working directory (local disk, for
example: '/tmp/' instead of NFS in my case), or is there?


Are these 'libgomp.fortran/rwlock_{1,2,3}.f90' test cases intended to be
correctness test cases (and therefore may be limiting themselves to some
suitable lower 'OMP_NUM_THREADS', for example via 'num_threads' clauses, as
discussed before), or performance test cases that really need to exercise all
cores, for example?

[Bug tree-optimization/115652] [15 Regression] GCN: FAIL: gcc.dg/vect/pr70138-{1,2}.c (internal compiler error: verify_ssa failed)

2024-06-28 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115652

Thomas Schwinge  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #8 from Thomas Schwinge  ---
Richard, thanks for looking into this!

I confirm 'gcc.dg/vect/pr70138-1.c', 'gcc.dg/vect/pr70138-2.c' restored to
PASSing, but as of commit r15-1653-gf80db5495d5f8455b3003951727eb6c8dc67d81d
"tree-optimization/115652 - adjust insertion gsi for SLP", we've now got a
bunch of other ICEs for GCN target (tested '-march=gfx908').  (These are not
addressed by follow-up commit
r15-1670-gc7cb0dd94589ab501bca27f93641b4074e5a2e99 "tree-optimization/115652 -
amend last fix".)

PASS: gcc.dg/torture/pr111614.c   -O0  (test for excess errors)
PASS: gcc.dg/torture/pr111614.c   -O1  (test for excess errors)
{+FAIL: gcc.dg/torture/pr111614.c   -O2  (internal compiler error:
Segmentation fault)+}
[-PASS:-]{+FAIL:+} gcc.dg/torture/pr111614.c   -O2  (test for excess
errors)
PASS: gcc.dg/torture/pr111614.c   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
PASS: gcc.dg/torture/pr111614.c   -O3 -g  (test for excess errors)
PASS: gcc.dg/torture/pr111614.c   -Os  (test for excess errors)

during GIMPLE pass: vect
[...]/gcc/testsuite/gcc.dg/torture/pr111614.c: In function 'main':
[...]/gcc/testsuite/gcc.dg/torture/pr111614.c:19:5: internal compiler
error: Segmentation fault
0x20dcac2 internal_error(char const*, ...)
[...]/gcc/diagnostic-global-context.cc:491
0x11af353 crash_signal
[...]/gcc/toplev.cc:319
0x7f8fc0faf51f ???
./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x14e6513 gimple_bb(gimple const*)
[...]/gcc/gimple.h:1909
0x14e6513 vect_stmt_dominates_stmt_p(gimple*, gimple*)
[...]/gcc/tree-vectorizer.cc:775
0x14bf1e5 vect_schedule_slp_node
[...]/gcc/tree-vect-slp.cc:9754
0x14d609f vect_schedule_slp_node
[...]/gcc/tree-vect-slp.cc:9583
0x14d609f vect_schedule_scc
[...]/gcc/tree-vect-slp.cc:10025
0x14d6038 vect_schedule_scc
[...]/gcc/tree-vect-slp.cc:10006
0x14d6038 vect_schedule_scc
[...]/gcc/tree-vect-slp.cc:10006
0x14d6038 vect_schedule_scc
[...]/gcc/tree-vect-slp.cc:10006
0x14d6038 vect_schedule_scc
[...]/gcc/tree-vect-slp.cc:10006
0x14d6797 vect_schedule_slp(vec_info*, vec<_slp_instance*, va_heap, vl_ptr>
const&)
[...]/gcc/tree-vect-slp.cc:10170
0x14a2a41 vect_transform_loop(_loop_vec_info*, gimple*)
[...]/gcc/tree-vect-loop.cc:12114
0x14e7614 vect_transform_loops
[...]/gcc/tree-vectorizer.cc:1007
0x14e7ce3 try_vectorize_loop_1
[...]/gcc/tree-vectorizer.cc:1153
0x14e7ce3 try_vectorize_loop
[...]/gcc/tree-vectorizer.cc:1183
0x14e836c execute
[...]/gcc/tree-vectorizer.cc:1299

Similarly:

{+FAIL: gcc.dg/vect/ggc-pr37574.c (internal compiler error: Segmentation
fault)+}
[-PASS:-]{+FAIL:+} gcc.dg/vect/ggc-pr37574.c (test for excess errors)

{+FAIL: gcc.dg/vect/no-scevccp-outer-13.c (internal compiler error:
Segmentation fault)+}
[-PASS:-]{+FAIL:+} gcc.dg/vect/no-scevccp-outer-13.c (test for excess
errors)
[-PASS:-]{+UNRESOLVED:+} gcc.dg/vect/no-scevccp-outer-13.c [-execution
test-]{+compilation failed to produce executable+}
[-PASS:-]{+FAIL:+} gcc.dg/vect/no-scevccp-outer-13.c scan-tree-dump-times
vect "OUTER LOOP VECTORIZED." 1

{+FAIL: gcc.dg/vect/no-scevccp-outer-18.c (internal compiler error:
Segmentation fault)+}
[-PASS:-]{+FAIL:+} gcc.dg/vect/no-scevccp-outer-18.c (test for excess
errors)
[-PASS:-]{+UNRESOLVED:+} gcc.dg/vect/no-scevccp-outer-18.c [-execution
test-]{+compilation failed to produce executable+}
[-PASS:-]{+FAIL:+} gcc.dg/vect/no-scevccp-outer-18.c scan-tree-dump-times
vect "OUTER LOOP VECTORIZED." 1

{+FAIL: gcc.dg/vect/no-scevccp-outer-7.c (internal compiler error:
Segmentation fault)+}
[-PASS:-]{+FAIL:+} gcc.dg/vect/no-scevccp-outer-7.c (test for excess
errors)
[-PASS:-]{+UNRESOLVED:+} gcc.dg/vect/no-scevccp-outer-7.c [-execution
test-]{+compilation failed to produce executable+}
[-PASS:-]{+FAIL:+} gcc.dg/vect/no-scevccp-outer-7.c scan-tree-dump-times
vect "OUTER LOOP VECTORIZED." 1
PASS: gcc.dg/vect/no-scevccp-outer-7.c scan-tree-dump-times vect
"vect_recog_widen_mult_pattern: detected(?:(?!Analysis failed).)*Analysis
succeeded" 1

{+FAIL: gcc.dg/vect/vect-outer-4i.c (internal compiler error: Segmentation
fault)+}
[-PASS:-]{+FAIL:+} gcc.dg/vect/vect-outer-4i.c (test for excess errors)
[-PASS:-]{+UNRESOLVED:+} gcc.dg/vect/vect-outer-4i.c [-execution
test-]{+compilation failed to 

[Bug target/115682] New: nvptx vs. "fwprop: invoke change_is_worthwhile to judge if a replacement is worthwhile"

2024-06-27 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115682

Bug ID: 115682
   Summary: nvptx vs. "fwprop: invoke change_is_worthwhile to
judge if a replacement is worthwhile"
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: sayle at gcc dot gnu.org, vries at gcc dot gnu.org
  Target Milestone: ---
Target: nvptx

I've observed/bisected that "fwprop: invoke change_is_worthwhile to judge if a
replacement is worthwhile" (a) slightly improves and in a some cases also (b)
slightly regresses nvptx target code generation (supposedly; not actually
benchmarked).  That is, in a few cases, it introduces additional register usage
together with (a) different choice of PTX instructions (presumably beneficial),
or (b) redundant computations (presumably bad).  It's possible that the PTX JIT
later is able to re-optimize this, but we can't be sure -- and would like good
PTX code emitted by GCC if only for our own reading pleasure.

I've only looked at PTX code generated for GCC target libraries, and only
looked at those where the 'diff' was less than one screenful, roughly.

A few examples of category (a):

'nvptx-none/libgfortran/generated/_dim_i16.o:_gfortran_specific__dim_i16':

[...]
+.reg .u64 %r42;
[...]
-mov.u64 %r54,%r32;
-sub.u64 %r55,%r39,%r35;
-setp.ge.s64 %r45,%r55,0;
-@ %r45 bra $L2;
-mov.u64 %r54,0;
-mov.u64 %r55,%r54;
-$L2:
+sub.u64 %r42,%r39,%r35;
+setp.ge.s64 %r45,%r42,0;
+selp.u64 %r54,%r32,0,%r45;
+selp.u64 %r55,%r42,0,%r45;
[...]

That is, replace conditional 'bra' by two 'selp's.

Similar, 'nvptx-none/newlib/libc/stdio/libc_a-makebuf.o:__swhatbuf_r':

[...]
 setp.ne.u16 %r41,%r39,0;
-@ %r41 bra $L22;
 mov.u32 %r23,0;
-mov.u64 %r31,1024;
+selp.u64 %r31,64,1024,%r41;
 mov.u32 %r32,%r23;
 bra $L20;
 $L19:
@@ -325,11 +324,6 @@
 .loc 2 123 10
 mov.u64 %r31,1024;
 mov.u32 %r32,2048;
-bra $L20;
-$L22:
-mov.u32 %r23,0;
-mov.u64 %r31,64;
-mov.u32 %r32,%r23;
 $L20:
[...]

Differently, 'nvptx-none/newlib/libc/search/libc_a-hash_page.o:__addel':

[...]
-add.u64 %r38,%r37,%r37;
+shl.b64 %r38,%r37,1;
[...]

A few examples of category (b):

'nvptx-none/newlib/libc/stdlib/libc_a-gdtoa-gdtoa.o:__gdtoa':

[...]
+.reg .u64 %r263;
[...]
-add.u64 %r337,%r391,24;
+add.u64 %r263,%r391,24;
+mov.u64 %r337,%r263;
[...]

New unnecessary intermediate 'u64 %r263'.

Similarly, 'nvptx-none/newlib/libc/stdlib/libc_a-mprec.o:__d2b':

[...]
+.reg .u32 %r89;
[...]
-cvt.u64.u32 %r90,%r39;
+mov.u32 %r89,%r39;
+cvt.u64.u32 %r90,%r89;
[...]

'nvptx-none/libgfortran/generated/cshift0_c4.o:_gfortrani_cshift0_c4' (full
diff with a bit of unchanged context added):

[...]
 .reg .u64 %r131;
[...]
 .reg .u32 %r177;
[...]
+.reg .u32 %r266;
 .reg .u64 %r267;
[...]
 mov.u32 %r177,%ar3;
[...]
 cvt.s64.s32 %r131,%r177;
[...]
-cvt.u64.u32 %r267,%r177;
+cvt.u32.u64 %r266,%r131;
+cvt.u64.u32 %r267,%r266;
[...]

In the new code, the 'u32' '%r177' is interpreted as 's32', converted into
's64', and stored into 'u64' '%r131' (which is present also in the old code for
other reasons; I didn't try to understand that in more detail).  The 'u64'
'%r131' is then converted into 'u32' and stored in the new 'u32 %r266', and
then again converted into 'u64', and stored in '%r267'.

In the old code, the 'u32' '%r177' was converted into 'u64', and stored into
the 'u64' '%r267' directly.

A lot more instances of this in other files.

'nvptx-none/libgfortran/generated/matmul_i1.o:_gfortran_matmul_i1' (full diff
with a bit of unchanged context added):

[...]
 .reg .u64 %r299;
[...]
 .reg .u64 %r347;
 .reg .u64 %r349;
[...]
 .reg .u64 %r432;
[...]
 .reg .u64 %r1004;
[...]
 ['%r299' initialized via different code paths]
[...]
 ld.u64 %r347,[%r733];
[...]
 add.u64 %r1004,%r299,1;
[...]
 sub.u64 %r349,%r347,%r1004;
[...]
-mov.u64 %r432,%r347;
+add.u64 %r432,%r349,%r1004;
[...]

That is, instead of just using '%r347' ('mov') for '%r432', we now prefer
re-computing it ('add').

Similarly in 'nvptx-none/libgfortran/io/transfer.o:bswap_array' -- even more
concisely:

[...]
 add.u64 %r68,%r92,%r65;
-mov.u64 %r53,%r92;
+sub.u64 %r53,%r68,%r65;
[...]

..., and both these items in combination (?) in
'nvptx-none/newlib/libc/search/libc_a-hash.o:__expand_table', for example,
where we now use two additional registers, plus additional 'sub' plus
additional 'cvt'.

Now, I suppose, we'll quickly conclude this is due to instruction costing (...,
and, in 

[Bug target/115640] [15 Regression] GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-26 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640

--- Comment #11 from Thomas Schwinge  ---
(In reply to Richard Biener from comment #9)
> Created attachment 58519 [details]
> patch
> 
> I think this fixes it, but I cannot validate.

Yes, it does, thanks!

[Bug target/115652] New: [15 Regression] GCN: FAIL: gcc.dg/vect/pr70138-{1,2}.c (internal compiler error: verify_ssa failed)

2024-06-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115652

Bug ID: 115652
   Summary: [15 Regression] GCN: FAIL: gcc.dg/vect/pr70138-{1,2}.c
(internal compiler error: verify_ssa failed)
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: ice-checking, ice-on-valid-code, testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, rguenth at gcc dot gnu.org
  Target Milestone: ---
Target: GCN

As of commit r15-1056-g4653b682ef161c3c2fc7bf8462b8f9206a1349e6 "Allow
single-lane SLP in-order reductions" we've got a '-fchecking' ICE regression
for GCN target (tested '-march=gfx908'):

{+FAIL: gcc.dg/vect/pr70138-1.c (internal compiler error: verify_ssa
failed)+}
[-PASS:-]{+FAIL:+} gcc.dg/vect/pr70138-1.c (test for excess errors)
[-PASS:-]{+UNRESOLVED:+} gcc.dg/vect/pr70138-1.c [-execution
test-]{+compilation failed to produce executable+}

[...]/source-gcc/gcc/testsuite/gcc.dg/vect/pr70138-1.c: In function 'foo':
[...]/source-gcc/gcc/testsuite/gcc.dg/vect/pr70138-1.c:6:1: error:
definition in block 3 follows the use
for SSA_NAME: stmp_c_17.9_153 in statement:
c_17 = stmp_c_17.9_153 + stmp_c_17.9_154;
during GIMPLE pass: vect
dump file: ./pr70138-1.c.180t.vect
[...]/source-gcc/gcc/testsuite/gcc.dg/vect/pr70138-1.c:6:1: internal
compiler error: verify_ssa failed
0x142272d verify_ssa(bool, bool)
[...]/source-gcc/gcc/tree-ssa.cc:1203

{+FAIL: gcc.dg/vect/pr70138-2.c (internal compiler error: verify_ssa
failed)+}
[-PASS:-]{+FAIL:+} gcc.dg/vect/pr70138-2.c (test for excess errors)
[-PASS:-]{+UNRESOLVED:+} gcc.dg/vect/pr70138-2.c [-execution
test-]{+compilation failed to produce executable+}

[...]/source-gcc/gcc/testsuite/gcc.dg/vect/pr70138-2.c: In function 'foo':
[...]/source-gcc/gcc/testsuite/gcc.dg/vect/pr70138-2.c:6:1: error:
definition in block 3 follows the use
for SSA_NAME: stmp_c_15.9_152 in statement:
c_15 = stmp_c_15.9_152 + stmp_c_15.9_153;
during GIMPLE pass: vect
dump file: ./pr70138-2.c.180t.vect
[...]/source-gcc/gcc/testsuite/gcc.dg/vect/pr70138-2.c:6:1: internal
compiler error: verify_ssa failed
0x142272d verify_ssa(bool, bool)
[...]/source-gcc/gcc/tree-ssa.cc:1203

[Bug target/112363] GCN: 'FAIL: gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c execution test'

2024-06-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112363

Thomas Schwinge  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=115382

--- Comment #4 from Thomas Schwinge  ---
(In reply to Thomas Schwinge from comment #3)
> Something in the last few weeks' worth of commits made this go back to PASS:
> 
> PASS: gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c (test for
> excess errors)
> [-FAIL:-]{+PASS:+} gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c
> execution test

For posterity: it was commit r15-1187-g2b438a0d2aa80f051a09b245a58f643540d4004b
"vect: Merge loop mask and cond_op mask in fold-left reduction [PR115382]" that
fixed this for GCN target, too.

[Bug target/115640] [15 Regression] GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640

Thomas Schwinge  changed:

   What|Removed |Added

Summary|GCN: FAIL:  |[15 Regression] GCN: FAIL:
   |gfortran.dg/vect/pr115528.f |gfortran.dg/vect/pr115528.f
   |  -O  execution test|  -O  execution test
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=114107

--- Comment #5 from Thomas Schwinge  ---
Turns out, this is a regression after all: before commit
r15-1238-g1fe55a1794863b5ad9eeca5062782834716016b2 "tree-optimization/114107 -
avoid peeling for gaps in more cases", this test case
'gfortran.dg/vect/pr115528.f' (which, of course, wasn't in-tree back then) does
PASS its execution test for GCN target (tested '-march=gfx908').

[Bug middle-end/115574] [OpenMP] Nested C function – 'declare target link(var)' leads to "referenced in offloaded code but hasn't been marked to be included in the offloaded code"

2024-06-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115574

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org

--- Comment #1 from Thomas Schwinge  ---
(Just noting that fixing the code à la '-Wunknown-pragmas' doesn't seem to make
a difference re the 'error' diagnostics.)

[Bug target/115648] New: [15 Regression] GCN: [-PASS:-]{+FAIL:+} gcc.dg/hoist-register-pressure-{2, 3}.c scan-rtl-dump hoist "PRE/HOIST: end of bb .* copying expression"

2024-06-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115648

Bug ID: 115648
   Summary: [15 Regression] GCN: [-PASS:-]{+FAIL:+}
gcc.dg/hoist-register-pressure-{2,3}.c scan-rtl-dump
hoist "PRE/HOIST: end of bb .* copying expression"
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, guihaoc at gcc dot gnu.org
  Target Milestone: ---

Yesterday's commit r15-1575-gea8061f46a301797e7ba33b52e3b4713fb8e6b48 "fwprop:
invoke change_is_worthwhile to judge if a replacement is worthwhile" regresses
GCN target (tested '-march=gfx908'):

PASS: gcc.dg/hoist-register-pressure-1.c (test for excess errors)
PASS: gcc.dg/hoist-register-pressure-2.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/hoist-register-pressure-2.c scan-rtl-dump hoist
"PRE/HOIST: end of bb .* copying expression"
PASS: gcc.dg/hoist-register-pressure-3.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/hoist-register-pressure-3.c scan-rtl-dump hoist
"PRE/HOIST: end of bb .* copying expression"

There is moderate code generation difference; I can't tell whether before vs.
after is better.  Does the compiler or the test cases or the compiler flags
need to be adjusted?  Re the latter, with '-fno-forward-propagate', we're back
to PASSing.

[Bug target/115640] GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640

--- Comment #1 from Thomas Schwinge  ---
For GCN target (tested '-march=gfx908'), we've got identical code
('pr115528.exe') before vs. after the commit
r15-1582-g2f83ea87ee328d337f87d4430861221be9babe1e "tree-optimization/115528 -
fix vect alignment analysis for outer loop vect" code changes, so it's likely
an unrelated issue?

[Bug target/115640] New: GCN: FAIL: gfortran.dg/vect/pr115528.f -O execution test

2024-06-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115640

Bug ID: 115640
   Summary: GCN: FAIL: gfortran.dg/vect/pr115528.f   -O  execution
test
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, rguenth at gcc dot gnu.org
  Target Milestone: ---
Target: GCN

The new PR115528 test case 'gfortran.dg/vect/pr115528.f' FAILs its execution
test for GCN target (tested '-march=gfx908'):

+PASS: gfortran.dg/vect/pr115528.f   -O  (test for excess errors)
+FAIL: gfortran.dg/vect/pr115528.f   -O  execution test

spawn -ignore SIGHUP [...]/build-gcc/gcc/gcn-run ./pr115528.exe
Memory access fault by GPU node-2 (Agent handle: 0x1834c30) on address
0x7f67e6dff000. Reason: Page not present or supervisor privilege.
FAIL: gfortran.dg/vect/pr115528.f   -O  execution test

[Bug target/112363] GCN: 'FAIL: gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c execution test'

2024-06-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112363

Thomas Schwinge  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Thomas Schwinge  ---
Something in the last few weeks' worth of commits made this go back to PASS:

PASS: gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c (test for excess
errors)
[-FAIL:-]{+PASS:+} gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c
execution test

[Bug target/115633] [15 Regression] powerpc64le: "relocation truncated to fit: R_PPC64_TOC16 against `.rodata.cst4'" with (default) '-flate-combine-instructions' since r15-1579-g792f97b44ffc5e6a967292

2024-06-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115633

Thomas Schwinge  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |tschwinge at gcc dot 
gnu.org
 Status|UNCONFIRMED |RESOLVED
   See Also|https://gcc.gnu.org/bugzill |
   |a/show_bug.cgi?id=115612|

--- Comment #4 from Thomas Schwinge  ---
Fixed.

[Bug target/115622] gcc.dg/ipa/iinline-attr.c fails after r15-1579-g792f97b44ffc5e

2024-06-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115622

Thomas Schwinge  changed:

   What|Removed |Added

   See Also|https://gcc.gnu.org/bugzill |
   |a/show_bug.cgi?id=115612|
   Assignee|unassigned at gcc dot gnu.org  |tschwinge at gcc dot 
gnu.org
 Resolution|--- |FIXED
 Status|NEW |RESOLVED
  Component|other   |target

--- Comment #3 from Thomas Schwinge  ---
Fixed.

[Bug target/115633] New: [15 Regression] powerpc64le: "relocation truncated to fit: R_PPC64_TOC16 against `.rodata.cst4'" with (default) '-flate-combine-instructions'

2024-06-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115633

Bug ID: 115633
   Summary: [15 Regression] powerpc64le: "relocation truncated to
fit: R_PPC64_TOC16 against `.rodata.cst4'" with
(default) '-flate-combine-instructions'
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: rsandifo at gcc dot gnu.org, seurer at gcc dot gnu.org
  Target Milestone: ---
Target: powerpc64le-unknown-linux-gnu

With commit r15-1579-g792f97b44ffc5e6a967292b3747fd835e99396e7 "Add a
late-combine pass [PR106594]", I see on powerpc64le-unknown-linux-gnu a number
of "relocation truncated to fit: R_PPC64_TOC16 against `.rodata.cst4'" etc.
with (default) '-flate-combine-instructions':

[-PASS:-]{+FAIL:+} gcc.dg/pr91734.c (test for excess errors)
[-PASS:-]{+UNRESOLVED:+} gcc.dg/pr91734.c [-execution test-]{+compilation
failed to produce executable+}

/tmp/ccOWP7Yx.o: in function `f2':
pr91734.c:(.text+0x38): relocation truncated to fit: R_PPC64_TOC16 against
`.rodata.cst4'
/tmp/ccOWP7Yx.o: in function `f3':
pr91734.c:(.text+0x68): relocation truncated to fit: R_PPC64_TOC16 against
`.rodata.cst4'
/tmp/ccOWP7Yx.o: in function `f4':
pr91734.c:(.text+0x98): relocation truncated to fit: R_PPC64_TOC16 against
`.rodata.cst4'+4
/tmp/ccOWP7Yx.o: in function `f5':
pr91734.c:(.text+0xc8): relocation truncated to fit: R_PPC64_TOC16 against
`.rodata.cst4'+8
/tmp/ccOWP7Yx.o: in function `f6':
pr91734.c:(.text+0xf8): relocation truncated to fit: R_PPC64_TOC16 against
`.rodata.cst4'+c
/tmp/ccOWP7Yx.o: in function `f7':
pr91734.c:(.text+0x128): relocation truncated to fit: R_PPC64_TOC16 against
`.rodata.cst4'+10
/tmp/ccOWP7Yx.o: in function `f8':
pr91734.c:(.text+0x158): relocation truncated to fit: R_PPC64_TOC16 against
`.rodata.cst4'+14
/tmp/ccOWP7Yx.o: in function `f9':
pr91734.c:(.text+0x188): relocation truncated to fit: R_PPC64_TOC16 against
`.rodata.cst4'+18
/tmp/ccOWP7Yx.o: in function `f10':
pr91734.c:(.text+0x1b8): relocation truncated to fit: R_PPC64_TOC16 against
`.rodata.cst4'+1c
collect2: error: ld returned 1 exit status

This is with the Debian GNU/Linux 12 (bookworm) binutils 2.40-2 package.  (...,
in case that's where the problem is...)

Similarly:

[-PASS:-]{+FAIL:+} gcc.dg/sinatan-1.c (test for excess errors)
[-PASS:-]{+UNRESOLVED:+} gcc.dg/sinatan-1.c [-execution test-]{+compilation
failed to produce executable+}

[-PASS:-]{+FAIL:+} gcc.dg/ipa/inline-8.c (test for excess errors)
[-PASS:-]{+UNRESOLVED:+} gcc.dg/ipa/inline-8.c [-execution
test-]{+compilation failed to produce executable+}

..., and (roughly) all 'gcc.dg/vect/tsvc/[...]' test cases that don't use
'-flto'.

=== gcc Summary ===

# of expected passes[-179982-]{+179673+}
# of unexpected failures[-127-]{+282+}
# of unexpected successes   20
# of expected failures  1612
{+# of unresolved testcases 154+}
# of unsupported tests  4394

Additionally:

[-PASS:-]{+FAIL:+} libgomp.c/simd-math-1.c (test for excess errors)
[-PASS:-]{+UNRESOLVED:+} libgomp.c/simd-math-1.c [-execution
test-]{+compilation failed to produce executable+}


I see a number of recent  emails by Bill Seurer
that contain a similar set of new FAILs, so putting you in CC here.

[Bug other/115622] gcc.dg/ipa/iinline-attr.c fails after r15-1579-g792f97b44ffc5e

2024-06-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115622

Thomas Schwinge  changed:

   What|Removed |Added

   Keywords||testsuite-fail
 Status|UNCONFIRMED |NEW
 CC||tschwinge at gcc dot gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2024-06-25

[Bug target/115631] New: [15 Regression] GCN: [-PASS:-]{+FAIL:+} c-c++-common/torture/builtin-arith-overflow-6.c -O2 execution test

2024-06-25 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115631

Bug ID: 115631
   Summary: [15 Regression] GCN: [-PASS:-]{+FAIL:+}
c-c++-common/torture/builtin-arith-overflow-6.c   -O2
execution test
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, rsandifo at gcc dot gnu.org
  Target Milestone: ---
Target: GCN

With commit r15-1579-g792f97b44ffc5e6a967292b3747fd835e99396e7 "Add a
late-combine pass [PR106594]", I see for GCN target testing (tested
'-march=gfx908') regress for both C, C++:

@@ -191300,7 +191300,7 @@ PASS:
c-c++-common/torture/builtin-arith-overflow-6.c   -O0  (test for excess er
PASS: c-c++-common/torture/builtin-arith-overflow-6.c   -O0  execution test
UNSUPPORTED: c-c++-common/torture/builtin-arith-overflow-6.c   -O1
PASS: c-c++-common/torture/builtin-arith-overflow-6.c   -O2  (test for
excess errors)
[-PASS:-]{+FAIL:+} c-c++-common/torture/builtin-arith-overflow-6.c   -O2 
execution test
UNSUPPORTED: c-c++-common/torture/builtin-arith-overflow-6.c   -O3 -g
UNSUPPORTED: c-c++-common/torture/builtin-arith-overflow-6.c   -Os

spawn -ignore SIGHUP [...]/build-gcc/gcc/gcn-run
./builtin-arith-overflow-6.exe
GCN Kernel Aborted
Kernel aborted
FAIL: c-c++-common/torture/builtin-arith-overflow-6.c   -O2  execution test

With '-fno-late-combine-instructions', it's back to PASS.

The diff between good ('-fno-late-combine-instructions') vs. bad
('-flate-combine-instructions') of 'builtin-arith-overflow-6.s' as well as
'-fdump-rtl-all' is big, so I'm not able to directly pinpoint one specific
issue.

I however do observe a number of instances as follows (good vs. bad):

[...]
s_mov_b64   exec, -1
[...]
-   s_mov_b32   s12, 0
-   v_writelane_b32 v0, s12, 0
s_mov_b64   exec, 1
+   v_mov_b32   v0, 0
flat_store_dwordv[18:19], v0
[...]

Might that "move across 'exec'" be in error?

[Bug libgomp/105274] [libgomp][nvptx] Provide means to set the stack size on the device side (+ improve doc)

2024-06-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105274

Thomas Schwinge  changed:

   What|Removed |Added

   Keywords||openacc
 CC||tschwinge at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-06-04
 Ever confirmed|0   |1
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=85519

[Bug target/97385] [nvptx, docs] -msoft-stack-reserve-local= missing documentation

2024-06-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97385

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org
   Last reconfirmed||2024-06-04
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=97203,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=97384,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=105274

[Bug tree-optimization/115304] gcc.dg/vect/slp-gap-1.c FAILs

2024-06-03 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115304

--- Comment #5 from Thomas Schwinge  ---
Created attachment 58333
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58333=edit
'c2' GCN target ('-march=gfx908') 'slp-gap-1.c.179t.vect'

(In reply to r...@cebitec.uni-bielefeld.de from comment #3)
> > --- Comment #2 from Richard Biener  ---
> > It should only need vect32 - basically I assumed the target can compose the
> > 64bit vector from two 32bit elements.  But it might be that for this to work
> > the loads would need to be aligned.
> >
> > What is needed is char-to-short unpacking and vector composition.  Either
> > composing V2SImode or V8QImode from two V4QImode vectors.
> >
> > Does the following help?
> 
> Unfortunately not: makes no difference AFAICS.

Also doesn't resolve the issue for GCN target (tested '-march=gfx908'); see
attached 'c2' GCN target ('-march=gfx908') 'slp-gap-1.c.179t.vect'.

[Bug tree-optimization/115304] gcc.dg/vect/slp-gap-1.c FAILs

2024-06-03 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115304

--- Comment #4 from Thomas Schwinge  ---
Created attachment 58332
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58332=edit
GCN target ('-march=gfx908') 'slp-gap-1.c.179t.vect'

Similar (I suppose?) for GCN target (tested '-march=gfx908'):

+PASS: gcc.dg/vect/slp-gap-1.c (test for excess errors)
+FAIL: gcc.dg/vect/slp-gap-1.c scan-tree-dump-times vect "{_[0-9]+, 0" 6

[Bug tree-optimization/115304] gcc.dg/vect/slp-gap-1.c FAILs

2024-06-03 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115304

Thomas Schwinge  changed:

   What|Removed |Added

 Target|sparc*-sun-solaris2.11  |sparc*-sun-solaris2.11 GCN
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-06-03
 Ever confirmed|0   |1
 CC||ams at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org

[Bug target/115254] [15 Regression] GCN regressions from "Avoid splitting store dataref groups during SLP discovery"

2024-05-28 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115254

--- Comment #7 from Thomas Schwinge  ---
(In reply to Richard Biener from comment #6)
> The following works for me - does it work for you?

> --- a/gcc/testsuite/gcc.dg/vect/vect-gather-4.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-gather-4.c

> -/* { dg-final { scan-tree-dump-not "vectorizing stmts using SLP" vect } } */
> +/* We do not want to see a two-lane .MASK_LOAD or .MASK_GATHER_LOAD since
> +   the gathers are different on each lane.  This is a bit fragile and
> +   should possibly be turned into a runtime test.  */
> +/* { dg-final { scan-tree-dump-not "stmt 1 \[^\r\n\]* = .MASK" vect } } */

Yes:

PASS: gcc.dg/vect/vect-gather-4.c (test for excess errors)
PASS: gcc.dg/vect/vect-gather-4.c scan-tree-dump-not vect "stmt 1 [^\r\n]*
= .MASK"

But indeed, "a bit fragile".  ;-)

[Bug target/115254] [15 Regression] GCN regressions from "Avoid splitting store dataref groups during SLP discovery"

2024-05-28 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115254

--- Comment #5 from Thomas Schwinge  ---
Created attachment 58300
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58300=edit
GCN target ('-march=gfx908') 'vect-gather-4.c.179t.vect'

(In reply to GCC Commits from comment #3)
> commit r15-859-geaaa4b88038d4d6eda1b20ab662f1568fd9be31f

> * gcc.dg/vect/slp-cond-2-big-array.c: Expect 4 times SLP.
> * gcc.dg/vect/slp-cond-2.c: Likewise.

ACK, again PASS for GCN target ('-march=gfx908'), thanks!


(In reply to Richard Biener from comment #4)
> The gcc.dg/vect/vect-gather-4.c FAIL should be still present.

Yes.

(In reply to Richard Biener from comment #2)
> Note for gcc.dg/vect/vect-gather-4.c with -mgather and gather support in the
> ISA on x86_64 I get two 'vectorizing stmts using SLP', for f1 and f2 only.
> 
> Does that match GCN?

In addition to 'f1', 'f2', GCN target ('-march=gfx908') apparently can do 'f3',
too:

[...]/gcc.dg/vect/vect-gather-4.c:37:21: note:   vectorizing stmts using
SLP.

Attaching that 'vect-gather-4.c.179t.vect'.

> We unfortunately cannot handle masked gathers as "emulated".
> 
> And we don't have good dejagnu target selectors for this either.

[Bug target/115259] New: [15 Regressions] GCN vs. "tree-optimization/115144 - improve sinking destination choice"

2024-05-28 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115259

Bug ID: 115259
   Summary: [15 Regressions] GCN vs. "tree-optimization/115144 -
improve sinking destination choice"
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, rguenth at gcc dot gnu.org
  Target Milestone: ---
Target: GCN

Per bisecting, I found commit r15-815-g5b9b3bae33cae7fca2e3c3e3028be6b8bee9b698
"tree-optimization/115144 - improve sinking destination choice" responsible for
a number of GCN target (tested '-march=gfx908) execution test regressions.  As
it seems, it's a mis-compilation of libgfortran code.  (That is, have to
rebuilt target libgfortran to make the issue appear/go away.)  Given that we're
not seeing such failures in any other configuration, it might be a latent issue
in GCC/GCN?

PASS: gfortran.dg/all_bounds_1.f90   -O0  (test for excess errors)
PASS: gfortran.dg/all_bounds_1.f90   -O0  execution test
[-PASS:-]{+FAIL:+} gfortran.dg/all_bounds_1.f90   -O0  output pattern test
PASS: gfortran.dg/all_bounds_1.f90   -O1  (test for excess errors)
PASS: gfortran.dg/all_bounds_1.f90   -O1  execution test
[-PASS:-]{+FAIL:+} gfortran.dg/all_bounds_1.f90   -O1  output pattern test
PASS: gfortran.dg/all_bounds_1.f90   -O2  (test for excess errors)
PASS: gfortran.dg/all_bounds_1.f90   -O2  execution test
[-PASS:-]{+FAIL:+} gfortran.dg/all_bounds_1.f90   -O2  output pattern test
PASS: gfortran.dg/all_bounds_1.f90   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess
errors)
PASS: gfortran.dg/all_bounds_1.f90   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
[-PASS:-]{+FAIL:+} gfortran.dg/all_bounds_1.f90   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  output pattern test
PASS: gfortran.dg/all_bounds_1.f90   -O3 -g  (test for excess errors)
PASS: gfortran.dg/all_bounds_1.f90   -O3 -g  execution test
[-PASS:-]{+FAIL:+} gfortran.dg/all_bounds_1.f90   -O3 -g  output pattern
test
PASS: gfortran.dg/all_bounds_1.f90   -Os  (test for excess errors)
PASS: gfortran.dg/all_bounds_1.f90   -Os  execution test
[-PASS:-]{+FAIL:+} gfortran.dg/all_bounds_1.f90   -Os  output pattern test

Fortran runtime error: Incorrect extent in return value of ALL intrinsic in
dimension 1: is 3, should be 140127603031922

Fortran runtime error: Incorrect extent in return value of ALL intrinsic in
dimension 1: is 3, should be 17179869188

Fortran runtime error: Incorrect extent in return value of ALL intrinsic in
dimension 1: is 3, should be 5648507875369387964

[Etc.]

Should match:
Fortran runtime error: Incorrect extent in return value of ALL intrinsic in
dimension 1: is 3, should be 2

(..., that is, the "should be" value is wrong?!)


PASS: gfortran.dg/allocated_4.f90   -O0  (test for excess errors)
PASS: gfortran.dg/allocated_4.f90   -O0  execution test
PASS: gfortran.dg/allocated_4.f90   -O1  (test for excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/allocated_4.f90   -O1  execution test
PASS: gfortran.dg/allocated_4.f90   -O2  (test for excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/allocated_4.f90   -O2  execution test
PASS: gfortran.dg/allocated_4.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/allocated_4.f90   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
PASS: gfortran.dg/allocated_4.f90   -O3 -g  (test for excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/allocated_4.f90   -O3 -g  execution test
PASS: gfortran.dg/allocated_4.f90   -Os  (test for excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/allocated_4.f90   -Os  execution test

Operating system error: Not enough space
Integer overflow in xmallocarray


PASS: gfortran.dg/any_all_1.f90   -O0  (test for excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/any_all_1.f90   -O0  execution test
PASS: gfortran.dg/any_all_1.f90   -O1  (test for excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/any_all_1.f90   -O1  execution test
PASS: gfortran.dg/any_all_1.f90   -O2  (test for excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/any_all_1.f90   -O2  execution test
PASS: gfortran.dg/any_all_1.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/any_all_1.f90   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
PASS: 

[Bug target/115254] New: [15 Regression] GCN regressions from "Avoid splitting store dataref groups during SLP discovery"

2024-05-28 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115254

Bug ID: 115254
   Summary: [15 Regression] GCN regressions from "Avoid splitting
store dataref groups during SLP discovery"
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, rguenth at gcc dot gnu.org
  Target Milestone: ---
Target: GCN

If my tracking is to be believed, the recent commit
r15-812-gc71886f2ca2e46ce1449c7064d6f1b447d02fcba "Avoid splitting store
dataref groups during SLP discovery" is causing the following regressions for
GCN target testing (tested '-march=gfx908'):

PASS: gcc.dg/vect/slp-cond-2-big-array.c (test for excess errors)
PASS: gcc.dg/vect/slp-cond-2-big-array.c execution test
[-PASS:-]{+FAIL:+} gcc.dg/vect/slp-cond-2-big-array.c scan-tree-dump-times
vect "vectorizing stmts using SLP" 3

gcc.dg/vect/slp-cond-2-big-array.c: pattern found 4 times


PASS: gcc.dg/vect/slp-cond-2.c (test for excess errors)
PASS: gcc.dg/vect/slp-cond-2.c execution test
[-PASS:-]{+FAIL:+} gcc.dg/vect/slp-cond-2.c scan-tree-dump-times vect
"vectorizing stmts using SLP" 3

gcc.dg/vect/slp-cond-2.c: pattern found 4 times


PASS: gcc.dg/vect/vect-gather-4.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.dg/vect/vect-gather-4.c scan-tree-dump-not vect
"Loop contains only SLP stmts"

Per commit 85e2ce10f76aee93e43aab6558cf8e39cec911e4 "Fix
gcc.dg/vect/vect-gather-4.c for cascadelake", that one later additionally
changed:

FAIL: gcc.dg/vect/vect-gather-4.c scan-tree-dump-not vect [-"Loop contains
only SLP stmts"-]{+"vectorizing stmts using SLP"+}

..., but that's not relevant for the original regression.


In other words, if I locally revert that patch, these all PASS again.

[Bug testsuite/115140] [15 regression] libgomp.oacc-c++/../libgomp.oacc-c-c++-common/acc_prof-kernels-1.c excess errors after r15-579-ga9251ab3c91c8c

2024-05-24 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115140

Thomas Schwinge  changed:

   What|Removed |Added

   Host|powerpc64-linux-gnu,|
   |powerpc64le-linux-gnu,  |
   |*-*-solaris2.11 |
  Build|powerpc64-linux-gnu,|
   |powerpc64le-linux-gnu,  |
   |*-*-solaris2.11 |
 CC||tschwinge at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
 Target|powerpc64-linux-gnu,|
   |powerpc64le-linux-gnu,  |
   |*-*-solaris2.11 |
   Last reconfirmed||2024-05-24
 Ever confirmed|0   |1

--- Comment #3 from Thomas Schwinge  ---
So the PASS -> FAIL regressions are due to parloops (for '-O2') no longer
parallelizing the simple OpenACC 'kernels' construct at line 185 (and two
more):

int x[N];
#pragma acc kernels
{
  for (int i = 0; i < N; ++i)
x[i] = i * i;
}

(In reply to Richard Biener from comment #1)
> Looks like a testsuite artifact?
> 
> volatile // TODO PR90488
> static int state = -1;
> 
> I've not looked as to why/how we are getting that to influence points-to
> solutions (note as we track also integers volatile on non-pointers can
> matter).

Yeah, it's not obvious to me how that 'state' variable would have such an
effect -- but I've not yet 'diff'ed the dumps.

On the other hand, it's highly likely that there is some relation, as no other
OpenACC 'kernels' test cases did regress.

[Bug c/114819] New: 'constructor', 'destructor' function attributes vs. function signature

2024-04-23 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114819

Bug ID: 114819
   Summary: 'constructor', 'destructor' function attributes vs.
function signature
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: diagnostic, documentation
  Severity: minor
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
  Target Milestone: ---

In context of PR114818 "'constructor', 'destructor' function attributes vs.
'extern'", I also found that there's no user documentation that the
constructor, destructor function signature has to match 'void FN(void)', and
GCC currently doesn't check/diagnose this.

Should we update 'gcc/doc/extend.texi' for this, and implement a diagnostic
(warning or even error, enabled by default)?

I found that we only document in 'gcc/target.def':

/* Output a constructor for a symbol with a given priority.  */
DEFHOOK
(constructor,
 "If defined, a function that outputs assembler code to arrange to call\n\
the function referenced by @var{symbol} at initialization time.\n\
\n\
Assume that @var{symbol} is a @code{SYMBOL_REF} for a function taking\n\
no arguments and with no return value.  [...]

Note "a function taking no arguments and with no return value".

[Bug c/114818] New: 'constructor', 'destructor' function attributes vs. 'extern'

2024-04-23 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114818

Bug ID: 114818
   Summary: 'constructor', 'destructor' function attributes vs.
'extern'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: diagnostic, documentation
  Severity: minor
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
  Target Milestone: ---

By chance, I noticed that when 'constructor', 'destructor' function attributes
appear on an 'extern' function declaration, then that is (a) accepted without
any diagnostic by the C, C++ front ends, but (b) no 'constructor', 'destructor'
calls are emitted.  (Doesn't matter whether the function does or doesn't get
linked in.)

Assuming that is the expected behavior, should we update 'gcc/doc/extend.texi'
for this, and implement a diagnostic (warning or even error, enabled by
default)?

I found that in 'gcc/doc/tm.texi', '@node Initialization' we state:

[...] Each
object file that defines an initialization function also puts a word in
the constructor section to point to that function.  [...]

Note "defines", which excludes 'extern'.

[Bug other/113317] New test case libgomp.c++/ind-base-2.C fails with ICE

2024-04-18 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113317

Thomas Schwinge  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Keywords||openmp
   Last reconfirmed||2024-04-18
 CC||burnus at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org

--- Comment #9 from Thomas Schwinge  ---
Reproduced with a native build on cfarm120.cfarm.net
(powerpc64le-unknown-linux-gnu, "POWER10 (architected), altivec supported"),
with '--enable-checking=yes,extra,rtl'.

And, 'libgomp.c/../libgomp.c-c++-common/ind-base-4.c',
'libgomp.c++/../libgomp.c-c++-common/ind-base-4.c' ICE in the same way.

[Bug libgomp/92840] [OpenACC] Disallow 'acc_unmap_data' for everything other than 'acc_map_data'

2024-04-16 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92840

--- Comment #5 from Thomas Schwinge  ---
As determined during patch review, there's still an unresolved issue:

On 2024-04-16T17:12:17+0800, Chung-Lin Tang  wrote:
> If we continue to use k->refcount itself as the flag holder of map type, I 
> guess we will not be able to directly determine whether it is a
> structured or dynamic adjustment at that point. Probably need a new field 
> entirely.

[Bug libgomp/92840] [OpenACC] Disallow 'acc_unmap_data' for everything other than 'acc_map_data'

2024-04-16 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92840

Thomas Schwinge  changed:

   What|Removed |Added

 CC||cltang at gcc dot gnu.org

--- Comment #4 from Thomas Schwinge  ---
commit r14-9991-ga7578a077ed8b64b94282aa55faf7037690abbc5 "OpenACC 2.7: Adjust
acc_map_data/acc_unmap_data interaction with reference counters"

[Bug driver/114717] '-fcf-protection' vs. offloading compilation

2024-04-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114717

--- Comment #5 from Thomas Schwinge  ---
Distributions injecting some '-fcf-protection' by default could also inject
'-foffload-options=amdgcn-amdhsa=-fno-cf-protection' (or similar) to keep the
default case of offloading compilation working, but then with explicit
user-specified '-fcf-protection', the user would still get an error for
offloading compilation -- which may actually be desirable (for some)?

Alternatively: yes, the 'mkoffload's could filter that out -- but there is a
policy question, whether 'mkoffload's are permitted to silently drop
user-requested '-f[...]' flags?  Probably that's OK if the '-fcf-protection'
documentation is updated accordingly?

I guess I don't have any strong preference.  ;-)

[Bug target/114718] New: GCN's '-march'es vs. default multilib

2024-04-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114718

Bug ID: 114718
   Summary: GCN's '-march'es vs. default multilib
   Product: gcc
   Version: 14.0
   URL: https://github.com/gcc-mirror/gcc/commit/1bf18629c54ad
f4893c8db5227a36e1952ee69a3#commitcomment-140648051
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org
  Target Milestone: ---
Target: GCN

When a specific multilib build for GCN's '-march'es is not available (has not
been 'configure'd/packaged), GCC resorts to the default multilib build -- which
in the case of GCN won't even link: 'ld: error: incompatible mach'.  Instead of
attempting the latter (default multilib build), should this case be diagnosed
properly, instead?

(Independent of the vague idea that multilib builds for GCN be made "more
permeable".)

Reported, for example, by Oscar Barenys in
.

[Bug target/114717] New: '-fcf-protection' vs. offloading compilation

2024-04-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114717

Bug ID: 114717
   Summary: '-fcf-protection' vs. offloading compilation
   Product: gcc
   Version: 14.0
   URL: https://github.com/gcc-mirror/gcc/commit/1bf18629c54ad
f4893c8db5227a36e1952ee69a3#commitcomment-140648051
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, vries at gcc dot gnu.org
  Target Milestone: ---
Target: GCN, nvptx

If '-fcf-protection' is in effect (as, for example, enabled by default in
certain distributions), that option gets forwarded to the offloading compilers,
but for both GCN and nvptx:

lto1: error: ‘-fcf-protection=full’ is not supported for this target

Originally reported by Oscar Barenys in
.

[Bug libgomp/114690] OpenMP 'indirect' clause: dynamic image loading/unloading

2024-04-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114690

--- Comment #1 from Thomas Schwinge  ---
I suggest that in the short term, we at least add a safeguard in the
'GOMP_OFFLOAD_load_image's to error out if 'GOMP_INDIRECT_ADDR_MAP' has already
been set (that should address (a), right?), and in the
'GOMP_OFFLOAD_unload_image's error out if 'GOMP_INDIRECT_ADDR_MAP' has been set
(that should address (b) -- right?).  (I'm assuming that stale mappings being
present may potentially be problematic?)

Those should be no-ops for the presumably common case that either dynamic
loading/unloading of images isn't used at all, or if it is, that no 'indirect'
clauses are actually present.

[Bug libgomp/114690] New: OpenMP 'indirect' clause: dynamic image loading/unloading

2024-04-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114690

Bug ID: 114690
   Summary: OpenMP 'indirect' clause: dynamic image
loading/unloading
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: burnus at gcc dot gnu.org, jakub at gcc dot gnu.org,
kcy at codesourcery dot com
  Target Milestone: ---

The OpenMP 'indirect' clause mapping table is not populated at image load time
(host-side), but upon the first device kernel invocation (device-side:
'build_indirect_map'), and is then immutable.

This is sufficient for a lot of cases, but breaks if additional images are
loaded after the first device kernel invocation (new mappings not added), or if
images get unloaded (stale mappings not retired).

Reference:

"[PATCH] openmp: Add support for the 'indirect' clause in C/C++", "Also, for my
understanding: [...]" ff.

[Bug other/111966] GCN '--with-arch=[...]' not considered for 'mkoffload' default 'elf_arch'

2024-03-27 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111966

Thomas Schwinge  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |burnus at gcc dot 
gnu.org
 CC||rguenth at gcc dot gnu.org
 Status|REOPENED|ASSIGNED

--- Comment #5 from Thomas Schwinge  ---
Tobias is working on this.

[Bug middle-end/112653] PTA should handle correctly escape information of values returned by a function

2024-03-19 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112653

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org

--- Comment #16 from Thomas Schwinge  ---
By means of facilitating an additional '-Wuninitialized' diagnostic, this
commit r14-5879-gf7884f7673444b8a2c10ea0981d480f2e82dd16a
"tree-optimization/112653 - PTA and return" found a bug in GCC/Rust front end
C++ constructor code: see 
"`Block.Rust::AST::ExprWithoutBlock::Rust::AST::Expr.Rust::AST::Expr::node_id’
is used uninitialized [-Werror=uninitialized]`".  :-)

[Bug rust/113499] crab1 fails to link when configuring with --disable-plugin

2024-03-18 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113499

Thomas Schwinge  changed:

   What|Removed |Added

   See Also||https://github.com/Rust-GCC
   ||/gccrs/issues/2890
 CC||cohenarthur at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org

--- Comment #4 from Thomas Schwinge  ---
If I understood Arthur correctly, GCC/Rust is going to effectively require
'dlopen' (and therefore '--enable-plugin'?), so that means, if the latter's not
available we have to auto-disable Rust language front end if enabled
'--enable-languages=all' vs. raise a 'configure'-time error if enabled via
explicit '--enable-languages=rust'?

Related is also  "Don't
hard-code `-ldl -lpthread` for `format_args`".

[Bug target/114302] New: [14 Regression] GCN regressions after: vect: Tighten vect_determine_precisions_from_range [PR113281]

2024-03-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302

Bug ID: 114302
   Summary: [14 Regression] GCN regressions after: vect: Tighten
vect_determine_precisions_from_range [PR113281]
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: minor
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, rsandifo at gcc dot gnu.org
  Target Milestone: ---
Target: GCN

If my tracking/bisecting is to be believed, commit
r14-8492-g1a8261e047f7a2c2b0afb95716f7615cba718cd1 "vect: Tighten
vect_determine_precisions_from_range [PR113281]" is causing a few
'scan-assembler' regressions for GCN, for all '-march'es; see full list below. 
(No execution test regressions, so presumably not wrong-code.)  Due to lack of
knowledge of the relevant parts, I can't tell what needs to be adjusted.

For example, for '-march=gfx90a', 'gcc.target/gcn/simd-math-5-char-16.c' we get
before vs. after:

--- before/simd-math-5-char-16-march=gfx90a.s 2024-03-04
10:49:00.532961673 +0100
+++ after/simd-math-5-char-16-march=gfx90a.s  2024-03-04
11:02:31.409941756 +0100
@@ -269,18 +269,20 @@
v_addc_co_u32   v7, s[22:23], 0, v7, s[22:23]
flat_load_ubyte v16, v[6:7] offset:0
s_waitcnt   0
+   v_mov_b32_sdwa  v16, sext(v16) src0_sel:BYTE_0
v_add_co_u32v4, s[22:23], s34, v1
v_mov_b32   v5, s35
v_addc_co_u32   v5, s[22:23], 0, v5, s[22:23]
flat_load_ubyte v17, v[4:5] offset:0
s_waitcnt   0
+   v_mov_b32_sdwa  v17, sext(v17) src0_sel:BYTE_0
s_add_u32   s40, s14, 80
s_addc_u32  s41, s15, 0
s_getpc_b64 s[42:43]
s_add_u32   s42, s42, __divv16hi3@rel32@lo+4
s_addc_u32  s43, s43, __divv16hi3@rel32@hi+4
-   v_mov_b32_sdwa  v9, sext(v17) src0_sel:BYTE_0
-   v_mov_b32_sdwa  v8, sext(v16) src0_sel:BYTE_0
+   v_mov_b32   v9, v17
+   v_mov_b32   v8, v16
s_swappc_b64s[18:19], s[42:43]
s_mov_b64   exec, 65535
v_mov_b32_sdwa  v8, v8 dst_sel:BYTE_0 dst_unused:UNUSED_PAD
src0_sel:WORD_0
@@ -291,12 +293,13 @@
s_add_u32   s38, s14, 64
s_addc_u32  s39, s15, 0
s_getpc_b64 s[44:45]
-   s_add_u32   s44, s44, __modv16qi3@rel32@lo+4
-   s_addc_u32  s45, s45, __modv16qi3@rel32@hi+4
-   v_mov_b32   v9, v17
-   v_mov_b32   v8, v16
+   s_add_u32   s44, s44, __modv16si3@rel32@lo+4
+   s_addc_u32  s45, s45, __modv16si3@rel32@hi+4
+   v_mov_b32_sdwa  v9, sext(v17) src0_sel:WORD_0
+   v_mov_b32_sdwa  v8, sext(v16) src0_sel:WORD_0
s_swappc_b64s[18:19], s[44:45]
s_mov_b64   exec, 65535
+   v_mov_b32_sdwa  v8, v8 dst_sel:BYTE_0 dst_unused:UNUSED_PAD
src0_sel:DWORD
v_add_co_u32v4, s[22:23], s38, v1
v_mov_b32   v5, s39
v_addc_co_u32   v5, s[22:23], 0, v5, s[22:23]
@@ -334,8 +337,11 @@
v_addc_co_u32   v5, s[22:23], 0, v5, s[22:23]
flat_load_ubyte v9, v[4:5] offset:0
s_waitcnt   0
+   v_mov_b32_sdwa  v9, sext(v9) src0_sel:BYTE_0
+   v_mov_b32_sdwa  v8, sext(v8) src0_sel:BYTE_0
s_swappc_b64s[18:19], s[44:45]
s_mov_b64   exec, 65535
+   v_mov_b32_sdwa  v8, v8 dst_sel:BYTE_0 dst_unused:UNUSED_PAD
src0_sel:DWORD
v_add_co_u32v4, s[22:23], s42, v1
v_mov_b32   v5, s43
v_addc_co_u32   v5, s[22:23], 0, v5, s[22:23]
@@ -557,5 +563,5 @@
 .LEFDE0:
.globl  __modsi3
.globl  __divsi3
-   .globl  __modv16qi3
+   .globl  __modv16si3
.globl  __divv16hi3

Due to no registers getting renamed, that one is the smallest of all before vs.
after 'diff's; but is illustrative of what generally happens, as far as I can
tell.

Let me know if you'd like me to provide any artifacts.

Full list:

@@ -607,28 +607,28 @@ PASS: gcc.target/gcn/simd-math-5-char-16.c (test for
excess errors)
XFAIL: gcc.target/gcn/simd-math-5-char-16.c scan-assembler-times
__divmod16.i4@rel32@lo 1
PASS: gcc.target/gcn/simd-math-5-char-16.c scan-assembler-times
__divv16hi3@rel32@lo 1
PASS: gcc.target/gcn/simd-math-5-char-16.c scan-assembler-times
__divv16qi3@rel32@lo 0
[-PASS:-]{+FAIL:+} gcc.target/gcn/simd-math-5-char-16.c
scan-assembler-times __modv16qi3@rel32@lo 1
PASS: gcc.target/gcn/simd-math-5-char-16.c scan-assembler-times
__udivv16qi3@rel32@lo 0
PASS: gcc.target/gcn/simd-math-5-char-16.c scan-assembler-times

[Bug target/113331] AMDGCN: Compilation failure due to duplicate .LEHB/.LEHE symbols

2024-03-06 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113331

Thomas Schwinge  changed:

   What|Removed |Added

   Last reconfirmed|2024-02-20 00:00:00 |2024-3-6
 CC||ams at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org
 Status|ASSIGNED|NEW

--- Comment #4 from Thomas Schwinge  ---
(I've not yet started working on this, but) I've noticed that we run into the
same issue for 'libgomp.c++/firstprivate-2.C' that Jakub recently added in
commit r14-9257-g4f82d5a95a244d0aa4f8b2541b47a21bce8a191b "OpenMP/C++: Fix
(first)private clause with member variables [PR110347]":

spawn -ignore SIGHUP g++
../source-gcc/libgomp/testsuite/libgomp.c++/firstprivate-2.C [...] -fopenmp -O2
-lm -o ./firstprivate-2.exe
/tmp/ccLrOMGJ.mkoffload.2.s:215:1: error: symbol '.LEHB0' is already
defined
.LEHB0:
^
/tmp/ccLrOMGJ.mkoffload.2.s:241:1: error: symbol '.LEHE0' is already
defined
.LEHE0:
^
/tmp/ccLrOMGJ.mkoffload.2.s:341:1: error: symbol '.LEHB0' is already
defined
.LEHB0:
^
/tmp/ccLrOMGJ.mkoffload.2.s:367:1: error: symbol '.LEHE0' is already
defined
.LEHE0:
^
/tmp/ccLrOMGJ.mkoffload.2.s:467:1: error: symbol '.LEHB0' is already
defined
.LEHB0:
^
/tmp/ccLrOMGJ.mkoffload.2.s:493:1: error: symbol '.LEHE0' is already
defined
.LEHE0:
^
gcn mkoffload: fatal error: x86_64-pc-linux-gnu-accel-amdgcn-amdhsa-gcc
returned 1 exit status
[...]
FAIL: libgomp.c++/firstprivate-2.C (test for excess errors)

Again, that's for GCN offloading compilation only, but not nvptx.

[Bug target/113331] AMDGCN: Compilation failure due to duplicate .LEHB/.LEHE symbols

2024-02-20 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113331

Thomas Schwinge  changed:

   What|Removed |Added

   Last reconfirmed||2024-02-20
 Status|UNCONFIRMED |ASSIGNED
 CC||tschwinge at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Thomas Schwinge  ---
Planning to look into this as part of my ongoing GPU C++ support task.

[Bug other/111966] GCN '--with-arch=[...]' not considered for 'mkoffload' default 'elf_arch'

2024-01-24 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111966

Thomas Schwinge  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|RESOLVED|REOPENED
   Last reconfirmed||2024-01-24
 Resolution|FIXED   |---

--- Comment #3 from Thomas Schwinge  ---
Tobias, thanks for fixing the easy part ('s%803%900' default) -- however, the
harder part still remains to be done; see this issue's initial comment.

[Bug target/113022] GCN offloading bricked by "amdgcn: Work around XNACK register allocation problem"

2024-01-08 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113022

Thomas Schwinge  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED
   Assignee|unassigned at gcc dot gnu.org  |ams at gcc dot gnu.org

--- Comment #2 from Thomas Schwinge  ---
Resolved via commit r14-6997-g78dff4c25c1b959e4682d7da50d00fb371849a46 "amdgcn:
Match new XNACK defaults in mkoffload".

[Bug target/112937] [14 Regression] GCN: FAILs due to unconditional 'f->use_flat_addressing = true;'

2024-01-08 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112937

--- Comment #3 from Thomas Schwinge  ---
The GCN offloading 'libgomp.fortran/target1.f90' regression has been cured by
commit r14-6996-gc5c3aab38132ea34dc1ee69d93fded787e6ac7a4 "amdgcn: Don't
double-count AVGPRs" (..., but not the GCN target regressions that I initially
reported here).

[Bug rust/113056] [14 regression] Build failure in libgrust

2024-01-08 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113056

Thomas Schwinge  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #14 from Thomas Schwinge  ---
Should be fixed.  If not, please re-open providing more data.

[Bug libstdc++/112997] _Unwind_Exception conflicts with void*. failed to build with clang

2024-01-08 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112997

Thomas Schwinge  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org

--- Comment #11 from Thomas Schwinge  ---
ACK; I had the same change in my WIP tree:

In my GCN and nvptx target libstdc++ work (WIP), I see:

[...]/source-gcc/libstdc++-v3/libsupc++/eh_call.cc:39:1: error:
conflicting C language linkage declaration ‘void __cxa_call_terminate(void*)’
[-Werror]
   39 | __cxa_call_terminate(void* ue_header_in) throw ()
  | ^~~~
In file included from
[...]/source-gcc/libstdc++-v3/libsupc++/eh_call.cc:28:
[...]/source-gcc/libstdc++-v3/libsupc++/unwind-cxx.h:170:17: note:
previous declaration ‘void
__cxxabiv1::__cxa_call_terminate(_Unwind_Exception*)’
  170 | extern "C" void __cxa_call_terminate (_Unwind_Exception*) throw
()
  | ^~~~
cc1plus: all warnings being treated as errors
make[4]: *** [eh_call.lo] Error 1

[Bug libgomp/113192] [11/12/13/14 Regression] ERROR: couldn't execute "../../../gcc/libgomp/testsuite/flock": no such file or directory

2024-01-02 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113192

Thomas Schwinge  changed:

   What|Removed |Added

Summary|[14 Regression] ERROR:  |[11/12/13/14 Regression]
   |couldn't execute|ERROR: couldn't execute
   |"../../../gcc/libgomp/tests |"../../../gcc/libgomp/tests
   |uite/flock": no such file   |uite/flock": no such file
   |or directory|or directory
 Blocks||66005
Version|14.0|11.0

--- Comment #1 from Thomas Schwinge  ---
(In reply to John David Anglin from comment #0)
> HP-UX doesn't have flock but it does have perl. configure tries to create
> a fallback but a relative path to libgomp/testsuite/flock is generated.
> It is wrong when the testsuite is run.
> 
> AC_MSG_NOTICE([checking for flock implementation])
> AC_CHECK_PROGS(FLOCK, flock)
> # Fallback if 'perl' is available.
> if test -z "$FLOCK"; then
>   AC_CHECK_PROG(FLOCK, perl, $srcdir/testsuite/flock)
> fi

Aha, sorry.  Does it work if you changes:

-AC_CHECK_PROG(FLOCK, perl, $srcdir/testsuite/flock)
+AC_CHECK_PROG(FLOCK, perl, $ac_abs_srcdir/testsuite/flock)

..., so that this:

> configure: checking for flock implementation
> checking for flock... no
> checking for perl... ../../../gcc/libgomp/testsuite/flock

... turns into an absolute path, to resolve:

> Running /home/dave/gnu/gcc/gcc/libgomp/testsuite/libgomp.c/c.exp ...
> ERROR: tcl error sourcing
> /home/dave/gnu/gcc/gcc/libgomp/testsuite/libgomp.c/c.exp.
> ERROR: tcl error code NONE
> ERROR: couldn't execute "../../../gcc/libgomp/testsuite/flock": no such file
> or directory

... this.

If that works, and you submit a patch, please in the commit log cite this
commit:

> This problem was introduced by the following commit:
> 
> commit 04abe1944d30eb18a2060cfcd9695d085f7b4752
> Author: Thomas Schwinge 
> Date:   Mon May 15 20:00:07 2023 +0200
> 
> Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]

..., and also specify 'PR testsuite/66005' in the commit log.

If that suggestion doesn't easily resolve this issue, then I'll be able to look
into it next week.


> It appears this problem can be worked around by exporting FLOCK.

But only if that specifies an absolute path (or a "suitable" relative one), I
suppose?


I've also set this "11/12/13 Regression", as these branches use the exact same
code.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005
[Bug 66005] libgomp make check time is excessive

[Bug rtl-optimization/112918] [m68k] [LRA] ICE: maximum number of generated reload insns per insn achieved (90)

2023-12-21 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918

Thomas Schwinge  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=113097,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=113098
   Assignee|unassigned at gcc dot gnu.org  |vmakarov at gcc dot 
gnu.org

[Bug testsuite/113005] 'libgomp.fortran/rwlock_1.f90', 'libgomp.fortran/rwlock_3.f90' execution test timeouts

2023-12-21 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113005

Thomas Schwinge  changed:

   What|Removed |Added

  Component|libfortran  |testsuite
   Last reconfirmed||2023-12-21
 Target|powerpc64le-linux-gnu   |
 CC||burnus at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Thomas Schwinge  ---
Turns out, this isn't actually specific to powerpc64le-linux-gnu, but rather
the following: my testing where I saw the timeouts was not build-tree 'make
check' testing, but instead "installed" testing (where you invoke 'runtest' on
a 'make install'ed GCC tree).  In that case, r266482 "Tweak libgomp env vars in
parallel make check (take 2)" is not in effect, that is, there's no limiting to
'OMP_NUM_THREADS=8'.

For example, manually running the '-O0' variant of
'libgomp.fortran/rwlock_1.f90' on a "big-iron" x86_64-pc-linux-gnu system:

$ grep ^model\ name < /proc/cpuinfo | uniq -c
256 model name  : AMD EPYC 7V13 64-Core Processor
$ \time env OMP_NUM_THREADS=[...] LD_LIBRARY_PATH=[...] ./rwlock_1.exe

..., I produce the following data on an idle system:

'OMP_NUM_THREADS=8':

0.16user 0.56system 0:02.36elapsed 31%CPU (0avgtext+0avgdata
4452maxresident)k
0.17user 0.54system 0:02.30elapsed 30%CPU (0avgtext+0avgdata
4532maxresident)k

'OMP_NUM_THREADS=16':

0.40user 1.03system 0:04.52elapsed 31%CPU (0avgtext+0avgdata
5832maxresident)k
0.49user 0.99system 0:04.39elapsed 33%CPU (0avgtext+0avgdata
5876maxresident)k

'OMP_NUM_THREADS=32':

0.98user 2.36system 0:09.33elapsed 35%CPU (0avgtext+0avgdata
8528maxresident)k
0.98user 2.25system 0:09.02elapsed 35%CPU (0avgtext+0avgdata
8548maxresident)k

'OMP_NUM_THREADS=64':

1.82user 5.83system 0:18.44elapsed 41%CPU (0avgtext+0avgdata
13952maxresident)k
1.54user 6.03system 0:18.22elapsed 41%CPU (0avgtext+0avgdata
13996maxresident)k

'OMP_NUM_THREADS=128':

3.71user 12.41system 0:38.02elapsed 42%CPU (0avgtext+0avgdata
24376maxresident)k
3.96user 12.52system 0:39.34elapsed 41%CPU (0avgtext+0avgdata
24476maxresident)k

'OMP_NUM_THREADS=256' (or not set, for that matter):

9.65user 25.19system 1:20.93elapsed 43%CPU (0avgtext+0avgdata
45816maxresident)k
8.99user 25.82system 1:19.40elapsed 43%CPU (0avgtext+0avgdata
45636maxresident)k

For comparison, if I remove 'LD_LIBRARY_PATH', such that the system-wide GCC 10
libraries are used, I get for the latter case:

9.28user 24.54system 1:22.09elapsed 41%CPU (0avgtext+0avgdata
45588maxresident)k
11.26user 24.51system 1:24.32elapsed 42%CPU (0avgtext+0avgdata
45712maxresident)k

..., so only a little bit of an improvement of the new "rwlock" libgfortran vs.
old "mutex" GCC 10 one, curiously.  (But supposedly that depends on the
hardware or other factors?)

Anyway: should these test cases be limiting themselves to some lower
'OMP_NUM_THREADS', for example via 'num_threads' clauses?

The powerpc64le-linux-gnu systems:

$ grep ^cpu < /proc/cpuinfo | uniq -c

160 cpu : POWER8 (raw), altivec supported

152 cpu : POWER8NVL (raw), altivec supported

128 cpu : POWER9, altivec supported

[Bug rtl-optimization/112918] [m68k] [LRA] ICE: maximum number of generated reload insns per insn achieved (90)

2023-12-19 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org

--- Comment #14 from Thomas Schwinge  ---
*** Bug 112265 has been marked as a duplicate of this bug. ***

[Bug target/112265] [14 Regression] GCN offloading 'libgomp.c-c++-common/for-5.c': 'internal compiler error: maximum number of generated reload insns per insn achieved (90)'

2023-12-19 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112265

Thomas Schwinge  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED
   Assignee|unassigned at gcc dot gnu.org  |vmakarov at gcc dot 
gnu.org

--- Comment #4 from Thomas Schwinge  ---
Resolved via commit r14-6667-g989e67f827b74b76e58abe137ce12d948af2290c
"[PR112918][LRA]: Fixing IRA ICE on m68k".

*** This bug has been marked as a duplicate of bug 112918 ***

[Bug rust/113056] [14 regression] Build failure in libgrust

2023-12-18 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113056

Thomas Schwinge  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #12 from Thomas Schwinge  ---
(In reply to Sam James from comment #0)
> checking for suffix of object files... configure: error: in
> `/var/tmp/portage/sys-devel/gcc-14.0.0_pre20231217/work/build/32/libgrust':
> configure: error: cannot compute suffix of object files: cannot compile
> See `config.log' for more details
> make[1]: *** [Makefile:16176: configure-libgrust] Error 1

Notice that this is the *host* libgrust build -- unexpectedly multilibbed. 
Please test

"libgrust: 'AM_ENABLE_MULTILIB' only for target builds [PR113056]".

[Bug rust/113056] [14 regression] Build failure in libgrust

2023-12-18 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113056

Thomas Schwinge  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |tschwinge at gcc dot 
gnu.org
 Status|WAITING |ASSIGNED
 CC||tschwinge at gcc dot gnu.org

--- Comment #11 from Thomas Schwinge  ---
Reproduced, and I think I know what's happening.

[Bug target/113022] New: GCN offloading bricked by "amdgcn: Work around XNACK register allocation problem"

2023-12-14 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113022

Bug ID: 113022
   Summary: GCN offloading bricked by "amdgcn: Work around XNACK
register allocation problem"
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: openacc, openmp, testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, jules at gcc dot gnu.org
  Target Milestone: ---
Target: GCN

I've not seen a problem in GCN target testing, but GCN offloading -- at least
in my testing -- is bricked (for non-'-march=gfx90a'?) by the recent commit
r14-6503-g4c12bcbeb0c0fd6da4c56e7622814201daadd585 "amdgcn: Work around XNACK
register allocation problem":

/tmp/ccwsYf5g.mkoffload.2.s:1:17: error: .amdgcn_target directive's target
id amdgcn-unknown-amdhsa--gfx900:xnack- does not match the specified target id
amdgcn-unknown-amdhsa--gfx900
.amdgcn_target "amdgcn-unknown-amdhsa--gfx900:xnack-"
   ^
/tmp/ccwsYf5g.mkoffload.2.s:29:4: error: .amdhsa_reserve_xnack_mask does
not match target id
  .amdhsa_reserve_xnack_mask0
  ^~
[...]

Reverting that commit is my workaround for the time being.

Is maybe simply something missing in GCN 'mkoffload'?

[Bug libfortran/113005] New: 'libgomp.fortran/rwlock_1.f90', 'libgomp.fortran/rwlock_3.f90' execution test timeouts

2023-12-13 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113005

Bug ID: 113005
   Summary: 'libgomp.fortran/rwlock_1.f90',
'libgomp.fortran/rwlock_3.f90' execution test timeouts
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: libfortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: hjl at gcc dot gnu.org
  Target Milestone: ---
Target: powerpc64le-linux-gnu

On several of our "big iron" powerpc64le-linux-gnu systems, I'm seeing the new
test cases 'libgomp.fortran/rwlock_1.f90', 'libgomp.fortran/rwlock_3.f90' run
into execution test timeouts (300 s).  Those were added in commit
r14-6425-gb806c88fab3f9c6833563f9a44b608dd5dd14de9 "libgfortran: Replace mutex
with rwlock".

PASS: libgomp.fortran/rwlock_1.f90   -O0  (test for excess errors)
WARNING: program timed out.
FAIL: libgomp.fortran/rwlock_1.f90   -O0  execution test
PASS: libgomp.fortran/rwlock_1.f90   -O1  (test for excess errors)
WARNING: program timed out.
FAIL: libgomp.fortran/rwlock_1.f90   -O1  execution test
PASS: libgomp.fortran/rwlock_1.f90   -O2  (test for excess errors)
WARNING: program timed out.
FAIL: libgomp.fortran/rwlock_1.f90   -O2  execution test
PASS: libgomp.fortran/rwlock_1.f90   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess
errors)
WARNING: program timed out.
FAIL: libgomp.fortran/rwlock_1.f90   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
PASS: libgomp.fortran/rwlock_1.f90   -O3 -g  (test for excess errors)
WARNING: program timed out.
FAIL: libgomp.fortran/rwlock_1.f90   -O3 -g  execution test
PASS: libgomp.fortran/rwlock_1.f90   -Os  (test for excess errors)
WARNING: program timed out.
FAIL: libgomp.fortran/rwlock_1.f90   -Os  execution test
PASS: libgomp.fortran/rwlock_2.f90   -O0  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -O0  execution test
PASS: libgomp.fortran/rwlock_2.f90   -O1  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -O1  execution test
PASS: libgomp.fortran/rwlock_2.f90   -O2  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -O2  execution test
PASS: libgomp.fortran/rwlock_2.f90   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess
errors)
PASS: libgomp.fortran/rwlock_2.f90   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
PASS: libgomp.fortran/rwlock_2.f90   -O3 -g  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -O3 -g  execution test
PASS: libgomp.fortran/rwlock_2.f90   -Os  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -Os  execution test
PASS: libgomp.fortran/rwlock_3.f90   -O0  (test for excess errors)
WARNING: program timed out.
FAIL: libgomp.fortran/rwlock_3.f90   -O0  execution test
PASS: libgomp.fortran/rwlock_3.f90   -O1  (test for excess errors)
WARNING: program timed out.
FAIL: libgomp.fortran/rwlock_3.f90   -O1  execution test
PASS: libgomp.fortran/rwlock_3.f90   -O2  (test for excess errors)
WARNING: program timed out.
FAIL: libgomp.fortran/rwlock_3.f90   -O2  execution test
PASS: libgomp.fortran/rwlock_3.f90   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess
errors)
WARNING: program timed out.
FAIL: libgomp.fortran/rwlock_3.f90   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
PASS: libgomp.fortran/rwlock_3.f90   -O3 -g  (test for excess errors)
WARNING: program timed out.
FAIL: libgomp.fortran/rwlock_3.f90   -O3 -g  execution test
PASS: libgomp.fortran/rwlock_3.f90   -Os  (test for excess errors)
WARNING: program timed out.
FAIL: libgomp.fortran/rwlock_3.f90   -Os  execution test

All-PASS on all x86_64-pc-linux-gnu systems that I've tested.

[Bug analyzer/112955] Valgrind error in ana::feasibility_state::maybe_update_for_edge

2023-12-12 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112955

Thomas Schwinge  changed:

   What|Removed |Added

 CC||danglin at gcc dot gnu.org

--- Comment #4 from Thomas Schwinge  ---
*** Bug 112704 has been marked as a duplicate of this bug. ***

[Bug analyzer/112704] FAIL: gcc.dg/analyzer/data-model-20.c (test for warnings, line 17)

2023-12-12 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112704

Thomas Schwinge  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #2 from Thomas Schwinge  ---
Should be resolved via commit
r14-6434-g6008b80b25d71827fb26ce49f49aae02b645bb12 "analyzer: fix uninitialized
bitmap [PR112955]".

*** This bug has been marked as a duplicate of bug 112955 ***

[Bug c++/112847] [14 Regression] nvptx: 'FAIL: g++.dg/cpp2a/concepts-explicit-inst1.C -std=c++20 scan-assembler _Z1gI1XEvT_', 'scan-assembler _Z1gI1YEvT_'

2023-12-12 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112847

Thomas Schwinge  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org

--- Comment #2 from Thomas Schwinge  ---
This one (but not PR112846 "nvptx: 'FAIL: g++.dg/abi/anon6.C -std=c++20
scan-assembler
_Z5dummyIXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec000EEvv'",
which I'd filed at the same time) has been fixed by commit
r14-6432-g074c6f15f7a28c620c756f18c2a310961de00539 "testsuite: update
mangling", I presume.  (The new 'g++.dg/cpp2a/concepts-explicit-inst1a.C' also
is all-PASS.)

[Bug target/112937] [14 Regression] GCN: FAILs due to unconditional 'f->use_flat_addressing = true;'

2023-12-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112937

--- Comment #1 from Thomas Schwinge  ---
The unconditional GCN 'f->use_flat_addressing = true;' also has an effect on
one (only!) libgomp offloading test case, for
'-foffload-options=amdgcn-amdhsa=-march=gfx90a' (only!):

@@ -6188,11 +6188,11 @@ PASS: libgomp.fortran/target1.f90   -O1  execution
test
PASS: libgomp.fortran/target1.f90   -O2  (test for excess errors)
PASS: libgomp.fortran/target1.f90   -O2  execution test
PASS: libgomp.fortran/target1.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
{+WARNING: program timed out.+}
[-PASS:-]{+FAIL:+} libgomp.fortran/target1.f90   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
PASS: libgomp.fortran/target1.f90   -O3 -g  (test for excess errors)
{+WARNING: program timed out.+}
[-PASS:-]{+FAIL:+} libgomp.fortran/target1.f90   -O3 -g  execution test
PASS: libgomp.fortran/target1.f90   -Os  (test for excess errors)
PASS: libgomp.fortran/target1.f90   -Os  execution test

libgomp: GCN fatal error: Asynchronous queue error
Runtime message: HSA_STATUS_ERROR_INVALID_ISA: The instruction set
architecture is invalid.
[hangs]

Huh!?  That looks very odd (to me, at least).

Manually trying with simply '-O3', that appears to be 100 % reproducible on our
gfx90a systems -- but nowhere else.  ..., and disappears with the unconditional
GCN 'f->use_flat_addressing = true;' reverted.  (..., which of course would
regress other new test cases.)

[Bug libgomp/112264] Occasionally (but very rare): 'FAIL: libgomp.fortran/target-nowait-array-section.f90 -O execution test'

2023-12-09 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112264

Thomas Schwinge  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-12-09
 Ever confirmed|0   |1

[Bug target/112937] New: [14 Regression] GCN: FAILs due to unconditional 'f->use_flat_addressing = true;'

2023-12-09 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112937

Bug ID: 112937
   Summary: [14 Regression] GCN: FAILs due to unconditional
'f->use_flat_addressing = true;'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: ams at gcc dot gnu.org, jules at gcc dot gnu.org
  Target Milestone: ---
Target: GCN

The unconditional GCN 'f->use_flat_addressing = true;' applied as part of
commit r14-6226-ge7d6c277fa28c0b9b621d23c471e0388d2912644 "amdgcn, libgomp:
low-latency allocator" is causing a few regressions for GCN target (not
offloading) testing (tested '-march=gfx906', '-march=gfx90a'):

C:

[-PASS:-]{+FAIL:+} gcc.dg/pr64935-1.c (test for excess errors)

xgcc: error: [...]/gcc.dg/pr64935-1.c: '-fcompare-debug' failure (length)

Fortran:

PASS: gfortran.dg/coarray/fail_image_2.f08 -fcoarray=single  -O2  (test for
excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=single 
-O2  execution test

PASS: gfortran.dg/team_change_1.f90   -O0  (test for excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/team_change_1.f90   -O0  execution test
[Etc.]

PASS: gfortran.dg/team_end_1.f90   -O0  (test for excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/team_end_1.f90   -O0  execution test
[Etc.]

PASS: gfortran.dg/team_form_1.f90   -O0  (test for excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/team_form_1.f90   -O0  execution test
[Etc.]

PASS: gfortran.dg/team_number_1.f90   -O0  (test for excess errors)
[-PASS:-]{+FAIL:+} gfortran.dg/team_number_1.f90   -O0  execution test
[Etc.]

These execution test FAILs are generally of the form:

Memory access fault by GPU node-2 (Agent handle: 0x20a1d40) on address
0x7f56. Reason: Page not present or supervisor privilege.

Additionally, I'm seeing the following in my libstdc++ enablement tree:

PASS: std/ranges/iota/max_size_type.cc  -std=gnu++20 (test for excess
errors)
{+WARNING: std/ranges/iota/max_size_type.cc  -std=gnu++20 execution test
program timed out.+}
[-PASS:-]{+FAIL:+} std/ranges/iota/max_size_type.cc  -std=gnu++20 execution
test
PASS: std/ranges/iota/max_size_type.cc  -std=gnu++26 (test for excess
errors)
{+WARNING: std/ranges/iota/max_size_type.cc  -std=gnu++26 execution test
program timed out.+}
[-PASS:-]{+FAIL:+} std/ranges/iota/max_size_type.cc  -std=gnu++26 execution
test

(I guess I could provide pre-processed files for those if you'd like to
reproduce.)

To restore the Fortran test cases, just reverting the GCN back end change is
not sufficient: also need to rebuild GCN target libraries.  (That is, the GCN
back end change does affect code generated for GCN target libraries.)

[Bug libgcc/109289] Conflicting types for built-in functions in libgcc/emutls.c

2023-12-06 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109289

Thomas Schwinge  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org
 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Thomas Schwinge  ---
(In reply to myself from comment #2)
> Similarly seen for GCN target, and this is now fatal

Actually, sorry, that's not accurate; the "warning: conflicting types" didn't
turn fatal.

Anyway: 'libgcc/emutls.c' should now be clean to build; please re-open if not.

[Bug libstdc++/112858] [14 Regression] nvptx: 'unresolved symbol __cxa_thread_atexit_impl'

2023-12-06 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112858

--- Comment #5 from Thomas Schwinge  ---
(I did see that the '__cxa_thread_atexit_impl' issue has been resolved
differently, but there is a genuine GCC/nvptx issue here.)

(In reply to myself from comment #1)
> Indeed in 'build-gcc/nvptx-none/libstdc++-v3/libsupc++/atexit_thread.o' I
> see:
> 
> // BEGIN GLOBAL FUNCTION DECL: __cxa_thread_atexit_impl
> .extern .func (.param .u32 %value_out) __cxa_thread_atexit_impl (.param
> .u64 %in_ar0, .param .u64 %in_ar1, .param .u64 %in_ar2);
> 
> That is, '.extern' instead of '.weak' linking directive, huh.

That one indeed is a GCC/nvptx back end issue.  A fix might look similar to the
following:

--- gcc/config/nvptx/nvptx.cc
+++ gcc/config/nvptx/nvptx.cc
@@ -1001,10 +1001,11 @@ write_fn_proto_1 (std::stringstream , bool
is_defn,
  const char *name, const_tree decl, bool force_public)
 {
-  if (lookup_attribute ("alias", DECL_ATTRIBUTES (decl)) == NULL)
+  if (lookup_attribute ("alias", DECL_ATTRIBUTES (decl)) == NULL
+  && !DECL_WEAK (decl))
 write_fn_marker (s, is_defn, TREE_PUBLIC (decl) || force_public,
name);

   /* PTX declaration.  */
   if (DECL_EXTERNAL (decl))
-s << ".extern ";
+s << (DECL_WEAK (decl) ? ".weak " : ".extern ");
   else if (TREE_PUBLIC (decl) || force_public)
 s << (DECL_WEAK (decl) ? ".weak " : ".visible ");

> ..., but still doing the NULL check: [...]

..., and that check ('if (__cxa_thread_atexit_impl)') then fails to assemble,
and thus the build (!) fails:

ptxas fatal   : Cannot take address of function '__cxa_thread_atexit_impl' 

Thus, more smarts are needed to make "weak, undefined" work.  (May be able to
fix this up in the linker, assuming seeing the whole program; similar to
PR105018 "[nvptx] Need better alias support" ideas?)  (For reference, "weak,
defined" does not run into this problem.)

[Bug libgcc/109289] Conflicting types for built-in functions in libgcc/emutls.c

2023-12-06 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109289

Thomas Schwinge  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org

--- Comment #8 from Thomas Schwinge  ---
(In reply to myself from comment #2)
> [...]/source-gcc/libgcc/emutls.c: In function ‘__emutls_get_address’:
> [...]/source-gcc/libgcc/emutls.c:172:13: error: implicit declaration of
> function ‘calloc’ [-Wimplicit-function-declaration]
>   172 |   arr = calloc (size + 1, sizeof (void *));
>   | ^~
> [...]/source-gcc/libgcc/emutls.c:32:1: note: include ‘’ or
> provide a declaration of ‘calloc’
>31 | #include "gthr.h"
>   +++ |+#include 
>32 |
> [...]/source-gcc/libgcc/emutls.c:172:13: warning: incompatible implicit
> declaration of built-in function ‘calloc’ [-Wbuiltin-declaration-mismatch]
>   172 |   arr = calloc (size + 1, sizeof (void *));
>   | ^~
> [...]/source-gcc/libgcc/emutls.c:172:13: note: include ‘’ or
> provide a declaration of ‘calloc’
> [...]/source-gcc/libgcc/emutls.c:184:13: error: implicit declaration of
> function ‘realloc’ [-Wimplicit-function-declaration]
>   184 |   arr = realloc (arr, (size + 1) * sizeof (void *));
>   | ^~~
> [...]/source-gcc/libgcc/emutls.c:184:13: note: include ‘’ or
> provide a declaration of ‘realloc’
> [...]/source-gcc/libgcc/emutls.c:184:13: warning: incompatible implicit
> declaration of built-in function ‘realloc’ [-Wbuiltin-declaration-mismatch]
> [...]/source-gcc/libgcc/emutls.c:184:13: note: include ‘’ or
> provide a declaration of ‘realloc’

> GCC's suggestion to "include ‘’" needs to be carefully reviewed,
> in case this is meant to be buildable in an environment without C library
> headers?

(In reply to Florian Weimer from comment #3)
> Thomas, the safe thing to do would be to use __builtin_calloc and
> __builtin_realloc in those spots because it avoids a dependency on an
> external header that might not exist at this point.

That part got resolved differently, in commit
r14-6207-g6e84dafcc72d1cd6d028b42f1801e092a91d3214 "tsystem.h: Declare
calloc/realloc #ifdef inhibit_libc".

[Bug libgcc/109289] Conflicting types for built-in functions in libgcc/emutls.c

2023-12-05 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109289

--- Comment #7 from Thomas Schwinge  ---
Created attachment 56805
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56805=edit
'0001-WIP-GCC-PR109289-Conflicting-types-for-built-in-func.patch'

Attaching my current WIP patch.  I may later bring this to completion
(properly), unless anyone gets there first.

[Bug target/112858] nvptx: 'unresolved symbol __cxa_thread_atexit_impl'

2023-12-05 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112858

--- Comment #2 from Thomas Schwinge  ---
Earlier in 'libstdc++-v3/libsupc++/atexit_thread.cc', we have:

[...]
#if _GLIBCXX_HAVE___CXA_THREAD_ATEXIT

// Libc provides __cxa_thread_atexit definition.

#elif _GLIBCXX_HAVE___CXA_THREAD_ATEXIT_IMPL

extern "C" int __cxa_thread_atexit_impl (void (_GLIBCXX_CDTOR_CALLABI
*func) (void *),
 void *arg, void *d);
extern "C" int
__cxxabiv1::__cxa_thread_atexit (void (_GLIBCXX_CDTOR_CALLABI *dtor)(void
*),
 void *obj, void *dso_handle)
  _GLIBCXX_NOTHROW
{
  return __cxa_thread_atexit_impl (dtor, obj, dso_handle);
}

#else /* _GLIBCXX_HAVE___CXA_THREAD_ATEXIT_IMPL */
[...]

..., that is, indeed a non-weak '__cxa_thread_atexit_impl' declaration --
however, that's active only if not '_GLIBCXX_HAVE___CXA_THREAD_ATEXIT' and not
'_GLIBCXX_HAVE___CXA_THREAD_ATEXIT_IMPL', but we have per
'include/nvptx-none/bits/c++config.h':

/* Define to 1 if you have the `__cxa_thread_atexit' function. */
/* #undef _GLIBCXX_HAVE___CXA_THREAD_ATEXIT */

/* Define to 1 if you have the `__cxa_thread_atexit_impl' function. */
/* #undef _GLIBCXX_HAVE___CXA_THREAD_ATEXIT_IMPL */

[Bug target/112858] nvptx: 'unresolved symbol __cxa_thread_atexit_impl'

2023-12-05 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112858

--- Comment #1 from Thomas Schwinge  ---
Indeed in 'build-gcc/nvptx-none/libstdc++-v3/libsupc++/atexit_thread.o' I see:

// BEGIN GLOBAL FUNCTION DECL: __cxa_thread_atexit_impl
.extern .func (.param .u32 %value_out) __cxa_thread_atexit_impl (.param
.u64 %in_ar0, .param .u64 %in_ar1, .param .u64 %in_ar2);

That is, '.extern' instead of '.weak' linking directive, huh.

..., but still doing the NULL check:

[...]
.reg .u64 %r29;
.reg .pred %r30;
[...]
mov.u64 %r29,__cxa_thread_atexit_impl;
setp.eq.u64 %r30,%r29,0;
@ %r30 bra $L9;
.loc 2 156 37
{
.param .u32 %value_in;
.param .u64 %out_arg1;
st.param.u64 [%out_arg1],%r26;
.param .u64 %out_arg2;
st.param.u64 [%out_arg2],%r27;
.param .u64 %out_arg3;
st.param.u64 [%out_arg3],%r28;
call (%value_in),__cxa_thread_atexit_impl,(%out_arg1,%out_arg2,%out_arg3);
ld.param.u32 %r34,[%value_in];
}
mov.u32 %r25,%r34;
.loc 2 156 59
bra $L8;
$L9:
[...]

[Bug target/112858] New: nvptx: 'unresolved symbol __cxa_thread_atexit_impl'

2023-12-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112858

Bug ID: 112858
   Summary: nvptx: 'unresolved symbol __cxa_thread_atexit_impl'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: vries at gcc dot gnu.org
  Target Milestone: ---

With commit r14-6082-gf4dd9416843308d4ae519983415fe62212662536 "libsupc++: try
cxa_thread_atexit_impl at runtime", there's one regression in nvptx target
testing (only visible on top of my WIP C++ enablement changes):

[-PASS:-]{+FAIL:+} g++.dg/tls/thread_local6.C  -std=c++14 (test for excess
errors)
[-PASS:-]{+UNRESOLVED:+} g++.dg/tls/thread_local6.C  -std=c++14 [-execution
test-]{+compilation failed to produce executable+}
[-PASS:-]{+FAIL:+} g++.dg/tls/thread_local6.C  -std=c++17 (test for excess
errors)
[-PASS:-]{+UNRESOLVED:+} g++.dg/tls/thread_local6.C  -std=c++17 [-execution
test-]{+compilation failed to produce executable+}
[-PASS:-]{+FAIL:+} g++.dg/tls/thread_local6.C  -std=c++20 (test for excess
errors)
[-PASS:-]{+UNRESOLVED:+} g++.dg/tls/thread_local6.C  -std=c++20 [-execution
test-]{+compilation failed to produce executable+}
UNSUPPORTED: g++.dg/tls/thread_local6.C  -std=c++98

unresolved symbol __cxa_thread_atexit_impl
collect2: error: ld returned 1 exit status

Very likely, this isn't an issue with that commit, but rather due to GCC/nvptx'
deficient implementation of weak symbols.

[Bug c++/112847] [14 Regression] nvptx: 'FAIL: g++.dg/cpp2a/concepts-explicit-inst1.C -std=c++20 scan-assembler _Z1gI1XEvT_', 'scan-assembler _Z1gI1YEvT_'

2023-12-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112847

Thomas Schwinge  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org

--- Comment #1 from Thomas Schwinge  ---
Same as for PR112846, the issue disappears if I revert commit
r14-6064-gc3f281a0c1ca50e4df5049923aa2f5d1c3c39ff6 "c++: mangle function
template constraints".  I don't know yet (a) what that means, and (b) why nvptx
target behaves differently from everything else.

[Bug c++/112846] [14 Regression] nvptx: 'FAIL: g++.dg/abi/anon6.C -std=c++20 scan-assembler _Z5dummyIXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec00000000000EEEEEEvv'

2023-12-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112846

Thomas Schwinge  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=112847
 CC||jason at gcc dot gnu.org

--- Comment #1 from Thomas Schwinge  ---
Same as for PR112847, the issue disappears if I revert commit
r14-6064-gc3f281a0c1ca50e4df5049923aa2f5d1c3c39ff6 "c++: mangle function
template constraints".  I don't know yet (a) what that means, and (b) why nvptx
target behaves differently from everything else.

[Bug c++/112847] New: [14 Regression] nvptx: 'FAIL: g++.dg/cpp2a/concepts-explicit-inst1.C -std=c++20 scan-assembler _Z1gI1XEvT_', 'scan-assembler _Z1gI1YEvT_'

2023-12-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112847

Bug ID: 112847
   Summary: [14 Regression] nvptx: 'FAIL:
g++.dg/cpp2a/concepts-explicit-inst1.C  -std=c++20
scan-assembler _Z1gI1XEvT_', 'scan-assembler
_Z1gI1YEvT_'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: vries at gcc dot gnu.org
  Target Milestone: ---
Target: nvptx

For nvptx target, something in Git
r14-5829-g449b6b817ed76173e6475debd02b195ea9dab0a0..r14-6074-gb74981b5cf32ebf4bfffd25e7174b5c80243447a
regresses:

UNSUPPORTED: g++.dg/cpp2a/concepts-explicit-inst1.C  -std=c++98
UNSUPPORTED: g++.dg/cpp2a/concepts-explicit-inst1.C  -std=c++14
UNSUPPORTED: g++.dg/cpp2a/concepts-explicit-inst1.C  -std=c++17
PASS: g++.dg/cpp2a/concepts-explicit-inst1.C  -std=c++20 (test for excess
errors)
[-PASS:-]{+FAIL:+} g++.dg/cpp2a/concepts-explicit-inst1.C  -std=c++20 
scan-assembler _Z1gI1XEvT_
[-PASS:-]{+FAIL:+} g++.dg/cpp2a/concepts-explicit-inst1.C  -std=c++20 
scan-assembler _Z1gI1YEvT_
PASS: g++.dg/cpp2a/concepts-explicit-inst1.C  -std=c++20  scan-assembler
_Z1gIiEvT_

--- concepts-explicit-inst1.s   2023-12-04 12:44:38.047527549 +0100
+++ concepts-explicit-inst1.s   2023-12-04 12:44:20.675703887 +0100
@@ -26,2 +26,2 @@
-// BEGIN GLOBAL FUNCTION DECL: _Z1gI1XEvT_
-.weak .func _Z1gI1XEvT_ (.param.u64 %in_ar0);
+// BEGIN GLOBAL FUNCTION DECL: _Z1gITk1D1XEvT_
+.weak .func _Z1gITk1D1XEvT_ (.param.u64 %in_ar0);
@@ -29,2 +29,2 @@
-// BEGIN GLOBAL FUNCTION DEF: _Z1gI1XEvT_
-.weak .func _Z1gI1XEvT_ (.param.u64 %in_ar0)
+// BEGIN GLOBAL FUNCTION DEF: _Z1gITk1D1XEvT_
+.weak .func _Z1gITk1D1XEvT_ (.param.u64 %in_ar0)
@@ -46,2 +46,2 @@
-// BEGIN GLOBAL FUNCTION DECL: _Z1gI1YEvT_
-.weak .func _Z1gI1YEvT_ (.param.u64 %in_ar0);
+// BEGIN GLOBAL FUNCTION DECL: _Z1gITk1C1YEvT_
+.weak .func _Z1gITk1C1YEvT_ (.param.u64 %in_ar0);
@@ -49,2 +49,2 @@
-// BEGIN GLOBAL FUNCTION DEF: _Z1gI1YEvT_
-.weak .func _Z1gI1YEvT_ (.param.u64 %in_ar0)
+// BEGIN GLOBAL FUNCTION DEF: _Z1gITk1C1YEvT_
+.weak .func _Z1gITk1C1YEvT_ (.param.u64 %in_ar0)

[Bug c++/112846] New: [14 Regression] nvptx: 'FAIL: g++.dg/abi/anon6.C -std=c++20 scan-assembler _Z5dummyIXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec00000000000EEEEEEvv'

2023-12-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112846

Bug ID: 112846
   Summary: [14 Regression] nvptx: 'FAIL: g++.dg/abi/anon6.C
-std=c++20  scan-assembler
_Z5dummyIXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut
_Edi9RightNameLd405ec000EEvv'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: vries at gcc dot gnu.org
  Target Milestone: ---
Target: nvptx

For nvptx target, something in Git
r14-5829-g449b6b817ed76173e6475debd02b195ea9dab0a0..r14-6074-gb74981b5cf32ebf4bfffd25e7174b5c80243447a
regresses:

UNSUPPORTED: g++.dg/abi/anon6.C  -std=c++98
UNSUPPORTED: g++.dg/abi/anon6.C  -std=c++14
UNSUPPORTED: g++.dg/abi/anon6.C  -std=c++17
PASS: g++.dg/abi/anon6.C  -std=c++20 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/abi/anon6.C  -std=c++20  scan-assembler
_Z5dummyIXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec000EEvv

--- anon6.s 2023-12-04 12:22:40.631978250 +0100
+++ anon6.s 2023-12-04 12:22:21.592135699 +0100
@@ -8,2 +8,2 @@
-// BEGIN GLOBAL FUNCTION DECL:
_Z5dummyIXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec000EEvv
-.weak .func
_Z5dummyIXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec000EEvv;
+// BEGIN GLOBAL FUNCTION DECL:
_Z5dummyITnDaXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec000EEvv
+.weak .func
_Z5dummyITnDaXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec000EEvv;
@@ -11,2 +11,2 @@
-// BEGIN GLOBAL FUNCTION DEF:
_Z5dummyIXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec000EEvv
-.weak .func
_Z5dummyIXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec000EEvv
+// BEGIN GLOBAL FUNCTION DEF:
_Z5dummyITnDaXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec000EEvv
+.weak .func
_Z5dummyITnDaXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec000EEvv
@@ -26 +26 @@
-   call
_Z5dummyIXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec000EEvv;
+   call
_Z5dummyITnDaXtl8wrapper1IdEtlNS1_Ut_Edi9RightNametlNS2_Ut_Edi9RightNameLd405ec000EEvv;

[Bug modula2/112825] Modula 2 builds target objects as part of all-gcc

2023-12-03 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112825

Thomas Schwinge  changed:

   What|Removed |Added

 CC||tschwinge at gcc dot gnu.org

--- Comment #4 from Thomas Schwinge  ---
I too, months ago, had run into this problem -- via the nvptx 'as' failing on
'-o /dev/null'... (which I'll fix anyway).  After a quick look at
'gcc/m2/tools-src/makeSystem' I did presume that the compiler is invoked there
only for some side effects (remember, '-o /dev/null'), and thus fixing this
might be...

(In reply to Gaius Mulley from comment #3)
> Created attachment 56782 [details]
> Proposed fix
> 
> This patch changes all invocations of gm2 -c to gm2 -S to avoid invoking the
> target assembler (which might not be present if make all-gcc is run).

... as simple as this.  (Then, of course, interrupted by higher-priority
things, and I never got back to this.)

Therefore, for what it's worth, conceptual ACK to this proposed change (which
I've not yet tested, but will, eventually).

[Bug tree-optimization/112788] [14 regression] ICEs in fold_range, at range-op.cc:206 after r14-5972-gea19de921b01a6

2023-12-03 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112788

Thomas Schwinge  changed:

   What|Removed |Added

   Last reconfirmed|2023-12-01 00:00:00 |2023-12-3
 CC||tschwinge at gcc dot gnu.org
   Keywords||build

--- Comment #3 from Thomas Schwinge  ---
I'm actually running into this ICE already during powerpc64le-linux-gnu build:

libtool: compile:  [...]/build-gcc/./gcc/xgcc [...] -fchecking=1 [...]
-mabi=ibmlongdouble -mno-gnu-attribute -fcx-fortran-rules -ffunction-sections
-fdata-sections -mabi=ieeelongdouble -g -O2 -MT norm2_r17.lo -MD -MP -MF
.deps/norm2_r17.Tpo -c [...]/source-gcc/libgfortran/generated/norm2_r17.c 
-fPIC -DPIC -o .libs/norm2_r17.o
cc1: warning: Using IEEE extended precision ‘long double’ [-Wpsabi]
during GIMPLE pass: evrp
[...]/source-gcc/libgfortran/generated/norm2_r17.c: In function
‘norm2_r17’:
[...]/source-gcc/libgfortran/generated/norm2_r17.c:214:1: internal compiler
error: in fold_range, at range-op.cc:206
  214 | }
  | ^
0x10bf213b range_op_handler::fold_range(vrange&, tree_node*, vrange const&,
vrange const&, relation_trio) const
[...]/source-gcc/gcc/range-op.cc:206
0x11d8790b fold_using_range::range_of_range_op(vrange&,
gimple_range_op_handler&, fur_source&)
[...]/source-gcc/gcc/gimple-range-fold.cc:702
0x11d89873 fold_using_range::fold_stmt(vrange&, gimple*, fur_source&,
tree_node*)
[...]/source-gcc/gcc/gimple-range-fold.cc:602
0x11d89eaf fold_range(vrange&, gimple*, range_query*)
[...]/source-gcc/gcc/gimple-range-fold.cc:322
0x11d7c6e3 ranger_cache::get_global_range(vrange&, tree_node*, bool&)
[...]/source-gcc/gcc/gimple-range-cache.cc:1052
0x11d69577 gimple_ranger::range_of_stmt(vrange&, gimple*, tree_node*)
[...]/source-gcc/gcc/gimple-range.cc:311
0x11221b93 range_query::value_of_stmt(gimple*, tree_node*)
[...]/source-gcc/gcc/value-query.cc:113
0x111c6ae7 rvrp_folder::value_of_stmt(gimple*, tree_node*)
[...]/source-gcc/gcc/tree-vrp.cc:999
0x11044fbb
substitute_and_fold_dom_walker::before_dom_children(basic_block_def*)
[...]/source-gcc/gcc/tree-ssa-propagate.cc:820
0x11ce966b dom_walker::walk(basic_block_def*)
[...]/source-gcc/gcc/domwalk.cc:311
0x110436f3
substitute_and_fold_engine::substitute_and_fold(basic_block_def*)
[...]/source-gcc/gcc/tree-ssa-propagate.cc:999
0x111c1877 execute_ranger_vrp(function*, bool, bool)
[...]/source-gcc/gcc/tree-vrp.cc:1064
0x111c6a6b execute
[...]/source-gcc/gcc/tree-vrp.cc:1307
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
make[3]: *** [norm2_r17.lo] Error 1

[Bug libgcc/109289] Conflicting types for built-in functions in libgcc/emutls.c

2023-12-01 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109289

Thomas Schwinge  changed:

   What|Removed |Added

   Last reconfirmed||2023-12-01
 CC||ams at gcc dot gnu.org,
   ||fw at gcc dot gnu.org,
   ||jules at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #2 from Thomas Schwinge  ---
Similarly seen for GCN target, and this is now fatal after Florian's recent
changes (I presume -- and I fully do support those, for avoidance of doubt):

[...]/source-gcc/libgcc/emutls.c:61:7: warning: conflicting types for
built-in function ‘__emutls_get_address’; expected ‘void *(void *)’
[-Wbuiltin-declaration-mismatch]
   61 | void *__emutls_get_address (struct __emutls_object *);
  |   ^~~~
[...]/source-gcc/libgcc/emutls.c:63:6: warning: conflicting types for
built-in function ‘__emutls_register_common’; expected ‘void(void *, unsigned
int,  unsigned int,  void *)’ [-Wbuiltin-declaration-mismatch]
   63 | void __emutls_register_common (struct __emutls_object *, word,
word, void *);
  |  ^~~~
[...]/source-gcc/libgcc/emutls.c:140:1: warning: conflicting types for
built-in function ‘__emutls_get_address’; expected ‘void *(void *)’
[-Wbuiltin-declaration-mismatch]
  140 | __emutls_get_address (struct __emutls_object *obj)
  | ^~~~
[...]/source-gcc/libgcc/emutls.c: In function ‘__emutls_get_address’:
[...]/source-gcc/libgcc/emutls.c:172:13: error: implicit declaration of
function ‘calloc’ [-Wimplicit-function-declaration]
  172 |   arr = calloc (size + 1, sizeof (void *));
  | ^~
[...]/source-gcc/libgcc/emutls.c:32:1: note: include ‘’ or
provide a declaration of ‘calloc’
   31 | #include "gthr.h"
  +++ |+#include 
   32 |
[...]/source-gcc/libgcc/emutls.c:172:13: warning: incompatible implicit
declaration of built-in function ‘calloc’ [-Wbuiltin-declaration-mismatch]
  172 |   arr = calloc (size + 1, sizeof (void *));
  | ^~
[...]/source-gcc/libgcc/emutls.c:172:13: note: include ‘’ or
provide a declaration of ‘calloc’
[...]/source-gcc/libgcc/emutls.c:184:13: error: implicit declaration of
function ‘realloc’ [-Wimplicit-function-declaration]
  184 |   arr = realloc (arr, (size + 1) * sizeof (void *));
  | ^~~
[...]/source-gcc/libgcc/emutls.c:184:13: note: include ‘’ or
provide a declaration of ‘realloc’
[...]/source-gcc/libgcc/emutls.c:184:13: warning: incompatible implicit
declaration of built-in function ‘realloc’ [-Wbuiltin-declaration-mismatch]
[...]/source-gcc/libgcc/emutls.c:184:13: note: include ‘’ or
provide a declaration of ‘realloc’
[...]/source-gcc/libgcc/emutls.c: At top level:
[...]/source-gcc/libgcc/emutls.c:204:1: warning: conflicting types for
built-in function ‘__emutls_register_common’; expected ‘void(void *, unsigned
int,  unsigned int,  void *)’ [-Wbuiltin-declaration-mismatch]
  204 | __emutls_register_common (struct __emutls_object *obj,
  | ^~~~
make[2]: *** [[...]/source-gcc/libgcc/static-object.mk:17: emutls.o] Error
1

GCC's suggestion to "include ‘’" needs to be carefully reviewed, in
case this is meant to be buildable in an environment without C library headers?

[Bug target/112669] GCN: wrong 'LIBRARY_PATH' in presence of several different '-march=[...]' flags

2023-11-27 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112669

Thomas Schwinge  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Thomas Schwinge  ---
Fixed in master branch; not currently planning on backporting.

[Bug target/112725] New: [14 Regression] powerpc64le-linux-gnu: 'c-c++-common/builtin-classify-type-1.c:113:3: error: AltiVec argument passed to unprototyped function'

2023-11-27 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112725

Bug ID: 112725
   Summary: [14 Regression] powerpc64le-linux-gnu:
'c-c++-common/builtin-classify-type-1.c:113:3: error:
AltiVec argument passed to unprototyped function'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Test run for powerpc64le-linux-gnu, very likely commit
r14-5615-g509b470dcee9795887a60ddb32ab454f22e74411 "c, c++: Add new value for
vector types for __builtin_classify_type":

[-PASS:-]{+FAIL:+} c-c++-common/builtin-classify-type-1.c  -Wc++-compat 
(test for excess errors)
[-PASS:-]{+UNRESOLVED:+} c-c++-common/builtin-classify-type-1.c 
-Wc++-compat  [-execution test-]{+compilation failed to produce executable+}

[...]/c-c++-common/builtin-classify-type-1.c: In function 'main':
[...]/c-c++-common/builtin-classify-type-1.c:113:3: error: AltiVec argument
passed to unprototyped function
[...]/c-c++-common/builtin-classify-type-1.c:115:3: error: AltiVec argument
passed to unprototyped function

  1   2   3   4   5   >