from:"iii at linux dot ibm.com"

[Bug sanitizer/115461] lsan doesn't work on s390x

2024-06-13 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115461

--- Comment #6 from Ilya Leoshkevich  ---
Forgot to add: since the runtime is shared, this observation applies to both
GCC and LLVM.

$ gcc k.c -fsanitize=leak; ./a.out
0x5080
$ LSAN_OPTIONS=use_stacks=0 ./a.out
0x5080

=
==948446==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 123 byte(s) in 1 object(s) allocated from:
#0 0x3fff7a16caf in malloc (/lib64/liblsan.so.0+0x16caf) (BuildId:
58eab4a667c0b1f8c0ff7fe7ac931e0eaa86cd5e)
#1 0x1001219 in main (/tmp/a.out+0x1001219) (BuildId:
277d8d1498d2a3f76a547ae04af127173f8a2c76)

SUMMARY: LeakSanitizer: 123 byte(s) leaked in 1 allocation(s).

[Bug sanitizer/115461] lsan doesn't work on s390x

2024-06-13 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115461

--- Comment #5 from Ilya Leoshkevich  ---
The LLVM testsuite still passes.

Looking a bit deeper:

$ LSAN_OPTIONS=verbosity=1,log_pointers=1 ./a.out
[...]
0x5080
==1522380==LeakSanitizer: checking for leaks
[...]
==1522381==Scanning STACK range 0x03ffa3d8-0x03ffb000.
==1522381==0x03ffa820: found 0x5080 pointing into chunk
0x5080-0x5080007b of size 123.

So something spilled the pointer value on stack, and LSan thinks that it's
still referenced. And indeed, turning stack scanning off resolves the issue:

$ LSAN_OPTIONS=use_stacks=0 ./a.out
0x5080

=
==1522412==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 123 byte(s) in 1 object(s) allocated from:
#0 0x2aa00045bbd in malloc
[...]/llvm-project/compiler-rt/lib/lsan/lsan_interceptors.cpp:75:3
#1 0x2aa0004779d in main ([...]/llvm-project/build/a.out+0x4779d)

SUMMARY: LeakSanitizer: 123 byte(s) leaked in 1 allocation(s).

[Bug sanitizer/115461] lsan doesn't work on s390x

2024-06-13 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115461

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #4 from Ilya Leoshkevich  ---
It doesn't work for me anymore either.  I will take a look at both GCC and LLVM
issues.

[Bug sanitizer/79341] Many Asan tests fail on s390

2024-04-04 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #77 from Ilya Leoshkevich  ---
Apparently fixing the message in GCC will produce maintenance overhead [1].  If
that's not very important to you, I'd rather leave this message as is.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648775.html

[Bug sanitizer/79341] Many Asan tests fail on s390

2024-04-03 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #76 from Ilya Leoshkevich  ---
It's because the sanitizer runtime was copied from LLVM to GCC.  I will post a
patch removing the unsupported MSan and DFSan from the error message.

[Bug target/114404] [11] GCC reorders stores when it probably shouldn't

2024-03-20 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114404

--- Comment #4 from Ilya Leoshkevich  ---
Thanks, cherry-picking
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=a98d5130a6dcff2ed4db371e500550134777b8cf
helped both with the minimized testcase and the actual kernel bug.  What you
describe there - reassociation causing a wrong base term to be selected -
matches what I've seen during debugging as well.  So let's close this as a
duplicate.

[Bug c/114404] [11] GCC reorders stores when it probably shouldn't

2024-03-20 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114404

--- Comment #2 from Ilya Leoshkevich  ---
Created attachment 57745
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57745=edit
cc1 invocation

[Bug c/114404] [11] GCC reorders stores when it probably shouldn't

2024-03-20 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114404

--- Comment #1 from Ilya Leoshkevich  ---
Created attachment 57744
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57744=edit
preprocessed source

[Bug c/114404] New: [11] GCC reorders stores when it probably shouldn't

2024-03-20 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114404

Bug ID: 114404
   Summary: [11] GCC reorders stores when it probably shouldn't
   Product: gcc
   Version: 11.4.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
  Target Milestone: ---

Reproducible with gcc commit 1b5510a59163.
I'm writing this up as a result of the following linux kernel discussion:

https://lore.kernel.org/bpf/c9923c1d-971d-4022-8dc8-1364e929d...@gmail.com/
https://lore.kernel.org/bpf/20240320015515.11883-1-...@linux.ibm.com/

In the following code:

extern const char bpf_plt[];
extern const char bpf_plt_ret[];
extern const char bpf_plt_target[];
static void bpf_jit_plt(void *plt, void *ret, void *target)
{
memcpy(plt, bpf_plt, BPF_PLT_SIZE);
*(void **)((char *)plt + (bpf_plt_ret - bpf_plt)) = ret;
*(void **)((char *)plt + (bpf_plt_target - bpf_plt)) = target ?: ret;
}

GCC 11's sched1 pass reorders memcpy() and assignments.  In GCC 12 this
behavior is gone after

commit 2e96b5f14e4025691b57d2301d71aa6092ed44bc 
Author: Aldy Hernandez
Date:   Tue Jun 15 12:32:51 2021 +0200  

Backwards jump threader rewrite with ranger.

but this seems to be accidental.  Internally, output_dependence() for the
respective mems returns false, because it believes that they are based on
different SYMBOL_REFs.  This may be because on the C level we are not allowed
to subtract pointers to different objects.

However, a possible solution to this should be casting pointers to longs, since
C pointer subtraction rules would no longer apply, but in practice this does
nothing. 

In the attached minimized preprocessed source with long casts we get:

stg %r3,232(%r2,%r15)
ltgr%r11,%r11
locgrne %r3,%r11
stg %r3,232(%r1,%r15)
la  %r2,0(%r1,%r9)
la  %r3,232(%r1,%r15)
mvc 232(16,%r15),0(%r5)
mvc 248(16,%r15),16(%r5)
lghi%r4,8
brasl   %r14,s390_kernel_write@PLT

so the assignments are placed before the memcpy().

[Bug sanitizer/113284] [14 regression] many failures in asan after r14-6946-ge66dc37b299cac

2024-01-09 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113284

--- Comment #6 from Ilya Leoshkevich  ---
Created attachment 57014
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57014=edit
patch v2

Thanks for the correction.  I've noticed the function label and got tunnel
vision; .opd does indeed contain no code, but only function and toc pointers,
and we don't want that in ASAN reports.  Would the attached patch be okay? 
It's basically your proposal, but with some code reuse.

[Bug sanitizer/113284] [14 regression] many failures in asan after r14-6946-ge66dc37b299cac

2024-01-09 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113284

--- Comment #4 from Ilya Leoshkevich  ---
Thanks, I can repro this on cfarm203 now.  Apparently I missed

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 94fbf46f2b6..fd9bb807957 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -21334,7 +21334,7 @@ rs6000_elf_declare_function_name (FILE *file, const
char *name, tree decl)
   if (TARGET_64BIT && DEFAULT_ABI != ABI_ELFv2)
 {
   fputs ("\t.section\t\".opd\",\"aw\"\n\t.align 3\n", file);
-  ASM_OUTPUT_LABEL (file, name);
+  ASM_OUTPUT_FUNCTION_LABEL (file, name, decl);
   fputs (DOUBLE_INT_ASM_OP, file);
   rs6000_output_function_entry (file, name);
   fputs (",.TOC.@tocbase,0\n\t.previous\n", file);

in commit c659dd8bfb55 ("Implement ASM_DECLARE_FUNCTION_NAME using
ASM_OUTPUT_FUNCTION_LABEL").  I will start a full regtest and send a patch
shortly.

[Bug sanitizer/113284] [14 regression] many failures in asan after r14-6946-ge66dc37b299cac

2024-01-08 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113284

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #1 from Ilya Leoshkevich  ---
Could you please share the configure command that you use?  I originally
regtested that patch on cfarm120 (POWER10) with `./configure
--enable-checking=yes,rtl`, and I cannot reproduce the issue there.

[Bug target/113273] [14 Regression][x86][asan] ICE Segmentation fault since r14-6946-ge66dc37b299cac

2024-01-08 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113273

--- Comment #4 from Ilya Leoshkevich  ---
I've pushed the fix.  This can be closed as a duplicate of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113251.

[Bug target/113273] [14 Regression][x86][asan] ICE Segmentation fault since r14-6946-ge66dc37b299cac

2024-01-08 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113273

--- Comment #3 from Ilya Leoshkevich  ---
Thank you for the confirmation.  I will push the fix as soon as my regtests
finish.

[Bug target/113273] [14 Regression][x86][asan] ICE Segmentation fault since r14-6946-ge66dc37b299cac

2024-01-08 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113273

--- Comment #1 from Ilya Leoshkevich  ---
Hi, sorry about the regression.  Could you please check if
https://inbox.sourceware.org/gcc-patches/20240108092434.554918-1-...@linux.ibm.com/
fixes that for you?

[Bug sanitizer/99476] 'PATH_MAX' was not declared in this scope

2024-01-08 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99476

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #3 from Ilya Leoshkevich  ---
I had a similar issue when compiling GCC targeting i686-linux on x86_64 debian,
and --includedir= helped, thanks! I had to do the following:

../configure --target=i686-linux-gnu --disable-bootstrap --prefix=/usr
--includedir=/usr/i686-linux-gnu/include

[Bug sanitizer/113251] [14 Regression] ICE on gcc.dg/asan/pr63845.c on i686-linux since r14-6946

2024-01-06 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113251

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #1 from Ilya Leoshkevich  ---
I can reproduce this manually and will work on a fix.

Surprisingly, this does not show in my test results. I.e.:

$ make check-gcc RUNTESTFLAGS="asan.exp=pr63845.c --debug"
=== gcc Summary ===

# of expected passes7

$ cat gcc/testsuite/gcc/gcc.sum

PASS: gcc.dg/asan/pr63845.c   -O0  (test for excess errors)
PASS: gcc.dg/asan/pr63845.c   -O1  (test for excess errors)
PASS: gcc.dg/asan/pr63845.c   -O2  (test for excess errors)
PASS: gcc.dg/asan/pr63845.c   -O3 -g  (test for excess errors)
PASS: gcc.dg/asan/pr63845.c   -Os  (test for excess errors)
PASS: gcc.dg/asan/pr63845.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
PASS: gcc.dg/asan/pr63845.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (test for excess errors)

But!

$ cat gcc/testsuite/gcc/dbg.log

expect: does "fPIC170653.c:3:13: internal compiler error: Segmentation
fault\r\n" (spawn_id exp7) match regular expression ".+"? (No Gate, RE only)
gate=yes re=yes

compiler exited with status 1

So the problem manifests itself during the test run, but the runner fails to
recognize it for some reason.

[Bug target/112986] s390x gcc O2, O3: Incorrect logic operation in < comparison with the same values

2023-12-13 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112986

--- Comment #4 from Ilya Leoshkevich  ---
Hi,

Nina fixed this in v8.0.0
(https://gitlab.com/qemu-project/qemu/-/commit/54fce97cfcaf5463ee5f325bc1f1d4adc2772f38).
The fix was backported to v7.2.2
(https://gitlab.com/qemu-project/qemu/-/commit/17b032c6598ea756889f25e8d3e4cd9f2036669b),
but not to v6.

[Bug target/106342] [12/13/14 Regression] internal compiler error: in extract_insn, at recog.cc:2791 since r12-4240-g2b8453c401b699

2023-04-19 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106342

--- Comment #10 from Ilya Leoshkevich  ---
This bug was fixed and backported to gcc-12:

commit 06254d97b8fa3a5d1c8b6b4e091d851700801385
Author: Ilya Leoshkevich 
Date:   Fri Jul 29 16:14:10 2022 +0200

PR106342 - IBM zSystems: Provide vsel for all vector modes

dg.exp=pr104612.c fails with an ICE on s390x, because copysignv2sf3
produces an insn that vsel is supposed to recognize, but can't,
because it's not defined for V2SF.  Fix by defining it for all vector
modes supported by copysign3.

gcc/ChangeLog:

* config/s390/vector.md (V_HW_FT): New iterator.
* config/s390/vx-builtins.md (vsel): Use V_HW_FT instead
of V_HW.

(cherry picked from commit 2f17f489de47d46626ed85804c3b810547ef550e)

I think it should be closed.

[Bug target/93242] [MIPS] patchable-function-entry broken

2022-09-01 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93242

--- Comment #11 from Ilya Leoshkevich  ---
I see.  It would be good to update https://gcc.gnu.org/gcc-9/ then - e.g.
https://gcc.gnu.org/gcc-8/ says "This release series is no longer supported".

[Bug target/93242] [MIPS] patchable-function-entry broken

2022-09-01 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93242

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #9 from Ilya Leoshkevich  ---
Would it be possible to backport this to gcc-9?

Linux kernel will start using patchable_function_entry soon, and there are
problems with s390x, which this patch fixes as well:
https://lore.kernel.org/bpf/9099057e-124c-8f30-c29d-54be85eee...@iogearbox.net/

There is a workaround for now, but it would be good to have this fixed in all
the maintained gccs (gcc-8 is no longer maintained as far as I can see, so this
leaves only gcc-9).

[Bug target/106342] [12/13 Regression] internal compiler error: in extract_insn, at recog.cc:2791 since r12-4240-g2b8453c401b699

2022-07-28 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106342

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #6 from Ilya Leoshkevich  ---
Maybe that's something obvious, but still: wouldn't adding V1SF, V2SF, and V1DF
to vsel also work? E.g. by changing it from using V_HW to using VT.

[Bug c++/100853] internal compiler error: in cp_tree_equal, at cp/tree.c:4148

2021-06-01 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100853

--- Comment #1 from Ilya Leoshkevich  ---
Created attachment 50903
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50903=edit
repro

[Bug c++/100853] New: internal compiler error: in cp_tree_equal, at cp/tree.c:4148

2021-06-01 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100853

Bug ID: 100853
   Summary: internal compiler error: in cp_tree_equal, at
cp/tree.c:4148
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
  Target Milestone: ---

$ cat cp-tree-equal.cpp 
typedef struct a *b;
template  struct c
{  d({
b *e; __typeof (*({
  __typeof *e f;
  f})).g const f (({  __typeof (*({
int h;
f} )).g const f (

$ gcc/cc1plus -fno-PIE -g -O2 -fno-checking -gtoggle -DIN_GCC
-fno-exceptions -fno-rtti -fasynchronous-unwind-tables cp-tree-equal.cpp
...
cp-tree-equal.cpp:8:37: internal compiler error: in cp_tree_equal, at
cp/tree.c:4148
8 | f} )).g const f (
  | ^
0x7062e9 cp_tree_equal(tree_node*, tree_node*)
../../gcc/cp/tree.c:4148
0xbd1d3e cp_tree_equal(tree_node*, tree_node*)
../../gcc/cp/tree.c:4138
0xbd1d3e cp_tree_equal(tree_node*, tree_node*)
../../gcc/cp/tree.c:4138
0xbd1d3e cp_tree_equal(tree_node*, tree_node*)
../../gcc/cp/tree.c:4138
0xbd1d3e cp_tree_equal(tree_node*, tree_node*)
../../gcc/cp/tree.c:4138
0xbdc784 structural_comptypes
../../gcc/cp/typeck.c:1491
0xae2ffc check_local_shadow
../../gcc/cp/name-lookup.c:3264
0xae2ffc do_pushdecl
../../gcc/cp/name-lookup.c:3773
0xae39b4 pushdecl(tree_node*, bool)
../../gcc/cp/name-lookup.c:3852
0xa3995e start_decl(cp_declarator const*, cp_decl_specifier_seq*, int,
tree_node*, tree_node*, tree_node**)
../../gcc/cp/decl.c:5591
0xb2dd61 cp_parser_init_declarator
../../gcc/cp/parser.c:21802
0xb093cd cp_parser_simple_declaration
../../gcc/cp/parser.c:14487
0xb0b0a9 cp_parser_declaration_statement
../../gcc/cp/parser.c:13622
0xb0bc0b cp_parser_statement
../../gcc/cp/parser.c:11848
0xb0ce0e cp_parser_statement_seq_opt
../../gcc/cp/parser.c:12215
0xb0cee8 cp_parser_compound_statement
../../gcc/cp/parser.c:12164
0xb10473 cp_parser_statement_expr
../../gcc/cp/parser.c:5142
0xb10473 cp_parser_primary_expression
../../gcc/cp/parser.c:5549
0xb11b80 cp_parser_postfix_expression
../../gcc/cp/parser.c:7528
0xb243af cp_parser_unary_expression
../../gcc/cp/parser.c:8849

Found when reducing a testcase for another problem.

[Bug middle-end/100278] IBM Z: Segmentation fault when building valgrind with -march=z14

2021-06-01 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100278

Ilya Leoshkevich  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Ilya Leoshkevich  ---
Fixed, thanks!

[Bug middle-end/100278] New: IBM Z: Segmentation fault when building valgrind with -march=z14

2021-04-26 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100278

Bug ID: 100278
   Summary: IBM Z: Segmentation fault when building valgrind with
-march=z14
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
  Target Milestone: ---

Minimized test:

$ cat test.c
a() {
  register b asm("");
  if (b)
b = 1;
  for (; b;)
;
}

$ $HOME/gcc/build/dist/bin/gcc -m64 -O2 -g -finline-functions
-fno-stack-protector -fno-builtin -fomit-frame-pointer -fstrict-aliasing
-march=z14 -c test.c
test.c:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
1 | a() {
  | ^
test.c: In function ‘a’:
test.c:2:12: warning: type defaults to ‘int’ in declaration of ‘b’
[-Wimplicit-int]
2 |   register b asm("");
  |^
during GIMPLE pass: pre
test.c:1:1: internal compiler error: Segmentation fault
1 | a() {
  | ^
0x1a33499 crash_signal
../../gcc/toplev.c:327
0x11e9bf2 contains_struct_check(tree_node*, tree_node_structure_enum, char
const*, int, char const*)
../../gcc/tree.h:3466
0x1c57673 compute_avail
../../gcc/tree-ssa-pre.c:4163
0x1c580d9 execute
../../gcc/tree-ssa-pre.c:4370

Bisect points to:

commit 577d05fc914338cd7ddc254f3bee4532331f5c13
Author: Richard Biener 
Date:   Tue Mar 9 09:29:29 2021 +0100

tree-optimization/99473 - more cselim

[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552

2021-04-26 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217

--- Comment #9 from Ilya Leoshkevich  ---
Created attachment 50679
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50679=edit
regtesting this patch now

[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552

2021-04-23 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217

--- Comment #8 from Ilya Leoshkevich  ---
Yeah, inline asm seems to be problematic:

/home/iii/gcc/build/gcc/xgcc -B/home/iii/gcc/build/gcc/
/home/iii/gcc/gcc/testsuite/gcc.target/s390/vector/long-double-asm-hardreg.c   
-fdiagnostics-plain-output   -O2 -march=z14 -mzarch -S -o
long-double-asm-hardreg.s

with the patch from comment 2 produces:

foo:
.LFB0:
.cfi_startproc
larl%r5,.L4
vl  %v0,.L5-.L4(%r5),3
#APP
# 10
"/home/iii/gcc/gcc/testsuite/gcc.target/s390/vector/long-double-asm-hardreg.c"
1
# %v0
# 0 "" 2
#NO_APP
br  %r14

`vl  %v0,.L5-.L4(%r5),3` loads 1.0L into %v0[0:128]. However, it should be
loaded into %v0[0:64] . %v2[0:64].

With the patch from comment 3 I get:

foo:
.LFB0:
.cfi_startproc
larl%r5,.L4
ld  %f0,.L5-.L4(%r5)
ld  %f2,.L5-.L4+8(%r5)
#APP
# 10
"/home/iii/gcc/gcc/testsuite/gcc.target/s390/vector/long-double-asm-hardreg.c"
1
# %f0
# 0 "" 2
#NO_APP
br  %r14

which is correct, but in general case the exact reg that the user requested is
not honored.

[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552

2021-04-23 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217

--- Comment #5 from Ilya Leoshkevich  ---
That would be an ideal solution, but I wonder how to implement it? Suppose we
find a way to convince expand to pick FPRX2mode for such a long double. What if
the following comes up?

register long double x asm ("v0");  /* FPRX2mode */
long double y;  /* TFmode */
x += y; /* convert? */

Would it be feasible to also teach expand to do the mode conversions?



One other alternative might be to detect `register long double asm("fN")`
declarations and go back to using floating point register pairs for functions
that contain them.

[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552

2021-04-23 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #3 from Ilya Leoshkevich  ---
There main problem here is that `register long double f0 asm ("f0")` does not
make sense on z14 anymore. long doubles are stored in vector registers now, not
in floating-point register pairs. If we skip the hard reg, the code will end up
having the following semantics:

vr0[0:128] = 1.0L;
asm("/* expect the value in vr0[0:64] . vr2[0:64] */");

and fail during the run time. So I think it's better to use the "best effort"
approach and force it into a pseudo, even if this would mean that the
user-specified register is not honored:

--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -16814,6 +16814,12 @@ s390_md_asm_adjust (vec , vec
,
   gcc_assert (allows_reg);
   /* Copy input value from a vector register into a FPR pair.  */
   rtx fprx2 = gen_reg_rtx (FPRX2mode);
+  if (REG_P (inputs[i]) && HARD_REGISTER_P (inputs[i]))
+   {
+ rtx orig_input = inputs[i];
+ inputs[i] = gen_reg_rtx (TFmode);
+ emit_move_insn (inputs[i], orig_input);
+   }
   emit_insn (gen_tf_to_fprx2 (fprx2, inputs[i]));
   inputs[i] = fprx2;
   input_modes[i] = FPRX2mode;

I need to check whether we can keep the output logic as is.

Ideally the code should be adapted and use the __LONG_DOUBLE_VX__ macro like
this:

#ifdef __LONG_DOUBLE_VX__
  register long double f0 asm ("v0");
#else
  register long double f0 asm ("f0");
#endif

  f0 = 1.0L;

#ifdef __LONG_DOUBLE_VX__
  asm("" : : "v" (f0));
#else
  asm("" : : "f" (f0));
#endif

Maybe a warning recommending to do this should be printed.

[Bug libgomp/98738] task-detach-6.f90 hangs intermittently

2021-01-18 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98738

--- Comment #1 from Ilya Leoshkevich  ---
I realized I didn't post the command line I used to build task-detach-6.exe
(there are multiple variants of this test); here it is:

gcc/build/x86_64-pc-linux-gnu/libgomp/testsuite$ ../../../../build/./gcc/xgcc
-B../../../../build/./gcc/ -B../../../../install/x86_64-pc-linux-gnu/bin/
-B../../../../install/x86_64-pc-linux-gnu/lib/ -isystem
../../../../install/x86_64-pc-linux-gnu/include -isystem
../../../../install/x86_64-pc-linux-gnu/sys-include -fchecking=1
../../../../libgomp/testsuite/libgomp.fortran/task-detach-6.f90
-B../../../../build/x86_64-pc-linux-gnu/./libgomp/
-B../../../../build/x86_64-pc-linux-gnu/./libgomp/.libs
-I../../../../build/x86_64-pc-linux-gnu/./libgomp
-I../../../../libgomp/testsuite/../../include
-I../../../../libgomp/testsuite/.. -fmessage-length=0
-fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp
-B../../../../build/x86_64-pc-linux-gnu/./libgomp/../libquadmath/.libs/ -O1
-B../../../../build/x86_64-pc-linux-gnu/./libgomp/../libgfortran/.libs
-fintrinsic-modules-path=../../../../build/x86_64-pc-linux-gnu/./libgomp
-L../../../../build/x86_64-pc-linux-gnu/./libgomp/.libs
-L../../../../build/x86_64-pc-linux-gnu/./libgomp/../libquadmath/.libs/
-L../../../../build/x86_64-pc-linux-gnu/./libgomp/../libgfortran/.libs
-lgfortran -foffload=-lgfortran -lquadmath -lm -o ./task-detach-6.exe

[Bug libgomp/98738] New: task-detach-6.f90 hangs intermittently

2021-01-18 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98738

Bug ID: 98738
   Summary: task-detach-6.f90 hangs intermittently
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

I'm currently on commit 2e43880dbd4c.  Building task-detach-6.exe and running
it in a loop eventually leads to a hang (might take a while, during the last
run it happened after 7k runs):

gcc/build/x86_64-pc-linux-gnu/libgomp/testsuite$ while true; do
LD_LIBRARY_PATH=../../../../install/lib64 ./task-detach-6.exe; echo -n .; done

I first spotted this on s390 and then checked on x86_64; the issue is
reproducible on both.

[Bug testsuite/98208] make check's check-fixincludes fails in sys/types.h around AIX_PHYSADR_T_CHECK

2020-12-14 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98208

Ilya Leoshkevich  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Ilya Leoshkevich  ---
I've committed the fix:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=057dc81f820b

I think I messed up the commit message and Bugzilla did not link the commit to
this bug.

Anyway, marking this as RESOLVED/FIXED now.

[Bug testsuite/98208] make check's check-fixincludes fails in sys/types.h around AIX_PHYSADR_T_CHECK

2020-12-10 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98208

--- Comment #10 from Ilya Leoshkevich  ---
I've posted the combined fixincludes/tests/base/sys/types.h + genfixes patch
here: https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561601.html

[Bug testsuite/98208] make check's check-fixincludes fails in sys/types.h around AIX_PHYSADR_T_CHECK

2020-12-09 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98208

--- Comment #8 from Ilya Leoshkevich  ---
Hm, can it be that fixincludes/tests/base/sys/types.h simply needs to be
updated?

For example, here is a similar commit:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=081b3517b4df826ac917147eb906bbb8fc6528b1

There, both fixincludes/inclhack.def and fixincludes/tests/base/sys/inttypes.h
are updated.
I tried the following and it helped:

diff --git a/fixincludes/tests/base/sys/types.h
b/fixincludes/tests/base/sys/types.h
index 683b5e9..a318f9b 100644
--- a/fixincludes/tests/base/sys/types.h
+++ b/fixincludes/tests/base/sys/types.h
@@ -9,6 +9,11 @@



+#if defined( AIX_PHYSADR_T_CHECK )
+typedef struct __physadr_s {
+#endif  /* AIX_PHYSADR_T_CHECK */
+
+
 #if defined( GNU_TYPES_CHECK )
 #if !defined(_GCC_PTRDIFF_T)
 #define _GCC_PTRDIFF_T

[Bug testsuite/98208] make check's check-fixincludes fails in sys/types.h around AIX_PHYSADR_T_CHECK

2020-12-09 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98208

--- Comment #7 from Ilya Leoshkevich  ---
Still a similar error:



sys/types.h /home/iii/gcc/fixincludes/tests/base/sys/types.h differ: byte 243,
line 12
*** sys/types.h 2020-12-09 15:57:57.575959676 +
--- /home/iii/gcc/fixincludes/tests/base/sys/types.h2020-04-14
11:43:52.317860128 +
***
*** 9,20 



- #if defined( AIX_PHYSADR_T_CHECK )
- typedef struct __physadr_s { int r[1]; } *physadr_t;
- 
- #endif  /* AIX_PHYSADR_T_CHECK */
- 
- 
  #if defined( GNU_TYPES_CHECK )
  #if !defined(_GCC_PTRDIFF_T)
  #define _GCC_PTRDIFF_T
--- 9,14 

There were fixinclude test FAILURES



What I don't quite get is why does this kick in on Linux?
It seems to be gated by `mach  = "*-*-aix*"`, just like other similar fixes
which don't cause issues.

[Bug testsuite/98208] make check's check-fixincludes fails in sys/types.h around AIX_PHYSADR_T_CHECK

2020-12-09 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98208

--- Comment #5 from Ilya Leoshkevich  ---
Oh, just in case: gcc121 is x86_64 CentOS Linux 7, not AIX.

[Bug testsuite/98208] make check's check-fixincludes fails in sys/types.h around AIX_PHYSADR_T_CHECK

2020-12-09 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98208

--- Comment #4 from Ilya Leoshkevich  ---
Unfortunately not, with this patch I get:

sys/types.h gcc/fixincludes/tests/base/sys/types.h differ: byte 243, line 12
*** sys/types.h 2020-12-09 15:46:15.843503181 +
--- gcc/fixincludes/tests/base/sys/types.h  2020-04-14 11:43:52.317860128
+
***
*** 9,19 



- #if defined( AIX_PHYSADR_T_CHECK )
- typedef struct __physadr_s { random text } *physadr_t;
- #endif  /* AIX_PHYSADR_T_CHECK */
- 
- 
  #if defined( GNU_TYPES_CHECK )
  #if !defined(_GCC_PTRDIFF_T)
  #define _GCC_PTRDIFF_T
--- 9,14 

There were fixinclude test FAILURES

[Bug testsuite/98208] make check's check-fixincludes fails in sys/types.h around AIX_PHYSADR_T_CHECK

2020-12-08 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98208

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||nathan at acm dot org

--- Comment #1 from Ilya Leoshkevich  ---
Bisect points to:

commit 92648faa1cb2b28685f3b3dccfdfc4b1ed2c5a7b
Author: Nathan Sidwell 
Date:   Wed Nov 18 10:33:30 2020 -0800

aix: Fixinclude

[Bug testsuite/98208] New: make check's check-fixincludes fails in sys/types.h around AIX_PHYSADR_T_CHECK

2020-12-08 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98208

Bug ID: 98208
   Summary: make check's check-fixincludes fails in sys/types.h
around AIX_PHYSADR_T_CHECK
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
  Target Milestone: ---

With the recent master (f1b6e17b3f75) make check fails (on gcc121 machine) as
follows:

sys/types.h gcc/regtest-f1b6e17b3f75/fixincludes/tests/base/sys/types.h differ:
byte 243, line 12
*** sys/types.h 2020-12-08 20:08:54.944208838 +
--- gcc/regtest-f1b6e17b3f75/fixincludes/tests/base/sys/types.h  
2020-12-08 18:36:20.011729819 +
***
*** 9,19 



- #if defined( AIX_PHYSADR_T_CHECK )
- typedef struct __physadr_s {
- #endif  /* AIX_PHYSADR_T_CHECK */
- 
- 
  #if defined( GNU_TYPES_CHECK )
  #if !defined(_GCC_PTRDIFF_T)
  #define _GCC_PTRDIFF_T
--- 9,14 

There were fixinclude test FAILURES

Might be related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97865, but I
haven't bisected yet.

[Bug tree-optimization/98113] [11 Regression] popcnt is not vectorized on s390 since f5e18dd9c7da

2020-12-03 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98113

--- Comment #6 from Ilya Leoshkevich  ---
With the patch, vxe/popcount-1.c works on s390 again:

vpopctf:
.LFB2:
.cfi_startproc
vpopctf %v24,%v24
br  %r14

Thanks!

[Bug tree-optimization/98113] New: [11 Regression] popcnt is not vectorized on s390 since f5e18dd9c7da

2020-12-02 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98113

Bug ID: 98113
   Summary: [11 Regression] popcnt is not vectorized on s390 since
f5e18dd9c7da
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
  Target Milestone: ---

s390's vxe/popcount-1.c began to fail after PR96789 fix.

The reason is that for the following source code

uv4si __attribute__((noinline))
vpopctf (uv4si a)
{
  uv4si r;
  int i;

  for (i = 0; i < 4; i++)
r[i] = __builtin_popcount (a[i]);

  return r;
}

FRE turned

  _4 = BIT_FIELD_REF ;
  _11 = __builtin_popcountD.1211 (_4);
  _18 = (unsigned intD.9) _11;
  BIT_FIELD_REF  = _18;
  i_20 = 1;
  ivtmp_21 = 3;
  _25 = VIEW_CONVERT_EXPR(aD.2283)[i_20];
  _26 = __builtin_popcountD.1211 (_25);
  _27 = (unsigned intD.9) _26;
  VIEW_CONVERT_EXPR(rD.2286)[i_20] = _27;
  i_29 = i_20 + 1;
  ivtmp_30 = ivtmp_21 + 4294967295;
  _34 = VIEW_CONVERT_EXPR(aD.2283)[i_29];
  _35 = __builtin_popcountD.1211 (_34);
  _36 = (unsigned intD.9) _35;
  VIEW_CONVERT_EXPR(rD.2286)[i_29] = _36;
  i_38 = i_29 + 1;
  ivtmp_39 = ivtmp_30 + 4294967295;
  _1 = VIEW_CONVERT_EXPR(aD.2283)[i_38];
  _2 = __builtin_popcountD.1211 (_1);
  _3 = (unsigned intD.9) _2;
  VIEW_CONVERT_EXPR(rD.2286)[i_38] = _3;
  i_10 = i_38 + 1;
  ivtmp_16 = ivtmp_39 + 4294967295;
  _7 = rD.2286;
  rD.2286 ={v} {CLOBBER};
  return _7;

into

  _4 = BIT_FIELD_REF ;
  _11 = __builtin_popcountD.1211 (_4);
  _18 = (unsigned intD.9) _11;
  r_14 = BIT_INSERT_EXPR ;
  _25 = BIT_FIELD_REF ;
  _26 = __builtin_popcountD.1211 (_25);
  _27 = (unsigned intD.9) _26;
  r_33 = BIT_INSERT_EXPR ;
  _34 = BIT_FIELD_REF ;
  _35 = __builtin_popcountD.1211 (_34);
  _36 = (unsigned intD.9) _35;
  r_32 = BIT_INSERT_EXPR ;
  _1 = BIT_FIELD_REF ;
  _2 = __builtin_popcountD.1211 (_1);
  _3 = (unsigned intD.9) _2;
  r_31 = BIT_INSERT_EXPR ;
  _7 = r_31;
  return _7;

that is, replaced a sequence of stores with a sequence of
BIT_INSERT_EXPRs.

slp1 now says: "missed:  not vectorized: no grouped stores in basic
block", presumably because it doesn't understand BIT_INSERT_EXPRs.

[Bug target/97866] [11 Regression] bootstrap error in libasan building a s390x-linux-gnu cross compiler

2020-11-17 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97866

--- Comment #3 from Ilya Leoshkevich  ---
I believe it's already fixed by:

commit 253c415a1acba50711c82693426391743ac18040
Author: Vladimir N. Makarov 
Date:   Sun Nov 15 11:22:19 2020 -0500

Do not put reload insns in the last empty BB.

gcc/
* lra.c (lra_process_new_insns): Don't put reload insns in the
last empty BB.

Cherry-picking it helps, and the comment from this commit describes what is
happening here: "Do not put reload insns if it is the last BB without actual
insns.  In this case the reload insns can get null BB after emitting".

[Bug target/97866] [11 Regression] bootstrap error in libasan building a s390x-linux-gnu cross compiler

2020-11-17 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97866

--- Comment #2 from Ilya Leoshkevich  ---
Never mind, I managed to reproduce it now:

ubuntu-focal-amd64$ git rev-parse --short HEAD
77f67db2a47

ubuntu-focal-amd64$ ../configure --target=s390x-linux-gnu --exec-prefix=/usr
--disable-bootstrap --disable-multilib --enable-languages=c,c++

ubuntu-focal-amd64$ cat test.cpp
typedef long a;
typedef void (*b)(a, a, void *);
class c {
  unsigned char *m_fn1();
  char d(a);
  a e(a);
  void f();
};
b g;
void *h;
void c::f() {
  for (a j; j < 6; j++) {
unsigned char *flags = m_fn1();
for (a i, k; i < k; i++) {
  if (flags)
continue;
  int *ff = reinterpret_cast(d(i));
  g(a(ff), e(j), h);
}
  }
}

ubuntu-focal-amd64$ gcc/xgcc -Bgcc -std=gnu++14 -O2 -c test.cpp
during RTL pass: reload
test.cpp: In member function ‘void c::f()’:
test.cpp:21:1: internal compiler error: in get_insn_freq, at lra.c:1554

[Bug target/97866] [11 Regression] bootstrap error in libasan building a s390x-linux-gnu cross compiler

2020-11-17 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97866

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #1 from Ilya Leoshkevich  ---
Could you please share your configure flags?

On x86_64 Ubuntu 20.04 the following worked fine:

../configure --target=s390x-linux-gnu --exec-prefix=/usr --disable-bootstrap
--disable-multilib --enable-languages=c,c++

[Bug rtl-optimization/97326] New: [11 Regression] s390: ICE in do_store_flag after 10843f830350

2020-10-07 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97326

Bug ID: 97326
   Summary: [11 Regression] s390: ICE in do_store_flag after
10843f830350
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
  Target Milestone: ---

The following (minimized from
gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-eq.c) produces
an ICE on s390:

build$ cat x.c
long *a;
double *b;
c;
d() {
  for (; c; c++)
a[c] = 0 <= b[c] && 0 >= b[c];
}

build$ gcc/cc1 -O3 -march=z15 x.c
during RTL pass: expand

x.c: In function 'd':
x.c:4:1: internal compiler error: in do_store_flag, at expr.c:12388
0x150dd15 do_store_flag
../../gcc/expr.c:12388
0x1505e1b expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
../../gcc/expr.c:9621
0x14ec50d expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../gcc/expr.c:10165
0x14ef981 expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier,
rtx_def**, bool)
../../gcc/expr.c:8480
0x1635191 expand_normal
../../gcc/expr.h:288
0x1635191 expand_vect_cond_optab_fn
../../gcc/internal-fn.c:2602
0x136d83d expand_call_stmt
../../gcc/cfgexpand.c:2612
0x136d83d expand_gimple_stmt_1
../../gcc/cfgexpand.c:3686
0x136d83d expand_gimple_stmt
../../gcc/cfgexpand.c:3851
0x1374f4b expand_gimple_basic_block
../../gcc/cfgexpand.c:5892
0x1377963 execute
../../gcc/cfgexpand.c:6576



Bisect points to:

commit 10843f8303509fcba880c6c05c08e4b4ccd24f36
Author: Richard Biener 
Date:   Thu Sep 24 10:14:33 2020 +0200

tree-optimization/97085 - fold some trivial bool vector ?:

[Bug c++/95700] read-md.c: "missing sentinel in function call" when building gcc with musl

2020-07-22 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95700

--- Comment #17 from Ilya Leoshkevich  ---
Created attachment 48917
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48917=edit
aarch64 native build fix

Could you please try the attached patch? It fixed the issue for me, and
survived bootstrap/regtest on x86_64.

[Bug c++/95700] read-md.c: "missing sentinel in function call" when building gcc with musl

2020-07-21 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95700

--- Comment #16 from Ilya Leoshkevich  ---
I finally managed to reproduce this by doing `./configure
--host=aarch64-none-linux-gnu` on gcc113. The problem is that `CXX_FOR_BUILD`
doesn't seem to be set correctly - normally it's `g++-4.8.1 -std=gnu++11`, but
in this case it's just `g++`. I'm currently trying to wrap my head around
autotools build setup in order to figure out where exactly things went wrong.

[Bug c++/95700] read-md.c: "missing sentinel in function call" when building gcc with musl

2020-07-20 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95700

--- Comment #14 from Ilya Leoshkevich  ---
gcc113 has 4.8.4, which is a bit newer. But in any case, according to
https://gcc.gnu.org/projects/cxx-status.html, gcc should support nullptr since
4.6.

Could you please post the failing compiler invocation command?

In the meantime I will build gcc 4.8.1 on gcc113 and try to build master with
it.

[Bug c++/95700] read-md.c: "missing sentinel in function call" when building gcc with musl

2020-07-11 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95700

--- Comment #12 from Ilya Leoshkevich  ---
I managed to bootstrap and regtest upstream commit 6e41c27bf549 on gcc113 farm
machine.

Two questions:

- What is your system compiler version? For GCC 11, C++11 compiler is required:
https://gcc.gnu.org/install/prerequisites.html

- What exactly is "native aarch64 build" - is it simply building the compiler
on aarch64, or something else?

[Bug c++/95700] read-md.c: "missing sentinel in function call" when building gcc with musl

2020-07-10 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95700

--- Comment #11 from Ilya Leoshkevich  ---
Sorry about that! I will have a look.

[Bug c++/95700] read-md.c: "missing sentinel in function call" when building gcc with musl

2020-06-17 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95700

--- Comment #8 from Ilya Leoshkevich  ---
Created attachment 48750
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48750=edit
proposed patch (tests are running)

[Bug c++/95700] read-md.c: "missing sentinel in function call" when building gcc with musl

2020-06-17 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95700

--- Comment #7 from Ilya Leoshkevich  ---
Would it be OK then to replace last arguments of functions with
__attribute__((sentinel)) from NULLs to nullptrs? I can make a patch for this.

[Bug c++/95700] read-md.c: "missing sentinel in function call" when building gcc with musl

2020-06-16 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95700

--- Comment #5 from Ilya Leoshkevich  ---
I'm sorry, I should not have written (uintptr_t)0 - I just used it as a synonym
for a "pointer-sized int". Would allowing 0L as a sentinel value be a
reasonable thing?

[Bug c++/95700] read-md.c: "missing sentinel in function call" when building gcc with musl

2020-06-16 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95700

--- Comment #4 from Ilya Leoshkevich  ---
Created attachment 48740
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48740=edit
preprocessed output

In the preprocessed output I see that gcc's stddef.h is used, but most likely
`#define NULL 0L` comes from some other musl header, since musl defines it in
like 8 places.

[Bug c++/95700] New: read-md.c: "missing sentinel in function call" when building gcc with musl

2020-06-16 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95700

Bug ID: 95700
   Summary: read-md.c: "missing sentinel in function call" when
building gcc with musl
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
  Target Milestone: ---

I'm trying to bootstrap gcc on gcc301 with --disable-multilib
--build=x86_64-alpine-linux-musl. The following error occurs:

/home/iii/gcc/regtest-f8a59086423e/build/./prev-gcc/xg++
-B/home/iii/gcc/regtest-f8a59086423e/build/./prev-gcc/
-B/home/iii/gcc/regtest-f8a59086423e/install/x86_64-alpine-linux-musl/bin/
-nostdinc++
-B/home/iii/gcc/regtest-f8a59086423e/build/prev-x86_64-alpine-linux-musl/libstdc++-v3/src/.libs
-B/home/iii/gcc/regtest-f8a59086423e/build/prev-x86_64-alpine-linux-musl/libstdc++-v3/libsupc++/.libs

-I/home/iii/gcc/regtest-f8a59086423e/build/prev-x86_64-alpine-linux-musl/libstdc++-v3/include/x86_64-alpine-linux-musl

-I/home/iii/gcc/regtest-f8a59086423e/build/prev-x86_64-alpine-linux-musl/libstdc++-v3/include
 -I/home/iii/gcc/regtest-f8a59086423e/libstdc++-v3/libsupc++
-L/home/iii/gcc/regtest-f8a59086423e/build/prev-x86_64-alpine-linux-musl/libstdc++-v3/src/.libs
-L/home/iii/gcc/regtest-f8a59086423e/build/prev-x86_64-alpine-linux-musl/libstdc++-v3/libsupc++/.libs
-c   -g -O2 -fno-checking -gtoggle  -DIN_GCC -fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -Werror -fno-common  -DHAVE_CONFIG_H  -DGENERATOR_FILE
-fno-PIE -I. -Ibuild -I../../gcc -I../../gcc/build -I../../gcc/../include 
-I../../gcc/../libcpp/include  \
-o build/read-md.o ../../gcc/read-md.c
../../gcc/read-md.c: In member function ‘const char*
md_reader::join_c_conditions(const char*, const char*)’:
../../gcc/read-md.c:158:58: error: missing sentinel in function call
[-Werror=format=]
  158 |   result = concat ("(", cond1, ") && (", cond2, ")", NULL);
  |  ^
../../gcc/read-md.c: In member function ‘void
md_reader::handle_enum(file_location, bool)’:
../../gcc/read-md.c:947:58: error: missing sentinel in function call
[-Werror=format=]
  947 |value_name = concat (def->name, "_", name.string, NULL);
  |  ^
../../gcc/read-md.c: In member function ‘void
md_reader::handle_include(file_location)’:
../../gcc/read-md.c:1072:57: error: missing sentinel in function call
[-Werror=format=]
 1072 |pathname = concat (stackp->fname, sep, filename, NULL);
  | ^
../../gcc/read-md.c:1085:47: error: missing sentinel in function call
[-Werror=format=]
 1085 |  pathname = concat (m_base_dir, filename, NULL);
  |   ^
cc1plus: all warnings being treated as errors

musl has the following commit:
https://git.musl-libc.org/cgit/musl/commit/?id=c8a9c22173f485c8c053709e1dfa0a617cb6be1a,
which suggests that C++ (as opposed to plain C) should allow plain (uintptr_t)0
as a sentinel value.

gcc wants either a pointer or __null though:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/c-family/c-common.c;h=b1379faa412e3646a443969c0067f5c4fb23e107;hb=929fd91ba975eebf9e57f7f092041271dcaf0c34#l5385

Would it be possible to allow (uintptr_t)0 as a valid sentinel value for C++?
Or is it musl that is wrong here?

[Bug tree-optimization/94792] New: Missed SLP optimization in pr65930-2.c variation

2020-04-27 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94792

Bug ID: 94792
   Summary: Missed SLP optimization in pr65930-2.c variation
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
  Target Milestone: ---

gcc commit cf3a909cf455. Consider the following variation of pr65930-2.c:

$ cat pr65930-2b.c
#include "tree-vect.h"

int __attribute__((noipa))
bar (unsigned int *x, int n)
{
  unsigned int sum = 4;
  x = __builtin_assume_aligned (x, __BIGGEST_ALIGNMENT__);
  for (int i = 0; i < n; ++i)
sum += x[i*4+0]+ x[i*4 + 1] + x[i*4 + 2] + x[i*4 + 3];
  return sum;
}

int
main ()
{
  static int a[16] __attribute__((aligned(__BIGGEST_ALIGNMENT__)))
= { 1, 3, 5, 8, 9, 10, 17, 18, 23, 29, 30, 55, 42, 2, 3, 1 };
  check_vect ();
  if (bar (a, 4) != 260)
abort ();
  return 0;
}

This differs from pr65930-2.c only in that sum type is unsigned int, which
should be on cast less. And yet:

$ gcc pr65930-2b.c -fno-diagnostics-show-caret
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never
-fdiagnostics-urls=never -msse2 -ftree-vectorize
-fno-tree-loop-distribute-patterns -fno-vect-cost-model -fno-common -O2
-fdump-tree-vect-details -lm -o ./pr65930-2.exe ; grep SLP
pr65930-2b.c.161t.vect | wc -l
0

whereas for the original version:

$ gcc pr65930-2.c -fno-diagnostics-show-caret
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never
-fdiagnostics-urls=never -msse2 -ftree-vectorize
-fno-tree-loop-distribute-patterns -fno-vect-cost-model -fno-common -O2
-fdump-tree-vect-details -lm -o ./pr65930-2.exe ; grep SLP
pr65930-2.c.161t.vect | wc -l
33

The resulting assembly is also noticeably larger and uses regular adds for at
least part of the data.

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-12-10 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #29 from Ilya Leoshkevich  ---
Created attachment 47463
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47463=edit
nop plugin

Hi Maxim,

Just to clear my conscience, could you please try the nop trick in your
setup?  I normally use the attached plugin for that.  Just build it and
add e.g.

-fplugin=$HOME/gcc-nop-plugin/libgcc_nop_plugin.so
-fplugin-arg-libgcc_nop_plugin-S_regmatch=4

to the compiler flags.

Best regards,
Ilya

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-12-09 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #27 from Ilya Leoshkevich  ---
With

-DSPEC_CPU -DNDEBUG -DPERL_CORE   -O3 -save-temps=obj -fopt-info-vec-optimized 
 -DSPEC_CPU_LP64 -DSPEC_CPU_LINUX_X64 -fgnu89-inline

on gcc113 I can see 2% slowdown:

r277511 (without this fix): 880.09s
r277515 (with this fix):897.85s

The function that degraded the most is indeed S_regmatch:

$ perf diff perf-9760321.data perf-44b2b4c.data
32.24%   exe[.] S_regmatch
 8.92%   exe[.] S_find_byclass.isra.0 
 6.80%   +0.28%  libc-2.19.so   [.] 0x0007dec0
 5.20%   exe[.] S_regtry  

However, the "shape" of S_regmatch did not change, that is, when all
offsets and register numbers are replaced with "x" in the objdump
output, the old and the new versions are identical.  This hints at some
microarchitectural effect - aliasing in the branch predictor maybe?

From my perspective, this happens too often, so I use the following test
to rule this out: just add a nop at the beginning of the problematic
function. This changes all the offsets and makes aliasing situation
completely different.  And indeed, by adding a single nop to S_regmatch,
I get wildly different results (for now this is just 1 repeat, I will
run best-of-3 overnight):

r277511 (without this fix): 929.1s
r277515 (with this fix):931.48s

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-12-09 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #26 from Ilya Leoshkevich  ---
Whoops, I accidentally used a script I normally use for big-endian
machines (s390 actually).  But gcc113 is an APM X-Gene Mustang board.
I'll try again with your flags and see if I can reproduce the regression
there.

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-12-09 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #24 from Ilya Leoshkevich  ---
I got the following results on gcc113:

400.perlbench

Compiler flags: -DSPEC_CPU -DNDEBUG -DPERL_CORE   -march=native -g -O3
-funroll-loops -fopt-info-vec-optimized   -DSPEC_CPU -DNDEBUG -DPERL_CORE
-DSPEC_CPU_LINUX -DSPEC_CPU_BIGENDIAN -D_GNU_SOURCE -DSPEC_CPU_LP64
-fno-strict-aliasing -std=gnu90

r277511 (without this fix): 884.11s
r277515 (with this fix):874.93s

Maxim, could you please share compiler flags with which you are seeing the
regression?

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-12-06 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #22 from Ilya Leoshkevich  ---
Hello Maxim,

Sorry about that!

I don't think it's possible to simply move jump threading back, since
it's not correct to have it where it used to be.  So I will build and
run the new and the old 400.perlbench on gcc compile farm and see what
else I can do about the difference.

Best regards,
Ilya

[Bug rtl-optimization/92430] [9/10 Regression] Compile-time hog w/ -Os -fno-if-conversion -fno-tree-dce -fno-tree-loop-optimize -fno-tree-vrp

2019-11-11 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92430

--- Comment #3 from Ilya Leoshkevich  ---
Findings so far: when we forward an edge like this:

#0  redirect_edge_succ (e=0x76d73cc0, new_succ=0x76c2aa90) at
../.././gcc/cfg.c:368
#1  0x00a776ff in redirect_edge_succ_nodup (e=0x76d73cc0,
new_succ=0x76c2aa90) at ../.././gcc/cfghooks.c:469
#2  0x00a9c18a in cfg_layout_redirect_edge_and_branch
(e=0x76d73cc0, dest=0x76c2aa90) at ../.././gcc/cfgrtl.c:4500
#3  0x00a77419 in redirect_edge_and_branch (e=0x76d73cc0,
dest=0x76c2aa90) at ../.././gcc/cfghooks.c:373
#4  0x02496e8d in try_forward_edges (mode=40, b=0x76d86680) at
../.././gcc/cfgcleanup.c:563
#5  0x024a2654 in try_optimize_cfg (mode=40) at
../.././gcc/cfgcleanup.c:2961
#6  0x024a2d1a in cleanup_cfg (mode=40) at
../.././gcc/cfgcleanup.c:3175
#7  0x024a2f29 in (anonymous
namespace)::pass_jump_after_combine::execute (this=0x38a2b00) at
../.././gcc/cfgcleanup.c:3315

we don't seem to correctly update dominance info (if at all), making it
inconsistent with the actual CFG. In this particular case, inconsistency
makes the following call chain produce a loop in the dominator tree:

#3  0x00b37638 in redirect_immediate_dominators (dir=CDI_DOMINATORS,
bb=0x76c2ab60, to=0x76d867b8) at ../.././gcc/dominance.c:995
#4  0x00a7838c in merge_blocks (a=0x76d867b8, b=0x76c2ab60) at
../.././gcc/cfghooks.c:852
#5  0x024a1a1d in try_optimize_cfg (mode=40) at
../.././gcc/cfgcleanup.c:2825
#6  0x024a2d1a in cleanup_cfg (mode=40) at
../.././gcc/cfgcleanup.c:3175
#7  0x024a2f29 in (anonymous
namespace)::pass_jump_after_combine::execute (this=0x38a2b00) at
../.././gcc/cfgcleanup.c:3315

which ultimately leads to the hang that we are observing.

[Bug rtl-optimization/92430] [9/10 Regression] Compile-time hog w/ -Os -fno-if-conversion -fno-tree-dce -fno-tree-loop-optimize -fno-tree-vrp

2019-11-11 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92430

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com,
   ||krebbel1 at de dot ibm.com

--- Comment #2 from Ilya Leoshkevich  ---
I'm looking into this.

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-10-17 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #15 from Ilya Leoshkevich  ---
Created attachment 47059
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47059=edit
proposed fix (without renaming the pass so far)

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-10-17 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #14 from Ilya Leoshkevich  ---
Created attachment 47058
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47058=edit
temporary patch for finding out the number of threaded edges

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-10-17 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #12 from Ilya Leoshkevich  ---
> Well, it apparently has found new jump threading opportunities after
> partition_blocks.  Are such changes useful?  Does it happen often?

It's still combine that was responsible for this particular opportunity.
I've added a simple counter of threaded edges and built two compiler
versions: with and without the patch from comment 3. The value of the
counter is the same in both cases for the code from this bugreport.

Furthermore, I've built SPEC 2006 and SPEC 2017 with vanilla and patched
compilers and aggregated the counter values.

When doing jump threading right after reload, 3889 edges are threaded.
When doing jump threading right after combine, 3918 edges are threaded.

Both figures are more or less the same, we even end up losing some
opportunities if we delay jump threading.

[Bug tree-optimization/92115] [10 Regression] ICE in gimple_cond_get_ops_from_tree, at gimple-expr.c:577

2019-10-17 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92115

--- Comment #6 from Ilya Leoshkevich  ---
> Am 16.10.2019 um 16:32 schrieb asolokha at gmx dot com 
> :
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92115
> 
> --- Comment #4 from Arseny Solokha  ---
> (In reply to Ilya Leoshkevich from comment #3)
>> Arseny, how did you find this? Did you just run the regtest? I wonder why
>> didn't I see it during my test runs.
> 
> My test harness continuously compiles a corpus of C, C++, and Fortran code 
> with
> the latest weekly trunk snapshot, picking a random set of compiler options and
> parameters for each file. gcc and libstdc++ test suites constitute an 
> important
> part of that corpus. When compiling files from these test suites, my test
> harness ignores compiler options specified there for DejaGNU and uses its own
> randomly chosen ones instead. Of course, this approach is not suitable for
> testing run-time correctness.
> 
> So, if there are no testcases in the test suite yet which could trigger that
> specific code path in gcc internals, probably due to an unusual set of 
> compiler
> options, your testing won't reveal a problem reported here. That is probably
> OK, as regression testing have to be deterministic, after all.

Hi Arseny,

Did you per chance open-source it?

Best regards,
Ilya

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-10-16 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #10 from Ilya Leoshkevich  ---
> Question is how to figure out which to do when.

I would always do the former before reload, and always the latter after
reload.

However, I have a concern regarding this approach: in more complicated
cases instead of just a single 11/COLD we might have a larger lump of
cold basic blocks.  In order to avoid introducing new crossing edges, we
would have to make them all hot (using e.g. a simple worklist
algorithm).  Is such an end result desirable?

I'd also still like to understand the motivation behind doing this pass
after reload.  When I introduced it in r266734, the only goal was to
clean up the CFG after combine.  I was advised to put it where it is
now, and back then I did not see any downsides to doing so.  But now
that this problem has arisen - what is the advantage of doing this after
the following 16 additional passes?  What would be the downside of doing
it between pass_combine and pass_partition_blocks?

  NEXT_PASS (pass_combine);
--
  NEXT_PASS (pass_if_after_combine);
  NEXT_PASS (pass_partition_blocks);
  NEXT_PASS (pass_outof_cfg_layout_mode);
  NEXT_PASS (pass_split_all_insns);
  NEXT_PASS (pass_lower_subreg3);
  NEXT_PASS (pass_df_initialize_no_opt);
  NEXT_PASS (pass_stack_ptr_mod);
  NEXT_PASS (pass_mode_switching);
  NEXT_PASS (pass_match_asm_constraints);
  NEXT_PASS (pass_sms);
  NEXT_PASS (pass_live_range_shrinkage);
  NEXT_PASS (pass_sched);
  NEXT_PASS (pass_early_remat);
  NEXT_PASS (pass_ira);
  NEXT_PASS (pass_reload);
  NEXT_PASS (pass_postreload);
  PUSH_INSERT_PASSES_WITHIN (pass_postreload)
--
  NEXT_PASS (pass_postreload_jump);

[Bug tree-optimization/92115] [10 Regression] ICE in gimple_cond_get_ops_from_tree, at gimple-expr.c:577

2019-10-16 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92115

--- Comment #3 from Ilya Leoshkevich  ---
Thanks again, Jakub.

Arseny, how did you find this? Did you just run the regtest? I wonder why
didn't I see it during my test runs.

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-10-11 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #7 from Ilya Leoshkevich  ---
How can we do this here?

When we make a decision to eliminate bb 5, all the "nearby" edges are
hot.

Having eliminated bb 5, we cannot avoid making bb 6 cold, since this
would violate CFG integrity: as far as I understand, it's important to
maintain the property that cold bbs cannot dominate hot bbs.

So we would have to avoid eliminating bb 5 in the first place, and for
that we would need to analyze which consequences that would have w.r.t.
dominators and partitioning, and that might be costly.

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-10-11 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #5 from Ilya Leoshkevich  ---
+1 regarding renaming, I just wanted to keep it simple here.

Landing pad issue aside, I'm beginning to wonder if we can have a jump
pass after reload at all?  For example, if hotness of a basic block
changes, and a related jump becomes a crossing one: can it be that on
some targets we would have to change a "simple" branching instruction
to a sequence that first fetches a target address from a literal pool?
And then, since reload has already completed, how do we allocate a
register for that?

[Bug middle-end/92063] [10 Regression] ICE in operation_could_trap_p, at tree-eh.c:2528 when compiling Python's Python/_warnings.c

2019-10-11 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92063

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #10 from Ilya Leoshkevich  ---
Hi Jakub, thanks for fixing this!  FWIW the patch looks good to me.  I
have also run my signaling comparison tests on S/390 with it, and they
still work.

Is there something else I need to look at in context of this problem?

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-10-11 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #3 from Ilya Leoshkevich  ---
Jump threading has converted this:

  +-- 2/HOT +
  | |
  v v
3/HOT --> 5/HOT --> 8/HOT --> 11/COLD --> 6/HOT --EH--> 16/HOT
|   ^
|   |
+---+

into this:

  +-- 2/HOT --+
  |   |
  v   v
3/HOT --> 8/HOT --> 11/COLD --> 6/COLD --EH--> 16/HOT

by eleminating bb 5.  This made bb 6 dominated by cold bb 11, and
because of this fixup_partitions made bb 6 cold as well, which in turn
made EH edge 6->16 a crossing one.

According to

https://gcc.gnu.org/viewcvs/gcc?view=revision=176696

in this situation we need to create a cold landing pad.


But I wonder whether we could just do the following instead?

--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -439,6 +439,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_ud_rtl_dce);
   NEXT_PASS (pass_combine);
   NEXT_PASS (pass_if_after_combine);
+  NEXT_PASS (pass_postreload_jump);
   NEXT_PASS (pass_partition_blocks);
   NEXT_PASS (pass_outof_cfg_layout_mode);
   NEXT_PASS (pass_split_all_insns);
@@ -455,7 +456,6 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_reload);
   NEXT_PASS (pass_postreload);
   PUSH_INSERT_PASSES_WITHIN (pass_postreload)
-  NEXT_PASS (pass_postreload_jump);
   NEXT_PASS (pass_postreload_cse);
   NEXT_PASS (pass_gcse2);
   NEXT_PASS (pass_split_after_reload);

This will fix this problem while retaining the benefits of the
additional jump threading pass.

[Bug target/91323] LTGT rtx produces UCOMISS instead of COMISS

2019-10-10 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91323

Ilya Leoshkevich  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #21 from Ilya Leoshkevich  ---
I'm happy with x86 and spec improvements; the latter has also helped me to make
progress with S/390 implementation of signaling comparisons. Thanks!

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-10-09 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #2 from Ilya Leoshkevich  ---
I will have a look at this and try to adjust the CLEANUP_THREADING code.

[Bug target/88082] ICE in change_address_1, at emit-rtl.c:2286

2019-08-27 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88082

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #1 from Ilya Leoshkevich  ---
Hello Martin, do you per chance remember the failing revision?

With r274945 and stable gcc 9.1.1 it seems to work fine:

$ ./build/gcc/cc1 gcc/testsuite/c-c++-common/pr59037.c -Os -march=z14 ; echo $?

0

$ gcc-9 gcc/testsuite/c-c++-common/pr59037.c -Os -march=z14 ; echo $?
0

[Bug target/87206] Suboptimal code generation for __atomic_compare_exchange_n followed by a comparison

2019-08-26 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87206

--- Comment #1 from Ilya Leoshkevich  ---
Gentle ping.  Is there a way to make this work?  I could look into implementing
this if someone points me in the right direction.

[Bug target/91323] New: LTGT rtx produces UCOMISS instead of COMISS

2019-08-01 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91323

Bug ID: 91323
   Summary: LTGT rtx produces UCOMISS instead of COMISS
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
  Target Milestone: ---

I'm implementing signaling comparisons on S/390 and I'm trying to figure out
whether or not LTGT is supposed to be signaling.
I've decided to check what Intel does, and ran into what appears to be a bug.

Consider the following functions:

int f1(float a, float b) { return a < b || a > b; }
int f2(float a, float b) { return __builtin_isless(a, b) ||
__builtin_isgreater(a, b); }
int f3(float a, float b) { return __builtin_islessgreater(a, b); }

gcc creates LTGT rtx for f1 and UNEQ for f2 and f3.
However, for all 3 variants it then emits UCOMISS instruction.
I would expect f1 to be compiled to COMISS, since I believe that comparison
operators in C are supposed to be signaling.

[Bug target/89233] [9 Regression] ICE in change_address_1, at emit-rtl.c:2286

2019-02-07 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89233

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #2 from Ilya Leoshkevich  ---
I'll look into this.

[Bug rtl-optimization/87902] [9 Regression] Shrink-wrapping multiple conditions

2018-11-23 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87902

--- Comment #7 from Ilya Leoshkevich  ---
Apparently, for this specific case doing more of hard register copy
propagation is enough.  I've just tried running pass_cprop_hardreg
before pass_thread_prologue_and_epilogue, and it helped.

So, would running a mini-cprop_hardreg instead of just
copyprop_hardreg_forward_bb_without_debug_insn (entry_block) be
reasonable here?  Something along the lines of:

- Do something like pre_and_rev_post_order_compute_fn (), but do not go
  further from bbs which contain insns satisfying
  requires_stack_frame_p (), since shrink-wrapping cannot happen past
  those anyway.

  Same for bbs which have more than 1 predecessor, since
  cprop_hardreg forgets everything it saw when it encounters those.  Not
  sure if a reasonable merge function can be defined for struct
  value_data to improve this?

  Maybe also stop completely when a certain number of bbs is found.

- Do something like pass_cprop_hardreg::execute (), but use only bbs
  computed during the previous step.  Btw, would reverse postorder be
  the "more intelligent queuing of blocks" mentioned in
  pass_cprop_hardreg::execute ()?



When you say that what IRA does is not effective, do you mean just the
need to track indirect hard register copies, or can it be improved even
further?

[Bug target/87762] [9 Regression] extract_constrain_insn, at recog.c:2206 on s390x

2018-11-13 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87762

--- Comment #5 from Ilya Leoshkevich  ---
Martin, I believe I fixed this one.  Could you please give it another try?

[Bug rtl-optimization/87902] [9 Regression] Shrink-wrapping multiple conditions

2018-11-09 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87902

--- Comment #5 from Ilya Leoshkevich  ---
By the time shrink-wrapping is performed, which is after LRA
(pass_thread_prologue_and_epilogue, to be precise), aren't all spilling
decisions already made?  Because if that's true, we have to be
conservative in prepare_shrink_wrap () anyway, and move down copies only
when the parameter register still contains the parameter value.

[Bug rtl-optimization/87902] [9 Regression] Shrink-wrapping multiple conditions

2018-11-08 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87902

--- Comment #3 from Ilya Leoshkevich  ---
Judging by the following comment in lra-coalesce.c, RA doesn't do this
intentionally:

   Here we coalesce only spilled pseudos.  Coalescing non-spilled
   pseudos (with different hard regs) might result in spilling
   additional pseudos because of possible conflicts with other
   non-spilled pseudos and, as a consequence, in more constraint
   passes and even LRA infinite cycling.  Trivial the same hard
   register moves will be removed by subsequent compiler passes.

In which cases would moving copies down in prepare_shrink_wrap () make
the code worse?

[Bug rtl-optimization/87902] [9 Regression] Shrink-wrapping multiple conditions

2018-11-06 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87902

--- Comment #1 from Ilya Leoshkevich  ---
Bisect points to r265398: combine: Do not combine moves from hard
registers.

I wonder what would be the best place to fix this?  I was thinking about
making shrink-wrapping try harder by not limiting the processing to the
first basic block.

[Bug rtl-optimization/87902] New: [9 Regression] Shrink-wrapping multiple conditions

2018-11-06 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87902

Bug ID: 87902
   Summary: [9 Regression] Shrink-wrapping multiple conditions
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
CC: krebbel at gcc dot gnu.org
  Target Milestone: ---
Target: s390x-linux-gnu

I've noticed that r265830 fails to shrink-wrap multiple early returns in
gcc/testsuite/gcc.target/s390/nobp-return-mem-z900.c, while r264877 managed to
do so just fine.

After reload we end up with the following code for those conditions:

;; basic block 2
(note 5 1 3 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 3 5 2 2 NOTE_INSN_FUNCTION_BEG)
(insn 2 3 7 2 (set (reg/v:DI 12 %r12 [orig:63 aD.2191+-4 ] [63])
(reg:DI 2 %r2 [72]))
"gcc/testsuite/gcc.target/s390/nobp-return-mem-z900.c":14:1 1269 {*movdi_64}
 (nil))
(insn 7 2 8 2 (set (reg:CCZ 33 %cc)
(compare:CCZ (reg:SI 12 %r12 [orig:63 aD.2191 ] [63])
(const_int 42 [0x2a])))
"gcc/testsuite/gcc.target/s390/nobp-return-mem-z900.c":17:6 1232 {*cmpsi_cct}
 (nil))
(jump_insn 8 7 9 2 (set (pc)
(if_then_else (eq (reg:CCZ 33 %cc)
(const_int 0 [0]))
(label_ref:DI 33)
(pc))) "gcc/testsuite/gcc.target/s390/nobp-return-mem-z900.c":17:6
1896 {*cjump_64}
 (int_list:REG_BR_PROB 225163668 (nil))
 -> 33)

;; basic block 3
(note 9 8 12 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 12 9 13 3 (set (reg:CCS 33 %cc)
(compare:CCS (reg:SI 12 %r12 [orig:63 aD.2191 ] [63])
(const_int 0 [0])))
"gcc/testsuite/gcc.target/s390/nobp-return-mem-z900.c":20:3 1222
{*tstsi_cconly2}
 (nil))
(jump_insn 13 12 14 3 (set (pc)
(if_then_else (le (reg:CCS 33 %cc)
(const_int 0 [0]))
(label_ref:DI 33)
(pc))) "gcc/testsuite/gcc.target/s390/nobp-return-mem-z900.c":20:3
1896 {*cjump_64}
 (int_list:REG_BR_PROB 118111604 (nil))
 -> 33)

Note that comparisons use a copy in caller-saved %r12, and not %r2.  Then,
prepare_shrink_wrap () successfully propagates it to basic block 2. Basic block
3 is not affected - this seems to be by design, since prepare_shrink_wrap ()
only concerns itself with the first basic block.

In the past reload used to eliminate the copy and use %r2 directly in both
comparisons, but this seems to be no longer the case.

[Bug target/87762] [9 Regression] extract_constrain_insn, at recog.c:2206 on s390x

2018-10-29 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87762

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #2 from Ilya Leoshkevich  ---
This must have slipped through, because I tested the movdi patch on top of the
outdated trunk (r264877).

Candidate fix: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01793.html

[Bug bootstrap/87747] [9 regression] Bootstrap failure if using gcc-4.6 as stage1 compiler

2018-10-25 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87747

--- Comment #2 from Ilya Leoshkevich  ---
This is a little bit more complicated, because EQ_ATTR_ALT is valid only for
GENERATOR_FILEs.  The regtest has just finished, so I will post the patch to
the mailing list now.

[Bug tree-optimization/87687] New: s390x gcc 9 ICE in value_range::check

2018-10-22 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87687

Bug ID: 87687
   Summary: s390x gcc 9 ICE in value_range::check
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
  Target Milestone: ---
Target: s390x-redhat-linux

SVN r265373 / git f9fd74d64e9:

$ f9fd74d64e9-install/bin/gcc -x c -O2 -c -
void b() {
  int c = 1, d, e = 4096;
  for (; c; c--) {
d = 1;
for (; d; d--)
  e--;
  }
}
during GIMPLE pass: evrp
: In function ‘b’:
:8:1: internal compiler error: in check, at tree-vrp.c:155
0x1ab6019 value_range::check()
/home/iii/ibm/gcc-bisect/src/gcc/tree-vrp.c:155
0x1ab9a35 value_range::value_range(value_range_kind, tree_node*, tree_node*,
bitmap_head*)
/home/iii/ibm/gcc-bisect/src/gcc/tree-vrp.c:110
0x1ab9a35 set_value_range_with_overflow
/home/iii/ibm/gcc-bisect/src/gcc/tree-vrp.c:1422
0x1ab9a35 extract_range_from_binary_expr_1(value_range*, tree_code, tree_node*,
value_range const*, value_range const*)
/home/iii/ibm/gcc-bisect/src/gcc/tree-vrp.c:1679
0x1b48af7 vr_values::extract_range_from_binary_expr(value_range*, tree_code,
tree_node*, tree_node*, tree_node*)
/home/iii/ibm/gcc-bisect/src/gcc/vr-values.c:734
0x1b4b0d1 vr_values::extract_range_from_assignment(value_range*, gassign*)
/home/iii/ibm/gcc-bisect/src/gcc/vr-values.c:1389
0x1f03e29 evrp_range_analyzer::record_ranges_from_stmt(gimple*, bool)
/home/iii/ibm/gcc-bisect/src/gcc/gimple-ssa-evrp-analyze.c:285
0x1f0228f evrp_dom_walker::before_dom_children(basic_block_def*)
/home/iii/ibm/gcc-bisect/src/gcc/gimple-ssa-evrp.c:139
0x1edb47d dom_walker::walk(basic_block_def*)
/home/iii/ibm/gcc-bisect/src/gcc/domwalk.c:353
0x1f02dc9 execute_early_vrp
/home/iii/ibm/gcc-bisect/src/gcc/gimple-ssa-evrp.c:311
0x1f02dc9 execute
/home/iii/ibm/gcc-bisect/src/gcc/gimple-ssa-evrp.c:348
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

[Bug bootstrap/87417] [9 regression] Internal error: abort in attr_alt_intersection, at genattrtab.c:2357

2018-09-24 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87417

--- Comment #6 from Ilya Leoshkevich  ---
Candidate patch here:
  https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01382.html

[Bug bootstrap/87417] [9 regression] Internal error: abort in attr_alt_intersection, at genattrtab.c:2357

2018-09-24 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87417

--- Comment #5 from Ilya Leoshkevich  ---
Ok, makes sense.  I've just made a patch that adds the 5th, but it had to be
special-cased for GENERATOR_FILE, and thus doesn't look too nice.  FORMAT[0] ==
'w' looks much cleaner.

[Bug bootstrap/87417] [9 regression] Internal error: abort in attr_alt_intersection, at genattrtab.c:2357

2018-09-24 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87417

--- Comment #3 from Ilya Leoshkevich  ---
Valgrind has found an issue:

==12738== Invalid write of size 4
==12738==at 0x804CC48: attr_rtx_1 (genattrtab.c:518)
==12738==by 0x804CC48: attr_rtx(rtx_code, ...) (genattrtab.c:588)
==12738==by 0x804EA6D: mk_attr_alt (genattrtab.c:2406)
==12738==by 0x804EA6D: check_attr_test(file_location, rtx_def*, attr_desc*)
(genattrtab.c:709)
==12738==by 0x804EBBF: check_attr_value(file_location, rtx_def*,
attr_desc*) (genattrtab.c:945)
==12738==by 0x804A0AA: check_defs (genattrtab.c:1108)
==12738==by 0x804A0AA: main (genattrtab.c:5253)
==12738==  Address 0x6d79aa0 is 0 bytes after a block of size 16 alloc'd
==12738==at 0x402E27C: malloc (vg_replace_malloc.c:299)
==12738==by 0x8064FD3: xmalloc (xmalloc.c:147)
==12738==by 0x805233E: ggc_internal_alloc (ggc.h:130)
==12738==by 0x805233E: ggc_alloc_rtx_def_stat (ggc.h:275)
==12738==by 0x805233E: rtx_alloc_stat_v(rtx_code, int) (rtl.c:209)
==12738==by 0x805236D: rtx_alloc(rtx_code) (rtl.c:233)
==12738==by 0x804CC39: attr_rtx_1 (genattrtab.c:516)
==12738==by 0x804CC39: attr_rtx(rtx_code, ...) (genattrtab.c:588)
==12738==by 0x804EA6D: mk_attr_alt (genattrtab.c:2406)
==12738==by 0x804EA6D: check_attr_test(file_location, rtx_def*, attr_desc*)
(genattrtab.c:709)
==12738==by 0x804EBBF: check_attr_value(file_location, rtx_def*,
attr_desc*) (genattrtab.c:945)
==12738==by 0x804A0AA: check_defs (genattrtab.c:1108)
==12738==by 0x804A0AA: main (genattrtab.c:5253)

Apparently allocated EQ_ATTR_ALT is smaller than I expect: 16 bytes are clearly
not enough to contain rtx_def and 2 HOST_WIDE_INTs.

[Bug bootstrap/87417] [9 regression] Internal error: abort in attr_alt_intersection, at genattrtab.c:2357

2018-09-24 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87417

--- Comment #2 from Ilya Leoshkevich  ---
Fails on i686-linux-gnu:

*** Error in `build/genattrtab': malloc(): memory corruption: 0x08e56da0 ***
=== Backtrace: =
/lib/i386-linux-gnu/libc.so.6(+0x6738a)[0xf755c38a]
/lib/i386-linux-gnu/libc.so.6(+0x6dfc7)[0xf7562fc7]
/lib/i386-linux-gnu/libc.so.6(+0x6ff82)[0xf7564f82]
/lib/i386-linux-gnu/libc.so.6(__libc_malloc+0xc5)[0xf7566bf5]
build/genattrtab[0x8064fd4]
build/genattrtab[0x805233f]
build/genattrtab[0x805236e]
build/genattrtab[0x804cc3a]
build/genattrtab[0x804ea6e]
build/genattrtab[0x804ebc0]
build/genattrtab[0x804a0ab]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf6)[0xf750d286]
build/genattrtab[0x804ba27]
=== Memory map: 
08048000-08091000 r-xp  00:21 29450225  
/Users/beep/ibm/gcc/host-x86_64-pc-linux-gnu/gcc/build/genattrtab
08092000-08093000 r--p 00049000 00:21 29450225  
/Users/beep/ibm/gcc/host-x86_64-pc-linux-gnu/gcc/build/genattrtab
08093000-08097000 rw-p 0004a000 00:21 29450225  
/Users/beep/ibm/gcc/host-x86_64-pc-linux-gnu/gcc/build/genattrtab
08097000-0809b000 rw-p  00:00 0
08452000-08eef000 rw-p  00:00 0  [heap]
f710-f7121000 rw-p  00:00 0
f7121000-f720 ---p  00:00 0
f72af000-f73b1000 rw-p  00:00 0
f73ce000-f73ea000 r-xp  00:2b 74
/lib/i386-linux-gnu/libgcc_s.so.1
f73ea000-f73eb000 r--p 0001b000 00:2b 74
/lib/i386-linux-gnu/libgcc_s.so.1
f73eb000-f73ec000 rw-p 0001c000 00:2b 74
/lib/i386-linux-gnu/libgcc_s.so.1
f73f1000-f74f5000 rw-p  00:00 0
f74f5000-f76a6000 r-xp  00:2b 43
/lib/i386-linux-gnu/libc-2.24.so
f76a6000-f76a8000 r--p 001b 00:2b 43
/lib/i386-linux-gnu/libc-2.24.so
f76a8000-f76a9000 rw-p 001b2000 00:2b 43
/lib/i386-linux-gnu/libc-2.24.so
f76a9000-f76ac000 rw-p  00:00 0
f76ac000-f76ff000 r-xp  00:2b 88
/lib/i386-linux-gnu/libm-2.24.so
f76ff000-f770 r--p 00052000 00:2b 88
/lib/i386-linux-gnu/libm-2.24.so
f770-f7701000 rw-p 00053000 00:2b 88
/lib/i386-linux-gnu/libm-2.24.so
f7705000-f7709000 rw-p  00:00 0
f7709000-f770b000 r--p  00:00 0  [vvar]
f770b000-f770c000 r-xp  00:00 0  [vdso]
f770c000-f772f000 r-xp  00:2b 36
/lib/i386-linux-gnu/ld-2.24.so
f772f000-f773 r--p 00022000 00:2b 36
/lib/i386-linux-gnu/ld-2.24.so
f773-f7731000 rw-p 00023000 00:2b 36
/lib/i386-linux-gnu/ld-2.24.so
fffc5000-fffe7000 rw-p  00:00 0 
[stack]

[Bug bootstrap/87417] [9 regression] Internal error: abort in attr_alt_intersection, at genattrtab.c:2357

2018-09-24 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87417

--- Comment #1 from Ilya Leoshkevich  ---
Ouch!  Somehow s2 got corrupted (the 2nd value can be either 0 or 1).  I'm
looking at this now.

[Bug tree-optimization/87309] [9 Regression] Spurious note: messages when building with -fopt-info-vec-optimized

2018-09-21 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87309

--- Comment #7 from Ilya Leoshkevich  ---
Thanks!

[Bug tree-optimization/87309] [9 Regression] Spurious note: messages when building with -fopt-info-vec-optimized

2018-09-19 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87309

--- Comment #4 from Ilya Leoshkevich  ---
Do we also need to test m_test_pp_flags?
At least dump_context::emit_item does it.

[Bug tree-optimization/87309] Spurious note: messages when building with -fopt-info-vec-optimized

2018-09-14 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87309

Ilya Leoshkevich  changed:

   What|Removed |Added

 CC||iii at linux dot ibm.com

--- Comment #1 from Ilya Leoshkevich  ---
Created attachment 44693
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44693=edit
patch

This fixes the problem for me, but I'm not sure if this is the right solution.

[Bug tree-optimization/87309] New: Spurious note: messages when building with -fopt-info-vec-optimized

2018-09-14 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87309

Bug ID: 87309
   Summary: Spurious note: messages when building with
-fopt-info-vec-optimized
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
  Target Milestone: ---

$ cat test.cpp
void a() {}

$ g++ -c test.cpp -fopt-info-vec-optimized -O3
test.cpp:1:6: note: test.cpp:1:11: note:

This is coming from DUMP_VECT_SCOPE ("vect_analyze_data_refs"); in
vect_analyze_data_refs().  I suspect that alt_flags check around dump_loc call
is missing in dump_context::begin_scope.

[Bug tree-optimization/87206] New: Suboptimal code generation for __atomic_compare_exchange_n followed by a comparison

2018-09-03 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87206

Bug ID: 87206
   Summary: Suboptimal code generation for
__atomic_compare_exchange_n followed by a comparison
   Product: gcc
   Version: 8.2.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iii at linux dot ibm.com
CC: krebbel at gcc dot gnu.org
  Target Milestone: ---

I tried to build the example #5 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080 on x86_64 and observed a
similar issue:

$ cat 1.c
extern void bar (int *);

void foo5(int *mem)
{
  int oldval = 0;
  __atomic_compare_exchange_n (mem, (void *) , 1,
   1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED);
  if (oldval != 0)
bar (mem);
}

$ gcc-8 -c 1.c -O3 -g

$ objdump -d 1.o
# skip
 <_foo5>:
   0:   31 c0   xor%eax,%eax
   2:   ba 01 00 00 00  mov$0x1,%edx
   7:   f0 0f b1 17 lock cmpxchg %edx,(%rdi)
   b:   85 c0   test   %eax,%eax
   d:   75 01   jne10 <_foo5+0x10>
   f:   c3  retq
  10:   e9 00 00 00 00  jmpq   15 <_foo5+0x15>

We don't have to do "test %eax,%eax", because this information is already
available through ZF, which is set by CMPXCHG.

I wonder if it would be possible to come up with a common solution for all
architectures, including x86_64 and s390?

[Bug target/80080] S390: Isses with emitted cs-instructions for __atomic builtins.

2018-08-22 Thread iii at linux dot ibm.com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080

--- Comment #12 from Ilya Leoshkevich  ---
I've investigated foo3, foo4 and foo5, and came to the following
conclusions:

When foo3 is compiled with -march=z10 or later, cprop1 pass propagates
global's SYMBOL_REF value into UNSPECV_CAS.  On previous machines it
does not happen, because the result is rejected by insn_invalid_p ().
Then, reload realizes that SYMBOL_REF cannot be a legitimate UNSPECV_CAS
argument, and loads it into a pseudo right before.  The net result is
that loading of SYMBOL_REF is moved from outside of the loop into the
loop.  So we need to somehow inhibit constant propagation for this case.

Jump threading in foo4 does not work, because it's done only during
`jump' pass, at which point there are insns with side-effects in the
basic block of the 2nd jump.  They are later deleted by the `combine'
pass, but we don't request CLEANUP_THREADING after that.  I wonder if we
could introduce it?

In addition, when foo4 is compiled with -O2 or -O3, we don't use
conditional return, because our return sequence contains a PARALLEL,
which is rejected by bb_is_just_return ().  This can also be improved.

Finally, in foo5 `cs' is generated by s390_expand_cs_tdsi (), and
comparison is generated by common expansion logic, so it doesn't look
possible to improve the situation solely in the back-end.  We need to
somehow make gcc aware that (oldval == 0) and (retval != 0) are
equivalent after `cs', but I'm not sure at which point we could and
should do this - in theory doing this on tree rather than RTL level can
help other architectures.

1 2 >

1 - 100 of 109 matches

Mail list logo