[Bug middle-end/103762] New: [12 Regression] glibc master branch is miscompiled by r12-897

2021-12-18 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103762

Bug ID: 103762
   Summary: [12 Regression] glibc master branch is miscompiled by
r12-897
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: luoxhu at cn dot ibm.com
  Target Milestone: ---
Target: x86-64

On x86-64, r12-897

commit de56f95afaaa22c67cbeec780921d63e8b34514e
Author: Xionghu Luo 
Date:   Tue May 18 21:34:18 2021 -0500

Run pass_sink_code once more before store_merging

Gimple sink code pass runs quite early, there may be some new
oppertunities exposed by later gimple optmization passes, this patch
runs the sink code pass once more before store_merging.  For detailed
discussion, please refer to:
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562352.html

Tested the SPEC2017 performance on P8LE, 544.nab_r is improved
by 2.43%, but no big changes to other cases, GEOMEAN is improved quite
small with 0.25%.

gcc/ChangeLog:

2021-05-18  Xionghu Luo  

* passes.def: Add sink_code pass before store_merging.
* tree-ssa-sink.c (pass_sink_code:clone): New.

gcc/testsuite/ChangeLog:

2021-05-18  Xionghu Luo  

* gcc.dg/tree-ssa/ssa-sink-1.c: Adjust.
* gcc.dg/tree-ssa/ssa-sink-2.c: Ditto.
* gcc.dg/tree-ssa/ssa-sink-3.c: Ditto.
* gcc.dg/tree-ssa/ssa-sink-4.c: Ditto.
* gcc.dg/tree-ssa/ssa-sink-5.c: Ditto.
* gcc.dg/tree-ssa/ssa-sink-6.c: Ditto.
* gcc.dg/tree-ssa/ssa-sink-7.c: Ditto.
* gcc.dg/tree-ssa/ssa-sink-8.c: Ditto.
* gcc.dg/tree-ssa/ssa-sink-9.c: Ditto.
* gcc.dg/tree-ssa/ssa-sink-10.c: Ditto.
* gcc.dg/tree-ssa/ssa-sink-13.c: Ditto.
* gcc.dg/tree-ssa/ssa-sink-14.c: Ditto.
* gcc.dg/tree-ssa/ssa-sink-16.c: Ditto.
* gcc.dg/tree-ssa/ssa-sink-17.c: Ditto.
* gcc.dg/tree-ssa/ssa-sink-18.c: New.

miscompiled glibc master branch:

FAIL: elf/tst-env-setuid

[hjl@gnu-skx-1 build-x86_64-linux]$ env
GCONV_PATH=/export/build/gnu/tools-build/glibc-cet/build-x86_64-linux/iconvdata
LOCPATH=/export/build/gnu/tools-build/glibc-cet/build-x86_64-linux/localedata
LC_ALL=C MALLOC_CHECK_=2 MALLOC_MMAP_THRESHOLD_=4096 LD_HWCAP_MASK=0x1
/export/build/gnu/tools-build/glibc-test/build-x86_64-linux/elf/tst-env-setuid
Segmentation fault (core dumped)
[hjl@gnu-skx-1 build-x86_64-linux]$ 

(gdb) set env MALLOC_CHECK_=2
(gdb) r --direct
Starting program:
/export/build/gnu/tools-build/glibc-test/build-x86_64-linux/elf/tst-env-setuid
--direct

Program received signal SIGSEGV, Segmentation fault.
0x77f67053 in do_tunable_update_val (cur=cur@entry=0xefefe7f0,
valp=valp@entry=0x7fffdd38, minp=minp@entry=0x0, maxp=maxp@entry=0x0) at
dl-tunables.c:102
102   if (cur->type.type_code == TUNABLE_TYPE_STRING)
(gdb) bt
#0  0x77f67053 in do_tunable_update_val (cur=cur@entry=0xefefe7f0, 
valp=valp@entry=0x7fffdd38, minp=minp@entry=0x0, maxp=maxp@entry=0x0)
at dl-tunables.c:102
#1  0x77f6733c in tunable_initialize (strval=0x7fffefa7 "2", 
cur=) at dl-tunables.c:151
#2  __tunables_init (envp=0x7fffe058) at dl-tunables.c:349
#3  0x77f15f67 in __libc_start_main_impl (main=0x77f128f0 , 
argc=2, argv=0x7fffdea8, init=, fini=, 
rtld_fini=0x0, stack_end=0x7fffde98) at ../csu/libc-start.c:291
#4  0x77f129c5 in _start () at ../sysdeps/x86_64/start.S:115
(gdb)

[Bug middle-end/102080] [12 Regression] avx512vl related ICE, on firefox-92 gcc ICEs: in expand_insn, at optabs.c:7946 by r12-2679

2021-12-17 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102080

H.J. Lu  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #17 from H.J. Lu  ---
Fixed.

[Bug middle-end/103735] [12 Regression] Extra glibc "make check" failures by r12-4764

2021-12-16 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103735

H.J. Lu  changed:

   What|Removed |Added

Summary|[12 Regression] Extra glibc |[12 Regression] Extra glibc
   |"make check" failures   |"make check" failures by
   ||r12-4764
 CC||rguenth at gcc dot gnu.org

--- Comment #2 from H.J. Lu  ---
FAIL: math/test-float32-yn

is caused by

a84b9d5373c7e67fd0ab2a412c22162cdf969c91 is the first bad commit
commit a84b9d5373c7e67fd0ab2a412c22162cdf969c91
Author: Richard Biener 
Date:   Wed Oct 27 14:27:40 2021 +0200

middle-end/57245 - honor -frounding-math in real truncation

The following honors -frounding-math when converting a FP constant
to another FP type.

2021-10-27  Richard Biener  

PR middle-end/57245
* fold-const.c (fold_convert_const_real_from_real): Honor
-frounding-math if the conversion is not exact.
* simplify-rtx.c (simplify_const_unary_operation): Do not
simplify FLOAT_TRUNCATE with sign dependent rounding.

* gcc.dg/torture/fp-double-convert-float-1.c: New testcase.

[Bug middle-end/103735] [12 Regression] Extra glibc "make check" failures

2021-12-16 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103735

H.J. Lu  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-12-16
 Status|UNCONFIRMED |NEW

--- Comment #1 from H.J. Lu  ---
The glibc patch:

https://sourceware.org/pipermail/libc-alpha/2021-December/134254.html

fixed:

FAIL: elf/tst-env-setuid

[Bug c/103735] New: [12 Regression] Extra glibc "make check" failures

2021-12-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103735

Bug ID: 103735
   Summary: [12 Regression] Extra glibc "make check" failures
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com, joseph at codesourcery dot com
  Target Milestone: ---

With GCC master branch:

commit aeedb00a1ae2ccd10b1a5f00ff466081aeadb54b
Author: Roger Sayle 
Date:   Thu Dec 9 10:45:28 2021 +0100

nvptx: Add (experimental) support for HFmode with -misa=sm_53

I got

FAIL: elf/tst-env-setuid
FAIL: math/test-float-clog10
FAIL: math/test-float-cos
FAIL: math/test-float-j0
FAIL: math/test-float-jn
FAIL: math/test-float-sin
FAIL: math/test-float-sincos
FAIL: math/test-float-y0
FAIL: math/test-float-y1
FAIL: math/test-float-yn
FAIL: math/test-float32-clog10
FAIL: math/test-float32-cos
FAIL: math/test-float32-j0
FAIL: math/test-float32-jn
FAIL: math/test-float32-sin
FAIL: math/test-float32-sincos
FAIL: math/test-float32-y0
FAIL: math/test-float32-y1
FAIL: math/test-float32-yn

in "make check" failures on glibc master branch on x86-64.  Some of failures
are

testing float (without inline functions)
Failure: cos (qNaN): Exception "Inexact" set
Failure: cos (-qNaN): Exception "Inexact" set
Failure: cos_downward (qNaN): Exception "Inexact" set
Failure: cos_downward (-qNaN): Exception "Inexact" set
Failure: cos_towardzero (qNaN): Exception "Inexact" set
Failure: cos_towardzero (-qNaN): Exception "Inexact" set
Failure: cos_upward (qNaN): Exception "Inexact" set
Failure: cos_upward (-qNaN): Exception "Inexact" set

GCC 11:

[hjl@gnu-skx-1 build-x86_64-linux]$ env
GCONV_PATH=/export/build/gnu/tools-build/glibc-cet/build-x86_64-linux/iconvdata
LOCPATH=/export/build/gnu/tools-build/glibc-cet/build-x86_64-linux/localedata
LC_ALL=C MALLOC_CHECK_=2 MALLOC_MMAP_THRESHOLD_=4096 LD_HWCAP_MASK=0x1
/export/build/gnu/tools-build/glibc-cet/build-x86_64-linux/elf/tst-env-setuid
error: tst-env-setuid.c:99: SGID failed: GID and EGID match (1000)

[hjl@gnu-skx-1 build-x86_64-linux]$ echo $?
77

GCC 12:

[hjl@gnu-skx-1 build-x86_64-linux]$ env
GCONV_PATH=/export/build/gnu/tools-build/glibc-cet/build-x86_64-linux/iconvdata
LOCPATH=/export/build/gnu/tools-build/glibc-cet/build-x86_64-linux/localedata
LC_ALL=C MALLOC_CHECK_=2 MALLOC_MMAP_THRESHOLD_=4096 LD_HWCAP_MASK=0x1
/export/build/gnu/tools-build/glibc-test/build-x86_64-linux/elf/tst-env-setuid
Segmentation fault (core dumped)
[hjl@gnu-skx-1 build-x86_64-linux]$ 

(gdb) set env MALLOC_CHECK_=2
(gdb) r --direct
Starting program:
/export/build/gnu/tools-build/glibc-test/build-x86_64-linux/elf/tst-env-setuid
--direct

Program received signal SIGSEGV, Segmentation fault.
0x77f67053 in do_tunable_update_val (cur=cur@entry=0xefefe7f0,
valp=valp@entry=0x7fffdd38, minp=minp@entry=0x0, maxp=maxp@entry=0x0) at
dl-tunables.c:102
102   if (cur->type.type_code == TUNABLE_TYPE_STRING)
(gdb) bt
#0  0x77f67053 in do_tunable_update_val (cur=cur@entry=0xefefe7f0, 
valp=valp@entry=0x7fffdd38, minp=minp@entry=0x0, maxp=maxp@entry=0x0)
at dl-tunables.c:102
#1  0x77f6733c in tunable_initialize (strval=0x7fffefa7 "2", 
cur=) at dl-tunables.c:151
#2  __tunables_init (envp=0x7fffe058) at dl-tunables.c:349
#3  0x77f15f67 in __libc_start_main_impl (main=0x77f128f0 , 
argc=2, argv=0x7fffdea8, init=, fini=, 
rtld_fini=0x0, stack_end=0x7fffde98) at ../csu/libc-start.c:291
#4  0x77f129c5 in _start () at ../sysdeps/x86_64/start.S:115
(gdb)

[Bug target/103594] [12 Regression] ICE in get, at cgraph.h:1335 since r12-5771

2021-12-07 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103594

H.J. Lu  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from H.J. Lu  ---
Fixed.

[Bug target/103594] [12 Regression] ICE in get, at cgraph.h:1335 since r12-5771

2021-12-07 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103594

--- Comment #4 from H.J. Lu  ---
(In reply to H.J. Lu from comment #3)
> Why can't I reproduce it?
> 
> $ ./xgcc -B./ -S -O
> /export/gnu/import/git/gitlab/x86-gcc/gcc/testsuite/gcc.c-torture/compile/
> pr37433.c
> $

-fPIE/-fPIC is needed.

[Bug target/103594] [12 Regression] ICE in get, at cgraph.h:1335 since r12-5771

2021-12-07 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103594

H.J. Lu  changed:

   What|Removed |Added

 CC|hjl at gcc dot gnu.org |hjl.tools at gmail dot 
com

--- Comment #3 from H.J. Lu  ---
Why can't I reproduce it?

$ ./xgcc -B./ -S -O
/export/gnu/import/git/gitlab/x86-gcc/gcc/testsuite/gcc.c-torture/compile/pr37433.c
$

[Bug sanitizer/103466] [12 Regression] SIGILL on machine without avx support when using thread sanitizer

2021-12-06 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103466

--- Comment #7 from H.J. Lu  ---
(In reply to Jakub Jelinek from comment #6)
> I'd say we should patch away locally the initial v prefixes until the merge
> is done.

The patch is here:

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585755.html

[Bug target/103269] Enable ZMM in MOVE_MAX and STORE_MAX_PIECES without -mprefer-vector-width=512

2021-12-04 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103269

H.J. Lu  changed:

   What|Removed |Added

   Target Milestone|--- |12.0
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from H.J. Lu  ---
Fixed for GCC 12.

[Bug bootstrap/103547] [12 Regression] Bootstrap failure with --with-cpu=skylake-avx512

2021-12-03 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103547

--- Comment #3 from H.J. Lu  ---
r12-5778 builds now.  It has happened once before.  I will leave it open
until we find out exactly what is going on.

[Bug bootstrap/103547] New: [12 Regression] Bootstrap failure

2021-12-03 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103547

Bug ID: 103547
   Summary: [12 Regression] Bootstrap failure
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: tamar.christina at arm dot com
  Target Milestone: ---
Target: x86-64

On Linux/x86-64, r12-5775 failed to bootstrap when configured

--with-arch=native --with-cpu=native --prefix=/usr/12.0.0 --enable-clocale=gnu
--with-system-zlib --enable-shared --enable-cet --with-demangler-in-ld
--enable-libmpx --with-multilib-list=m32,m64,mx32 --with-fpmath=sse

where native == skylake-avx512:

https://gcc.gnu.org/pipermail/gcc-regression/2021-December/075945.html


In file included from ../../src-master/gcc/gimple-ssa-strength-reduction.c:37:
In member function ‘hash_table::value_type*
hash_table::alloc_entries(size_t) const [with
Descriptor = hash_map::hash_entry; bool Lazy = false;
Allocator = xcallocator]’,
inlined from ‘void hash_table::expand() [with
Descriptor = hash_map::hash_entry; bool Lazy = false;
Allocator = xcallocator]’ at ../../src-master/gcc/hash-table.h:802:40:
../../src-master/gcc/system.h:784:34: error: section type conflict with ‘void
hash_table::expand() [with Descriptor =
hash_map::hash_entry; bool Lazy = false; Allocator =
xcallocator]’
  784 |((void)(!(EXPR) ? fancy_abort (__FILE__, __LINE__, __FUNCTION__), 0
: 0))
  |  ^~
../../src-master/gcc/hash-table.h:715:3: note: in expansion of macro
‘gcc_assert’
  715 |   gcc_assert (nentries != NULL);
  |   ^~
In file included from ../../src-master/gcc/coretypes.h:482,
 from ../../src-master/gcc/gimple-ssa-strength-reduction.c:38:
../../src-master/gcc/hash-table.h: In member function ‘void
hash_table::expand() [with Descriptor =
hash_map::hash_entry; bool Lazy = false; Allocator =
xcallocator]’:
../../src-master/gcc/hash-table.h:779:1: note: ‘void hash_table::expand() [with Descriptor = hash_map::hash_entry; bool Lazy = false; Allocator = xcallocator]’ was
declared here
  779 | hash_table::expand ()
  | ^~~
/export/gnu/import/git/gcc-test-master-intel64-native/bld/./prev-gcc/xg++
-B/export/gnu/import/git/gcc-test-master-intel64-native/bld/./prev-gcc/
-B/usr/12.0.0/x86_64-pc-linux-gnu/bin/ -nostdinc++
-B/export/gnu/import/git/gcc-test-master-intel64-native/bld/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-B/export/gnu/import/git/gcc-test-master-intel64-native/bld/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs

-I/export/gnu/import/git/gcc-test-master-intel64-native/bld/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu

-I/export/gnu/import/git/gcc-test-master-intel64-native/bld/prev-x86_64-pc-linux-gnu/libstdc++-v3/include

-I/export/gnu/import/git/gcc-test-master-intel64-native/src-master/libstdc++-v3/libsupc++
-L/export/gnu/import/git/gcc-test-master-intel64-native/bld/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-L/export/gnu/import/git/gcc-test-master-intel64-native/bld/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs
 -fno-PIE -c   -g -O2 -fno-checking -gtoggle -DIN_GCC -fno-exceptions
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -Werror -fno-common  -DHAVE_CONFIG_H -I. -I.
-I../../src-master/gcc -I../../src-master/gcc/.
-I../../src-master/gcc/../include -I../../src-master/gcc/../libcpp/include
-I../../src-master/gcc/../libcody  -I../../src-master/gcc/../libdecnumber
-I../../src-master/gcc/../libdecnumber/bid -I../libdecnumber
-I../../src-master/gcc/../libbacktrace   -o loop-invariant.o -MT
loop-invariant.o -MMD -MP -MF ./.deps/loop-invariant.TPo
../../src-master/gcc/loop-invariant.c
/export/gnu/import/git/gcc-test-master-intel64-native/bld/./prev-gcc/xg++
-B/export/gnu/import/git/gcc-test-master-intel64-native/bld/./prev-gcc/
-B/usr/12.0.0/x86_64-pc-linux-gnu/bin/ -nostdinc++
-B/export/gnu/import/git/gcc-test-master-intel64-native/bld/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-B/export/gnu/import/git/gcc-test-master-intel64-native/bld/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs

-I/export/gnu/import/git/gcc-test-master-intel64-native/bld/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu

-I/export/gnu/import/git/gcc-test-master-intel64-native/bld/prev-x86_64-pc-linux-gnu/libstdc++-v3/include

-I/export/gnu/import/git/gcc-test-master-intel64-native/src-master/libstdc++-v3/libsupc++
-L/export/gnu/import/git/gcc-test-master-intel64-native/bld/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs

[Bug pch/71934] pch cannot be disabled so gcc cannot be position independent

2021-12-03 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71934

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2021-12-03

--- Comment #12 from H.J. Lu  ---
(In reply to Iain Sandoe from comment #11)
> So my guess is that the full answer is:
> 
> Yes, for hosts that can find a 512Mb or sol space in the VMA that is
> confidently available - and no otherwise.

We can add a host option to enable PIE.

[Bug pch/71934] pch cannot be disabled so gcc cannot be position independent

2021-12-03 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71934

--- Comment #9 from H.J. Lu  ---
Can we enable PIE on gcc now?

[Bug target/83782] [9/10/11 Regression] Inconsistent address for hidden ifunc in a shared library

2021-12-03 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83782

H.J. Lu  changed:

   What|Removed |Added

 Status|REOPENED|NEW
Summary|[9/10/11/12 Regression] |[9/10/11 Regression]
   |Inconsistent address for|Inconsistent address for
   |hidden ifunc in a shared|hidden ifunc in a shared
   |library |library

--- Comment #4 from H.J. Lu  ---
Fixed for GCC 12 so far.

[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-25 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393

--- Comment #14 from H.J. Lu  ---
(In reply to Richard Earnshaw from comment #13)
> Also, note that the comment in gimple-fold.c prior to this change read:
> 
>   /* If we can perform the copy efficiently with first doing all loads
>  and then all stores inline it that way.  Currently efficiently
>  means that we can load all the memory into a single integer
>  register which is what MOVE_MAX gives us.  */
> 
> Which would imply that the AArch64 definition of MOVE_MAX is the correct one.

The GCC manual has

- Macro: MOVE_MAX
 The maximum number of bytes that a single instruction can move
 quickly between memory and registers or between two memory
 locations.

[Bug middle-end/103419] FAIL: gcc.target/i386/pr102566-10b.c with -mx32

2021-11-24 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103419

--- Comment #2 from H.J. Lu  ---
Created attachment 51871
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51871=edit
A patch

Hongtao, please take a look.

[Bug middle-end/103419] FAIL: gcc.target/i386/pr102566-10b.c with -mx32

2021-11-24 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103419

H.J. Lu  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
  Component|target  |middle-end
   Last reconfirmed||2021-11-25

--- Comment #1 from H.J. Lu  ---
(match (nop_atomic_bit_test_and_p @0 @1 @4)
 (bit_and (convert?@4 (ATOMIC_FETCH_OR_XOR_N @2 INTEGER_CST@0 @3))
   INTEGER_CST@1)
 (with {
 int ibit = tree_log2 (@0);
 int ibit2 = tree_log2 (@1);
   }
  (if (ibit == ibit2
  && ibit >= 0
  && TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@2))

(match (nop_atomic_bit_test_and_p @0 @1 @3)
 (bit_and (convert?@3 (SYNC_FETCH_OR_XOR_N @2 INTEGER_CST@0))
  INTEGER_CST@1)
 (with {
 int ibit = tree_log2 (@0);
 int ibit2 = tree_log2 (@1);
   }
  (if (ibit == ibit2
  && ibit >= 0
  && TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@2))

(match (nop_atomic_bit_test_and_p @0 @0 @4)
 (bit_and:c
  (convert1?@4
   (ATOMIC_FETCH_OR_XOR_N @2 (nop_convert? (lshift@0 integer_onep@5 @6)) @3))
  (convert2? @0))
 (if (TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@2)

(match (nop_atomic_bit_test_and_p @0 @0 @4)
 (bit_and:c
  (convert1?@4
   (SYNC_FETCH_OR_XOR_N @2 (nop_convert? (lshift@0 integer_onep@3 @5
  (convert2? @0))
 (if (TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@2)

are wrong. Here "type" is an integer type.  But TREE_TYPE (@2) is a pointer
type.  We should compare 2 integer types, not an integer type and a pointer
types.

[Bug middle-end/103419] New: FAIL: gcc.target/i386/pr102566-10b.c with -mx32

2021-11-24 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103419

Bug ID: 103419
   Summary: FAIL: gcc.target/i386/pr102566-10b.c with -mx32
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com
  Target Milestone: ---
Target: x86-64

On Linux/x86-64, -mx32 fails the following tests:

FAIL: gcc.target/i386/pr102566-10b.c scan-assembler-not cmpxchg
FAIL: gcc.target/i386/pr102566-10b.c scan-assembler-times lock;?[ \t]*btrq 1
FAIL: gcc.target/i386/pr102566-3b.c scan-assembler-not cmpxchg
FAIL: gcc.target/i386/pr102566-3b.c scan-assembler-times lock;?[ \t]*btsq 1

[Bug target/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128

2021-11-24 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393

--- Comment #2 from H.J. Lu  ---
(In reply to Richard Biener from comment #1)
> It isn't the vectorizer but memmove inline expansion.  I'm not sure it's
> really a bug, but there isn't a way to disable %ymm use besides disabling
> AVX entirely.
> HJ?

YMM move is generated by loop distribution which doesn't check
TARGET_PREFER_AVX128.

[Bug middle-end/103364] s390x: TLS reference in /usr/lib64/libLLVM.so mismatches non-TLS reference in /usr/lib64/libLLVM.so

2021-11-23 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103364

--- Comment #8 from H.J. Lu  ---
Does it work on x86-64?

[Bug other/103335] new test case gcc.dg/tree-ssa/modref-dse-4.c fails

2021-11-20 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103335

H.J. Lu  changed:

   What|Removed |Added

   Last reconfirmed||2021-11-20
 Target|powerpc64-linux-gnu |
 Ever confirmed|0   |1
   Host|powerpc64-linux-gnu |
  Build|powerpc64-linux-gnu |
 Status|UNCONFIRMED |NEW

--- Comment #1 from H.J. Lu  ---
It also fails on Linux/x86 with

make check RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-dse-4.c
--target_board='unix{-m32\ -march=cascadelake}'"

[Bug target/103330] [12 Regression] FAIL: gcc.target/i386/avx512fp16-vector-complex-float.c by r12-5378

2021-11-19 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103330

H.J. Lu  changed:

   What|Removed |Added

   Last reconfirmed||2021-11-19
 Ever confirmed|0   |1
   Target Milestone|--- |12.0
 Status|UNCONFIRMED |NEW

[Bug target/103330] New: [12 Regression] FAIL: gcc.target/i386/avx512fp16-vector-complex-float.c by r12-5378

2021-11-19 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103330

Bug ID: 103330
   Summary: [12 Regression] FAIL:
gcc.target/i386/avx512fp16-vector-complex-float.c by
r12-5378
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com, tamar.christina at arm dot com
  Target Milestone: ---

On Linux/x86, r12-5378 caused:

FAIL: gcc.target/i386/avx512fp16-vector-complex-float.c scan-assembler-not
vfmadd[123]*ph[ \\t]
FAIL: gcc.target/i386/avx512fp16-vector-complex-float.c scan-assembler-times
vfcmaddcph[ \\t] 1
FAIL: gcc.target/i386/avx512fp16-vector-complex-float.c scan-assembler-times
vfmaddcph[ \\t] 1

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-11-18 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

H.J. Lu  changed:

   What|Removed |Added

   Target Milestone|--- |12.0
 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #29 from H.J. Lu  ---
Fixed for GCC 12.

[Bug middle-end/103309] [12 Regression] Random gcc/system.h:784:34: error: section type conflict

2021-11-17 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103309

--- Comment #5 from H.J. Lu  ---
(In reply to Andrew Pinski from comment #4)
> (In reply to H.J. Lu from comment #3)
> > I first ran into it with r12-5074.  I am using GCC 11.2.1 from Fedora 35
> > and binutils master branch.   For r12-5074, the only change on the machine
> > is the GCC source.
> 
> r12-5074 does not even touch anything used for x86_64 builds either ...

My tester doesn't check every commit.  This failure comes and goes at
random.  I have no idea when it was introduced.

[Bug middle-end/103309] [12 Regression] Random gcc/system.h:784:34: error: section type conflict

2021-11-17 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103309

--- Comment #3 from H.J. Lu  ---
I first ran into it with r12-5074.  I am using GCC 11.2.1 from Fedora 35
and binutils master branch.   For r12-5074, the only change on the machine
is the GCC source.

[Bug bootstrap/103309] [12 Regression] Random gcc/system.h:784:34: error: section type conflict

2021-11-17 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103309

H.J. Lu  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-11-17

--- Comment #1 from H.J. Lu  ---
The build error can happen on different source files.  But it always points
to

https://gcc.gnu.org/pipermail/gcc-regression/2021-November/075734.html

In member function ‘hash_table::value_type*
hash_table::alloc_entries(size_t) const [with
Descriptor = hash_map >::hash_entry; bool Lazy =
false; Allocator = xcallocator]’,
inlined from ‘void hash_table::expand() [with
Descriptor = hash_map >::hash_entry; bool Lazy =
false; Allocator = xcallocator]’ at ../../src-master/gcc/hash-table.h:802:40:
../../src-master/gcc/system.h:784:34: error: section type conflict with ‘void
hash_table::expand() [with Descriptor =
hash_map >::hash_entry; bool Lazy = false;
Allocator = xcallocator]’
  784 |((void)(!(EXPR) ? fancy_abort (__FILE__, __LINE__, __FUNCTION__), 0
: 0))
  |  ^~
../../src-master/gcc/hash-table.h:715:3: note: in expansion of macro
‘gcc_assert’
  715 |   gcc_assert (nentries != NULL);
  |   ^~

[Bug bootstrap/103309] New: [12 Regression] Random gcc/system.h:784:34: error: section type conflict

2021-11-17 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103309

Bug ID: 103309
   Summary: [12 Regression] Random gcc/system.h:784:34: error:
section type conflict
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
  Target Milestone: ---

On Linux/x86-64, I started getting:

/export/build/gnu/tools-build/gcc-gitlab/build-x86_64-linux/./prev-gcc/xg++
-B/export/build/gnu/tools-build/gcc-gitlab/build-x86_64-linux/./prev-gcc/
-B/usr/gcc-12.0.0-x86-64/x86_64-pc-linux-gnu/bin/ -nostdinc++
-B/export/build/gnu/tools-build/gcc-gitlab/build-x86_64-linux/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-B/export/build/gnu/tools-build/gcc-gitlab/build-x86_64-linux/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs

-I/export/build/gnu/tools-build/gcc-gitlab/build-x86_64-linux/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu

-I/export/build/gnu/tools-build/gcc-gitlab/build-x86_64-linux/prev-x86_64-pc-linux-gnu/libstdc++-v3/include
 -I/export/gnu/import/git/gitlab/x86-gcc/libstdc++-v3/libsupc++
-L/export/build/gnu/tools-build/gcc-gitlab/build-x86_64-linux/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-L/export/build/gnu/tools-build/gcc-gitlab/build-x86_64-linux/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs
 -fno-PIE -c   -g -O2 -fchecking=1 -DIN_GCC -fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -Werror -fno-common  -DHAVE_CONFIG_H -I. -I.
-I/export/gnu/import/git/gitlab/x86-gcc/gcc
-I/export/gnu/import/git/gitlab/x86-gcc/gcc/.
-I/export/gnu/import/git/gitlab/x86-gcc/gcc/../include
-I/export/gnu/import/git/gitlab/x86-gcc/gcc/../libcpp/include
-I/export/gnu/import/git/gitlab/x86-gcc/gcc/../libcody 
-I/export/gnu/import/git/gitlab/x86-gcc/gcc/../libdecnumber
-I/export/gnu/import/git/gitlab/x86-gcc/gcc/../libdecnumber/bid
-I../libdecnumber -I/export/gnu/import/git/gitlab/x86-gcc/gcc/../libbacktrace  
-o passes.o -MT passes.o -MMD -MP -MF ./.deps/passes.TPo
/export/gnu/import/git/gitlab/x86-gcc/gcc/passes.c
In file included from /export/gnu/import/git/gitlab/x86-gcc/gcc/passes.c:26:
In member function ??hash_table::value_type*
hash_table::alloc_entries(size_t) const [with
Descriptor = hash_map::hash_entry; bool Lazy =
false; Allocator = xcallocator]??,
inlined from ??void hash_table::expand() [with
Descriptor = hash_map::hash_entry; bool Lazy =
false; Allocator = xcallocator]?? at
/export/gnu/import/git/gitlab/x86-gcc/gcc/hash-table.h:802:40:
/export/gnu/import/git/gitlab/x86-gcc/gcc/system.h:784:34: error: section type
conflict with ??void hash_table::expand() [with
Descriptor = hash_map::hash_entry; bool Lazy =
false; Allocator = xcallocator]??
  784 |((void)(!(EXPR) ? fancy_abort (__FILE__, __LINE__, __FUNCTION__), 0
: 0))
  |  ^~
/export/gnu/import/git/gitlab/x86-gcc/gcc/hash-table.h:715:3: note: in
expansion of macro ??gcc_assert??
  715 |   gcc_assert (nentries != NULL);
  |   ^~
In file included from
/export/gnu/import/git/gitlab/x86-gcc/gcc/coretypes.h:482,
 from /export/gnu/import/git/gitlab/x86-gcc/gcc/passes.c:27:
/export/gnu/import/git/gitlab/x86-gcc/gcc/hash-table.h: In member function
??void hash_table::expand() [with Descriptor =
hash_map::hash_entry; bool Lazy = false;
Allocator = xcallocator]??:
/export/gnu/import/git/gitlab/x86-gcc/gcc/hash-table.h:779:1: note: ??void
hash_table::expand() [with Descriptor =
hash_map::hash_entry; bool Lazy = false;
Allocator = xcallocator]?? was declared here
  779 | hash_table::expand ()
  | ^~~
In file included from
/export/gnu/import/git/gitlab/x86-gcc/gcc/coretypes.h:474,
 from /export/gnu/import/git/gitlab/x86-gcc/gcc/expmed.c:26:
In function ??poly_uint16 mode_to_bytes(machine_mode)??,
inlined from ??typename if_nonpoly::type
GET_MODE_SIZE(const T&) [with T = scalar_int_mode]?? at
/export/gnu/import/git/gitlab/x86-gcc/gcc/machmode.h:647:24,
inlined from ??rtx_def* emit_store_flag_1(rtx, rtx_code, rtx, rtx,
machine_mode, int, int, machine_mode)?? at
/export/gnu/import/git/gitlab/x86-gcc/gcc/expmed.c:5724:56:
/export/gnu/import/git/gitlab/x86-gcc/gcc/machmode.h:550:49: warning:
??*(unsigned int*)((char*)_mode + offsetof(scalar_int_mode,
scalar_int_mode::m_mode))?? may be used uninitialized in this function
[-Wmaybe-uninitialized]
  550 |   ? mode_size_inline (mode) : mode_size[mode]);
  | ^~~~
/export/gnu/import/git/gitlab/x86-gcc/gcc/expmed.c: In function ??rtx_def*

[Bug target/103307] New: Unused "%!" before return

2021-11-17 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103307

Bug ID: 103307
   Summary: Unused "%!" before return
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: ubizjak at gmail dot com
  Target Milestone: ---
Target: i386, x86-64

x86 backend has

i386.c:  output_asm_insn ("%!ret", NULL);
i386.c:return "%!ret";
i386.md:  "%!ret\t%0"

Before MPX was removed, "%!" was mapped to

case '!':
  if (ix86_bnd_prefixed_insn_p (current_output_insn))
fputs ("bnd ", file);
  return;

After CET was added and MPX was removed, "%!" was mapped to

   case '!':
  if (ix86_notrack_prefixed_insn_p (current_output_insn))
fputs ("notrack ", file);
  return;

ix86_notrack_prefixed_insn_p always returns false on RET since the
notrack prefix is only for indirect branches.  Therefore, "%!" before
RET is unused.

[Bug target/103275] [11/12 Regression] don't generate kmov with IE model relocations

2021-11-17 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103275

--- Comment #11 from H.J. Lu  ---
(In reply to Hongtao.liu from comment #10)
> I'm working on a patch which adds a new memory constraint "Bk" which will
> exclude TLS UNSPECs for mask register alternative.
> 
> The UNSPEC i'm excluding is like below, any other UNSPEC needs to be added?
> 
> bool
> ix86_notls_memory (rtx mem)
> {
>   gcc_assert (MEM_P (mem));
> 
>   rtx addr = XEXP (mem, 0);
>   subrtx_var_iterator::array_type array;
>   FOR_EACH_SUBRTX_VAR (iter, array, addr, ALL)
> {
>   rtx op = *iter;
>   if (GET_CODE (op) == UNSPEC)
>   switch (XINT (op, 1))
> {
> case UNSPEC_GOTTPOFF:
> case UNSPEC_GOTNTPOFF:
> case UNSPEC_TP:
> case UNSPEC_TLS_GD:
> case UNSPEC_TLS_LD_BASE:
> case UNSPEC_TLSDESC:
> case UNSPEC_TLS_IE_SUN:

This doesn't look right.  For TARGET_64BIT, only

kmovq   foo@gottpoff(%rip), %k0
kmovq   foo@tlsld(%rip), %k0

should be disallowed.  For !TARGET_64BIT, only

kmovd   foo@gotntpoff(%eax), %k0
kmovd   foo@tpoff(%eax), %k0

should be disallowed.

>   return false;
> default:
>   break;
> }
>   /* Should iter.skip_subrtxes ();
>if there's no inner UNSPEC in addr???.  */
> }
> 
>   return true;
> }

[Bug target/103275] [11/12 Regression] don't generate kmov with IE model relocations

2021-11-16 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103275

--- Comment #8 from H.J. Lu  ---
(In reply to Hongtao.liu from comment #7)
> vmovd have same issue, for simplify should we disable 32bit load for
> sse/mask register when memory_operand has PIC address.

Please disable specific UNSPEC operands with mask register disallowed
in this assembler change:

https://sourceware.org/git/?p=binutils-gdb.git;a=patch;h=d7e3e627027fcf37d63e284144fe27ff4eba36b5

[Bug testsuite/103282] New test case gcc.dg/tree-ssa/modref-dse-5.c in r12-5292 fails

2021-11-16 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103282

H.J. Lu  changed:

   What|Removed |Added

  Build|powerpc64-linux-gnu,|
   |powerpc64le-linux-gnu   |
   Last reconfirmed||2021-11-16
 Status|UNCONFIRMED |NEW
 Target|powerpc64-linux-gnu,|
   |powerpc64le-linux-gnu   |
 CC||hjl.tools at gmail dot com
 Ever confirmed|0   |1
   Host|powerpc64-linux-gnu,|
   |powerpc64le-linux-gnu   |

--- Comment #1 from H.J. Lu  ---
I also saw it on Linux/x86.

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-11-16 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

H.J. Lu  changed:

   What|Removed |Added

 Status|WAITING |NEW
 CC||ubizjak at gmail dot com

--- Comment #26 from H.J. Lu  ---
(In reply to peterz from comment #25)
> (In reply to H.J. Lu from comment #24)
> > Should I submit the current patches?
> 
> Yes, I'd say so. Once merged I'll send a kernel patch to use
> -mindirect-branch-cs-prefix for all RETPOLINE builds. I posted an SLS patch
> but haven't really had much feedback on that yet, we'll see.

Patches are at

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584636.html
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584637.html

[Bug tree-optimization/103194] [12 Regression] ice in optimize_atomic_bit_test_and with __sync_fetch_and_and since r12-5102-gfb161782545224f5

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103194

--- Comment #17 from H.J. Lu  ---
(In reply to David Binderman from comment #0)
> For this C code:
> 
> long pscc_a_2_3;
> int pscc_a_1_4;
> void pscc()
> {
> pscc_a_1_4 = __sync_fetch_and_and(_a_2_3, 1);
> }
> 
> compiled by recent gcc trunk, does this:
> 
> $ /home/dcb/gcc/results/bin/gcc -c -O1 bug771.c
> during GIMPLE pass: fab
> bug771.c: In function ‘pscc’:
> bug771.c:3:6: internal compiler error: in optimize_atomic_bit_test_and, at
> tree-ssa-ccp.c:3626
> 3 | void pscc()
>   |  ^~~~
> 0xee6020 optimize_atomic_bit_test_and(gimple_stmt_iterator*, internal_fn,
> bool, bool)
>   ../../trunk.git/gcc/tree-ssa-ccp.c:3626
> 0xee389a (anonymous namespace)::pass_fold_builtins::execute(function*)
>   ../../trunk.git/gcc/tree-ssa-ccp.c:0
> 
> The bug first seems to occur sometime between git hash f2572a398d21fd52
> and a97fdde627e64202,a distance of some 60 commits.
> 
> In that range, commit fb161782545224f55ba26ba663889c5e6e9a04d1
> looks a likely candidate.

This is fixed by r12-5290.  But we should fix the missed optimization.

[Bug tree-optimization/103268] [12 regression] ICE om glib-2.10.1: internal compiler error: in optimize_atomic_bit_test_and, at tree-ssa-ccp.c:3626

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103268

H.J. Lu  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from H.J. Lu  ---
Fixed.

[Bug target/103269] Enable ZMM in MOVE_MAX and STORE_MAX_PIECES without -mprefer-vector-width=512

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103269

--- Comment #1 from H.J. Lu  ---
Created attachment 51803
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51803=edit
A patch

Please try this.

[Bug tree-optimization/103194] [12 Regression] ice in optimize_atomic_bit_test_and with __sync_fetch_and_and since r12-5102-gfb161782545224f5

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103194

--- Comment #16 from H.J. Lu  ---
(In reply to Hongtao.liu from comment #15)
> (In reply to H.J. Lu from comment #13)
> > (In reply to Hongtao.liu from comment #8)
> > > unsigned long pscc_a_2_3;
> > > int pscc_a_1_4;
> > > unsigned long pc2;
> > > void pscc(int n)
> > > {
> > >   long mask = 1ll << n;
> > >   pc2 = __sync_fetch_and_or(_a_2_3, mask) & mask;
> > > }
> > > 
> > > void pscc1(int n)
> > > {
> > >   long mask = 1ll << 65;
> > >   pc2 = __sync_fetch_and_or(_a_2_3, mask) & mask;
> > > }
> > > 
> > > pscc and pscc1 have different behavior when n >= 64, It seems unsafe to
> > > optimize variable mask?
> > 
> > Is the behavior well defined for n >= 64? I got
> > 
> > foo.c:11:19: warning: left shift count >= width of type
> > [-Wshift-count-overflow]
> >11 |   long mask = 1ll << 65;
> >   |   ^~
> According to C99
> The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are
> filled with zeros. If E1 has an unsigned type, the value of the result is E1
> × 2E2, reduced modulo one more than the maximum value representable in the
> result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is
> representable in the result type, then that is the resulting value;
> otherwise, the behavior is undefined.
> 
> So yes, it's well defined, and the result is zero.

This is the existing behavior since GCC 7.

[Bug tree-optimization/103194] [12 Regression] ice in optimize_atomic_bit_test_and with __sync_fetch_and_and since r12-5102-gfb161782545224f5

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103194

--- Comment #14 from H.J. Lu  ---
Should we open a new bug for missed optimization?

[Bug tree-optimization/103194] [12 Regression] ice in optimize_atomic_bit_test_and with __sync_fetch_and_and since r12-5102-gfb161782545224f5

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103194

--- Comment #13 from H.J. Lu  ---
(In reply to Hongtao.liu from comment #8)
> unsigned long pscc_a_2_3;
> int pscc_a_1_4;
> unsigned long pc2;
> void pscc(int n)
> {
>   long mask = 1ll << n;
>   pc2 = __sync_fetch_and_or(_a_2_3, mask) & mask;
> }
> 
> void pscc1(int n)
> {
>   long mask = 1ll << 65;
>   pc2 = __sync_fetch_and_or(_a_2_3, mask) & mask;
> }
> 
> pscc and pscc1 have different behavior when n >= 64, It seems unsafe to
> optimize variable mask?

Is the behavior well defined for n >= 64? I got

foo.c:11:19: warning: left shift count >= width of type
[-Wshift-count-overflow]
   11 |   long mask = 1ll << 65;
  |   ^~

[Bug tree-optimization/103268] [12 regression] ICE om glib-2.10.1: internal compiler error: in optimize_atomic_bit_test_and, at tree-ssa-ccp.c:3626

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103268

--- Comment #4 from H.J. Lu  ---
Created attachment 51802
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51802=edit
A patch

[Bug tree-optimization/103268] [12 regression] ICE om glib-2.10.1: internal compiler error: in optimize_atomic_bit_test_and, at tree-ssa-ccp.c:3626

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103268

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-11-16
 Ever confirmed|0   |1

--- Comment #3 from H.J. Lu  ---
It is related to PR 103194.

[Bug tree-optimization/103268] [12 regression] ICE om glib-2.10.1: internal compiler error: in optimize_atomic_bit_test_and, at tree-ssa-ccp.c:3626

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103268

H.J. Lu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com,
   ||hjl.tools at gmail dot com

--- Comment #2 from H.J. Lu  ---
A simplified test:


[hjl@gnu-cfl-2 pr103268]$ cat x.c
static int si;
long
test_types (long n)
{
  unsigned int u2 = __atomic_fetch_xor (, 0, 5);
  return u2;
}
[hjl@gnu-cfl-2 pr103268]$ make
/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/ -O2 -S
x.c
during GIMPLE pass: fab
x.c: In function ‘test_types’:
x.c:3:1: internal compiler error: in optimize_atomic_bit_test_and, at
tree-ssa-ccp.c:3645
3 | test_types (long n)
  | ^~
0x1515c9d optimize_atomic_bit_test_and
/export/gnu/import/git/gitlab/x86-gcc/gcc/tree-ssa-ccp.c:3645
0x151790a execute
/export/gnu/import/git/gitlab/x86-gcc/gcc/tree-ssa-ccp.c:4115
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
make: *** [Makefile:43: x.s] Error 1
[hjl@gnu-cfl-2 pr103268]$

[Bug target/103269] New: Enable ZMM in MOVE_MAX and STORE_MAX_PIECES without -mprefer-vector-width=512

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103269

Bug ID: 103269
   Summary: Enable ZMM in MOVE_MAX and STORE_MAX_PIECES without
-mprefer-vector-width=512
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com, wwwhhhyyy333 at gmail dot com
  Target Milestone: ---
Target: i386,x86-64

Need a way to enable ZMM in MOVE_MAX and STORE_MAX_PIECES without
-mprefer-vector-width=512.

[Bug middle-end/103184] [12 Regression] ICE caused by r12-5102

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103184

H.J. Lu  changed:

   What|Removed |Added

  Component|tree-optimization   |middle-end
 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #9 from H.J. Lu  ---
Fixed.

[Bug middle-end/103262] [12 Regression] Random FAIL: gcc.c-torture/execute/20061220-1.c after r12-5242

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103262

--- Comment #3 from H.J. Lu  ---
merge_call_side_effects has

  modref_parm_map chain_map;
...
 for (auto kill : saved_kills)
{
  if (kill.parm_index >= (int)parm_map.length ())
continue;
  modref_parm_map 
  = kill.parm_index == MODREF_STATIC_CHAIN_PARM
? chain_map
: parm_map[kill.parm_index];
  if (m.parm_index == MODREF_LOCAL_MEMORY_PARM
  || m.parm_index == MODREF_UNKNOWN_PARM
  || m.parm_index == MODREF_RETSLOT_PARM
  || !m.parm_offset_known)
continue;
  modref_access_node n = kill;
  n.parm_index = m.parm_index;
  n.parm_offset += m.parm_offset;
  if (modref_access_node::insert_kill (cur_summary->kills, n,
   record_adjustments))
changed = true;
}

But chain_map is never initialized.

[Bug middle-end/103262] [12 Regression] Random FAIL: gcc.c-torture/execute/20061220-1.c after r12-5242

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103262

--- Comment #2 from H.J. Lu  ---
May also need -mtune=generic -march=pentium4 with bootstrap.  Valgrind reports:

==2421026== Memcheck, a memory error detector
==2421026== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==2421026== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==2421026== Command:
/export/users/hjl/build/gnu/tools-build/gcc-32bit-gitlab/build-i686-linux/gcc/cc1
-quiet -v -iprefix
/export/users/hjl/build/gnu/tools-build/gcc-32bit-gitlab/build-i686-linux/gcc/../lib/gcc/i686-linux/12.0.0/
-isystem
/export/users/hjl/build/gnu/tools-build/gcc-32bit-gitlab/build-i686-linux/gcc/include
-isystem
/export/users/hjl/build/gnu/tools-build/gcc-32bit-gitlab/build-i686-linux/gcc/include-fixed
/export/gnu/import/git/gitlab/x86-gcc/gcc/testsuite/gcc.c-torture/execute/20061220-1.c
-quiet -dumpbase 20061220-1.c -dumpbase-ext .c -mtune=generic -march=pentium4
-O1 -version -o 20061220-1.s
==2421026==
GNU C17 (GCC) version 12.0.0 2025 (experimental) (i686-linux)
compiled by GNU C version 12.0.0 2025 (experimental), GMP version
6.2.0, MPFR version 4.1.0-p13, MPC version 1.2.1, isl version isl-0.16.1-GMP

GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
ignoring nonexistent directory
"/export/users/hjl/build/gnu/tools-build/gcc-32bit-gitlab/build-i686-linux/gcc/../lib/gcc/i686-linux/12.0.0/include"
ignoring nonexistent directory
"/export/users/hjl/build/gnu/tools-build/gcc-32bit-gitlab/build-i686-linux/gcc/../lib/gcc/i686-linux/12.0.0/include-fixed"
ignoring nonexistent directory
"/export/users/hjl/build/gnu/tools-build/gcc-32bit-gitlab/build-i686-linux/gcc/../lib/gcc/i686-linux/12.0.0/../../../../i686-linux/include"
ignoring nonexistent directory
"/usr/gcc-12.0.0-32bit/lib/gcc/i686-linux/12.0.0/include"
ignoring nonexistent directory "/usr/gcc-12.0.0-32bit/include"
ignoring nonexistent directory
"/usr/gcc-12.0.0-32bit/lib/gcc/i686-linux/12.0.0/include-fixed"
ignoring nonexistent directory "/usr/gcc-12.0.0-32bit/i686-linux/include"
#include "..." search starts here:
#include <...> search starts here:

/export/users/hjl/build/gnu/tools-build/gcc-32bit-gitlab/build-i686-linux/gcc/include

/export/users/hjl/build/gnu/tools-build/gcc-32bit-gitlab/build-i686-linux/gcc/include-fixed
 /usr/local/include
 /usr/include
End of search list.
GNU C17 (GCC) version 12.0.0 2025 (experimental) (i686-linux)
compiled by GNU C version 12.0.0 2025 (experimental), GMP version
6.2.0, MPFR version 4.1.0-p13, MPC version 1.2.1, isl version isl-0.16.1-GMP

GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: b2e0b138055bd7ca0343d4cea39ca3ee
==2421026== Conditional jump or move depends on uninitialised value(s)
==2421026==at 0x88605D0: (anonymous
namespace)::merge_call_side_effects(modref_summary*, gimple*, modref_summary*,
bool, cgraph_node*, bool, bool) (ipa-modref.c:1037)
==2421026==by 0x8866A54: analyze_call (ipa-modref.c:1414)
==2421026==by 0x8866A54: (anonymous
namespace)::analyze_stmt(modref_summary*, modref_summary_lto*, gimple*, bool,
vec*, bool) (ipa-modref.c:1585)
==2421026==by 0x8868EB1: (anonymous namespace)::analyze_function(function*,
bool) (ipa-modref.c:2935)
==2421026==by 0x886AC6D: (anonymous
namespace)::pass_modref::execute(function*) (ipa-modref.c:3965)
==2421026==by 0x89BAAA0: execute_one_pass(opt_pass*) (passes.c:2567)
==2421026==by 0x89BB341: execute_pass_list_1(opt_pass*) (passes.c:2656)
==2421026==by 0x89BB354: execute_pass_list_1(opt_pass*) (passes.c:2657)
==2421026==by 0x89BB38C: execute_pass_list(function*, opt_pass*)
(passes.c:2667)
==2421026==by 0x89BBC52: do_per_function_toporder(void (*)(function*,
void*), void*) [clone .part.0] (passes.c:1773)
==2421026==by 0x89BBE74: do_per_function_toporder (passes.c:1740)
==2421026==by 0x89BBE74: execute_ipa_pass_list(opt_pass*) (passes.c:3001)
==2421026==by 0x86105E7: ipa_passes (cgraphunit.c:2154)
==2421026==by 0x86105E7: symbol_table::compile() [clone .part.0]
(cgraphunit.c:2289)
==2421026==by 0x8612DF2: compile (cgraphunit.c:2269)
==2421026==by 0x8612DF2: symbol_table::finalize_compilation_unit()
(cgraphunit.c:2537)
==2421026==
==2421026== Conditional jump or move depends on uninitialised value(s)
==2421026==at 0x88605D5: (anonymous
namespace)::merge_call_side_effects(modref_summary*, gimple*, modref_summary*,
bool, cgraph_node*, bool, bool) (ipa-modref.c:1037)
==2421026==by 0x8866A54: analyze_call (ipa-modref.c:1414)
==2421026==by 0x8866A54: (anonymous
namespace)::analyze_stmt(modref_summary*, modref_summary_lto*, gimple*, bool,
vec*, bool) (ipa-modref.c:1585)
==2421026==by 0x8868EB1: (anonymous namespace)::analyze_function(function*,
bool) (ipa-modref.c:2935)
==2421026==by 0x886AC6D: (anonymous
namespace)::pass_modref::execute(function*) (ipa-modref.c:3965)
==2421026==by 0x89BAAA0: 

[Bug middle-end/103262] [12 Regression] Random FAIL: gcc.c-torture/execute/20061220-1.c after r12-5242

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103262

H.J. Lu  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-11-15
 Status|UNCONFIRMED |NEW

--- Comment #1 from H.J. Lu  ---
It may need -m32.

[Bug middle-end/103262] New: [12 Regression] Random FAIL: gcc.c-torture/execute/20061220-1.c after r12-5242

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103262

Bug ID: 103262
   Summary: [12 Regression] Random FAIL:
gcc.c-torture/execute/20061220-1.c after r12-5242
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
  Target Milestone: ---

On Linux/x86-64, I got random

spawn -ignore SIGHUP
/export/users/hjl/build/gnu/tools-build/gcc-32bit-gitlab/build-i686-linux/gcc/xgcc
-B/export/users/hjl/build/gnu/tools-build/gcc-32bit-gitlab/build-i686-linux/gcc/
/export/gnu/import/git/gitlab/x86-gcc/gcc/testsuite/gcc.c-torture/execute/20061220-1.c
-fdiagnostics-plain-output -O1 -w -lm -o ./20061220-1.exe
during GIMPLE pass: modref
/export/gnu/import/git/gitlab/x86-gcc/gcc/testsuite/gcc.c-torture/execute/20061220-1.c:
In function 'main':
/export/gnu/import/git/gitlab/x86-gcc/gcc/testsuite/gcc.c-torture/execute/20061220-1.c:73:1:
internal compiler error: in operator[], at vec.h:889
0x8306118 vec::operator[](unsigned int)
/export/gnu/import/git/gitlab/x86-gcc/gcc/vec.h:889
0x8306dbe vec::operator[](unsigned int)
/export/gnu/import/git/gitlab/x86-gcc/gcc/vec.h:1903
0x8306dbe vec::operator[](unsigned int)
/export/gnu/import/git/gitlab/x86-gcc/gcc/vec.h:1495
0x8306dbe merge_call_side_effects
/export/gnu/import/git/gitlab/x86-gcc/gcc/ipa-modref.c:1034
0x8866a54 analyze_call
/export/gnu/import/git/gitlab/x86-gcc/gcc/ipa-modref.c:1414
0x8866a54 analyze_stmt
/export/gnu/import/git/gitlab/x86-gcc/gcc/ipa-modref.c:1585
0x8868eb1 analyze_function
/export/gnu/import/git/gitlab/x86-gcc/gcc/ipa-modref.c:2935
0x886ac6d execute
/export/gnu/import/git/gitlab/x86-gcc/gcc/ipa-modref.c:3965
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
compiler exited with status 1
FAIL: gcc.c-torture/execute/20061220-1.c   -O1  (internal compiler error)
FAIL: gcc.c-torture/execute/20061220-1.c   -O1  (test for excess errors)

r12-5242 is OK and r12-5244 is not.  But it may be latent since it fails
at random.

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

H.J. Lu  changed:

   What|Removed |Added

 Status|NEW |WAITING

--- Comment #24 from H.J. Lu  ---
Should I submit the current patches?

[Bug target/103065] [meta] atomic operations aren't optimized

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103065
Bug 103065 depends on bug 103069, which changed state.

Bug 103069 Summary: cmpxchg isn't optimized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103069

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug target/103069] cmpxchg isn't optimized

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103069

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
   Target Milestone|--- |12.0
 Resolution|--- |FIXED

--- Comment #4 from H.J. Lu  ---
Fixed for GCC 12.

[Bug tree-optimization/103184] [12 Regression] ICE caused by r12-5102

2021-11-15 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103184

--- Comment #7 from H.J. Lu  ---
The updated patch is at

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584464.html

[Bug tree-optimization/103184] [12 Regression] ICE caused by r12-5102

2021-11-14 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103184

H.J. Lu  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com

--- Comment #6 from H.J. Lu  ---
A patch is posted at

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584293.html

[Bug tree-optimization/103235] [12 Regression] Recent change to atomics triggers ICE

2021-11-14 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103235

--- Comment #3 from H.J. Lu  ---
Works for me:

[hjl@gnu-cfl-2 pr103235]$
/export/build/gnu/tools-build/gcc-gitlab-cross/build-csky-linux/gcc/cc1 -O2 
pthread_cancel.i -I./ -quiet -w
[hjl@gnu-cfl-2 pr103235]$
/export/build/gnu/tools-build/gcc-gitlab-cross/build-csky-linux/gcc/xgcc -v
Using built-in specs.
COLLECT_GCC=/export/build/gnu/tools-build/gcc-gitlab-cross/build-csky-linux/gcc/xgcc
Target: csky-linux
Configured with: /export/gnu/import/git/gitlab/x86-gcc/configure
--with-demangler-in-ld --target=csky-linux --prefix=/usr/gcc-12.0.0-csky-linux
--with-local-prefix=/usr/local --with-system-zlib --disable-libcc1
--disable-libcilkrts --disable-libsanitizer --disable-libmpx
--enable-languages=c,c++
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.0 2024 (experimental) (GCC) 
[hjl@gnu-cfl-2 pr103235]$

[Bug tree-optimization/103235] [12 Regression] Recent change to atomics triggers ICE

2021-11-14 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103235

H.J. Lu  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-11-14
 CC|hjl at gcc dot gnu.org |hjl.tools at gmail dot 
com
 Status|UNCONFIRMED |WAITING

--- Comment #1 from H.J. Lu  ---
I can't produce it with r12-5243 for csky-linux-gnu.

[Bug libgomp/103224] New: [12 Regression] FAIL: libgomp.c/../libgomp.c-c++-common/target-in-reduction-2.c (internal compiler error) by r12-5146

2021-11-13 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103224

Bug ID: 103224
   Summary: [12 Regression] FAIL:
libgomp.c/../libgomp.c-c++-common/target-in-reduction-
2.c (internal compiler error) by r12-5146
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

On Linux/x86, r12-5146 caused:

$ /export/project/git/gcc-bisect/master/r12-5147/bld/./gcc/xgcc
-B/export/project/git/gcc-bisect/master/r12-5147/bld/./gcc/
-B/export/project/git/gcc-bisect/master/r12-5147/usr/x86_64-pc-linux-gnu/bin/
-B/export/project/git/gcc-bisect/master/r12-5147/usr/x86_64-pc-linux-gnu/lib/
-isystem
/export/project/git/gcc-bisect/master/r12-5147/usr/x86_64-pc-linux-gnu/include
-isystem
/export/project/git/gcc-bisect/master/r12-5147/usr/x86_64-pc-linux-gnu/sys-include
-fchecking=1
../../../../../gcc/libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/target-in-reduction-2.c
-B/export/project/git/gcc-bisect/master/r12-5147/bld/x86_64-pc-linux-gnu/./libgomp/
-B/export/project/git/gcc-bisect/master/r12-5147/bld/x86_64-pc-linux-gnu/./libgomp/.libs
-I/export/project/git/gcc-bisect/master/r12-5147/bld/x86_64-pc-linux-gnu/./libgomp
-I../../../../../gcc/libgomp/testsuite/../../include
-I../../../../../gcc/libgomp/testsuite/.. -fmessage-length=0
-fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -O2
-L/export/project/git/gcc-bisect/master/r12-5147/bld/x86_64-pc-linux-gnu/./libgomp/.libs
-lm -o ./target-in-reduction-2.exe
../../../../../gcc/libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/target-in-reduction-2.c:
In function ??foo??:
../../../../../gcc/libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/target-in-reduction-2.c:35:13:
internal compiler error: Segmentation fault
0xea0e2f crash_signal
../../../gcc/gcc/toplev.c:322
0xbc36ef omp_add_variable
../../../gcc/gcc/gimplify.c:7143
0xbc9793 gimplify_scan_omp_clauses
../../../gcc/gcc/gimplify.c:10070
0xbd0e32 gimplify_omp_workshare
../../../gcc/gcc/gimplify.c:13694
0xbd2060 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
../../../gcc/gcc/gimplify.c:15191
0xbd67e6 gimplify_stmt(tree_node**, gimple**)
../../../gcc/gcc/gimplify.c:7031
0xbd47bb gimplify_statement_list
../../../gcc/gcc/gimplify.c:2012
0xbd47bb gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
../../../gcc/gcc/gimplify.c:15105
0xbd67e6 gimplify_stmt(tree_node**, gimple**)
../../../gcc/gcc/gimplify.c:7031
0xbd4216 gimplify_and_add(tree_node*, gimple**)
../../../gcc/gcc/gimplify.c:494
0xbd4216 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
../../../gcc/gcc/gimplify.c:15276
0xbd67e6 gimplify_stmt(tree_node**, gimple**)
../../../gcc/gcc/gimplify.c:7031
0xbd47bb gimplify_statement_list
../../../gcc/gcc/gimplify.c:2012
0xbd47bb gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
../../../gcc/gcc/gimplify.c:15105
0xbd67e6 gimplify_stmt(tree_node**, gimple**)
../../../gcc/gcc/gimplify.c:7031
0xbd704f gimplify_bind_expr
../../../gcc/gcc/gimplify.c:1426
0xbd3107 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
../../../gcc/gcc/gimplify.c:14861
0xbed037 gimplify_stmt(tree_node**, gimple**)
../../../gcc/gcc/gimplify.c:7031
0xbed037 gimplify_body(tree_node*, bool)
../../../gcc/gcc/gimplify.c:15906
0xbed4ad gimplify_function_tree(tree_node*)
../../../gcc/gcc/gimplify.c:16060
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
$

[Bug tree-optimization/103194] [12 Regression] ice in optimize_atomic_bit_test_and with __sync_fetch_and_and since r12-5102-gfb161782545224f5

2021-11-13 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103194

H.J. Lu  changed:

   What|Removed |Added

  Attachment #51784|0   |1
is obsolete||

--- Comment #6 from H.J. Lu  ---
Created attachment 51785
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51785=edit
The v2 incomplete patch

Hongtao, please finish it.  Thanks.

[Bug tree-optimization/103194] [12 Regression] ice in optimize_atomic_bit_test_and with __sync_fetch_and_and since r12-5102-gfb161782545224f5

2021-11-13 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103194

--- Comment #5 from H.J. Lu  ---
Created attachment 51784
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51784=edit
An incomplete patch

Hongtao, can you finish it?

[Bug sanitizer/102911] AddressSanitizer: CHECK failed: asan_malloc_linux.cpp:46

2021-11-13 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102911

--- Comment #4 from H.J. Lu  ---
Fixed for GCC 12.

[Bug libffi/102874] [12 regression] src/x86/win64.S doesn't assemble with Solaris as

2021-11-12 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102874

--- Comment #8 from H.J. Lu  ---
The proposed fix should be submitted to GCC and put it in
libffi/LOCAL_PATCHES after it is checked in.

[Bug target/103205] [9/10/11/12 Regression] ICE Segmentation fault since r7-532

2021-11-12 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103205

H.J. Lu  changed:

   What|Removed |Added

Summary|[12 Regression] ICE |[9/10/11/12 Regression] ICE
   |Segmentation fault since|Segmentation fault since
   |r12-5102-gfb161782545224f5  |r7-532
 CC|hjl at gcc dot gnu.org |hjl.tools at gmail dot 
com,
   ||jakub at redhat dot com

--- Comment #2 from H.J. Lu  ---
[hjl@gnu-cfl-2 pr103205]$ cat s2.c
unsigned short sync_fetch_and_and_short_15_a;
unsigned short
__attribute__sync_fetch_and_and_short_15 (void)
{
  return __sync_fetch_and_and(_fetch_and_and_short_15_a, ~1) & 1;
}
[hjl@gnu-cfl-2 pr103205]$ /usr/gcc-7.3.1-x32/bin/gcc -O2
-mtune-ctrl=^himode_math -S s2.c
s2.c: In function ‘__attribute__sync_fetch_and_and_short_15’:
s2.c:5:10: internal compiler error: Segmentation fault
   return __sync_fetch_and_and(_fetch_and_and_short_15_a, ~1) & 1;
  ^~~~
unrecognized DWARF version in .debug_info at 6
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
[hjl@gnu-cfl-2 pr103205]$

[Bug tree-optimization/103194] [12 Regression] ice in optimize_atomic_bit_test_and with __sync_fetch_and_and since r12-5102-gfb161782545224f5

2021-11-12 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103194

--- Comment #4 from H.J. Lu  ---
This avoids the crash:

diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 0f79e9f05bd..14c5ecdf119 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -3443,7 +3443,7 @@ optimize_atomic_bit_test_and (gimple_stmt_iterator *gsip,
 ibit = 0;
   }
   else if (TYPE_PRECISION (TREE_TYPE (use_lhs))
- == TYPE_PRECISION (TREE_TYPE (use_rhs)))
+ <= TYPE_PRECISION (TREE_TYPE (use_rhs)))
   {
 gimple *use_nop_stmt;
 if (!single_imm_use (use_lhs, _p, _nop_stmt)

But nop_atomic_bit_test_and_p should handle cast.  Hongtao, please take a look.

[Bug tree-optimization/103194] [12 Regression] ice in optimize_atomic_bit_test_and with __sync_fetch_and_and

2021-11-11 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103194

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-11-11
 Ever confirmed|0   |1

--- Comment #2 from H.J. Lu  ---
The narrowing cast isn't handled:

[hjl@gnu-cfl-2 pr102566]$ cat x3.c
#include 

int
foo (_Atomic long long int *v)
{
  return atomic_fetch_or_explicit (v, 1, memory_order_relaxed) & 1;
}
[hjl@gnu-cfl-2 pr102566]$ make x3.s
/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/ -O2 -S
x3.c
[hjl@gnu-cfl-2 pr102566]$ cat x3.s
.file   "x3.c"
.text
.p2align 4
.globl  foo
.type   foo, @function
foo:
.LFB0:
.cfi_startproc
movq(%rdi), %rax
.L2:
movq%rax, %rcx
movq%rax, %rdx
orq $1, %rcx
lock cmpxchgq   %rcx, (%rdi)
jne .L2
movl%edx, %eax
andl$1, %eax
ret
.cfi_endproc
.LFE0:
.size   foo, .-foo
.ident  "GCC: (GNU) 12.0.0 2021 (experimental)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-cfl-2 pr102566]$

[Bug libffi/102874] [12 regression] src/x86/win64.S doesn't assemble with Solaris as

2021-11-05 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102874

--- Comment #6 from H.J. Lu  ---
(In reply to Richard Biener from comment #5)
> Upstream bug was filed and fix proposed.  IMHO we don't need to wait and can
> pull this fix temporarily.

Is there a pull request to fix it in upstream?

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-11-05 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

--- Comment #24 from H.J. Lu  ---
(In reply to Andrew Cooper from comment #23)
> Apologies for the delay, but I do now have a working prototype of Xen with
> CET-IBT active, using the current version of these patches.
> 
> The result actually builds back to older versions of GCCs, but the lack of
> cf_check-ness typechecking makes this a fragile activity.

Should I polish my patches and submit them now?

[Bug target/103066] __sync_val_compare_and_swap/__sync_bool_compare_and_swap aren't optimized

2021-11-05 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103066

--- Comment #7 from H.J. Lu  ---
Instead of generating:

movlf(%rip), %eax
.L2:
movd%eax, %xmm0
addss   .LC0(%rip), %xmm0
movd%xmm0, %edx
lock cmpxchgl   %edx, f(%rip)
jne .L2
ret

we want

movlf(%rip), %eax
.L2:
movd%eax, %xmm0
addss   .LC0(%rip), %xmm0
movd%xmm0, %edx
cmplf(%rip), %eax
jne .L2
lock cmpxchgl   %edx, f(%rip)
jne .L2
ret

[Bug target/103066] __sync_val_compare_and_swap/__sync_bool_compare_and_swap aren't optimized

2021-11-05 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103066

--- Comment #4 from H.J. Lu  ---
(In reply to Jakub Jelinek from comment #3)
> If by fail you mean that it doesn't update the memory if the memory isn't
> equal to expected, sure, but do you mean it can fail spuriously, not update
> the memory even if the memory is equal to expected?
> Neither __sync_{bool,val}_compare_and_swap nor __atomic_compare_exchange_n
> with weak set to false can fail spuriously, __atomic_compare_exchange_n with
> weak set to true can.

If we generate

movl m(%rip), %eax
cmpl  %edi, %eax
jne  .L1
movl%edi, %eax
lock cmpxchgl   %esi, m(%rip)
.L1:
ret

is it a valid implementation of atomic_compare_exchange_strong?

[Bug target/103066] __sync_val_compare_and_swap/__sync_bool_compare_and_swap aren't optimized

2021-11-05 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103066

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2021-11-05

--- Comment #2 from H.J. Lu  ---
(In reply to Hongyu Wang from comment #1)
> __sync_val_compare_and_swap will be expanded to
> atomic_compare_exchange_strong
> by default, should we restrict the check and return under
> atomic_compare_exchange_weak which is allowed to fail spuriously?

On x86, "lock cmpxchgl" can fail.

[Bug target/103069] New: cmpxchg isn't optimized

2021-11-03 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103069

Bug ID: 103069
   Summary: cmpxchg isn't optimized
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com, wwwhhhyyy333 at gmail dot com
Blocks: 103065
  Target Milestone: ---
Target: i386,x86-64

>From the CPU's point of view, getting a cache line for writing is more
expensive than reading.  See Appendix A.2 Spinlock in:

https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/xeon-lock-scaling-analysis-paper.pdf

The full compare and swap will grab the cache line exclusive and causes
excessive cache line bouncing.

[hjl@gnu-cfl-2 pr102566]$ cat e.c 
int
f3 (int *a)
{
  return __atomic_fetch_or (a, 0x4000, __ATOMIC_RELAXED);
}
[hjl@gnu-cfl-2 pr102566]$ gcc -S -O2 x.c 
[hjl@gnu-cfl-2 pr102566]$ cat x.s 
.file   "x.c"
.text
.p2align 4
.globl  foo
.type   foo, @function
foo:
.LFB0:
.cfi_startproc
movlv(%rip), %eax
.L2:
movl%eax, %ecx
movl%eax, %edx
orl $1, %ecx
lock cmpxchgl   %ecx, v(%rip)

GCC should first emit a normal load, check and jump to .L2 if cmpxchgl
may fail.  Before jump to .L2, PAUSE should be inserted to to yield the
CPU to another hyperthread and to save power. It also serves to slightly
limit the rate of accesses on the processor interconnect.
jne .L2
movl%edx, %eax
andl$1, %eax
ret
.cfi_endproc
.LFE0:
.size   foo, .-foo
.ident  "GCC: (GNU) 11.2.1 20211019 (Red Hat 11.2.1-6)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-cfl-2 pr102566]$


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103065
[Bug 103065] [meta] atomic operations aren't optimized

[Bug libgomp/103068] New: gomp_mutex_lock_slow isn't optimized

2021-11-03 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103068

Bug ID: 103068
   Summary: gomp_mutex_lock_slow isn't optimized
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com, jakub at gcc dot gnu.org,
wwwhhhyyy333 at gmail dot com
  Target Milestone: ---
Target: i386,x86-64

>From the CPU's point of view, getting a cache line for writing is more
expensive than reading.  See Appendix A.2 Spinlock in:

https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/xeon-lock-scaling-analysis-paper.pdf

The full compare and swap will grab the cache line exclusive and causes
excessive cache line bouncing.

gomp_mutex_lock_slow has

void
gomp_mutex_lock_slow (gomp_mutex_t *mutex, int oldval)
{
  /* First loop spins a while.  */
  while (oldval == 1)
{   
  if (do_spin (mutex, 1)) 
{
  /* Spin timeout, nothing changed.  Set waiting flag.  */
  oldval = __atomic_exchange_n (mutex, -1, MEMMODEL_ACQUIRE);
  if (oldval == 0)
return;
  futex_wait (mutex, -1);
  break;
}
  else
{
  /* Something changed.  If now unlocked, we're good to go.  */
  oldval = 0;

Add a normal check for *mutex == 1 and continue if
__atomic_compare_exchange_n may fail.

  if (__atomic_compare_exchange_n (mutex, , 1, false,
   MEMMODEL_ACQUIRE, MEMMODEL_RELAXED))
return;
}
}

  /* Second loop waits until mutex is unlocked.  We always exit this
 loop with wait flag set, so next unlock will awaken a thread.  */
  while ((oldval = __atomic_exchange_n (mutex, -1, MEMMODEL_ACQUIRE)))
do_wait (mutex, -1);
}

[Bug target/103066] New: __sync_val_compare_and_swap/__sync_bool_compare_and_swap aren't optimized

2021-11-03 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103066

Bug ID: 103066
   Summary: __sync_val_compare_and_swap/__sync_bool_compare_and_sw
ap aren't optimized
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com, wwwhhhyyy333 at gmail dot com
Blocks: 103065
  Target Milestone: ---
Target: i386,x86-64

>From the CPU's point of view, getting a cache line for writing is more
expensive than reading.  See Appendix A.2 Spinlock in:

https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/xeon-lock-scaling-analysis-paper.pdf

The full compare and swap will grab the cache line exclusive and causes
excessive cache line bouncing.

[hjl@gnu-cfl-1 tmp]$ cat x.c
extern int m;

int test(int oldv, int newv)
{
  return __sync_val_compare_and_swap (, oldv, newv);
}
[hjl@gnu-cfl-1 tmp]$ gcc -S -O2 x.c
[hjl@gnu-cfl-1 tmp]$ cat x.s
.file   "x.c"
.text
.p2align 4
.globl  test
.type   test, @function
test:
.LFB0:
.cfi_startproc
movl%edi, %eax
lock cmpxchgl   %esi, m(%rip)

GCC should first emit a normal load, check and return immediately if cmpxchgl
may fail.
ret
.cfi_endproc
.LFE0:
.size   test, .-test
.ident  "GCC: (GNU) 11.2.1 20211019 (Red Hat 11.2.1-6)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-cfl-1 tmp]$


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103065
[Bug 103065] [meta] atomic operations aren't optimized

[Bug target/103065] New: [meta] atomic operations aren't optimized

2021-11-03 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103065

Bug ID: 103065
   Summary: [meta] atomic operations aren't optimized
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: meta-bug
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com, wwwhhhyyy333 at gmail dot com
  Target Milestone: ---
Target: i386,x86-64

>From the CPU's point of view, getting a cache line for writing is more
expensive than reading.  See Appendix A.2 Spinlock in:

https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/xeon-lock-scaling-analysis-paper.pdf

The full compare and swap will grab the cache line exclusive and causes
excessive cache line bouncing.

[Bug bootstrap/102675] [12 regression] Bootstrap fails in libsanitizer: 'MD5_DIGEST_STRING_LENGTH' was not declared in this scope

2021-10-30 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102675

H.J. Lu  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #15 from H.J. Lu  ---
A patch is posted at

https://gcc.gnu.org/pipermail/gcc-patches/2021-October/582945.html

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-30 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

--- Comment #22 from H.J. Lu  ---
(In reply to Andrew Cooper from comment #21)
> Another possibly-bug, but possibly mis-expectations on my behalf.
> 
> I've found some code in the depths of Xen which is causing a failure on
> final link due to a missing `__x86_indirect_thunk_nt_rax` symbol.
> 
>   $ cat fnptr-typeof.c
>   extern void (*fnptrs[])(char);
> 
>   void foo(int a)
>   {
>   typeof(foo) *bar = (void *)fnptrs[0];
>   bar(a);
>   }
> 
> I realise this  is wildly undefined behaviour, and I will try to address it
> in due course.  However, the instruction generation is bizarre.
> 
> When I compile with -fcf-protection=branch -mmanual-endbr, I get a plain
> `jmp *fnptrs(%rip)` instruction.  (This is fine.)
> 
> When I compile with -fcf-check-attribute=no as well, then I get `notrack jmp
> *fnptrs(%rip)`.  I'm not sure why the notrack is warranted here; for all GCC
> knows, the target does have a suitable ENDBR64 instruction.
> 

>From "info gcc":

 The 'nocf_check' attribute on a type of pointer to function is used
 to inform the compiler that a call through the pointer should not
 be instrumented when compiled with the '-fcf-protection=branch'
 option.  The compiler assumes that the function's address from the
 pointer is a valid target for a control-flow transfer.  A direct
 function call through a function name is assumed to be a safe call
 thus direct calls are not instrumented by the compiler.

That is why notrack is added.

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-29 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

--- Comment #19 from H.J. Lu  ---
(In reply to Andrew Cooper from comment #17)
> I think I've found a bug in the -fcf-check-attribute implementation.
> 

Please try the v5 patch.  BTW, do you have a testcase to show how
-fcf-check-attribute=yes is used?

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-29 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

H.J. Lu  changed:

   What|Removed |Added

  Attachment #51696|0   |1
is obsolete||

--- Comment #18 from H.J. Lu  ---
Created attachment 51701
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51701=edit
The v5 patch to add -fcf-check-attribute=[yes|no]

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-28 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

H.J. Lu  changed:

   What|Removed |Added

 Status|WAITING |NEW

--- Comment #23 from H.J. Lu  ---
(In reply to Andrew Cooper from comment #22)
> One curious thing I have discovered.  While auditing the -mharden-sls=all
> code generation in Xen, I found examples where I got "ret int3 ret int3"
> with no intervening instructions.
> 
> It turns out this is not a regression in this change.  It is a pre-existing
> missing optimisation, which is made more obvious when every ret is extended
> with an int3.
> 
> It occurs for functions with either no stack frame at all, or functions
> which have an early exit before setting up the stack frame.  Some examples
> which occur at -O1 do not occur at -O2.
> 
> One curious example which does still repro at -O2 is this.  We have a hash
> lookup function:
> 
> struct context *sidtab_search(struct sidtab *s, u32 sid)
> {
> int hvalue;
> struct sidtab_node *cur;
> 
> if ( !s )
> return NULL;
> 
> hvalue = SIDTAB_HASH(sid);
> cur = s->htable[hvalue];
> while ( cur != NULL && sid > cur->sid )
> cur = cur->next;
> 
> if ( cur == NULL || sid != cur->sid )
> {
> /* Remap invalid SIDs to the unlabeled SID. */
> sid = SECINITSID_UNLABELED;
> hvalue = SIDTAB_HASH(sid);
> cur = s->htable[hvalue];
> while ( cur != NULL && sid > cur->sid )
> cur = cur->next;
> if ( !cur || sid != cur->sid )
> return NULL;
> }
> 
> return >context;
> }
> 
> which compiles (reformatted a little for width - unmodified:
> https://paste.debian.net/hidden/7bf675d6/) to:
> 
> :
>  48 85 ff test   %rdi,%rdi
> /--- 74 63je 
> |48 8b 17 mov(%rdi),%rdx
> |89 f0mov%esi,%eax
> |83 e0 7f and$0x7f,%eax
> |48 8b 04 c2  mov(%rdx,%rax,8),%rax
> |48 85 c0 test   %rax,%rax
> |   /--- 75 13jne
> |  /|--- eb 17jmp
> |  ||0f 1f 84 00 00 00 00 nopl   0x0(%rax,%rax,1)
> |  ||00 
> |  ||/-> 48 8b 40 48  mov0x48(%rax),%rax
> |  |||   48 85 c0 test   %rax,%rax
> |  +||-- 74 06je 
> |  |\|-> 39 30cmp%esi,(%rax)
> |  | \-- 72 f3jb 
> | /| 74 24je 
> | |\---> 48 8b 42 28  mov0x28(%rdx),%rax
> | |  48 85 c0 test   %rax,%rax
> | | /--- 75 11jne
> |/|-|--- eb 32jmp // (1)
> ||| |66 0f 1f 44 00 00nopw   0x0(%rax,%rax,1)
> ||| |/-> 48 8b 40 48  mov0x48(%rax),%rax
> ||| ||   48 85 c0 test   %rax,%rax
> |||/||-- 74 17je  // (2)
> \|-> 83 38 04 cmpl   $0x4,(%rax)
>  \-- 76 f2jbe
>  83 38 05 cmpl   $0x5,(%rax)
> +||| 75 15jne
> ||\|---> 48 83 c0 08  add$0x8,%rax
> || | c3   retq   
> || | cc   int3   
> || | 0f 1f 80 00 00 00 00 nopl   0x0(%rax)
> || \---> c3   retq// Target of (2)
> ||   cc   int3   
> ||   66 0f 1f 44 00 00nopw   0x0(%rax,%rax,1)
> \|-> 31 c0xor%eax,%eax
>  |   c3   retq   
>  |   cc   int3   
>  \-> c3   retq// Target of (1)
>  cc   int3   
>  66 90xchg   %ax,%ax
> 
> There are 4 exits in total.  Two have to set up %eax, so they can't usefully
> be merged.
> 
> However, the unconditional jmp at (1) is 2 bytes, and could fully contain
> its target ret;int3 without even impacting the surrounding padding.  Whether
> it inlines or merges, this drops 4 bytes.
> 
> The conditional jump at (2) could be folded in to any of the other exit
> paths, dropping 16 bytes from the total size size.
> 
> I have no idea how easy/hard this may be to track down, or whether it is
> worth pursuing urgently, but it probably does want looking at, seeing as SLS
> hardening doubles the hit.

Please open a separate bug to track it.  Should shrink-wrap handle it?

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-28 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

--- Comment #16 from H.J. Lu  ---
(In reply to Andrew Cooper from comment #14)
> (In reply to H.J. Lu from comment #13)
> > (In reply to Andrew Cooper from comment #11)
> > > 
> > > There should be a diagnostic, but it ought to include cf_check in the type
> > > it prints.
> > 
> > Try the v3 patch.
> 
> Thanks.  Now get:
> 
> proto.c:2:37: error: conflicting types with implied 'nocf_check' attribute
> for 'foo'; have 'void(void)'
> 2 | static void __attribute__((unused)) foo(void)
>   | ^~~
> proto.c:1:39: note: previous declaration of 'foo' with type 'void(void)'
> 1 | static void __attribute__((cf_check)) foo(void);
>   |   ^~~
> 
> which at least highlights the issue.  Any variant like this, but possibly
> even simply reporting 'void __attribute__((nocf_check))(void)' should be
> fine.

The v4 patch changed it to

bar1.c:2:37: error: conflicting types for ‘foo’; have ‘void(void)’ with implied
‘nocf_check’ attribute
2 | static void __attribute__((unused)) foo(void)
  | ^~~
bar1.c:1:39: note: previous declaration of ‘foo’ with type ‘void(void)’
1 | static void __attribute__((cf_check)) foo(void);
  |   ^~~
bar1.c:5:21: warning: initialization of ‘void (*)(void)’ from incompatible
pointer type ‘void (__attribute__((nocf_check)) *)(void)’
[-Wincompatible-pointer-types]
5 | void (*ptr)(void) = foo;
  | ^~~

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-28 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

H.J. Lu  changed:

   What|Removed |Added

  Attachment #51693|0   |1
is obsolete||

--- Comment #15 from H.J. Lu  ---
Created attachment 51696
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51696=edit
The v4 patch to add -fcf-check-attribute=[yes|no]

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-28 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

--- Comment #13 from H.J. Lu  ---
(In reply to Andrew Cooper from comment #11)
> 
> There should be a diagnostic, but it ought to include cf_check in the type
> it prints.

Try the v3 patch.

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-28 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

H.J. Lu  changed:

   What|Removed |Added

  Attachment #51687|0   |1
is obsolete||

--- Comment #12 from H.J. Lu  ---
Created attachment 51693
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51693=edit
The v3 patch to add -fcf-check-attribute=[yes|no]

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

--- Comment #10 from H.J. Lu  ---
(In reply to Andrew Cooper from comment #8)
> Actually, there is a (possibly pre-existing) diagnostics issue:
> 
> $ cat proto.c
> static void __attribute__((cf_check)) foo(void);
> static void __attribute__((unused)) foo(void)
> {
> }
> void (*ptr)(void) = foo;
> 
> $ gcc -Wall -Os -fcf-protection=branch -mmanual-endbr
> -fcf-check-attribute=no -c proto.c -o proto.o
> proto.c:2:37: error: conflicting types for 'foo'; have 'void(void)'
> 2 | static void __attribute__((unused)) foo(void)
>   | ^~~
> proto.c:1:39: note: previous declaration of 'foo' with type 'void(void)'
> 1 | static void __attribute__((cf_check)) foo(void);
>   |   ^~~
> 
> 
> The diagnostic complaining that the forward declaration doesn't match the
> definition gives 'void(void)' as the type in both cases, leaving out the
> fact that they differ by cf_check-ness.

Please try the v2 patch.

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

H.J. Lu  changed:

   What|Removed |Added

  Attachment #51672|0   |1
is obsolete||

--- Comment #9 from H.J. Lu  ---
Created attachment 51687
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51687=edit
The v2 patch to add -fcf-check-attribute=[yes|no]

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

H.J. Lu  changed:

   What|Removed |Added

  Attachment #51684|0   |1
is obsolete||

--- Comment #19 from H.J. Lu  ---
Created attachment 51685
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51685=edit
The v4 patch to add -mharden-sls=

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

--- Comment #17 from H.J. Lu  ---
[hjl@gnu-tgl-2 pr102952]$ cat z2.i
extern void (*fptr) (int, int);

void
foo (int x, int y)
{
  fptr (x, y);
}
[hjl@gnu-tgl-2 pr102952]$ make z2.s
/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/ -O2
-mindirect-branch=thunk -mindirect-branch-cs-prefix -mharden-sls=all -S z2.i
[hjl@gnu-tgl-2 pr102952]$ cat z2.s
.file   "z2.i"
.text
.p2align 4
.globl  foo
.type   foo, @function
foo:
.LFB0:
.cfi_startproc
movqfptr(%rip), %rax
jmp __x86_indirect_thunk_rax

Is int3 needed here?

.cfi_endproc
.LFE0:
.size   foo, .-foo
.section   
.text.__x86_indirect_thunk_rax,"axG",@progbits,__x86_indirect_thunk_rax,comdat
.globl  __x86_indirect_thunk_rax
.hidden __x86_indirect_thunk_rax
.type   __x86_indirect_thunk_rax, @function
__x86_indirect_thunk_rax:
.LFB1:
.cfi_startproc
call.LIND1
.LIND0:
pause
lfence
jmp .LIND0
.LIND1:
.cfi_def_cfa_offset 16
mov %rax, (%rsp)
ret
int3    Is this needed?
.cfi_endproc
.LFE1:
.ident  "GCC: (GNU) 12.0.0 20211027 (experimental)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-tgl-2 pr102952]$

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

--- Comment #16 from H.J. Lu  ---
(In reply to Andrew Cooper from comment #15)
> So this is the irritating corner case where the two options are linked.
> 
> *If* we are using -mindirect-branch-cs-prefix, then we intend to rewrite
> `jmp __x86_indirect_thunk_*` to `jmp *%reg` or `lfence; jmp *%reg` based on
> boot time configuration/settings.
> 
> In this case, we still need to fit the `int3` for SLS protection in
> somewhere.
> 
> The two options are:
> 1) Special case `jmp __x86_indirect_thunk_*` as if it were an indirect jump
> and write out an `int3` directly, or

I can do this.

> 2) Pad one extra %cs prefix on the jmp, so we've got space to insert one at
> boot time.

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

--- Comment #14 from H.J. Lu  ---
(In reply to peterz from comment #13)
> (In reply to H.J. Lu from comment #12)
> > (In reply to peterz from comment #9)
> > > Created attachment 51683 [details]
> > > kernel patch to test -mharden-sls=all
> > > 
> > > $ make O=defconfig CC=gcc-12.0.0 arch/x86/entry/common.o
> > > ...
> > > arch/x86/entry/common.o: warning: objtool: do_SYSENTER_32()+0x1b:
> > > unreachable instruction
> > 
> > Please try the v2 patch.
> 
> Per comment #6 this should be v3, no? Anyway, the good news is that I now
> seem to have a kernel image with lots of extra int3 instructions, but all in
> the right place.
> 
> *However*, I seem to be missing a few:
> 
>   36f4:   41 5f   pop%r15
>   36f6:   e9 00 00 00 00  jmp36fb
> <__do_set_cpus_allowed+0x5b>
> 36f7: R_X86_64_PLT32__x86_indirect_thunk_rax-0x4

This is a direct branch.

>   36fb:   48 8b 87 90 02 00 00mov0x290(%rdi),%rax
> 
> There should be one after the jmp __x86_indirect_thunk_* thingy. I'll do an
> objtool patch to search for missing int3, but that'll have to wait until
> tomorrow, it's past midnight.

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

--- Comment #12 from H.J. Lu  ---
(In reply to peterz from comment #9)
> Created attachment 51683 [details]
> kernel patch to test -mharden-sls=all
> 
> $ make O=defconfig CC=gcc-12.0.0 arch/x86/entry/common.o
> ...
> arch/x86/entry/common.o: warning: objtool: do_SYSENTER_32()+0x1b:
> unreachable instruction

Please try the v2 patch.

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

H.J. Lu  changed:

   What|Removed |Added

  Attachment #51679|0   |1
is obsolete||

--- Comment #11 from H.J. Lu  ---
Created attachment 51684
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51684=edit
The v2 patch to add -mharden-sls=

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

--- Comment #8 from H.J. Lu  ---
(In reply to peterz from comment #7)
> (In reply to H.J. Lu from comment #3)
> > Created attachment 51678 [details]
> > A patch to add -mharden-sls=
> > 
> > x86: Add -mharden-sls=[none|all|return|indirect-branch]
> > 
> > Generate code to mitigate against straight line speculation.
> 
> I'm getting (a lot) spurious int3 instructions with this, for example:
> 
> 0280 :
>  280:   48 81 8f 90 00 00 00 00 02 00 00orq$0x200,0x90(%rdi)
>  28b:   48 8b 47 20 mov0x20(%rdi),%rax
>  28f:   48 89 87 98 00 00 00mov%rax,0x98(%rdi)
>  296:   e9 75 ff ff ff  jmp210 
>  29b:   cc  int3
> 
> That's not an *indirect* jump there.

Please provide a testcase.

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

H.J. Lu  changed:

   What|Removed |Added

  Attachment #51678|0   |1
is obsolete||

--- Comment #6 from H.J. Lu  ---
Created attachment 51681
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51681=edit
The v2 patch to add -mharden-sls=

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

H.J. Lu  changed:

   What|Removed |Added

 Status|ASSIGNED|WAITING

--- Comment #5 from H.J. Lu  ---
(In reply to Andrew Cooper from comment #0)
> Hello
> 
> [FYI, this is being cross-requested of Clang too]
> 
> Linux and other kernel level software makes use of
> -mindirect-branch=thunk-extern to be able to alter the handling of indirect
> branches at boot.  It turns out to be advantageous to inline the thunks when
> retpoline is not in use. 
> https://lore.kernel.org/lkml/20211026120132.613201...@infradead.org/ is some
> infrastructure to make this work.
> 
> In some cases, we want to be able to inline an `lfence; jmp *%reg` thunk. 
> This is fine for the low 8 registers, but not fine for %r{8..15} where the
> REX prefix pushes the replacement size to being 6 bytes.
> 
> It would be very useful to have a code-gen option to write out `call
> %cs:__x86_indirect_thunk_r{8..15}` where the redundant %cs prefix will
> increase the instruction length to 6, allowing the non-retpoline form to be
> inlined.
> 

-mindirect-branch-cs-prefix

> Relatedly, x86 straight line speculation has been discussed before, but
> without any action taken.  It would be helpful to have a code gen option
> which would emit `int3` following any `ret` instruction, and any indirect
> jump, as neither of these two cases have following architectural execution.
> 
> The reason these two are related is that if both options are in use, we want
> an extra byte of replacement space to be able to inline `lfence; jmp *%reg;
> int3`.
> 

-mharden-sls=[none|all|return|indirect-branch]

Let me know if they work.  I also need testcases.

> Third (and possibly only for future optimisations), Clang has been observed
> to spot conditional tail calls as `Jcc __x86_indirect_thunk_*`.  This is a 6
> byte source size, but needs up to 9 bytes of space for inlining including an
> `int3` for straight line speculation reasons (See
> https://lore.kernel.org/lkml/20211026120310.359986...@infradead.org/ for
> full details).  It might be enough to simply prohibit an optimisation like
> this when trying to pad retpolines for inlineability.

I don't think GCC does that at all.

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

--- Comment #4 from H.J. Lu  ---
Created attachment 51679
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51679=edit
A patch to add -mindirect-branch-cs-prefix

It adds CS prefix to call and jmp to thunk when converting indirect call
and jump.

[Bug target/102952] New code-gen options for retpolines and straight line speculation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102952

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |hjl.tools at gmail dot 
com
 Ever confirmed|0   |1
   Last reconfirmed||2021-10-27

--- Comment #3 from H.J. Lu  ---
Created attachment 51678
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51678=edit
A patch

x86: Add -mharden-sls=[none|all|return|indirect-branch]

Generate code to mitigate against straight line speculation.

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-27 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

H.J. Lu  changed:

   What|Removed |Added

  Attachment #51670|0   |1
is obsolete||

--- Comment #6 from H.J. Lu  ---
Created attachment 51677
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51677=edit
An updated patch

[Bug tree-optimization/102950] [11/12 Regression] Dead Code Elimination Regression at -O3 (trunk&11.2.0 vs 10.3.0)

2021-10-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102950

--- Comment #2 from H.J. Lu  ---
r11-3685 is bad and r11-3683 is good.

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

H.J. Lu  changed:

   What|Removed |Added

 Status|NEW |WAITING

--- Comment #4 from H.J. Lu  ---
Please try these patches and let me know if they work.

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

--- Comment #3 from H.J. Lu  ---
Created attachment 51672
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51672=edit
Add -fcf-check-attribute=[yes|no]

-fcf-check-attribute=[yes|no] implies "cf_check" or "nocf_check"
function attribute.

[Bug target/102953] Improvements to CET-IBT and ENDBR generation

2021-10-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102953

H.J. Lu  changed:

   What|Removed |Added

   Last reconfirmed||2021-10-26
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |hjl.tools at gmail dot 
com
 Status|UNCONFIRMED |NEW

--- Comment #2 from H.J. Lu  ---
Created attachment 51670
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51670=edit
Skip ENDBR when emitting direct call/jmp to local function

<    1   2   3   4   5   6   7   8   9   10   >