Re: [PATCH v4] Add QI vector mode support to by-pieces for memset

2021-07-26 Thread H.J. Lu via Gcc-patches
On Mon, Jul 26, 2021 at 2:53 PM Richard Sandiford wrote: > > "H.J. Lu via Gcc-patches" writes: > > On Mon, Jul 26, 2021 at 11:42 AM Richard Sandiford > > wrote: > >> > >> "H.J. Lu via Gcc-patches" writes: > >> > +to avoid st

Re: [PATCH v4] Add QI vector mode support to by-pieces for memset

2021-07-26 Thread H.J. Lu via Gcc-patches
On Mon, Jul 26, 2021 at 11:42 AM Richard Sandiford wrote: > > "H.J. Lu via Gcc-patches" writes: > > +/* Callback routine for store_by_pieces. Read GET_MODE_BITSIZE (MODE) > > + bytes from constant string DATA + OFFSET and return it as target > > + consta

Re: [PATCH] x86: Convert load from constand pool to SSE constant load

2021-07-22 Thread H.J. Lu via Gcc-patches
On Thu, Jul 22, 2021 at 7:17 PM Hongtao Liu wrote: > > On Thu, Jul 22, 2021 at 8:34 PM H.J. Lu via Gcc-patches > wrote: > > > > For > > > > (set (reg:V32QI 108) > > (const_vector:V32QI [ > > (const_int -1 [0x]

[PATCH v2] x86: Don't return hard register when LRA is in progress

2021-07-22 Thread H.J. Lu via Gcc-patches
Don't return hard register in ix86_gen_scratch_sse_rtx when LRA is in progress to avoid ICE when there are no available hard registers for LRA. gcc/ PR target/101504 * config/i386/i386.c (ix86_gen_scratch_sse_rtx): Don't return hard register when LRA is in progress. gcc/t

[PATCH] x86: Convert load from constand pool to SSE constant load

2021-07-22 Thread H.J. Lu via Gcc-patches
For (set (reg:V32QI 108) (const_vector:V32QI [ (const_int -1 [0x]) repeated x32 ])) "x.c":15:5 1573 {movv32qi_internal} (expr_list:REG_EQUIV (const_vector:V32QI [ (const_int -1 [0x]) repeated x32 ]

Re: [PATCH v3] Add QI vector mode support to by-pieces for memset

2021-07-21 Thread H.J. Lu via Gcc-patches
On Wed, Jul 21, 2021 at 12:42 PM Richard Sandiford wrote: > > Richard Sandiford writes: > > "H.J. Lu via Gcc-patches" writes: > >> On Wed, Jul 21, 2021 at 7:50 AM Richard Sandiford > >> wrote: > >>> > >>> "H.J. Lu"

[PATCH v4] Add QI vector mode support to by-pieces for memset

2021-07-21 Thread H.J. Lu via Gcc-patches
1. Replace scalar_int_mode with fixed_size_mode in the by-pieces infrastructure to allow non-integer mode. 2. Rename widest_int_mode_for_size to widest_fixed_size_mode_for_size to return QI vector mode for memset. 3. Add op_by_pieces_d::smallest_fixed_size_mode_for_size to return the smallest integ

Re: [PATCH v3] Add QI vector mode support to by-pieces for memset

2021-07-21 Thread H.J. Lu via Gcc-patches
On Wed, Jul 21, 2021 at 12:20 PM Richard Sandiford wrote: > > "H.J. Lu via Gcc-patches" writes: > > On Wed, Jul 21, 2021 at 7:50 AM Richard Sandiford > > wrote: > >> > >> "H.J. Lu" writes: > >> > diff --git a/gcc/buil

Re: [PATCH v3] Add QI vector mode support to by-pieces for memset

2021-07-21 Thread H.J. Lu via Gcc-patches
On Wed, Jul 21, 2021 at 7:50 AM Richard Sandiford wrote: > > "H.J. Lu" writes: > > diff --git a/gcc/builtins.c b/gcc/builtins.c > > index 39ab139b7e1..1972301ce3c 100644 > > --- a/gcc/builtins.c > > +++ b/gcc/builtins.c > > @@ -3890,13 +3890,16 @@ expand_builtin_strnlen (tree exp, rtx target, >

[PATCH] x86: Remove OPTION_MASK_ISA_SSE4_2 from CRC32 _builtin functions

2021-07-21 Thread H.J. Lu via Gcc-patches
Since commit 39671f87b2df6a1894cc11a161e4a7949d1ddccd Author: H.J. Lu Date: Thu Apr 15 05:59:48 2021 -0700 x86: Use crc32 target option for CRC32 intrinsics enabled OPTION_MASK_ISA_CRC32 for -msse4 and removed TARGET_SSE4_2 check in sse4_2_crc32 pattens, remove OPTION_MASK_ISA_SSE4_2 from

Re: [PATCH 0/4] libsanitizer: Sync with upstream

2021-07-20 Thread H.J. Lu via Gcc-patches
On Tue, Jul 20, 2021 at 1:58 PM Jeff Law wrote: > > > > On 7/20/2021 2:55 PM, H.J. Lu via Gcc-patches wrote: > > 1. Sync with upstream commit 7704fedfff6ef5676adb6415f3be0ac927d1a746 > > 2. Apply local patche > > 3. Update LOCAL_PATCHES > > 4. Bump asan/t

[PATCH 2/4] libsanitizer: Apply local patches

2021-07-20 Thread H.J. Lu via Gcc-patches
--- libsanitizer/asan/asan_globals.cpp| 19 -- libsanitizer/asan/asan_interceptors.h | 7 ++- libsanitizer/asan/asan_mapping.h | 2 +- .../sanitizer_linux_libcdep.cpp | 4 .../sanitizer_common/sanitizer_mac.cpp| 12

[PATCH 0/4] libsanitizer: Sync with upstream

2021-07-20 Thread H.J. Lu via Gcc-patches
1. Sync with upstream commit 7704fedfff6ef5676adb6415f3be0ac927d1a746 2. Apply local patche 3. Update LOCAL_PATCHES 4. Bump asan/tsan versions for the upstream commit: commit acf0a6428681dccac803984bfbb1e3e54248f090 Author: Ilya Leoshkevich Date: Fri Jul 2 02:42:38 2021 +0200

[PATCH 4/4] libsanitizer: Bump asan/tsan versions

2021-07-20 Thread H.J. Lu via Gcc-patches
Bump asan/tsan versions for the upstream commit: commit acf0a6428681dccac803984bfbb1e3e54248f090 Author: Ilya Leoshkevich Date: Fri Jul 2 02:42:38 2021 +0200 [sanitizer] Fix __sanitizer_kernel_sigset_t endianness issue setuid(0) hangs on SystemZ under TSan because TSan's BackgroundThr

[PATCH 3/4] libsanitizer: Update LOCAL_PATCHES

2021-07-20 Thread H.J. Lu via Gcc-patches
* LOCAL_PATCHES: Update to the corresponding revision. --- libsanitizer/LOCAL_PATCHES | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/libsanitizer/LOCAL_PATCHES b/libsanitizer/LOCAL_PATCHES index b1969fc7882..d45655392b0 100644 --- a/libsanitizer/LOCAL_PATCHES +++ b/l

Re: [PATCH] Add QI vector mode support to by-pieces for memset

2021-07-20 Thread H.J. Lu via Gcc-patches
On Tue, Jul 20, 2021 at 8:12 AM Richard Sandiford wrote: > > Richard Sandiford via Gcc-patches writes: > > "H.J. Lu via Gcc-patches" writes: > >> On Mon, Jul 19, 2021 at 11:38 PM Richard Sandiford > >> wrote: >

[PATCH v3] Add QI vector mode support to by-pieces for memset

2021-07-20 Thread H.J. Lu via Gcc-patches
1. Replace scalar_int_mode with fixed_size_mode in the by-pieces infrastructure to allow non-integer mode. 2. Rename widest_int_mode_for_size to widest_fixed_size_mode_for_size to return QI vector mode for memset. 3. Add op_by_pieces_d::smallest_fixed_size_mode_for_size to return the smallest integ

Re: [PATCH] Add QI vector mode support to by-pieces for memset

2021-07-20 Thread H.J. Lu via Gcc-patches
On Tue, Jul 20, 2021 at 5:48 AM H.J. Lu wrote: > > On Mon, Jul 19, 2021 at 11:38 PM Richard Sandiford > wrote: > > > > "H.J. Lu via Gcc-patches" writes: > > >> > + { > > >> > + /* First generate subreg of word mode if the

Re: [PATCH] Add QI vector mode support to by-pieces for memset

2021-07-20 Thread H.J. Lu via Gcc-patches
On Mon, Jul 19, 2021 at 11:38 PM Richard Sandiford wrote: > > "H.J. Lu via Gcc-patches" writes: > >> > + { > >> > + /* First generate subreg of word mode if the previous mode is > >> > + wider than word mode and word

[PATCH v2] Add QI vector mode support to by-pieces for memset

2021-07-19 Thread H.J. Lu via Gcc-patches
1. Replace scalar_int_mode with fixed_size_mode in the by-pieces infrastructure to allow non-integer mode. 2. Rename widest_int_mode_for_size to widest_fixed_size_mode_for_size to return QI vector mode for memset. 3. Add op_by_pieces_d::smallest_fixed_size_mode_for_size to return the smallest integ

Re: [PATCH] Add QI vector mode support to by-pieces for memset

2021-07-19 Thread H.J. Lu via Gcc-patches
On Mon, Jul 19, 2021 at 7:41 AM Richard Sandiford wrote: > > "H.J. Lu via Gcc-patches" writes: > > On Fri, Jul 16, 2021 at 7:15 AM H.J. Lu wrote: > >> > >> On Fri, Jul 16, 2021 at 6:24 AM Richard Sandiford > >> wrote: > >> > >

[PATCH] x86: Don't issue vzeroupper if callee returns AVX register

2021-07-18 Thread H.J. Lu via Gcc-patches
Don't issue vzeroupper before function call if callee returns AVX register since callee must be compiled with AVX. gcc/ PR target/101495 * config/i386/i386.c (ix86_check_avx_upper_stores): Moved before ix86_avx_u128_mode_needed. (ix86_avx_u128_mode_needed): Return

[PATCH v5] : Add pragma GCC target("general-regs-only")

2021-07-17 Thread H.J. Lu via Gcc-patches
On Thu, Apr 22, 2021 at 7:30 AM Richard Biener via Gcc-patches wrote: > > On Thu, Apr 22, 2021 at 2:52 PM Richard Biener > wrote: > > > > On Thu, Apr 22, 2021 at 2:22 PM Jakub Jelinek wrote: > > > > > > On Thu, Apr 22, 2021 at 01:23:20PM +0200, Richard Biener via Gcc-patches > > > wrote: > > >

[PATCH] ix86: Enable the GPR only instructions for -mgeneral-regs-only

2021-07-17 Thread H.J. Lu via Gcc-patches
For -mgeneral-regs-only, enable the GPR only instructions which are enabled implicitly by SSE ISAs unless they have been disabled explicitly. gcc/ PR target/101492 * common/config/i386/i386-common.c (ix86_handle_option): For -mgeneral-regs-only, enable the GPR only instruc

[PATCH] Add QI vector mode support to by-pieces for memset

2021-07-16 Thread H.J. Lu via Gcc-patches
On Fri, Jul 16, 2021 at 7:15 AM H.J. Lu wrote: > > On Fri, Jul 16, 2021 at 6:24 AM Richard Sandiford > wrote: > > > > "H.J. Lu via Gcc-patches" writes: > > > On Fri, Jul 16, 2021 at 4:38 AM Richard Sandiford > > > wrote: > > >>

Re: [PATCH] Rewrite memset expanders with vec_duplicate

2021-07-16 Thread H.J. Lu via Gcc-patches
On Fri, Jul 16, 2021 at 6:24 AM Richard Sandiford wrote: > > "H.J. Lu via Gcc-patches" writes: > > On Fri, Jul 16, 2021 at 4:38 AM Richard Sandiford > > wrote: > >> > >> "H.J. Lu via Gcc-patches" writes: > >> >

Re: [PATCH] Rewrite memset expanders with vec_duplicate

2021-07-16 Thread H.J. Lu via Gcc-patches
On Fri, Jul 16, 2021 at 4:38 AM Richard Sandiford wrote: > > "H.J. Lu via Gcc-patches" writes: > > 1. Rewrite builtin_memset_read_str and builtin_memset_gen_str with > > vec_duplicate_optab to duplicate QI value to TI/OI/XI value. > > 2. Add TARGET_GEN_MEMSET_S

Re: [PATCH 1/4][committed] testsuite: Fix testisms in scalar tests PR101457

2021-07-15 Thread H.J. Lu via Gcc-patches
On Thu, Jul 15, 2021 at 9:40 AM Tamar Christina via Gcc-patches wrote: > > Hi All, > > These testcases accidentally contain the wrong signs for the expected values > for the scalar code. The vector code however is correct. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Co

[PATCH v2] x86: Don't set AVX_U128_DIRTY when all bits are zero

2021-07-15 Thread H.J. Lu via Gcc-patches
On Thu, Jul 15, 2021 at 6:36 PM Hongtao Liu wrote: > > On Fri, Jul 16, 2021 at 1:30 AM H.J. Lu via Gcc-patches > wrote: > > > > In a single SET, all bits of the source YMM/ZMM register are zero when > > > > 1. The source is contant zero. > > 2. The source Y

[PATCH] x86: Don't set AVX_U128_DIRTY when all bits are zero

2021-07-15 Thread H.J. Lu via Gcc-patches
In a single SET, all bits of the source YMM/ZMM register are zero when 1. The source is contant zero. 2. The source YMM/ZMM operand are defined from contant zero. and we don't set AVX_U128_DIRTY. gcc/ PR target/101456 * config/i386/i386.c (ix86_avx_u128_mode_needed): Don't set

Re: GCC 11.1.1 Status Report (2021-07-06)

2021-07-15 Thread H.J. Lu via Gcc-patches
On Tue, Jul 6, 2021 at 12:00 AM Richard Biener wrote: > > > Status > == > > The GCC 11 branch is open for regression and documentation fixes. > It's time for a GCC 11.2 release and we are aiming for a release > candidate in about two weeks which would result in the GCC 11.2 > release about thr

Re: GCC 11.1.1 Status Report (2021-07-06)

2021-07-14 Thread H.J. Lu via Gcc-patches
On Tue, Jul 6, 2021 at 12:00 AM Richard Biener wrote: > > > Status > == > > The GCC 11 branch is open for regression and documentation fixes. > It's time for a GCC 11.2 release and we are aiming for a release > candidate in about two weeks which would result in the GCC 11.2 > release about thr

Re: ping-2: [PATCH] c-family: Add more predefined macros for math flags

2021-07-14 Thread H.J. Lu via Gcc-patches
On Wed, Jul 14, 2021 at 12:32 AM Matthias Kretz wrote: > > OK? > > On Wednesday, 30 June 2021 10:59:28 CEST Matthias Kretz wrote: > > Library code, especially in headers, sometimes needs to know how the > > compiler interprets / optimizes floating-point types and operations. > > This information c

Re: [PATCH] [i386] Remove pass_cpb which is related to enable avx512 embedded broadcast from constant pool.

2021-07-14 Thread H.J. Lu via Gcc-patches
On Tue, Jul 13, 2021 at 9:35 PM Hongtao Liu wrote: > > On Wed, Jul 14, 2021 at 10:34 AM liuhongt wrote: > > > > By optimizing vector movement to broadcast in ix86_expand_vector_move > > during pass_expand, pass_reload/LRA can automatically generate an avx512 > > embedded broadcast, pass_cpb is no

PING^1 [PATCH v2] x86: Check AVX512 without mask instructions

2021-07-14 Thread H.J. Lu via Gcc-patches
On Fri, Jun 25, 2021 at 5:39 AM H.J. Lu wrote: > > On Fri, Jun 25, 2021 at 12:50 AM Uros Bizjak wrote: > > > > On Fri, Jun 25, 2021 at 4:51 AM Hongtao Liu wrote: > > > > > > On Fri, Jun 25, 2021 at 12:13 AM Uros Bizjak via Gcc-patches > > > wrote: > > > > > > > > On Thu, Jun 24, 2021 at 2:12 PM

[PATCH] Rewrite memset expanders with vec_duplicate

2021-07-13 Thread H.J. Lu via Gcc-patches
1. Rewrite builtin_memset_read_str and builtin_memset_gen_str with vec_duplicate_optab to duplicate QI value to TI/OI/XI value. 2. Add TARGET_GEN_MEMSET_SCRATCH_RTX to allow the backend to use a hard scratch register to avoid stack realignment when expanding memset. PR middle-end/90773

[PATCH v3] x86: Don't enable UINTR in 32-bit mode

2021-07-13 Thread H.J. Lu via Gcc-patches
On Mon, Jul 12, 2021 at 11:56 PM Jakub Jelinek wrote: > > On Mon, Jul 12, 2021 at 06:51:30PM -0700, H.J. Lu wrote: > > @@ -404,9 +404,18 @@ const char *host_detect_local_cpu (int argc, const > > char **argv) > >if (argc < 1) > > return NULL; > > I think it would be simpler to use 2 argum

Re: [llvm-dev] [PATCH] Add optional _Float16 support

2021-07-13 Thread H.J. Lu via Gcc-patches
On Tue, Jul 13, 2021 at 8:41 AM Joseph Myers wrote: > > On Tue, 13 Jul 2021, H.J. Lu wrote: > > > On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei > > wrote: > > > > > > > Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers. > > > > > > Can you please explain the behavior here? Is

Re: [llvm-dev] [PATCH] Add optional _Float16 support

2021-07-13 Thread H.J. Lu via Gcc-patches
om: llvm-dev On Behalf Of H.J. Lu > > via llvm-dev > > Sent: Friday, July 2, 2021 6:28 AM > > To: Joseph Myers > > Cc: llvm-...@lists.llvm.org; GCC Patches ; > > GNU C Library ; IA32 System V Application > > Binary Interface > > Subject: Re: [llvm-

Re: [llvm-dev] [PATCH] Add optional _Float16 support

2021-07-13 Thread H.J. Lu via Gcc-patches
2 System V Application Binary > Interface > Subject: Re: [llvm-dev] [PATCH] Add optional _Float16 support > > On Thu, Jul 1, 2021 at 3:10 PM Joseph Myers wrote: > > > > On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote: > > > > > 2. Return _Float16 and _Comple

[PATCH v2] x86: Don't enable UINTR in 32-bit mode

2021-07-12 Thread H.J. Lu via Gcc-patches
UINTR is available only in 64-bit mode. Since the codegen target is unknown when the the gcc driver is processing -march=native, to properly handle UINTR for -march=native: 1. Pass arch[32|64] and tune[32|64] to host_detect_local_cpu to indicate 32-bit and 64-bit codegen. 2. Change ix86_option_ov

PING^1 [PATCH v5 00/11] Allow TImode/OImode/XImode in op_by_pieces operations

2021-07-12 Thread H.J. Lu via Gcc-patches
On Thu, Jul 1, 2021 at 8:22 AM H.J. Lu wrote: > > Changes in the v5 patches: > > 1. Add TARGET_GEN_MEMSET_SCRATCH_RTX to allow the backend to use a hard > scratch register to avoid stack realignment when expanding memset. > 2. Use vec_duplicate, instead of adding TARGET_READ_MEMSET_VALUE and > TAR

Re: [PATCH v3 1/2] Add -f[no-]direct-extern-access

2021-07-12 Thread H.J. Lu via Gcc-patches
On Sun, Jul 11, 2021 at 11:13 PM Richard Biener wrote: > > On Fri, Jul 9, 2021 at 4:50 PM H.J. Lu wrote: > > > > -fdirect-extern-access is the default. With -fno-direct-extern-access: > > > > 1. Always use GOT to access undefined data and function symbols, > >including in PIE and non-PIE. T

[PATCH] x86: Don't enable UINTR in 32-bit mode

2021-07-11 Thread H.J. Lu via Gcc-patches
UINTR is available only in 64-bit mode. Since the codegen target is unknown when the the gcc driver is processing -march=native, to properly handle UINTR for -march=native: 1. Add an undocumented option, -muintr-native. 2. Update the gcc driver to pass -muintr-native with -march=native if UINTR i

Re: [COMMITTED] Fix relation query of equivalences.

2021-07-10 Thread H.J. Lu via Gcc-patches
On Thu, Jun 24, 2021 at 10:36 AM Andrew MacLeod via Gcc-patches wrote: > > > When looking for relations between equivalencies, a typo was causing the > same bitmap to be checked for both operands, instead of the correct one > for each. This caused us to never notice relations between equivalence

Re: [PATCH] PR tree-opt/40210: Fold (bswap(X)>>C1)&C2 to (X>>C3)&C2 in match.pd

2021-07-10 Thread H.J. Lu via Gcc-patches
On Thu, Jul 8, 2021 at 2:51 AM Richard Biener via Gcc-patches wrote: > > On Thu, Jul 8, 2021 at 9:37 AM Roger Sayle wrote: > > > > > > Hi Richard, > > Thanks. Yep, you've correctly the diagnosed that the motivation for the > > get_builtin_precision helper function was that the TREE_TYPE of the >

Re: [PATCH] libffi/x86: Always check __x86_64__ for x86 hosts

2021-07-09 Thread H.J. Lu via Gcc-patches
On Mon, Jul 5, 2021 at 6:00 PM H.J. Lu wrote: > > Since for gnux32 hosts, -m32 generates i386 codes, always check __x86_64__ > for x86 hosts. > > PR libffi/101336 > * configure.host: Always check __x86_64__ for x86 hosts. > --- > libffi/configure.host | 21 +++-- >

[PATCH v3 2/2] Add TARGET_ASM_EMIT_GNU_PROPERTY_NOTE

2021-07-09 Thread H.J. Lu via Gcc-patches
Generate the marker for -fno-direct-extern-access to indicate that the object file uses GOT to access all external symbols. Access to protected symbols in the resulting shared library is treated as local, which requires canonical function pointers and cannot be used with copy relocation. GNU_PROP

[PATCH v3 0/2] Implement indirect external access

2021-07-09 Thread H.J. Lu via Gcc-patches
Changes in the v2 patch. 1. GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS support has been added to GNU binutils 2.38. But the -z indirect-extern-access linker option is only available for Linux/x86. However, the --max-cache-size=SIZE linker option was also addded within a day. --max-cache-size=

[PATCH v3 1/2] Add -f[no-]direct-extern-access

2021-07-09 Thread H.J. Lu via Gcc-patches
-fdirect-extern-access is the default. With -fno-direct-extern-access: 1. Always use GOT to access undefined data and function symbols, including in PIE and non-PIE. These will avoid copy relocations in executables. This is compatible with existing executables and shared libraries. 2.

[PATCH] soft-fp: Update soft-fp from glibc

2021-07-07 Thread H.J. Lu via Gcc-patches
From: liuhongt 1. Add __extendhfdf2/__extendhfsf2 to return an IEEE half converted to IEEE double/single. 2. Add __truncdfhf2/__extendsfhf2 to truncate IEEE double/single into IEEE half. 3. Add __eqhf2/__nehf2 to return 0 if a == b and a,b are not NAN, otherwise return 1. These are needed by x86

Re: [PATCH 0/2] Initial support for AVX512FP16

2021-07-06 Thread H.J. Lu via Gcc-patches
On Tue, Jul 6, 2021 at 3:15 AM Richard Biener via Gcc-patches wrote: > > On Tue, Jul 6, 2021 at 10:46 AM Hongtao Liu wrote: > > > > On Thu, Jul 1, 2021 at 9:04 PM Jakub Jelinek via Gcc-patches > > wrote: > > > > > > On Thu, Jul 01, 2021 at 02:58:01PM +0200, Richard Biener wrote: > > > > > The ma

PING^1 [PATCH] Add TARGET_IFUNC_REF_LOCAL_OK

2021-07-06 Thread H.J. Lu via Gcc-patches
On Sat, Jun 19, 2021 at 7:10 PM H.J. Lu wrote: > > 1. On some targets, like PowerPC, reference to ifunc function resolver > must be non-local so that compiler will properly emit PLT call. Add > TARGET_IFUNC_REF_LOCAL_OK to allow binding indirect function resolver > locally for targets which don't

[PATCH] libffi/x86: Always check __x86_64__ for x86 hosts

2021-07-05 Thread H.J. Lu via Gcc-patches
Since for gnux32 hosts, -m32 generates i386 codes, always check __x86_64__ for x86 hosts. PR libffi/101336 * configure.host: Always check __x86_64__ for x86 hosts. --- libffi/configure.host | 21 +++-- 1 file changed, 7 insertions(+), 14 deletions(-) diff --git a/

Re: Commit: Update libiberty sources

2021-07-04 Thread H.J. Lu via Gcc-patches
On Sun, Jul 4, 2021 at 5:56 AM Alan Modra wrote: > > On Sat, Jul 03, 2021 at 01:56:22PM +0100, Nick Clifton via Binutils wrote: > > Hi Guys, > > > > I am applying the attached file to synchronize our libiberty sources > > with gcc. > > This lost commit 50ad1254d503, "GCC: Pass --plugin to AR a

[PATCH] Sync with binutils: GCC: Pass --plugin to AR and RANLIB

2021-07-04 Thread H.J. Lu via Gcc-patches
Sync with binutils for building binutils with LTO: >From 50ad1254d5030d0804cbf89c758359ae202e8d55 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Sat, 9 Jan 2021 06:43:11 -0800 Subject: [PATCH] GCC: Pass --plugin to AR and RANLIB Detect GCC LTO plugin. Pass --plugin to AR and RANLIB to support L

[PATCH] Don't use vec_duplicate on vector in CTOR expansion

2021-07-02 Thread H.J. Lu via Gcc-patches
Since vec_duplicate only works on scalar, don't use it on vector in store constructor expansion. gcc/ PR middle-end/101294 * expr.c (store_constructor): Don't use vec_duplicate on vector. gcc/testsuite/ PR middle-end/101294 * gcc.dg/pr101294.c: New test. --- gcc

Re: [PATCH] Add optional _Float16 support

2021-07-01 Thread H.J. Lu via Gcc-patches
On Thu, Jul 1, 2021 at 3:40 PM Joseph Myers wrote: > > On Thu, 1 Jul 2021, H.J. Lu wrote: > > > BTW, _Float16 software emulation may require more than just SSE > > since we need to do _Float16 load and store with XMM registers. > > There is no 16bit load/store for XMM registers without AVX512FP16.

Re: [PATCH] Add optional _Float16 support

2021-07-01 Thread H.J. Lu via Gcc-patches
On Thu, Jul 1, 2021 at 3:10 PM Joseph Myers wrote: > > On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote: > > > 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers. > > That restricts use of _Float16 to processors with SSE. Is that what we > wa

[PATCH] Add optional _Float16 support

2021-07-01 Thread H.J. Lu via Gcc-patches
1. Pass _Float16 and _Complex _Float16 values on stack. 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers. --- low-level-sys-info.tex | 57 +- 1 file changed, 40 insertions(+), 17 deletions(-) diff --git a/low-level-sys-info.tex b/low

Re: [PATCH] soft-fp: Update soft-fp from glibc

2021-07-01 Thread H.J. Lu via Gcc-patches
On Thu, Jul 1, 2021 at 1:12 PM Jeff Law wrote: > > > > On 7/1/2021 1:28 PM, H.J. Lu via Gcc-patches wrote: > > This patch is updating soft-fp from glibc: > > > > 1. Add __extendhfxf2 to return an IEEE half converted to IEEE extended. > > 2. Add __truncxfhf2

[PATCH] soft-fp: Update soft-fp from glibc

2021-07-01 Thread H.J. Lu via Gcc-patches
This patch is updating soft-fp from glibc: 1. Add __extendhfxf2 to return an IEEE half converted to IEEE extended. 2. Add __truncxfhf2 to truncate IEEE extended into IEEE half. These are needed by x86 _Float16 support. --- libgcc/soft-fp/extendhfxf2.c | 53 l

[PATCH v5 06/11] x86: Add tests for piecewise move and store

2021-07-01 Thread H.J. Lu via Gcc-patches
* gcc.target/i386/pieces-memcpy-10.c: New test. * gcc.target/i386/pieces-memcpy-11.c: Likewise. * gcc.target/i386/pieces-memcpy-12.c: Likewise. * gcc.target/i386/pieces-memcpy-13.c: Likewise. * gcc.target/i386/pieces-memcpy-14.c: Likewise. * gcc.targe

[PATCH v5 11/11] x86: Also pass -mno-sse to vect8-ret.c

2021-07-01 Thread H.J. Lu via Gcc-patches
Also pass -mno-sse to vect8-ret.c to disable XMM load/store when running GCC tests with "-march=x86-64 -m32". * gcc.target/i386/vect8-ret.c: Also pass -mno-sse. --- gcc/testsuite/gcc.target/i386/vect8-ret.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/

[PATCH v5 10/11] x86: Update gcc.target/i386/incoming-11.c

2021-07-01 Thread H.J. Lu via Gcc-patches
Expect no stack realignment since we no longer realign stack when copying data. * gcc.target/i386/incoming-11.c: Expect no stack realignment. --- gcc/testsuite/gcc.target/i386/incoming-11.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/i386/i

[PATCH v5 09/11] x86: Also pass -mno-avx to sw-1.c for ia32

2021-07-01 Thread H.J. Lu via Gcc-patches
Also pass -mno-avx to sw-1.c for ia32 since copying data with YMM or ZMM registers disables shrink-wrapping when the second argument is passed on stack. * gcc.target/i386/sw-1.c: Also pass -mno-avx for ia32. --- gcc/testsuite/gcc.target/i386/sw-1.c | 1 + 1 file changed, 1 insertion(+) d

[PATCH v5 04/11] x86: Update piecewise move and store

2021-07-01 Thread H.J. Lu via Gcc-patches
We can use TImode/OImode/XImode integers for piecewise move and store. 1. Define MAX_MOVE_MAX to 64, which is the constant maximum number of bytes that a single instruction can move quickly between memory and registers or between two memory locations. 2. Define MOVE_MAX to MOVE_MAX_PIECES, which i

[PATCH v5 07/11] x86: Also pass -mno-avx to pr72839.c

2021-07-01 Thread H.J. Lu via Gcc-patches
Also pass -mno-avx to pr72839.c to avoid copying data with YMM or ZMM registers. * gcc.target/i386/pr72839.c: Also pass -mno-avx. --- gcc/testsuite/gcc.target/i386/pr72839.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/i386/pr72839.c b/gcc/

[PATCH v5 05/11] x86: Add AVX2 tests for PR middle-end/90773

2021-07-01 Thread H.J. Lu via Gcc-patches
PR middle-end/90773 * gcc.target/i386/pr90773-20.c: New test. * gcc.target/i386/pr90773-21.c: Likewise. * gcc.target/i386/pr90773-22.c: Likewise. * gcc.target/i386/pr90773-23.c: Likewise. * gcc.target/i386/pr90773-26.c: Likewise. --- gcc/testsuite/gc

[PATCH v5 00/11] Allow TImode/OImode/XImode in op_by_pieces operations

2021-07-01 Thread H.J. Lu via Gcc-patches
Changes in the v5 patches: 1. Add TARGET_GEN_MEMSET_SCRATCH_RTX to allow the backend to use a hard scratch register to avoid stack realignment when expanding memset. 2. Use vec_duplicate, instead of adding TARGET_READ_MEMSET_VALUE and TARGET_GEN_MEMSET_VALUE, to expand memset if available. Change

[PATCH v5 08/11] x86: Also pass -mno-avx to cold-attribute-1.c

2021-07-01 Thread H.J. Lu via Gcc-patches
Also pass -mno-avx to pr72839.c to avoid copying data with YMM or ZMM registers. * gcc.target/i386/cold-attribute-1.c: Also pass -mno-avx. --- gcc/testsuite/gcc.target/i386/cold-attribute-1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/i386

[PATCH v5 02/11] x86: Add TARGET_GEN_MEMSET_SCRATCH_RTX

2021-07-01 Thread H.J. Lu via Gcc-patches
Define TARGET_GEN_MEMSET_SCRATCH_RTX to ix86_gen_scratch_sse_rtx to return a scratch SSE register for memset. gcc/ PR middle-end/90773 * config/i386/i386.c (TARGET_GEN_MEMSET_SCRATCH_RTX): New. gcc/testsuite/ PR middle-end/90773 * gcc.target/i386/pr90773-15.c: Ne

[PATCH v5 01/11] Rewrite memset with TARGET_GEN_MEMSET_SCRATCH_RTX

2021-07-01 Thread H.J. Lu via Gcc-patches
1. Rewrite builtin_memset_read_str/builtin_memset_gen_str to use vector broadcast to duplicate QI value to TI/OI/XI value for memmset. 2. Add TARGET_GEN_MEMSET_SCRATCH_RTX to allow the backend to use a hard scratch register to avoid stack realignment when expanding memset. PR middle-end/90

[PATCH v5 03/11] x86: Avoid stack realignment when copying data

2021-07-01 Thread H.J. Lu via Gcc-patches
To avoid stack realignment, use SCRATCH_SSE_REG to copy data from one memory location to another. gcc/ * config/i386/i386-expand.c (ix86_expand_vector_move): Call ix86_gen_scratch_sse_rtx to get a scratch SSE register to copy data from one memory location to another. gcc/

Re: [PATCH v6 1/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-07-01 Thread H.J. Lu via Gcc-patches
Hi Uros, On Thu, Jul 1, 2021 at 1:32 AM Hongtao Liu wrote: > > On Tue, Jun 29, 2021 at 6:16 AM H.J. Lu wrote: > > > > 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTOR > > operands to vector broadcast from an integer with AVX. > > 2. Add ix86_gen_scratch_sse_rtx to return a

Re: [PATCH 0/2] Initial support for AVX512FP16

2021-07-01 Thread H.J. Lu via Gcc-patches
On Thu, Jul 1, 2021 at 4:10 AM Uros Bizjak wrote: > > [Sorry for double post, gcc-patches address was wrong in original post] > > On Thu, Jul 1, 2021 at 7:48 AM liuhongt wrote: > > > > Hi: > > AVX512FP16 is disclosed, refer to [1]. > > There're 100+ instructions for AVX512FP16, 67 gcc patches

Re: [PATCH v5 1/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-06-28 Thread H.J. Lu via Gcc-patches
On Sun, Jun 27, 2021 at 6:43 PM Hongtao Liu wrote: > > On Sun, Jun 27, 2021 at 4:02 AM H.J. Lu wrote: > > > > 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO > > operands to vector broadcast from an integer with AVX2. > > 2. Add ix86_gen_scratch_sse_rtx to return a scratch

[PATCH v6 1/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-06-28 Thread H.J. Lu via Gcc-patches
1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTOR operands to vector broadcast from an integer with AVX. 2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which won't increase stack alignment requirement and blocks transformation by the combine pass. A small be

[PATCH v6 0/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-06-28 Thread H.J. Lu via Gcc-patches
Changes in the v6 patch: 1. Update SI/DI broadcast with AVX. 2. Require non-standard SSE constant integer broadcast with AVX. 3. Use nonimmediate_operand in vec_duplicate and verify that it never fails. Changes in the v5 patch: 1. Allow AVX with SI/DI broadcast. 2. Add a comment for broadcasting

[PATCH v6 2/2] x86: Add vec_duplicate expander

2021-06-28 Thread H.J. Lu via Gcc-patches
Add vec_duplicate expander for SSE2 if we can move from GPR to SSE register directly. * config/i386/i386-expand.c (ix86_expand_vector_init_duplicate): Make it global. * config/i386/i386-protos.h (ix86_expand_vector_init_duplicate): New prototype. * config/i3

Re: [PATCH v5 2/2] x86: Add vec_duplicate expander

2021-06-28 Thread H.J. Lu via Gcc-patches
On Mon, Jun 28, 2021 at 5:36 AM Richard Sandiford wrote: > > "H.J. Lu" writes: > > On Sun, Jun 27, 2021 at 2:00 PM Richard Sandiford > > wrote: > >> > >> "H.J. Lu via Gcc-patches" writes: > >> > On Sun, Jun 27, 2021 at 1:43 A

Re: [PATCH v5 2/2] x86: Add vec_duplicate expander

2021-06-28 Thread H.J. Lu via Gcc-patches
On Sun, Jun 27, 2021 at 2:00 PM Richard Sandiford wrote: > > "H.J. Lu via Gcc-patches" writes: > > On Sun, Jun 27, 2021 at 1:43 AM Richard Sandiford > > wrote: > >> > >> "H.J. Lu" writes: > >> > 1. Update vec_duplicate to

Re: [PATCH v5 2/2] x86: Add vec_duplicate expander

2021-06-27 Thread H.J. Lu via Gcc-patches
On Sun, Jun 27, 2021 at 1:43 AM Richard Sandiford wrote: > > "H.J. Lu" writes: > > 1. Update vec_duplicate to allow to fail so that backend can only allow > > broadcasting an integer constant to a vector when broadcast instruction > > is available. This can be used by memset expander to avoid ve

[PATCH v5 1/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-06-26 Thread H.J. Lu via Gcc-patches
1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO operands to vector broadcast from an integer with AVX2. 2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which won't increase stack alignment requirement and blocks transformation by the combine pass. A small be

[PATCH v5 2/2] x86: Add vec_duplicate expander

2021-06-26 Thread H.J. Lu via Gcc-patches
1. Update vec_duplicate to allow to fail so that backend can only allow broadcasting an integer constant to a vector when broadcast instruction is available. This can be used by memset expander to avoid vec_duplicate when loading from constant pool is more efficient. 2. Add vec_duplicate expander

[PATCH v5 0/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-06-26 Thread H.J. Lu via Gcc-patches
Changes in the v5 patch: 1. Allow AVX with SI/DI broadcast. 2. Add a comment for broadcasting to V64QI and V32HI with AVX512F, but without AVX512BW. --- 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO operands to vector broadcast from an integer with AVX2. 2. Add ix86_gen_s

Re: PING^1 [PATCH v4 0/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-06-26 Thread H.J. Lu via Gcc-patches
e:INT_BROADCAST_MODE > > > >+ (match_operand: 1 "general_operand")))] > > > >+ "TARGET_SSE2 && TARGET_INTER_UNIT_MOVES_TO_VEC" > > > Why not directly use TARGET_AVX2 here, not in > > > ix86_expand_integer_vec_dupl

Re: PING^1 [PATCH v4 0/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-06-26 Thread H.J. Lu via Gcc-patches
d[1] to be const_int but > > not const0_rtx and constm1_rtx. > > w/ this, guess there's no need for FAIL? and you can directly > > call ix86_expand_vector_init_duplicate. general_operand is used to allow both variable and constant broadcast so that I can us

[PATCH v2] x86: Check AVX512 without mask instructions

2021-06-25 Thread H.J. Lu via Gcc-patches
On Fri, Jun 25, 2021 at 12:50 AM Uros Bizjak wrote: > > On Fri, Jun 25, 2021 at 4:51 AM Hongtao Liu wrote: > > > > On Fri, Jun 25, 2021 at 12:13 AM Uros Bizjak via Gcc-patches > > wrote: > > > > > > On Thu, Jun 24, 2021 at 2:12 PM H.J. Lu wrote: > > > > > > > > CPUID functions are used to detec

Re: [PATCH] x86: Compile CPUID functions with -mgeneral-regs-only

2021-06-24 Thread H.J. Lu via Gcc-patches
On Thu, Jun 24, 2021 at 9:12 AM Uros Bizjak wrote: > > On Thu, Jun 24, 2021 at 2:12 PM H.J. Lu wrote: > > > > CPUID functions are used to detect CPU features. If vector ISAs > > are enabled, compiler is free to use them in these functions. Add > > __attribute__ ((target("general-regs-only"))) t

PING^1 [PATCH v4 0/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-06-24 Thread H.J. Lu via Gcc-patches
On Wed, Jun 9, 2021 at 4:39 PM H.J. Lu wrote: > > 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO > operands to vector broadcast from an integer with AVX2. > 2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which > won't increase stack alignment requirement a

[PATCH v2 2/2] Add TARGET_ASM_EMIT_GNU_PROPERTY_NOTE

2021-06-24 Thread H.J. Lu via Gcc-patches
Generate the marker for -fno-direct-extern-access to indicate that the object file uses GOT to access all external symbols. Access to protected symbols in the resulting shared library is treated as local, which requires canonical function pointers and cannot be used with copy relocation.

[PATCH v2 0/2] Implement indirect external access

2021-06-24 Thread H.J. Lu via Gcc-patches
Changes in the v2 patch. 1. Rename the option to -fdirect-extern-access. --- On systems with copy relocation: * A copy in executable is created for the definition in a shared library at run-time by ld.so. * The copy is referenced by executable and shared libraries. * Executable can access the cop

[PATCH v2 1/2] Add -f[no-]direct-extern-access

2021-06-24 Thread H.J. Lu via Gcc-patches
-fdirect-extern-access is the default. With -fno-direct-extern-access: 1. Always use GOT to access undefined data and function symbols, including in PIE and non-PIE. These will avoid copy relocations in executables. This is compatible with existing executables and shared libraries. 2.

Re: [PATCH] x86: Compile CPUID functions with -mgeneral-regs-only

2021-06-24 Thread H.J. Lu via Gcc-patches
On Thu, Jun 24, 2021 at 5:47 AM Richard Biener wrote: > > On Thu, Jun 24, 2021 at 2:42 PM H.J. Lu wrote: > > > > On Thu, Jun 24, 2021 at 5:35 AM Richard Biener > > wrote: > > > > > > On Thu, Jun 24, 2021 at 2:13 PM H.J. Lu via Gcc-patches > > &g

Re: [PATCH] x86: Compile CPUID functions with -mgeneral-regs-only

2021-06-24 Thread H.J. Lu via Gcc-patches
On Thu, Jun 24, 2021 at 5:35 AM Richard Biener wrote: > > On Thu, Jun 24, 2021 at 2:13 PM H.J. Lu via Gcc-patches > wrote: > > > > CPUID functions are used to detect CPU features. If vector ISAs > > are enabled, compiler is free to use them in these functions. Add

[PATCH] x86: Compile CPUID functions with -mgeneral-regs-only

2021-06-24 Thread H.J. Lu via Gcc-patches
CPUID functions are used to detect CPU features. If vector ISAs are enabled, compiler is free to use them in these functions. Add __attribute__ ((target("general-regs-only"))) to CPUID functions to avoid vector instructions. gcc/ PR target/101185 * config/i386/cpuid.h (__get_cpu

Re: [PATCH 2/2] elf: Add GNU_PROPERTY_1_NEEDED check

2021-06-22 Thread H.J. Lu via Gcc-patches
On Tue, Jun 22, 2021 at 11:15 AM Fangrui Song wrote: > > On 2021-06-22, H.J. Lu wrote: > >On Mon, Jun 21, 2021 at 10:46 PM Fangrui Song wrote: > >> > >> On 2021-06-21, H.J. Lu wrote: > >> >On Mon, Jun 21, 2021 at 9:16 PM Alan Modra wrote: > >> >> > >> >> On Mon, Jun 21, 2021 at 07:12:02PM -0700,

Re: [PATCH 2/2] elf: Add GNU_PROPERTY_1_NEEDED check

2021-06-22 Thread H.J. Lu via Gcc-patches
On Mon, Jun 21, 2021 at 10:46 PM Fangrui Song wrote: > > On 2021-06-21, H.J. Lu wrote: > >On Mon, Jun 21, 2021 at 9:16 PM Alan Modra wrote: > >> > >> On Mon, Jun 21, 2021 at 07:12:02PM -0700, H.J. Lu wrote: > >> > On Mon, Jun 21, 2021 at 5:06 PM Alan Modra wrote: > >> > > > >> > > On Mon, Jun 21

[PATCH 2/2] Add TARGET_ASM_EMIT_GNU_PROPERTY_NOTE

2021-06-20 Thread H.J. Lu via Gcc-patches
Generate the GNU_PROPERTY_1_NEEDED_SINGLE_GLOBAL_DEFINITION marker for -fsingle-global-definition to indicate that the object file requires canonical function pointers and cannot be used with copy relocation. * configure.ac (HAVE_LD_SINGLE_GLOBAL_DEFINITION_SUPPORT): New. Define to

<    1   2   3   4   5   6   7   8   9   10   >