On Mon, Jul 26, 2021 at 2:53 PM Richard Sandiford
wrote:
>
> "H.J. Lu via Gcc-patches" writes:
> > On Mon, Jul 26, 2021 at 11:42 AM Richard Sandiford
> > wrote:
> >>
> >> "H.J. Lu via Gcc-patches" writes:
> >> > +to avoid st
On Mon, Jul 26, 2021 at 11:42 AM Richard Sandiford
wrote:
>
> "H.J. Lu via Gcc-patches" writes:
> > +/* Callback routine for store_by_pieces. Read GET_MODE_BITSIZE (MODE)
> > + bytes from constant string DATA + OFFSET and return it as target
> > + consta
On Thu, Jul 22, 2021 at 7:17 PM Hongtao Liu wrote:
>
> On Thu, Jul 22, 2021 at 8:34 PM H.J. Lu via Gcc-patches
> wrote:
> >
> > For
> >
> > (set (reg:V32QI 108)
> > (const_vector:V32QI [
> > (const_int -1 [0x]
Don't return hard register in ix86_gen_scratch_sse_rtx when LRA is in
progress to avoid ICE when there are no available hard registers for
LRA.
gcc/
PR target/101504
* config/i386/i386.c (ix86_gen_scratch_sse_rtx): Don't return
hard register when LRA is in progress.
gcc/t
For
(set (reg:V32QI 108)
(const_vector:V32QI [
(const_int -1 [0x]) repeated x32
])) "x.c":15:5 1573 {movv32qi_internal}
(expr_list:REG_EQUIV (const_vector:V32QI [
(const_int -1 [0x]) repeated x32
]
On Wed, Jul 21, 2021 at 12:42 PM Richard Sandiford
wrote:
>
> Richard Sandiford writes:
> > "H.J. Lu via Gcc-patches" writes:
> >> On Wed, Jul 21, 2021 at 7:50 AM Richard Sandiford
> >> wrote:
> >>>
> >>> "H.J. Lu"
1. Replace scalar_int_mode with fixed_size_mode in the by-pieces
infrastructure to allow non-integer mode.
2. Rename widest_int_mode_for_size to widest_fixed_size_mode_for_size
to return QI vector mode for memset.
3. Add op_by_pieces_d::smallest_fixed_size_mode_for_size to return the
smallest integ
On Wed, Jul 21, 2021 at 12:20 PM Richard Sandiford
wrote:
>
> "H.J. Lu via Gcc-patches" writes:
> > On Wed, Jul 21, 2021 at 7:50 AM Richard Sandiford
> > wrote:
> >>
> >> "H.J. Lu" writes:
> >> > diff --git a/gcc/buil
On Wed, Jul 21, 2021 at 7:50 AM Richard Sandiford
wrote:
>
> "H.J. Lu" writes:
> > diff --git a/gcc/builtins.c b/gcc/builtins.c
> > index 39ab139b7e1..1972301ce3c 100644
> > --- a/gcc/builtins.c
> > +++ b/gcc/builtins.c
> > @@ -3890,13 +3890,16 @@ expand_builtin_strnlen (tree exp, rtx target,
>
Since
commit 39671f87b2df6a1894cc11a161e4a7949d1ddccd
Author: H.J. Lu
Date: Thu Apr 15 05:59:48 2021 -0700
x86: Use crc32 target option for CRC32 intrinsics
enabled OPTION_MASK_ISA_CRC32 for -msse4 and removed TARGET_SSE4_2 check
in sse4_2_crc32 pattens, remove OPTION_MASK_ISA_SSE4_2 from
On Tue, Jul 20, 2021 at 1:58 PM Jeff Law wrote:
>
>
>
> On 7/20/2021 2:55 PM, H.J. Lu via Gcc-patches wrote:
> > 1. Sync with upstream commit 7704fedfff6ef5676adb6415f3be0ac927d1a746
> > 2. Apply local patche
> > 3. Update LOCAL_PATCHES
> > 4. Bump asan/t
---
libsanitizer/asan/asan_globals.cpp| 19 --
libsanitizer/asan/asan_interceptors.h | 7 ++-
libsanitizer/asan/asan_mapping.h | 2 +-
.../sanitizer_linux_libcdep.cpp | 4
.../sanitizer_common/sanitizer_mac.cpp| 12
1. Sync with upstream commit 7704fedfff6ef5676adb6415f3be0ac927d1a746
2. Apply local patche
3. Update LOCAL_PATCHES
4. Bump asan/tsan versions for the upstream commit:
commit acf0a6428681dccac803984bfbb1e3e54248f090
Author: Ilya Leoshkevich
Date: Fri Jul 2 02:42:38 2021 +0200
Bump asan/tsan versions for the upstream commit:
commit acf0a6428681dccac803984bfbb1e3e54248f090
Author: Ilya Leoshkevich
Date: Fri Jul 2 02:42:38 2021 +0200
[sanitizer] Fix __sanitizer_kernel_sigset_t endianness issue
setuid(0) hangs on SystemZ under TSan because TSan's BackgroundThr
* LOCAL_PATCHES: Update to the corresponding revision.
---
libsanitizer/LOCAL_PATCHES | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/libsanitizer/LOCAL_PATCHES b/libsanitizer/LOCAL_PATCHES
index b1969fc7882..d45655392b0 100644
--- a/libsanitizer/LOCAL_PATCHES
+++ b/l
On Tue, Jul 20, 2021 at 8:12 AM Richard Sandiford
wrote:
>
> Richard Sandiford via Gcc-patches writes:
> > "H.J. Lu via Gcc-patches" writes:
> >> On Mon, Jul 19, 2021 at 11:38 PM Richard Sandiford
> >> wrote:
>
1. Replace scalar_int_mode with fixed_size_mode in the by-pieces
infrastructure to allow non-integer mode.
2. Rename widest_int_mode_for_size to widest_fixed_size_mode_for_size
to return QI vector mode for memset.
3. Add op_by_pieces_d::smallest_fixed_size_mode_for_size to return the
smallest integ
On Tue, Jul 20, 2021 at 5:48 AM H.J. Lu wrote:
>
> On Mon, Jul 19, 2021 at 11:38 PM Richard Sandiford
> wrote:
> >
> > "H.J. Lu via Gcc-patches" writes:
> > >> > + {
> > >> > + /* First generate subreg of word mode if the
On Mon, Jul 19, 2021 at 11:38 PM Richard Sandiford
wrote:
>
> "H.J. Lu via Gcc-patches" writes:
> >> > + {
> >> > + /* First generate subreg of word mode if the previous mode is
> >> > + wider than word mode and word
1. Replace scalar_int_mode with fixed_size_mode in the by-pieces
infrastructure to allow non-integer mode.
2. Rename widest_int_mode_for_size to widest_fixed_size_mode_for_size
to return QI vector mode for memset.
3. Add op_by_pieces_d::smallest_fixed_size_mode_for_size to return the
smallest integ
On Mon, Jul 19, 2021 at 7:41 AM Richard Sandiford
wrote:
>
> "H.J. Lu via Gcc-patches" writes:
> > On Fri, Jul 16, 2021 at 7:15 AM H.J. Lu wrote:
> >>
> >> On Fri, Jul 16, 2021 at 6:24 AM Richard Sandiford
> >> wrote:
> >> >
>
Don't issue vzeroupper before function call if callee returns AVX
register since callee must be compiled with AVX.
gcc/
PR target/101495
* config/i386/i386.c (ix86_check_avx_upper_stores): Moved before
ix86_avx_u128_mode_needed.
(ix86_avx_u128_mode_needed): Return
On Thu, Apr 22, 2021 at 7:30 AM Richard Biener via Gcc-patches
wrote:
>
> On Thu, Apr 22, 2021 at 2:52 PM Richard Biener
> wrote:
> >
> > On Thu, Apr 22, 2021 at 2:22 PM Jakub Jelinek wrote:
> > >
> > > On Thu, Apr 22, 2021 at 01:23:20PM +0200, Richard Biener via Gcc-patches
> > > wrote:
> > >
For -mgeneral-regs-only, enable the GPR only instructions which are
enabled implicitly by SSE ISAs unless they have been disabled explicitly.
gcc/
PR target/101492
* common/config/i386/i386-common.c (ix86_handle_option): For
-mgeneral-regs-only, enable the GPR only instruc
On Fri, Jul 16, 2021 at 7:15 AM H.J. Lu wrote:
>
> On Fri, Jul 16, 2021 at 6:24 AM Richard Sandiford
> wrote:
> >
> > "H.J. Lu via Gcc-patches" writes:
> > > On Fri, Jul 16, 2021 at 4:38 AM Richard Sandiford
> > > wrote:
> > >>
On Fri, Jul 16, 2021 at 6:24 AM Richard Sandiford
wrote:
>
> "H.J. Lu via Gcc-patches" writes:
> > On Fri, Jul 16, 2021 at 4:38 AM Richard Sandiford
> > wrote:
> >>
> >> "H.J. Lu via Gcc-patches" writes:
> >> >
On Fri, Jul 16, 2021 at 4:38 AM Richard Sandiford
wrote:
>
> "H.J. Lu via Gcc-patches" writes:
> > 1. Rewrite builtin_memset_read_str and builtin_memset_gen_str with
> > vec_duplicate_optab to duplicate QI value to TI/OI/XI value.
> > 2. Add TARGET_GEN_MEMSET_S
On Thu, Jul 15, 2021 at 9:40 AM Tamar Christina via Gcc-patches
wrote:
>
> Hi All,
>
> These testcases accidentally contain the wrong signs for the expected values
> for the scalar code. The vector code however is correct.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Co
On Thu, Jul 15, 2021 at 6:36 PM Hongtao Liu wrote:
>
> On Fri, Jul 16, 2021 at 1:30 AM H.J. Lu via Gcc-patches
> wrote:
> >
> > In a single SET, all bits of the source YMM/ZMM register are zero when
> >
> > 1. The source is contant zero.
> > 2. The source Y
In a single SET, all bits of the source YMM/ZMM register are zero when
1. The source is contant zero.
2. The source YMM/ZMM operand are defined from contant zero.
and we don't set AVX_U128_DIRTY.
gcc/
PR target/101456
* config/i386/i386.c (ix86_avx_u128_mode_needed): Don't set
On Tue, Jul 6, 2021 at 12:00 AM Richard Biener wrote:
>
>
> Status
> ==
>
> The GCC 11 branch is open for regression and documentation fixes.
> It's time for a GCC 11.2 release and we are aiming for a release
> candidate in about two weeks which would result in the GCC 11.2
> release about thr
On Tue, Jul 6, 2021 at 12:00 AM Richard Biener wrote:
>
>
> Status
> ==
>
> The GCC 11 branch is open for regression and documentation fixes.
> It's time for a GCC 11.2 release and we are aiming for a release
> candidate in about two weeks which would result in the GCC 11.2
> release about thr
On Wed, Jul 14, 2021 at 12:32 AM Matthias Kretz wrote:
>
> OK?
>
> On Wednesday, 30 June 2021 10:59:28 CEST Matthias Kretz wrote:
> > Library code, especially in headers, sometimes needs to know how the
> > compiler interprets / optimizes floating-point types and operations.
> > This information c
On Tue, Jul 13, 2021 at 9:35 PM Hongtao Liu wrote:
>
> On Wed, Jul 14, 2021 at 10:34 AM liuhongt wrote:
> >
> > By optimizing vector movement to broadcast in ix86_expand_vector_move
> > during pass_expand, pass_reload/LRA can automatically generate an avx512
> > embedded broadcast, pass_cpb is no
On Fri, Jun 25, 2021 at 5:39 AM H.J. Lu wrote:
>
> On Fri, Jun 25, 2021 at 12:50 AM Uros Bizjak wrote:
> >
> > On Fri, Jun 25, 2021 at 4:51 AM Hongtao Liu wrote:
> > >
> > > On Fri, Jun 25, 2021 at 12:13 AM Uros Bizjak via Gcc-patches
> > > wrote:
> > > >
> > > > On Thu, Jun 24, 2021 at 2:12 PM
1. Rewrite builtin_memset_read_str and builtin_memset_gen_str with
vec_duplicate_optab to duplicate QI value to TI/OI/XI value.
2. Add TARGET_GEN_MEMSET_SCRATCH_RTX to allow the backend to use a hard
scratch register to avoid stack realignment when expanding memset.
PR middle-end/90773
On Mon, Jul 12, 2021 at 11:56 PM Jakub Jelinek wrote:
>
> On Mon, Jul 12, 2021 at 06:51:30PM -0700, H.J. Lu wrote:
> > @@ -404,9 +404,18 @@ const char *host_detect_local_cpu (int argc, const
> > char **argv)
> >if (argc < 1)
> > return NULL;
>
> I think it would be simpler to use 2 argum
On Tue, Jul 13, 2021 at 8:41 AM Joseph Myers wrote:
>
> On Tue, 13 Jul 2021, H.J. Lu wrote:
>
> > On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei
> > wrote:
> > >
> > > > Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
> > >
> > > Can you please explain the behavior here? Is
om: llvm-dev On Behalf Of H.J. Lu
> > via llvm-dev
> > Sent: Friday, July 2, 2021 6:28 AM
> > To: Joseph Myers
> > Cc: llvm-...@lists.llvm.org; GCC Patches ;
> > GNU C Library ; IA32 System V Application
> > Binary Interface
> > Subject: Re: [llvm-
2 System V Application Binary
> Interface
> Subject: Re: [llvm-dev] [PATCH] Add optional _Float16 support
>
> On Thu, Jul 1, 2021 at 3:10 PM Joseph Myers wrote:
> >
> > On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote:
> >
> > > 2. Return _Float16 and _Comple
UINTR is available only in 64-bit mode. Since the codegen target is
unknown when the the gcc driver is processing -march=native, to properly
handle UINTR for -march=native:
1. Pass arch[32|64] and tune[32|64] to host_detect_local_cpu to indicate
32-bit and 64-bit codegen.
2. Change ix86_option_ov
On Thu, Jul 1, 2021 at 8:22 AM H.J. Lu wrote:
>
> Changes in the v5 patches:
>
> 1. Add TARGET_GEN_MEMSET_SCRATCH_RTX to allow the backend to use a hard
> scratch register to avoid stack realignment when expanding memset.
> 2. Use vec_duplicate, instead of adding TARGET_READ_MEMSET_VALUE and
> TAR
On Sun, Jul 11, 2021 at 11:13 PM Richard Biener
wrote:
>
> On Fri, Jul 9, 2021 at 4:50 PM H.J. Lu wrote:
> >
> > -fdirect-extern-access is the default. With -fno-direct-extern-access:
> >
> > 1. Always use GOT to access undefined data and function symbols,
> >including in PIE and non-PIE. T
UINTR is available only in 64-bit mode. Since the codegen target is
unknown when the the gcc driver is processing -march=native, to properly
handle UINTR for -march=native:
1. Add an undocumented option, -muintr-native.
2. Update the gcc driver to pass -muintr-native with -march=native if
UINTR i
On Thu, Jun 24, 2021 at 10:36 AM Andrew MacLeod via Gcc-patches
wrote:
>
>
> When looking for relations between equivalencies, a typo was causing the
> same bitmap to be checked for both operands, instead of the correct one
> for each. This caused us to never notice relations between equivalence
On Thu, Jul 8, 2021 at 2:51 AM Richard Biener via Gcc-patches
wrote:
>
> On Thu, Jul 8, 2021 at 9:37 AM Roger Sayle wrote:
> >
> >
> > Hi Richard,
> > Thanks. Yep, you've correctly the diagnosed that the motivation for the
> > get_builtin_precision helper function was that the TREE_TYPE of the
>
On Mon, Jul 5, 2021 at 6:00 PM H.J. Lu wrote:
>
> Since for gnux32 hosts, -m32 generates i386 codes, always check __x86_64__
> for x86 hosts.
>
> PR libffi/101336
> * configure.host: Always check __x86_64__ for x86 hosts.
> ---
> libffi/configure.host | 21 +++--
>
Generate the marker for -fno-direct-extern-access to indicate that the
object file uses GOT to access all external symbols. Access to protected
symbols in the resulting shared library is treated as local, which requires
canonical function pointers and cannot be used with copy relocation.
GNU_PROP
Changes in the v2 patch.
1. GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS support has been added to
GNU binutils 2.38. But the -z indirect-extern-access linker option is
only available for Linux/x86. However, the --max-cache-size=SIZE linker
option was also addded within a day. --max-cache-size=
-fdirect-extern-access is the default. With -fno-direct-extern-access:
1. Always use GOT to access undefined data and function symbols,
including in PIE and non-PIE. These will avoid copy relocations
in executables. This is compatible with existing executables and
shared libraries.
2.
From: liuhongt
1. Add __extendhfdf2/__extendhfsf2 to return an IEEE half converted to
IEEE double/single.
2. Add __truncdfhf2/__extendsfhf2 to truncate IEEE double/single into
IEEE half.
3. Add __eqhf2/__nehf2 to return 0 if a == b and a,b are not NAN,
otherwise return 1.
These are needed by x86
On Tue, Jul 6, 2021 at 3:15 AM Richard Biener via Gcc-patches
wrote:
>
> On Tue, Jul 6, 2021 at 10:46 AM Hongtao Liu wrote:
> >
> > On Thu, Jul 1, 2021 at 9:04 PM Jakub Jelinek via Gcc-patches
> > wrote:
> > >
> > > On Thu, Jul 01, 2021 at 02:58:01PM +0200, Richard Biener wrote:
> > > > > The ma
On Sat, Jun 19, 2021 at 7:10 PM H.J. Lu wrote:
>
> 1. On some targets, like PowerPC, reference to ifunc function resolver
> must be non-local so that compiler will properly emit PLT call. Add
> TARGET_IFUNC_REF_LOCAL_OK to allow binding indirect function resolver
> locally for targets which don't
Since for gnux32 hosts, -m32 generates i386 codes, always check __x86_64__
for x86 hosts.
PR libffi/101336
* configure.host: Always check __x86_64__ for x86 hosts.
---
libffi/configure.host | 21 +++--
1 file changed, 7 insertions(+), 14 deletions(-)
diff --git a/
On Sun, Jul 4, 2021 at 5:56 AM Alan Modra wrote:
>
> On Sat, Jul 03, 2021 at 01:56:22PM +0100, Nick Clifton via Binutils wrote:
> > Hi Guys,
> >
> > I am applying the attached file to synchronize our libiberty sources
> > with gcc.
>
> This lost commit 50ad1254d503, "GCC: Pass --plugin to AR a
Sync with binutils for building binutils with LTO:
>From 50ad1254d5030d0804cbf89c758359ae202e8d55 Mon Sep 17 00:00:00 2001
From: "H.J. Lu"
Date: Sat, 9 Jan 2021 06:43:11 -0800
Subject: [PATCH] GCC: Pass --plugin to AR and RANLIB
Detect GCC LTO plugin. Pass --plugin to AR and RANLIB to support L
Since vec_duplicate only works on scalar, don't use it on vector in
store constructor expansion.
gcc/
PR middle-end/101294
* expr.c (store_constructor): Don't use vec_duplicate on vector.
gcc/testsuite/
PR middle-end/101294
* gcc.dg/pr101294.c: New test.
---
gcc
On Thu, Jul 1, 2021 at 3:40 PM Joseph Myers wrote:
>
> On Thu, 1 Jul 2021, H.J. Lu wrote:
>
> > BTW, _Float16 software emulation may require more than just SSE
> > since we need to do _Float16 load and store with XMM registers.
> > There is no 16bit load/store for XMM registers without AVX512FP16.
On Thu, Jul 1, 2021 at 3:10 PM Joseph Myers wrote:
>
> On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote:
>
> > 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
>
> That restricts use of _Float16 to processors with SSE. Is that what we
> wa
1. Pass _Float16 and _Complex _Float16 values on stack.
2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
---
low-level-sys-info.tex | 57 +-
1 file changed, 40 insertions(+), 17 deletions(-)
diff --git a/low-level-sys-info.tex b/low
On Thu, Jul 1, 2021 at 1:12 PM Jeff Law wrote:
>
>
>
> On 7/1/2021 1:28 PM, H.J. Lu via Gcc-patches wrote:
> > This patch is updating soft-fp from glibc:
> >
> > 1. Add __extendhfxf2 to return an IEEE half converted to IEEE extended.
> > 2. Add __truncxfhf2
This patch is updating soft-fp from glibc:
1. Add __extendhfxf2 to return an IEEE half converted to IEEE extended.
2. Add __truncxfhf2 to truncate IEEE extended into IEEE half.
These are needed by x86 _Float16 support.
---
libgcc/soft-fp/extendhfxf2.c | 53
l
* gcc.target/i386/pieces-memcpy-10.c: New test.
* gcc.target/i386/pieces-memcpy-11.c: Likewise.
* gcc.target/i386/pieces-memcpy-12.c: Likewise.
* gcc.target/i386/pieces-memcpy-13.c: Likewise.
* gcc.target/i386/pieces-memcpy-14.c: Likewise.
* gcc.targe
Also pass -mno-sse to vect8-ret.c to disable XMM load/store when running
GCC tests with "-march=x86-64 -m32".
* gcc.target/i386/vect8-ret.c: Also pass -mno-sse.
---
gcc/testsuite/gcc.target/i386/vect8-ret.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/testsuite/
Expect no stack realignment since we no longer realign stack when
copying data.
* gcc.target/i386/incoming-11.c: Expect no stack realignment.
---
gcc/testsuite/gcc.target/i386/incoming-11.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/testsuite/gcc.target/i386/i
Also pass -mno-avx to sw-1.c for ia32 since copying data with YMM or ZMM
registers disables shrink-wrapping when the second argument is passed on
stack.
* gcc.target/i386/sw-1.c: Also pass -mno-avx for ia32.
---
gcc/testsuite/gcc.target/i386/sw-1.c | 1 +
1 file changed, 1 insertion(+)
d
We can use TImode/OImode/XImode integers for piecewise move and store.
1. Define MAX_MOVE_MAX to 64, which is the constant maximum number of
bytes that a single instruction can move quickly between memory and
registers or between two memory locations.
2. Define MOVE_MAX to MOVE_MAX_PIECES, which i
Also pass -mno-avx to pr72839.c to avoid copying data with YMM or ZMM
registers.
* gcc.target/i386/pr72839.c: Also pass -mno-avx.
---
gcc/testsuite/gcc.target/i386/pr72839.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/testsuite/gcc.target/i386/pr72839.c
b/gcc/
PR middle-end/90773
* gcc.target/i386/pr90773-20.c: New test.
* gcc.target/i386/pr90773-21.c: Likewise.
* gcc.target/i386/pr90773-22.c: Likewise.
* gcc.target/i386/pr90773-23.c: Likewise.
* gcc.target/i386/pr90773-26.c: Likewise.
---
gcc/testsuite/gc
Changes in the v5 patches:
1. Add TARGET_GEN_MEMSET_SCRATCH_RTX to allow the backend to use a hard
scratch register to avoid stack realignment when expanding memset.
2. Use vec_duplicate, instead of adding TARGET_READ_MEMSET_VALUE and
TARGET_GEN_MEMSET_VALUE, to expand memset if available.
Change
Also pass -mno-avx to pr72839.c to avoid copying data with YMM or ZMM
registers.
* gcc.target/i386/cold-attribute-1.c: Also pass -mno-avx.
---
gcc/testsuite/gcc.target/i386/cold-attribute-1.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/testsuite/gcc.target/i386
Define TARGET_GEN_MEMSET_SCRATCH_RTX to ix86_gen_scratch_sse_rtx to
return a scratch SSE register for memset.
gcc/
PR middle-end/90773
* config/i386/i386.c (TARGET_GEN_MEMSET_SCRATCH_RTX): New.
gcc/testsuite/
PR middle-end/90773
* gcc.target/i386/pr90773-15.c: Ne
1. Rewrite builtin_memset_read_str/builtin_memset_gen_str to use vector
broadcast to duplicate QI value to TI/OI/XI value for memmset.
2. Add TARGET_GEN_MEMSET_SCRATCH_RTX to allow the backend to use a hard
scratch register to avoid stack realignment when expanding memset.
PR middle-end/90
To avoid stack realignment, use SCRATCH_SSE_REG to copy data from one
memory location to another.
gcc/
* config/i386/i386-expand.c (ix86_expand_vector_move): Call
ix86_gen_scratch_sse_rtx to get a scratch SSE register to copy
data from one memory location to another.
gcc/
Hi Uros,
On Thu, Jul 1, 2021 at 1:32 AM Hongtao Liu wrote:
>
> On Tue, Jun 29, 2021 at 6:16 AM H.J. Lu wrote:
> >
> > 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTOR
> > operands to vector broadcast from an integer with AVX.
> > 2. Add ix86_gen_scratch_sse_rtx to return a
On Thu, Jul 1, 2021 at 4:10 AM Uros Bizjak wrote:
>
> [Sorry for double post, gcc-patches address was wrong in original post]
>
> On Thu, Jul 1, 2021 at 7:48 AM liuhongt wrote:
> >
> > Hi:
> > AVX512FP16 is disclosed, refer to [1].
> > There're 100+ instructions for AVX512FP16, 67 gcc patches
On Sun, Jun 27, 2021 at 6:43 PM Hongtao Liu wrote:
>
> On Sun, Jun 27, 2021 at 4:02 AM H.J. Lu wrote:
> >
> > 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO
> > operands to vector broadcast from an integer with AVX2.
> > 2. Add ix86_gen_scratch_sse_rtx to return a scratch
1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTOR
operands to vector broadcast from an integer with AVX.
2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which
won't increase stack alignment requirement and blocks transformation by
the combine pass.
A small be
Changes in the v6 patch:
1. Update SI/DI broadcast with AVX.
2. Require non-standard SSE constant integer broadcast with AVX.
3. Use nonimmediate_operand in vec_duplicate and verify that it
never fails.
Changes in the v5 patch:
1. Allow AVX with SI/DI broadcast.
2. Add a comment for broadcasting
Add vec_duplicate expander for SSE2 if we can move from GPR to SSE
register directly.
* config/i386/i386-expand.c (ix86_expand_vector_init_duplicate):
Make it global.
* config/i386/i386-protos.h (ix86_expand_vector_init_duplicate):
New prototype.
* config/i3
On Mon, Jun 28, 2021 at 5:36 AM Richard Sandiford
wrote:
>
> "H.J. Lu" writes:
> > On Sun, Jun 27, 2021 at 2:00 PM Richard Sandiford
> > wrote:
> >>
> >> "H.J. Lu via Gcc-patches" writes:
> >> > On Sun, Jun 27, 2021 at 1:43 A
On Sun, Jun 27, 2021 at 2:00 PM Richard Sandiford
wrote:
>
> "H.J. Lu via Gcc-patches" writes:
> > On Sun, Jun 27, 2021 at 1:43 AM Richard Sandiford
> > wrote:
> >>
> >> "H.J. Lu" writes:
> >> > 1. Update vec_duplicate to
On Sun, Jun 27, 2021 at 1:43 AM Richard Sandiford
wrote:
>
> "H.J. Lu" writes:
> > 1. Update vec_duplicate to allow to fail so that backend can only allow
> > broadcasting an integer constant to a vector when broadcast instruction
> > is available. This can be used by memset expander to avoid ve
1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO
operands to vector broadcast from an integer with AVX2.
2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which
won't increase stack alignment requirement and blocks transformation by
the combine pass.
A small be
1. Update vec_duplicate to allow to fail so that backend can only allow
broadcasting an integer constant to a vector when broadcast instruction
is available. This can be used by memset expander to avoid vec_duplicate
when loading from constant pool is more efficient.
2. Add vec_duplicate expander
Changes in the v5 patch:
1. Allow AVX with SI/DI broadcast.
2. Add a comment for broadcasting to V64QI and V32HI with AVX512F, but
without AVX512BW.
---
1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO
operands to vector broadcast from an integer with AVX2.
2. Add ix86_gen_s
e:INT_BROADCAST_MODE
> > > >+ (match_operand: 1 "general_operand")))]
> > > >+ "TARGET_SSE2 && TARGET_INTER_UNIT_MOVES_TO_VEC"
> > > Why not directly use TARGET_AVX2 here, not in
> > > ix86_expand_integer_vec_dupl
d[1] to be const_int but
> > not const0_rtx and constm1_rtx.
> > w/ this, guess there's no need for FAIL? and you can directly
> > call ix86_expand_vector_init_duplicate.
general_operand is used to allow both variable and constant broadcast
so that I can us
On Fri, Jun 25, 2021 at 12:50 AM Uros Bizjak wrote:
>
> On Fri, Jun 25, 2021 at 4:51 AM Hongtao Liu wrote:
> >
> > On Fri, Jun 25, 2021 at 12:13 AM Uros Bizjak via Gcc-patches
> > wrote:
> > >
> > > On Thu, Jun 24, 2021 at 2:12 PM H.J. Lu wrote:
> > > >
> > > > CPUID functions are used to detec
On Thu, Jun 24, 2021 at 9:12 AM Uros Bizjak wrote:
>
> On Thu, Jun 24, 2021 at 2:12 PM H.J. Lu wrote:
> >
> > CPUID functions are used to detect CPU features. If vector ISAs
> > are enabled, compiler is free to use them in these functions. Add
> > __attribute__ ((target("general-regs-only"))) t
On Wed, Jun 9, 2021 at 4:39 PM H.J. Lu wrote:
>
> 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO
> operands to vector broadcast from an integer with AVX2.
> 2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which
> won't increase stack alignment requirement a
Generate the marker for -fno-direct-extern-access to indicate that the
object file uses GOT to access all external symbols. Access to protected
symbols in the resulting shared library is treated as local, which requires
canonical function pointers and cannot be used with copy relocation.
Changes in the v2 patch.
1. Rename the option to -fdirect-extern-access.
---
On systems with copy relocation:
* A copy in executable is created for the definition in a shared library
at run-time by ld.so.
* The copy is referenced by executable and shared libraries.
* Executable can access the cop
-fdirect-extern-access is the default. With -fno-direct-extern-access:
1. Always use GOT to access undefined data and function symbols,
including in PIE and non-PIE. These will avoid copy relocations
in executables. This is compatible with existing executables and
shared libraries.
2.
On Thu, Jun 24, 2021 at 5:47 AM Richard Biener
wrote:
>
> On Thu, Jun 24, 2021 at 2:42 PM H.J. Lu wrote:
> >
> > On Thu, Jun 24, 2021 at 5:35 AM Richard Biener
> > wrote:
> > >
> > > On Thu, Jun 24, 2021 at 2:13 PM H.J. Lu via Gcc-patches
> > &g
On Thu, Jun 24, 2021 at 5:35 AM Richard Biener
wrote:
>
> On Thu, Jun 24, 2021 at 2:13 PM H.J. Lu via Gcc-patches
> wrote:
> >
> > CPUID functions are used to detect CPU features. If vector ISAs
> > are enabled, compiler is free to use them in these functions. Add
CPUID functions are used to detect CPU features. If vector ISAs
are enabled, compiler is free to use them in these functions. Add
__attribute__ ((target("general-regs-only"))) to CPUID functions
to avoid vector instructions.
gcc/
PR target/101185
* config/i386/cpuid.h (__get_cpu
On Tue, Jun 22, 2021 at 11:15 AM Fangrui Song wrote:
>
> On 2021-06-22, H.J. Lu wrote:
> >On Mon, Jun 21, 2021 at 10:46 PM Fangrui Song wrote:
> >>
> >> On 2021-06-21, H.J. Lu wrote:
> >> >On Mon, Jun 21, 2021 at 9:16 PM Alan Modra wrote:
> >> >>
> >> >> On Mon, Jun 21, 2021 at 07:12:02PM -0700,
On Mon, Jun 21, 2021 at 10:46 PM Fangrui Song wrote:
>
> On 2021-06-21, H.J. Lu wrote:
> >On Mon, Jun 21, 2021 at 9:16 PM Alan Modra wrote:
> >>
> >> On Mon, Jun 21, 2021 at 07:12:02PM -0700, H.J. Lu wrote:
> >> > On Mon, Jun 21, 2021 at 5:06 PM Alan Modra wrote:
> >> > >
> >> > > On Mon, Jun 21
Generate the GNU_PROPERTY_1_NEEDED_SINGLE_GLOBAL_DEFINITION marker for
-fsingle-global-definition to indicate that the object file requires
canonical function pointers and cannot be used with copy relocation.
* configure.ac (HAVE_LD_SINGLE_GLOBAL_DEFINITION_SUPPORT): New.
Define to
501 - 600 of 1254 matches
Mail list logo