On Fri, Oct 20, 2023 at 8:54 AM liuhongt wrote:
>
> When I'm working on enable more 32/64-bit vectorization for _Float16,
> I notice there's 1 redundant define_expand, the patch removed the expander.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
On Tue, Oct 17, 2023 at 9:05 PM Roger Sayle wrote:
>
>
> This patch contains clean-ups of the widening multiplication patterns in
> i386.md, and provides variants of the existing highpart multiplication
> peephole2 transformations (that tidy up register allocation after
> reload), and thereby fixe
On Tue, Oct 17, 2023 at 7:54 PM Roger Sayle wrote:
>
>
> Hi Uros,
> Thanks for the speedy review.
>
> > From: Uros Bizjak
> > Sent: 17 October 2023 17:38
> >
> > On Tue, Oct 17, 2023 at 3:08 PM Roger Sayle
> > wrote:
> > >
> > >
&g
On Tue, Oct 17, 2023 at 3:08 PM Roger Sayle wrote:
>
>
> This patch is the backend piece of a solution to PRs 101955 and 106245,
> that adds a define_insn_and_split to the i386 backend, to perform sign
> extension of a single (least significant) bit using AND $1 then NEG.
>
> Previously, (x<<31)>>
On Mon, Oct 16, 2023 at 9:58 PM Fangrui Song wrote:
>
> On Mon, Oct 16, 2023 at 12:10 PM Uros Bizjak wrote:
> >
> > On Mon, Oct 16, 2023 at 8:24 PM Fangrui Song wrote:
> > >
> > > On 2023-10-16, Uros Bizjak wrote:
> > > >On Tue, Aug 1, 2023 at 9:5
On Mon, Oct 16, 2023 at 8:24 PM Fangrui Song wrote:
>
> On 2023-10-16, Uros Bizjak wrote:
> >On Tue, Aug 1, 2023 at 9:51 PM Fangrui Song wrote:
> >>
> >> When using -mcmodel=medium, large data objects larger than the
> >> -mlarge-data-threshold thresh
On Tue, Aug 1, 2023 at 9:51 PM Fangrui Song wrote:
>
> When using -mcmodel=medium, large data objects larger than the
> -mlarge-data-threshold threshold are placed into large data sections
> (.lrodata, .ldata, .lbss and some variants). GNU ld and ld.lld 17 place
> .l* sections into separate outpu
On Fri, Oct 6, 2023 at 3:59 PM Roger Sayle wrote:
>
>
> Grr! I've done it again. ENOPATCH.
>
> > -Original Message-
> > From: Roger Sayle
> > Sent: 06 October 2023 14:58
> > To: 'gcc-patches@gcc.gnu.org'
> > Cc: 'Uros Bizja
The stringop strategy selection algorithm falls back to a libcall strategy
when it exhausts its pool of available strategies. The memory area copy
function (memcpy) is not available from the system library for non-default
address spaces, so the compiler emits the most trivial byte-at-a-time
copy l
On Thu, Oct 5, 2023 at 1:45 PM Roger Sayle wrote:
>
> Doh! ENOPATCH.
>
> > -Original Message-
> > From: Roger Sayle
> > Sent: 05 October 2023 12:44
> > To: 'gcc-patches@gcc.gnu.org'
> > Cc: 'Uros Bizjak'
> > Subject: [X8
On Thu, Oct 5, 2023 at 11:06 AM Roger Sayle wrote:
>
>
> This patch avoids long lea instructions for performing x<<2 and x<<3
> by splitting them into shorter sal and move (or xchg instructions).
> Because this increases the number of instructions, but reduces the
> total size, its suitable for -O
PR target/111340
gcc/ChangeLog:
* config/i386/i386.cc (output_pic_addr_const): Handle CONST_WIDE_INT.
Call output_addr_const for CASE_CONST_SCALAR_INT.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr111340.c: New test.
Bootstrapped and regression tested on x86_64-linux-gnu {,-m32
On Wed, Sep 6, 2023 at 9:43 PM Vladimir Makarov wrote:
>
>
> On 9/1/23 05:07, Hongyu Wang wrote:
> > Uros Bizjak via Gcc-patches 于2023年8月31日周四 18:16写道:
> >> On Thu, Aug 31, 2023 at 10:20 AM Hongyu Wang wrote:
> >>> From: Kong Lingling
> >>>
>
On Mon, Sep 4, 2023 at 2:28 AM Hongtao Liu wrote:
> > > > > > > I think there should be some constraint which explicitly has all
> > > > > > > the 32
> > > > > > > GPRs, like there is one for just all 16 GPRs (h), so that
> > > > > > > regardless of
> > > > > > > -mapx-inline-asm-use-gpr32 one
On Fri, Sep 1, 2023 at 12:36 PM Hongtao Liu wrote:
>
> On Fri, Sep 1, 2023 at 5:38 PM Uros Bizjak via Gcc-patches
> wrote:
> >
> > On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang wrote:
> > >
> > > Uros Bizjak via Gcc-patches 于2023年8月31日周四
> > > 18:
On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang wrote:
>
> Uros Bizjak via Gcc-patches 于2023年8月31日周四 18:01写道:
> >
> > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
> > wrote:
> > >
> > > On Thu, Aug 31, 2023 at 04:20:17PM +0800
On Thu, Aug 31, 2023 at 10:20 AM Hongyu Wang wrote:
>
> From: Kong Lingling
>
> Current reload infrastructure does not support selective base_reg_class
> for backend insn. Add insn argument to base_reg_class for
> lra/reload usage.
I don't think this is the correct approach. Ideally, a memory
co
On Thu, Aug 31, 2023 at 10:20 AM Hongyu Wang wrote:
>
> From: Kong Lingling
>
> These legacy insn in opcode map0/1 only support GPR16,
> and do not have vex/evex counterpart, directly adjust constraints and
> add gpr32 attr to patterns.
>
> insn list:
> 1. xsave/xsave64, xrstor/xrstor64
> 2. xsav
On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches
wrote:
>
> On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote:
> > From: Kong Lingling
> >
> > In inline asm, we do not know if the insn can use EGPR, so disable EGPR
> > usage by default from mapping the comm
gcc/fortran/ChangeLog:
* match.cc (gfc_match_equivalence): Rename TRUE/FALSE to true/false.
* module.cc (check_access): Ditto.
* primary.cc (match_real_constant): Ditto.
* trans-array.cc (gfc_trans_allocate_array_storage): Ditto.
(get_array_ctor_strlen): Ditto.
* trans-comm
gcc/c-family/ChangeLog:
* c-format.cc (read_any_format_width):
Rename TRUE/FALSE to true/false.
gcc/ChangeLog:
* caller-save.cc (new_saved_hard_reg):
Rename TRUE/FALSE to true/false.
(setup_save_areas): Ditto.
* gcc.cc (set_collect_gcc_options): Ditto.
(driver::build_
Add new pattern involving vec_merge RTX that is produced by combine from the
combination of sse4_1_pinsrq and *movdi_internal:
7: r86:DI=0
8: r85:V2DI=vec_merge(vec_duplicate(r86:DI),r87:V2DI,0x2)
REG_DEAD r87:V2DI
REG_DEAD r86:DI
Successfully matched this instruction:
(set (re
On Wed, Aug 9, 2023 at 8:19 PM Jakub Jelinek wrote:
>
> Hi!
>
> The following patch enables _BitInt support on x86-64, the only
> target which has _BitInt specified in psABI.
>
> 2023-08-09 Jakub Jelinek
>
> PR c/102989
> * config/i386/i386.cc (classify_argument): Handle BITINT_
Disable (=&r,m,m) alternative for 32-bit targets. The combination of two
memory operands (possibly with complex addressing mode), early clobbered
output, frame pointer and PIC registers uses too many registers on
a register constrained 32-bit target.
Also merge two similar patterns using DWIH mode
Partial vector src is forced to a register as ops[1], we can use it
instead of SRC in the call to ix86_expand_sse_cmp. This change avoids
forcing operand[1] to a register in sign/zero-extend expanders.
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_sse_extend): Use ops[1]
inste
Implement vector extend and zero_extend functionality for TARGET_SSE2 using
PUNPCKL?? family of instructions. The code for e.g. zero-extend from V2SI to
V2DImode improves from:
movd%xmm0, %edx
pshufd $85, %xmm0, %xmm0
movd%xmm0, %eax
movq%rdx, (%rdi)
On Mon, Aug 14, 2023 at 4:46 AM liuhongt via Gcc-patches
wrote:
>
> vmovapd can enable register renaming and have same code size as
> vmovsd. Similar for vmovsh vs vmovaps, vmovaps is 1 byte less than
> vmovsh.
>
> When TARGET_AVX512VL is not available, still generate
> vmovsd/vmovss/vmovsh to avo
On Thu, Aug 10, 2023 at 9:40 AM Richard Biener
wrote:
>
> On Thu, Aug 10, 2023 at 3:13 AM liuhongt wrote:
> >
> > Currently we have 3 different independent tunes for gather
> > "use_gather,use_gather_2parts,use_gather_4parts",
> > similar for scatter, there're
> > "use_scatter,use_scatter_2parts,
On Thu, Aug 10, 2023 at 3:13 AM liuhongt wrote:
>
> Currently we have 3 different independent tunes for gather
> "use_gather,use_gather_2parts,use_gather_4parts",
> similar for scatter, there're
> "use_scatter,use_scatter_2parts,use_scatter_4parts"
>
> The patch support 2 standardizing options to
On Thu, Aug 10, 2023 at 2:49 AM liuhongt wrote:
>
> Also add ix86_partial_vec_fp_math to to condition of V2HF/V4HF named
> patterns in order to avoid generation of partial vector V8HFmode
> trapping instructions.
>
> Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,}
> Ok for trunk?
>
> gcc/
On Mon, Aug 7, 2023 at 1:20 PM Richard Biener
wrote:
> > Please also note the RFC patch [1] that relaxes clears for V2SFmode
> > with -fno-trapping-math. The patched compiler will then emit the same
> > code as clang does for -O2. Which raises another question - should gcc
> > default to -fno-tra
On Wed, Aug 9, 2023 at 8:38 AM Uros Bizjak wrote:
>
> On Wed, Aug 9, 2023 at 8:37 AM Liu, Hongtao wrote:
> >
> >
> >
> > > -Original Message-
> > > From: Uros Bizjak
> > > Sent: Wednesday, August 9, 2023 2:33 PM
> > > To: Liu,
On Wed, Aug 9, 2023 at 8:37 AM Liu, Hongtao wrote:
>
>
>
> > -Original Message-
> > From: Uros Bizjak
> > Sent: Wednesday, August 9, 2023 2:33 PM
> > To: Liu, Hongtao
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH V2] [X86] Work
On Wed, Aug 9, 2023 at 3:48 AM liuhongt wrote:
>
> > Please rather do it in a more self-descriptive way, as proposed in the
> > attached patch. You won't need a comment then.
> >
>
> Adjusted in V2 patch.
>
> Don't access leaf 7 subleaf 1 unless subleaf 0 says it is
> supported via EAX.
>
> Intel
On Wed, Aug 9, 2023 at 3:48 AM liuhongt wrote:
>
> > Please rather do it in a more self-descriptive way, as proposed in the
> > attached patch. You won't need a comment then.
> >
>
> Adjusted in V2 patch.
>
> Don't access leaf 7 subleaf 1 unless subleaf 0 says it is
> supported via EAX.
>
> Intel
Also introduce -m[no-]partial-vector-fp-math option to disable trapping
V2SF named patterns in order to avoid generation of partial vector V4SFmode
trapping instructions.
The new option is enabled by default, because even with sanitization,
a small but consistent speed up of 2 to 3% with Polyhedro
On Tue, Aug 8, 2023 at 9:58 AM liuhongt wrote:
>
> Don't access leaf 7 subleaf 1 unless subleaf 0 says it is
> supported via EAX.
>
> Intel documentation says invalid subleaves return 0. We had been
> relying on that behavior instead of checking the max sublef number.
>
> It appears that some Sand
On Tue, Aug 8, 2023 at 12:08 PM Richard Biener wrote:
> > > > > > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF
> > > > > > named patterns in order to avoid generation of partial vector
> > > > > > V4SFmode
> > > > > > trapping instructions.
> > > > > >
> > > > > > The new
On Tue, Aug 8, 2023 at 10:07 AM Richard Biener wrote:
>
> On Mon, 7 Aug 2023, Uros Bizjak wrote:
>
> > On Mon, Jul 31, 2023 at 11:40?AM Richard Biener wrote:
> > >
> > > On Sun, 30 Jul 2023, Uros Bizjak wrote:
> > >
> > > > Also introduce
On Mon, Jul 31, 2023 at 11:40 AM Richard Biener wrote:
>
> On Sun, 30 Jul 2023, Uros Bizjak wrote:
>
> > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF
> > named patterns in order to avoid generation of partial vector V4SFmode
> > trapping in
ith and without --target_board=unix{-m32}
> with no new failures. Ok for mainline?
>
>
> 2023-08-07 Roger Sayle
> Uros Bizjak
>
> gcc/ChangeLog
> PR target/107671
> * config/i386/i386.md (*bt_setc_mask): Allow the
> shift count to
On Mon, Aug 7, 2023 at 10:57 AM liuhongt wrote:
>
> Similar like r14-2786-gade30fad6669e5, the patch is for V4HF/V2HFmode.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/110762
> * config/i386/mmx.md (3): Changed from
On Thu, Aug 3, 2023 at 9:10 AM Roger Sayle wrote:
>
>
> This patch is the final piece in the series to improve the ABI issues
> affecting PR 88873. The previous patches tackled inserting DFmode
> values into V2DFmode registers, by introducing insvti_{low,high}part
> patterns. This patch improves
On Thu, Aug 3, 2023 at 12:18 AM Roger Sayle wrote:
>
>
> This patch is a conservative fix for PR target/110792, a wrong-code
> regression affecting doubleword rotations by BITS_PER_WORD, which
> effectively swaps the highpart and lowpart words, when the source to be
> rotated resides in memory. Th
On Wed, Aug 2, 2023 at 3:33 AM liuhongt wrote:
>
> In [1], I propose a patch to generate vmovdqu for all vlddqu intrinsics
> after AVX2, it's rejected as
> > The instruction is reachable only as __builtin_ia32_lddqu* (aka
> > _mm_lddqu_si*), so it was chosen by the programmer for a reason. I
> > t
On Mon, Jul 31, 2023 at 11:40 AM Richard Biener wrote:
>
> On Sun, 30 Jul 2023, Uros Bizjak wrote:
>
> > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF
> > named patterns in order to avoid generation of partial vector V4SFmode
> > trapping in
Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF
named patterns in order to avoid generation of partial vector V4SFmode
trapping instructions.
The new option is enabled by default, because even with sanitization,
a small but consistent speed up of 2 to 3% with Polyhedron capaci
The testcase should use dg-additional-options instead of dg-options to
not overwrite default compile flags that include path for finding
the IEEE modules.
gcc/testsuite/ChangeLog:
* gfortran.dg/ieee/comparisons_3.F90: Use dg-additional-options
instead of dg-options.
Tested on x86_64-linu
Clear the upper half of a V4SFmode operand register in front of all
potentially trapping instructions. The testcase:
--cut here--
typedef float v2sf __attribute__((vector_size(8)));
typedef float v4sf __attribute__((vector_size(16)));
v2sf test(v4sf x, v4sf y)
{
v2sf x2, y2;
x2 = __builtin_s
On Sat, Jul 22, 2023 at 4:17 PM Roger Sayle wrote:
>
>
> This patch attempts to help with PR rtl-optimization/110587, a regression
> of -O0 compile time for the pathological pr28071.c. My recent patch helps
> a bit, but hasn't returned -O0 compile-time to where it was before my
> ix86_expand_move
On Sat, Jul 22, 2023 at 5:37 PM Roger Sayle wrote:
>
>
> As suggested by Uros, this patch changes the ZERO_EXTRACTs and SIGN_EXTRACTs
> in i386.md to consistently use QImode for bit offsets (i.e. third and fourth
> operands), matching the use of QImode for bit counts in shifts and rotates.
>
> The
When sign-extending the value in a double-word register pair using shift and
ashiftrt sequence with the same count immediate value less than word width,
there is no need to shift the lower word of the value. The sign-extension
could be limited to the upper word, but we uselessly shift the lower wor
On Thu, Jul 20, 2023 at 9:35 AM liuhongt wrote:
>
> For Intel processors, after TARGET_AVX, vmovdqu is optimized as fast
> as vlddqu, UNSPEC_LDDQU can be removed to enable more optimizations.
> Can someone confirm this with AMD folks?
> If AMD doesn't like such optimization, I'll put my optimizati
On Thu, Jul 20, 2023 at 9:44 AM Roger Sayle wrote:
>
>
> Hi Uros,
>
> > From: Uros Bizjak
> > Sent: 20 July 2023 07:50
> >
> > On Wed, Jul 19, 2023 at 10:07 PM Roger Sayle
> > wrote:
> > >
> > > This patch is the next piece of a solut
On Wed, Jul 19, 2023 at 10:07 PM Roger Sayle wrote:
>
>
> This patch is the next piece of a solution to the x86_64 ABI issues in
> PR 88873. This splits the *concat3_3 define_insn_and_split
> into two patterns, a TARGET_64BIT *concatditi3_3 and a !TARGET_64BIT
> *concatsidi3_3. This allows us to
tested reverting
> r13-2006-ga56c1641e9d25e successfully. Can we choose between the
> options please? Sorry I'm only bringing this up now but 13.2 RC is due
> tomorrow.
>
> Thank you,
> Richard.
>
> >
> >
> > 2023-06-10 Roger Sayle
&g
Also change some internal variables and function arguments from int to bool.
gcc/ChangeLog:
* dwarf2asm.cc: Change FALSE to false.
* dwarf2cfi.cc (execute_dwarf2_frame): Change return type to void.
* dwarf2out.cc (matches_main_base): Change return type from
int to bool. Change "l
Also change some internal variables and function arguments from int to bool.
gcc/ChangeLog:
* combine.cc (struct reg_stat_type): Change last_set_invalid to bool.
(cant_combine_insn_p): Change return type from int to bool and adjust
function body accordingly.
(can_combine_p): Ditto
On Mon, Jul 17, 2023 at 10:28 AM Hongtao Liu wrote:
>
> I'd like to ping for this patch (only patch 1/2, for patch 2/2, I
> think that may not be necessary).
>
> On Mon, May 15, 2023 at 9:20 AM Hongtao Liu wrote:
> >
> > ping.
> >
> > On Fri, Apr 21, 2023 at 9:55 PM liuhongt wrote:
> > >
> > > >
On Mon, Jul 17, 2023 at 8:44 AM Hongtao Liu wrote:
>
> Ping.
>
> On Tue, Jul 11, 2023 at 5:16 PM liuhongt via Gcc-patches
> wrote:
> >
> > Similar like we did for CMPXCHG, but extended to all
> > ix86_comparison_int_operator since CMPCCXADD set EFLAGS exactly same
> > as CMP.
> >
> > When operand
On Fri, Jul 14, 2023 at 11:44 AM Jan Beulich wrote:
>
> The corresponding insn serves this purpose quite fine, and leads to
> slightly less (generated) code. All we need is the insn to not have a
> leading * in its name, while retaining that * for "extendhfsf2".
> Introduce a mode attribute in exc
On Fri, Jul 14, 2023 at 11:27 AM Roger Sayle wrote:
>
>
> > From: Uros Bizjak
> > Sent: 13 July 2023 19:21
> >
> > On Thu, Jul 13, 2023 at 7:10 PM Roger Sayle
> > wrote:
> > >
> > > This patch resolves PR target/110588 to catch another
On Fri, Jul 14, 2023 at 10:53 AM Richard Biener wrote:
>
> On Fri, 14 Jul 2023, Uros Bizjak wrote:
>
> > On Fri, Jul 14, 2023 at 10:31?AM Richard Biener wrote:
> > >
> > > On Fri, 14 Jul 2023, Uros Bizjak wrote:
> > >
> > > > cprop1 pass
On Fri, Jul 14, 2023 at 10:31 AM Richard Biener wrote:
>
> On Fri, 14 Jul 2023, Uros Bizjak wrote:
>
> > cprop1 pass does not consider paradoxical subreg and for (insn 22) claims
> > that it equals 8 elements of HImodeby setting REG_EQUAL note:
> >
> > (
On Thu, Jul 13, 2023 at 6:45 PM Roger Sayle wrote:
>
>
> This is the next piece towards a fix for (the x86_64 ABI issues affecting)
> PR 88873. This patch generalizes the recent tweak to ix86_expand_move
> for setting the highpart of a TImode reg from a DImode source using
> *insvti_highpart_1, t
On Fri, Jul 14, 2023 at 8:24 AM Haochen Jiang wrote:
>
> Hi all,
>
> This patch aims to auto vectorize usdot_prod and udot_prod with newly
> introduced AVX-VNNI-INT16.
>
> Also I refined the redundant mode iterator in the patch.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk after AVX-VNNI-INT
cprop1 pass does not consider paradoxical subreg and for (insn 22) claims
that it equals 8 elements of HImodeby setting REG_EQUAL note:
(insn 21 19 22 4 (set (reg:V4QI 98)
(mem/u/c:V4QI (symbol_ref/u:DI ("*.LC1") [flags 0x2]) [0 S4
A32])) "pr110206.c":12:42 1530 {*movv4qi_internal}
(
On Thu, Jul 13, 2023 at 7:10 PM Roger Sayle wrote:
>
>
> This patch resolves PR target/110588 to catch another case in combine
> where the i386 backend should be generating a btl instruction. This adds
> another define_insn_and_split to recognize the RTL representation for this
> case.
>
> I also
PR target/106966
gcc/ChangeLog:
* config/alpha/alpha.cc (alpha_emit_set_long_const):
Always use DImode when constructing long const.
gcc/testsuite/ChangeLog:
* gcc.target/alpha/pr106966.c: New test.
Bootstrapped and regression tested by Matthias on alpha-linux-gnu.
Uros.
diff
gcc/ChangeLog:
* ira.cc (equiv_init_varies_p): Change return type from int to bool
and adjust function body accordingly.
(equiv_init_movable_p): Ditto.
(memref_used_between_p): Ditto.
* lra-constraints.cc (valid_address_p): Ditto.
Bootstrapped and regression tested on x86_64-l
Also change some internal variables and function arguments from int to bool.
gcc/ChangeLog:
* ifcvt.cc (cond_exec_changed_p): Change variable to bool.
(last_active_insn): Change "skip_use_p" function argument to bool.
(noce_operand_ok): Change return type from int to bool.
(find_c
On Wed, Jul 12, 2023 at 12:58 PM Uros Bizjak wrote:
>
> On Wed, Jul 12, 2023 at 12:23 PM Richard Sandiford
> wrote:
> >
> > Richard Biener via Gcc-patches writes:
> > > On Mon, Jul 10, 2023 at 1:01 PM Uros Bizjak wrote:
> > >>
> > &g
On Wed, Jul 12, 2023 at 12:23 PM Richard Sandiford
wrote:
>
> Richard Biener via Gcc-patches writes:
> > On Mon, Jul 10, 2023 at 1:01 PM Uros Bizjak wrote:
> >>
> >> On Mon, Jul 10, 2023 at 11:47 AM Richard Biener
> >> wrote:
> >> >
> &g
On Tue, Jul 11, 2023 at 10:07 PM Roger Sayle wrote:
>
>
> The recent change in TImode parameter passing on x86_64 results in the
> FAIL of pr91681-1.c. The issue is that with the extra flexibility,
> the combine pass is now spoilt for choice between using either the
> *add3_doubleword_concat or t
On Tue, Jul 11, 2023 at 9:07 PM Roger Sayle wrote:
>
>
> This patch fixes the regression PR target/110598 caused by my recent
> addition of a peephole2. The intention of that optimization was to
> simplify zeroing a register, followed by an IOR, XOR or PLUS operation
> on it into a move, or as de
Also change some internal variables from int to bool.
gcc/ChangeLog:
* cfghooks.cc (verify_flow_info): Change "err" variable to bool.
* cfghooks.h (struct cfg_hooks): Change return type of
verify_flow_info from integer to bool.
* cfgrtl.cc (can_delete_note_p): Change return type f
Also change some internal variables and function arguments from int to bool.
gcc/ChangeLog:
* reorg.cc (stop_search_p): Change return type from int to bool
and adjust function body accordingly.
(resource_conflicts_p): Ditto.
(insn_references_resource_p): Change return type from in
On Mon, Jul 10, 2023 at 11:47 AM Richard Biener
wrote:
>
> On Mon, Jul 10, 2023 at 11:26 AM Uros Bizjak wrote:
> >
> > On Mon, Jul 10, 2023 at 11:17 AM Richard Biener
> > wrote:
> > >
> > > On Sun, Jul 9, 2023 at 10:53 AM Uros Bizjak via Gcc-patches
>
On Mon, Jul 10, 2023 at 11:17 AM Richard Biener
wrote:
>
> On Sun, Jul 9, 2023 at 10:53 AM Uros Bizjak via Gcc-patches
> wrote:
> >
> > As shown in the PR, simplify_gen_subreg call in simplify_replace_fn_rtx:
> >
> > (gdb) list
> > 469 if (code =
On Sun, Jul 9, 2023 at 11:30 PM Roger Sayle wrote:
>
>
> This patch implements another of Uros' suggestions, to investigate a
> insvti_lowpart_1 pattern to improve TImode parameter passing on x86_64.
> In PR 88873, the RTL the middle-end expands for passing V2DF in TImode
> is subtly different fro
On Sun, Jul 9, 2023 at 10:35 PM Roger Sayle wrote:
>
>
> Following Uros' suggestion, this patch adds support for AVX512VL's
> vpro[lr][dq] instructions to the recently added scalar-to-vector (STV)
> enhancements to handle DImode and SImode rotations by a constant.
>
> For the test cases:
>
> unsig
As shown in the PR, simplify_gen_subreg call in simplify_replace_fn_rtx:
(gdb) list
469 if (code == SUBREG)
470 {
471 op0 = simplify_replace_fn_rtx (SUBREG_REG (x),
old_rtx, fn, data);
472 if (op0 == SUBREG_REG (x))
473 return x;
47
Also change some internal variables from int to bool.
gcc/ChangeLog:
* cprop.cc (reg_available_p): Change return type from int to bool.
(reg_not_set_p): Ditto.
(try_replace_reg): Ditto. Change "success" variable to bool.
(cprop_jump): Change return type from int to void
and a
Also change some internal variables and function arguments from int to bool.
gcc/ChangeLog:
* gcse.cc (expr_equiv_p): Change return type from int to bool.
(oprs_unchanged_p): Change return type from int to void
and adjust function body accordingly.
(oprs_anticipatable_p): Ditto.
On Fri, Jul 7, 2023 at 7:31 AM liuhongt wrote:
>
> > Please split the above pattern into two, one emitting UNSPEC_IEEE_MAX
> > and the other emitting UNSPEC_IEEE_MIN.
> Splitted.
>
> > The test involves blendv instruction, which is SSE4.1, so it is
> > pointless to test it without -msse4.1. Please
On Thu, Jul 6, 2023 at 3:48 PM Roger Sayle wrote:
>
> > On Thu, Jul 6, 2023 at 2:04 PM Roger Sayle
> > wrote:
> > >
> > >
> > > Passing 128-bit integer (TImode) parameters on x86_64 can sometimes
> > > result in surprising code. Consider the example below (from PR 43644):
> > >
> > > __uint128 f
On Thu, Jul 6, 2023 at 2:04 PM Roger Sayle wrote:
>
>
> Passing 128-bit integer (TImode) parameters on x86_64 can sometimes
> result in surprising code. Consider the example below (from PR 43644):
>
> __uint128 foo(__uint128 x, unsigned long long y) {
> return x+y;
> }
>
> which currently resul
On Thu, Jul 6, 2023 at 8:39 AM Hongyu Wang wrote:
>
> Hi,
>
> This is a follow-up patch for
> https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623525.html
> that updates document about x86 inlining rules.
>
> Ok for trunk?
>
> gcc/ChangeLog:
>
> * doc/extend.texi: Move x86 inlining rule
On Thu, Jul 6, 2023 at 3:20 AM liuhongt wrote:
>
> We have ix86_expand_sse_fp_minmax to detect min/max sematics, but
> it requires rtx_equal_p for cmp_op0/cmp_op1 and if_true/if_false, for
> the testcase in the PR, there's an extra move from cmp_op0 to if_true,
> and it failed ix86_expand_sse_fp_m
On Thu, Jul 6, 2023 at 3:20 AM liuhongt wrote:
>
> They should have same cost as vector mode since both generate
> pand/pandn/pxor/por instruction.
>
> Bootstrapped and regtested on x86_64-pc-linu-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> * config/i386/i386.cc (ix86_rtx_costs): Ad
On Thu, Jul 6, 2023 at 3:14 AM liuhongt wrote:
>
> For testcase
>
> void __cond_swap(double* __x, double* __y) {
> bool __r = (*__x < *__y);
> auto __tmp = __r ? *__x : *__y;
> *__y = __r ? *__y : *__x;
> *__x = __tmp;
> }
>
> GCC-14 with -O2 and -march=x86-64 options generates the followi
Also change some internal variables to bool.
gcc/ChangeLog:
* sched-int.h (struct haifa_sched_info): Change can_schedule_ready_p,
scehdule_more_p and contributes_to_priority indirect frunction
type from int to bool.
(no_real_insns_p): Change return type from int to bool.
(cont
le description to the new subsubsection?
>
> > Looking at the above, perhaps inlining of different arches can also be
> > forced with always_inline? This would allow developers some control of
> > inlining, and would not be surprising.
>
> If so, I'd like to add the a
On Tue, Jul 4, 2023 at 5:12 AM Hongyu Wang wrote:
>
> Hi,
>
> For function with different target attributes, current logic rejects to
> inline the callee when any arch or tune is mismatched. Relax the
> condition to allow callee with default arch/tune to be inlined.
>
> Boostrapped/regtested on x8
Also change internal variable from int to bool.
gcc/ChangeLog:
* tree.h (tree_int_cst_equal): Change return type from int to bool.
(operand_equal_for_phi_arg_p): Ditto.
(tree_map_base_marked_p): Ditto.
* tree.cc (contains_placeholder_p): Update function body
for bool return ty
Also change some internal variables and function argument from int to bool.
gcc/ChangeLog:
* fold-const.h (multiple_of_p): Change return type from int to bool.
* fold-const.cc (split_tree): Change negl_p, neg_litp_p,
neg_conp_p and neg_var_p variables to bool.
(const_binop): Chang
On Fri, Jun 30, 2023 at 9:29 AM Roger Sayle wrote:
>
>
> This patch implements scalar-to-vector (STV) support for DImode and SImode
> rotations by constant bit counts. Scalar rotations are almost always
> optimal on x86, requiring only one or two instructions, but it is also
> possible to impleme
gcc/ChangeLog:
* cselib.h (rtx_equal_for_cselib_1):
Change return type from int to bool.
(references_value_p): Ditto.
(rtx_equal_for_cselib_p): Ditto.
* expr.h (can_store_by_pieces): Ditto.
(try_casesi): Ditto.
(try_tablejump): Ditto.
(safe_from_p): Ditto.
* sbi
Also change some internal variables to bool and change return type of
compute_alignments to void.
gcc/ChangeLog:
* output.h (leaf_function_p): Change return type from int to bool.
(final_forward_branch_p): Ditto.
(only_leaf_regs_used): Ditto.
(maybe_assemble_visibility): Ditto.
ne
specified. We expect "default" callee to have properties that allow
inlining it into all callers, independent of callers arch/tune target
attribute.
Uros.
>
> Uros Bizjak 于2023年6月28日周三 14:43写道:
> >
> > On Wed, Jun 28, 2023 at 3:56 AM Hongyu Wang wrote:
> > >
301 - 400 of 1054 matches
Mail list logo