Re: [PATCH V6] VECT: Add decrement IV support in Loop Vectorizer

2023-05-12 Thread Richard Sandiford via Gcc-patches
"juzhe.zhong" writes: > Thanks Richard. > I will do that as you suggested. I have a question for the first patch. How > to > enable decrement IV? Should I add a target hook or something to let target > decide whether enable decrement IV? At the moment, the only other targets that use

Re: [PATCH] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-12 Thread Richard Sandiford via Gcc-patches
"Li, Pan2 via Gcc-patches" writes: > Thanks Richard for comments. In previous, I am not sure it is reasonable to > let everywhere consume the same macro in rtl.h (As the includes you > mentioned). Thus, make a conservative change in PATCH v1. > > I will address the comments and try to align the

Re: [PATCH V6] VECT: Add decrement IV support in Loop Vectorizer

2023-05-12 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > 1. Fix document description according Jeff && Richard. > 2. Add LOOP_VINFO_USING_SELECT_VL_P for single rgroup. > 3. Add LOOP_VINFO_USING_SLP_ADJUSTED_LEN_P for SLP multiple rgroup. > > Fix bugs for V5 after testing: >

Re: [PATCH] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-12 Thread Richard Sandiford via Gcc-patches
pan2...@intel.com writes: > From: Pan Li > > We are running out of the machine_mode(8 bits) in RISC-V backend. Thus > we would like to extend the machine mode bit size from 8 to 16 bits. > However, it is sensitive to extend the memory size in common structure > like tree or rtx. This patch would

Re: [PATCH] [PR96339] AArch64: Optimise svlast[ab]

2023-05-11 Thread Richard Sandiford via Gcc-patches
Tejas Belagod writes: > From: Tejas Belagod > > This PR optimizes an SVE intrinsics sequence where > svlasta (svptrue_pat_b8 (SV_VL1), x) > a scalar is selected based on a constant predicate and a variable vector. > This sequence is optimized to return the correspoding element of a

Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-11 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Tue, 2 May 2023 at 18:22, Richard Sandiford > wrote: >> >> Prathamesh Kulkarni writes: >> > On Tue, 2 May 2023 at 17:32, Richard Sandiford >> > wrote: >> >> >> >> Prathamesh Kulkarni writes: >> >> > On Tue, 2 May 2023 at 14:56, Richard Sandiford >> >> >

Re: [aarch64] Use dup and zip1 for interleaving elements in initializing vector

2023-05-11 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > diff --git a/gcc/testsuite/gcc.target/aarch64/vec-init-18.c > b/gcc/testsuite/gcc.target/aarch64/vec-init-18.c > new file mode 100644 > index 000..598a51f17c6 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/vec-init-18.c > @@ -0,0 +1,20 @@ > +/* {

Re: [libgcc PATCH] Add bit reversal functions __bitrev[qhsd]i2.

2023-05-11 Thread Richard Sandiford via Gcc-patches
"Roger Sayle" writes: > This patch proposes adding run-time library support for bit reversal, > by adding a __bitrevsi2 function to libgcc. Thoughts/opinions? > > I'm also tempted to add __popcount[qh]i2 and __parity[qh]i2 to libgcc, > to allow the RTL optimizers to perform narrowing operations,

Re: [PATCH] Add RTX codes for BITREVERSE and COPYSIGN.

2023-05-11 Thread Richard Sandiford via Gcc-patches
"Roger Sayle" writes: > An analysis of backend UNSPECs reveals that two of the most common UNSPECs > across target backends are for copysign and bit reversal. This patch > adds RTX codes for these expressions to allow their representation to > be standardized, and them to optimized by the

Re: [PATCH V4] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-11 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Thanks. I have read rgroup descriptions again. > Still I am not fully understand it clearly, bear with me :) > > I don't known how to differentiate Case 2 and Case 3. > > Case 2 is multiple rgroup for SLP. > Case 3 is multiple rgroup for non-SLP (VEC_PACK_TRUNC) >

[PATCH] aarch64: Remove alignment assertions [PR109661]

2023-05-11 Thread Richard Sandiford via Gcc-patches
The trunk patch for this PR corrected the ABI for enums that have a defined underlying type. We shouldn't change the ABI on the branches though, so this patch just removes the assertions that highlighed the problem. I think the same approach makes sense longer-term: keep the assertions at

Re: [PATCH V4] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-11 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Oh, I see. But I saw there is a variable using_partial_vectors_p > in the loop data structure. > > Can I add a variable call using_select_vl_p ? Yeah. Please also add a wrapper macro like LOOP_VINFO_USING_PARTIAL_VECTORS_P. (I'm not really a fan of the

Re: [PATCH V4] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-11 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richard. Since create_iv has been approved and soon will be commited > after > we bootstrap && regression. > > Now, I plan to send patch for "decrement IV". > > After reading your comments, I have several questions: > > 1. >>if (use_bias_adjusted_len)

Re: [PATCH 15/20] arm: [MVE intrinsics] add unary_acc shape

2023-05-11 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > On 5/11/23 10:30, Richard Sandiford wrote: >> Christophe Lyon writes: >>> On 5/10/23 16:52, Kyrylo Tkachov wrote: > -Original Message- > From: Christophe Lyon > Sent: Wednesday, May 10, 2023 2:31 PM > To: gcc-patches@gcc.gnu.org;

Re: [PATCH 15/20] arm: [MVE intrinsics] add unary_acc shape

2023-05-11 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > On 5/10/23 16:52, Kyrylo Tkachov wrote: >> >> >>> -Original Message- >>> From: Christophe Lyon >>> Sent: Wednesday, May 10, 2023 2:31 PM >>> To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; >>> Richard Earnshaw ; Richard Sandiford >>> >>> Cc: Christophe Lyon

Re: [PATCH V5] VECT: Add tree_code into "creat_iv" and allow it can handle MINUS_EXPR IV.

2023-05-11 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Juzhe-Zhong > > This is patch is a seperate patch preparing for supporting decrement IV. > > gcc/ChangeLog: > > * cfgloopmanip.cc (create_empty_loop_on_edge): Add PLUS_EXPR. > * gimple-loop-interchange.cc >

Re: [PATCH v5] Var-Tracking: Typedef pointer_mux as decl_or_value

2023-05-11 Thread Richard Sandiford via Gcc-patches
pan2...@intel.com writes: > From: Pan Li > > The decl_or_value is defined as void * before this PATCH. It will take > care of both the tree_node and rtx_def. Unfortunately, given a void > pointer cannot tell the input is tree_node or rtx_def. > > Then we have some implicit structure layout

Re: [PATCH V4] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-10 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: > I am sorry that I am still confused about that. > > Is this what you want ? > > bool use_minus_p = TREE_CODE (step) == INTEGER_CST && ((TYPE_UNSIGNED > (TREE_TYPE (step)) && tree_int_cst_lt (step1, step)) > || (!TYPE_UNSIGNED (TREE_TYPE (step)) && >

Re: [PATCH v3] Var-Tracking: Typedef pointer_mux as decl_or_value

2023-05-10 Thread Richard Sandiford via Gcc-patches
"Li, Pan2" writes: > Thanks Richard Sandiford. Update PATCH v4 here -> > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618099.html. > >> - if (dv_as_opaque (node->dv) != decl || node->offset != offset) >> + if (node->dv.first_or_null () != decl || node->offset != >> + offset) >

Re: [PATCH V4] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-10 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: > Thanks Richard. > I am planning to seperate a patch with only creat_iv stuff only. > > Are you suggesting that I remove "tree_code incr_op = code;" > Use the argument directly ? > > I saw the codes here: > > /* For easier readability of the created code, produce MINUS_EXPRs >

Re: [vxworks] [testsuite] [aarch64] use builtin in pred-not-gen-4.c

2023-05-10 Thread Richard Sandiford via Gcc-patches
Alexandre Oliva via Gcc-patches writes: > On vxworks, isunordered is defined as a macro that ultimately calls a > _Fpcomp function, that GCC doesn't recognize as a builtin, so it > can't optimize accordingly. > > Use __builtin_isunordered instead to get the desired code for the > test. > >

Re: [PATCH V4] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-10 Thread Richard Sandiford via Gcc-patches
In addition to Jeff's comments: juzhe.zh...@rivai.ai writes: > [...] > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > index cc4a93a8763..99cf0cdbdca 100644 > --- a/gcc/doc/md.texi > +++ b/gcc/doc/md.texi > @@ -4974,6 +4974,40 @@ for (i = 1; i < operand3; i++) >operand0[i] = operand0[i - 1]

Re: [PATCH v3] Var-Tracking: Typedef pointer_mux as decl_or_value

2023-05-10 Thread Richard Sandiford via Gcc-patches
Thanks, mostly looks good to me. Some minor comments below. pan2...@intel.com writes: > From: Pan Li > > The decl_or_value is defined as void * before this PATCH. It will take > care of both the tree_node and rtx_def. Unfortunately, given a void > pointer cannot tell the input is tree_node or

Re: [PATCH] vect: Missed opportunity to use [SU]ABD

2023-05-10 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, May 10, 2023 at 11:49 AM Richard Biener > wrote: >> >> On Wed, May 10, 2023 at 11:01 AM Richard Sandiford >> wrote: >> > >> > Oluwatamilore Adebayo writes: >> > > From 0b5f469171c340ef61a48a31877d495bb77bd35f Mon Sep 17 00:00:00 2001 >> > > From: oluade01 >> >

Re: [PATCH v2] Var-Tracking: Typedef pointer_mux as decl_or_value

2023-05-10 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek writes: > On Wed, May 10, 2023 at 07:57:05PM +0800, pan2...@intel.com wrote: >> --- a/gcc/var-tracking.cc >> +++ b/gcc/var-tracking.cc >> @@ -116,9 +116,14 @@ >> #include "fibonacci_heap.h" >> #include "print-rtl.h" >> #include "function-abi.h" >> +#include "mux-utils.h" >> >>

Re: [PATCH] vect: Missed opportunity to use [SU]ABD

2023-05-10 Thread Richard Sandiford via Gcc-patches
Oluwatamilore Adebayo writes: > From 0b5f469171c340ef61a48a31877d495bb77bd35f Mon Sep 17 00:00:00 2001 > From: oluade01 > Date: Fri, 14 Apr 2023 10:24:43 +0100 > Subject: [PATCH 1/4] Missed opportunity to use [SU]ABD > > This adds a recognition pattern for the non-widening > absolute difference

Re: [PATCH] Var-Tracking: Leverage pointer_mux for decl_or_value

2023-05-10 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 10 May 2023, pan2...@intel.com wrote: > >> From: Pan Li >> >> The decl_or_value is defined as void * before this PATCH. It will take >> care of both the tree_node and rtx_def. Unfortunately, given a void >> pointer cannot tell the input is tree_node or rtx_def.

Re: [PATCH 2/2] aarch64: Improve register allocation for lane instructions

2023-05-10 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, May 10, 2023 at 12:05 AM Richard Sandiford via Gcc-patches > wrote: >> >> Andrew Pinski writes: >> > On Tue, May 9, 2023 at 11:02 AM Richard Sandiford via Gcc-patches >> > wrote: >> >> >>

[PATCH 1/2] aarch64: Fix cut-&-pasto in aarch64-sve2-acle-asm.exp

2023-05-09 Thread Richard Sandiford via Gcc-patches
aarch64-sve2-acle-asm.exp tried to prevent --with-cpu/tune from affecting the results, but it used sve_flags rather than sve2_flags. This was a silent failure when running the full testsuite, but was a fatal error when running the harness individually. Tested on aarch64-linux-gnu, pushed to

Re: [PATCH V4] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-09 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: > Hi, Richards. I would like to give more information about this patch so that > it will make this patch easier for you to review. > > Currently, I saw we have 3 situations that we need to handle in case of loop > control IV in auto-vectorization: > 1. Single rgroup loop control

Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 16-bit

2023-05-09 Thread Richard Sandiford via Gcc-patches
"Li, Pan2" writes: > After the bits patch like below. > > rtx_def code 16 => 8 bits. > rtx_def mode 8 => 16 bits. > tree_base code unchanged. > > The structure layout of both the rtx_def and tree_base will be something > similar as below. As I understand, the lower 8-bits of tree_base will be >

[PATCH 6/6] aarch64: Avoid hard-coding specific register allocations

2023-05-09 Thread Richard Sandiford via Gcc-patches
Some tests hard-coded specific allocations for temporary registers, whereas the RA should be free to pick anything that doesn't force unnecessary moves or spills. gcc/testsuite/ * gcc.target/aarch64/asimd-mul-to-shl-sub.c: Allow any register allocation for temporary results,

[PATCH 5/6] aarch64: Relax FP/vector register matches

2023-05-09 Thread Richard Sandiford via Gcc-patches
There were many tests that used [0-9] to match an FP or vector register, but that should allow any of 0-31 instead. asm-x-constraint-1.c required s0-s7, but that's the range for "y" rather than "x". "x" allows s0-s15. sve/pcs/return_9.c required z2-z7 (the initial set of available

[PATCH 2/6] aarch64: Allow moves after tied-register intrinsics

2023-05-09 Thread Richard Sandiford via Gcc-patches
Some ACLE intrinsics map to instructions that tie the output operand to an input operand. If all the operands are allocated to different registers, and if MOVPRFX can't be used, we will need a move either before the instruction or after it. Many tests only matched the "before" case; this patch

[PATCH 4/6] aarch64: Relax predicate register matches

2023-05-09 Thread Richard Sandiford via Gcc-patches
Most governing predicate operands require p0-p7, but some instructions also allow p8-p15. Non-gp uses of predicates often also allow all of p0-p15. This patch fixes up cases where we required p0-p7 unnecessarily. In some cases we match the definition (typically a comparison, PFALSE or PTRUE),

[PATCH 3/6] aarch64: Relax ordering requirements in SVE dup tests

2023-05-09 Thread Richard Sandiford via Gcc-patches
Some of the svdup tests expand to a SEL between two constant vectors. This patch allows the constants to be formed in either order. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/dup_s16.c: When using SEL to select between two constant vectors, allow the constant moves to appear

[PATCH 1/6] aarch64: Fix move-after-intrinsic function-body tests

2023-05-09 Thread Richard Sandiford via Gcc-patches
Some of the SVE ACLE asm tests tried to be agnostic about the instruction order, but only one of the alternatives was exercised in practice. This patch fixes latent typos in the other versions. gcc/testsuite/ * gcc.target/aarch64/sve2/acle/asm/aesd_u8.c: Fix expected register

[PATCH 0/6] aarch64: Avoid hard-coding specific register allocations

2023-05-09 Thread Richard Sandiford via Gcc-patches
I have a patch that seems to improve register allocation for SIMD lane operations, and for similar instructions that require a reduced register range. However, it showed that a lot of asm tests are sensitive to the current register allocation. This patch series tries to correct the affected

[PATCH] ira: Don't create copies for earlyclobbered pairs

2023-05-05 Thread Richard Sandiford via Gcc-patches
This patch follows on from g:9f635bd13fe9e85872e441b6f3618947f989909a ("the previous patch"). To start by quoting that: If an insn requires two operands to be tied, and the input operand dies in the insn, IRA acts as though there were a copy from the input to the output with the same execution

Re: [PATCH 2/3] Refactor widen_plus as internal_fn

2023-05-03 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Fri, 28 Apr 2023, Andre Vieira (lists) wrote: > >> This patch replaces the existing tree_code widen_plus and widen_minus >> patterns with internal_fn versions. >> >> DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it provides >> convenience wrappers

Re: [PATCH v2] MIPS: add speculation_barrier support

2023-05-03 Thread Richard Sandiford via Gcc-patches
YunQiang Su writes: > speculation_barrier for MIPS needs sync+jr.hb (r2+), > so we implement __speculation_barrier in libgcc, like arm32 does. Looks reasonable, but do you have a source for the fallback pre-r2 handling? (Thanks for adding that btw, since I realise it's not your focus here.)

[PATCH 2/2] aarch64: Fix ABI handling of aligned enums [PR109661]

2023-05-03 Thread Richard Sandiford via Gcc-patches
aarch64_function_arg_alignment has traditionally taken the alignment of a scalar type T from TYPE_ALIGN (TYPE_MAIN_VARIANT (T)). This is supposed to discard any user alignment and give the alignment of the underlying fundamental type. PR109661 shows that this did the wrong thing for enums with a

[PATCH 1/2] aarch64: Rename abi_break parameters [PR109661]

2023-05-03 Thread Richard Sandiford via Gcc-patches
aarch64_function_arg_alignment has two related abi_break parameters: abi_break for a change in GCC 9, and abi_break_packed for a related follow-on change in GCC 13. In a sense, abi_break_packed is a "subfix" of abi_break. PR109661 now requires a third ABI break that is independent of the other

Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-02 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Tue, 2 May 2023 at 17:32, Richard Sandiford > wrote: >> >> Prathamesh Kulkarni writes: >> > On Tue, 2 May 2023 at 14:56, Richard Sandiford >> > wrote: >> >> > [aarch64] Improve code-gen for vector initialization with single >> >> > constant element. >> >> > >>

Re: [PATCH] target: [PR109657] (a ? -1 : 0) | b could be optimized better for aarch64

2023-05-02 Thread Richard Sandiford via Gcc-patches
Andrew Pinski via Gcc-patches writes: > There is no canonical form for this case defined. So the aarch64 backend needs > a pattern to match both of these forms. > > The forms are: > (set (reg/i:SI 0 x0) > (if_then_else:SI (eq (reg:CC 66 cc) > (const_int 0 [0])) > (reg:SI

Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-02 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Tue, 2 May 2023 at 14:56, Richard Sandiford > wrote: >> > [aarch64] Improve code-gen for vector initialization with single constant >> > element. >> > >> > gcc/ChangeLog: >> > * config/aarch64/aarc64.cc (aarch64_expand_vector_init): Tweak >> > condition

Re: [PATCH v5 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.

2023-05-02 Thread Richard Sandiford via Gcc-patches
Michael Collison writes: > While working on autovectorizing for the RISCV port I encountered an issue > where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a > evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode), > where GET_MODE_NUNITS is equal to one.

Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-02 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Tue, 25 Apr 2023 at 16:29, Richard Sandiford > wrote: >> >> Prathamesh Kulkarni writes: >> > Hi Richard, >> > While digging thru aarch64_expand_vector_init, I noticed it gives >> > priority to loading a constant first: >> > /* Initialise a vector which is

Re: [PATCH] v2: Add targetm.libm_function_max_error

2023-04-28 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek via Gcc-patches writes: > Hi! > > On Thu, Apr 27, 2023 at 10:34:59AM +, Richard Biener wrote: >> OK. As said the patch itself looks good to me, let's go ahead. We >> have plenty of time to backtrack until GCC 14. > > Thanks. Unfortunately when I started using it, I've

Re: [PATCH] VECT: Add decrement IV iteration loop control by variable amount support

2023-04-26 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richard. > Would you mind take a look at the loop control part again: > > static gcond * > vect_set_loop_condition_partial_vectors (class loop *loop, > loop_vec_info loop_vinfo, tree niters, > tree final_iv, bool niters_maybe_zero, > gimple_stmt_iterator

Re: [PATCH] VECT: Add decrement IV iteration loop control by variable amount support

2023-04-26 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Thank you so much for pointing out this issue. > > After reading your comments carefully, I need to revise > "vect_set_loop_controls_by_while_len" in loop control like this: > > vect_set_loop_controls_by_while_len > ... > tree X = NULL_TREE; > FOR_EACH_VEC_ELT

Re: [PATCH] VECT: Add decrement IV iteration loop control by variable amount support

2023-04-26 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Thanks Richard so much. > >>> I don't think that's guaranteed by the proposed definition of WHILE_LEN. >>> The first int64_t WHILE_LEN could come up short, and return something >>> less than VF/2. > > I am so sorry that the comments of

Re: [PATCH] VECT: Add decrement IV iteration loop control by variable amount support

2023-04-26 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, 25 Apr 2023, Richard Sandiford wrote: >> juzhe.zh...@rivai.ai writes: >> > diff --git a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc >> > index a52277abdbf..54845a62298 100644 >> > --- a/gcc/tree-ssa-loop-manip.cc >> > +++ b/gcc/tree-ssa-loop-manip.cc

Re: [PATCH] VECT: Add decrement IV iteration loop control by variable amount support

2023-04-25 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index 6e81dc05e0e..5f44def90d3 100644 > --- a/gcc/internal-fn.cc > +++ b/gcc/internal-fn.cc > @@ -127,6 +127,7 @@ init_internal_fns () > #define cond_binary_direct { 1, 1, true } > #define cond_ternary_direct {

Re: [aarch64] Code-gen for vector initialization involving constants

2023-04-25 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > Hi Richard, > While digging thru aarch64_expand_vector_init, I noticed it gives > priority to loading a constant first: > /* Initialise a vector which is part-variable. We want to first try > to build those lanes which are constant in the most efficient way we

Re: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2023-04-24 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Thu, Apr 20, 2023 at 3:24 PM Andre Vieira (lists) via Gcc-patches > wrote: >> >> Rebased all three patches and made some small changes to the second one: >> - removed sub and abd optabs from commutative_optab_p, I suspect this >> was a copy paste mistake, >> - removed

Re: [PATCH] RFC: New compact syntax for insn and insn_split in Machine Descriptions

2023-04-24 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Friday, April 21, 2023 6:19 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> >> Subject: Re: [PATCH] RFC: New compact syntax for insn and insn_split in >> Machine

Re: [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v

2023-04-24 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > gcc/ChangeLog: > * tree-ssa-forwprop.cc (is_combined_permutation_identity): Try to > simplify two successive VEC_PERM_EXPRs with single operand and same > mask, where mask chooses elements in reverse order. > > gcc/testesuite/ChangeLog: > *

Re: [aarch64] Use dup and zip1 for interleaving elements in initializing vector

2023-04-24 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > [aarch64] Recursively intialize even and odd sub-parts and merge with zip1. > > gcc/ChangeLog: > * config/aarch64/aarch64.cc (aarch64_expand_vector_init_fallback): > Rename > aarch64_expand_vector_init to this, and remove interleaving case. >

Re: [PATCH] aarch64: PR target/109406 Add support for SVE2 unpredicated MUL

2023-04-24 Thread Richard Sandiford via Gcc-patches
Kyrylo Tkachov writes: > Hi all, > > SVE2 supports an unpredicated vector integer MUL form that we can emit from > our SVE expanders > without using up a predicate registers. This patch does so. > As the SVE MUL expansion currently is templated away through a code iterator > I did not split it

Re: [PATCH] RFC: New compact syntax for insn and insn_split in Machine Descriptions

2023-04-24 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Tamar Christina writes: >> Hi All, >> >> This patch adds support for a compact syntax for specifying constraints in >> instruction patterns. Credit for the idea goes to Richard Earnshaw. >> >> I am sending up this RFC to get feedback for it's inclusion in GCC 14. >>

Re: [PATCH] RFC: New compact syntax for insn and insn_split in Machine Descriptions

2023-04-21 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > This patch adds support for a compact syntax for specifying constraints in > instruction patterns. Credit for the idea goes to Richard Earnshaw. > > I am sending up this RFC to get feedback for it's inclusion in GCC 14. > With this new syntax we want a clean

Re: [match.pd] [SVE] Add pattern to transform svrev(svrev(v)) --> v

2023-04-21 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Wed, 19 Apr 2023 at 16:17, Richard Biener > wrote: >> >> On Wed, Apr 19, 2023 at 11:21 AM Prathamesh Kulkarni >> wrote: >> > >> > On Tue, 11 Apr 2023 at 19:36, Prathamesh Kulkarni >> > wrote: >> > > >> > > On Tue, 11 Apr 2023 at 14:17, Richard Biener >> > >

Re: [PATCH v2] Leveraging the use of STP instruction for vec_duplicate

2023-04-21 Thread Richard Sandiford via Gcc-patches
"Victor L. Do Nascimento" writes: > The backend pattern for storing a pair of identical values in 32 and > 64-bit modes with the machine instruction STP was missing, and > multiple instructions were needed to reproduce this behavior as a > result of failed RTL pattern match in the combine pass. >

Re: [aarch64] Use force_reg instead of copy_to_mode_reg

2023-04-21 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > Hi Richard, > Based on your suggestions in the other thread, the patch uses force_reg > to avoid creating pseudo if value is already in a register. > Bootstrap+test passes on aarch64-linux-gnu. > OK to commit ? > > Thanks, > Prathamesh > > [aarch64] Use force_reg

Re: [RFC 0/X] Implement GCC support for AArch64 libmvec

2023-04-21 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)" writes: > On 20/04/2023 17:13, Richard Sandiford wrote: >> "Andre Vieira (lists)" writes: >>> On 20/04/2023 15:51, Richard Sandiford wrote: "Andre Vieira (lists)" writes: > Hi all, > > This is a series of patches/RFCs to implement support in GCC to be

Re: [PATCH] MAINTAINERS: add Vineet Gupta to write after approval

2023-04-21 Thread Richard Sandiford via Gcc-patches
Palmer Dabbelt writes: > On Thu, 20 Apr 2023 09:55:23 PDT (-0700), Vineet Gupta wrote: >> ChangeLog: >> >> * MAINTAINERS (Write After Approval): Add myself. >> >> (Ref: <680c7bbe-5d6e-07cd-8468-247afc65e...@gmail.com>) >> >> Signed-off-by: Vineet Gupta >> --- >> MAINTAINERS | 1 + >> 1

Re: [aarch64] Use dup and zip1 for interleaving elements in initializing vector

2023-04-21 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > Hi, > I tested the interleave+zip1 for vector init patch and it segfaulted > during bootstrap while trying to build > libgfortran/generated/matmul_i2.c. > Rebuilding with --enable-checking=rtl showed out of bounds access in > aarch64_unzip_vector_init in following

Re: [RFC 0/X] Implement GCC support for AArch64 libmvec

2023-04-20 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)" writes: > On 20/04/2023 15:51, Richard Sandiford wrote: >> "Andre Vieira (lists)" writes: >>> Hi all, >>> >>> This is a series of patches/RFCs to implement support in GCC to be able >>> to target AArch64's libmvec functions that will be/are being added to glibc. >>> We

Re: [PATCH 1/X] omp: Replace simd_clone_subparts with TYPE_VECTOR_SUBPARTS

2023-04-20 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)" writes: > Hi, > > This patch replaces the uses of simd_clone_subparts with > TYPE_VECTOR_SUBPARTS and removes the definition of the first. > > gcc/ChangeLog: > > * omp-sind-clone.cc (simd_clone_subparts): Remove. > (simd_clone_init_simd_arrays): Replace

Re: [RFC 0/X] Implement GCC support for AArch64 libmvec

2023-04-20 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)" writes: > Hi all, > > This is a series of patches/RFCs to implement support in GCC to be able > to target AArch64's libmvec functions that will be/are being added to glibc. > We have chosen to use the omp pragma '#pragma omp declare variant ...' > with a simd construct as

Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.

2023-04-20 Thread Richard Sandiford via Gcc-patches
writes: > Yes, like kito said. > We won't enable VNx1DImode in auto-vectorization so it's meaningless to fix > it here. > We dynamic adjust the minimum vector-length for different '-march' according > to RVV ISA specification. > So we strongly suggest that we should drop this fix. I think the

Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization

2023-04-20 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Thu, 20 Apr 2023, Richard Sandiford wrote: > >> "juzhe.zh...@rivai.ai" writes: >> > OK. Thanks Richard. >> > So let me conclude: >> > 1. Community agree that I should support variable IV in the middle-end. >> > 2. We can keep WHILE_LEN pattern when "not only final

Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization

2023-04-20 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > OK. Thanks Richard. > So let me conclude: > 1. Community agree that I should support variable IV in the middle-end. > 2. We can keep WHILE_LEN pattern when "not only final iteration is partial". > And I should describe it more clearly in the doc. > > I should

Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization

2023-04-20 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Thanks Richard reminding me. I originally think community does not allow me > support variable amount IV and let me do this in RISC-V backend. No, I think that part should and needs to be done in the middle-end, since if the initial IVs are incorrect, it's very

Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization

2023-04-20 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: > Hi, Richards. > Since GCC 14 is open and this patch has been boostraped && tested on X86. > Is this patch supporting variable IV OK for the trunk ? Doesn't the patch need updating based on the previous discussion? I thought the outcome was that WHILE_LEN isn't a simple MIN

Re: [PATCH] install.texi: Document --enable-decimal-float for AArch64

2023-04-18 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > When I committed the patches to enable support for DFP on AArch64, I > forgot to update the installation documentation. > > This patch adds AArch64 as needed (same as i386/x86_64). > > OK for trunk and gcc-13? OK for both, thanks. Richard > 2023-04-17

Re: [PATCH v3] constraint: fix relaxed memory and repeated constraint handling

2023-04-18 Thread Richard Sandiford via Gcc-patches
Jeff Law via Gcc-patches writes: > On 4/18/23 07:02, Richard Sandiford via Gcc-patches wrote: >> "Victor L. Do Nascimento" writes: >>> The function `constrain_operands' lacked the logic to consider relaxed >>> memory constraints when "traditional&

Re: [PATCH v3] constraint: fix relaxed memory and repeated constraint handling

2023-04-18 Thread Richard Sandiford via Gcc-patches
"Victor L. Do Nascimento" writes: > The function `constrain_operands' lacked the logic to consider relaxed > memory constraints when "traditional" memory constraints were not > satisfied, creating potential issues as observed during the reload > compilation pass. > > In addition, it was observed

Re: [PATCH] constraint: fix relaxed memory and repeated constraint handling

2023-04-18 Thread Richard Sandiford via Gcc-patches
"Victor L. Do Nascimento" writes: > The function `constrain_operands' lacked the logic to consider relaxed > memory constraints when "traditional" memory constraints were not > satisfied, creating potential issues as observed during the reload > compilation pass. > > In addition, it was observed

Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization

2023-04-18 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: >>> But the issue is the same in the reverse with WHILE_LEN, no? >>>WHILE_LEN just computes a scalar value - you seem to suggest >>>there's a hidden side-effect of "coalescing" the result with >>>a hardware vector length register? I don't think that's good design.

[PATCH] aarch64: Don't trust TYPE_ALIGN for pointers [PR108910]

2023-04-13 Thread Richard Sandiford via Gcc-patches
The aarch64 PCS rules ignore user alignment for scalars and vectors and use the "natural" alignment of the type. GCC tried to calculate that natural alignment using: TYPE_ALIGN (TYPE_MAIN_VARIANT (type)) But as discussed in the PR, it's possible that the main variant of a pointer type is an

Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 16-bit

2023-04-13 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: > Yeah, like kito said. > Turns out the tuple type model in ARM SVE is the optimal solution for RVV. > And we like ARM SVE style implmentation. > > And now we see swapping rtx_code and mode in rtx_def can make rtx_def overal > not exceed 64 bit. > But it seems that there is still

Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization

2023-04-12 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: >>> I think that already works for them (could be misremembering). >>> However, IIUC, they have no special instruction to calculate the >>> length (unlike for RVV), and so it's open-coded using vect_get_len. > > Yeah, the current flow using min, sub, and then min in

Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization

2023-04-12 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 12 Apr 2023, juzhe.zh...@rivai.ai wrote: > >> >> >> Thanks for the detailed explanation. Just to clarify - with RVV >> >> there's only a single mask register, v0.t, or did you want to >> >> say an instruction can only specify a single mask register? >> >> RVV

Re: [aarch64] Use dup and zip1 for interleaving elements in initializing vector

2023-04-12 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Thu, 6 Apr 2023 at 16:05, Richard Sandiford > wrote: >> >> Prathamesh Kulkarni writes: >> > On Tue, 4 Apr 2023 at 23:35, Richard Sandiford >> > wrote: >> >> > diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc >> >> >

Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization

2023-04-11 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richards. > Kindly Ping this patch. > This is the most important patch for RVV auto-vectorization support. > Bootstraped on X86 has passed. Can it wait for GCC 14? It doesn't seem like stage 4 material. Also, pinging after 5 days seems a bit soon. It's

Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 16-bit

2023-04-11 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, 11 Apr 2023, Richard Sandiford wrote: > >> writes: >> > ARM SVE has?svint8_t, svint8x2_t, svint8x3_t, svint8x4_t >> > As far as I known, they don't have tuple type for partial vector. >> >> Yeah, there are no separate types for partial vectors, but there >> are

Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 16-bit

2023-04-11 Thread Richard Sandiford via Gcc-patches
Richard Earnshaw writes: > On 11/04/2023 10:46, Richard Sandiford via Gcc-patches wrote: >> writes: >>> ARM SVE has:svint8_t, svint8x2_t, svint8x3_t, svint8x4_t >>> As far as I known, they don't have tuple type for partial vector. >> >> Yeah, there ar

Re: [PATCH v2][RFC] vect: Verify that GET_MODE_NUNITS is greater than one for vect_grouped_store_supported

2023-04-11 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Mon, Mar 27, 2023 at 6:02 PM Kevin Lee wrote: >> >> This patch is a proper fix to the previous patch >> https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614463.html >> vect_grouped_store_supported checks if the count is a power of 2, but >> doesn't

Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 16-bit

2023-04-11 Thread Richard Sandiford via Gcc-patches
writes: > ARM SVE has:svint8_t, svint8x2_t, svint8x3_t, svint8x4_t > As far as I known, they don't have tuple type for partial vector. Yeah, there are no separate types for partial vectors, but there are separate modes. E.g. VNx2QI is a partial vector of QIs, with each QI stored in a 64-bit

Re: [aarch64] Use dup and zip1 for interleaving elements in initializing vector

2023-04-06 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Tue, 4 Apr 2023 at 23:35, Richard Sandiford > wrote: >> > diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc >> > b/gcc/config/aarch64/aarch64-sve-builtins-base.cc >> > index cd9cace3c9b..3de79060619 100644 >> > ---

Re: [PATCH] dse: Handle SUBREGs of word REGs differently for WORD_REGISTER_OPERATIONS targets [PR109040]

2023-04-06 Thread Richard Sandiford via Gcc-patches
Jeff Law writes: > On 4/5/23 10:48, Jakub Jelinek wrote: >> On Wed, Apr 05, 2023 at 10:17:59AM -0600, Jeff Law wrote: It is true that an instruction like (insn 8 7 9 2 (set (reg:HI 141) (subreg:HI (reg:SI 142) 0)) "aauu.c":6:18 181 {*movhi_internal} (nil))

Re: [aarch64] Use dup and zip1 for interleaving elements in initializing vector

2023-04-04 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Mon, 13 Mar 2023 at 13:03, Richard Biener wrote: >> On GIMPLE it would be >> >> _1 = { a, ... }; // (a) >> _2 = { _1, ... }; // (b) >> >> but I'm not sure if (b), a VL CTOR of fixed len(?) sub-vectors is >> possible? But at least a CTOR of vectors is what we

Re: [PATCHv4] [AARCH64] Fix PR target/103100 -mstrict-align and memset on not aligned buffers

2023-04-04 Thread Richard Sandiford via Gcc-patches
Andrew Pinski via Gcc-patches writes: > The problem here is that aarch64_expand_setmem does not change the alignment > for strict alignment case. > This is version 4 of the fix, major changes from the last version is fixing > the way store pairs are handled which allows handling of storing 2 SI

Re: [PATCH] aarch64, builtins: Include PR registers in FUNCTION_ARG_REGNO_P etc. [PR109254]

2023-03-31 Thread Richard Sandiford via Gcc-patches
Thanks for the patch and sorry for the slow reply. Jakub Jelinek writes: > Hi! > > The testcase in the PR (which unfortunately because of my lack of experience > with SVE I'm not able to turn into a runtime testcase that verifies it) > is miscompiled on aarch64-linux in the regname pass, because

Re: [PATCH v2 2/2] combine: Try harder to form zero_extends [PR106594]

2023-03-31 Thread Richard Sandiford via Gcc-patches
Segher Boessenkool writes: > On Fri, Mar 31, 2023 at 03:06:41PM +0100, Richard Sandiford wrote: >> This is an alternative presentation of the change that we discussed >> a few weeks ago, and that you already tested: >> >> https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613486.html >> >>

Re: [PATCH v2 2/2] combine: Try harder to form zero_extends [PR106594]

2023-03-31 Thread Richard Sandiford via Gcc-patches
Segher Boessenkool writes: > On Thu, Mar 09, 2023 at 12:10:51PM +, Richard Sandiford wrote: >> g:c23a9c87cc62bd177fd0d4db6ad34b34e1b9a31f uses nonzero_bits >> information to convert sign_extends into zero_extends. >> That change is semantically correct in itself, but for the >> testcase in

Re: [PATCH v2 1/2] combine: Split code out of make_compound_operation_int

2023-03-31 Thread Richard Sandiford via Gcc-patches
Segher Boessenkool writes: > Hi! > > On Thu, Mar 09, 2023 at 12:09:59PM +, Richard Sandiford wrote: >> This patch just splits some code out of make_compound_operation_int >> into a new function called make_compound_operation_and. It is a >> prerequisite for the fix for PR106594. >> >> It

Re: [PATCH] vect: Verify that GET_MODE_NUNITS is greater than one.

2023-03-31 Thread Richard Sandiford via Gcc-patches
Michael Collison writes: > While working on autovectorizing for the RISCV port I encountered an issue > where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a > evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode), > where GET_MODE_NUNITS is equal to one.

<    1   2   3   4   5   6   7   8   9   10   >