[PATCH]AArch64 backport Neoverse and Cortex CPU definitions

2024-11-08 Thread Tamar Christina
Hi All, This is a conservative backport of a few core definitions backporting only the core definitions and mapping them to their closest cost model that exist on the branches. Bootstrapped Regtested on aarch64-none-linux-gnu on branches and no issues. Ok for GCC 13 and 14? Thanks, Tamar gcc/C

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Jeff Law > Sent: Thursday, November 7, 2024 8:08 PM > To: Tamar Christina ; Li, Pan2 ; > Richard Biener > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > rdapp@gmail.com > Subject: Re: [PATCH v2 01/10

RE: [PATCH 3/5] Add LOOP_VINFO_MAIN_LOOP_INFO

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, November 6, 2024 2:30 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Sandiford ; Tamar Christina > > Subject: [PATCH 3/5] Add LOOP_VINFO_MAIN_LOOP_INFO > > The following introduces LOOP_VIN

[PATCH][ivopts]: perform affine fold to unsigned on non address expressions. [PR114932]

2024-11-07 Thread Tamar Christina
Hi All, When the patch for PR114074 was applied we saw a good boost in exchange2. This boost was partially caused by a simplification of the addressing modes. With the patch applied IV opts saw the following form for the base addressing; Base: (integer(kind=4) *) &block + ((sizetype) ((unsigne

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Li, Pan2 > Sent: Thursday, November 7, 2024 1:45 AM > To: Tamar Christina ; Richard Biener > > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > jeffreya...@gmail.com; rdapp@gmail.com > Subject: RE

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Li, Pan2 > Sent: Thursday, November 7, 2024 12:57 PM > To: Tamar Christina ; Richard Biener > > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > jeffreya...@gmail.com; rdapp@gmail.com > Subject: RE

RE: [PATCH 5/5] Allow multiple vectorized epilogs via --param vect-epilogues-nomask=N

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, November 6, 2024 2:32 PM > To: gcc-patches@gcc.gnu.org > Cc: RISC-V CI ; Tamar Christina > ; Richard Sandiford > Subject: [PATCH 5/5] Allow multiple vectorized epilogs via --param > vect-epilog

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-06 Thread Tamar Christina
> -Original Message- > From: Li, Pan2 > Sent: Wednesday, November 6, 2024 1:31 PM > To: Richard Biener > Cc: gcc-patches@gcc.gnu.org; Tamar Christina ; > juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; > rdapp@gmail.com > Subject: RE

RE: [PATCH 2/2]middle-end: use two's complement equality when comparing IVs during candidate selection [PR114932]

2024-11-05 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Monday, October 14, 2024 4:08 PM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: RE: [PATCH 2/2]middle-end: use two's complement equality when

RE: [PATCH 1/2]middle-end: refactor type to be explicit in operand_equal_p [PR114932]

2024-11-05 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Monday, October 14, 2024 4:08 PM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: RE: [PATCH 1/2]middle-end: refactor type to be explicit in > oper

RE: [PATCH]middle-end: Handle more gcond lowering [PR117176]

2024-10-21 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, October 21, 2024 9:55 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH]middle-end: Handle more gcond lowering [PR117176] > > On Mon, 21 Oct 2024, Tamar Christ

[PATCH]middle-end: Handle more gcond lowering [PR117176]

2024-10-20 Thread Tamar Christina
Hi All, For boolean mask handling we have to lower BIT_NOT_EXPR for correctness into BIT_XOR_EXPR. Normally this is done through vect_recog_bool_pattern by following the defs of the gimple_assign. In the PR we ICE because early exits have the comparison inside the gcond itself and so vect_recog_

RE: [PATCH 2/2] Add a new permute optimization step in SLP

2024-10-18 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Friday, October 18, 2024 11:03 AM > To: Tamar Christina > Cc: Christoph Müllner ; gcc-patches@gcc.gnu.org; > Philipp Tomsich ; Jeff Law ; > Robin Dapp > Subject: RE: [PATCH 2/2] Add a new permute optimization

RE: [PATCH] SVE intrinsics: Add fold_active_lanes_to method to refactor svmul and svdiv.

2024-10-18 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, October 17, 2024 6:05 PM > To: Jennifer Schmitz > Cc: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; Tamar > Christina > Subject: Re: [PATCH] SVE intrinsics: Add fold_active_lanes_to method to > re

[PATCH]middle-end: Fix GSI for gcond root [PR117140]

2024-10-18 Thread Tamar Christina
Hi All, When finding the gsi to use for code of the root statements we should use the one of the original statement rather than the gcond which may be inside a pattern. Without this the emitted instructions may be discarded later. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

RE: [PATCH 2/2] Add a new permute optimization step in SLP

2024-10-17 Thread Tamar Christina
Hi Christoph, > -Original Message- > From: Christoph Müllner > Sent: Tuesday, October 15, 2024 3:57 PM > To: gcc-patches@gcc.gnu.org; Philipp Tomsich ; Tamar > Christina ; Richard Biener > Cc: Jeff Law ; Robin Dapp ; > Christoph Müllner > Subject: [PATC

RE: [PATCH 1/2] Reduce lane utilization in VEC_PERM_EXPRs for two_operator nodes

2024-10-17 Thread Tamar Christina
Hi Christoph, > -Original Message- > From: Christoph Müllner > Sent: Tuesday, October 15, 2024 3:57 PM > To: gcc-patches@gcc.gnu.org; Philipp Tomsich ; Tamar > Christina ; Richard Biener > Cc: Jeff Law ; Robin Dapp ; > Christoph Müllner > Subject: [PATCH 1/2] Re

RE: [PATCH 4/4]middle-end: create the longest possible zero extend chain after overwidening

2024-10-15 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, October 15, 2024 1:42 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH 4/4]middle-end: create the longest possible zero extend > chain > after overwidening > &

RE: [PATCH 2/4]middle-end: Fix VEC_PERM_EXPR lowering since relaxation of vector sizes

2024-10-15 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, October 15, 2024 1:22 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH 2/4]middle-end: Fix VEC_PERM_EXPR lowering since > relaxation of vector sizes > > On Mon, 1

RE: [PATCH 1/4]middle-end: support multi-step zero-extends using VEC_PERM_EXPR

2024-10-15 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, October 15, 2024 1:20 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: RE: [PATCH 1/4]middle-end: support multi-step zero-extends using > VEC_PERM_EXPR > > On Tue, 15 Oct

RE: [PATCH 1/4]middle-end: support multi-step zero-extends using VEC_PERM_EXPR

2024-10-15 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, October 15, 2024 12:13 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH 1/4]middle-end: support multi-step zero-extends using > VEC_PERM_EXPR > > On Tue, 15 Oct

Re: [PATCH 1/4]middle-end: support multi-step zero-extends using VEC_PERM_EXPR

2024-10-15 Thread Tamar Christina
Hi, Thanks for the look, The 10/15/2024 09:54, Richard Biener wrote: > On Mon, 14 Oct 2024, Tamar Christina wrote: > > > Hi All, > > > > This patch series adds support for a target to do a direct convertion for > > zero > > extends using permutes. >

[PATCH]AArch64 re-enable memory access costing after SLP change.

2024-10-15 Thread Tamar Christina
Hi All, While chasing down a costing difference between SLP and non-SLP for memory access costing I noticed that at some point the SLP and non-SLP costing have diverged. It used to be we only supported LOAD_LANES in SLP and so the non-SLP costing was working fine. But with the change to SLP only

RE: [PATCH 1/4]middle-end: support multi-step zero-extends using VEC_PERM_EXPR

2024-10-14 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, October 14, 2024 7:34 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de > Subject: Re: [PATCH 1/4]middle-end: support multi-step zero-extends using > VEC_PERM_EXPR > &

RE: [PATCH 3/4]AArch64: enable zero-extends using TBLs for Adv. SIMD

2024-10-14 Thread Tamar Christina
Hi Kyrill, > -Original Message- > From: Kyrylo Tkachov > Sent: Monday, October 14, 2024 12:15 PM > To: Tamar Christina > Cc: GCC Patches ; nd ; Richard > Earnshaw ; ktkac...@gcc.gnu.org; Richard > Sandiford > Subject: Re: [PATCH 3/4]AArch64: enable zero-extends

RE: [PATCH 2/2]middle-end: use two's complement equality when comparing IVs during candidate selection [PR114932]

2024-10-14 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Monday, September 23, 2024 8:41 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: RE: [PATCH 2/2]middle-end: use two's complement equality when > comp

RE: [PATCH 1/2]middle-end: refactor type to be explicit in operand_equal_p [PR114932]

2024-10-14 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Monday, September 23, 2024 8:41 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: RE: [PATCH 1/2]middle-end: refactor type to be explicit in > operand_equal_p

[PATCH 4/4]middle-end: create the longest possible zero extend chain after overwidening

2024-10-14 Thread Tamar Christina
Hi All, Consider loops such as: void test9(unsigned char *x, long long *y, int n, unsigned char k) { for(int i = 0; i < n; i++) { y[i] = k + x[i]; } } where today we generate: .L5: ldr q29, [x5], 16 add x4, x4, 128 uaddl v1.8h, v29.8b, v30.8b

[PATCH 3/4]AArch64: enable zero-extends using TBLs for Adv. SIMD

2024-10-14 Thread Tamar Christina
Hi All, In this patch series I'm adding support for zero extending using permutes instead of requiring multi-step decomposition. This codegen has the benefit of needing fewer instructions and having much higher throughput than uxtl. We previously replaced pairs of uxtl/uxtl2s with ZIPs to increa

[PATCH 2/4]middle-end: Fix VEC_PERM_EXPR lowering since relaxation of vector sizes

2024-10-14 Thread Tamar Christina
Hi All, In GCC 14 VEC_PERM_EXPR was relaxed to be able to permute to a 2x larger vector than the size of the input vectors. However various passes and transformations were not updated to account for this. I have patches in these area that I will be upstreaming with individual patches that expose

[PATCH 1/4]middle-end: support multi-step zero-extends using VEC_PERM_EXPR

2024-10-14 Thread Tamar Christina
Hi All, This patch series adds support for a target to do a direct convertion for zero extends using permutes. To do this it uses a target hook use_permute_for_promotio which must be implemented by targets. This hook is used to indicate: 1. can a target do this for the given modes. 2. is it p

[PATCH 2/3]AArch64: support encoding integer immediates using floating point moves

2024-10-14 Thread Tamar Christina
Hi All, This patch extends our immediate SIMD generation cases to support generating integer immediates using floating point operation if the integer immediate maps to an exact FP value. As an example: uint32x4_t f1() { return vdupq_n_u32(0x3f80); } currently generates: f1: adr

[PATCH 3/3]AArch64: use movi d0, #0 to clear SVE registers instead of mov z0.d, #0

2024-10-14 Thread Tamar Christina
Hi All, This patch changes SVE to use Adv. SIMD fmov 0 to clear SVE registers when not in SVE streaming mode. As the Neoverse Software Optimization guides indicate SVE mov #0 is not a zero cost move. When In streaming mode we continue to use SVE's mov to clear the registers. Tests have already

[PATCH]middle-end: Save VMAT info in stmt_vec_info as well for SLP for costing.

2024-10-14 Thread Tamar Christina
Hi All, While chasing down a costing discrepancy between SLP and non-SLP noticed that costing for different VMATs were not working. It looks like the vectorizer for non-SLP stores the VMAT type in STMT_VINFO_MEMORY_ACCESS_TYPE on the stmt_info, but for SLP it stores it in SLP_TREE_MEMORY_ACCESS_T

[PATCH]AArch64: rename the SVE2 psel intrinsics to psel_lane [PR116371]

2024-10-14 Thread Tamar Christina
Hi All, The psel intrinsics. similar to the pext, should be name psel_lane. This corrects the naming. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: PR target/116371 * config/aarch64/aarch64-sve-builtins-sve2.cc (c

[PATCH]middle-end: copy STMT_VINFO_STRIDED_P when DR is replaced [PR116956]

2024-10-14 Thread Tamar Christina
Hi All, When move_dr copies a DR from one statement to another, it seems we've forgotten to copy the STMT_VINFO_STRIDED_P flag. This leaves the new DR in a broken state where it has a non constant stride but isn't marked as strided. This causes the ICE in the PR because dataref analysis fails du

[PATCH][simplify-rtx]: Fix incorrect folding of shift and AND [PR117012]

2024-10-14 Thread Tamar Christina
Hi All, The optimization added in r15-1047-g7876cde25cbd2f is using the wrong operaiton to check for uniform constant vectors. The Author intended to check that all the lanes in the vector are the same and so used CONST_VECTOR_DUPLICATE_P. However this only checks that the vector is created from

RE: [PATCH]middle-end: support SLP early break

2024-10-11 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Friday, October 11, 2024 8:11 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: RE: [PATCH]middle-end: support SLP early break > > On Thu, 10 Oct 2024, Tamar Christi

RE: [PATCH][PR113816] AArch64: Use SIMD+GPR for logical vector reductions

2024-10-11 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Friday, October 11, 2024 7:52 AM > To: Richard Sandiford > Cc: Jennifer Schmitz ; gcc-patches@gcc.gnu.org; Richard > Earnshaw ; Kyrylo Tkachov > ; Tamar Christina > Subject: Re: [PATCH][PR113816] AArch64:

RE: [PATCH][PR113816] AArch64: Use SIMD+GPR for logical vector reductions

2024-10-10 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, October 10, 2024 8:08 PM > To: Jennifer Schmitz > Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw ; > Kyrylo Tkachov ; Tamar Christina > ; rguent...@suse.de > Subject: Re: [PATCH][PR113816] AArch64:

RE: [PATCH][PR113816] AArch64: Use SIMD+GPR for logical vector reductions

2024-10-10 Thread Tamar Christina
Hi Jennifer, > -Original Message- > From: Jennifer Schmitz > Sent: Thursday, October 10, 2024 9:27 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Sandiford ; Richard Earnshaw > ; Kyrylo Tkachov ; Tamar > Christina > Subject: [PATCH][PR113816] AArch64: Use SIM

RE: [PATCH]middle-end: support SLP early break

2024-10-10 Thread Tamar Christina
> > e.g. if (a != 0) where a is loop invariant. For instance test_memcmp_1_1 > > in /gcc.dg/memcmp-1.c is such loop. Technically we should be able to > > vectorize such loops, but while we can represent externals in the SLP tree, > > we can't start discovery at them, as no stmt_info for them. >

RE: [PATCH] aarch64: Fix folding of degenerate svwhilele case [PR117045]

2024-10-09 Thread Tamar Christina
Hi Richard, > -Original Message- > From: Richard Sandiford > Sent: Wednesday, October 9, 2024 12:58 PM > To: gcc-patches@gcc.gnu.org > Cc: ktkac...@nvidia.com; Richard Earnshaw ; > Tamar Christina > Subject: [PATCH] aarch64: Fix folding of degenerate svwhilele cas

RE: [PATCH]middle-end: support SLP early break

2024-10-09 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, October 9, 2024 9:20 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: RE: [PATCH]middle-end: support SLP early break > > On Tue, 8 Oct 202

RE: [PATCH]middle-end: support SLP early break

2024-10-08 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, October 2, 2024 1:50 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: support SLP early break > > On Tue, 1 Oct 2024, Tamar Christ

RE: [PATCH] middle-end: reorder masking priority of math functions

2024-10-04 Thread Tamar Christina
Hi Victor, > -Original Message- > From: Victor Do Nascimento > Sent: Wednesday, October 2, 2024 5:26 PM > To: gcc-patches@gcc.gnu.org > Cc: Tamar Christina ; richard.guent...@gmail.com; > Victor Do Nascimento > Subject: [PATCH] middle-end: reorder masking prior

RE: [PATCH] aarch64: Optimise calls to ldexp with SVE FSCALE instruction

2024-10-04 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Thursday, October 3, 2024 4:45 PM > To: Richard Sandiford > Cc: Soumya AR ; Tamar Christina > ; gcc-patches@gcc.gnu.org; Richard Earnshaw > ; Jennifer Schmitz ; > Pengxuan Zheng (QUIC) > Subject: Re: [PATCH]

RE: [PATCH 2/2]AArch64: support encoding integer immediates using floating point moves

2024-10-02 Thread Tamar Christina
Hi, > -Original Message- > From: Richard Sandiford > Sent: Monday, September 30, 2024 6:33 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 2/2]AArch64: support enc

RE: [PATCH]middle-end: support SLP early break

2024-10-02 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, October 2, 2024 1:50 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: support SLP early break > > On Tue, 1 Oct 2024, Tamar Christ

RE: [PATCH] [PR113816] AArch64: Use SVE bit op reduction for vector reductions

2024-10-02 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Wednesday, October 2, 2024 1:09 PM > To: Richard Sandiford > Cc: Tamar Christina ; Jennifer Schmitz > ; gcc-patches@gcc.gnu.org; Kyrylo Tkachov > > Subject: Re: [PATCH] [PR113816] AArch64: Use SVE bit

RE: [PATCH] [PR113816] AArch64: Use SVE bit op reduction for vector reductions

2024-10-01 Thread Tamar Christina
Hi Jennifer, > -Original Message- > From: Richard Sandiford > Sent: Tuesday, October 1, 2024 12:20 PM > To: Jennifer Schmitz > Cc: gcc-patches@gcc.gnu.org; Kyrylo Tkachov > Subject: Re: [PATCH] [PR113816] AArch64: Use SVE bit op reduction for vector > reductions > > Jennifer Schmitz w

RE: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2024-10-01 Thread Tamar Christina
Hi Jennifer, > -Original Message- > From: Jennifer Schmitz > Sent: Tuesday, September 24, 2024 9:23 AM > To: gcc-patches@gcc.gnu.org > Cc: Tamar Christina ; Richard Sandiford > ; Kyrylo Tkachov > Subject: Re: [RFC][PATCH] AArch64: Remove > AARCH64_EXTRA_T

RE: [PATCH] aarch64: Optimise calls to ldexp with SVE FSCALE instruction

2024-10-01 Thread Tamar Christina
Hi Soumya, Nice patch! > -Original Message- > From: Kyrylo Tkachov > Sent: Tuesday, October 1, 2024 7:55 AM > To: Soumya AR > Cc: gcc-patches@gcc.gnu.org; Richard Sandiford > Subject: Re: [PATCH] aarch64: Optimise calls to ldexp with SVE FSCALE > instruction > > Hi Soumya > > > On 3

[PATCH]middle-end: support SLP early break

2024-09-30 Thread Tamar Christina
Hi all, This patch introduces feature parity for early break int the SLP only vectorizer. The approach taken here is to treat the early exits as root statements for an SLP tree. This means that we don't need any changes to build_slp to support gconds. Codegen for the gcond itself now has to be

RE: [PATCH] middle-end: Fix ifcvt predicate generation for masked function calls

2024-09-30 Thread Tamar Christina
Hi Victor, Thanks! This looks good to me with one minor comment: > -Original Message- > From: Victor Do Nascimento > Sent: Monday, September 30, 2024 2:34 PM > To: gcc-patches@gcc.gnu.org > Cc: Tamar Christina ; richard.guent...@gmail.com; > Victor Do Nascimento

RE: [PATCH 2/2]AArch64: support encoding integer immediates using floating point moves

2024-09-30 Thread Tamar Christina
Thanks for the review, Will get started on it but one question... > -Original Message- > From: Richard Sandiford > Sent: Monday, September 30, 2024 6:33 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@

[PATCH 2/2]AArch64: support encoding integer immediates using floating point moves

2024-09-30 Thread Tamar Christina
Hi All, This patch extends our immediate SIMD generation cases to support generating integer immediates using floating point operation if the integer immediate maps to an exact FP value. As an example: uint32x4_t f1() { return vdupq_n_u32(0x3f80); } currently generates: f1: adr

[PATCH 1/2]AArch64: refactor aarch64_float_const_representable_p to take additional mode param

2024-09-30 Thread Tamar Christina
Hi All, This is a refactoring to allow aarch64_float_const_representable_p to take an additional mode parameter which is the mode of the constant being analyzed. This will be required by the next patch in the series. No functional change is expected from this change. Bootstrapped Regtested on a

RE: [PATCH]middle-end: check explicitly for external or constants when checking for loop invariant [PR116817]

2024-09-30 Thread Tamar Christina
> > > Can you explain how you get to see constant/external defs with > astmt_vec_info? That's somehow a violation of some inherentinvariant in the > vectorizer. > > > > I'm not sure I actually get any. It could be the condition is never hit > > with a stmt_vec_info. I had assumed however since the

Re: [PATCH]middle-end: check explicitly for external or constants when checking for loop invariant [PR116817]

2024-09-23 Thread Tamar Christina
om: Richard Biener Sent: Tuesday, September 24, 2024 7:45 AM To: Tamar Christina Cc: gcc-patches@gcc.gnu.org ; nd ; j...@ventanamicro.com Subject: RE: [PATCH]middle-end: check explicitly for external or constants when checking for loop invariant [PR116817] On Mon, 23 Sep 2024, Tamar Christina

RE: [PATCH]middle-end: check explicitly for external or constants when checking for loop invariant [PR116817]

2024-09-23 Thread Tamar Christina
I had made the condition to strict before, here's an updated patch: Hi All, The previous check if a value was external was checking !vect_get_internal_def (vinfo, var) but this of course isn't completely right as they could reductions etc. This changes the check to just explicitly look at extern

[PATCH]middle-end: check explicitly for external or constants when checking for loop invariant [PR116817]

2024-09-23 Thread Tamar Christina
Hi All, The previous check if a value was external was checking !vect_get_internal_def (vinfo, var) but this of course isn't completely right as they could reductions etc. This changes the check to just explicitly look at externals and constants. Note that reductions remain unhandled here, but we

RE: [PATCH 2/2]middle-end: use two's complement equality when comparing IVs during candidate selection [PR114932]

2024-09-23 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Tuesday, September 10, 2024 8:57 PM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: RE: [PATCH 2/2]middle-end: use two's complement equality when

RE: [PATCH 1/2]middle-end: refactor type to be explicit in operand_equal_p [PR114932]

2024-09-23 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Tuesday, September 10, 2024 8:57 PM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: RE: [PATCH 1/2]middle-end: refactor type to be explicit in

[PATCH]middle-end: Insert invariant instructions before the gsi [PR116812[

2024-09-22 Thread Tamar Christina
Hi All, The new invariant statements should be inserted before the current statement and not after. This goes fine 99% of the time but when the current statement is a gcond the control flow gets corrupted. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, T

RE: [PATCH]AArch64: Take into account when VF is higher than known scalar iters

2024-09-20 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, September 20, 2024 3:48 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Take into account when

RE: [PATCH]AArch64: Take into account when VF is higher than known scalar iters

2024-09-20 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, September 20, 2024 3:02 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Take into account when

RE: [PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons

2024-09-20 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, September 20, 2024 2:10 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons > > Tamar Christina writes: > &g

[PATCH][testsuite]: Update commandline for PR116628.c to use neoverse-v2 [PR116628]

2024-09-20 Thread Tamar Christina
Hi All, The testcase for this tests needs Neoverse V2 to be used since due to costing the other cost models don't pick this particular SVE mode. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. committed as obvious. Thanks, Tamar gcc/testsuite/ChangeLog: PR tree-optimiz

[PATCH]AArch64: Take into account when VF is higher than known scalar iters

2024-09-20 Thread Tamar Christina
Hi All, Consider low overhead loops like: void foo (char *restrict a, int *restrict b, int *restrict c, int n) { for (int i = 0; i < 9; i++) { int res = c[i]; int t = b[i]; if (a[i] != 0) res = t; c[i] = res; } } For such loops we use latency only costin

RE: [PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons

2024-09-20 Thread Tamar Christina
> For variable-sized vectors, I suppose the question is whether the > first unequal element is found in the minimum vector length, or whether > it only occurs for larger lengths. In the former case we can fold at > compile time, but in the latter case we can't. > > So we probably do want the loop

RE: [PATCH 2/2]middle-end: use two's complement equality when comparing IVs during candidate selection [PR114932]

2024-09-10 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Tuesday, August 20, 2024 2:06 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: [PATCH 2/2]middle-end: use two's complement equality when comparing > IVs

RE: [PATCH 1/2]middle-end: refactor type to be explicit in operand_equal_p [PR114932]

2024-09-10 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Tuesday, August 20, 2024 2:06 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: [PATCH 1/2]middle-end: refactor type to be explicit in > operand_equal_p >

RE: [PATCH 4/4]AArch64: Define VECTOR_STORE_FLAG_VALUE.

2024-09-09 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, September 9, 2024 9:29 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 4/4]AArch64: Define VECTOR_STOR

RE: [PATCH]middle-end: check that the lhs of a COND_EXPR is an SSA_NAME in cond_store recognition [PR116628]

2024-09-06 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, September 6, 2024 2:15 PM > To: Tamar Christina > Cc: GCC Patches ; nd ; Richard Biener > ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: check that the lhs of a COND_EXPR is an > SSA_NAME i

RE: [PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons

2024-09-06 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, September 6, 2024 2:21 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons > > Tamar Christina writes: >

RE: [PATCH 2/4]middle-end: lower COND_EXPR into gimple form in vect_recog_bool_pattern

2024-09-06 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Friday, September 6, 2024 2:09 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH 2/4]middle-end: lower COND_EXPR into gimple form in > vect_recog_bool_patte

[PATCH]middle-end: check that the lhs of a COND_EXPR is an SSA_NAME in cond_store recognition [PR116628]

2024-09-06 Thread Tamar Christina
Hi All, Because the vect_recog_bool_pattern can at the moment still transition out of GIMPLE and back into GENERIC the vect_recog_cond_store_pattern can end up using an expression as a mask rather than an SSA_NAME. This adds an explicit check that we have a mask and not an expression. Bootstrapp

[PATCH 4/4]AArch64: Define VECTOR_STORE_FLAG_VALUE.

2024-09-03 Thread Tamar Christina
Hi All, This defines VECTOR_STORE_FLAG_VALUE to CONST1_RTX for AArch64 so we simplify vector comparisons in AArch64. With this enabled res: moviv0.4s, 0 cmeqv0.4s, v0.4s, v0.4s ret is simplified to: res: mvniv0.4s, 0 ret NOTE: I don't really

[PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons

2024-09-03 Thread Tamar Christina
Hi All, This adds vector constant simplification for EQ and NE. This is useful since the vectorizer generates a lot more vector compares now, in particular NE and EQ and so these help us optimize cases where the values were not known at GIMPLE but instead only at RTL. Bootstrapped Regtested on a

[PATCH 2/4]middle-end: lower COND_EXPR into gimple form in vect_recog_bool_pattern

2024-09-03 Thread Tamar Christina
Hi All, Currently the vectorizer cheats when lowering COND_EXPR during bool recog. In the cases where the conditonal is loop invariant or non-boolean it instead converts the operation back into GENERIC and hides much of the operation from the analysis part of the vectorizer. i.e. a ? b : c is

[PATCH 1/4]middle-end: have vect_recog_cond_store_pattern use pattern statement for cond if available

2024-09-03 Thread Tamar Christina
Hi All, When vectorizing a conditional operation we rely on the bool_recog pattern to hit and convert the bool of the operand to a valid mask. However we are currently not using the converted operand as this is in a pattern statement. This change updates it to look at the actual statement to be

[PATCH][docs]: [committed] remove double mention of armv9-a.

2024-09-03 Thread Tamar Christina
Hi All, The list of available architecture for Arm is incorrectly listing armv9-a twice. This removes the duplicate armv9-a enumeration from the part of the list having M-profile targets. committed under the obvious rule. Thanks, Tamar gcc/ChangeLog: * doc/invoke.texi: Remove duplicate

[PATCH][testsuite]: remove -fwrapv from signbit-5.c

2024-09-03 Thread Tamar Christina
Hi All, The meaning of the testcase was changed by passing it -fwrapv. The reason for the test failures on some platform was because the test was testing some implementation defined behavior wrt INT_MIN in generic code. Instead of using -fwrapv this just removes the border case from the test so

RE: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2024-08-28 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Wednesday, August 28, 2024 8:55 AM > To: Tamar Christina > Cc: Richard Sandiford ; Jennifer Schmitz > ; gcc-patches@gcc.gnu.org; Kyrylo Tkachov > > Subject: Re: [RFC][PATCH] AArch64: Remove > AARCH64_EXTR

RE: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2024-08-27 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Tuesday, August 27, 2024 11:46 AM > To: Tamar Christina > Cc: Jennifer Schmitz ; gcc-patches@gcc.gnu.org; Kyrylo > Tkachov > Subject: Re: [RFC][PATCH] AArch64: Remove > AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_CO

RE: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2024-08-23 Thread Tamar Christina
Hi Jennifer, > -Original Message- > From: Jennifer Schmitz > Sent: Friday, August 23, 2024 1:07 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Sandiford ; Kyrylo Tkachov > > Subject: [RFC][PATCH] AArch64: Remove > AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS > > This patch removes the AARCH6

RE: [RFC] Support single lane SLP early break

2024-08-22 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, August 21, 2024 12:12 PM > To: Tamar Christina > Cc: GCC Patches > Subject: Re: [RFC] Support single lane SLP early break > > On Tue, 20 Aug 2024, Tamar Christina wrote: > > > Hi, > &g

RE: [PATCH] testsuite: Add -fwrapv to signbit-5.c

2024-08-22 Thread Tamar Christina
> -Original Message- > From: Torbjorn SVENSSON > Sent: Wednesday, August 21, 2024 2:23 PM > To: Tamar Christina ; Richard Biener > > Cc: Jeff Law ; gcc-patches@gcc.gnu.org; Richard > Earnshaw ; quic_apin...@quicinc.com; > yvan.r...@foss.st.com > Subject:

[PATCH 2/2]middle-end: use two's complement equality when comparing IVs during candidate selection [PR114932]

2024-08-20 Thread Tamar Christina
Hi All, IVOPTS normally uses affine trees to perform comparisons between different IVs, but these seem to have been missing in two key spots and instead normal tree equivalencies used. In some cases where we have a two-complements equivalence but not a strict signedness equivalencies we end up ge

[PATCH 1/2]middle-end: refactor type to be explicit in operand_equal_p [PR114932]

2024-08-20 Thread Tamar Christina
Hi All, This is a refactoring with no expected behavioral change. The goal with this is to make the type of the expressions being used explicit. I did not change all the recursive calls to operand_equal_p () to recurse directly to the new function but instead this goes through the top level call

RE: [PATCH V3 10/10] autovectorizer: Test autovectorization of different dot-prod modes.

2024-08-20 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, August 20, 2024 1:54 PM > To: Tamar Christina > Cc: Victor Do Nascimento ; gcc- > patc...@gcc.gnu.org; claz...@gmail.com; hongtao@intel.com; > s...@gcc.gnu.org; bernds_...@t-online.de; al...@

RE: [PATCH] testsuite: Add -fwrapv to signbit-5.c

2024-08-20 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, August 20, 2024 12:33 PM > To: Torbjorn SVENSSON > Cc: Jeff Law ; gcc-patches@gcc.gnu.org; Richard > Earnshaw ; quic_apin...@quicinc.com; > yvan.r...@foss.st.com; Tamar Christina > Subject: Re: [PATCH

[RFC] Support single lane SLP early break

2024-08-20 Thread Tamar Christina
Hi, I've been working on a prototype of moving early break to SLP. As we've discussed on IRC I've decided to first try adding the gconds as roots and start SLP discovery using them as roots. This works great and doesn't require any changed to build_slp, it also has the additional benefit in that

[RFC] early vector boolean lowering

2024-08-20 Thread Tamar Christina
Hi, As you know I've been working on removing the code that demotes GIMPLE COND_EXPR to GENERIC during vect_recog_bool_pattern. To restate why, The issue we currently have today is that the mask (boolean argument of a COND_EXPR) is not always available during pattern matching. This is a problem

RE: [PATCH V3 10/10] autovectorizer: Test autovectorization of different dot-prod modes.

2024-08-20 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, August 20, 2024 10:37 AM > To: Victor Do Nascimento > Cc: gcc-patches@gcc.gnu.org; Tamar Christina ; > claz...@gmail.com; hongtao@intel.com; s...@gcc.gnu.org; bernds_cb1@t- > online.de; al...@

RE: [PATCH v1] Vect: Promote unsigned .SAT_ADD constant operand for vectorizable_call

2024-08-20 Thread Tamar Christina
Hi Pan, > -Original Message- > From: Li, Pan2 > Sent: Tuesday, August 20, 2024 1:58 AM > To: Tamar Christina ; Jakub Jelinek > > Cc: Richard Biener ; gcc-patches@gcc.gnu.org; > juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; > rdapp

RE: [PATCH v1] Vect: Promote unsigned .SAT_ADD constant operand for vectorizable_call

2024-08-19 Thread Tamar Christina
> -Original Message- > From: Jakub Jelinek > Sent: Monday, August 19, 2024 8:25 PM > To: Tamar Christina > Cc: Li, Pan2 ; Richard Biener ; > gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > jeffreya...@gmail.com; rdapp@gmail.com; Liu, Hon

RE: [PATCH v1] Vect: Promote unsigned .SAT_ADD constant operand for vectorizable_call

2024-08-19 Thread Tamar Christina
Hi Pan, > > Thanks Jakub for explaining. > > Hi Richard, > > Does it mean we need to do some promotion similar as this patch to make the > vectorizable_call happy > when there is a constant operand? I am not sure if there is a better approach > for > this case. I'll leave it up to Richi, but

RE: [PATCH V3 02/10] autovectorizer: Add basic support for convert optabs

2024-08-15 Thread Tamar Christina
Hi Victor, > -Original Message- > From: Victor Do Nascimento > Sent: Thursday, August 15, 2024 9:44 AM > To: gcc-patches@gcc.gnu.org > Cc: Tamar Christina ; claz...@gmail.com; > hongtao@intel.com; s...@gcc.gnu.org; bernds_...@t-online.de; > al...@redhat.com;

  1   2   3   4   5   6   7   8   9   10   >