Re: [PATCH] New finish_compare_by_pieces target hook (for x86).

2023-06-26 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Sun, Jun 25, 2023 at 7:39 AM Roger Sayle > wrote: >> >> >> On Tue, 13 June 2023 12:02, Richard Biener wrote: >> > On Mon, Jun 12, 2023 at 4:04 PM Roger Sayle >> > wrote: >> > > The following simple test case, from PR 104610, shows that memcmp () >> >

Re: [PATCH] Change fma_reassoc_width tuning for ampere1

2023-06-26 Thread Richard Sandiford via Gcc-patches
Philipp Tomsich writes: > Richard, > > OK for backport to GCC-13? Yeah, OK for GCC 13 too. Thanks, Richard > Thanks, > Philipp. > > On Thu, 22 Jun 2023 at 16:18, Richard Sandiford via Gcc-patches > wrote: >> >> Di Zhao OS via Gcc-patches writes: &

Re: [PATCH] tree-optimization/110381 - preserve SLP permutation with in-order reductions

2023-06-26 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following fixes a bug that manifests itself during fold-left > reduction transform in picking not the last scalar def to replace > and thus double-counting some elements. But the underlying issue > is that we merge a load permutation into the in-order reduction >

[PATCH] vect: Cost intermediate conversions

2023-06-26 Thread Richard Sandiford via Gcc-patches
g:6f19cf7526168f8 extended N-vector to N-vector conversions to handle cases where an intermediate integer extension or truncation is needed. This patch adjusts the cost to account for these intermediate conversions. Tested on aarch64-linux-gnu & x86_64-linux-gnu. OK to install? Richard gcc/

Re: [PATCH 3/3] [aarch64] Adjust testcase to match assembly output after r14-2007.

2023-06-26 Thread Richard Sandiford via Gcc-patches
liuhongt writes: > The new assembly looks better than original one, so I adjust those testcases. The new loops are shorter, but they process only half the amount of data per iteration. The problem is that the new vectoriser code generates multiple statements but only costs one. I'll post a fix

Re: [PATCH V6] VECT: Apply LEN_MASK_{LOAD,STORE} into vectorizer

2023-06-23 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Address comments from Richard and Bernhard from V5 patch. > V6 fixed all issues according their comments. > > gcc/ChangeLog: > > * internal-fn.cc (expand_partial_store_optab_fn): Adapt for > LEN_MASK_STORE. >

Re: [PATCH V6] VECT: Apply LEN_MASK_{LOAD,STORE} into vectorizer

2023-06-23 Thread Richard Sandiford via Gcc-patches
Bernhard Reutner-Fischer writes: > On 23 June 2023 01:51:12 CEST, juzhe.zh...@rivai.ai wrote: >>From: Ju-Zhe Zhong > > I am sorry but I somehow overlooked a trivial spot in V5. > Nit which does not warrant an immediate next version, but please consider it > before pushing iff approved: > >>+

Re: [PATCH V5] VECT: Apply LEN_MASK_{LOAD,STORE} into vectorizer

2023-06-22 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > gcc/ChangeLog: > > * internal-fn.cc (expand_partial_store_optab_fn): Adapt for > LEN_MASK_STORE. > (internal_load_fn_p): Add LEN_MASK_LOAD. > (internal_store_fn_p): Add LEN_MASK_STORE. >

Re: [PATCH] Change fma_reassoc_width tuning for ampere1

2023-06-22 Thread Richard Sandiford via Gcc-patches
Di Zhao OS via Gcc-patches writes: > This patch enables reassociation of floating-point additions on ampere1. > This brings about 1% overall benefit on spec2017 fprate cases. (There > are minor regressions in 510.parest_r and 508.namd_r, analyzed here: >

Re: [PATCH] [vect]Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed.

2023-06-21 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, Jun 21, 2023 at 11:32 AM Richard Sandiford > wrote: >> >> Richard Sandiford writes: >> > Richard Biener via Gcc-patches writes: >> >> On Fri, Jun 2, 2023 at 3:01 AM liuhongt via Gcc-patches >> >> wrote: >> >>> >> >>> We have already use intermidate type in

Re: [PATCH] tree-optimization/110243 - kill off IVOPTs split_offset

2023-06-21 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The issue in the PR the change is fixing is that we end up with > an expression that overflows but uses signed arithmetic and so > we miscompile it later. IIRC the fixes to split_constant_offset > always were that the sum of the base + offset wasn't equal to > the

Re: [PATCH] [vect]Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed.

2023-06-21 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Richard Biener via Gcc-patches writes: >> On Fri, Jun 2, 2023 at 3:01 AM liuhongt via Gcc-patches >> wrote: >>> >>> We have already use intermidate type in case WIDEN, but not for NONE, >>> this patch extended that. >>> >>> I didn't do that in pattern recog since we

Re: [PATCH] [vect]Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed.

2023-06-21 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Fri, Jun 2, 2023 at 3:01 AM liuhongt via Gcc-patches > wrote: >> >> We have already use intermidate type in case WIDEN, but not for NONE, >> this patch extended that. >> >> I didn't do that in pattern recog since we need to know whether the >> stmt

[pushed] aarch64: Fix gcc.target/aarch64/sve/pcs failures

2023-06-20 Thread Richard Sandiford via Gcc-patches
Several gcc.target/aarch64/sve/pcs tests started failing after 6a2e8dcbbd4, because the tests weren't robust against whether an indirect argument register or the stack pointer was used as the base for stores. The patch allows either base register when there is only one indirect argument. It

[pushed] aarch64: Robustify stack tie handling

2023-06-20 Thread Richard Sandiford via Gcc-patches
The SVE handling of stack clash protection copied the stack pointer to X11 before the probe and set up X11 as the CFA for unwind purposes: /* This is done to provide unwinding information for the stack adjustments we're about to do, however to prevent the optimizers from

Re: [PATCH] tree-optimization/110243 - kill off IVOPTs split_offset

2023-06-20 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Mon, 19 Jun 2023, Richard Sandiford wrote: > >> Jeff Law writes: >> > On 6/16/23 06:34, Richard Biener via Gcc-patches wrote: >> >> IVOPTs has strip_offset which suffers from the same issues regarding >> >> integer overflow that split_constant_offset did but the

Re: [PATCH][gensupport] drop suppport for define_cond_exec from compact syntac

2023-06-20 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > define_cond_exec does not support the special @@ syntax > and so can't support {@. As such just remove support > for it. > > Bootstrapped and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > PR bootstrap/110324 > *

Re: [PATCH] tree-optimization/110243 - kill off IVOPTs split_offset

2023-06-19 Thread Richard Sandiford via Gcc-patches
Jeff Law writes: > On 6/16/23 06:34, Richard Biener via Gcc-patches wrote: >> IVOPTs has strip_offset which suffers from the same issues regarding >> integer overflow that split_constant_offset did but the latter was >> fixed quite some time ago. The following implements strip_offset >> in terms

[committed] vect: Restore aarch64 bootstrap

2023-06-19 Thread Richard Sandiford via Gcc-patches
Spot-tested on aarch64-linux-gnu, pushed as obvious. Richard gcc/ * tree-vect-loop-manip.cc (vect_set_loop_condition_partial_vectors): Handle null niters_skip. --- gcc/tree-vect-loop-manip.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git

Re: [pushed] c/c++: use positive tone in missing header notes [PR84890]

2023-06-18 Thread Richard Sandiford via Gcc-patches
David Malcolm via Gcc-patches writes: > Quoting "How a computer should talk to people" (as quoted > in "Concepts Error Messages for Humans"): > > "Various negative tones or actions are unfriendly: being manipulative, > not giving a second chance, talking down, using fashionable slang, > blaming.

Re: [PATCH 3/3] AVX512 fully masked vectorization

2023-06-15 Thread Richard Sandiford via Gcc-patches
Andrew Stubbs writes: > One > comment: building a vector constant {0, 1, 2, 3, , 63} results in a > very large entry in the constant pool and an unnecessary memory load (it > literally has to use this sequence to generate the addresses to load the > constant!) Generating the sequence via

Re: [PATCH 1/2] Missed opportunity to use [SU]ABD

2023-06-15 Thread Richard Sandiford via Gcc-patches
Oluwatamilore Adebayo writes: > From: oluade01 > > This adds a recognition pattern for the non-widening > absolute difference (ABD). > > gcc/ChangeLog: > > * doc/md.texi (sabd, uabd): Document them. > * internal-fn.def (ABD): Use new optab. > * optabs.def (sabd_optab,

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-15 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: >> + >> + /* Skip any newlines or whitespaces needed. */ >> + while (ISSPACE(*templ)) >> + templ++; >> + continue; >> + } >> + else if (templ[0] == '/' && templ[1] == '*') >> + { >> + templ += 2; >> + /*

Re: [PATCH 1/3] Inline vect_get_max_nscalars_per_iter

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 14 Jun 2023, Richard Sandiford wrote: > >> Richard Biener via Gcc-patches writes: >> > The function is only meaningful for LOOP_VINFO_MASKS processing so >> > inline it into the single use. >> > >> > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? >> >

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-14 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > +The syntax rules are as follows: > +@itemize @bullet > +@item > +Templates must start with @samp{@{@@} to use the new syntax. > + > +@item > +@samp{@{@@} is followed by a layout in parentheses which is @samp{cons:} s/parentheses/square brackets/ > +followed by a

Re: [PATCH 3/3] AVX512 fully masked vectorization

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > This implemens fully masked vectorization or a masked epilog for > AVX512 style masks which single themselves out by representing > each lane with a single bit and by using integer modes for the mask > (both is much like GCN). > > AVX512 is also special in

Re: [PATCH 1/3] Inline vect_get_max_nscalars_per_iter

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > The function is only meaningful for LOOP_VINFO_MASKS processing so > inline it into the single use. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > * tree-vect-loop.cc (vect_get_max_nscalars_per_iter): Inline > into ... >

Re: [PATCH] [RFC] main loop masked vectorization with --param vect-partial-vector-usage=1

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > Currently vect_determine_partial_vectors_and_peeling will decide > to apply fully masking to the main loop despite > --param vect-partial-vector-usage=1 when the currently analyzed > vector mode results in a vectorization factor that's bigger > than the

Re: [PATCH 1/2] Missed opportunity to use [SU]ABD

2023-06-14 Thread Richard Sandiford via Gcc-patches
Oluwatamilore Adebayo writes: > From: oluade01 > > This adds a recognition pattern for the non-widening > absolute difference (ABD). > > gcc/ChangeLog: > > * doc/md.texi (sabd, uabd): Document them. > * internal-fn.def (ABD): Use new optab. > * optabs.def (sabd_optab,

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 14 Jun 2023, Richard Sandiford wrote: > >> Richard Biener writes: >> > On Wed, 14 Jun 2023, Richard Sandiford wrote: >> > >> >> Richard Biener writes: >> >> > AFAIU this special instruction is only supposed to prevent >> >> > code motion (of stack memory

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 14 Jun 2023, Richard Sandiford wrote: > >> Richard Biener writes: >> > AFAIU this special instruction is only supposed to prevent >> > code motion (of stack memory accesses?) across this instruction? >> > I'd say a >> > >> > (may_clobber (mem:BLK (reg:DI 1 1)))

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > AFAIU this special instruction is only supposed to prevent > code motion (of stack memory accesses?) across this instruction? > I'd say a > > (may_clobber (mem:BLK (reg:DI 1 1))) > > might be more to the point? I've used "may_clobber" which doesn't > exist since I'm

Re: [PATCH V2] RISC-V: Rework Phase 5 && Phase 6 of VSETVL PASS

2023-06-12 Thread Richard Sandiford via Gcc-patches
Jeff Law via Gcc-patches writes: > On 6/9/23 04:41, juzhe.zh...@rivai.ai wrote: >> @@ -4342,135 +4510,81 @@ pass_vsetvl::cleanup_insns (void) const >> } >> } >> >> +/* Return true if the SET result is not used by any instructions. */ >> +static bool >> +has_no_uses (basic_block

Re: [PATCH v2] [PR96339] Optimise svlast[ab]

2023-06-12 Thread Richard Sandiford via Gcc-patches
Tejas Belagod writes: > From: Tejas Belagod > > This PR optimizes an SVE intrinsics sequence where > svlasta (svptrue_pat_b8 (SV_VL1), x) > a scalar is selected based on a constant predicate and a variable vector. > This sequence is optimized to return the correspoding element of a

Re: [PATCH] simplify-rtx: Implement constant folding of SS_TRUNCATE, US_TRUNCATE

2023-06-09 Thread Richard Sandiford via Gcc-patches
Kyrylo Tkachov via Gcc-patches writes: > Hi all, > > This patch implements RTL constant-folding for the SS_TRUNCATE and > US_TRUNCATE codes. > The semantics are a clamping operation on the argument with the min and max > of the narrow mode, > followed by a truncation. The signedness of the

Re: [PATCH V6] VECT: Add SELECT_VL support

2023-06-09 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Thanks, Richi. > > Should I wait for Richard ACK gain ? > Since the last email of this patch, he just asked me to adjust comment no > codes change. > I am not sure whether he is ok. Yeah, OK from my POV too, thanks. Richard

Re: [PATCH] Add COMPLEX_VECTOR_INT modes

2023-06-09 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Fri, Jun 9, 2023 at 11:45 AM Andrew Stubbs wrote: >> >> On 09/06/2023 10:02, Richard Sandiford wrote: >> > Andrew Stubbs writes: >> >> On 07/06/2023 20:42, Richard Sandiford wrote: >> >>> I don't know if this helps (probably not), but we have a similar >> >>>

Re: [PATCH] Add COMPLEX_VECTOR_INT modes

2023-06-09 Thread Richard Sandiford via Gcc-patches
Andrew Stubbs writes: > On 07/06/2023 20:42, Richard Sandiford wrote: >> I don't know if this helps (probably not), but we have a similar >> situation on AArch64: a 64-bit mode like V8QI can be doubled to a >> 128-bit vector or to a pair of 64-bit vectors. We used V16QI for >> the former and

Re: [PATCH] Make sure SCALAR_INT_MODE_P before invoke try_const_anchors

2023-06-09 Thread Richard Sandiford via Gcc-patches
guojiufu writes: > Hi, > > On 2023-06-09 16:00, Richard Biener wrote: >> On Fri, 9 Jun 2023, Jiufu Guo wrote: >> >>> Hi, >>> >>> As checking the code, there is a "gcc_assert (SCALAR_INT_MODE_P >>> (mode))" >>> in "try_const_anchors". >>> This assert seems correct because the function

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-08 Thread Richard Sandiford via Gcc-patches
In addition to Andreas's and Richard's comments: Tamar Christina writes: > +@item > +@samp{@{@@} is followed by a layout in parentheses which is @samp{cons:} > followed by > +a list of @code{match_operand}/@code{match_scratch} comma operand numbers, > then a How about: a comma-separated

Re: [PATCH] Add COMPLEX_VECTOR_INT modes

2023-06-07 Thread Richard Sandiford via Gcc-patches
Andrew Stubbs writes: > On 30/05/2023 07:26, Richard Biener wrote: >> On Fri, May 26, 2023 at 4:35 PM Andrew Stubbs wrote: >>> >>> Hi all, >>> >>> I want to implement a vector DIVMOD libfunc for amdgcn, but I can't just >>> do it because the GCC middle-end models DIVMOD's return value as >>>

Re: vect: Don't pass subtype to vect_widened_op_tree where not needed [PR 110142]

2023-06-07 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)" writes: > Hi, > > This patch fixes an issue introduced by > g:2f482a07365d9f4a94a56edd13b7f01b8f78b5a0, where a subtype was beeing > passed to vect_widened_op_tree, when no subtype was to be used. This > lead to an errorneous use of IFN_VEC_WIDEN_MINUS. > >

Re: [PATCH 0/3] aarch64: ls64 builtin fixes [PR110100,PR110132]

2023-06-07 Thread Richard Sandiford via Gcc-patches
Alex Coplan writes: > Hi, > > This patch series fixes various defects with the FEAT_LS64 ACLE > implementation in the AArch64 backend. > > The series is organised as follows: > > - Patch 1/3 fixes whitespace errors in the existing code. > - Patch 2/3 fixes PR110100 where we generate wrong code

Re: [PATCH] rtl: AArch64: New RTL for ABD

2023-06-06 Thread Richard Sandiford via Gcc-patches
Oluwatamilore Adebayo writes: >> It would be good to mark all of these functions with __attribute__((noipa)), >> since I think interprocedural optimisations might otherwise defeat the >> runtime test in abd_run_1.c (in the sense that we might end up folding >> things at compile time and not

Re: [PATCH 1/2] Missed opportunity to use [SU]ABD

2023-06-06 Thread Richard Sandiford via Gcc-patches
Oluwatamilore Adebayo writes: > From: oluade01 > > This adds a recognition pattern for the non-widening > absolute difference (ABD). > > gcc/ChangeLog: > > * doc/md.texi (sabd, uabd): Document them. > * internal-fn.def (ABD): Use new optab. > * optabs.def (sabd_optab,

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-06 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> >int operand_number; /* Operand index in the big array. */ >> >int output_format; /* INSN_OUTPUT_FORMAT_*. */ >> > + bool compact_syntax_p; >> >struct operand_data operand[MAX_MAX_OPERANDS]; }; >> > >> > @@ -700,12 +702,57 @@

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-06 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: >> diff --git a/gcc/gensupport.h b/gcc/gensupport.h >> index >> a1edfbd71908b6244b40f801c6c01074de56777e..7925e22ed418767576567cad583bddf83c0846b1 >> 100644 >> --- a/gcc/gensupport.h >> +++ b/gcc/gensupport.h >> @@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-05 Thread Richard Sandiford via Gcc-patches
Looks good! Just some minor comments: Tamar Christina writes: > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > index > 6a435eb44610960513e9739ac9ac1e8a27182c10..1437ab55b260ab5c876e92d59ba39d24bffc6276 > 100644 > --- a/gcc/doc/md.texi > +++ b/gcc/doc/md.texi > @@ -27,6 +27,7 @@ See the next

Re: [PATCH] New wi::bitreverse function.

2023-06-05 Thread Richard Sandiford via Gcc-patches
"Roger Sayle" writes: > This patch provides a wide-int implementation of bitreverse, that > implements both of Richard Sandiford's suggestions from the review at > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618215.html of an > improved API (as a stand-alone function matching the bswap

Re: [PATCH V2] VECT: Add SELECT_VL support

2023-06-05 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richard. > >>> No, I meant that the comment I quoted seemed to be saying that solution >>> 3 wasn't possible. The comment seemed to say that we would need to do >>> solution 1. > I am so sorry that I didn't write the comments accurately. > Could you help me

Re: [PATCH V2] VECT: Add SELECT_VL support

2023-06-05 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > "juzhe.zh...@rivai.ai" writes: >> Hi, Richard. Thanks for the comments. >> If we use SELECT_VL to refer only to the target-independent ifn, I don't see why this last bit is true. >> Could you give me more details and information about this since I am not >>

Re: [PATCH V2] VECT: Add SELECT_VL support

2023-06-05 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richard. Thanks for the comments. > >>> If we use SELECT_VL to refer only to the target-independent ifn, I don't >>> see why this last bit is true. > Could you give me more details and information about this since I am not sure > whether I catch up with you.

Re: [PATCH V2] VECT: Add SELECT_VL support

2023-06-05 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > + /* If we're using decrement IV approach in loop control, we can use output > of > + SELECT_VL to adjust IV of loop control and data reference when it > satisfies > + the following checks: > + > + (a) SELECT_VL is supported by the target. > + (b)

Re: [PATCH] VECT: Add SELECT_VL support

2023-06-04 Thread Richard Sandiford via Gcc-patches
Sorry for the slow review. I don't know the IV-related parts well enough to review those properly, but they looked reasonable to me. Hopefully Richi can comment. I'm curious though. For: > + tree step = vect_dr_behavior (vinfo, dr_info)->step; > + > + [...] > + poly_uint64 bytesize =

Re: [PATCH 2/3] Refactor widen_plus as internal_fn

2023-06-02 Thread Richard Sandiford via Gcc-patches
Just some very minor things. "Andre Vieira (lists)" writes: > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index > 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..348bee35a35ae4ed9a8652f5349f430c2733e1cb > 100644 > --- a/gcc/internal-fn.cc > +++ b/gcc/internal-fn.cc > @@ -90,6 +90,71 @@

Re: [PATCH V3] VECT: Change flow of decrement IV

2023-06-02 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Thanks Richi. I am gonna merge it after Richard's final approve. Thanks for checking, but no need to wait for a second ack from me! Please go ahead and commit. Richard

Re: [PATCH v2] aarch64: Add pattern for bswap + rotate [PR 110039]

2023-05-31 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > After commit g:d8545fb2c71683f407bfd96706103297d4d6e27b, we missed a > pattern to match the new GIMPLE form. > > With this patch, gcc.target/aarch64/rev16_2.c passes again. > > 2023-05-31 Christophe Lyon > > PR target/110039 > gcc/ > *

Re: [PATCH] Add COMPLEX_VECTOR_INT modes

2023-05-31 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Fri, May 26, 2023 at 4:35 PM Andrew Stubbs wrote: >> >> Hi all, >> >> I want to implement a vector DIVMOD libfunc for amdgcn, but I can't just >> do it because the GCC middle-end models DIVMOD's return value as >> "complex int" type, and there are no

Re: [PATCH] aarch64: Add pattern for bswap + rotate [PR 110039]

2023-05-31 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > On Wed, 31 May 2023 at 11:49, Richard Sandiford > wrote: > >> Christophe Lyon writes: >> > After commit g:d8545fb2c71683f407bfd96706103297d4d6e27b, we missed a >> > pattern to match the new GIMPLE form. >> > >> > With this patch, gcc.target/aarch64/rev16_2.c passes

Re: [PATCH] aarch64: Add pattern for bswap + rotate [PR 110039]

2023-05-31 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > After commit g:d8545fb2c71683f407bfd96706103297d4d6e27b, we missed a > pattern to match the new GIMPLE form. > > With this patch, gcc.target/aarch64/rev16_2.c passes again. > > 2023-05-31 Christophe Lyon > > PR target/110039 > gcc/ > *

Re: [PATCH] VECT: Change flow of decrement IV

2023-05-31 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Wed, 31 May 2023, Richard Sandiford wrote: > >> Richard Biener writes: >> > On Wed, 31 May 2023, juzhe.zh...@rivai.ai wrote: >> > >> >> Hi?all. I have posted my several investigations: >> >>

Re: [PATCH] VECT: Change flow of decrement IV

2023-05-31 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 31 May 2023, juzhe.zh...@rivai.ai wrote: > >> Hi?all. I have posted my several investigations: >> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620101.html >> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620105.html >>

Re: [PATCH] Refactor wi::bswap as a function (instead of a method).

2023-05-30 Thread Richard Sandiford via Gcc-patches
"Roger Sayle" writes: > This patch implements Richard Sandiford's suggestion from > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618215.html > that wi::bswap (and a new wi::bitreverse) should be functions, > and ideally only accessors are member functions. This patch > implements the first

Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-30 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > Hi Richard, > The s32 case for single constant patch doesn't regress now after the > above commit. > Bootstrapped+tested on aarch64-linux-gnu, and verified that the new > tests pass for aarch64_be-linux-gnu. > Is it OK to commit ? > > Thanks, > Prathamesh > >

Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread Richard Sandiford via Gcc-patches
"juzhe.zhong" writes: > Maybe we can include rgroup number into select vl pattern?So that, I always > use select vl pattern. In my backend, if it is single rgroup,we gen vsetvl, > otherwise we gen min. That just seems to be a way of hiding an “is the target RVV?” test though. IMO targets

Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Before this patch: > foo: > ble a2,zero,.L5 > csrr a3,vlenb > srli a4,a3,2 > .L3: > minu a5,a2,a4 > vsetvli zero,a5,e32,m1,ta,ma > vle32.v v2,0(a1) > vle32.v v1,0(a0) > vsetvli t1,zero,e32,m1,ta,ma > vadd.vv v1,v1,v2 > vsetvli zero,a5,e32,m1,ta,ma > vse32.v

Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Follow Richi's suggestion, I change current decrement IV flow from: > > do { >remain -= MIN (vf, remain); > } while (remain != 0); > > into: > > do { >old_remain = remain; >len = MIN (vf, remain); >remain -= vf; > } while

Re: decremnt IV patch create fails on PowerPC

2023-05-30 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: >> But how easy would it be to extend SCEV analysis, via a pattern match? >> The evolution of the IV phi wrt the inner loop is still a normal SCEV. > > No, the IV isn't a normal SCEV, the final value is different. Which part of the IV though? Won't all executions of the

Re: decremnt IV patch create fails on PowerPC

2023-05-30 Thread Richard Sandiford via Gcc-patches
My understanding was that we went into this knowing that the IVs would defeat SCEV analysis. Apparently that wasn't a problem for RVV, but it's not surprising that it is a problem in general. This isn't just about SELECT_VL though. We use the same type of IV for cases what aren't going to use

Re: [PATCH] stor-layout, aarch64: Express SRA intrinsics with RTL codes

2023-05-25 Thread Richard Sandiford via Gcc-patches
Kyrylo Tkachov via Gcc-patches writes: > Hi all, > > This patch expresses the intrinsics for the SRA and RSRA instructions with > standard RTL codes rather than relying on UNSPECs. > These instructions perform a vector shift right plus accumulate with an > optional rounding constant addition for

Re: [PATCH V16] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-25 Thread Richard Sandiford via Gcc-patches
This looks good to me. Just a couple of very minor cosmetic things: juzhe.zh...@rivai.ai writes: > @@ -753,17 +846,35 @@ vect_set_loop_condition_partial_vectors (class loop > *loop, > continue; > } > > - /* See whether zero-based IV would ever generate all-false masks

Re: [PATCH V15] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-25 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richard. Thanks for the comments. > >>> if (!LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo) >>> || !iv_rgc >>> || (iv_rgc->max_nscalars_per_iter * iv_rgc->factor >>> != rgc->max_nscalars_per_iter * rgc->factor)) >>> { > >> /* See

Re: [PATCH] RISC-V: In pipeline scheduling, insns should not be fusion in different BB blocks.

2023-05-25 Thread Richard Sandiford via Gcc-patches
Jin Ma writes: > When the last insn1 of BB1 and the first insn2 of BB2 are fusion, insn2 will > clear all dependencies in the function chain_to_prev_insn, resulting in insn2 > may mov to any BB, and the program calculation result is wrong. > > gcc/ChangeLog: > > * sched-deps.cc

Re: [PATCH] Fix type error of 'switch (SUBREG_BYTE (op)).'

2023-05-25 Thread Richard Sandiford via Gcc-patches
"Jin Ma" writes: >> > On 5/17/23 03:03, Jin Ma wrote: >> >> For example: >> >> (define_insn "mov_lowpart_sidi2" >> >>[(set (match_operand:SI0 "register_operand" "=r") >> >> (subreg:SI (match_operand:DI 1 "register_operand" " r") 0))] >> >>"TARGET_64BIT" >> >>

Re: [PATCH V15] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-25 Thread Richard Sandiford via Gcc-patches
Thanks, this looks functionally correct to me. And I agree it handles the cases that previously needed multiplication. But I think it regresses code quality when no multiplication was needed. We can now generate duplicate IVs. Perhaps ivopts would remove the duplicates, but it might be hard,

Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-25 Thread Richard Sandiford via Gcc-patches
LGTM, just a couple of comment tweaks: Prathamesh Kulkarni writes: > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc > index d6fc94015fa..db7ca4c28c3 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -22332,6 +22332,46 @@

Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
I'll look at the samples tomorrow, but just to address one thing: 钟居哲 writes: >>> What gives the best code in these cases? Is emitting a multiplication >>> better? Or is using a new IV better? > Could you give me more detail information about "new refresh IV" approach. > I'd like to try that.

Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-24 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Wed, 24 May 2023 at 15:40, Richard Sandiford > wrote: >> >> Prathamesh Kulkarni writes: >> > On Mon, 22 May 2023 at 14:18, Richard Sandiford >> > wrote: >> >> >> >> Prathamesh Kulkarni writes: >> >> > Hi Richard, >> >> > Thanks for the suggestions. Does the

Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: > Oh. I see. Thank you so much for pointing this. > Could you tell me what I should do in the codes? > It seems that I should adjust it in > vect_adjust_loop_lens_control > > muliply by some factor ? Is this correct multiply by max_nscalars_per_iter > ? max_nscalars_per_iter *

Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: > Hi, Richard. I still don't understand it. Sorry about that. > >>> loop_len_48 = MIN_EXPR ; > >> _74 = loop_len_34 * 2 - loop_len_48; > > I have the tests already tested. > We have a MIN_EXPR to calculate the total elements: > loop_len_34 = MIN_EXPR ; > I think "8" is already

Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: > Hi, the .optimized dump is like this: > >[local count: 21045336]: > ivtmp.26_36 = (unsigned long) > ivtmp.27_3 = (unsigned long) > ivtmp.30_6 = (unsigned long) [(void *) + 16B]; > ivtmp.31_10 = (unsigned long) [(void *) + 32B]; > ivtmp.32_14 = (unsigned long)

Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
Thanks for trying it. I'm still surprised that no multiplication is needed though. Does the patch work for: short x[100]; int y[200]; void f() { for (int i = 0, j = 0; i < 100; i += 2, j += 4) { x[i + 0] += 1; x[i + 1] += 2; y[j + 0] += 1; y[j + 1] += 2; y[j + 2] += 3;

Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: >>> Both approaches are fine. I'm not against one or the other. > >>> What I didn't understand was why your patch only reuses existing IVs >>> for max_nscalars_per_iter == 1. Was it to avoid having to do a >>> multiplication (well, really a shift left) when moving from one >>>

Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: >>> In other words, why is this different from what >>>vect_set_loop_controls_directly would do? > Oh, I see. You are confused that why I do not make multiple-rgroup vec_trunk > handling inside "vect_set_loop_controls_directly". > > Well. Frankly, I just replicate the handling of ARM

Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
Sorry, I realised later that I had an implicit assumption here: if there are multiple rgroups, it's better to have a single IV for the smallest rgroup and scale that up to bigger rgroups. E.g. if the loop control IV is taken from an N-control rgroup and has a step S, an N*M-control rgroup would

Re: [PATCH V12] VECT: Add decrement IV iteration loop control by variable amount support

2023-05-24 Thread Richard Sandiford via Gcc-patches
Sorry for the slow review. I needed some time to go through this patch and surrounding code to understand it, and to understand why it wasn't structured the way I was expecting. I've got some specific comments below, and then a general comment about how I think we should structure this.

Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-24 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Mon, 22 May 2023 at 14:18, Richard Sandiford > wrote: >> >> Prathamesh Kulkarni writes: >> > Hi Richard, >> > Thanks for the suggestions. Does the attached patch look OK ? >> > Boostrap+test in progress on aarch64-linux-gnu. >> >> Like I say, please wait for the

Re: [PATCH V3] RISC-V: Add RVV comparison autovectorization

2023-05-24 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, May 23, 2023 at 5:05 PM wrote: >> >> From: Juzhe-Zhong >> >> This patch enable RVV auto-vectorization including floating-point >> unorder and order comparison. >> >> The testcases are leveraged from Richard. >> So include Richard as co-author. >> >>

Re: [PATCH 1/2] Missed opportunity to use [SU]ABD

2023-05-24 Thread Richard Sandiford via Gcc-patches
Thanks for the update. Mostly LGTM, just some minor things left below. Oluwatamilore Adebayo writes: > diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc > index > a49b09539776c0056e77f99b10365d0a8747fbc5..3a2248263cf67834a1cb41167a1783a3b6400014 > 100644 > ---

[PATCH] early-remat: Resync with new DF postorders [PR109940]

2023-05-24 Thread Richard Sandiford via Gcc-patches
When I wrote early-remat, the DF_FORWARD block order was a postorder of a reverse/backward walk (i.e. of the inverted cfg), rather than a reverse postorder of a forward walk. A postorder of a backward walk lacked the important property that dominators come before the blocks they dominate; instead

Re: [PATCH] tree-optimization/109747 - SLP cost of CTORs

2023-05-23 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The x86 backend looks at the SLP node passed to the add_stmt_cost > hook when costing vec_construct, looking for elements that require > a move from a GPR to a vector register and cost that. But since > vect_prologue_cost_for_slp decomposes the cost for an external > SLP

Re: [PATCH] Fix type error of 'switch (SUBREG_BYTE (op)).'

2023-05-23 Thread Richard Sandiford via Gcc-patches
Jeff Law via Gcc-patches writes: > On 5/17/23 03:03, Jin Ma wrote: >> For example: >> (define_insn "mov_lowpart_sidi2" >>[(set (match_operand:SI0 "register_operand" "=r") >> (subreg:SI (match_operand:DI 1 "register_operand" " r") 0))] >>"TARGET_64BIT" >>

Re: [PATCH] RISC-V: Refactor the framework of RVV auto-vectorization

2023-05-23 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Yeah. I know. > Like ARM does everywhere: > (define_expand "vcond" > [(set (match_operand:SVE_ALL 0 "register_operand") > (if_then_else:SVE_ALL > (match_operator 3 "comparison_operator" > [(match_operand:SVE_I 4 "register_operand") >

Re: [PATCH 2/2] aarch64: Provide FPR alternatives for some bit insertions [PR109632]

2023-05-23 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, May 23, 2023 at 12:38 PM Richard Sandiford via Gcc-patches > wrote: >> >> At -O2, and so with SLP vectorisation enabled: >> >> struct complx_t { float re, im; }; >> complx_t add(complx_t a, complx_t b) { >&

[PATCH 1/2] md: Allow to refer to the value of int iterator FOO

2023-05-23 Thread Richard Sandiford via Gcc-patches
In a follow-up patch, I wanted to use an int iterator to iterate over various possible values of a const_int. But one problem with int iterators was that there was no way of referring to the current value of the iterator. This is unlike modes and codes, which provide automatic "mode", "MODE",

[PATCH 2/2] aarch64: Provide FPR alternatives for some bit insertions [PR109632]

2023-05-23 Thread Richard Sandiford via Gcc-patches
At -O2, and so with SLP vectorisation enabled: struct complx_t { float re, im; }; complx_t add(complx_t a, complx_t b) { return {a.re + b.re, a.im + b.im}; } generates: fmovw3, s1 fmovx0, d0 fmovx1, d2 fmovw2, s3 bfi

Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-22 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > Hi Richard, > Thanks for the suggestions. Does the attached patch look OK ? > Boostrap+test in progress on aarch64-linux-gnu. Like I say, please wait for the tests to complete before sending an RFA. It saves a review cycle if the tests don't in fact pass. > diff

Re: [PATCH V13] VECT: Fix bug of multiple-rgroup for length is counting elements

2023-05-22 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Address comments from Richard that splits the patch of fixing multiple-rgroup > handling of length counting elements. > > This patch is fixing issue of handling multiple-rgroup of length is counting > elements > > Before this patch, multiple

Re: [PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements

2023-05-22 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Address comments from Richard that splits the patch of fixing multiple-rgroup > handling of length counting elements. > > This patch is fixing issue of handling multiple-rgroup of length is counting > elements > > Before this patch, multiple

Re: [PATCH V11] VECT: Add decrement IV support in Loop Vectorizer

2023-05-19 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richard. Thanks for the comments. > > Would you mind telling me whether it is possible that we can make decrement > IV support into GCC middle-end ? > > If yes, could you tell what I should do next for the patches since I am > confused that it seems the

<    1   2   3   4   5   6   7   8   9   10   >