Re: [PATCH] RISC-V: Support trailing vec_init optimization

2023-11-15 Thread Robin Dapp
Hi Juzhe, thanks, LGTM as it is just a refinement of what we already have. Regards Robin

Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

2023-11-13 Thread Robin Dapp
As per Jeff's remark I'm going to push the attached. Regards Robin Subject: [PATCH v4] RISC-V: vsetvl: Refine REG_EQUAL equality. This patch enhances the equality check for REG_EQUAL notes in the vsetvl pass by using the == operator instead of rtx_equal_p. With that, in situations like the

Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

2023-11-13 Thread Robin Dapp
On 11/13/23 11:36, juzhe.zh...@rivai.ai wrote: > --- /dev/null > +++ > b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb_run-2.c > @@ -0,0 +1,19 @@ > +/* { dg-do run { target { riscv_v } } } */ > +/* { dg-additional-options "-march=rv64gcv_zbb --param >

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

2023-11-13 Thread Robin Dapp
> On 2023/11/13 18:22, juzhe.zh...@rivai.ai wrote: >> If there is a difference between them. I think we should fix riscv-common.cc. >> Since I think "zvfh_zfh" should not be different with "zfh_zvfh" > > It's possible. Let me debug it and see if there's a problem. I don't think it is different.

Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

2023-11-13 Thread Robin Dapp
On 11/13/23 10:38, juzhe.zh...@rivai.ai wrote: > For @code{REG_EQUIV}, the register is equivalent to @var{op} throughout > the entire function, and could validly be replaced in all its > occurrences by @var{op}.  (``Validly'' here refers to the data flow of > the program; simple replacement may

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

2023-11-13 Thread Robin Dapp
> Looks like your configure is --with-march=rv32gcv_zvfh, can you change to > --with-march=rv32gcv_zvfh_zfh? >From config.log: $ ../configure --prefix=/home/rdapp/projects/builds/gcc --target=riscv32-unknown-linux-gnu --disable-nls --disable-multilib --disable-bootstrap

Re: [PATCH V2] RISC-V: Optimize combine sequence by merge approach

2023-11-13 Thread Robin Dapp
Hi Juzhe, LGTM apart from: > + int64_t a = -1789089.23423; > + int64_t b = -8916156.45644; What's that? :) Doesn't really matter of course but please change to a proper integer. OK with that changed. Regards Robin

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

2023-11-13 Thread Robin Dapp
Hi Lehua, > Executing on host: > /work/home/lding/open-source/riscv-gnu-toolchain-push/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/build-gcc-newlib-stage2/gcc/xgcc > >

Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

2023-11-13 Thread Robin Dapp
On 11/13/23 10:30, juzhe.zh...@rivai.ai wrote: > I just checked definition of REG_EQUAL and REG_EQUIV. > > As you said, REG_EQUIV is more reasonable. Agree with use rtx_equal_p on > REG_EQUIV and skip REG_EQUAL. > Could you check whether it does fix your issues ? Yes it would fix the issues. I

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

2023-11-13 Thread Robin Dapp
> I'm going to configure with --with-arch=rv32gcv_zfh_zvfh --with-abi=ilp32d > to see if there is any difference. No change for me, how do you invoke the testsuite? I.e. Which target board? Regards Robin

Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

2023-11-13 Thread Robin Dapp
On 11/13/23 09:25, juzhe.zh...@rivai.ai wrote: > Also, like kito previous remind me: > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635326.html >   > > I think you should add a dedicated test which with specifying >

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

2023-11-13 Thread Robin Dapp
> xgcc: fatal error: Cannot find suitable multilib set for > '-march=rv32imafdcv_zicsr_zifencei_zfh_zfhmin_zve32f_zve32x_zve64d_zve64f_zve64x_zvl128b_zvl32b_zvl64b'/'-mabi=ilp32d'^M > compilation terminated.^M > compiler exited with status 1 > FAIL: gcc.target/riscv/rvv/autovec/vmv-imm-run.c -O3

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

2023-11-13 Thread Robin Dapp
> FAIL: gcc.target/riscv/rvv/autovec/slp-mask-run-1.c -O3 -ftree-vectorize > (test for excess errors) > FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c -std=c99 -O3 > -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for > excess errors) > FAIL:

Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

2023-11-13 Thread Robin Dapp
> Does this patch fixes exposed bugs in current tests? > Or could you add test for it ? Ah, yes forgot to mention. This fixes several tests when testing with -march=rv64gcv_zbb. Regards Robin

[PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

2023-11-13 Thread Robin Dapp
Hi, this patch enhances the equality check for REG_EQUAL notes in the vsetvl pass. Currently, we assume that two such notes describe the same value when they have the same rtx representation. This is not true when either of the note's source operands is modified by an insn between the two

[PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

2023-11-10 Thread Robin Dapp
Hi, this patch fixes several more FAILs that would only show up in 32-bit runs. Regards Robin gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vmul-zvfh-run.c: Adjust. * gcc.target/riscv/rvv/autovec/binop/vsub-zvfh-run.c: Ditto. *

Re: [PATCH] RISC-V: Add combine optimization by slideup for vec_init vectorization

2023-11-10 Thread Robin Dapp
Hi Juzhe, LGTM. The test patterns are a bit unwieldy but not a blocker IMHO. Could probably done shorter using macro magic? Regards Robin

[PATCH] vect: Look through pattern stmt in fold_left_reduction.

2023-11-10 Thread Robin Dapp
Hi, more fallout from the COND_OP change was shown in PR112464. It appears as if we "look through" a statement pattern in vect_finish_replace_stmt but not before when we replace the newly created vector statement's lhs. Then the lhs is the statement pattern's lhs while in

Re: [PATCH] RISC-V: Robustify vec_init pattern[NFC]

2023-11-10 Thread Robin Dapp
Hi Juzhe, yes, that's reasonable. OK. Regards Robin

[PATCH] RISC-V/testsuite: Fix zvfh tests.

2023-11-09 Thread Robin Dapp
Hi, this fixes some zvfh test oversights as well as adds zfh to the target requirements. It's not strictly necessary to have zfh but it greatly simplifies test handling when we can just calculate the reference value instead of working around it. Regards Robin gcc/testsuite/ChangeLog:

[PATCH] vect: Use statement vectype for conditional mask.

2023-11-08 Thread Robin Dapp
Hi, as Tamar reported in PR112406 we still ICE on aarch64 in SPEC2017 when creating COND_OPs in ifcvt. The problem is that we fail to deduce the mask's type from the statement vectype and then end up with a non-matching mask in expand. This patch checks if the current op is equal to the mask

Re: [PATCH] vect/ifcvt: Add vec_cond fallback and check for vector versioning.

2023-11-07 Thread Robin Dapp
> isn't is_cond_op implied by mask != NULL? That said, if we ever end > up here with a non-cond op but a loop mask we effectively want the > same behvior so I think eliding is_cond_op and instead checking > mask != NULL_TREE below is more future proof. > > OK with that change. Thanks, attached

Re: [PATCH] RISC-V: Add RISC-V into vect_cmdline_needed

2023-11-07 Thread Robin Dapp
> It need command line to enable SIMD auto-vectorization (VLS mode in RVV). > It will enable VLS modes auto-vectorization by default if we didn't add RISCV > into vect_cmdline. > So adding it to disable VLS mode vectorization which will fix the FAILs like > other targets. Ah so it's about SIMD

Re: [PATCH] RISC-V: Add RISC-V into vect_cmdline_needed

2023-11-07 Thread Robin Dapp
Looks OK but I don't really get the test (e.g. gen-vect-26.c). It is only ran if target vect_cmdline_needed, otherwise compiled? Why does that have an impact on the scan? Looks weird but well... Regards Robin

Re: [PATCH] testsuite/vect: Make check more accurate.

2023-11-07 Thread Robin Dapp
Sorry, didn't reply-all: > /* { dg-final { scan-tree-dump-times "optimizing condition reduction with > FOLD_EXTRACT_LAST" 4 "vect" { target { { vect_fold_extract_last } && { ! > vect_pack_trunc } } } } } */ > > This check should be removed. Previously, I added it since we didn't enable  >

[PATCH] testsuite/vect: Make check more accurate.

2023-11-07 Thread Robin Dapp
Hi, similar to before this modifies a check so we do only match a vectorization attempt if it succeeded. On riscv we potentially try several modes of which some may fail. I tested on riscv, aarch64 and x86 but on the cfarm machines there is no vect_fold_extract_last. Maybe gcn would work?

[PATCH] testsuite: Change expectation for bb-slp-over-widen-n.c

2023-11-07 Thread Robin Dapp
Hi, this patch makes sure we check for note: Basic block will be vectorized using SLP instead of optimized: basic block which will also match optimized: basic block part of which there are many more in an RVV dump. Tested on x86 and aarch64 as well as RVV. Regards Robin

Re: [PATCH] RISC-V: Fixed failed rvv combine testcases

2023-11-06 Thread Robin Dapp
Thanks, what I was slightly concerned about is that we now have the implicit assumption that the initial value is 0. I mean that's what the vectorizer does for reductions but theoretically, wouldn't we also combine other values into 0 now? Regards Robin

[PATCH] vect/ifcvt: Add vec_cond fallback and check for vector versioning.

2023-11-06 Thread Robin Dapp
Hi, this restricts tree-ifcvt to only create COND_OPs when we versioned the loop for vectorization. Apart from that it re-creates a VEC_COND_EXPR in vect_expand_fold_left if we emitted a COND_OP. I'm still missing the "bail out" part for vect_expand_fold_left, though? Bootstrap, testsuites are

Re: [PATCH] RISC-V: Enhance AVL propagation for complicate reduction auto-vectorization

2023-11-06 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] Fix configure script comments(!?!) (Was: Re: [PATCH] genemit: Split insn-emit.cc into ten files)

2023-11-06 Thread Robin Dapp
> I'm not sure what that means, whether a wrong version of > autoconf/automake was used (though when I accidentally tried that, it > has always complained loudly) or if some environment difference can > cause this. Perhaps I should change the script not to care about > commits though that won't

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-03 Thread Robin Dapp
> Ah, OK. IMO it's better to keep the optab operands the same as the IFN > operands, even if that makes things inconsistent with vcond_mask. > vcond_mask isn't really a good example to follow, since the operand > order is not only inconsistent with the IFN, it's also inconsistent > with the

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-03 Thread Robin Dapp
> Could you explain why a special expansion is needed? (Sorry if you already > have and I missed it, bit overloaded ATM.) What does it do that is > different from what expand_fn_using_insn would do? All it does (in excess) is shuffle the arguments - vcond_mask_len has the mask as third operand

Re: [PATCH V2] RISC-V: Fix redundant vsetvl in fixed-vlmax vectorized codes[PR112326]

2023-11-02 Thread Robin Dapp
Hi Juzhe, in principle this LGTM. It could use some function comments, though ;) > +imm_avl_p (machine_mode mode) > { >poly_uint64 nuints = GET_MODE_NUNITS (mode); > >return nuints.is_constant () > -/* The vsetivli can only hold register 0~31. */ > -? (IN_RANGE

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-02 Thread Robin Dapp
> Looks reasonable overall. The new match patterns are 1:1 the > same as the COND_ ones. That's a bit awkward, but I don't see > a good way to "macroize" stuff further there. Can you at least > interleave the COND_LEN_* ones with the other ones instead of > putting them all at the end? Yes, no

Re: [PATCH] RISC-V: Fix bug of AVL propagation PASS

2023-11-02 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-31 Thread Robin Dapp
>> +int >> +internal_fn_else_index (internal_fn fn) > > The function needs a comment, maybe: > > /* If FN is an IFN_COND_* or IFN_COND_LEN_* function, return the index of the >argument that is used when the condition is false. Return -1 otherwise. > */ > > OK for the internal-fn* and

Re: [PATCH] RISC-V: Support strided load/store

2023-10-31 Thread Robin Dapp
Hi Juzhe, LGTM once the middle-end parts are in. Just tiny nits. Nothing that would warrant a V2, though. > +;; = > +;; == Stried Load/Store missing a 'd' here. > +(define_predicate "vector_stride_extension_operand" > +

Re: [PATCH] RISC-V: Add vector fmin/fmax expanders.

2023-10-31 Thread Robin Dapp
Thanks, going to commit the attached. Regards Robin This patch adds expanders for fmin and fmax. As per RISC-V V Spec 1.0 vfmin/vfmax are IEEE 754-2019 compliant which differs from IEEE 754-2008 that fmin/fmax require (particularly in the signaling-NaN handling). Therefore the pattern

Re: [PATCH] OPTABS/IFN: Add mask_len_strided_load/mask_len_strided_store OPTABS/IFN

2023-10-31 Thread Robin Dapp
Hi Juzhe, > +@cindex @code{mask_len_strided_load@var{m}@var{n}} instruction pattern > +@item @samp{mask_len_strided_load@var{m}@var{n}} > +Load several separate memory locations into a vector of mode m. > +Operand 1 is a scalar base address and operand 2 is mode @var{n} > +specifying each uniform

Re: [PATCH] RISC-V: Add vector fmin/fmax expanders.

2023-10-30 Thread Robin Dapp
> Aren't they actually the IEEE 754-2019 operations (with different > signaling NaN semantics; C functions such as fmaximum in C23), not the > IEEE 754-2008 operations (C functions such as fmax)? V spec 1.0 says "The > vector floating-point vfmin and vfmax instructions have the same behavior

[PATCH] RISC-V: Add vector fmin/fmax expanders.

2023-10-30 Thread Robin Dapp
Hi, this patch adds expanders for fmin and fmax and the associated cond and reduc ones. As per RISC-V V spec 1.0 vfmin/vfmax are IEEE 754-2008 compliant so that should be ok. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md (3): fmax/fmin expanders. (cond_):

Re: [PATCH] RISC-V: Fix bugs of handling scalar of SEW64 vx instruction in RV32

2023-10-30 Thread Robin Dapp
Thanks, LGTM. Regards Robin

Re: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-27 Thread Robin Dapp
bin >From 248744c328440bff9cc339d2bf622852cbaac343 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Thu, 12 Oct 2023 11:23:26 +0200 Subject: [PATCH v3] genemit: Split insn-emit.cc into several partitions. On riscv insn-emit.cc has grown to over 1.2 mio lines of code and compiling it takes considerable time. Therefor

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
> It seems that you didn't commit it yet. > > A nit comment: > > + int lmul = riscv_autovec_lmul == RVV_DYNAMIC ? RVV_M8 : riscv_autovec_lmul; > > I change you could use TARGET_MAX_LMUL No didn't commit yet, testsuite was still running. OK, added it, will commit later. Regards Robin

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
Attached v3 that I'd commit. Regards Robin >From 246b986a8ea2332ced7a094dd68d35d84dcbbc04 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Tue, 24 Oct 2023 10:33:15 +0200 Subject: [PATCH v3] RISC-V: Add rawmemchr expander. This patch adds a vectorized rawmemchr expander. It also mo

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
> I notice we have expand_block_move > in riscv-v.cc > > Maybe we should move it into riscv-string.cc ? Yes I will also move that one. Regards Robin

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
> Could you put it into riscv-string.cc rather than riscv-v.cc? I would > like to put those builtin function expander together if possible, > riscv-string.cc might little bit confuse, but it's all included in > string.h Ok, sure. Will commit the adjusted patch if no further comments. Regards

Re: [PATCH] RISC-V: Fix wrong tune parameters on int_div

2023-10-27 Thread Robin Dapp
> @@ -346,7 +346,7 @@ static const struct riscv_tune_param rocket_tune_info = { >{COSTS_N_INSNS (4), COSTS_N_INSNS (5)},/* fp_mul */ >{COSTS_N_INSNS (20), COSTS_N_INSNS (20)}, /* fp_div */ >{COSTS_N_INSNS (4), COSTS_N_INSNS (4)},/* int_mul */ > - {COSTS_N_INSNS (6),

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
> Suggested adapt codes as follows: > > unsigned int element_size = GET_MODE_SIZE (mode).to_constant (); > poly_int64 nunits = exact_div (BYTES_PER_RISCV_VECTOR *TARGET_MAX_LMUL, > element_size); > if (!get_vector_mode(mode, nunits).exists()) >   gcc_unreachable (); Actually I was initially

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-26 Thread Robin Dapp
> +(define_expand "vcond_mask_len_" > +  [(match_operand:V_VLS 0 "register_operand") > +    (match_operand: 3 "nonmemory_operand") > +    (match_operand:V_VLS 1 "nonmemory_operand") > +    (match_operand:V_VLS 2 "autovec_else_operand") > +    (match_operand 4 "autovec_length_operand") > +   

[PATCH] RISC-V: Add rawmemchr expander.

2023-10-26 Thread Robin Dapp
Hi, this patch adds a vectorized rawmemchr expander. It's basically strstr but for 8, 16 and 32-byte needles. Apart from adjusting the common-code tests I re-used a similar test that Stefan added to the s390 backend. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md (rawmemchr):

[PATCH] RISC-V: Fix cond_sqrt tests.

2023-10-26 Thread Robin Dapp
helpers. Regards Robin >From f5c43e700c0343ec32d95a28226837e26c1bfdca Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Thu, 26 Oct 2023 20:40:00 +0200 Subject: [PATCH] RISC-V: Fix cond_sqrt tests. As long as we do not have universal Zvfh support in binutils linking against libm does not w

Re: [Ready to commit V3] RISC-V: Add AVL propagation PASS for RVV auto-vectorization

2023-10-26 Thread Robin Dapp
> rv32gcv: > FAIL: gfortran.dg/intrinsic_pack_6.f90   -O2  execution test > FAIL: gfortran.dg/intrinsic_pack_6.f90   -O3 -g  execution test > FAIL: gfortran.dg/matmul_3.f90   -O2  execution test > FAIL: gfortran.fortran-torture/execute/intrinsic_matmul.f90 execution,  -O2 > FAIL:

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-26 Thread Robin Dapp
, aarch64 and power10. Regards Robin >From 7acdebb5b13b71331621af08da6649fe08476fe8 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Wed, 25 Oct 2023 22:19:43 +0200 Subject: [PATCH v3] internal-fn: Add VCOND_MASK_LEN. In order to prevent simplification of a COND_OP with degenerate mask (all true or

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-26 Thread Robin Dapp
> Yeah. I think Robin may need this : > > TREE_CODE (else_val) == SSA_NAAME > && SSA_NAME_IS_DEFAULT_DEF (else_val) > && VAR_P (SSA_NAME_VAR (else_val)) > > to differentiate whether the ELSE VALUE is uninitialized SSA or not. I think we are talking about a different simplification now. This

Re: [PATCH V2] RISC-V: Add AVL propagation PASS for RVV auto-vectorization

2023-10-26 Thread Robin Dapp
> I have sent V3 with adapting testcases (2 additional dump FAILs detected by > both Pan Li and Patrick). > No need to review. > > I will wait for patrick is ok to ignore popcount FAILs for now then commit it. Just to confirm: I can now also reproduce the popcount fail on my machine without

Re: [PATCH] RISC-V: Add AVL propagation PASS for RVV auto-vectorization

2023-10-26 Thread Robin Dapp
> Increased FAILS are LMUL = M4. I have analyzed the codegen. Looks > reasonable. > > Moreover, When I removed 'popcount_64' and test, all passed no matter > apply this patch or not. > > I think it is because popcount64 is buggy in RV32, this patch trigger > LMUL = 4 bug already existed that

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-25 Thread Robin Dapp
. For now I kept the expander function but used a direct optab. Regards Robin >From 4f793b71184b3301087780ed500f798d69328fc9 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Fri, 13 Oct 2023 10:20:35 +0200 Subject: [PATCH v2] internal-fn: Add VCOND_MASK_LEN. In order to prevent simp

Re: [RFC] RISC-V: elide sign extend when expanding cmp_and_jump

2023-10-25 Thread Robin Dapp
> Well, it doesn't seem like there's a lot of difference between doing > it in the generic expander bits vs target expander bits -- the former > just calls into the latter for the most part. Thus if the > subreg-promoted state is available in the target expander, I'd expect > it to be available

Re: [PATCH] RISC-V: Fix multiple EXCESS test FAILs in RVV testsuite

2023-10-25 Thread Robin Dapp
> Hmmm. I am not familiar with Binutils... > > I just adapted tests like others in the testsuite make them consistent. > And turns out it can fix the issues.. I see where you're coming from, but can you assemble/link any executable with -march=..._zvfh? Probably not? Doesn't half of GCC's

Re: [PATCH] RISC-V: Fix multiple EXCESS test FAILs in RVV testsuite

2023-10-25 Thread Robin Dapp
> However, when I built the toolchain with -march=rv64gcv_zfh_zvfh. > Then link fail. Hmm, is it about zvfh or why does linking fail? Regards Robin

Re: [PATCH] RISC-V: Change MD attribute avl_type into avl_type_idx[NFC]

2023-10-25 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Fix multiple EXCESS test FAILs in RVV testsuite

2023-10-25 Thread Robin Dapp
Hi Juzhe, I guess that's OK but what's the problem here? Are the default options wrong so we need to overwrite them instead of adding some? Regards Robin

Re: [RFC] RISC-V: elide sign extend when expanding cmp_and_jump

2023-10-25 Thread Robin Dapp
Hi Vineet, I was thinking of two things while skimming the code: - Couldn't we do this in the expanders directly? Or is the subreg-promoted info gone until we reach that? - Should some common-code part be more suited to handle that? We already elide redundant sign-zero extensions for

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-24 Thread Robin Dapp
ell as internal_fn_else_index. Testsuite on riscv is unchanged, bootstrap and testsuite on power10 done, aarch64 and x86 still running. Regards Robin >From e11ac2b5889558c58ce711d8119ebcd78173ac6c Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Wed, 13 Sep 2023 22:19:35 +0200 Subject: [PATCH v6] ifcv

[PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-23 Thread Robin Dapp
cond_mask_len that results from a simplification with all-one or all-zero mask we could allow constant immediate vectors and expand them to simple len moves in the backend. Regards Robin >From bc72e9b2f3ee46508404ee7723ca78790fa96b6b Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Fri, 13 Oct 2

Re: [PATCH v3 1/4] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

2023-10-20 Thread Robin Dapp
> Sorry for the slow review. TBH I was hoping someone else would pick > it up, since (a) I'm not very familiar with this code, and (b) I don't > really agree with the way that the current code works. I'm not sure the > current dependency checking is safe, so I'm nervous about adding even > more

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-19 Thread Robin Dapp
Ugh, I didn't push yet because with a rebased trunk I am seeing different behavior for some riscv testcases. A reduction is not recognized because there is yet another "double use" occurrence in check_reduction_path. I guess it's reasonable to loosen the restriction for conditional operations

Re: [PATCH V2] RISC-V: Fix failed hoist in LICM of vmv.v.x instruction

2023-10-19 Thread Robin Dapp
Hi Juzhe, as discussed off-list this approach generally makes sense to me so the patch LGTM once the vsetvl rework is upstream and settled. Independently, we still need to understand why the more complex broadcast pattern is not hoisted out of the loop. Regards Robin

Re: [PATCH V3 00/11] Refactor and cleanup vsetvl pass

2023-10-19 Thread Robin Dapp
Hi Lehua, thanks for the extensive rework. I'm going to let Juzhe handle the review since it's his pass and he knows it best. Delegated it to him in patchwork. Regards Robin

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
> Could you by the way add this mention this PR: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111791 > Add the test of this PR ? Commented in that PR. This patch does not help there. Regards Robin

[PATCH] vect: Allow same precision for bit-precision conversions.

2023-10-18 Thread Robin Dapp
Hi, even though there was no full conclusion yet I took the liberty of just posting this as a patch in case of further discussion. In PR/111794 we miss a vectorization because on riscv type precision and mode precision differ for mask types. We can still vectorize when allowing assignments with

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
I didn't push this yet because it would have introduced an UNRESOLVED that my summary script didn't catch. Normally I go with just contrib/test_summary but that only filters out FAIL and XPASS. I should really be using compare_testsuite_log.py from riscv-gnu-toolchain/scripts. It was caused by

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
> I saw you didn't extend VI -> V_VLSI. I guess will failed SLP on popcount. Added VLS modes and your test in v2. Testsuite looks unchanged on my side (vect, dg, rvv). Regards Robin Subject: [PATCH v2] RISC-V: Add popcount fallback expander. I didn't manage to get back to the generic

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
> I saw you didn't extend VI -> V_VLSI. I guess will failed SLP on > popcount. Hehe, right, I just copied and pasted the expander from my old patch. Will adjust it and add the test. Regards Robin

[PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
Hi, as I didn't manage to get back to the generic vectorizer fallback for popcount in time (still the generic costing problem) I figured I'd rather implement the popcount fallback in the riscv backend. It uses the WWG algorithm from libgcc. rvv.exp is unchanged, vect and dg.exp testsuites are

Re: [PATCH] RISC-V: Optimize consecutive permutation index pattern by vrgather.vi/vx

2023-10-18 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Robin Dapp
Thank you for the explanation. So, assuming I added an IFN_VCOND_MASK and IFN_VCOND_MASK_LEN along with the respective helper and expand functions, what would be the way forward? Generate an IFN_VCOND_MASK(_LEN) here instead of a VEC_COND_EXPR? How would I make sure all of match.pd's vec_cond

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Robin Dapp
>> I don't know much about valueisation either :) But it does feel >> like we're working around the lack of a LEN form of COND_EXPR. >> In other words, it seems odd that we can do: >> >> IFN_COND_LEN_ADD (mask, a, 0, b, len, bias) >> >> but we can't do: >> >> IFN_COND_LEN (mask, a, b, len,

Re: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-17 Thread Robin Dapp
> Natively, things seem fine, but for cross, I get failures on a few > targets (hppa2.0-unknown-linux-gnu, hppa64-unknown-linux-gnu). > > With ./configure --host=x86_64-pc-linux-gnu > --target=hppa2.0-unknown-linux-gnu --build=x86_64-pc-linux-gnu && make > -j$(nproc), I get a bunch of stuff like:

Re: [PATCH V3] RISC-V: Fix unexpected big LMUL choosing in dynamic LMUL model for non-adjacent load/store

2023-10-16 Thread Robin Dapp
> + if (live_range && flow_bb_inside_loop_p (loop, e->src)) > + { Doesn't this match several cases more than before i.e set the range start to zero fairly often? I mean if it works fine with me and the code is easier to read. Please split off the search for the

Re: [PATCH V2] RISC-V: Fix unexpected big LMUL choosing in dynamic LMUL model for non-adjacent load/store

2023-10-16 Thread Robin Dapp
Hi Juzhe, > +/* Get STORE value. */ > +static tree > +get_store_value (gimple *stmt) > +{ > + if (is_gimple_call (stmt) && gimple_call_internal_p (stmt)) > +{ > + if (gimple_call_internal_fn (stmt) == IFN_MASK_STORE) > + return gimple_call_arg (stmt, 3); > + else > +

Re: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-16 Thread Robin Dapp
not help but also doesn't slow anything down. insn-emit.cc isn't very large to begin with on s390. Regards Robin >From 34d05113a4e3c7e83a4731020307e26c1144af69 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Thu, 12 Oct 2023 11:23:26 +0200 Subject: [PATCH v2] genemit: Split insn-emit.cc into seve

Re: [PATCH] RISC-V: Use VLS modes if the NITERS is known and smaller than VLS mode elements.

2023-10-16 Thread Robin Dapp
Hi Juzhe, this LGTM. I was first concerned whether we would want to stop e.g. at LMUL = 1 and only continue with a specific flag but actually this should be done via the costs. If an implementation wants to penalize or incentivize some behavior it can always adjust the costs which should be

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-13 Thread Robin Dapp
> Why are the contents of this if statement wrong for COND_LEN? > If the "else" value doesn't matter, then the masked form can use > the "then" value for all elements. I would have expected the same > thing to be true of COND_LEN. Right, that one was overly pessimistic. Removed. > But isn't

Re: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-13 Thread Robin Dapp
> Hmm why? The same callback you use to consume the listed arguments > can be used to consume the list can it not? I may be wrong, but from > what I remember the callback is called when main can't consume an > argv value and it's allowed to eat all remaining input? Ah, I see. If that's

Re: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-13 Thread Robin Dapp
> Testsuite is unchanged on all but x86 where, strangely, I saw several > illegal instructions in the pch tests. Those were not reproducible > in a second manual test suite run. I'm just running another full > bootstrap and testsuite cycle with the latest trunk. Follow-up on the pch tests. The

[PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-12 Thread Robin Dapp
Hi, on riscv insn-emit.cc has grown to over 1.2 mio lines of code and compiling it takes considerable time. Therefore, this patch adjust genemit to create ten files insn-emit-1.cc to insn-emit-10.cc. In order to do so it first counts the number of available patterns, calculates the number of

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-11 Thread Robin Dapp
8e50859 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Wed, 13 Sep 2023 22:19:35 +0200 Subject: [PATCH v4] ifcvt/vect: Emit COND_ADD for conditional scalar reduction. As described in PR111401 we currently emit a COND and a PLUS expression for conditional reductions. This makes it difficu

Re: [PATCH V3] RISC-V: Fix incorrect index(offset) of gather/scatter

2023-10-11 Thread Robin Dapp
LGTM, thanks. Regards Robin

Re: [PATCH V2] RISC-V: Fix incorrect index(offset) of gather/scatter

2023-10-11 Thread Robin Dapp
Hi Juzhe, good that you noticed it now, I should have caught that in the review back then... One thing, though: > + if (inner_offsize < GET_MODE_BITSIZE (GET_MODE (ptr)).to_constant ()) Shouldn't ptr always be Pmode i.e. the bitsize == XLEN? Rest LGTM. Regards Robin

Re: [PATCH] RISC-V: Remove XFAIL of ssa-dom-cse-2.c

2023-10-11 Thread Robin Dapp
LGTM FWIW. Regards Robin

Re: [PATCH] RISC-V: Enable full coverage vect tests

2023-10-11 Thread Robin Dapp
Hi Juzhe, seems OK to me. We don't support most of the patterns directly but as we can and want to vectorize them it makes sens to enable the tests. Regards Robin

Re: [PATCH V2] RISC-V: Support movmisalign of RVV VLA modes

2023-10-09 Thread Robin Dapp
Thanks, for now this LGTM. Regards Robin

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-09 Thread Robin Dapp
> Hmm, the function is called at transform time so this shouldn't help > avoiding the ICE. I expected we refuse to vectorize _any_ reduction > when sign dependent rounding is in effect? OTOH maybe sign-dependent > rounding is OK but only when we use a unconditional fold-left > (so a loop mask

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-09 Thread Robin Dapp
> It'd be good to expand on this comment a bit. What kind of COND are you > anticipating? A COND with the neutral op as the else value, so that the > PLUS_EXPR (or whatever) can remain unconditional? If so, it would be > good to sketch briefly how that happens, and why it's better than using >

Re: [PATCH V2] TEST: Fix vect_cond_arith_* dump checks for RVV

2023-10-09 Thread Robin Dapp
On 10/9/23 09:32, Andreas Schwab wrote: > On Okt 09 2023, juzhe.zh...@rivai.ai wrote: > >> Turns out COND(_LEN)?_ADD can't work. > > It should work though. Tcl regexps are a superset of POSIX EREs. > The problem is that COND(_LEN)?_ADD matches two times against COND_LEN_ADD and a

Re: [PATCH] RISC-V: Support movmisalign of RVV VLA modes

2023-10-09 Thread Robin Dapp
Hi Juzhe, I think an extra param might be too intrusive. I would expect normal hardware implementations to support unaligned accesses (but they might be slow which should be covered by costs) and only rarely have hardware that doesn't support it and raises exceptions. Therefore I would suggest

Re: [PATCH] TEST: Fix dump FAIL of vect-multitypes-16.c for RVV

2023-10-09 Thread Robin Dapp
> Maybe I should pretend RVV support vect_pack/vect_unpack and enable > all the tests in target-supports.exp? The problem is that vect_pack/unpack is an overloaded term in the moment meaning "vector conversion" (promotion/demotion) or so. This test does not require pack/unpack for successful

<    1   2   3   4   5   6   7   8   9   10   >