Re: [PATCH] RISC-V: Teach liveness computation loop invariant shift amount[Dynamic LMUL]

2024-01-08 Thread Robin Dapp
> > +  if (is_gimple_min_invariant (op)) > > +    return true; > > +  if (SSA_NAME_IS_DEFAULT_DEF (op) > > +  || !flow_bb_inside_loop_p (loop, gimple_bb (SSA_NAME_DEF_STMT > (op > > +    return true; > > +  return gimple_uid (SSA_NAME_DEF_STMT (op)) & 1; > > +}

Re: [PATCH] RISC-V: Allow simplification non-vlmax with len = NUNITS reg to reg move

2024-01-05 Thread Robin Dapp
> +/* Return true it is whole register-register move. */ > +bool > +whole_reg_to_reg_move_p (rtx *ops, machine_mode mode) > +{ > + if (register_operand (ops[0], mode) > + && register_operand (ops[3], mode) > + && satisfies_constraint_vu (ops[2]) > + && satisfies_constraint_Wc1

Re: [PATCH] RISC-V: Teach liveness computation loop invariant shift amount[Dynamic LMUL]

2024-01-05 Thread Robin Dapp
> 1). We not only have vashl_optab,vashr_optab,vlshr_optab which vectorize > shift with vector shift amount, > that is, vectorization of 'a[i] >> x[i]', the shift amount is loop variant. > 2). But also, we have ashl_optab, ashr_optab, lshr_optab which can vectorize > shift with scalar shift

[PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971].

2023-12-18 Thread Robin Dapp
Hi, found in PR112971, this patch adds folding support for bitwise operations of const duplicate zero vectors and stepped vectors. On riscv we have the situation that a folding would perpetually continue without simplifying because e.g. {0, 0, 0, ...} & {7, 6, 5, ...} would not fold to {0, 0, 0,

Re: [PATCH V2] RISC-V: Support one more overlap for wv instructions

2023-12-18 Thread Robin Dapp
OK. Regards Robin

Re: [PATCH V2] RISC-V: Enable vect test for RV32

2023-12-18 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Fix natural regsize for fixed-vlmax of -march=rv64gc_zve32f

2023-12-18 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Enable vect test for RV32

2023-12-18 Thread Robin Dapp
Hi Juzhe, > -} elseif [istarget riscv64-*-*] { > +} elseif [istarget riscv*-*-*] { > if [check_effective_target_riscv_v] { > lappend DEFAULT_VECTCFLAGS "--param" "riscv-vector-abi" > set dg-do-what-default run Yes, that's reasonable. A bit further down we have

Re: [PATCH V2] RISC-V: Fix vmerge optimization bug in vec_perm vectorization

2023-12-15 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Fix vmerge optimization bug in vec_perm vectorization

2023-12-15 Thread Robin Dapp
On 12/15/23 13:52, juzhe.zh...@rivai.ai wrote: > Do you mean : > >   /* We need to use precomputed mask for such situation and such mask >      can only be computed in compile-time known size modes.  */ >   bool indices_fit_selector_p >     = GET_MODE_BITSIZE (GET_MODE_INNER (vmode)) > 8 ||

Re: [PATCH] RISC-V: Fix vmerge optimization bug in vec_perm vectorization

2023-12-15 Thread Robin Dapp
> Oh. I think it should be renamed into not_fit. > > Is this following make sense to you ? > >   /* We need to use precomputed mask for such situation and such mask >      can only be computed in compile-time known size modes.  */ >   bool indices_not_fit_selector_p >     = maybe_ge (vec_len, 2

Re: [PATCH] RISC-V: Fix vmerge optimization bug in vec_perm vectorization

2023-12-15 Thread Robin Dapp
On 12/15/23 13:16, juzhe.zh...@rivai.ai wrote: > >>> bool indices_fit_selector = maybe_ge (vec_len, 2 << GET_MODE_BITSIZE >>> (GET_MODE_INNER (vmode))); > No, I think it will make us miss some optimization. > > For example, for poly value [16,16]  maybe_ge ([16,16], 65536) which makes us >

Re: [PATCH] RISC-V: Fix vmerge optimization bug in vec_perm vectorization

2023-12-15 Thread Robin Dapp
Hi Juzhe, in general looks OK. > + /* We need to use precomputed mask for such situation and such mask > + can only be computed in compile-time known size modes. */ > + if (GET_MODE_BITSIZE (GET_MODE_INNER (vmode)) == 8 && maybe_ge (vec_len, > 256) > + && !vec_len.is_constant ()) >

Re: [PATCH] expmed: Get vec_extract element mode from insn_data, [PR112999]

2023-12-14 Thread Robin Dapp
> It looks like: > > FOR_EACH_MODE_FROM (new_mode, new_mode) > if (known_eq (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (GET_MODE (op0))) > && known_eq (GET_MODE_UNIT_SIZE (new_mode), GET_MODE_SIZE (tmode)) > && targetm.vector_mode_supported_p (new_mode) >

[PATCH] expmed: Get vec_extract element mode from insn_data, [PR112999]

2023-12-14 Thread Robin Dapp
Hi, this is a bit of a follow up of the latest expmed change. In extract_bit_field_1 we try to get a better vector mode before extracting from it. Better refers to the case when the requested target mode does not equal the inner mode of the vector to extract from and we have an equivalent

Re: [PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

2023-12-13 Thread Robin Dapp
Thanks, LGTM but please add a comment like: These test cases used to cause out-of-bounds writes to the stack and therefore showed unreliable behavior. Depending on the execution environment they can either pass or fail. As of now, with the latest QEMU version, they will pass even without the

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread Robin Dapp
> Do you mean add some comments in tests? I meant add it as a run test as well and comment that the test has caused out-of-bounds writes before and passed by the time of adding it (or so) and is kept regardless. Regards Robin

Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-13 Thread Robin Dapp
Thanks. The attached v2 goes with your suggestion and adds a vec_extractbi expander. Apart from that it keeps the MODE_PRECISION changes from before and uses insn_data[icode].operand[0]'s mode. Apart from that no changes on the riscv side. Bootstrapped and regtested on x86 and aarch64. On

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread Robin Dapp
> I don”t choose to run since I didn”t have issue run on my local > simulator no matter qemu or spike. Yes it was flaky. That's kind of expected with the out-of-bounds writes we did. They can depend on runtime environment and other factors. Of course it's a bit counterintuitive to add a

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread Robin Dapp
Hi Juzhe, in general looks OK to me. Just a question for understanding: > - if (header_info.valid_p () > - && (anticipated_exp_p (header_info) || block_info.full_available)) Why is full_available true if we cannot use it? > +/* { dg-do compile } */ It would be nice if we could

Re: [PATCH] RISC-V: Apply vla vs. vls mode heuristic vector COST model

2023-12-12 Thread Robin Dapp
Given that it's almost verbatim aarch64's implementation and the general approach appears sensible, LGTM. Regards Robin

Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-12 Thread Robin Dapp
> - Change the second mode to vec_extract_optab. This is only a name > lookup, and it seems more natural to continue using the real element mode. Am I understanding correctly that this implies we should provide a vec_extractbi expander? (with the innermode being BImode here). Regards Robin

Re: [PATCH] RISC-V: Refactor Dynamic LMUL codes

2023-12-12 Thread Robin Dapp
Yes, no harm in doing that. LGTM. Regards Robin

Re: Ping: [PATCH] Add a late-combine pass [PR106594]

2023-12-11 Thread Robin Dapp
Hi Richard, I have tested the new pass on riscv64 and while it did exhibit some regressions, none of them are critical. Mostly, test expectations will need to be adjusted - no new execution failures. As mentioned in the initial discussion it does help us get the behavior we want but, as of now,

Re: [PATCH] RISC-V: Add vectorized strcmp.

2023-12-11 Thread Robin Dapp
> Yes, I test the patch with all below configurations and there is no failure > now. That would be great! Thank you! I posted it as a patch now: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640182.html Regards Robin

[PATCH] RISC-V: testsuite: Fix strcmp-run.c test.

2023-12-11 Thread Robin Dapp
Hi, this fixes expectations in the strcmp-run test which would sometimes fail with newlib. The test expects libc strcmp return values and asserts the vectorized result is similar to those. Therefore hard-code the expected results instead of relying on a strcmp call. Pan has already tested in a

Re: [PATCH] RISC-V: Add vectorized strcmp.

2023-12-11 Thread Robin Dapp
Hi Pan, > I reduced the SZ size from 10 to 1, and the below case with SZ = 2 > will fail. The failed location is "foo is 50, foo2 is 12800, i,j is > 1, 0". > > #define SZ 2 > > const char *s[SZ] = {"1", > "12345678901234567889012345678901234567890"}; Thanks. I still cannot reproduce but I

Re: [PATCH] RISC-V: Robostify shuffle index used by vrgather and fix regression

2023-12-11 Thread Robin Dapp
> Robostify the gather index to fixe those FAILs. OK. They must have somehow slipped through because I pruned vlmax tests for comparison of different vlens. Regards Robin

Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-11 Thread Robin Dapp
What also works is something like: scalar_mode extract_mode = innermode; if (GET_MODE_CLASS (outermode) == MODE_VECTOR_BOOL) extract_mode = smallest_int_mode_for_size (GET_MODE_PRECISION (innermode)); however > So yes, I guess we need to answer

Re: [PATCH 1/3] RISC-V: movmem for RISCV with V extension

2023-12-11 Thread Robin Dapp
Ah, please also ensure to include (and follow) the stringop_strategy checks. (LIBCALL, VECTOR) The naming is a bit unfortunate still but that need not be fixed in this patch. Regards Robin

Re: [PATCH 1/3] RISC-V: movmem for RISCV with V extension

2023-12-11 Thread Robin Dapp
Hi Sergei, thanks for contributing this! Small general remarks/nits upfront: The code looks like it hasn't been run through clang-format or similar. Please make sure that it adheres to the GNU coding conventions. The same applies to comments. Some of them start in lowercase. As you rely on

Re: [PATCH] RTL-SSA: Fix ICE on record_use of RTL_SSA for RISC-V VSETVL PASS

2023-12-11 Thread Robin Dapp
> Yes. This is the last chance to walk around it here but we will end up with > more patterns. > since reduction dest operand always LMUL = 1 mode. > > So, when -march=rv64gcv, the dest mode should be V4SI, if > -march=rv64gcv_zvl256b, the dest mode should be V8SI. > ...etc.  Different

Re: [PATCH] RTL-SSA: Fix ICE on record_use of RTL_SSA for RISC-V VSETVL PASS

2023-12-11 Thread Robin Dapp
> In record_use: > > if (HARD_REGISTER_NUM_P (regno) > && partial_subreg_p (use->mode (), mode)) > > Assertion failed on partial_subreg_p which is: > > inline bool > partial_subreg_p (machine_mode outermode, machine_mode innermode) > { > /* Modes involved in a subreg must be

Re: [PATCH] RISC-V: Add vectorized strcmp.

2023-12-11 Thread Robin Dapp
> FYI. I have the some failures as juzhe mentioned, with the emulator > qemu version qemu-riscv64 version 8.1.93 (v8.2.0-rc3). The entire log > may look like below: > > Executing on host: > /home/box/panli/riscv-gnu-toolchain/build-gcc-newlib-stage2/gcc/xgcc >

Re: [PATCH] RISC-V: Recognize stepped series in expand_vec_perm_const.

2023-12-10 Thread Robin Dapp
On 12/11/23 03:09, juzhe.zh...@rivai.ai wrote: > + if (shuffle_series (d)) > + return true; > > > Could you rename it into shuffle_series_patterns ? > > Just to make naming consistent. Done, going to push with that change in a while. Regards Robin

Re: [PATCH] RISC-V: Support highest overlap for wv instructions

2023-12-10 Thread Robin Dapp
LGTM, thanks. Regards Robin

Re: [PATCH] RISC-V: Add vectorized strcmp.

2023-12-09 Thread Robin Dapp
> rv64gcv With -minline-strcmp I assume? Regards Robin

Re: [PATCH] RISC-V: Add vectorized strcmp.

2023-12-09 Thread Robin Dapp
> FAIL: gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c execution test > FAIL: gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c execution test > FAIL: gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c execution test > FAIL: gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c execution test >

[PATCH] RISC-V: Recognize stepped series in expand_vec_perm_const.

2023-12-09 Thread Robin Dapp
Hi, we currently try to recognize various forms of stepped (const_vector) sequence variants in expand_const_vector. Because of complications with canonicalization and encoding it is easier to identify such patterns in expand_vec_perm_const_1 already where perm.series_p () is available. This

Re: [PATCH] RISC-V: Add vectorized strcmp.

2023-12-08 Thread Robin Dapp
Ah, I forgot to attach the current v2 that also enables strncmp. It was additionally tested with -minline-strncmp on rv64gcv. Regards Robin Subject: [PATCH v2] RISC-V: Add vectorized strcmp and strncmp. This patch adds vectorized strcmp and strncmp implementations and tests. Similar to

Re: [PATCH] RISC-V: Add vectorized strcmp.

2023-12-08 Thread Robin Dapp
Similar to strlen, this now seems safe to push. Will do so later. I tested on rv64gcv_zvl128b with -minline-strlen and didn't see regressions. Regards Robin

Re: [PATCH] RISC-V: Add vectorized strlen.

2023-12-08 Thread Robin Dapp
After Juzhe's vsetvl fix earlier this week this seems safe to push. Going to do so later. I tested on rv64gcv_zvl128b with -minline-strlen and didn't see regressions apart from zbb-strlen-disabled-2.c which will always fail with -minline-strlen because it expects -mno-inline-strlen. Regards

Re: [PATCH] RISC-V: Support interleave vector with different step sequence for VLA SLP

2023-12-07 Thread Robin Dapp
Sorry for the delay, just a tiny naming/comment nit. Rest LGTM, no need for a v2. > +/* Return true each pattern has different 2 steps. > + TODO: We currently only support NPATTERNS = 2. */ Return true if the permutation consists of two interleaved patterns with a constant step each. >

Re: [PATCH] RISC-V: Fix AVL propagation ICE for vleff/vlsegff

2023-12-07 Thread Robin Dapp
LGTM. Btw your vsetvl patch from yesterday fixes the vectorized strlen/strcmp problems. Those use vleff as first instruction. Regards Robin

Re: [PATCH] RISC-V: Fix VSETVL PASS bug

2023-12-06 Thread Robin Dapp
LGTM. + /* Don't perform earliest fusion on unrelated edge. */ + if (bitmap_count_bits (e) != 1) + continue; This could still use a comment why e is "unrelated" in that case (no v2 needed). Regards Robin

[PATCH] RISC-V: Add vec_init expander for masks [PR112854].

2023-12-05 Thread Robin Dapp
Hi, PR112854 shows a problem on rv32 with zvl1024b. During the course of expand_constructor we try to overlay/subreg a 64-element mask by a scalar (Pmode) register. This works for zvle512b and its maximum of 32 elements but fails for rv32 and 64 elements. To circumvent this this patch adds a

Re: [PATCH] RISC-V: Block VLSmodes according to TARGET_MAX_LMUL and BITS_PER_RISCV_VECTOR

2023-12-05 Thread Robin Dapp
> I think the VLS modes are excluded exactly meet we expected. > For example, when zvl128b, LMUL = 1. > We allow allow VLS modes <= 128bit, exclude VLS modes > 128bits. > We have the same behavior as ARM SVE. I just found the ordered_p a bit unintuitive here at first sight. But when thinking

Re: [PATCH] RISC-V: Block VLSmodes according to TARGET_MAX_LMUL and BITS_PER_RISCV_VECTOR

2023-12-05 Thread Robin Dapp
Yes, makes sense. Just one clarifying question. > +{ > + if (GET_MODE_CLASS (vls_mode) != MODE_VECTOR_BOOL > + && !ordered_p (TARGET_MAX_LMUL * BITS_PER_RISCV_VECTOR, > + GET_MODE_PRECISION (vls_mode))) > + /* We enable VLS modes which are aligned with

Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-05 Thread Robin Dapp
> But how do we know BImode fits in QImode? I was kind of hoping that a "bit" always fits in a "byte"/unit but yeah, I guess we don't always know :/ > I think the issue is more that we try to extract an element from > the mask vector? How is element extraction defined for VLA vectors > anyway?

Re: [PATCH] RISC-V: Add blocker for gather/scatter auto-vectorization

2023-12-04 Thread Robin Dapp
OK. Regards Robin

Re: [PATCH] RISC-V: Fix overlap group incorrect overlap on v0

2023-12-04 Thread Robin Dapp
I'd suggest the same thing as in the other patch, i.e. not having the large number of identical lines in the iterator. That's just my opinion, though. Rest LGTM. Regards Robin

Re: [PATCH V2] RISC-V: Support highest-number regno overlap for widen ternary

2023-12-04 Thread Robin Dapp
LGTM. Regards Robin

[PATCH] RISC-V: Fix two testscases related to -std changes.

2023-12-04 Thread Robin Dapp
Hi, recent -std changes caused testsuite failures. Fix those by adding -std=gnu99 and -Wno-incompatible-pointer-types. Going to commit as obvious. Regards Robin gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr112552.c: Add -Wno-incompatible-pointer-types. *

Re: [PATCH] RISC-V: Support highest-number regno overlap for widen ternary vx instructions

2023-12-04 Thread Robin Dapp
> +(define_mode_attr widen_ternop_dest_constraint [ > + (RVVM8QI "=vd, vr, vd, vr, vd, vr, ?") > + (RVVM4QI "=vd, vr, vd, vr, vd, vr, ?") > + (RVVM2QI "=vd, vr, vd, vr, vd, vr, ?") > + (RVVM1QI "=vd, vr, vd, vr, vd, vr, ?") > + (RVVMF2QI "=vd, vr, vd, vr, vd, vr, ?") > + (RVVMF4QI "=vd, vr,

[PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-04 Thread Robin Dapp
Hi, this changes the vec_extract path of extract_bit_field to use QImode instead of BImode when extracting from mask vectors and changes GET_MODE_BITSIZE to GET_MODE_PRECISION. This fixes an ICE on riscv where we did not find a vec_extract optab and continued with the generic code that requires

Re: [PATCH] RISC-V: Remove earlyclobber from widen reduction

2023-12-04 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Vectorized str(n)cmp and strlen.

2023-12-01 Thread Robin Dapp
Split it into four separate patches now. Regards Robin

[PATCH] RISC-V: Add vectorized strcmp.

2023-12-01 Thread Robin Dapp
Hi, this patch adds a vectorized strcmp implementation and tests. Similar to strlen, expansion is still guarded by -minline-strcmp. I just realized I forgot to make it a series but this one is actually dependent on the NFC patch and the rawmemchr fix before. Regards Robin gcc/ChangeLog:

[PATCH] RISC-V: Add vectorized strlen.

2023-12-01 Thread Robin Dapp
Hi, this patch implements a vectorized strlen by re-using and slightly adjusting the rawmemchr implementation. Rawmemchr returns the address of the needle while strlen returns the difference between needle address and start address. As before, strlen expansion is guarded by -minline-strlen.

[PATCH] RISC-V: Rename and unify stringop strategy handling [NFC].

2023-12-01 Thread Robin Dapp
Hi, now split into multiple patches. In preparation for the vectorized strlen and strcmp support this NFC patch unifies the stringop strategy handling a bit. The "auto" strategy now is a combination of scalar and vector and an expander should try the strategies in their preferred order. For

[PATCH] RISC-V: Fix rawmemchr implementation.

2023-12-01 Thread Robin Dapp
Hi, this fixes a bug in the rawmemchr implementation by incrementing the source address by vl * element_size instead of just vl. This is normally harmless as we will just scan the same region more than once but, in combination with an older qemu version, would lead to an execution failure in

Re: [PATCH] RISC-V: Support highpart overlap for indexed load with SRC EEW < DEST EEW

2023-12-01 Thread Robin Dapp
LGTM Regards Robin

Re: [RISC-V PATCH] Improve style to work around PR 60994 in host compiler.

2023-12-01 Thread Robin Dapp
Yes, OK, thanks for that. CC'ing Juzhe as this is his pass. Regards Robin

[PATCH] RISC-V: Vectorized str(n)cmp and strlen.

2023-11-30 Thread Robin Dapp
Hi, this adds vectorized implementations of strcmp and strncmp as well as strlen. strlen falls back to the previously implemented rawmemchr. Also, it fixes a rawmemchr bug causing a SPEC2017 execution failure: We would only ever increment the source address by 1 regardless of the input type.

Re: [PATCH V2] RISC-V: Remove earlyclobber for wx/wf instructions.

2023-11-30 Thread Robin Dapp
OK. Regards Robin

Re: [PATCH] RISC-V: Remove earlyclobber for wx/wf instructions.

2023-11-30 Thread Robin Dapp
> While working on overlap for widening instructions, I realize that we set > vwadd.wx/vfwadd.wf as earlyclobber which is incorrect. > > Since according to RVV ISA: > "The destination EEW equals the source EEW." > > For both vwadd.wx/vfwadd.wf source vector and dest vector operand are same >

Re: [PATCH] RISC-V: Support highpart overlap for vext.vf

2023-11-29 Thread Robin Dapp
LGTM (in context of the last message) but please consider adding the comments/naming I suggested. Regards Robin

Re: [PATCH] RISC-V: Support highpart register overlap for vwcvt

2023-11-29 Thread Robin Dapp
>>> I can't really match spec and code.  For the lmul = 2 case sure, >>> but W84 e.g. allows v4 and not v6?  What actually is "highest-numbered >>>part"? > Yes. > > For vwcvt, LMUL 4 -> LMUL 8.  > We allow overlap  vwcvt v0 (occupy v0 - v7), v4 (occupy v4 - v7) > This patch support the overlap

Re: [PATCH] RISC-V: Support highpart register overlap for vwcvt

2023-11-29 Thread Robin Dapp
Looks like this already went in while I was looking at it... In general it looks ok to me but I would have really hoped for some more comments. > +;; These following constraints are used by RVV instructions with dest EEW > > src EEW. > +;; RISC-V 'V' Spec 5.2. Vector Operands: > +;; The

Re: [PATCH] RISC-V: Disallow poly (1,1) VLA SLP interleave vectorization

2023-11-28 Thread Robin Dapp
LGTM. That one is easy to revert (as opposed to changing all modes). Regards Robin

Re: [PATCH][RFC] middle-end/110237 - wrong MEM_ATTRs for partial loads/stores

2023-11-27 Thread Robin Dapp
> The easiest way to avoid running into the alias analysis problem is > to scrap the MEM_EXPR when we expand the internal functions for > partial loads/stores. That avoids the disambiguation we run into > which is realizing that we store to an object of less size as > the size of the mode we

Re: [PATCH] RISC-V: Remove incorrect function gate gather_scatter_valid_offset_mode_p

2023-11-27 Thread Robin Dapp
On 11/25/23 09:24, Juzhe-Zhong wrote: > Come back to review the codes of gather/scatter, notice > gather_scatter_valid_offset_mode_p looks odd. > gather_scatter_valid_offset_mode_p is supposed to block vluxei64/vsuxei64 in > RV32 system. > However, it failed to do that since it is passing

Re: [PATCH] RISC-V: Fix inconsistency among all vectorization hooks

2023-11-24 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Optimize a special case of VLA SLP

2023-11-23 Thread Robin Dapp
LGTM (and harmless enough) but I'd rather wait for a second look or a maintainer's OK as we're past stage 1 and it's not a real bugfix. (On top, it's Thanksgiving so not many people will even notice). On a related note, this should probably be a middle-end optimization but before a variable-index

Re: [PATCH] RISC-V: Disable AVL propagation of vrgather instruction

2023-11-23 Thread Robin Dapp
> Oh. You mean this patch also fixes FLTO failed case ? Yes, it's the same issue. There we have a fixed vl (known via LTO) that is being propagated "into" gathers and we end up missing gather elements. Regards Robin

Re: [PATCH] RISC-V: Disable AVL propagation of vrgather instruction

2023-11-23 Thread Robin Dapp
I was just about to post a similar-ish patch that fixes pr65518.c but you were faster ;) Therefore LGTM. You can add PR/target 112670. Regards Robin

Re: [PATCH] RISC-V: Fix permutation indice mode bug

2023-11-21 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: testsuite: Remove redundant vector_hw and zvfh_hw.

2023-11-21 Thread Robin Dapp
> I don't get it. Why do we need remove them ?  It's just replaced by riscv_zvfh. I should probably edit the patch description and changelog entries to make it clearer. Regards Robin

Re: [PATCH] vect: Allow reduc_index != 1 for COND_OPs.

2023-11-21 Thread Robin Dapp
>> Bootstrapped and regtested on aarch64 and regtested on riscv. x86 is >> still running. Just to confirm: x86 bootstrap and regtest unchanged. Going to commit it soon. Regards Robin

[PATCH] RISC-V: testsuite: Remove redundant vector_hw and zvfh_hw.

2023-11-21 Thread Robin Dapp
Hi, this removes the now-redundant vector_hw and zvfh_hw checks in the testsuite. Regards Robin gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/copysign-zvfh-run.c: Remove zvfh_hw. * gcc.target/riscv/rvv/autovec/binop/vadd-zvfh-run.c: Ditto. *

[PATCH] vect: Allow reduc_index != 1 for COND_OPs.

2023-11-21 Thread Robin Dapp
Hi, in PR112406 Tamar found another problem with COND_OP reductions. I wrongly assumed that the reduction variable will always remain in operand 1, just as we create the COND_OP in ifcvt. But of course, addition being commutative, we are free to swap operand 1 and 2 and can end up with e.g.

Re: [PATCH] RISC-V: testsuite: Add rv64 requirement for bug-9 and bug-14.

2023-11-21 Thread Robin Dapp
> /* { dg-do run { target { { {riscv_v} && {rv64} } } } } */ > > Seems you should remove rv64 here ? sicne I think it is redundant here. Going to commit with that removed. Regards Robin

Re: [PATCH] RISC-V: testsuite: Fix popcount test.

2023-11-21 Thread Robin Dapp
> Mhm, not so obvious after all. We vectorize 250 instances with > rv32gcv, 229 with rv64gcv and 250 with rv64gcv_zbb. Will have > another look tomorrow. The problem is that tree-vect-patterns is more restrictive than necessary and does not vectorize everything it could. Therefore I'm going to

Re: [BUG FIX] RISC-V: Disallow COSNT_VECTOR for DI on RV32

2023-11-21 Thread Robin Dapp
Hi Juzhe, > This bug is exposed when testing on zvl512b RV32 system. > > The rootcause is RA reload DI CONST_VECTOR into vmv.v.x then it ICE. > > So disallow DI CONST_VECTOR on RV32. OK. Regards Robin

Re: [PATCH] RISC-V: testsuite: Fix popcount test.

2023-11-20 Thread Robin Dapp
Mhm, not so obvious after all. We vectorize 250 instances with rv32gcv, 229 with rv64gcv and 250 with rv64gcv_zbb. Will have another look tomorrow. Regards Robin

[PATCH] RISC-V: testsuite: Fix popcount test.

2023-11-20 Thread Robin Dapp
Hi, since Jakub's recent middle-end changes we vectorize more popcount instances. This patch just adjusts the dump check. Going to commit as obvious once I have figured out why there is a dump difference between my local tester and on the server. Regards Robin gcc/testsuite/ChangeLog:

[PATCH] RISC-V: testsuite: Add rv64 requirement for bug-9 and bug-14.

2023-11-20 Thread Robin Dapp
Hi, this adds an effective target requirement to compile the tests. Since we disabled 64-bit indices on rv32 targets those tests should be unsupported on rv32. Regards Robin gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/bug-14.C: Add dg-require-effective-target rv64.

[PATCH] RISC-V: testsuite: Do not set default arch for RVV.

2023-11-20 Thread Robin Dapp
Hi, as per recent discussion and in order to fix inconsistencies between spike and qemu this patch removes gcc_march and gcc_mabi arguments from the default CFLAGS in the testsuite invocation for some sub directories. Juzhe reported that this helps for him. Regards Robin

Re: [BUG FIX] RISC-V: Fix intermediate mode on slide1 instruction for SEW64 on RV32

2023-11-20 Thread Robin Dapp
LGTM. I prefer that over the iterator. Regards Robin

Re: [PATCH] RISC-V Regression: Remove scalable compile option

2023-11-20 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] vect: Use statement vectype for conditional mask.

2023-11-17 Thread Robin Dapp
> No, you shouldn't place _7 != 0 inside the .COND_ADD but instead > have an extra pattern stmt producing that so > > patt_8 = _7 != 0; > patt_9 = .COND_ADD (patt_8, ...); > > that's probably still not enough, but I always quickly forget how > bool patterns work ... basically a comparison like

Re: [PATCH] RISC-V: Disallow 64-bit indexed loads and stores for rv32gcv.

2023-11-17 Thread Robin Dapp
> It must be correct. We already have test (intrinsic codes) for it. Yeah, just noticed that myself. Anyway will do some more tests, maybe my initial VLS analysis was somehow flawed. > Condition should be put into iterators (Add a new iterator for > indexed load store). Ah, that's what you

Re: [PATCH] RISC-V: Disallow 64-bit indexed loads and stores for rv32gcv.

2023-11-17 Thread Robin Dapp
> OK. Make sense。 I'm wondering whether the VLA modes in the iterator are correct. Looks dubious to me but unsure, will need to create some tests before continuing. > LGTM as long as you remove  all > GET_MODE_BITSIZE (GET_MODE_INNER (mode)) <= GET_MODE_BITSIZE (Pmode) What's the problem with

Re: [PATCH] vect: Use statement vectype for conditional mask.

2023-11-17 Thread Robin Dapp
> Yes, your version is also OK. The attached was bootstrapped and regtested on aarch64, x86 and regtested on riscv. Going to commit it later unless somebody objects. Regards Robin Subject: [PATCH] vect: Pass truth type to vect_get_vec_defs. For conditional operations the mask is loop

Re: [PATCH] RISC-V: testsuite: Fix 32-bit FAILs.

2023-11-17 Thread Robin Dapp
> So, going back to our testcases that reported errors with this, I > don't think we should explicitly specify -march and -mabi when > compiling a runnable program, but use the defaults (--with-arch). > Most of our current runnable testcases adhere to this convention, > except for the ones we

Re: [PATCH] vect: Use statement vectype for conditional mask.

2023-11-17 Thread Robin Dapp
> But note you can explicitly specify a vector type as well, there's an > overload for it, so we can fix the "invariant" case with the following > (OK if you can test this on relevant targets) > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > index 3f59139cb01..936a3de9534 100644 >

Re: [PATCH] RISC-V: Optimize VLA SLP with duplicate VLA shuffle indice

2023-11-17 Thread Robin Dapp
Hi Juzhe, > csrra4,vlenb > csrra5,vlenb Totally unrelated to this patch but this looks odd. I don't remember if we had a patch for this already at some point. In general the idea for the patch is to use the largest vector element mode for the indices and compress several of

Re: [PATCH] vect: Use statement vectype for conditional mask.

2023-11-16 Thread Robin Dapp
> For the fortran testcase we don't even run into this but hit an > internal def and assert on > > gcc_assert (STMT_VINFO_VEC_STMTS (def_stmt_info).length () == ncopies); > > I think this shows missing handling of .COND_* in the bool pattern recognition > as we get the 'bool' condition as

Re: [PATCH] RISC-V: Disallow 64-bit indexed loads and stores for rv32gcv.

2023-11-15 Thread Robin Dapp
On 11/15/23 15:29, 钟居哲 wrote: > Could you show me the example ? > > It's used by handling SEW = 64 on RV32. I don't know why this patch touch > this code. Use gather_load_run-1.c with the 64-bit index patterns disabled on rv32. We insert (mem:DI (reg:SI)) into a vector so use the SEW = 64

Re: [PATCH] RISC-V: Disallow 64-bit indexed loads and stores for rv32gcv.

2023-11-15 Thread Robin Dapp
> Looks wrong. Recover back. When we demote we use two elements where there was one before. Therefore the vector needs to be able to hold twice as many elements. We adjust vl correctly but the mode is not here. Regards Robin

<    1   2   3   4   5   6   7   8   9   10   >