Re: [PATCH v2] RISC-V: Introduce -mrvv-allow-misalign.

2024-05-24 Thread Robin Dapp
> * -mstrict-align: Both scalar and vector misaligned accesses are > unsupported (-mrvv-allow-misalign doesn't matter). I'm not sure if > there's hardware there, but given we have systems that don't support > scalar misaligned accesses it seems reasonable to assume they'll also > not support

[PATCH v2] RISC-V: Introduce -mrvv-allow-misalign.

2024-05-24 Thread Robin Dapp
> We should have something in doc/invoke too, this one is going to be > tricky for users. We'll also have to define how this interacts with > the existing -mstrict-align. Addressed the rest in the attached v2 which also fixes tests. I'm really not sure about -mstrict-align. I would have hoped

[PATCH] RISC-V: Introduce -mrvv-allow-misalign.

2024-05-24 Thread Robin Dapp
Hi, this patch changes the default from always enabling movmisalign to disabling it. It adds an option to override the default and adds generic-ooo to the uarchs that support misaligned vector access. It also adds a check_effective_target_riscv_v_misalign_ok to the testsuite which enables or

Re: [PATCH] RISC-V: Enable vectorization for vect-early-break_124-pr114403.c

2024-05-21 Thread Robin Dapp
The patch is OK from the riscv side. generic-ooo includes fast unaligned access. Regards Robin

Re: [PATCH v6] RISC-V: Implement IFN SAT_ADD for both the scalar and vector

2024-05-17 Thread Robin Dapp
Hi Pan, all in all LGTM. Just insignificant nits. > +void > +expand_vec_usadd (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode) > +{ > + emit_vec_saddu (op_0, op_1, op_2, BINARY_OP, vec_mode); > +} > + Do we really need this function? Or do you want it to be a dispatcher for later? If

[PATCH] RISC-V: Remove dead perm series code and document.

2024-05-17 Thread Robin Dapp
Hi, with the introduction of shuffle_series_patterns the explicit handler code for a perm series is dead. This patch removes it and also adds a function-level comment to shuffle_series_patterns. Regtested on rv64gcv_zvfh_zvbb. Regards Robin gcc/ChangeLog: * config/riscv/riscv-v.cc

[PATCH] RISC-V: Add vector popcount, clz, ctz.

2024-05-17 Thread Robin Dapp
Hi, this patch adds the zvbb vcpop, vclz and vctz to the autovec machinery as well as tests for them. It also changes several non-VLS iterators to V_VLS iterators for consistency. Regtested on rv64gcv_zvfh_zvbb. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md (ctz2): New

[PATCH] RISC-V: Add vandn combine helper.

2024-05-17 Thread Robin Dapp
Hi, this patch adds a combine pattern for vandn as well as tests for it. Regtested on rv64gcv_zvfh_zvbb. Regards Robin gcc/ChangeLog: * config/riscv/autovec-opt.md (*vandn_): New pattern. * config/riscv/vector.md: Add vandn to mode_idx. gcc/testsuite/ChangeLog: *

[PATCH] RISC-V: Use widening shift for scatter/gather if applicable.

2024-05-17 Thread Robin Dapp
Hi, with the zvbb extension we can emit a widening shift for scatter/gather index preparation in case we need to multiply by 2 and zero extend. The patch also adds vwsll to the mode_idx attribute and removes the mode from shift-count operand of the insn pattern. Regtested on rv64gcv_zvfh_zvbb.

[PATCH] RISC-V: Add vwsll combine helpers.

2024-05-17 Thread Robin Dapp
Hi, this patch enables the usage of vwsll in autovec context by adding the necessary combine patterns and tests. Regtested on rv64gcv_zvfh_zvbb. Regards Robin gcc/ChangeLog: * config/riscv/autovec-opt.md (*vwsll_zext1_): New pattern. (*vwsll_zext2_): Ditto.

[PATCH] RISC-V: Split vwadd.wx and vwsub.wx and add helpers.

2024-05-17 Thread Robin Dapp
Hi, vwadd.wx and vwsub.wx have the same problem vfwadd.wf had. This patch splits the insn pattern in the same way vfwadd.wf was split. It also adds two patterns to recognize extended scalars. In practice those do not provide a lot of improvement over what we already have but in some instances

Re: [PATCH] internal-fn: Do not force vcond operand to reg.

2024-05-17 Thread Robin Dapp
> OK if that pre-commit CI works out. The CI didn't pick it up, guess it needs to be a bit more explicit. In the meanwhile, however, I managed to catch a short window with > 10G free on gcc185 => Bootstrap and regtest successful on aarch64. Going to push the patch later today. Regards Robin

Re: [PATCH 1/5] RISC-V: Remove float vector eqne pattern

2024-05-16 Thread Robin Dapp
> Can eqne pattern removal patches be committed firstly? Please first make sure you test with corner cases, NaNs in particular. I'm pretty sure we don't have any test cases for those. Regards Robin

Re: [PATCH 1/5] RISC-V: Remove float vector eqne pattern

2024-05-15 Thread Robin Dapp
Hi Demin, are you still going to continue with this? Regards Robin

Re: [PATCH] RISC-V: Do not allow v0 as dest when merging [PR115068].

2024-05-15 Thread Robin Dapp
> I saw vwadd/vwsub.wx have same issue. Could you change them and add test too ? Yes, will do. At first I didn't manage to reproduce it because we seem to be lacking a combine-opt pattern for it. I'm going to post it separately. Regards Robin

Re: [PATCH v1 2/3] RISC-V: Implement vectorizable early exit with vcond_mask_len

2024-05-13 Thread Robin Dapp
Hi Pan, thanks for working on this. In general the patch looks reasonable to me but I'd rather have some more comments about the high-level idea. E.g. cbranch is implemented like aarch64 by xor'ing the bitmasks and comparing the result against zero (so we branch based on mask equality). > +;;

[PATCH] RISC-V: Do not allow v0 as dest when merging [PR115068].

2024-05-13 Thread Robin Dapp
Hi, this patch splits the vfw...wf pattern so we do not emit e.g. vfwadd.wf v0,v8,fa5,v0.t anymore. Regtested on rv64gcv_zvfh. Regards Robin gcc/ChangeLog: PR target/115068 * config/riscv/vector.md: Split vfw.wf pattern. gcc/testsuite/ChangeLog: *

Re: [PATCH v1 3/3] RISC-V: Enable vectorizable early exit test

2024-05-13 Thread Robin Dapp
Hi Pan, > > @@ -4114,6 +4115,7 @@ proc check_effective_target_vect_early_break_hw { } { > || [check_effective_target_arm_v8_neon_hw] > || [check_sse4_hw_available] > || [istarget amdgcn-*-*] > + || [check_effective_target_riscv_v] > }}] > } I believe this should be

Re: [PATCH] internal-fn: Do not force vcond operand to reg.

2024-05-13 Thread Robin Dapp
> What happens if we simply remove all of the force_reg here? On x86 I bootstrapped and tested the attached without fallout (gcc188, so it's no avx512-native machine and therefore limited coverage). riscv regtest is unchanged. For aarch64 I would to rely on the pre-commit CI to pick it up (does

Re: [PATCH] internal-fn: Do not force vcond operand to reg.

2024-05-13 Thread Robin Dapp
> How does this make a difference in the end? I'd expect say forwprop to > fix things? In general we try to only add the masking "boilerplate" of our instructions at split time so fwprop, combine et al. can do their work uninhibited of it (and we don't need numerous (if_then_else ...

[PATCH] internal-fn: Do not force vcond operand to reg.

2024-05-10 Thread Robin Dapp
Hi, this only forces the first comparison operator into a register if it is not already suitable. Bootstrap and regtest is running on x86 and aarch64, successful on p10. Regtested on riscv. gcc/ChangeLog: PR middle-end/113474 * internal-fn.cc (expand_vec_cond_mask_optab_fn):

[PATCH] RISC-V: Add testcase for PR114749.

2024-04-25 Thread Robin Dapp
Hi, this adds a test case for PR114749. Going to commit as obvious unless somebody complains. Regards Robin gcc/testsuite/ChangeLog: PR tree-optimization/114749 * gcc.target/riscv/rvv/autovec/pr114749.c: New test. --- .../gcc.target/riscv/rvv/autovec/pr114749.c | 15

Re: State of risc-v port in the current merge, revert, rinse-repeat commotion

2024-04-24 Thread Robin Dapp
Thanks Vineet! > The dynamic icounts looks sane (vs. Apr 10 snapshot) except for a > regression in x264 which is likely independent of the chaos going on. > > Apr 10 | Apr 23  | >   109f1b28fc94  |  6f0a646dd2fc   | > +-+ > 276,584,692,883

Re: [PATCH v1] Revert "RISC-V: Support highpart register overlap for vwcvt"

2024-04-24 Thread Robin Dapp
> (define_insn "@pred_vwsll_scalar" > - [(set (match_operand:VWEXTI 0 "register_operand" "=vd, vr, vd, vr, vd, vr, > vd, vr, vd, vr, vd, vr, ?, ?") > + [(set (match_operand:VWEXTI 0 "register_operand" "=vr, vr") Just noticed, not a problem of the revert but wasn't that wrong before without the

Re: [PATCH] RISC-V: Add xfail test case for wv insn highest overlap

2024-04-20 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH v1] RISC-V: Revert RVV wv instructions overlap and xfail tests

2024-04-19 Thread Robin Dapp
Hi Pan, > The RVV register overlap requires both the dest, and src operands. > Thus the rigister filter in constraint cannot cover the fully sematics > of the vector register overlap. I'm not sure I'm following. Did we miss something that should have been covered? Like only an overlap on the

[PATCH] RISC-V: Add VLS to mask vec_extract [PR114668].

2024-04-15 Thread Robin Dapp
Hi, this adds the missing VLS modes to the mask extract expanders. I found a dump scan difficult to create reliably so I just kept the PR's run test case. Regtested on rv64gcv. Regards Robin gcc/ChangeLog: PR target/114668 * config/riscv/autovec.md: Add VLS.

Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.

2024-03-25 Thread Robin Dapp
> So where do we stand with this? Juzhe asked it to be rebased, but I > don't see a rebased version in my inbox and I don't see anything that > looks like this on the trunk. I missed this one and figured as we're pretty late in the cycle it can wait until GCC 15. Therefore let's call it

Re: [PATCH v2] RISC-V: Introduce option -mrvv-max-lmul for RVV autovec

2024-03-18 Thread Robin Dapp
LGTM as well. Regards Robin

Re: [PATCH] RISC-V: Introduce option -mrvv-autovec-max-lmul for RVV autovec

2024-03-14 Thread Robin Dapp
Should it really be called autovec-max-lmul? We also use TARGET_MAX_LMUL for builtins etc. Or are we just following LLVM's naming here? Isn't -mrvv-max-lmul sufficient? > PR target/112648 This PR is not really resolved or affected by the

Re: [PATCH] vect: Do not peel epilogue for partial vectors [PR114196].

2024-03-07 Thread Robin Dapp
Attached v2 combines the checks. Bootstrapped and regtested on x86 an power10, aarch64 still running. Regtested on riscv64. Regards Robin Subject: [PATCH v2] vect: Do not peel epilogue for partial vectors. r14-7036-gcbf569486b2dec added an epilogue vectorization guard for early break but

Re: [PATCH] vect: Do not peel epilogue for partial vectors [PR114196].

2024-03-07 Thread Robin Dapp
> r14-7036-gcbf569486b2dec added an epilogue vectorization guard for early > break but PR114196 shows that we also run into the problem without early > break. Therefore remove early break from the conditions. Forgot: Bootstrapped and regtested on x86, aarch64 and power10. Regtested on riscv64.

[PATCH] vect: Do not peel epilogue for partial vectors [PR114196].

2024-03-06 Thread Robin Dapp
Hi, r14-7036-gcbf569486b2dec added an epilogue vectorization guard for early break but PR114196 shows that we also run into the problem without early break. Therefore remove early break from the conditions. gcc/ChangeLog: PR middle-end/114196 * tree-vect-loop-manip.cc

[PATCH] RISC-V: Use vmv1r.v instead of vmv.v.v for fma output reloads [PR114200].

2024-03-06 Thread Robin Dapp
Hi, three-operand instructions like vmacc are modeled with an implicit output reload when the output does not match one of the operands. For this we use vmv.v.v which is subject to length masking. In a situation where the current vl is less than the full vlenb and the fma's result value is used

Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.

2024-03-01 Thread Robin Dapp
> +  /* Segment load/store permute cost.  */ > +  const int segment_permute_2; > +  const int segment_permute_4; > +  const int segment_permute_8; > > Why do we only have 2/4/8, I think we should have 2/3/4/5/6/7/8 No idea why I posted that (wrong) version, I used it for some testing locally.

Re: [PATCH 1/5] RISC-V: Remove float vector eqne pattern

2024-03-01 Thread Robin Dapp
> 2. When I write if (a == 2) and if (2 == a), the results are > same > > 3. The vec_duplicate operand is the 5th operand in both cmp and > eqne patterns. I think they are equal. A comparison with a constant is always canonicalized to have the constant second, that's why you won't see

Re: [PATCH 5/5] RISC-V: Support vmsxx.vx for autovec comparison of vec and imm

2024-03-01 Thread Robin Dapp
Hi Han, in addition to what Juzhe mentioned (and that late-combine is going to handle such cases) it should be noted that register pressure should not be the only consideration here. Many uarchs have a higher latency for register-file-crossing moves. At least without spilling the vv variant is

Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-02-29 Thread Robin Dapp
On 2/29/24 02:38, Li, Pan2 wrote: >> So it's going to check if V2SF can be tied to DI and V4QI with SI. I >> suspect those are going to fail for RISC-V as those aren't tieable. > > Yes, you are right. Different REG_CLASS are not allowed to be tieable in > RISC-V. > > static bool >

Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-29 Thread Robin Dapp
> I think it makes more sense to remove the whole > --param=riscv-autovec-preference since we should use > -fno-tree-vectorize instead of --param=riscv-autovec-preference=none > which is more reasonable compile option for users. > > --param is just a internal testing option that we added before,

Re: [PATCH] RISC-V: Update test expectancies with recent scheduler change

2024-02-28 Thread Robin Dapp
> I suggest specify -fno-schedule-insns to force tests assembler never > change for any scheduling model. We already do that and that's the point - as I mentioned before, no scheduling is worse than default scheduling here (for some definition of worse). The way to reduce the number of vsetvls

Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.

2024-02-27 Thread Robin Dapp
> This patch looks odd to me. > I don't see memrefs in the trunk code. It's on top of the vle/vse offset handling patch from a while back that I haven't committed yet. > Also, I prefer list all cost in cost tune info for NF = 2 ~ 8 like ARM SVE > does: I don't mind having separate costs for

[PATCH] RISC-V: Add initial cost handling for segment loads/stores.

2024-02-26 Thread Robin Dapp
Hi, This has been sitting on my local tree - I've been wanting to post it for a while but somehow forgot. This patch makes segment loads and stores more expensive. It adds segment_load and segment_store cost fields to the common vector costs and adds handling to adjust_stmt_cost. In the future

Re: [PATCH] RISC-V: Update test expectancies with recent scheduler change

2024-02-26 Thread Robin Dapp
On 2/24/24 00:10, Edwin Lu wrote: > Given the recent change with adding the scheduler pipeline descriptions, > many scan-dump failures emerged. Relax the expected assembler output > conditions on the affected tests to reduce noise. I'm not entirely sure yet about relaxing the scans like this.

Re: [PATCH] RISC-V: Fix vec_init for simple sequences [PR114028].

2024-02-23 Thread Robin Dapp
> +/* { dg-final { scan-assembler-times "vmv\.v\.i\tv\[0-9\],0" 0 } } */ > > I think you should use "scan-assembler-not" Thanks, going to commit with that change. Regards Robin

[PATCH] RISC-V: Fix vec_init for simple sequences [PR114028].

2024-02-22 Thread Robin Dapp
Hi, for a vec_init (_a, _a, _a, _a) with _a of mode DImode we try to construct a "superword" of two "_a"s. This only works for modes < Pmode when we can "shift and or" two halves into one Pmode register. This patch disallows the optimization for inner_mode == Pmode and emits a simple broadcast

Re: [PATCH V4 4/5] RISC-V: Quick and simple fixes to testcases that break due to reordering

2024-02-21 Thread Robin Dapp
> For calling-convention-*.c, LGTM but one nit about change log. Take > **Update** here may make others not easy to learn what you did about > the file. You can say similar to "Rearrange and adjust the > asm-checker times" or likewise. Of course, you can refine the > changelog when commit. >> *

Re: [PATCH V4 1/5] RISC-V: Add non-vector types to dfa pipelines

2024-02-21 Thread Robin Dapp
OK. Regards Robin

Re: [PATCH] RISC-V: Set require-effective-target rv64 for PR113742

2024-02-15 Thread Robin Dapp
> Ah oops I glanced over the /* { dg-do compile } */part. It should be > fine to add '-march=rv64gc' instead then? Hmm it's a bit tricky. So generally -mcpu=sifive-p670 includes rv64 but it does not override a previously specified -march=rv32 (that might have been added by the test harness or

Re: [PATCH] RISC-V: Set require-effective-target rv64 for PR113742

2024-02-14 Thread Robin Dapp
On 2/14/24 20:46, Edwin Lu wrote: > The testcase pr113742.c is failing for 32 bit targets due to the following cc1 > error: > cc1: error: ABI requries '-march=rv64' I think we usually just add exactly this to the test options (so it is always run rather than just on a 64-bit target. Regards

[PATCH] RISC-V: Adjust vec unit-stride load/store costs.

2024-02-13 Thread Robin Dapp
Hi, scalar loads provide offset addressing while unit-stride vector instructions cannot. The offset must be loaded into a general-purpose register before it can be used. In order to account for this, this patch adds an address arithmetic heuristic that keeps track of data reference operands.

Re: [PATCH v1] RISC-V: Fix misspelled term args in error_at message

2024-02-12 Thread Robin Dapp
OK. Regards Robin

Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence

2024-02-06 Thread Robin Dapp
> The root cause is this following RTL pattern, after fwprop1: > > (insn 82 78 84 9 (set (reg:DI 230) >         (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) >                 (subreg:SI (reg:DI 221) 0 13 {subsi3_extended} >      (expr_list:REG_EQUAL (sign_extend:DI

Re: [PATCH] RISC-V: Fix infinite compilation of VSETVL PASS

2024-02-06 Thread Robin Dapp
> Testing is running. Ok for trunk if I passed the testing with no > regression ? OK. Regards Robin

Re: [PATCH] RISC-V: Remove vsetvl_pre bogus instructions in VSETVL PASS

2024-02-01 Thread Robin Dapp
> +static bool > +vsetvl_pre_insn_p (rtx_insn *rinsn) > +{ > + return recog_memoized (rinsn) >= 0 > + && get_attr_type (rinsn) == TYPE_VSETVL_PRE; > +} Indent looks off on my screen. Can you check? Apart from that LGTM (no need for v2 of course). Regards Robin

Re: [PATCH V4 2/4] RISC-V: Add vector related pipelines

2024-01-31 Thread Robin Dapp
LGTM, thanks. Regards Robin

Re: [PATCH] RISC-V: Support scheduling for sifive p600 series

2024-01-31 Thread Robin Dapp
> + NULL, /* vector cost */ > +}; Does the P600 series include a vector unit? From what I found on the web it looks like it. If so I would suggest specifying at least the default (generic) vector cost model here. We fall back to the default one for

[PATCH] match: Fix vcond into conditional op folding [PR113607].

2024-01-31 Thread Robin Dapp
Hi, in PR113607 we see an invalid fold of _429 = .COND_SHL (mask_patt_205.47_276, vect_cst__262, vect_cst__262, { 0, ... }); vect_prephitmp_129.51_282 = _429; vect_iftmp.55_287 = VEC_COND_EXPR ; to Applying pattern match.pd:9607, gimple-match-10.cc:3817 gimple_simplified to

Re: [PATCH] RISC-V: Fix VSETLV PASS compile-time issue

2024-01-30 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH V3 2/4] RISC-V: Add vector related pipelines

2024-01-30 Thread Robin Dapp
> I think removing the is_inorder attribute should be ok. I added it > because I wanted to avoid having two matching insn reservations > defined since matching solely on the type attribute should also match > on all subsets as well (i.e. if eventually we add an insn reservation > checking for type

[PATCH] genopinit: Split init_all_optabs [PR113575]

2024-01-26 Thread Robin Dapp
Hi, init_all_optabs initializes > 1 patterns for riscv targets. This leads to pathological situations in dataflow analysis (which can occur with many adjacent stores). To alleviate this this patch makes genopinit split the init_all_optabs function into several init_optabs_xx functions that

Re: [PATCH V3 4/4] RISC-V: Enable assert for insn_has_dfa_reservation

2024-01-25 Thread Robin Dapp
>/* If we ever encounter an insn without an insn reservation, trip > an assert so we can find and fix this problem. */ > -#if 0 > + if (! insn_has_dfa_reservation_p (insn)) { > +print_rtl(stderr, insn); > +fprintf(stderr, "%d", get_attr_type (insn)); > + } >gcc_assert

Re: [PATCH V3 3/4] RISC-V: Use default cost model for insn scheduling

2024-01-25 Thread Robin Dapp
> Use default cost model scheduling on these test cases. All these tests > introduce scan dump failures with -mtune generic-ooo. Since the vector > cost models are the same across all three tunes, some of the tests > in PR113249 will be fixed with this patch series. This is OK, thanks. > 39

Re: [PATCH V3 2/4] RISC-V: Add vector related pipelines

2024-01-25 Thread Robin Dapp
Thanks, that looks better IMHO. > +;; Copyright (C) 2011-2024 Free Software Foundation, Inc. > +;; Contributed by Andrew Waterman (and...@sifive.com). > +;; Based on MIPS target for GNU compiler. You might want to change that, as well as the date. While at it you can also fix the broken date in

Re: [PATCH V3 1/4] RISC-V: Add non-vector types to dfa pipelines

2024-01-25 Thread Robin Dapp
LGTM, thanks. Regards Robin

Re: [PATCH] RISC-V: Fix incorrect LCM delete bug [VSETVL PASS]

2024-01-25 Thread Robin Dapp
The non-test parts are OK IMHO. Regards Robin

[PATCH] testsuite/vect: Add target checks to refined patterns [PR113558]

2024-01-24 Thread Robin Dapp
Hi, on Solaris/SPARC several vector tests appeared to be regressing. They were never vectorized but the checks before r14-3612-ge40edf64995769 would match regardless if a loop was actually vectorized or not. The refined checks only match a successful vectorization attempt but are run

Re: [PATCH] RISC-V: Fix large memory usage of VSETVL PASS [PR113495]

2024-01-23 Thread Robin Dapp
> SPEC 2017 wrf benchmark expose unreasonble memory usage of VSETVL PASS > that is, VSETVL PASS consume over 33 GB memory which make use impossible > to compile SPEC 2017 wrf in a laptop. > > The root cause is wasting-memory variables: LGTM. The new code matches compute_lcm_local_properties

Re: [PATCH] RISC-V: Lower vmv.v.x (avl = 1) into vmv.s.x

2024-01-22 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Fix regressions due to 86de9b66480b710202a2898cf513db105d8c432f

2024-01-22 Thread Robin Dapp
> No, we didn't undo the optimization. > > We just disallow move pattern for (set (reg) (VL_REGNUM)). Ah, what I referred to was the opposite direction. We allow (subreg:V8QI (reg:DI ...)) which is not touched by this patch. Then it is OK. Regards Robin

Re: [PATCH] RISC-V: Fix regressions due to 86de9b66480b710202a2898cf513db105d8c432f

2024-01-22 Thread Robin Dapp
Hi Juzhe, in principle this seems ok to me but I wonder about: > We shouldn't worry about subreg:...VL_REGNUM since it's impossible > that we can have such situation, I think we allow this in legitimize_move for situations like (subreg:SI (reg:V4QI)). That was not added for correctness but

Re: [PATCH V2] RISC-V: Fix RVV_VLMAX

2024-01-19 Thread Robin Dapp
Ah, interesting that this was it. Thanks for fixing and also thanks to Andrew for suggesting that fix. Regards Robin

Re: [PATCH V2] RISC-V: Add has compatible check for conflict vsetvl fusion

2024-01-17 Thread Robin Dapp
OK. Regards Robin

Re: [PATCH] RISC-V: Add has compatible check for conflict vsetvl fusion

2024-01-17 Thread Robin Dapp
Hi Juzhe, the change itself is OK but I don't think we should add binary files like this. Even if not ideal, if you want to go forward IMHO let's skip the test for now and add it at a (not much) later time. > diff --git > a/gcc/testsuite/gcc.target/riscv/rvv/fortran/spec2017_cam4/ppgrid.mod >

Re: [PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971].

2024-01-15 Thread Robin Dapp
I gave it another shot now by introducing a separate function as Richard suggested. It's probably not at the location he intended. The way I read the discussion there hasn't been any consensus on how (or rather where) to properly tackle the problem. Any other ideas still? Regards Robin

Re: [PATCH] RISC-V: Adjust loop len by costing 1 when NITER < VF

2024-01-15 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Fix regression (GCC-14 compare with GCC-13.2) of SHA256 from coremark-pro

2024-01-15 Thread Robin Dapp
OK, thanks. Regards Robin

Re: [PATCH V3] RISC-V: Adjust scalar_to_vec cost

2024-01-12 Thread Robin Dapp
> Tested on both RV32/RV64 no regression, Ok for trunk ? Yes, thanks! Btw out of curiosity, did you see why we actually fail to optimize away the VLA loop? We should open a bug for that I suppose. Regards Robin

Re: [PATCH V2] RISC-V: Adjust scalar_to_vec cost accurately

2024-01-11 Thread Robin Dapp
> 1. This patch set scalar_to_vec cost as 2 instead 1 since scalar move >instruction is slightly more costly than normal rvv instructions (e.g. > vadd.vv). We can go with 2 or 3 (if needed) for now but should later really incorporate reg-move costs in this IMHO. Just like e.g. static const

Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Robin Dapp
> 32872 spends 2 scalar instructions + 1 scalar_to_vec cost: > > lia4,-32768 > addiwa4,a4,104 > vmv.v.xv16,a4 > > It seems reasonable but only can fix test with -march=rv64gcv_zvl256b but > failed on -march=rv64gcv_zvl4096b. The scalar version also needs both instructions: li

Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Robin Dapp
> I think we shouldn't vectorize it with any vlen, since the non-vectorized > codegen is much better. > And also, I have tested -msve-vector-bits=2048, ARM SVE doesn't vectorize it. > -zvl65536b, RVV Clang also doesn't vectorize it. Of course I agree that optimizing everything to return 0 is

Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Robin Dapp
On 1/11/24 11:20, juzhe.zh...@rivai.ai wrote: > Ok I see your idea and we need to adjust scalar_to_vec accurately. Inside the > loop we have these 2 scalar_to_vec: > > 1. MIN_EXPR 1 times scalar_to_vec costs 1 in prologue > >    This scalar_to_vec cost should be 0 or 1 since it only generate

Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Robin Dapp
>  Yeah... I just noticed. I should set it as 4 to fix it with biggest VLEN > size, > that is, -march=rv64gcv_zvl4096b --param=riscv-autovec-lmul=m8... > > I am confused now how to fix this case. 4 is definitely too high compared to a regular instruction. vmv.vx could even be zero-cost for

Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Robin Dapp
>> The slidedown/vmv.x.s part is of course vec_extract but we indeed >> don't seem to cost it as vec_to_scalar here. > > It looks like a vectorized live operation as it's not in the loop body > (and thus really irrelevant for costing in practice). This has > > /* ??? Enable for loop

Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Robin Dapp
On 1/11/24 10:46, juzhe.zh...@rivai.ai wrote: > Oh. I see I think I have done wrong here. > > I should adjust cost for VEC_EXTRACT not VEC_SET. > > But it's odd, I didn't see loop vectorizer is scanning scalar_to_vec > cost in vect.dump. The slidedown/vmv.x.s part is of course vec_extract but

Re: [PATCH v5] RISC-V: Fix register overlap issue for some xtheadvector instructions

2024-01-11 Thread Robin Dapp
LGTM now, thanks. I find it much more readable that way. Regards Robin

Re: [PATCH V2 2/4][RFC] RISC-V: Add vector related reservations

2024-01-10 Thread Robin Dapp
> Since all the pipelines should be tuned to their cost model, they > would be different anyway. If it would be simpler for now, I could > separate the files out. > I think I'm getting a bit confused. Is there a reason why we would > want to exchange scheduler descriptions like the example you >

Re: [PATCH V2 2/4][RFC] RISC-V: Add vector related reservations

2024-01-10 Thread Robin Dapp
Hi Edwin, > This patch copies the vector reservations from generic-ooo.md and > inserts them into generic.md and sifive.md. Creates new vector crypto related > insn reservations. In principle, the changes look good to me but I wonder if we could split off the vector parts from generic-ooo into

Re: [PATCH V2] RISC-V: Switch RVV cost model.

2024-01-10 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Switch RVV cost model to generic vector cost model

2024-01-10 Thread Robin Dapp
> Current generic cost model makes dynamic-lmul2-7.c generate inferior codegen. > > I found if I tweak the cost a little bit then dynamic-lmul2-7.c codegen can > be recovered. > However, it makes other tests failed > It's complicated story Ok, makes sense. So the plan seems to be:

Re: [PATCH] RISC-V: Switch RVV cost model to generic vector cost model

2024-01-10 Thread Robin Dapp
On 1/10/24 15:40, 钟居哲 wrote: > I need to add these costs for segment load/stores: > > /* Generic costs for VLA vector operations.  */ > static const scalable_vector_cost generic_vla_vector_cost = { >   { >     1,/* int_stmt_cost  */ >     1,/* fp_stmt_cost  */ >     1,/* gather_load_cost  */ >    

Re: [PATCH V2] RISC-V: Minor tweak dynamic cost model

2024-01-10 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Switch RVV cost model to generic vector cost model

2024-01-10 Thread Robin Dapp
Hi Juzhe, > The reason we want to switch to generic vector cost model is the default > cost model generates inferior codegen for various benchmarks. > > For example, PR113247, we have performance bug that we end up having over 70% > performance drop of SHA256. Currently, no matter how we adapt

Re: [PATCH v5] RISC-V: Fix register overlap issue for some xtheadvector instructions

2024-01-10 Thread Robin Dapp
Hi Joshua, > For th.vmadc/th.vmsbc as well as narrowing arithmetic instructions > and floating-point compare instructions, an illegal instruction > exception will be raised if the destination vector register overlaps > a source vector register group. > > To handle this issue, we use

Re: [PATCH] RISC-V: Fix loop invariant check

2024-01-09 Thread Robin Dapp
OK. I'm still a bit unsure about whether SSA_NAME_IS_DEFAULT_DEF can really occur but if it does it's not wrong to treat it as loop invariant. Regards Robin

Re: [PATCH] RISC-V: Teach liveness computation loop invariant shift amount[Dynamic LMUL]

2024-01-08 Thread Robin Dapp
> > +  if (is_gimple_min_invariant (op)) > > +    return true; > > +  if (SSA_NAME_IS_DEFAULT_DEF (op) > > +  || !flow_bb_inside_loop_p (loop, gimple_bb (SSA_NAME_DEF_STMT > (op > > +    return true; > > +  return gimple_uid (SSA_NAME_DEF_STMT (op)) & 1; > > +}

Re: [PATCH] RISC-V: Allow simplification non-vlmax with len = NUNITS reg to reg move

2024-01-05 Thread Robin Dapp
> +/* Return true it is whole register-register move. */ > +bool > +whole_reg_to_reg_move_p (rtx *ops, machine_mode mode) > +{ > + if (register_operand (ops[0], mode) > + && register_operand (ops[3], mode) > + && satisfies_constraint_vu (ops[2]) > + && satisfies_constraint_Wc1

Re: [PATCH] RISC-V: Teach liveness computation loop invariant shift amount[Dynamic LMUL]

2024-01-05 Thread Robin Dapp
> 1). We not only have vashl_optab,vashr_optab,vlshr_optab which vectorize > shift with vector shift amount, > that is, vectorization of 'a[i] >> x[i]', the shift amount is loop variant. > 2). But also, we have ashl_optab, ashr_optab, lshr_optab which can vectorize > shift with scalar shift

[PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971].

2023-12-18 Thread Robin Dapp
Hi, found in PR112971, this patch adds folding support for bitwise operations of const duplicate zero vectors and stepped vectors. On riscv we have the situation that a folding would perpetually continue without simplifying because e.g. {0, 0, 0, ...} & {7, 6, 5, ...} would not fold to {0, 0, 0,

Re: [PATCH V2] RISC-V: Support one more overlap for wv instructions

2023-12-18 Thread Robin Dapp
OK. Regards Robin

Re: [PATCH V2] RISC-V: Enable vect test for RV32

2023-12-18 Thread Robin Dapp
LGTM. Regards Robin

  1   2   3   4   5   6   7   8   9   10   >