[PATCH] internal-fn: Do not force vcond operand to reg.

2024-05-10 Thread Robin Dapp
Hi, this only forces the first comparison operator into a register if it is not already suitable. Bootstrap and regtest is running on x86 and aarch64, successful on p10. Regtested on riscv. gcc/ChangeLog: PR middle-end/113474 * internal-fn.cc (expand_vec_cond_mask_optab_fn):

[PATCH] RISC-V: Add testcase for PR114749.

2024-04-25 Thread Robin Dapp
Hi, this adds a test case for PR114749. Going to commit as obvious unless somebody complains. Regards Robin gcc/testsuite/ChangeLog: PR tree-optimization/114749 * gcc.target/riscv/rvv/autovec/pr114749.c: New test. --- .../gcc.target/riscv/rvv/autovec/pr114749.c | 15

Re: State of risc-v port in the current merge, revert, rinse-repeat commotion

2024-04-24 Thread Robin Dapp
Thanks Vineet! > The dynamic icounts looks sane (vs. Apr 10 snapshot) except for a > regression in x264 which is likely independent of the chaos going on. > > Apr 10 | Apr 23  | >   109f1b28fc94  |  6f0a646dd2fc   | > +-+ > 276,584,692,883

Re: [PATCH v1] Revert "RISC-V: Support highpart register overlap for vwcvt"

2024-04-24 Thread Robin Dapp
> (define_insn "@pred_vwsll_scalar" > - [(set (match_operand:VWEXTI 0 "register_operand" "=vd, vr, vd, vr, vd, vr, > vd, vr, vd, vr, vd, vr, ?, ?") > + [(set (match_operand:VWEXTI 0 "register_operand" "=vr, vr") Just noticed, not a problem of the revert but wasn't that wrong before without the

Re: [PATCH] RISC-V: Add xfail test case for wv insn highest overlap

2024-04-20 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH v1] RISC-V: Revert RVV wv instructions overlap and xfail tests

2024-04-19 Thread Robin Dapp
Hi Pan, > The RVV register overlap requires both the dest, and src operands. > Thus the rigister filter in constraint cannot cover the fully sematics > of the vector register overlap. I'm not sure I'm following. Did we miss something that should have been covered? Like only an overlap on the

[gcc r14-9972] RISC-V: Add VLS to mask vec_extract [PR114668].

2024-04-15 Thread Robin Dapp via Gcc-cvs
https://gcc.gnu.org/g:02cc8f3e68f9af96d484d9946ceaa9e3eed38151 commit r14-9972-g02cc8f3e68f9af96d484d9946ceaa9e3eed38151 Author: Robin Dapp Date: Mon Apr 15 12:44:56 2024 +0200 RISC-V: Add VLS to mask vec_extract [PR114668]. This adds the missing VLS modes to the mask extract

[PATCH] RISC-V: Add VLS to mask vec_extract [PR114668].

2024-04-15 Thread Robin Dapp
Hi, this adds the missing VLS modes to the mask extract expanders. I found a dump scan difficult to create reliably so I just kept the PR's run test case. Regtested on rv64gcv. Regards Robin gcc/ChangeLog: PR target/114668 * config/riscv/autovec.md: Add VLS.

Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.

2024-03-25 Thread Robin Dapp
> So where do we stand with this? Juzhe asked it to be rebased, but I > don't see a rebased version in my inbox and I don't see anything that > looks like this on the trunk. I missed this one and figured as we're pretty late in the cycle it can wait until GCC 15. Therefore let's call it

Re: [PATCH v2] RISC-V: Introduce option -mrvv-max-lmul for RVV autovec

2024-03-18 Thread Robin Dapp
LGTM as well. Regards Robin

Re: [PATCH] RISC-V: Introduce option -mrvv-autovec-max-lmul for RVV autovec

2024-03-14 Thread Robin Dapp
Should it really be called autovec-max-lmul? We also use TARGET_MAX_LMUL for builtins etc. Or are we just following LLVM's naming here? Isn't -mrvv-max-lmul sufficient? > PR target/112648 This PR is not really resolved or affected by the

[gcc r14-9366] vect: Do not peel epilogue for partial vectors.

2024-03-07 Thread Robin Dapp via Gcc-cvs
https://gcc.gnu.org/g:226043a4d8fb23c7fe7bf16e485b3cfaa094db21 commit r14-9366-g226043a4d8fb23c7fe7bf16e485b3cfaa094db21 Author: Robin Dapp Date: Wed Mar 6 16:54:35 2024 +0100 vect: Do not peel epilogue for partial vectors. r14-7036-gcbf569486b2dec added an epilogue vectorization

Re: [PATCH] vect: Do not peel epilogue for partial vectors [PR114196].

2024-03-07 Thread Robin Dapp
Attached v2 combines the checks. Bootstrapped and regtested on x86 an power10, aarch64 still running. Regtested on riscv64. Regards Robin Subject: [PATCH v2] vect: Do not peel epilogue for partial vectors. r14-7036-gcbf569486b2dec added an epilogue vectorization guard for early break but

Re: [PATCH] vect: Do not peel epilogue for partial vectors [PR114196].

2024-03-07 Thread Robin Dapp
> r14-7036-gcbf569486b2dec added an epilogue vectorization guard for early > break but PR114196 shows that we also run into the problem without early > break. Therefore remove early break from the conditions. Forgot: Bootstrapped and regtested on x86, aarch64 and power10. Regtested on riscv64.

[PATCH] vect: Do not peel epilogue for partial vectors [PR114196].

2024-03-06 Thread Robin Dapp
Hi, r14-7036-gcbf569486b2dec added an epilogue vectorization guard for early break but PR114196 shows that we also run into the problem without early break. Therefore remove early break from the conditions. gcc/ChangeLog: PR middle-end/114196 * tree-vect-loop-manip.cc

[gcc r14-9345] RISC-V: Use vmv1r.v instead of vmv.v.v for fma output reloads [PR114200].

2024-03-06 Thread Robin Dapp via Gcc-cvs
https://gcc.gnu.org/g:59554a50be8ebbd52e8a6348a92110af182e1874 commit r14-9345-g59554a50be8ebbd52e8a6348a92110af182e1874 Author: Robin Dapp Date: Wed Mar 6 12:15:40 2024 +0100 RISC-V: Use vmv1r.v instead of vmv.v.v for fma output reloads [PR114200]. Three-operand instructions

[gcc r14-9344] RISC-V: Adjust vec unit-stride load/store costs.

2024-03-06 Thread Robin Dapp via Gcc-cvs
https://gcc.gnu.org/g:9ae83078fe45d093bbaa02b8348f2407fe0c62d6 commit r14-9344-g9ae83078fe45d093bbaa02b8348f2407fe0c62d6 Author: Robin Dapp Date: Mon Jan 15 17:34:58 2024 +0100 RISC-V: Adjust vec unit-stride load/store costs. Scalar loads provide offset addressing while unit

[PATCH] RISC-V: Use vmv1r.v instead of vmv.v.v for fma output reloads [PR114200].

2024-03-06 Thread Robin Dapp
Hi, three-operand instructions like vmacc are modeled with an implicit output reload when the output does not match one of the operands. For this we use vmv.v.v which is subject to length masking. In a situation where the current vl is less than the full vlenb and the fma's result value is used

Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.

2024-03-01 Thread Robin Dapp
> +  /* Segment load/store permute cost.  */ > +  const int segment_permute_2; > +  const int segment_permute_4; > +  const int segment_permute_8; > > Why do we only have 2/4/8, I think we should have 2/3/4/5/6/7/8 No idea why I posted that (wrong) version, I used it for some testing locally.

Re: [PATCH 1/5] RISC-V: Remove float vector eqne pattern

2024-03-01 Thread Robin Dapp
> 2. When I write if (a == 2) and if (2 == a), the results are > same > > 3. The vec_duplicate operand is the 5th operand in both cmp and > eqne patterns. I think they are equal. A comparison with a constant is always canonicalized to have the constant second, that's why you won't see

Re: [PATCH 5/5] RISC-V: Support vmsxx.vx for autovec comparison of vec and imm

2024-03-01 Thread Robin Dapp
Hi Han, in addition to what Juzhe mentioned (and that late-combine is going to handle such cases) it should be noted that register pressure should not be the only consideration here. Many uarchs have a higher latency for register-file-crossing moves. At least without spilling the vv variant is

Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-02-29 Thread Robin Dapp
On 2/29/24 02:38, Li, Pan2 wrote: >> So it's going to check if V2SF can be tied to DI and V4QI with SI. I >> suspect those are going to fail for RISC-V as those aren't tieable. > > Yes, you are right. Different REG_CLASS are not allowed to be tieable in > RISC-V. > > static bool >

Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-29 Thread Robin Dapp
> I think it makes more sense to remove the whole > --param=riscv-autovec-preference since we should use > -fno-tree-vectorize instead of --param=riscv-autovec-preference=none > which is more reasonable compile option for users. > > --param is just a internal testing option that we added before,

Re: [PATCH] RISC-V: Update test expectancies with recent scheduler change

2024-02-28 Thread Robin Dapp
> I suggest specify -fno-schedule-insns to force tests assembler never > change for any scheduling model. We already do that and that's the point - as I mentioned before, no scheduling is worse than default scheduling here (for some definition of worse). The way to reduce the number of vsetvls

Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.

2024-02-27 Thread Robin Dapp
> This patch looks odd to me. > I don't see memrefs in the trunk code. It's on top of the vle/vse offset handling patch from a while back that I haven't committed yet. > Also, I prefer list all cost in cost tune info for NF = 2 ~ 8 like ARM SVE > does: I don't mind having separate costs for

[PATCH] RISC-V: Add initial cost handling for segment loads/stores.

2024-02-26 Thread Robin Dapp
Hi, This has been sitting on my local tree - I've been wanting to post it for a while but somehow forgot. This patch makes segment loads and stores more expensive. It adds segment_load and segment_store cost fields to the common vector costs and adds handling to adjust_stmt_cost. In the future

Re: [PATCH] RISC-V: Update test expectancies with recent scheduler change

2024-02-26 Thread Robin Dapp
On 2/24/24 00:10, Edwin Lu wrote: > Given the recent change with adding the scheduler pipeline descriptions, > many scan-dump failures emerged. Relax the expected assembler output > conditions on the affected tests to reduce noise. I'm not entirely sure yet about relaxing the scans like this.

Re: [PATCH] RISC-V: Fix vec_init for simple sequences [PR114028].

2024-02-23 Thread Robin Dapp
> +/* { dg-final { scan-assembler-times "vmv\.v\.i\tv\[0-9\],0" 0 } } */ > > I think you should use "scan-assembler-not" Thanks, going to commit with that change. Regards Robin

[PATCH] RISC-V: Fix vec_init for simple sequences [PR114028].

2024-02-22 Thread Robin Dapp
Hi, for a vec_init (_a, _a, _a, _a) with _a of mode DImode we try to construct a "superword" of two "_a"s. This only works for modes < Pmode when we can "shift and or" two halves into one Pmode register. This patch disallows the optimization for inner_mode == Pmode and emits a simple broadcast

Re: [PATCH V4 4/5] RISC-V: Quick and simple fixes to testcases that break due to reordering

2024-02-21 Thread Robin Dapp
> For calling-convention-*.c, LGTM but one nit about change log. Take > **Update** here may make others not easy to learn what you did about > the file. You can say similar to "Rearrange and adjust the > asm-checker times" or likewise. Of course, you can refine the > changelog when commit. >> *

Re: [PATCH V4 1/5] RISC-V: Add non-vector types to dfa pipelines

2024-02-21 Thread Robin Dapp
OK. Regards Robin

Re: [PATCH] RISC-V: Set require-effective-target rv64 for PR113742

2024-02-15 Thread Robin Dapp
> Ah oops I glanced over the /* { dg-do compile } */part. It should be > fine to add '-march=rv64gc' instead then? Hmm it's a bit tricky. So generally -mcpu=sifive-p670 includes rv64 but it does not override a previously specified -march=rv32 (that might have been added by the test harness or

Re: [PATCH] RISC-V: Set require-effective-target rv64 for PR113742

2024-02-14 Thread Robin Dapp
On 2/14/24 20:46, Edwin Lu wrote: > The testcase pr113742.c is failing for 32 bit targets due to the following cc1 > error: > cc1: error: ABI requries '-march=rv64' I think we usually just add exactly this to the test options (so it is always run rather than just on a 64-bit target. Regards

[PATCH] RISC-V: Adjust vec unit-stride load/store costs.

2024-02-13 Thread Robin Dapp
Hi, scalar loads provide offset addressing while unit-stride vector instructions cannot. The offset must be loaded into a general-purpose register before it can be used. In order to account for this, this patch adds an address arithmetic heuristic that keeps track of data reference operands.

Re: [PATCH v1] RISC-V: Fix misspelled term args in error_at message

2024-02-12 Thread Robin Dapp
OK. Regards Robin

Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence

2024-02-06 Thread Robin Dapp
> The root cause is this following RTL pattern, after fwprop1: > > (insn 82 78 84 9 (set (reg:DI 230) >         (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) >                 (subreg:SI (reg:DI 221) 0 13 {subsi3_extended} >      (expr_list:REG_EQUAL (sign_extend:DI

Re: [PATCH] RISC-V: Fix infinite compilation of VSETVL PASS

2024-02-06 Thread Robin Dapp
> Testing is running. Ok for trunk if I passed the testing with no > regression ? OK. Regards Robin

Re: [PATCH] RISC-V: Remove vsetvl_pre bogus instructions in VSETVL PASS

2024-02-01 Thread Robin Dapp
> +static bool > +vsetvl_pre_insn_p (rtx_insn *rinsn) > +{ > + return recog_memoized (rinsn) >= 0 > + && get_attr_type (rinsn) == TYPE_VSETVL_PRE; > +} Indent looks off on my screen. Can you check? Apart from that LGTM (no need for v2 of course). Regards Robin

Re: [PATCH V4 2/4] RISC-V: Add vector related pipelines

2024-01-31 Thread Robin Dapp
LGTM, thanks. Regards Robin

Re: [PATCH] RISC-V: Support scheduling for sifive p600 series

2024-01-31 Thread Robin Dapp
> + NULL, /* vector cost */ > +}; Does the P600 series include a vector unit? From what I found on the web it looks like it. If so I would suggest specifying at least the default (generic) vector cost model here. We fall back to the default one for

[PATCH] match: Fix vcond into conditional op folding [PR113607].

2024-01-31 Thread Robin Dapp
Hi, in PR113607 we see an invalid fold of _429 = .COND_SHL (mask_patt_205.47_276, vect_cst__262, vect_cst__262, { 0, ... }); vect_prephitmp_129.51_282 = _429; vect_iftmp.55_287 = VEC_COND_EXPR ; to Applying pattern match.pd:9607, gimple-match-10.cc:3817 gimple_simplified to

Re: [PATCH] RISC-V: Fix VSETLV PASS compile-time issue

2024-01-30 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH V3 2/4] RISC-V: Add vector related pipelines

2024-01-30 Thread Robin Dapp
> I think removing the is_inorder attribute should be ok. I added it > because I wanted to avoid having two matching insn reservations > defined since matching solely on the type attribute should also match > on all subsets as well (i.e. if eventually we add an insn reservation > checking for type

[PATCH] genopinit: Split init_all_optabs [PR113575]

2024-01-26 Thread Robin Dapp
Hi, init_all_optabs initializes > 1 patterns for riscv targets. This leads to pathological situations in dataflow analysis (which can occur with many adjacent stores). To alleviate this this patch makes genopinit split the init_all_optabs function into several init_optabs_xx functions that

Re: [PATCH V3 4/4] RISC-V: Enable assert for insn_has_dfa_reservation

2024-01-25 Thread Robin Dapp
>/* If we ever encounter an insn without an insn reservation, trip > an assert so we can find and fix this problem. */ > -#if 0 > + if (! insn_has_dfa_reservation_p (insn)) { > +print_rtl(stderr, insn); > +fprintf(stderr, "%d", get_attr_type (insn)); > + } >gcc_assert

Re: [PATCH V3 3/4] RISC-V: Use default cost model for insn scheduling

2024-01-25 Thread Robin Dapp
> Use default cost model scheduling on these test cases. All these tests > introduce scan dump failures with -mtune generic-ooo. Since the vector > cost models are the same across all three tunes, some of the tests > in PR113249 will be fixed with this patch series. This is OK, thanks. > 39

Re: [PATCH V3 2/4] RISC-V: Add vector related pipelines

2024-01-25 Thread Robin Dapp
Thanks, that looks better IMHO. > +;; Copyright (C) 2011-2024 Free Software Foundation, Inc. > +;; Contributed by Andrew Waterman (and...@sifive.com). > +;; Based on MIPS target for GNU compiler. You might want to change that, as well as the date. While at it you can also fix the broken date in

Re: [PATCH V3 1/4] RISC-V: Add non-vector types to dfa pipelines

2024-01-25 Thread Robin Dapp
LGTM, thanks. Regards Robin

Re: [PATCH] RISC-V: Fix incorrect LCM delete bug [VSETVL PASS]

2024-01-25 Thread Robin Dapp
The non-test parts are OK IMHO. Regards Robin

[PATCH] testsuite/vect: Add target checks to refined patterns [PR113558]

2024-01-24 Thread Robin Dapp
Hi, on Solaris/SPARC several vector tests appeared to be regressing. They were never vectorized but the checks before r14-3612-ge40edf64995769 would match regardless if a loop was actually vectorized or not. The refined checks only match a successful vectorization attempt but are run

Re: [PATCH] RISC-V: Fix large memory usage of VSETVL PASS [PR113495]

2024-01-23 Thread Robin Dapp
> SPEC 2017 wrf benchmark expose unreasonble memory usage of VSETVL PASS > that is, VSETVL PASS consume over 33 GB memory which make use impossible > to compile SPEC 2017 wrf in a laptop. > > The root cause is wasting-memory variables: LGTM. The new code matches compute_lcm_local_properties

Re: [PATCH] RISC-V: Lower vmv.v.x (avl = 1) into vmv.s.x

2024-01-22 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Fix regressions due to 86de9b66480b710202a2898cf513db105d8c432f

2024-01-22 Thread Robin Dapp
> No, we didn't undo the optimization. > > We just disallow move pattern for (set (reg) (VL_REGNUM)). Ah, what I referred to was the opposite direction. We allow (subreg:V8QI (reg:DI ...)) which is not touched by this patch. Then it is OK. Regards Robin

Re: [PATCH] RISC-V: Fix regressions due to 86de9b66480b710202a2898cf513db105d8c432f

2024-01-22 Thread Robin Dapp
Hi Juzhe, in principle this seems ok to me but I wonder about: > We shouldn't worry about subreg:...VL_REGNUM since it's impossible > that we can have such situation, I think we allow this in legitimize_move for situations like (subreg:SI (reg:V4QI)). That was not added for correctness but

Re: [PATCH V2] RISC-V: Fix RVV_VLMAX

2024-01-19 Thread Robin Dapp
Ah, interesting that this was it. Thanks for fixing and also thanks to Andrew for suggesting that fix. Regards Robin

Re: [PATCH V2] RISC-V: Add has compatible check for conflict vsetvl fusion

2024-01-17 Thread Robin Dapp
OK. Regards Robin

Re: [PATCH] RISC-V: Add has compatible check for conflict vsetvl fusion

2024-01-17 Thread Robin Dapp
Hi Juzhe, the change itself is OK but I don't think we should add binary files like this. Even if not ideal, if you want to go forward IMHO let's skip the test for now and add it at a (not much) later time. > diff --git > a/gcc/testsuite/gcc.target/riscv/rvv/fortran/spec2017_cam4/ppgrid.mod >

Re: [PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971].

2024-01-15 Thread Robin Dapp
I gave it another shot now by introducing a separate function as Richard suggested. It's probably not at the location he intended. The way I read the discussion there hasn't been any consensus on how (or rather where) to properly tackle the problem. Any other ideas still? Regards Robin

Re: [PATCH] RISC-V: Adjust loop len by costing 1 when NITER < VF

2024-01-15 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Fix regression (GCC-14 compare with GCC-13.2) of SHA256 from coremark-pro

2024-01-15 Thread Robin Dapp
OK, thanks. Regards Robin

Re: [PATCH V3] RISC-V: Adjust scalar_to_vec cost

2024-01-12 Thread Robin Dapp
> Tested on both RV32/RV64 no regression, Ok for trunk ? Yes, thanks! Btw out of curiosity, did you see why we actually fail to optimize away the VLA loop? We should open a bug for that I suppose. Regards Robin

Re: [PATCH V2] RISC-V: Adjust scalar_to_vec cost accurately

2024-01-11 Thread Robin Dapp
> 1. This patch set scalar_to_vec cost as 2 instead 1 since scalar move >instruction is slightly more costly than normal rvv instructions (e.g. > vadd.vv). We can go with 2 or 3 (if needed) for now but should later really incorporate reg-move costs in this IMHO. Just like e.g. static const

Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Robin Dapp
> 32872 spends 2 scalar instructions + 1 scalar_to_vec cost: > > lia4,-32768 > addiwa4,a4,104 > vmv.v.xv16,a4 > > It seems reasonable but only can fix test with -march=rv64gcv_zvl256b but > failed on -march=rv64gcv_zvl4096b. The scalar version also needs both instructions: li

Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Robin Dapp
> I think we shouldn't vectorize it with any vlen, since the non-vectorized > codegen is much better. > And also, I have tested -msve-vector-bits=2048, ARM SVE doesn't vectorize it. > -zvl65536b, RVV Clang also doesn't vectorize it. Of course I agree that optimizing everything to return 0 is

Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Robin Dapp
On 1/11/24 11:20, juzhe.zh...@rivai.ai wrote: > Ok I see your idea and we need to adjust scalar_to_vec accurately. Inside the > loop we have these 2 scalar_to_vec: > > 1. MIN_EXPR 1 times scalar_to_vec costs 1 in prologue > >    This scalar_to_vec cost should be 0 or 1 since it only generate

Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Robin Dapp
>  Yeah... I just noticed. I should set it as 4 to fix it with biggest VLEN > size, > that is, -march=rv64gcv_zvl4096b --param=riscv-autovec-lmul=m8... > > I am confused now how to fix this case. 4 is definitely too high compared to a regular instruction. vmv.vx could even be zero-cost for

Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Robin Dapp
>> The slidedown/vmv.x.s part is of course vec_extract but we indeed >> don't seem to cost it as vec_to_scalar here. > > It looks like a vectorized live operation as it's not in the loop body > (and thus really irrelevant for costing in practice). This has > > /* ??? Enable for loop

Re: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3

2024-01-11 Thread Robin Dapp
On 1/11/24 10:46, juzhe.zh...@rivai.ai wrote: > Oh. I see I think I have done wrong here. > > I should adjust cost for VEC_EXTRACT not VEC_SET. > > But it's odd, I didn't see loop vectorizer is scanning scalar_to_vec > cost in vect.dump. The slidedown/vmv.x.s part is of course vec_extract but

Re: [PATCH v5] RISC-V: Fix register overlap issue for some xtheadvector instructions

2024-01-11 Thread Robin Dapp
LGTM now, thanks. I find it much more readable that way. Regards Robin

Re: [PATCH V2 2/4][RFC] RISC-V: Add vector related reservations

2024-01-10 Thread Robin Dapp
> Since all the pipelines should be tuned to their cost model, they > would be different anyway. If it would be simpler for now, I could > separate the files out. > I think I'm getting a bit confused. Is there a reason why we would > want to exchange scheduler descriptions like the example you >

Re: [PATCH V2 2/4][RFC] RISC-V: Add vector related reservations

2024-01-10 Thread Robin Dapp
Hi Edwin, > This patch copies the vector reservations from generic-ooo.md and > inserts them into generic.md and sifive.md. Creates new vector crypto related > insn reservations. In principle, the changes look good to me but I wonder if we could split off the vector parts from generic-ooo into

Re: [PATCH V2] RISC-V: Switch RVV cost model.

2024-01-10 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Switch RVV cost model to generic vector cost model

2024-01-10 Thread Robin Dapp
> Current generic cost model makes dynamic-lmul2-7.c generate inferior codegen. > > I found if I tweak the cost a little bit then dynamic-lmul2-7.c codegen can > be recovered. > However, it makes other tests failed > It's complicated story Ok, makes sense. So the plan seems to be:

Re: [PATCH] RISC-V: Switch RVV cost model to generic vector cost model

2024-01-10 Thread Robin Dapp
On 1/10/24 15:40, 钟居哲 wrote: > I need to add these costs for segment load/stores: > > /* Generic costs for VLA vector operations.  */ > static const scalable_vector_cost generic_vla_vector_cost = { >   { >     1,/* int_stmt_cost  */ >     1,/* fp_stmt_cost  */ >     1,/* gather_load_cost  */ >    

Re: [PATCH V2] RISC-V: Minor tweak dynamic cost model

2024-01-10 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Switch RVV cost model to generic vector cost model

2024-01-10 Thread Robin Dapp
Hi Juzhe, > The reason we want to switch to generic vector cost model is the default > cost model generates inferior codegen for various benchmarks. > > For example, PR113247, we have performance bug that we end up having over 70% > performance drop of SHA256. Currently, no matter how we adapt

Re: [PATCH v5] RISC-V: Fix register overlap issue for some xtheadvector instructions

2024-01-10 Thread Robin Dapp
Hi Joshua, > For th.vmadc/th.vmsbc as well as narrowing arithmetic instructions > and floating-point compare instructions, an illegal instruction > exception will be raised if the destination vector register overlaps > a source vector register group. > > To handle this issue, we use

Re: [PATCH] RISC-V: Fix loop invariant check

2024-01-09 Thread Robin Dapp
OK. I'm still a bit unsure about whether SSA_NAME_IS_DEFAULT_DEF can really occur but if it does it's not wrong to treat it as loop invariant. Regards Robin

Re: [PATCH] RISC-V: Teach liveness computation loop invariant shift amount[Dynamic LMUL]

2024-01-08 Thread Robin Dapp
> > +  if (is_gimple_min_invariant (op)) > > +    return true; > > +  if (SSA_NAME_IS_DEFAULT_DEF (op) > > +  || !flow_bb_inside_loop_p (loop, gimple_bb (SSA_NAME_DEF_STMT > (op > > +    return true; > > +  return gimple_uid (SSA_NAME_DEF_STMT (op)) & 1; > > +}

Re: [PATCH] RISC-V: Allow simplification non-vlmax with len = NUNITS reg to reg move

2024-01-05 Thread Robin Dapp
> +/* Return true it is whole register-register move. */ > +bool > +whole_reg_to_reg_move_p (rtx *ops, machine_mode mode) > +{ > + if (register_operand (ops[0], mode) > + && register_operand (ops[3], mode) > + && satisfies_constraint_vu (ops[2]) > + && satisfies_constraint_Wc1

Re: [PATCH] RISC-V: Teach liveness computation loop invariant shift amount[Dynamic LMUL]

2024-01-05 Thread Robin Dapp
> 1). We not only have vashl_optab,vashr_optab,vlshr_optab which vectorize > shift with vector shift amount, > that is, vectorization of 'a[i] >> x[i]', the shift amount is loop variant. > 2). But also, we have ashl_optab, ashr_optab, lshr_optab which can vectorize > shift with scalar shift

[PATCH] fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971].

2023-12-18 Thread Robin Dapp
Hi, found in PR112971, this patch adds folding support for bitwise operations of const duplicate zero vectors and stepped vectors. On riscv we have the situation that a folding would perpetually continue without simplifying because e.g. {0, 0, 0, ...} & {7, 6, 5, ...} would not fold to {0, 0, 0,

Re: [PATCH V2] RISC-V: Support one more overlap for wv instructions

2023-12-18 Thread Robin Dapp
OK. Regards Robin

Re: [PATCH V2] RISC-V: Enable vect test for RV32

2023-12-18 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Fix natural regsize for fixed-vlmax of -march=rv64gc_zve32f

2023-12-18 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Enable vect test for RV32

2023-12-18 Thread Robin Dapp
Hi Juzhe, > -} elseif [istarget riscv64-*-*] { > +} elseif [istarget riscv*-*-*] { > if [check_effective_target_riscv_v] { > lappend DEFAULT_VECTCFLAGS "--param" "riscv-vector-abi" > set dg-do-what-default run Yes, that's reasonable. A bit further down we have

Re: [PATCH V2] RISC-V: Fix vmerge optimization bug in vec_perm vectorization

2023-12-15 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Fix vmerge optimization bug in vec_perm vectorization

2023-12-15 Thread Robin Dapp
On 12/15/23 13:52, juzhe.zh...@rivai.ai wrote: > Do you mean : > >   /* We need to use precomputed mask for such situation and such mask >      can only be computed in compile-time known size modes.  */ >   bool indices_fit_selector_p >     = GET_MODE_BITSIZE (GET_MODE_INNER (vmode)) > 8 ||

Re: [PATCH] RISC-V: Fix vmerge optimization bug in vec_perm vectorization

2023-12-15 Thread Robin Dapp
> Oh. I think it should be renamed into not_fit. > > Is this following make sense to you ? > >   /* We need to use precomputed mask for such situation and such mask >      can only be computed in compile-time known size modes.  */ >   bool indices_not_fit_selector_p >     = maybe_ge (vec_len, 2

Re: [PATCH] RISC-V: Fix vmerge optimization bug in vec_perm vectorization

2023-12-15 Thread Robin Dapp
On 12/15/23 13:16, juzhe.zh...@rivai.ai wrote: > >>> bool indices_fit_selector = maybe_ge (vec_len, 2 << GET_MODE_BITSIZE >>> (GET_MODE_INNER (vmode))); > No, I think it will make us miss some optimization. > > For example, for poly value [16,16]  maybe_ge ([16,16], 65536) which makes us >

Re: [PATCH] RISC-V: Fix vmerge optimization bug in vec_perm vectorization

2023-12-15 Thread Robin Dapp
Hi Juzhe, in general looks OK. > + /* We need to use precomputed mask for such situation and such mask > + can only be computed in compile-time known size modes. */ > + if (GET_MODE_BITSIZE (GET_MODE_INNER (vmode)) == 8 && maybe_ge (vec_len, > 256) > + && !vec_len.is_constant ()) >

Re: [PATCH] expmed: Get vec_extract element mode from insn_data, [PR112999]

2023-12-14 Thread Robin Dapp
> It looks like: > > FOR_EACH_MODE_FROM (new_mode, new_mode) > if (known_eq (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (GET_MODE (op0))) > && known_eq (GET_MODE_UNIT_SIZE (new_mode), GET_MODE_SIZE (tmode)) > && targetm.vector_mode_supported_p (new_mode) >

[PATCH] expmed: Get vec_extract element mode from insn_data, [PR112999]

2023-12-14 Thread Robin Dapp
Hi, this is a bit of a follow up of the latest expmed change. In extract_bit_field_1 we try to get a better vector mode before extracting from it. Better refers to the case when the requested target mode does not equal the inner mode of the vector to extract from and we have an equivalent

Re: [PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

2023-12-13 Thread Robin Dapp
Thanks, LGTM but please add a comment like: These test cases used to cause out-of-bounds writes to the stack and therefore showed unreliable behavior. Depending on the execution environment they can either pass or fail. As of now, with the latest QEMU version, they will pass even without the

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread Robin Dapp
> Do you mean add some comments in tests? I meant add it as a run test as well and comment that the test has caused out-of-bounds writes before and passed by the time of adding it (or so) and is kept regardless. Regards Robin

Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-13 Thread Robin Dapp
Thanks. The attached v2 goes with your suggestion and adds a vec_extractbi expander. Apart from that it keeps the MODE_PRECISION changes from before and uses insn_data[icode].operand[0]'s mode. Apart from that no changes on the riscv side. Bootstrapped and regtested on x86 and aarch64. On

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread Robin Dapp
> I don”t choose to run since I didn”t have issue run on my local > simulator no matter qemu or spike. Yes it was flaky. That's kind of expected with the out-of-bounds writes we did. They can depend on runtime environment and other factors. Of course it's a bit counterintuitive to add a

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread Robin Dapp
Hi Juzhe, in general looks OK to me. Just a question for understanding: > - if (header_info.valid_p () > - && (anticipated_exp_p (header_info) || block_info.full_available)) Why is full_available true if we cannot use it? > +/* { dg-do compile } */ It would be nice if we could

Re: [PATCH] RISC-V: Apply vla vs. vls mode heuristic vector COST model

2023-12-12 Thread Robin Dapp
Given that it's almost verbatim aarch64's implementation and the general approach appears sensible, LGTM. Regards Robin

Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-12 Thread Robin Dapp
> - Change the second mode to vec_extract_optab. This is only a name > lookup, and it seems more natural to continue using the real element mode. Am I understanding correctly that this implies we should provide a vec_extractbi expander? (with the innermode being BImode here). Regards Robin

  1   2   3   4   5   6   7   8   9   10   >