Re: [PATCH] RISC-V: Fix reduc_strict_run-1 test case.

2023-08-16 Thread Robin Dapp via Gcc-patches
> But if it's a float16 precision issue then I would have expected both > the computations for the lhs and rhs values to have suffered > similarly. Yeah, right. I didn't look closely enough. The problem is not the reduction but the additional return-value conversion that is omitted when

[PATCH] IFN: Fix vector extraction into promoted subreg.

2023-08-15 Thread Robin Dapp via Gcc-patches
Hi, this patch fixes the case where vec_extract gets passed a promoted subreg (e.g. from a return value). When such a subreg is the destination of a vector extraction we create a separate pseudo register and ensure that the necessary promotion is performed afterwards. Before this patch a

Re: [PATCH] RISC-V: Fix autovec_length_operand predicate[PR110989]

2023-08-15 Thread Robin Dapp via Gcc-patches
> Currently, autovec_length_operand predicate incorrect configuration is > discovered in PR110989 since this following situation: In case you haven't committed it yet: This is OK. Regards Robin

Re: [PATCH] RISC-V: Add the missed half floating-point mode patterns of local_pic_load/store when only use zfhmin

2023-08-17 Thread Robin Dapp via Gcc-patches
Hi Lehua, thanks for fixing this. Looks like the same reason we have the separation of zvfh and zvfhmin for vector loads/stores. > +;; Iterator for hardware-supported load/store floating-point modes. > +(define_mode_iterator ANYLSF [(SF "TARGET_HARD_FLOAT || TARGET_ZFINX") > +

Re: [PATCH] RISC-V: Fix reduc_strict_run-1 test case.

2023-08-17 Thread Robin Dapp via Gcc-patches
> I'm not opposed to merging the test change, but I couldn't figure out > where in C the implicit conversion was coming from: as far as I can > tell the macros don't introduce any (it's "return _float16 * > _float16"), I'd had the patch open since last night but couldn't > figure it out. > > We

[PATCH] RISC-V: Enable pressure-aware scheduling by default.

2023-08-18 Thread Robin Dapp via Gcc-patches
Hi, this patch enables pressure-aware scheduling for riscv. There have been various requests for it so I figured I'd just go ahead and send the patch. There is some slight regression in code quality for a number of vector tests where we spill more due to different instructions order. The ones I

Re: [PATCH] RISC-V: Fix -march error of zhinxmin testcases

2023-08-18 Thread Robin Dapp via Gcc-patches
> This little patch fixs the -march error of a zhinxmin testcase I added earlier > and an old zhinxmin testcase, since these testcases are for zhinxmin extension > and not zfhmin extension. Arg, I should have noticed that ;) OK, of course. Regards Robin

[PATCH] RISC-V: Allow immediates 17-31 for vector shift.

2023-08-18 Thread Robin Dapp via Gcc-patches
Hi, this patch adds a missing constraint check in order to be able to print (and not ICE) vector immediates 17-31 for vector shifts. Regards Robin gcc/ChangeLog: * config/riscv/riscv.cc (riscv_print_operand): gcc/testsuite/ChangeLog: *

[PATCH] RISC-V/testsuite: Add missing conversion tests.

2023-08-18 Thread Robin Dapp via Gcc-patches
Hi, this patch adds some missing tests for vf[nw]cvt. Regards Robin gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-run.c: Add tests. * gcc.target/riscv/rvv/autovec/conversions/vfncvt-ftoi-rv32gcv.c: Ditto. *

Re: [PATCH] RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl

2023-08-17 Thread Robin Dapp via Gcc-patches
Hi Lehua, > XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand > XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand > XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c scan-assembler \\tvand > XPASS: gcc.target/riscv/rvv/autovec/partial/slp-1.c

Re: [PATCH] RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvl

2023-08-17 Thread Robin Dapp via Gcc-patches
Hi Lehua, unrelated but I'm seeing a lot of failing gather/scatter tests on master right now. > /* DIRTY -> DIRTY or VALID -> DIRTY. */ > + if (block_info.reaching_out.demand_p (DEMAND_NONZERO_AVL) > + && vlmax_avl_p (prop.get_avl ())) > +

Re: [PATCH V2] RISC-V: Enable vec_int testsuite for RVV VLA vectorization

2023-08-28 Thread Robin Dapp via Gcc-patches
Thanks, just giving my quick thoughts on some of the FAILs: > Test report: > FAIL: gcc.dg/vect/bb-slp-10.c -flto -ffat-lto-objects scan-tree-dump slp2 > "unsupported unaligned access" > FAIL: gcc.dg/vect/bb-slp-10.c scan-tree-dump slp2 "unsupported unaligned > access" For these we would need

Re: [PATCH] RISC-V: Disable user vsetvl fusion into EMPTY block

2023-08-28 Thread Robin Dapp via Gcc-patches
> || vsetvl_insn_p (expr.get_insn ()->rtl ())) > continue; > new_info = expr.global_merge (expr, eg->src->index); > @@ -3317,6 +3335,25 @@ pass_vsetvl::earliest_fusion (void) > prob = profile_probability::uninitialized (); >

Re: [PATCH] RISC-V: Refactor and clean expand_cond_len_{unop,binop,ternop}

2023-08-28 Thread Robin Dapp via Gcc-patches
Hi Lehua, thanks for starting with the refactoring. I have some minor comments. > +/* The value means the number of operands for insn_expander. */ > enum insn_type > { >RVV_MISC_OP = 1, >RVV_UNOP = 2, > - RVV_UNOP_M = RVV_UNOP + 2, > - RVV_UNOP_MU = RVV_UNOP + 2, > - RVV_UNOP_TU =

Re: [PATCH V3] RISC-V: Enable vec_int testsuite for RVV VLA vectorization

2023-08-28 Thread Robin Dapp via Gcc-patches
On 8/28/23 12:16, Juzhe-Zhong wrote: > FAIL: gcc.dg/vect/bb-slp-10.c -flto -ffat-lto-objects scan-tree-dump slp2 > "unsupported unaligned access" > FAIL: gcc.dg/vect/bb-slp-10.c scan-tree-dump slp2 "unsupported unaligned > access" > XPASS: gcc.dg/vect/no-scevccp-outer-12.c scan-tree-dump-times

Re: [PATCH V4] RISC-V: Enable vec_int testsuite for RVV VLA vectorization

2023-08-28 Thread Robin Dapp via Gcc-patches
> LGTM from my side, but I would like to wait Robin is ok too In principle I'm OK with it as well, realizing we will still need to fine-tune a lot here anyway. For now, IMHO it's good to have some additional test coverage in the vector space but we should not expect every test to be correct/a

[PATCH] RISC-V: Fix reduc_strict_run-1 test case.

2023-08-15 Thread Robin Dapp via Gcc-patches
Hi, this patch changes the equality check for the reduc_strict_run-1 testcase from == to fabs () < EPS. The FAIL only occurs with _Float16 but I'd argue approximate equality is preferable for all float modes. Regards Robin gcc/testsuite/ChangeLog: *

Re: [PATCH] RISC-V: Implement vector "average" autovec pattern.

2023-08-15 Thread Robin Dapp via Gcc-patches
> Plz put your testcases into: > > # widening operation only test on LMUL < 8 > set AUTOVEC_TEST_OPTS [list \ >   {-ftree-vectorize -O3 --param riscv-autovec-lmul=m1} \ >   {-ftree-vectorize -O3 --param riscv-autovec-lmul=m2} \ >   {-ftree-vectorize -O3 --param riscv-autovec-lmul=m4} \ >  

Re: [PATCH V4] VECT: Support loop len control on EXTRACT_LAST vectorization

2023-08-14 Thread Robin Dapp via Gcc-patches
Hi Kewen, > I did a bootstrapping and regression testing on Power10 (LE) and found a lot > of failures. I think the problem is that just like for vec_set we're expecting the vec_extract expander not to fail. It is probably passed not a const int here anymore and therefore fails to expand?

Re: [PATCH V3] RISC-V: Refactor and clean expand_cond_len_{unop,binop,ternop}

2023-08-29 Thread Robin Dapp via Gcc-patches
Hi Lehua, thanks, LGTM now. Regards Robin

Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns

2023-08-22 Thread Robin Dapp via Gcc-patches
> What about conditional zero_extension, sign_extension, > float_extension, ...etc? > > We have discussed this, we can have some many conditional situations > that can be supported by either match.pd or rtl backend combine > pass. > > IMHO, it will be too many optabs/internal fns if we support

Re: [PATCH V2] RISC-V: Add conditional autovec convert(INT<->INT) patterns

2023-08-25 Thread Robin Dapp via Gcc-patches
Hi Lehua, thanks, LGTM. One thing maybe for the next patches: It seems to me that we lump all of the COND_... tests into the cond subdirectory when IMHO they would also fit into the respective directories of their operations (binop, unop etc). Right now we will have a lot of rather unrelated

Re: [PATCH] RISC-V: Support LEN_FOLD_EXTRACT_LAST auto-vectorization

2023-08-24 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > vcpop.m a5,v0 > beq a5,zero,.L3 > addia5,a5,-1 > vsetvli a4,zero,e32,m1,ta,ma > vcompress.vmv2,v3,v0 > vslidedown.vx v2,v2,a5 > vmv.x.s a0,v2 > .L3: > sext.w a0,a0 Mhm, where is this sext coming from? Thought I had this

Re: [PATCH] s390: Recognize reverse/element swap permute patterns.

2022-08-22 Thread Robin Dapp via Gcc-patches
rom 1f11a6b89c9b0ad64b480229cd4db06e887a Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Fri, 24 Jun 2022 15:17:08 +0200 Subject: [PATCH v2] s390: Recognize reverse/element swap permute patterns. This adds functions to recognize reverse/element swap permute patterns for vler, vster as well as vpdi and rotate. gcc/Change

Re: [PATCH] s390: Recognize reverse/element swap permute patterns.

2022-08-31 Thread Robin Dapp via Gcc-patches
Hi, adding -save-temps as well as a '\t' in order for the tests to do what they are supposed to do. Going to push this as obvious in some days. Regards Robin -- gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vperm-rev-z14.c: Add -save-temps. *

Re: [PATCH] expand: Convert cst - x into cst xor x.

2022-09-07 Thread Robin Dapp via Gcc-patches
> The question is really whether xor or sub is "better" statically. I can't > think of any reasons. On s390, why does xor end up "better"? There is an xor with immediate (as opposed to no "subtract from immediate") which saves an instruction, usually. On x86, I think the usual argument for xor

[RFC] postreload cse'ing vector constants

2022-09-07 Thread Robin Dapp via Gcc-patches
Hi, I recently looked into a sequence like vzero %v0 vlr %v2, %v0 vlr %v3, %v0. Ideally we would like to use vzero for all of these sets in order to not create dependencies. For some instances of this problem I found the offending snippet to be the postreload cse pass. If there is a non

Re: [RFC] postreload cse'ing vector constants

2022-09-07 Thread Robin Dapp via Gcc-patches
> Did you did any archeology into this code to see if there was any > history that might shed light on why it doesn't just using the costing > models? This one was buried under some dust :) commit 0254c56158b0533600ba9036258c11d377d46adf Author: John Carr Date: Wed Jun 10 06:00:50 1998 +

[PATCH] testsuite/s390: Add -mzarch to ifcvt test cases.

2022-09-06 Thread Robin Dapp via Gcc-patches
Hi, this adds a missing -mzarch to some ifcvt test cases. Going to commit this as obvious in some days barring objections. Regards Robin gcc/testsuite/ChangeLog: * gcc.target/s390/ifcvt-one-insn-bool.c: Add -mzarch. * gcc.target/s390/ifcvt-one-insn-char.c: Dito. *

[PATCH] expand: Convert cst - x into cst xor x.

2022-09-06 Thread Robin Dapp via Gcc-patches
Hi, posting this separately from PR91213 now. I wrote an s390 test and most likely it could also be done for x86 which will give it broader coverage. Depending on the backend it might be better to convert cst - x into cst xor x if cst + 1 is a power of two and 0 <= x <= cst. This patch

Re: [PATCH] expand: Convert cst - x into cst xor x.

2022-09-06 Thread Robin Dapp via Gcc-patches
> cost might also depend on the context in case flag setting > behavior differs for xor vs sub (on x86 sub looks strictly more > powerful here). The same is probably true when looking for > a combination with another bitwise operation. > > Btw, why not perform the optimization in expand_binop?

Re: [RFC] postreload cse'ing vector constants

2022-09-28 Thread Robin Dapp via Gcc-patches
> I opened: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107061 The online docs for encodekey256 also say XMM4 through XMM6 are reserved for future usages and software should not rely upon them being zeroed. I believe we also zero there. > This sounds like an issue. So with your patch

Re: VN, len_store and endianness

2022-09-27 Thread Robin Dapp via Gcc-patches
> Yes, because the native_interpret always starts at offset zero > (we can't easily feed in a "shifted" RHS). So what I assumed is > that IFN_LEN_STORE always stores elements [0, len + adj]. Hmm, but this assumption is not violated here or am I missing something? It's not like we're storing

Re: VN, len_store and endianness

2022-09-27 Thread Robin Dapp via Gcc-patches
> The error is probably in vn_reference_lookup_3 which assumes that > 'len' applies to the vector elements in element order. See the part > of the code where it checks for internal_store_fn_p. If 'len' is with > respect to the memory and thus endianess has to be taken into > account then for the

Re: [PATCH] expand: Convert cst - x into cst xor x.

2022-10-21 Thread Robin Dapp via Gcc-patches
> Do we have evidence that targets properly cost XOR vs SUB RTXen? > > It might actually be a reload optimization - when the constant is > available in a register use 'sub', when it needs to be reloaded > use 'xor'? > > That said, I wonder if the fallout of changing some SUB to XOR > is bigger

[PATCH] s390: Fix bootstrap error with checking and -m31

2022-10-19 Thread Robin Dapp via Gcc-patches
Hi, since r13-2746 we hit an ICE when bootstrapping with -m31 and --enable-checking=all. ../../../../libgfortran/ieee/ieee_helper.c: In function 'ieee_class_helper_16': ../../../../libgfortran/ieee/ieee_helper.c:77:3: internal compiler error: RTL check: expected code 'reg', have 'subreg' in

Re: [RFC] postreload cse'ing vector constants

2022-09-08 Thread Robin Dapp via Gcc-patches
> Which is this from the mail archives: > > https://gcc.gnu.org/pipermail/gcc-patches/1998-June/000308.html > > I would tend to agree that for equal cost that the constant would be > preferred since that should be better from a scheduling/dependency > standpoint.   So it seems to me we can

Basic REG_EQUIV comprehension question

2022-09-15 Thread Robin Dapp via Gcc-patches
Hi, I have been working on making better use of s390's vzero instruction. Currently we rather zero a vector register once and load it into other registers via vlr instead of emitting multiple vzeros. At IRA/reload point we e.g. have (insn 8 5 19 2 (set (reg/v:V2DI 64 [ zero ])

Re: Basic REG_EQUIV comprehension question

2022-09-15 Thread Robin Dapp via Gcc-patches
> Yeah, rtx_costs (or preferably insn_cost, if that works) seem like the > best way of addressing this. If the target says that register moves are > cheaper than constant moves then it's a feature that CSE & co remove > duplicate constants. The REG_EQUIV note is still useful in those cases >

Re: Basic REG_EQUIV comprehension question

2022-09-15 Thread Robin Dapp via Gcc-patches
Small addition to clarify: (insn 8) from the example is of course matched to a vzero. The "problem" begins when (reg 64) is later moved into another register and the (const_vector) has been optimized to a single definition e.g. by CSE, i.e. we have several (insn yy (set (reg:V2DI xx) (reg:V2DI

VN, len_store and endianness

2022-09-26 Thread Robin Dapp via Gcc-patches
Hi, I'm locally testing a branch that enables vll/vstl for partial vector usage i.e. len_load and len_store on s390. I see a FAIL in testsuite/gfortran.dg/power_3.f90. Since r13-1777-gbd9837bc3ca134 we also performe VN for masked/len stores and things go wrong there. The problem seems to be

Re: [RFC] postreload cse'ing vector constants

2022-09-27 Thread Robin Dapp via Gcc-patches
> I did bootstrapping and ran the testsuite on x86(-64), aarch64, Power9 > and s390. Everything looks good except two additional fails on x86 > where code actually looks worse. > > gcc.target/i386/keylocker-encodekey128.c > > 17c17,18 > < movaps %xmm4, k2(%rip) > --- >> pxor

[PATCH] s390: Implement vec_extract via vec_select.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, vec_select can handle dynamic/runtime masks nowadays. Therefore we can get rid of the UNSPEC_VEC_EXTRACT that was preventing further optimizations like combining instructions with vec_extract patterns. Bootstrapped and regtested. No regressions. Is it OK? Regards Robin gcc/ChangeLog:

[PATCH] s390: Implement vec_set with vec_merge and, vec_duplicate.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, similar to other backends this patch implements vec_set via vec_merge and vec_duplicate instead of an unspec. This opens up more possibilites to combine instructions. Bootstrapped and regtested. No regressions. Is it OK? Regards Robin gcc/ChangeLog: * config/s390/s390.md:

[PATCH] s390: Recognize reverse/element swap permute patterns.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, this adds functions to recognize reverse/element swap permute patterns for vler, vster as well as vpdi and rotate. Bootstrapped and regtested, no regressions. Is it OK? Regards Robin gcc/ChangeLog: * config/s390/s390.cc (expand_perm_with_vpdi): Recognize swap pattern.

[PATCH] s390: Use vpdi and verllg in vec_reve.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, swapping the two elements of a V2DImode or V2DFmode vector can be done with vpdi instead of using the generic way of loading a permutation mask from the literal pool and vperm. Analogous to the V2DI/V2DF case reversing the elements of a four-element vector can be done by first swapping the

[PATCH] s390: Add z15 to s390_issue_rate.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, this patch tries to be more explicit by mentioning z15 in s390_issue_rate. No changes in testsuite, bootstrap or SPEC obviously. Is it OK? Regards Robin gcc/ChangeLog: * config/s390/s390.cc (s390_issue_rate): Add z15. --- gcc/config/s390/s390.cc | 1 + 1 file changed, 1

[PATCH] s390: Add -munroll-only-small-loops.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, inspired by Power we also introduce -munroll-only-small-loops. This implies activating -funroll-loops and -munroll-only-small-loops at -O2 and above. Bootstrapped and regtested. This introduces one regression in gcc.dg/sms-compare-debug-1.c but currently dumps for sms are broken as well.

[PATCH] s390: Implement vec_revb(vector short)/bswapv8hi with, verllh.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi, this patch implements a byte swap for a V8HImode vector via an element rotate by 8 bits. Bootstrapped and regtested, no regressions. Is it OK? Regards Robin gcc/ChangeLog: PR target/100867 * config/s390/vector.md: Add special case for V8HImode. gcc/testsuite/ChangeLog:

optabs: Variable index vec_set

2022-10-31 Thread Robin Dapp via Gcc-patches
Hi, I'm looking into vec_set with variable index on s390. Uros posted a patch [1] that did not make it upstream in Nov 2020. It changed the mode of the index operand to whatever the target supports in can_vec_set_var_idx_p. I missed it back then but we indeed do not make proper use of vec_set

Re: optabs: Variable index vec_set

2022-11-02 Thread Robin Dapp via Gcc-patches
Hi, > With the patch my local changes to make better use of vec_set work > nicely even though I haven't done a full bootstrap yet. Were there > other issues with the patch or can it still be applied? I performed a bootstrap as well as a regtest with -march=z16 on s390. There is no new fallout.

Re: optabs: Variable index vec_set

2022-11-02 Thread Robin Dapp via Gcc-patches
> IIRC, I was trying to "fix" modeless operand by giving it a mode, but > since it made no difference for x86, I later dropped the patch. > However, operand with a known mode is preferred, so if it works for > you, just include my patch in your submission. My patch is somehow > trivial if we want

Re: [PATCH] ifcvt.cc: Prevent excessive if-conversion for conditional moves

2023-01-11 Thread Robin Dapp via Gcc-patches
Hi, > On optimizing for speed, default_noce_conversion_profitable_p() allows > plenty of headroom, so this patch has little impact. > > Also, if the target-specific cost estimate is accurate or allows for > margins, the impact should be similarly small. I believe this part of ifcvt does/did not

Re: [RFC] postreload cse'ing vector constants

2022-11-03 Thread Robin Dapp via Gcc-patches
Should we go ahead with this, i.e. push the change and wait for fallout? I guess we're still early enough in the cycle for that. There are no regressions anymore on s390, Power9, x86 and aarch64 (at least on the farm machines I checked). Regards Robin

[PATCH] s390: Add LEN_LOAD/LEN_STORE support.

2023-02-02 Thread Robin Dapp via Gcc-patches
Hi, this patch adds LEN_LOAD/LEN_STORE support for z14 and newer. It defines a bias value of -1 and implements the LEN_LOAD and LEN_STORE optabs. It also includes various vll/vstl testcases adapted from Kewen Lin's patch for Power. Bootstrapped and regtested on z13-z16. Is it OK? Regards

Re: [PATCH] s390: Add LEN_LOAD/LEN_STORE support.

2023-02-27 Thread Robin Dapp via Gcc-patches
ng used as part of the length then? Do we need a zero-extend > here? v2 attached with these problems addressed. Testsuite and bootstrap as before. Regards RobinFrom 27cc2fa49a0f3fbc2c629028b51e862346392636 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Mon, 22 Aug 2022 11:05:39 +0200 Subject: [P

[PATCH] s390: Use arch14 instead of z16 for -march=native.

2023-03-02 Thread Robin Dapp via Gcc-patches
Hi, When compiling on a system where binutils do not yet support the 'z16' name assembling fails with -march=native which we currently interpret as -march=z16 (on a z16 machine). This patch uses -march=arch14 instead. Is it OK? Regards Robin -- gcc/ChangeLog: *

[PATCH] s390: Fix ifcvt test cases

2023-03-02 Thread Robin Dapp via Gcc-patches
Hi, we seem to flip flop between the "high" and "not low" variants of load on condition. Accept both in the affected test cases. Going to commit this as obvious. Regards Robin -- gcc/testsuite/ChangeLog: * gcc.target/s390/ifcvt-two-insns-bool.c: Allow "high" and "not low or

[PATCH] testsuite: Do not expect partial vectorization for s390.

2023-03-02 Thread Robin Dapp via Gcc-patches
Hi, this patch changes SLP test expectations. As we only vectorize when no more than one rgroup is present, no vectorization is performed. I was also considering using a separate target selector (something like vect_partial_vectors_bias_m1) but as the number of testcases is limited that would

Re: [committed] testsuite: Fix up syntax errors in scan-tree-dump-times target selectors

2023-03-06 Thread Robin Dapp via Gcc-patches
Hi, > This broke the tests, I'm seeing syntax errors: > ERROR: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects: error executing dg-final: > syntax error in target selector "target ! vect_partial_vectors || vect32 || > s390_vx" > ERROR: gcc.dg/vect/slp-3.c: error executing dg-final: syntax error

Re: [PATCH 2/3 V2] RISC-V: Enable basic auto-vectorization for RVV

2023-04-20 Thread Robin Dapp via Gcc-patches
> $ riscv64-unknown-linux-gnu-gcc > --param=riscv-autovec-preference=fixed-vlmax > gcc/testsuite/gcc.target/riscv/rvv/base/spill-10.c -O2 -march=rv64gcv > -S > ../riscv-gnu-toolchain-trunk/riscv-gcc/gcc/testsuite/gcc.target/riscv/rvv/base/spill-10.c: > In function 'stach_check_alloca_1': >

Re: [PATCH 2/3 V2] RISC-V: Enable basic auto-vectorization for RVV

2023-04-20 Thread Robin Dapp via Gcc-patches
> Can you give more comments about Robin's opinion that he want to change into > "fixed" vs "varying" or "fixed vector size" vs "dynamic vector size" ? It's not necessary to decide on this now as --params are not supposed to be stable and can be changed quickly. I was just curious if this had

Re: [RFA] [PR target/108248] [RISC-V] Break down some bitmanip insn types

2023-04-21 Thread Robin Dapp via Gcc-patches
> ../../gcc/config/riscv/generic.md:28:1: unknown value `smin' for attribute > `type' > make[3]: *** [Makefile:2528: s-attrtab] Error 1 > >From 582c428258ce17ffac8ef1b96b4072f3d510480f Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Fri, 21 Apr 2023 09:38:06 +0200 Subject: [PA

Re: [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations

2023-04-26 Thread Robin Dapp via Gcc-patches
Hi Michael, I have the diff below for the binops in my tree locally. Maybe something like this works for you? Untested but compiles and the expander helpers would need to be fortified obviously. Regards Robin -- gcc/ChangeLog: * config/riscv/autovec.md (3): New binops expander.

[PATCH] riscv: Allow vector constants in riscv_const_insns.

2023-04-28 Thread Robin Dapp via Gcc-patches
Hi, I figured I'm going to start sending some patches that build on top of the upcoming RISC-V autovectorization. This one is obviously not supposed to be installed before the basic support lands but it's small enough that it shouldn't hurt to send it now. This patch allows vector constants in

Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization

2023-04-12 Thread Robin Dapp via Gcc-patches
>> I think we can CC IBM folks to see whether we can make WHILE_LEN works >> for both IBM and RVV ? > > I've CCed them. Adding WHILE_LEN support to rs6000/s390x would be > mainly the "easy" way to get len-masked (epilog) loop support. I've > figured actually implementing WHILE_ULT for AVX512

Re: [PATCH V2] RISC-V: Add ZVFHMIN block autovec testcase

2023-06-12 Thread Robin Dapp via Gcc-patches
> +/* We can't enable FP16 NEG/PLUS/MINUS/MULT/DIV auto-vectorization when > -march="*zvfhmin*". */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 0 > "vect" } } */ Thanks. OK from my side. Regards Robin

Re: [PATCH] RISC-V: Fix V_WHOLE && V_FRACT iterator requirement

2023-06-12 Thread Robin Dapp via Gcc-patches
> +  (VNx16QI "TARGET_MIN_VLEN <= 128") > +  (VNx32QI "TARGET_MIN_VLEN <= 256") > +  (VNx64QI "TARGET_MIN_VLEN >= 64 && TARGET_MIN_VLEN <= 512") > +  (VNx128QI "TARGET_MIN_VLEN >= 128 && TARGET_MIN_VLEN <= 1024") > > This not correct, we always use VNx16QI as LMUL = m1 for min_vlen >= 128. >

Re: [PATCH] RISC-V: Ensure vector args and return use function stack to pass [PR110119]

2023-06-14 Thread Robin Dapp via Gcc-patches
Hi, > Thanks for fixing this. > > This patch let RVV type (both vector and tuple) return in memory by > default when there is no vector ABI support. It makes sens to me. > > CC more RISC-V folks to comments. so this is intended to fix the PR as well as unblock while we continue with the

Re: [PATCH V2] RISC-V: Ensure vector args and return use function stack to pass [PR110119]

2023-06-14 Thread Robin Dapp via Gcc-patches
> Oh. I see Robin's email is also wrong. CC Robin too for you  It still arrived via the mailing list ;) > Good to see a Fix patch of the ICE before Vector ABI patch. > Let's wait for more comments. LGTM, this way I don't even need to rewrite my tests. Regards Robin

Re: [PATCH] RISC-V: Use merge approach to optimize vector permutation

2023-06-14 Thread Robin Dapp via Gcc-patches
Hi Juzhe, the general method seems sane and useful (it's not very complicated). I was just distracted by > Selector = { 0, 17, 2, 19, 4, 21, 6, 23, 8, 9, 10, 27, 12, 29, 14, 31 }, the > common expression: > { 0, nunits + 1, 1, nunits + 2, 2, nunits + 3, ... } > > For this selector, we can use

[PATCH] RISC-V: Add autovec FP unary operations.

2023-06-14 Thread Robin Dapp via Gcc-patches
Hi, this patch adds floating-point autovec expanders for vfneg, vfabs as well as vfsqrt and the accompanying tests. vfrsqrt7 will be added at a later time. Similary to the binop tests, there are flavors for zvfh now. Prerequisites as before. Regards Robin gcc/ChangeLog: *

[PATCH] RISC-V: testsuite: Add vector_hw and zvfh_hw checks.

2023-06-14 Thread Robin Dapp via Gcc-patches
Hi, this introduces new checks for run tests. Currently we have riscv_vector as well as rv32 and rv64 which all check if GCC (with the current configuration) can build (not execute) the respective tests. Many tests specify e.g. a different -march for vector, though. So the check fails even

[PATCH] RISC-V: Add autovec FP binary operations.

2023-06-14 Thread Robin Dapp via Gcc-patches
Hi, this implements the floating-point autovec expanders for binary operations: vfadd, vfsub, vfdiv, vfmul, vfmax, vfmin and adds tests. The existing tests are amended and split up into non-_Float16 and _Float16 flavors as we cannot rely on the zvfh extension being present. As long as we do not

Re: [PATCH v1] RISC-V: Fix one typo in full-vec-movel test

2023-06-13 Thread Robin Dapp via Gcc-patches
> Oh. Sorry. Since I want to commit my patch so I asked Pan to commit > your test as well. I think you can resend a fix of this testcase and > drop this patch. No problem, will fix it another time. Pan can just go ahead with this fix now, no need to wait for a maintainer, it's obvious enough.

Re: [PATCH] RISC-V: Implement vec_set and vec_extract.

2023-06-13 Thread Robin Dapp via Gcc-patches
> I suggest we implement vector calling convention even though it is not > ratified yet. > We can allow calling convention to be enabled only when > --param=riscv-autovec-preference=fixed-vlmax. > We have such issue: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110119 >

Re: [PATCH v1] RISC-V: Fix one typo in full-vec-movel test

2023-06-13 Thread Robin Dapp via Gcc-patches
> This patch would like to fix one typo when checking assembly of > full-vec-movel. OK. (I actually intended to commit this myself adding some more comments to the iterator change as well as fix the tests, but well...) Regards Robin

Re: [PATCH V3] RISC-V: Add more SLP tests

2023-06-13 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, works for me as is. I just hope somebody is going to take on the task of making different LMUL SLP variants "scannable" at some point because it would definitely increase our test coverage with these tests. (Or split the tests manually and not iterate over LMUL) Regards Robin

Re: [PATCH] RISC-V: Add more SLP tests

2023-06-13 Thread Robin Dapp via Gcc-patches
Hi Juzhe, as the tests are mostly directly from aarch64's testsuite I would advise comments on where they were taken from as well as a TODO that they should become common tests for a specific target selector (vect_scalable_supported or something). How about some assembly checks for the non-run

Re: [PATCH] RISC-V: Fix bug of VLA SLP auto-vectorization

2023-06-13 Thread Robin Dapp via Gcc-patches
Hi Juzhe, LGTM. You could also add the aarch64 test disclaimer here again, but no need for a V2. Regards Robin

Re: [PATCH V2] VECT: Support LEN_MASK_ LOAD/STORE to support flow control for length loop control

2023-06-15 Thread Robin Dapp via Gcc-patches
> Meh, PoP is now behind a paywall, trying to get through ... I wonder > if there's a nice online html documenting the s390 len_load/store > instructions to better understand the need for the bias. https://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf Look for vector load with length (store). The

Re: [PATCH] RISC-V: Add autovec FP unary operations.

2023-06-15 Thread Robin Dapp via Gcc-patches
Hi Juzhe, I like the iterator solution better, I added it to the binops V2 patch with a comment and will post it in a while. Also realized there is already a testcase and the "enabled" attribute is set properly now but I hadn't rebased to the current master branch in a while... Btw. I'm

Re: [PATCH V2] VECT: Support LEN_MASK_ LOAD/STORE to support flow control for length loop control

2023-06-15 Thread Robin Dapp via Gcc-patches
> the minus in 'operand 2 - operand 3' should be a plus if the > bias is really zero or -1. I suppose Yes, that somehow got lost from when the bias was still +1. Maybe Juzhe can fix this in the course of his patch. > that's quite conservative. I think you can do better when the > loads are

Re: [PATCH V2] VECT: Support LEN_MASK_ LOAD/STORE to support flow control for length loop control

2023-06-15 Thread Robin Dapp via Gcc-patches
On 6/15/23 11:18, Robin Dapp wrote: >> Meh, PoP is now behind a paywall, trying to get through ... I wonder >> if there's a nice online html documenting the s390 len_load/store >> instructions to better understand the need for the bias. This is z16, but obviously no changes f

Re: [PATCH V2] VECT: Support LEN_MASK_ LOAD/STORE to support flow control for length loop control

2023-06-15 Thread Robin Dapp via Gcc-patches
>>> Can you try using the same wording for length and mask operands >>> as for len_load and maskload? Also len_load has the "bias" >>> operand which you omit here - IIRC that was added for s390 which >>> for unknown reason behaves a little different than power. If >>> len support for s390 ever

Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering

2023-07-03 Thread Robin Dapp via Gcc-patches
> Thanks. Ok for trunk? OK from my side. As agreed with Jeff, I'm going to get back to this and revisit/change if needed in the future. Regards Robin

Re: [PATCH] RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac combine lowering

2023-07-03 Thread Robin Dapp via Gcc-patches
To reiterate, this is OK from my side. As discussed in the other thread, Jeff would like to have more info on whether a bridge pattern is needed at all and I agreed to get back to it in a while. Until then, we can merge this. Regards Robin

Re: [PATCH v5] RISC-V: Fix one bug for floating-point static frm

2023-07-06 Thread Robin Dapp via Gcc-patches
Hi Pan, thanks, I think that works for me as I'm expecting these parts to change a bit anyway in the near future. There is no functional change to the last revision that Kito already OK'ed so I think you can go ahead. Regards Robin

Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments

2023-07-03 Thread Robin Dapp via Gcc-patches
Hi Juzhe, when changing the argument order for LEN_LOAD/LEN_STORE, you will also need to adjust rs6000's and s390's expanders. Regards Robin

Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments

2023-07-03 Thread Robin Dapp via Gcc-patches
> Similar to LEN_MASK_LOAD/STORE, their orders are consistent now after > this patch. Ah right, apologies. Regards Robin

Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering

2023-07-03 Thread Robin Dapp via Gcc-patches
On 7/3/23 10:45, juzhe.zh...@rivai.ai wrote: > We can apply it but not sure why the patchwork shows it's rejected. I believe it also failed for me locally because the order of patterns in autovec-opt.md was somehow different. The one attached worked for me though after some minor merge

Re: [PATCH 2/2] ifcvt: Allow more operations in multiple set if conversion

2023-07-03 Thread Robin Dapp via Gcc-patches
Hi Manolis, that looks like a nice enhancement of what's already possible. The concern I had some years back already was that this function would eventually grow and cannibalize on some of what the other functions in ifcvt already do :) At some point we really should unify but that's not within

Re: [PATCH v1] RISC-V: Fix one typo of FRM dynamic definition

2023-07-03 Thread Robin Dapp via Gcc-patches
> Thanks for fixing it. LGTM. > I think you can merge it when Robin is ok since this is a simple typo > fix. Yes, that's definitely simple enough :) Regards Robin

Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering

2023-07-03 Thread Robin Dapp via Gcc-patches
> We failed to merge it since it's been rejected. > https://patchwork.sourceware.org/project/gcc/patch/20230628041512.188243-1-juzhe.zh...@rivai.ai/ > > >   Err, who rejected? Or is this about

[PATCH] gimple-isel: Recognize vec_extract pattern.

2023-07-03 Thread Robin Dapp via Gcc-patches
Hi, In gimple-isel we already deduce a vec_set pattern from an ARRAY_REF(VIEW_CONVERT_EXPR). This patch does the same for a vec_extract. The code is largely similar to the vec_set one including the addition of a can_vec_extract_var_idx_p function in optabs.cc to check if the backend can handle

Re: [VSETVL PASS] RISC-V: Optimize local AVL propagation

2023-07-03 Thread Robin Dapp via Gcc-patches
LGTM. Regards Robin

Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI

2023-07-04 Thread Robin Dapp via Gcc-patches
> Kito (or somebody else), would you mind doing a RISC-V bootstrap? It would > take forever on my machine. Thank you. I did a bootstrap myself now and it finally finished. Going to commit the attached tomorrow. Regards Robin Subject: [PATCH] Change MODE_BITSIZE to MODE_PRECISION for

Re: [PATCH] gimple-isel: Recognize vec_extract pattern.

2023-07-04 Thread Robin Dapp via Gcc-patches
Hi Richard, changed the patch according to your comments and I agree that it is more readable that way. I hope using lhs as target for the extract directly is possible the way I did it. Richard's patch for aarch64 is already, therefore testsuites on aarch64 and i386 are unchanged. Regards

Re: [PATCH v1] RISC-V: Fix one typo of FRM dynamic definition

2023-07-03 Thread Robin Dapp via Gcc-patches
> Sorry for inconvenient, still working on fix it. If urgent I can > revert this change to unblock your work ASAP. I'm not blocked by this, thanks, just wanted to document it here. I was testing another patch and needed to dig for a while until I realized the FAILs come from this one. In general

Re: [PATCH v1] RISC-V: Fix one typo of FRM dynamic definition

2023-07-03 Thread Robin Dapp via Gcc-patches
Hmm, looks like it wasn't simple enough... I'm seeing execution fails for various floating point test cases. This is due to a mismatch between the FRM_DYN definition (0b111 == 7) and the attribute value (== 5). Therefore we set the rounding mode to 5 instead of 7. Regards Robin

<    3   4   5   6   7   8   9   10   11   >