Re: [PATCH V2] RISC-V: Support CALL conditional autovec patterns

2023-08-03 Thread Robin Dapp via Gcc-patches
Hi Juzhe, I would find it a bit clearer if the prepare_ternay part were a separate patch. As it's mostly mechanical replacements I don't mind too much, though so it's LGTM from my side without that. As to the lmul = 8 ICE, is the problem that the register allocator would actually need 5 "registe

Re: [PATCH] RISC-V: Implement vector "average" autovec pattern.

2023-08-02 Thread Robin Dapp via Gcc-patches
> 1. How do you model round to +Inf (avg_floor) and round to -Inf (avg_ceil) ? That's just specified by the +1 or the lack of it in the original pattern. Actually the IFN is just a detour because we would create perfect code if not for the fallback. But as there is currently now way to check for

[PATCH] RISC-V: Implement vector "average" autovec pattern.

2023-08-01 Thread Robin Dapp via Gcc-patches
Hi, this patch adds vector average patterns op[0] = (narrow) ((wide) op[1] + (wide) op[2]) >> 1; op[0] = (narrow) ((wide) op[1] + (wide) op[2] + 1) >> 1; If there is no direct support, the vectorizer can synthesize the patterns but, presumably due to lack of narrowing operation support, won't

Re: RISCV test infrastructure for d / v / zfh extensions

2023-08-01 Thread Robin Dapp via Gcc-patches
Hi Joern, thanks, I believe this will help with testing. > +proc check_effective_target_riscv_v { } { > +return [check_no_compiler_messages riscv_ext_v assembly { > + #ifndef __riscv_v > + #error "Not __riscv_v" > + #endif > +}] > +} This can be replaced by riscv_vector

Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-07-31 Thread Robin Dapp via Gcc-patches
>>> I'm not against continuing with the more well-known approach for now >>> but we should keep in mind that might still be potential for improvement. > > No. I don't think it's faster. I did a quick check on my x86 laptop and it's roughly 25% faster there. That's consistent with the literature.

Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-07-31 Thread Robin Dapp via Gcc-patches
> +/* FIXME: We don't allow vectorize "__builtin_popcountll" yet since it needs > "vec_pack_trunc" support > + and such pattern may cause inferior codegen. > + We will enable "vec_pack_trunc" when we support reasonable vector > cost model. */ Wait, why do we need vec_pack_trunc f

Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-07-31 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > +/* Expand Vector POPCOUNT by parallel popcnt: > + > + int parallel_popcnt(uint32_t n) { > + #define POW2(c) (1U << (c)) > + #define MASK(c) (static_cast(-1) / (POW2(POW2(c)) + 1U)) > + #define COUNT(x, c) ((x) & MASK(c)) + (((x)>>(POW2(c))) & MASK(c)) > + n = CO

Re: [PATCH V2] RISC-V: Enable basic VLS auto-vectorization

2023-07-30 Thread Robin Dapp via Gcc-patches
> +;; - > +;; Duplicate Operations > +;; - > + > +(define_insn_and_split "@vec_duplicate" > + [(set (match_operand:VLS 0 "register_operand") > +(vec_duplicat

Re: [PATCH v2] RISC-V: convert the mulh with 0 to mov 0 to the reg.

2023-07-28 Thread Robin Dapp via Gcc-patches
> This is a draft patch. I would like to explain it's hard to make the > simplify generic and ask for some help. > > There're 2 categories we need to optimize. > > - The op in optab such as div / 1. > - The unspec operation such as mulh * 0, (vadc+vmadc) + 0. > > Especially for the unspec operat

[PATCH] gcse: Extract reg pressure handling into separate file.

2023-07-28 Thread Robin Dapp via Gcc-patches
Hi, this patch extracts the hoist-pressure handling from gcse and puts it into a separate file so it can be used by other passes in the future. No functional change and I also abstained from c++ifying the code. The naming with the regpressure_ prefix might be a bit clunky for now and I'm open to a

Re: [PATCH v8] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-28 Thread Robin Dapp via Gcc-patches
Hi Pan, thanks for your patience and your work. Apart from my general doubt whether mode-changing intrinsics are a good idea, I don't have other remarks that need fixing. What I mentioned before: - Handling of asms wouldn't be a huge change. It can be done in a follow-up patch of course but

Re: [PATCH v2] RISC-V: testsuite: Add vector_hw and zvfh_hw checks.

2023-07-27 Thread Robin Dapp via Gcc-patches
> LGTM, I just found this patch still on the list, I mostly tested with > qemu, so I don't think that is a problem before, but I realize it's a > problem when we run on a real board that does not support those > extensions. I think we can skip this one as I needed to introduce vector_hw and zvfh_h

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-27 Thread Robin Dapp via Gcc-patches
> I see, you mean at the beginning of frm_after, we can just return the > incoming mode as is? > > If (CALL_P (insn)) > return mode; // Given we aware the mode is DYN_CALL already. Yes, potentially similar for all the other ifs but I didn't check all of them. > Thank and will cleanup this i

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-27 Thread Robin Dapp via Gcc-patches
>> Why do we appear to return a different mode here? We already request >> FRM_MODE_DYN_CALL in mode_needed. It looks like in the whole function >> we do not change the mode so we could just always return the incoming >> mode? > > Because we need to emit 2 insn when meet a call. One before the c

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-26 Thread Robin Dapp via Gcc-patches
> I would like to propose that being focus and moving forward for this > patch itself, the underlying other RVV floating point API support and > the RVV instrinsic API fully tests depend on this. Sorry, I didn't mean to ditch LCM/mode switching. I believe it is doing a pretty good job and we shou

Re: [PATCH] RISC-V: Enable basic VLS modes support

2023-07-26 Thread Robin Dapp via Gcc-patches
Hi Juzhe, just some small remarks, all in all no major concerns. > + vmv%m1r.v\t%0,%1" > + "&& (!register_operand (operands[0], mode) > + || !register_operand (operands[1], mode))" > + [(const_int 0)] > + { > +unsigned size = GET_MODE_BITSIZE (mode).to_constant (); > +if (size

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-26 Thread Robin Dapp via Gcc-patches
> CSR write could be expensive, it will flush whole pipeline in some > RISC-V core implementation… Hopefully not flush but just sequentialize but yes, it's usually a performance concern. However if we set the rounding mode to something else for an intrinsic and then call a function we want to re

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-26 Thread Robin Dapp via Gcc-patches
> current llvm didn't do any pre optimization. They always > backup+restore for each rounding mode intrinsic I see. There is still the option of lazily restoring the (entry) FRM before a function call but not read the FRM after every call. Do we have any data on how good or bad the mode-switchi

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-26 Thread Robin Dapp via Gcc-patches
So after thinking about it again - I'm still not really sure I like treating every function as essentially an fesetround. There is a reason why fesetround is special. Does LLVM behave the same way? But supposing we really, really want it and assuming there's consensus: + start_sequence (); + e

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-25 Thread Robin Dapp via Gcc-patches
> The call fesetround could be any function in practice, and we never > know if that function might use dynamic rounding mode floating point > operation or not, also we don't know if it will be called fesetround > or not. > > So that's why we want to restore before function call to make sure we >

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-25 Thread Robin Dapp via Gcc-patches
Hi Pan, > Given we have a call, we would like to restore before call and then > backup frm after call. Looks current mode switching cannot emit insn > like that, it can only either emit insn before (mostly) or after > (when NOTE_INSN_BASIC_BLOCK_P). Thus, we try to emit the one after > call when n

Re: [PATCH] RISC-V: Fixbug for fsflags instruction error using immediate.

2023-07-24 Thread Robin Dapp via Gcc-patches
Hi Jin, this looks reasonable. Would you mind adding (small) test cases still to make sure we don't accidentally reintroduce the problem? Regards Robin

Re: [PATCH v6] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-24 Thread Robin Dapp via Gcc-patches
Hi Pan, > + for (insn = PREV_INSN (cur_insn); insn; insn = PREV_INSN (insn)) > +{ > + if (INSN_P (insn)) > + { > + if (CALL_P (insn)) > + mode = FRM_MODE_DYN; > + break; > + } > + > + if (insn == BB_HEAD (bb)) > + break; > +} > + > + return mode;

Re: [PATCH v2] vect: Handle demoting FLOAT and promoting FIX_TRUNC.

2023-07-20 Thread Robin Dapp via Gcc-patches
>> cvt_type >> - = build_nonstandard_integer_type (GET_MODE_BITSIZE (imode), >> + = build_nonstandard_integer_type (GET_MODE_BITSIZE >> + (intermediate_mode), >>

Re: [PATCH V2] RISC-V: Support in-order floating-point reduction

2023-07-20 Thread Robin Dapp via Gcc-patches
> LGTM, but I would like make sure Robin is OK too Yes, LGTM as well. Regards Robin

Re: [PATCH] RISC-V: Support in-order floating-point reduction

2023-07-20 Thread Robin Dapp via Gcc-patches
> The UNORDERED enum will cause ICE since we have UNORDERED in rtx_code. > > Could you give me another enum name? I would have expected it to work when it's namespaced. Regards Robin

Re: [PATCH] RISC-V: Support in-order floating-point reduction

2023-07-20 Thread Robin Dapp via Gcc-patches
> +enum reduction_type > +{ > + UNORDERED_REDUDUCTION, > + FOLD_LEFT_REDUDUCTION, > + MASK_LEN_FOLD_LEFT_REDUDUCTION, > +}; There are redundant 'DU's here ;) Wouldn't it be sufficient to have an enum enum reduction_type { UNORDERED, FOLD_LEFT, MASK_LEN_FOLD_LEFT, }; ? Regards Robin

Re: [PATCH] VECT: Support floating-point in-order reduction for length loop control

2023-07-19 Thread Robin Dapp via Gcc-patches
Hi Juzhe, I just noticed that we recently started calling things MASK_LEN (instead of LEN_MASK before) with the reductions. Wouldn't we want to be consistent here? Especially as the length takes precedence. I realize the preparational work like optabs is already upstream but still wanted to brin

Re: [PATCH] RISC-V: Fix testcase failed when default -mcmodel=medany

2023-07-18 Thread Robin Dapp via Gcc-patches
Hi Lehua, > I think you are rigth, I would like to remove the `-mcmodel=medany` option and > relax assert from `__riscv_save/restore_4` to `__riscv_save/restore_(3|4)` to > let > this testcase not brittle on any -mcmodel. Then I'm also going to add another > testcase (I dont known how to run -ma

Re: [PATCH] RISC-V: Fix testcase failed when default -mcmodel=medany

2023-07-18 Thread Robin Dapp via Gcc-patches
Hi Lehua, > I think the purpose of this testcase is to check whether the modifications to > the stack frame are as expected, so it is necessary to specify exactly whether > three or four registers are saved. But I think its need to add another > testcase > which use another option -mcmodel=medany

Re: [PATCH V2] RISC-V: Enable SLP un-order reduction

2023-07-18 Thread Robin Dapp via Gcc-patches
OK. Regards Robin

Re: [PATCH] RISC-V: Fix testcase failed when default -mcmodel=medany

2023-07-18 Thread Robin Dapp via Gcc-patches
Hi Lehua, > This patch fix testcase failed when I build RISC-V GCC with -mcmodel=medany > as default. If set to medany, stack_save_restore.c testcase will fail because > of > the reduced use of s3 registers in assembly (thus calling __riscv_save/store_3 > instead of __riscv_save/store_4). Explici

Re: [PATCH] RISC-V: Enable SLP un-order reduction

2023-07-18 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > +;; - > +;; [INT,FP] Initialize from individual elements > +;; - > +;; Includes: > +;; - vslide1up.vx/vfslide1up.vf > +;; ---

[PATCH v2] vect: Handle demoting FLOAT and promoting FIX_TRUNC.

2023-07-14 Thread Robin Dapp via Gcc-patches
>>> Can you add testcases? Also the current restriction is because >>> the variants you add are not always correct and I don't see any >>> checks that the intermediate type doesn't lose significant bits? I didn't manage to create one for aarch64 nor for x86 because AVX512 has direct conversions e

Re: [PATCH V2] RISC-V: Enable COND_LEN_FMA auto-vectorization

2023-07-14 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, looks good to me now - did before already actually ;). Regards Robin

Re: [PATCH] RISC-V: Enable COND_LEN_FMA auto-vectorization

2023-07-13 Thread Robin Dapp via Gcc-patches
> Is COND _LEN FMA ok for trunk? I can commit it without changing > scatter store testcase fix. > > It makes no sense block cond Len fma support. The middle end support > has already been merged. Then just add a TODO or so that says e.g. "For some reason we exceed the default code model's +-2

Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-13 Thread Robin Dapp via Gcc-patches
From my understanding, we dont have RVV instruction for fmax/fmin? > > Unless I'm misunderstanding, we do. The ISA manual says > > === Vector Floating-Point MIN/MAX Instructions > > The vector floating-point `vfmin` and `vfmax` instructions have the > same behavior as the

Re: [PATCH] vect: Handle demoting FLOAT and promoting FIX_TRUNC.

2023-07-13 Thread Robin Dapp via Gcc-patches
> Can you add testcases? Also the current restriction is because > the variants you add are not always correct and I don't see any > checks that the intermediate type doesn't lose significant bits? The testcases I wanted to add with a follow-up RISC-V patch but I can also try an aarch64 one. So

Re: [PATCH] RISC-V: Enable COND_LEN_FMA auto-vectorization

2023-07-13 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, no complaints from my side apart from one: > +/* { dg-additional-options "-mcmodel=medany" } */ Please add a comment why we need this. Regards Robin

[PATCH] vect: Handle demoting FLOAT and promoting FIX_TRUNC.

2023-07-13 Thread Robin Dapp via Gcc-patches
Hi, the recent changes that allowed multi-step conversions for "non-packing/unpacking", i.e. modifier == NONE targets included promoting to-float and demoting to-int variants. This patch adds demoting to-float and promoting to-int handling. Bootstrapped and regtested on x86 and aarch64. A quest

Re: [PATCH] Add VXRM enum

2023-07-12 Thread Robin Dapp via Gcc-patches
> +enum __RISCV_VXRM { > + __RISCV_VXRM_RNU = 0, > + __RISCV_VXRM_RNE = 1, > + __RISCV_VXRM_RDN = 2, > + __RISCV_VXRM_ROD = 3, > +}; > + > __extension__ extern __inline unsigned long > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > vread_csr(enum RVV_CSR csr) We have

Re: [PATCH] RISC-V: Support COND_LEN_* patterns

2023-07-12 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > +/* Return true if the operation is the floating-point operation need FRM. */ > +static bool > +need_frm_p (rtx_code code, machine_mode mode) > +{ > + if (!FLOAT_MODE_P (mode)) > +return false; > + return code != SMIN && code != SMAX; > +} Return true if the operation requires

Re: [PATCH] [vect]Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed.

2023-07-12 Thread Robin Dapp via Gcc-patches
> int32_t x = (int32_t)0x1.0p32; > int32_t y = (int32_t)(int64_t)0x1.0p32; > > sets x to 2147483647 and y to 0. >>> >>> Hmm, good question. GENERIC has a direct truncation to unsigned char >>> for example, the C standard generally says if the integral part cannot >>> be represented

[PATCH v2] genopinit: Allow more than 256 modes.

2023-07-11 Thread Robin Dapp via Gcc-patches
Attached is v2 that does not switch to uint64_t but stays within 32 bits by shifting the optab by 20 and the mode(s) by 10 bits. Regards Robin Upcoming changes for RISC-V will have us exceed 255 modes or 8 bits. This patch increases the limit to 10 bits and adjusts the hashing function for the g

[PATCH] genopinit: Allow more than 256 modes.

2023-07-11 Thread Robin Dapp via Gcc-patches
Ok so the consensus seems to rather stay with 32 bits and only change the shift to 10/20? As MACHINE_MODE_BITSIZE is already 16 we would need an additional check independent of that. Wouldn't that also be a bit confusing? Attached is a "v2" with unsigned long long changed to uint64_t and checking

Re: [PATCH] genopinit: Allow more than 256 modes.

2023-07-11 Thread Robin Dapp via Gcc-patches
> if (NUM_OPTABS > 0x > || MAX_MACHINE_MODE >= ((1 << MACHINE_MODE_BITSIZE) - 1)) > fatal ("genopinit range assumptions invalid"); > > so it would be a case of changing those instead. Thanks, right at the beginning of the file and I didn't see it ;) MACHINE_MODE_BITSIZE is already 1

Re: [PATCH] RISC-V: Optimize permutation codegen with vcompress

2023-07-11 Thread Robin Dapp via Gcc-patches
> MASK4 0, 5, 6, 7 also works definitely Sure :) My remark was that the tests are all(?) evenly split and a bit more variation would have been nice. Not that it doesn't work, I'm OK with it as is. Regards Robin

Re: [PATCH] RISC-V: Optimize permutation codegen with vcompress

2023-07-11 Thread Robin Dapp via Gcc-patches
> The compress optimization pattern has included all variety. > It's not necessary to force split (half/half), we can apply this compress > pattern to any variety of compress pattern. Yes, that's clear. I meant the testcases are mostly designed like MASK4 1, 2, 6, 7 instead of variation like M

Re: [PATCH] RISC-V: Optimize permutation codegen with vcompress

2023-07-11 Thread Robin Dapp via Gcc-patches
Hi Juzhe, looks good from my side, thanks. While going through it I thought of some related cases that we could still handle differently but I didn't bother to formalize them for now. Most likely we already handle them in the shortest way anyway. I'm going to check on that when I find some time

[PATCH] genopinit: Allow more than 256 modes.

2023-07-11 Thread Robin Dapp via Gcc-patches
Hi, upcoming changes for RISC-V will have us exceed 256 modes or 8 bits. The helper functions in gen* rely on the opcode as well as two modes fitting into an unsigned int (a signed int even if we consider the qsort default comparison function). This patch changes the type of the index/hash from u

Re: [PATCH V4] RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-07 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, the somewhat unified modulo is IMHO a more readable. Could probably still be improved but OK with me for now. > + if (is_dummy_len) > + { > + rtx dummy_len = gen_reg_rtx (Pmode); Can we call this is_vlmax_len/is_vlmax and vlmax_len or so? > + if (inner

Re: [PATCH V3] RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-07 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, that's quite a chunk :) and it took me a while to go through it. > @@ -564,7 +565,14 @@ const_vec_all_in_range_p (rtx vec, poly_int64 minval, > poly_int64 maxval) > static rtx > gen_const_vector_dup (machine_mode mode, poly_int64 val) > { > - rtx c = gen_int_mode (val, GET_

Re: [PATCH v5] RISC-V: Fix one bug for floating-point static frm

2023-07-06 Thread Robin Dapp via Gcc-patches
Hi Pan, thanks, I think that works for me as I'm expecting these parts to change a bit anyway in the near future. There is no functional change to the last revision that Kito already OK'ed so I think you can go ahead. Regards Robin

[PATCH] RISC-V: Change truncate to float_truncate in narrowing

2023-07-05 Thread Robin Dapp via Gcc-patches
Hi, Juzhe noticed that several floating-point conversion tests FAIL on 32 bit. This is due to the autovect FP narrowing patterns using a truncate instead of a float_truncate which results in a combine ICE. It would try to e.g. simplify a unary operation by simplify_const_unary_operation which ob

[PATCH] RISC-V: Support variable index in vec_extract.

2023-07-05 Thread Robin Dapp via Gcc-patches
Hi, this patch adds a gen_lowpart in the vec_extract expander so it properly works with a variable index and adds tests. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md: Add gen_lowpart. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c:

[PATCH] RISC-V: Allow variable index for vec_set.

2023-07-05 Thread Robin Dapp via Gcc-patches
Hi, this patch enables a variable index for vec_set and adjusts/cleans up the tests. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md: Allow register index operand. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Adjust test.

Re: [PATCH] gimple-isel: Recognize vec_extract pattern.

2023-07-05 Thread Robin Dapp via Gcc-patches
>> + _4 = vD.2208; >> + _5 = .VEC_EXTRACT (_4, idx_2(D)); >> + _3 = _5; */ > > I think you are doing > > _3 = .VEC_EXTRACT (_4, idx_2(D)); > > and avoiding the SSA name copy correctly. Can you double-check? > > OK with the comment adjusted. Argh, yes, thanks. Regards

Re: [PATCH v4] RISC-V: Fix one bug for floating-point static frm

2023-07-05 Thread Robin Dapp via Gcc-patches
Hi Pan, yes, the problem is fixed for me. Still some comments ;) Sorry it took a while. > 1. By default, the RVV floating-point will take dyn mode. > 2. DYN is invalid in FRM register for RVV floating-point. > > When mode switching the function entry and exit, it will take DYN as > the frm mod

Re: [PATCH v4] RISC-V: Fix one bug for floating-point static frm

2023-07-05 Thread Robin Dapp via Gcc-patches
> LGTM, thanks :) just a moment please, I still wanted to reply ;) Regards Robin

Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI

2023-07-04 Thread Robin Dapp via Gcc-patches
> Kito (or somebody else), would you mind doing a RISC-V bootstrap? It would > take forever on my machine. Thank you. I did a bootstrap myself now and it finally finished. Going to commit the attached tomorrow. Regards Robin Subject: [PATCH] Change MODE_BITSIZE to MODE_PRECISION for MODE_VECT

Re: [PATCH] gimple-isel: Recognize vec_extract pattern.

2023-07-04 Thread Robin Dapp via Gcc-patches
Hi Richard, changed the patch according to your comments and I agree that it is more readable that way. I hope using lhs as target for the extract directly is possible the way I did it. Richard's patch for aarch64 is already, therefore testsuites on aarch64 and i386 are unchanged. Regards Robi

Re: [PATCH v1] RISC-V: Fix one typo of FRM dynamic definition

2023-07-04 Thread Robin Dapp via Gcc-patches
> Just revert this patch, it reports some weird illegal instr, I may > need more time for this. The illegal instruction is due to the wrong rounding mode. We set 5 instead of 7 because the two enums don't match. A simple but ugly fix would be two dummy entries so that FRM_MODE_DYN is entry 7 in

Re: [PATCH v1] RISC-V: Use FRM_DYN when add the rounding mode operand

2023-07-04 Thread Robin Dapp via Gcc-patches
Hi Pan, in general this looks good to me. I would have expected the change in the other patch I just looked at though ;) Sure it's intrinsics this time but the same principle. Regards Robin

Re: [PATCH v2] RISC-V: Fix one bug for floating-point static frm

2023-07-04 Thread Robin Dapp via Gcc-patches
Hi Pan, I only just now got back to my mails and I'm a bit confused about the several patches related to rounding mode. > 1. By default, the RVV floating-point will take dyn mode. Here you are referring to 10.1 in the spec I assume. Could we add this as a comment in the code? > 2. DYN is inval

Re: [PATCH v1] RISC-V: Fix one typo of FRM dynamic definition

2023-07-03 Thread Robin Dapp via Gcc-patches
> Sorry for inconvenient, still working on fix it. If urgent I can > revert this change to unblock your work ASAP. I'm not blocked by this, thanks, just wanted to document it here. I was testing another patch and needed to dig for a while until I realized the FAILs come from this one. In general

Re: [PATCH v1] RISC-V: Fix one typo of FRM dynamic definition

2023-07-03 Thread Robin Dapp via Gcc-patches
Hmm, looks like it wasn't simple enough... I'm seeing execution fails for various floating point test cases. This is due to a mismatch between the FRM_DYN definition (0b111 == 7) and the attribute value (== 5). Therefore we set the rounding mode to 5 instead of 7. Regards Robin

Re: [VSETVL PASS] RISC-V: Optimize local AVL propagation

2023-07-03 Thread Robin Dapp via Gcc-patches
LGTM. Regards Robin

[PATCH] gimple-isel: Recognize vec_extract pattern.

2023-07-03 Thread Robin Dapp via Gcc-patches
Hi, In gimple-isel we already deduce a vec_set pattern from an ARRAY_REF(VIEW_CONVERT_EXPR). This patch does the same for a vec_extract. The code is largely similar to the vec_set one including the addition of a can_vec_extract_var_idx_p function in optabs.cc to check if the backend can handle a

Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments

2023-07-03 Thread Robin Dapp via Gcc-patches
> Similar to LEN_MASK_LOAD/STORE, their orders are consistent now after > this patch. Ah right, apologies. Regards Robin

Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments

2023-07-03 Thread Robin Dapp via Gcc-patches
Hi Juzhe, when changing the argument order for LEN_LOAD/LEN_STORE, you will also need to adjust rs6000's and s390's expanders. Regards Robin

Re: [PATCH 2/2] ifcvt: Allow more operations in multiple set if conversion

2023-07-03 Thread Robin Dapp via Gcc-patches
Hi Manolis, that looks like a nice enhancement of what's already possible. The concern I had some years back already was that this function would eventually grow and cannibalize on some of what the other functions in ifcvt already do :) At some point we really should unify but that's not within

Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering

2023-07-03 Thread Robin Dapp via Gcc-patches
On 7/3/23 10:45, juzhe.zh...@rivai.ai wrote: > We can apply it but not sure why the patchwork shows it's rejected. I believe it also failed for me locally because the order of patterns in autovec-opt.md was somehow different. The one attached worked for me though after some minor merge adjustment

Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering

2023-07-03 Thread Robin Dapp via Gcc-patches
> We failed to merge it since it's been rejected. > https://patchwork.sourceware.org/project/gcc/patch/20230628041512.188243-1-juzhe.zh...@rivai.ai/ > > >   Err, who rejected? Or is this about the

Re: [PATCH v1] RISC-V: Fix one typo of FRM dynamic definition

2023-07-03 Thread Robin Dapp via Gcc-patches
> Thanks for fixing it. LGTM. > I think you can merge it when Robin is ok since this is a simple typo > fix. Yes, that's definitely simple enough :) Regards Robin

Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering

2023-07-03 Thread Robin Dapp via Gcc-patches
> Thanks. Ok for trunk? OK from my side. As agreed with Jeff, I'm going to get back to this and revisit/change if needed in the future. Regards Robin

Re: [PATCH] RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac combine lowering

2023-07-03 Thread Robin Dapp via Gcc-patches
To reiterate, this is OK from my side. As discussed in the other thread, Jeff would like to have more info on whether a bridge pattern is needed at all and I agreed to get back to it in a while. Until then, we can merge this. Regards Robin

Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering

2023-07-01 Thread Robin Dapp via Gcc-patches
> There has to be some kind of mismatch between the patch or testcase > or what we're looking at to judge success. Yeah I think the initially posted example was misleading because it contained an already working example. > While I really don't see the need to have the bridge pattern, I'm > still

Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering

2023-06-30 Thread Robin Dapp via Gcc-patches
> The explicit conversions I see are because we need the output of the > conversion in multiple vfmul instructions. That won't be helped by > the patch you've proposed. FWIW on my local branch and the patch applied I see that the vfwmuls are being generated (all of the vfmuls are replaced). > It

Re: [PATCH V2] Machine Description: Add LEN_MASK_{GATHER_LOAD, SCATTER_STORE} pattern

2023-06-29 Thread Robin Dapp via Gcc-patches
> I am not sure why you mention 'len' in bytes. The 'trick' for then len_load/len_store patterns is to allow a QImode/byte-only length rather than elements. Regards Robin

Re: [PATCH] Machine Description: Add LEN_MASK_{GATHER_LOAD, SCATTER_STORE} pattern

2023-06-29 Thread Robin Dapp via Gcc-patches
> I personally prefer **NOT** to include BIAS in the gather/scatter > since I don't known how it will be used. It was not my intention to suggest to add BIAS here. This can be done by the respective targets when/if they support mask_*, not by you. What I meant is that I'm unsure whether to add a

Re: [PATCH] Machine Description: Add LEN_MASK_{GATHER_LOAD, SCATTER_STORE} pattern

2023-06-29 Thread Robin Dapp via Gcc-patches
Hi Juzhe, just looking at the documentation changes. > +@cindex @code{len_mask_gather_load@var{m}@var{n}} instruction pattern > +@item @samp{len_mask_gather_load@var{m}@var{n}} > +Like @samp{gather_load@var{m}@var{n}}, but takes an extra len operand > +as operand 5 and an extra mask operand as op

Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI

2023-06-29 Thread Robin Dapp via Gcc-patches
>> Hi Robin: >> >>> diff --git a/gcc/lto/lto-lang.cc b/gcc/lto/lto-lang.cc >>> index 52d7626e92e..14d419c2013 100644 >>> --- a/gcc/lto/lto-lang.cc >>> +++ b/gcc/lto/lto-lang.cc >>> @@ -1050,7 +1050,7 @@ lto_type_for_mode (machine_mode mode, int unsigned_p) >>>else if (GET_MODE_CLASS (mode) == M

Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI

2023-06-29 Thread Robin Dapp via Gcc-patches
> This should probably use GET_MODE_PRECISION as well. > > OK if it bootstraps/tests on both aarch64 and riscv. > > Richard. I found a several other instances, also in the frontends that I'm not exactly sure about. I'm currently testing this but aarch64 bootstrap is still going to take a while,

Re: [PATCH] Prevent TYPE_PRECISION on VECTOR_TYPEs

2023-06-29 Thread Robin Dapp via Gcc-patches
>> Since nobody else has provided a patch yet, is the attached OK as long >> as x86 bootstrap and testsuite are clean? > > Yes. Bootstrap and testsuite are good. Going to commit. Thanks. Regards Robin

Re: [PATCH] Prevent TYPE_PRECISION on VECTOR_TYPEs

2023-06-29 Thread Robin Dapp via Gcc-patches
Ah, the one sub-thread continued before you were CC'ed. Sorry about that. Regards Robin

Re: [PATCH] Prevent TYPE_PRECISION on VECTOR_TYPEs

2023-06-29 Thread Robin Dapp via Gcc-patches
> Currently, I have no ideal how to walk around this ICE in RISC-V port. > Do you have any suggestions? I'm already bootstrapping this patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623184.html I replied to all but it seems you got lost in the thread? Regards Robin

Re: [PATCH] Prevent TYPE_PRECISION on VECTOR_TYPEs

2023-06-29 Thread Robin Dapp via Gcc-patches
> I think GCN people wanted to make this code work for vectors, the > most obvious local fix is to use element_precision (type) above. > > Note usually vector integer divisions are not a thing so this might > explain why you're seeing this only with RVV? Since nobody else has provided a patch yet

Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI

2023-06-29 Thread Robin Dapp via Gcc-patches
> Yeah, that part is OK, and was the case I was thinking about when > I said OK yesterday. But now that we allow BITSIZE != PRECISION, > it's possible for BITSIZE - PRECISION to be more than a full byte, > in which case the new loop would not initialise every byte of > the mode. Ah, I see, so whe

Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI

2023-06-29 Thread Robin Dapp via Gcc-patches
>>> are we absolutely sure this is the only problem we will have >>> with precision != bitsize and it is confined to the backend? > Yes. With vinfo.vector_mode == VNx4SI mask_type = get_mask_type_for_scalar_type (vinfo, int) mask_type is: vector(4) I.e. the precision is 2. This is definitely

Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI

2023-06-29 Thread Robin Dapp via Gcc-patches
I grep'ed a bit and found several more instances of the same pattern which would probably all have to be adjusted (frontend-related mostly but also in native_encode_rtx). Most likely they would all have to be adjusted? > Sorry, only realised later, but: if the precision can cover fewer > bytes t

Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI

2023-06-28 Thread Robin Dapp via Gcc-patches
Hi Juzhe, I find the bug description rather confusing. What I can see is that the constant in the literal pool is indeed wrong but how would DSE or so play a role there? Particularly only for the smaller modes? My suspicion would be that the constant in the literal/constant pool is wrong from s

Re: [PATCH] match.pd: Use element_mode instead of TYPE_MODE.

2023-06-27 Thread Robin Dapp via Gcc-patches
> You can put it into the original one. Bootstrap and testsuite run were successful. I'm going to push the attached, thanks. Regards Robin diff --git a/gcc/match.pd b/gcc/match.pd index 33ccda3e7b6..83bcefa914b 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -7454,10 +7454,12 @@ DEFINE_INT_AND_

Re: [PATCH] match.pd: Use element_mode instead of TYPE_MODE.

2023-06-27 Thread Robin Dapp via Gcc-patches
> so I suggest to do a similar VECTOR_MODE_P check and your original test. > So > > && (!VECTOR_MODE_P (TYPE_MODE (newtype)) > || target_supports_op_p (newtype, op, optab_default)) > > OK with that change. Separate patch or into the original one? We needed element_mode because T

Re: [PATCH] match.pd: Use element_mode instead of TYPE_MODE.

2023-06-27 Thread Robin Dapp via Gcc-patches
> Yeah, the optab should already have the fallback of WIDENing here? > So why does that fail? We reach if (CLASS_HAS_WIDER_MODES_P (mclass)) which returns false because mclass == MODE_VECTOR_FLOAT. CLASS_HAS_WIDER_MODES_P only handles non-vector classes? Same for FOR_EACH_WIDER_MODE that follows.

Re: [PATCH] match.pd: Use element_mode instead of TYPE_MODE.

2023-06-27 Thread Robin Dapp via Gcc-patches
> Why does the expander not have a fallback here? If we put up > restrictions like this like we do for vector operations (after > vector lowering!), we need to document this. Your check covers > more than just FP16 types as well which I think is undesirable. I'm not sure I follow. What would we

Re: [PATCH] match.pd: Use element_mode instead of TYPE_MODE.

2023-06-26 Thread Robin Dapp via Gcc-patches
> Can you push the element_mode change separately please? OK. > I'd like to hear more reasoning of why target_supports_op_p is wanted > here. Doesn't target_supports_op_p return false if this is for example > a soft-fp target? So if at all, shouldn't the test only be carried > out if the origin

[PATCH] RISC-V: Add autovect widening/narrowing Integer/FP conversions.

2023-06-26 Thread Robin Dapp via Gcc-patches
Hi, this patch implements widening and narrowing float-to-int and int-to-float autovec conversions and adds tests. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md (2): New expander. (2): Dito. (2): Dito. (2): Dito. * config/riscv/vector-it

[PATCH] RISC-V: Add autovec FP widening/narrowing.

2023-06-26 Thread Robin Dapp via Gcc-patches
Hi, this patch adds FP widening and narrowing autovec expanders as well as tests. Conceptually similar to integer extension/truncation, we emulate _Float16 -> double by two vfwcvts and double -> _Float16 by two vfncvts. Optimizations to create widening operations will be added separately. Regar

[PATCH] RISC-V: Add autovec FP int->float conversion.

2023-06-26 Thread Robin Dapp via Gcc-patches
Hi, this patch adds the autovec expander for vfcvt.f.x.v and tests for it. In addition, it modifies the zfhmin-1 test so it doesn't scan for "no vectorization" but rather check that we do not emit any (RTL) vector operations (other than float/float conversions) with a VNx..HFmode. Regards Robin

[PATCH] match.pd: Use element_mode instead of TYPE_MODE.

2023-06-26 Thread Robin Dapp via Gcc-patches
Hi, this patch changes TYPE_MODE into element_mode in a match.pd simplification. As the simplification can be called with vector types real_can_shorten_arithmetic would ICE in REAL_MODE_FORMAT which expects a scalar mode. Therefore, use element_mode instead of TYPE_MODE. Additionally, check if

<    1   2   3   4   5   >