Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

2023-10-02 Thread Robin Dapp
> Conceptually the rounding mode is just a property. The call, in > effect, should demand a "normal" rounding mode and set the rounding > mode to unknown if I understand how this is supposed to work. If my > understanding is wrong, then maybe that's where we should start -- > with a good descript

Re: [PATCH] RFC: Add late-combine pass [PR106594]

2023-10-02 Thread Robin Dapp
Hi Richard, cool, thanks. I just gave it a try with my test cases and it does what it is supposed to do, at least if I disable the register pressure check :) A cursory look over the test suite showed no major regressions and just some overly specific tests. My test case only works before split,

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-04 Thread Robin Dapp
Hi Tamar, > I can't approve but hope you don't mind the review, Not at all, greatly appreciated. I incorporated all your remarks apart from this: > Isn't vec_opmask NULL for SLP? You probably need to read it from > vec_defs for the COND_EXPR Above that I gcc_assert (!slp_node) for the IFN_COND

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-04 Thread Robin Dapp
Ping^2. I realize it's not very elegant as of now. If there's a better/shorter way to solve this feel free to suggest :) Regards Robin

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-04 Thread Robin Dapp
> + gcc_assert (code == IFN_COND_ADD || code == IFN_COND_SUB); I forgot to add the other IFN_CONDs here before sending. So with - gcc_assert (code == IFN_COND_ADD || code == IFN_COND_SUB); + gcc_assert (code == IFN_COND_ADD || code == IFN_COND_SUB + || code == IF

Re: [PATCH] RISC-V: Remove @ of vec_series

2023-10-04 Thread Robin Dapp
I'm currently in the process of removing some unused @s. This is OK. Regards Robin

Re: [PATCH] RISC-V: Fix the riscv_legitimize_poly_move issue on targets where the minimal VLEN exceeds 512.

2023-10-04 Thread Robin Dapp
>> I think the "max poly value" is the LMUL 1 mode coeffs[1] >> >> See int vlenb = BYTES_PER_RISCV_VECTOR.coeffs[1]; >> >> So I think bump max_power to exact_log2 (64); is not enough. >> since we adjust the LMUL 1 mode size according to TARGET_MIN_VLEN. >> >> I suspect the testcase you append in th

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-05 Thread Robin Dapp
Hi Tamar, > So in the > > if (slp_node) > { > > Add something like: > > If (is_cond_op) > { > if (dump_enabled_p ()) > dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, >"left fold reduction on SLP not supported.\n"); > return false; >

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-05 Thread Robin Dapp
Ah, sorry, read your remark incorrectly. Will try again. Regards Robin

Re: [RFC] expr: don't clear SUBREG_PROMOTED_VAR_P flag for a promoted subreg [target/111466]

2023-10-05 Thread Robin Dapp
> So I think Kenner's code is trying to prevent having a value in a > SUBREG that is inconsistent with the SUBREG_PROMOTED* flag bits. But > I think it's been unnecessary since Matz's rewrite in 2009. I couldn't really tell what the rewrite does entirely so I tried creating a case where we would

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-05 Thread Robin Dapp
Hi Tamar, > The only comment I have is whether you actually need this helper > function? It looks like all the uses of it are in cases you have, or > will call conditional_internal_fn_code directly. removed the cond_fn_p entirely in the attached v3. Bootstrapped and regtested on x86_64, aarch64 a

Re: [PATCH] RISC-V: Fix the riscv_legitimize_poly_move issue on targets where the minimal VLEN exceeds 512.

2023-10-05 Thread Robin Dapp
> Your suggested code seems work fine, let me run more test and send > v2, I guess I just don’t know how to explain why it work in comment > :p If it's too convoluted maybe we should rather not use it :D The idea is for factor % (vlenb / potential_div) == 0 we're actually looking for the larges

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-06 Thread Robin Dapp
> ... here we probably get PLUS_EXPR for MINUS_EXPR above but IIRC > for MINUS_EXPR the !as_initial case should return positive zero. > > Can you double-check? You're referring to the canonicalization from a - CST to a + -CST so that the neutral op would need to change with it? Argh, good point.

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-06 Thread Robin Dapp
> We might need a similar assert > > gcc_assert (HONOR_SIGNED_ZEROS (vectype_out) > && !HONOR_SIGN_DEPENDENT_ROUNDING (vectype_out));? erm, obviously not that exact assert but more something like if (HONOR_SIGNED_ZEROS && !HONOR_SIGN_DEPENDENT_ROUNDING...) { i

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-06 Thread Robin Dapp
> So if you think you got everything correct the patch is OK as-is, > I just wasn't sure - maybe the neutral_element change deserves > a comment as to how MINUS_EXPR is handled. Heh, I never think I got everything correct ;) Added this now: static bool fold_left_reduction_fn (code_helper code,

Re: [PATCH] TEST: Fix dump FAIL of vect-multitypes-16.c for RVV

2023-10-09 Thread Robin Dapp
> Maybe I should pretend RVV support vect_pack/vect_unpack and enable > all the tests in target-supports.exp? The problem is that vect_pack/unpack is an overloaded term in the moment meaning "vector conversion" (promotion/demotion) or so. This test does not require pack/unpack for successful vect

Re: [PATCH] RISC-V: Support movmisalign of RVV VLA modes

2023-10-09 Thread Robin Dapp
Hi Juzhe, I think an extra param might be too intrusive. I would expect normal hardware implementations to support unaligned accesses (but they might be slow which should be covered by costs) and only rarely have hardware that doesn't support it and raises exceptions. Therefore I would suggest t

Re: [PATCH V2] TEST: Fix vect_cond_arith_* dump checks for RVV

2023-10-09 Thread Robin Dapp
On 10/9/23 09:32, Andreas Schwab wrote: > On Okt 09 2023, juzhe.zh...@rivai.ai wrote: > >> Turns out COND(_LEN)?_ADD can't work. > > It should work though. Tcl regexps are a superset of POSIX EREs. > The problem is that COND(_LEN)?_ADD matches two times against COND_LEN_ADD and a scan-tree-dum

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-09 Thread Robin Dapp
> It'd be good to expand on this comment a bit. What kind of COND are you > anticipating? A COND with the neutral op as the else value, so that the > PLUS_EXPR (or whatever) can remain unconditional? If so, it would be > good to sketch briefly how that happens, and why it's better than using > t

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-09 Thread Robin Dapp
> Hmm, the function is called at transform time so this shouldn't help > avoiding the ICE. I expected we refuse to vectorize _any_ reduction > when sign dependent rounding is in effect? OTOH maybe sign-dependent > rounding is OK but only when we use a unconditional fold-left > (so a loop mask fro

Re: [PATCH V2] RISC-V: Support movmisalign of RVV VLA modes

2023-10-09 Thread Robin Dapp
Thanks, for now this LGTM. Regards Robin

Re: [PATCH] RISC-V: Enable full coverage vect tests

2023-10-10 Thread Robin Dapp
Hi Juzhe, seems OK to me. We don't support most of the patterns directly but as we can and want to vectorize them it makes sens to enable the tests. Regards Robin

Re: [PATCH] RISC-V: Remove XFAIL of ssa-dom-cse-2.c

2023-10-11 Thread Robin Dapp
LGTM FWIW. Regards Robin

Re: [PATCH V2] RISC-V: Fix incorrect index(offset) of gather/scatter

2023-10-11 Thread Robin Dapp
Hi Juzhe, good that you noticed it now, I should have caught that in the review back then... One thing, though: > + if (inner_offsize < GET_MODE_BITSIZE (GET_MODE (ptr)).to_constant ()) Shouldn't ptr always be Pmode i.e. the bitsize == XLEN? Rest LGTM. Regards Robin

Re: [PATCH V3] RISC-V: Fix incorrect index(offset) of gather/scatter

2023-10-11 Thread Robin Dapp
LGTM, thanks. Regards Robin

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-11 Thread Robin Dapp
2b50b96f889dc0a9c2fc8e50859 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Wed, 13 Sep 2023 22:19:35 +0200 Subject: [PATCH v4] ifcvt/vect: Emit COND_ADD for conditional scalar reduction. As described in PR111401 we currently emit a COND and a PLUS expression for conditional reductions. T

[PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-12 Thread Robin Dapp
Hi, on riscv insn-emit.cc has grown to over 1.2 mio lines of code and compiling it takes considerable time. Therefore, this patch adjust genemit to create ten files insn-emit-1.cc to insn-emit-10.cc. In order to do so it first counts the number of available patterns, calculates the number of patt

Re: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-13 Thread Robin Dapp
> Testsuite is unchanged on all but x86 where, strangely, I saw several > illegal instructions in the pch tests. Those were not reproducible > in a second manual test suite run. I'm just running another full > bootstrap and testsuite cycle with the latest trunk. Follow-up on the pch tests. The

Re: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-13 Thread Robin Dapp
> Hmm why? The same callback you use to consume the listed arguments > can be used to consume the list can it not? I may be wrong, but from > what I remember the callback is called when main can't consume an > argv value and it's allowed to eat all remaining input? Ah, I see. If that's possible

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-13 Thread Robin Dapp
> Why are the contents of this if statement wrong for COND_LEN? > If the "else" value doesn't matter, then the masked form can use > the "then" value for all elements. I would have expected the same > thing to be true of COND_LEN. Right, that one was overly pessimistic. Removed. > But isn't the

Re: [PATCH] RISC-V: Use VLS modes if the NITERS is known and smaller than VLS mode elements.

2023-10-16 Thread Robin Dapp
Hi Juzhe, this LGTM. I was first concerned whether we would want to stop e.g. at LMUL = 1 and only continue with a specific flag but actually this should be done via the costs. If an implementation wants to penalize or incentivize some behavior it can always adjust the costs which should be suff

Re: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-16 Thread Robin Dapp
ioning does not help but also doesn't slow anything down. insn-emit.cc isn't very large to begin with on s390. Regards Robin >From 34d05113a4e3c7e83a4731020307e26c1144af69 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Thu, 12 Oct 2023 11:23:26 +0200 Subject: [PATCH v2] genemit: Spl

Re: [PATCH V2] RISC-V: Fix unexpected big LMUL choosing in dynamic LMUL model for non-adjacent load/store

2023-10-16 Thread Robin Dapp
Hi Juzhe, > +/* Get STORE value. */ > +static tree > +get_store_value (gimple *stmt) > +{ > + if (is_gimple_call (stmt) && gimple_call_internal_p (stmt)) > +{ > + if (gimple_call_internal_fn (stmt) == IFN_MASK_STORE) > + return gimple_call_arg (stmt, 3); > + else > + gcc_un

Re: [PATCH V3] RISC-V: Fix unexpected big LMUL choosing in dynamic LMUL model for non-adjacent load/store

2023-10-16 Thread Robin Dapp
> + if (live_range && flow_bb_inside_loop_p (loop, e->src)) > + { Doesn't this match several cases more than before i.e set the range start to zero fairly often? I mean if it works fine with me and the code is easier to read. Please split off the search for the non-contiguo

Re: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-17 Thread Robin Dapp
> Natively, things seem fine, but for cross, I get failures on a few > targets (hppa2.0-unknown-linux-gnu, hppa64-unknown-linux-gnu). > > With ./configure --host=x86_64-pc-linux-gnu > --target=hppa2.0-unknown-linux-gnu --build=x86_64-pc-linux-gnu && make > -j$(nproc), I get a bunch of stuff like:

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Robin Dapp
>> I don't know much about valueisation either :) But it does feel >> like we're working around the lack of a LEN form of COND_EXPR. >> In other words, it seems odd that we can do: >> >> IFN_COND_LEN_ADD (mask, a, 0, b, len, bias) >> >> but we can't do: >> >> IFN_COND_LEN (mask, a, b, len, bia

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Robin Dapp
Thank you for the explanation. So, assuming I added an IFN_VCOND_MASK and IFN_VCOND_MASK_LEN along with the respective helper and expand functions, what would be the way forward? Generate an IFN_VCOND_MASK(_LEN) here instead of a VEC_COND_EXPR? How would I make sure all of match.pd's vec_cond opt

Re: [PATCH] RISC-V: Optimize consecutive permutation index pattern by vrgather.vi/vx

2023-10-18 Thread Robin Dapp
LGTM. Regards Robin

[PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
Hi, as I didn't manage to get back to the generic vectorizer fallback for popcount in time (still the generic costing problem) I figured I'd rather implement the popcount fallback in the riscv backend. It uses the WWG algorithm from libgcc. rvv.exp is unchanged, vect and dg.exp testsuites are cur

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
> I saw you didn't extend VI -> V_VLSI. I guess will failed SLP on > popcount. Hehe, right, I just copied and pasted the expander from my old patch. Will adjust it and add the test. Regards Robin

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
> I saw you didn't extend VI -> V_VLSI. I guess will failed SLP on popcount. Added VLS modes and your test in v2. Testsuite looks unchanged on my side (vect, dg, rvv). Regards Robin Subject: [PATCH v2] RISC-V: Add popcount fallback expander. I didn't manage to get back to the generic vectoriz

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
I didn't push this yet because it would have introduced an UNRESOLVED that my summary script didn't catch. Normally I go with just contrib/test_summary but that only filters out FAIL and XPASS. I should really be using compare_testsuite_log.py from riscv-gnu-toolchain/scripts. It was caused by a

[PATCH] vect: Allow same precision for bit-precision conversions.

2023-10-18 Thread Robin Dapp
Hi, even though there was no full conclusion yet I took the liberty of just posting this as a patch in case of further discussion. In PR/111794 we miss a vectorization because on riscv type precision and mode precision differ for mask types. We can still vectorize when allowing assignments with

Re: [PATCH] RISC-V: Add popcount fallback expander.

2023-10-18 Thread Robin Dapp
> Could you by the way add this mention this PR: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111791 > Add the test of this PR ? Commented in that PR. This patch does not help there. Regards Robin

Re: [PATCH V3 00/11] Refactor and cleanup vsetvl pass

2023-10-19 Thread Robin Dapp
Hi Lehua, thanks for the extensive rework. I'm going to let Juzhe handle the review since it's his pass and he knows it best. Delegated it to him in patchwork. Regards Robin

Re: [PATCH V2] RISC-V: Fix failed hoist in LICM of vmv.v.x instruction

2023-10-19 Thread Robin Dapp
Hi Juzhe, as discussed off-list this approach generally makes sense to me so the patch LGTM once the vsetvl rework is upstream and settled. Independently, we still need to understand why the more complex broadcast pattern is not hoisted out of the loop. Regards Robin

[PATCH, S390] Avoid LA with base and index on z13

2018-07-05 Thread Robin Dapp
Hi, this patch avoids emitting LA on z13 and later when the address has both an index and a base since a regular add is faster in that case. Regtested on s390x. Regards Robin -- gcc/ChangeLog: 2018-07-05 Robin Dapp * config/s390/s390.c (preferred_la_operand_p): Do not use

[RFC] fwprop address cost changes

2018-07-11 Thread Robin Dapp
Hi, we recently hit a problem where fwprop would not propagate a memory address into an insn because our backend (s390) tells it that the address_cost ()s for an address with index are higher than for one without. Subsequently, should_replace_address () returns false and no propagation is performe

[PATCH, S390] Increase function alignment to 16 bytes

2018-07-11 Thread Robin Dapp
timize("Os"))) void bar () {}; I did not observe that the default alignment, once set, was reset anywhere. Regards Robin -- gcc/ChangeLog: 2018-07-11 Robin Dapp * config/s390/s390.c (s390_default_align): Set default function alignment. (s390_override_option

Re: [PATCH, S390] Increase function alignment to 16 bytes

2018-07-12 Thread Robin Dapp
ests (which do not all fail without the patch as we can get lucky with the alignment). Regtested on s390x. Regards Robin -- gcc/ChangeLog: 2018-07-12 Robin Dapp * config/s390/s390.c (s390_default_align): Set default function alignment to 16. (s390_over

[PATCH, S390] Avoid LA with base and index on z13

2018-07-16 Thread Robin Dapp
d the comment. Regards Robin -- gcc/ChangeLog: 2018-07-16 Robin Dapp * config/s390/s390.c (preferred_la_operand_p): Do not use LA with index register on z196 or later. diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 23c3f3db621..d8b47c6fe67 100644 --- a/g

[PATCH] RISC-V: Allow all const_vec_duplicates as constants.

2023-05-26 Thread Robin Dapp
Hi, as we can always broadcast an integer constant to a vector register allow them in riscv_const_insns. We need as many instructions as it takes to generate the constant and one vmv.vx. Regards Robin gcc/ChangeLog: * config/riscv/riscv.cc (riscv_const_insns): Allow const_vec_

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-19 Thread Robin Dapp
Ugh, I didn't push yet because with a rebased trunk I am seeing different behavior for some riscv testcases. A reduction is not recognized because there is yet another "double use" occurrence in check_reduction_path. I guess it's reasonable to loosen the restriction for conditional operations her

Re: [PATCH v3 1/4] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

2023-10-20 Thread Robin Dapp
> Sorry for the slow review. TBH I was hoping someone else would pick > it up, since (a) I'm not very familiar with this code, and (b) I don't > really agree with the way that the current code works. I'm not sure the > current dependency checking is safe, so I'm nervous about adding even > more c

[PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-23 Thread Robin Dapp
a vcond_mask_len that results from a simplification with all-one or all-zero mask we could allow constant immediate vectors and expand them to simple len moves in the backend. Regards Robin >From bc72e9b2f3ee46508404ee7723ca78790fa96b6b Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Fri, 13

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-24 Thread Robin Dapp
as internal_fn_else_index. Testsuite on riscv is unchanged, bootstrap and testsuite on power10 done, aarch64 and x86 still running. Regards Robin >From e11ac2b5889558c58ce711d8119ebcd78173ac6c Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Wed, 13 Sep 2023 22:19:35 +0200 Subject: [PATCH v6] ifcv

Re: [RFC] RISC-V: elide sign extend when expanding cmp_and_jump

2023-10-25 Thread Robin Dapp
Hi Vineet, I was thinking of two things while skimming the code: - Couldn't we do this in the expanders directly? Or is the subreg-promoted info gone until we reach that? - Should some common-code part be more suited to handle that? We already elide redundant sign-zero extensions for ot

Re: [PATCH] RISC-V: Fix multiple EXCESS test FAILs in RVV testsuite

2023-10-25 Thread Robin Dapp
Hi Juzhe, I guess that's OK but what's the problem here? Are the default options wrong so we need to overwrite them instead of adding some? Regards Robin

Re: [PATCH] RISC-V: Change MD attribute avl_type into avl_type_idx[NFC]

2023-10-25 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] RISC-V: Fix multiple EXCESS test FAILs in RVV testsuite

2023-10-25 Thread Robin Dapp
> However, when I built the toolchain with -march=rv64gcv_zfh_zvfh. > Then link fail. Hmm, is it about zvfh or why does linking fail? Regards Robin

Re: [PATCH] RISC-V: Fix multiple EXCESS test FAILs in RVV testsuite

2023-10-25 Thread Robin Dapp
> Hmmm. I am not familiar with Binutils... > > I just adapted tests like others in the testsuite make them consistent. > And turns out it can fix the issues.. I see where you're coming from, but can you assemble/link any executable with -march=..._zvfh? Probably not? Doesn't half of GCC's tests

Re: [RFC] RISC-V: elide sign extend when expanding cmp_and_jump

2023-10-25 Thread Robin Dapp
> Well, it doesn't seem like there's a lot of difference between doing > it in the generic expander bits vs target expander bits -- the former > just calls into the latter for the most part. Thus if the > subreg-promoted state is available in the target expander, I'd expect > it to be available

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-25 Thread Robin Dapp
se expand_direct_optab_fn. For now I kept the expander function but used a direct optab. Regards Robin >From 4f793b71184b3301087780ed500f798d69328fc9 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Fri, 13 Oct 2023 10:20:35 +0200 Subject: [PATCH v2] internal-fn: Add VCOND_MASK_LEN. In ord

Re: [PATCH] RISC-V: Add AVL propagation PASS for RVV auto-vectorization

2023-10-26 Thread Robin Dapp
> Increased FAILS are LMUL = M4. I have analyzed the codegen. Looks > reasonable. > > Moreover, When I removed 'popcount_64' and test, all passed no matter > apply this patch or not. > > I think it is because popcount64 is buggy in RV32, this patch trigger > LMUL = 4 bug already existed that we

Re: [PATCH V2] RISC-V: Add AVL propagation PASS for RVV auto-vectorization

2023-10-26 Thread Robin Dapp
> I have sent V3 with adapting testcases (2 additional dump FAILs detected by > both Pan Li and Patrick). > No need to review. > > I will wait for patrick is ok to ignore popcount FAILs for now then commit it. Just to confirm: I can now also reproduce the popcount fail on my machine without you

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-26 Thread Robin Dapp
> Yeah. I think Robin may need this : > > TREE_CODE (else_val) == SSA_NAAME > && SSA_NAME_IS_DEFAULT_DEF (else_val) > && VAR_P (SSA_NAME_VAR (else_val)) > > to differentiate whether the ELSE VALUE is uninitialized SSA or not. I think we are talking about a different simplification now. This

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-26 Thread Robin Dapp
n x86, aarch64 and power10. Regards Robin >From 7acdebb5b13b71331621af08da6649fe08476fe8 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Wed, 25 Oct 2023 22:19:43 +0200 Subject: [PATCH v3] internal-fn: Add VCOND_MASK_LEN. In order to prevent simplification of a COND_OP with degenerate mask (all tru

Re: [Ready to commit V3] RISC-V: Add AVL propagation PASS for RVV auto-vectorization

2023-10-26 Thread Robin Dapp
> rv32gcv: > FAIL: gfortran.dg/intrinsic_pack_6.f90   -O2  execution test > FAIL: gfortran.dg/intrinsic_pack_6.f90   -O3 -g  execution test > FAIL: gfortran.dg/matmul_3.f90   -O2  execution test > FAIL: gfortran.fortran-torture/execute/intrinsic_matmul.f90 execution,  -O2 > FAIL: gfortran.fortran-t

[PATCH] RISC-V: Fix cond_sqrt tests.

2023-10-26 Thread Robin Dapp
helpers. Regards Robin >From f5c43e700c0343ec32d95a28226837e26c1bfdca Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Thu, 26 Oct 2023 20:40:00 +0200 Subject: [PATCH] RISC-V: Fix cond_sqrt tests. As long as we do not have universal Zvfh support in binutils linking against libm does not w

[PATCH] RISC-V: Add rawmemchr expander.

2023-10-26 Thread Robin Dapp
Hi, this patch adds a vectorized rawmemchr expander. It's basically strstr but for 8, 16 and 32-byte needles. Apart from adjusting the common-code tests I re-used a similar test that Stefan added to the s390 backend. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md (rawmemchr):

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-26 Thread Robin Dapp
> +(define_expand "vcond_mask_len_" > +  [(match_operand:V_VLS 0 "register_operand") > +    (match_operand: 3 "nonmemory_operand") > +    (match_operand:V_VLS 1 "nonmemory_operand") > +    (match_operand:V_VLS 2 "autovec_else_operand") > +    (match_operand 4 "autovec_length_operand") > +    (match

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
> Suggested adapt codes as follows: > > unsigned int element_size = GET_MODE_SIZE (mode).to_constant (); > poly_int64 nunits = exact_div (BYTES_PER_RISCV_VECTOR *TARGET_MAX_LMUL, > element_size); > if (!get_vector_mode(mode, nunits).exists(&vmode)) >   gcc_unreachable (); Actually I was initiall

Re: [PATCH] RISC-V: Fix wrong tune parameters on int_div

2023-10-27 Thread Robin Dapp
> @@ -346,7 +346,7 @@ static const struct riscv_tune_param rocket_tune_info = { >{COSTS_N_INSNS (4), COSTS_N_INSNS (5)},/* fp_mul */ >{COSTS_N_INSNS (20), COSTS_N_INSNS (20)}, /* fp_div */ >{COSTS_N_INSNS (4), COSTS_N_INSNS (4)},/* int_mul */ > - {COSTS_N_INSNS (6), COSTS_N_IN

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
> Could you put it into riscv-string.cc rather than riscv-v.cc? I would > like to put those builtin function expander together if possible, > riscv-string.cc might little bit confuse, but it's all included in > string.h Ok, sure. Will commit the adjusted patch if no further comments. Regards Ro

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
> I notice we have expand_block_move > in riscv-v.cc > > Maybe we should move it into riscv-string.cc ? Yes I will also move that one. Regards Robin

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
Attached v3 that I'd commit. Regards Robin >From 246b986a8ea2332ced7a094dd68d35d84dcbbc04 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Tue, 24 Oct 2023 10:33:15 +0200 Subject: [PATCH v3] RISC-V: Add rawmemchr expander. This patch adds a vectorized rawmemchr expander. It also m

Re: [PATCH] RISC-V: Add rawmemchr expander.

2023-10-27 Thread Robin Dapp
> It seems that you didn't commit it yet. > > A nit comment: > > + int lmul = riscv_autovec_lmul == RVV_DYNAMIC ? RVV_M8 : riscv_autovec_lmul; > > I change you could use TARGET_MAX_LMUL No didn't commit yet, testsuite was still running. OK, added it, will commit later. Regards Robin

Re: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-27 Thread Robin Dapp
s Robin >From 248744c328440bff9cc339d2bf622852cbaac343 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Thu, 12 Oct 2023 11:23:26 +0200 Subject: [PATCH v3] genemit: Split insn-emit.cc into several partitions. On riscv insn-emit.cc has grown to over 1.2 mio lines of code and compiling it takes considerable time. The

Re: [PATCH] RISC-V: Fix bugs of handling scalar of SEW64 vx instruction in RV32

2023-10-30 Thread Robin Dapp
Thanks, LGTM. Regards Robin

[PATCH] RISC-V: Add vector fmin/fmax expanders.

2023-10-30 Thread Robin Dapp
Hi, this patch adds expanders for fmin and fmax and the associated cond and reduc ones. As per RISC-V V spec 1.0 vfmin/vfmax are IEEE 754-2008 compliant so that should be ok. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md (3): fmax/fmin expanders. (cond_): Ditt

Re: [PATCH] RISC-V: Add vector fmin/fmax expanders.

2023-10-30 Thread Robin Dapp
> Aren't they actually the IEEE 754-2019 operations (with different > signaling NaN semantics; C functions such as fmaximum in C23), not the > IEEE 754-2008 operations (C functions such as fmax)? V spec 1.0 says "The > vector floating-point vfmin and vfmax instructions have the same behavior >

Re: [PATCH] OPTABS/IFN: Add mask_len_strided_load/mask_len_strided_store OPTABS/IFN

2023-10-31 Thread Robin Dapp
Hi Juzhe, > +@cindex @code{mask_len_strided_load@var{m}@var{n}} instruction pattern > +@item @samp{mask_len_strided_load@var{m}@var{n}} > +Load several separate memory locations into a vector of mode m. > +Operand 1 is a scalar base address and operand 2 is mode @var{n} > +specifying each uniform

Re: [PATCH] RISC-V: Add vector fmin/fmax expanders.

2023-10-31 Thread Robin Dapp
Thanks, going to commit the attached. Regards Robin This patch adds expanders for fmin and fmax. As per RISC-V V Spec 1.0 vfmin/vfmax are IEEE 754-2019 compliant which differs from IEEE 754-2008 that fmin/fmax require (particularly in the signaling-NaN handling). Therefore the pattern condition

Re: [PATCH] RISC-V: Support strided load/store

2023-10-31 Thread Robin Dapp
Hi Juzhe, LGTM once the middle-end parts are in. Just tiny nits. Nothing that would warrant a V2, though. > +;; = > +;; == Stried Load/Store missing a 'd' here. > +(define_predicate "vector_stride_extension_operand" > +

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-31 Thread Robin Dapp
>> +int >> +internal_fn_else_index (internal_fn fn) > > The function needs a comment, maybe: > > /* If FN is an IFN_COND_* or IFN_COND_LEN_* function, return the index of the >argument that is used when the condition is false. Return -1 otherwise. > */ > > OK for the internal-fn* and tree

Re: [PATCH] RISC-V: Fix bug of AVL propagation PASS

2023-11-02 Thread Robin Dapp
LGTM. Regards Robin

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-02 Thread Robin Dapp
> Looks reasonable overall. The new match patterns are 1:1 the > same as the COND_ ones. That's a bit awkward, but I don't see > a good way to "macroize" stuff further there. Can you at least > interleave the COND_LEN_* ones with the other ones instead of > putting them all at the end? Yes, no

Re: [PATCH V2] RISC-V: Fix redundant vsetvl in fixed-vlmax vectorized codes[PR112326]

2023-11-02 Thread Robin Dapp
Hi Juzhe, in principle this LGTM. It could use some function comments, though ;) > +imm_avl_p (machine_mode mode) > { >poly_uint64 nuints = GET_MODE_NUNITS (mode); > >return nuints.is_constant () > -/* The vsetivli can only hold register 0~31. */ > -? (IN_RANGE (nuints.to_cons

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-03 Thread Robin Dapp
> Could you explain why a special expansion is needed? (Sorry if you already > have and I missed it, bit overloaded ATM.) What does it do that is > different from what expand_fn_using_insn would do? All it does (in excess) is shuffle the arguments - vcond_mask_len has the mask as third operand s

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-03 Thread Robin Dapp
> Ah, OK. IMO it's better to keep the optab operands the same as the IFN > operands, even if that makes things inconsistent with vcond_mask. > vcond_mask isn't really a good example to follow, since the operand > order is not only inconsistent with the IFN, it's also inconsistent > with the natura

Re: [PATCH] Fix configure script comments(!?!) (Was: Re: [PATCH] genemit: Split insn-emit.cc into ten files)

2023-11-06 Thread Robin Dapp
> I'm not sure what that means, whether a wrong version of > autoconf/automake was used (though when I accidentally tried that, it > has always complained loudly) or if some environment difference can > cause this. Perhaps I should change the script not to care about > commits though that won't ha

Re: [PATCH 5/5] RISC-V: Support vmsxx.vx for autovec comparison of vec and imm

2024-03-01 Thread Robin Dapp
Hi Han, in addition to what Juzhe mentioned (and that late-combine is going to handle such cases) it should be noted that register pressure should not be the only consideration here. Many uarchs have a higher latency for register-file-crossing moves. At least without spilling the vv variant is p

Re: [PATCH 1/5] RISC-V: Remove float vector eqne pattern

2024-03-01 Thread Robin Dapp
> 2. When I write if (a == 2) and if (2 == a), the results are > same > > 3. The vec_duplicate operand is the 5th operand in both cmp and > eqne patterns. I think they are equal. A comparison with a constant is always canonicalized to have the constant second, that's why you won't see

Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.

2024-03-01 Thread Robin Dapp
> +  /* Segment load/store permute cost.  */ > +  const int segment_permute_2; > +  const int segment_permute_4; > +  const int segment_permute_8; > > Why do we only have 2/4/8, I think we should have 2/3/4/5/6/7/8 No idea why I posted that (wrong) version, I used it for some testing locally. At

[PATCH] RISC-V: Use vmv1r.v instead of vmv.v.v for fma output reloads [PR114200].

2024-03-06 Thread Robin Dapp
Hi, three-operand instructions like vmacc are modeled with an implicit output reload when the output does not match one of the operands. For this we use vmv.v.v which is subject to length masking. In a situation where the current vl is less than the full vlenb and the fma's result value is used

[PATCH] vect: Do not peel epilogue for partial vectors [PR114196].

2024-03-06 Thread Robin Dapp
Hi, r14-7036-gcbf569486b2dec added an epilogue vectorization guard for early break but PR114196 shows that we also run into the problem without early break. Therefore remove early break from the conditions. gcc/ChangeLog: PR middle-end/114196 * tree-vect-loop-manip.cc (vect_can

Re: [PATCH] vect: Do not peel epilogue for partial vectors [PR114196].

2024-03-07 Thread Robin Dapp
> r14-7036-gcbf569486b2dec added an epilogue vectorization guard for early > break but PR114196 shows that we also run into the problem without early > break. Therefore remove early break from the conditions. Forgot: Bootstrapped and regtested on x86, aarch64 and power10. Regtested on riscv64.

Re: [PATCH] vect: Do not peel epilogue for partial vectors [PR114196].

2024-03-07 Thread Robin Dapp
Attached v2 combines the checks. Bootstrapped and regtested on x86 an power10, aarch64 still running. Regtested on riscv64. Regards Robin Subject: [PATCH v2] vect: Do not peel epilogue for partial vectors. r14-7036-gcbf569486b2dec added an epilogue vectorization guard for early break but PR11

Re: [PATCH] RISC-V: Introduce option -mrvv-autovec-max-lmul for RVV autovec

2024-03-14 Thread Robin Dapp
Should it really be called autovec-max-lmul? We also use TARGET_MAX_LMUL for builtins etc. Or are we just following LLVM's naming here? Isn't -mrvv-max-lmul sufficient? > PR target/112648 This PR is not really resolved or affected by the pat

Re: [PATCH v2] RISC-V: Introduce option -mrvv-max-lmul for RVV autovec

2024-03-18 Thread Robin Dapp
LGTM as well. Regards Robin

  1   2   3   4   5   6   7   8   9   10   >