Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-05 Thread Richard Biener
On Sun, 5 Nov 2023, Richard Sandiford wrote: > Robin Dapp writes: > >> Ah, OK. IMO it's better to keep the optab operands the same as the IFN > >> operands, even if that makes things inconsistent with vcond_mask. > >> vcond_mask isn't really a good example to follow, since the operand > >> order

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-05 Thread Richard Sandiford
Robin Dapp writes: >> Ah, OK. IMO it's better to keep the optab operands the same as the IFN >> operands, even if that makes things inconsistent with vcond_mask. >> vcond_mask isn't really a good example to follow, since the operand >> order is not only inconsistent with the IFN, it's also incons

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-03 Thread Robin Dapp
> Ah, OK. IMO it's better to keep the optab operands the same as the IFN > operands, even if that makes things inconsistent with vcond_mask. > vcond_mask isn't really a good example to follow, since the operand > order is not only inconsistent with the IFN, it's also inconsistent > with the natura

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-03 Thread Richard Sandiford
Robin Dapp writes: >> Could you explain why a special expansion is needed? (Sorry if you already >> have and I missed it, bit overloaded ATM.) What does it do that is >> different from what expand_fn_using_insn would do? > > All it does (in excess) is shuffle the arguments - vcond_mask_len has t

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-03 Thread Robin Dapp
> Could you explain why a special expansion is needed? (Sorry if you already > have and I missed it, bit overloaded ATM.) What does it do that is > different from what expand_fn_using_insn would do? All it does (in excess) is shuffle the arguments - vcond_mask_len has the mask as third operand s

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-02 Thread Richard Sandiford
Robin Dapp writes: >> Looks reasonable overall. The new match patterns are 1:1 the >> same as the COND_ ones. That's a bit awkward, but I don't see >> a good way to "macroize" stuff further there. Can you at least >> interleave the COND_LEN_* ones with the other ones instead of >> putting them

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-02 Thread Robin Dapp
> Looks reasonable overall. The new match patterns are 1:1 the > same as the COND_ ones. That's a bit awkward, but I don't see > a good way to "macroize" stuff further there. Can you at least > interleave the COND_LEN_* ones with the other ones instead of > putting them all at the end? Yes, no

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-11-02 Thread Richard Biener
On Thu, 26 Oct 2023, Robin Dapp wrote: > Ok, next try. Now without dubious pattern and with direct optab > but still dedicated expander function. > > This will cause one riscv regression in cond_widen_reduc-2.c that > we can deal with later. It is just a missed optimization where > we do not co

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-26 Thread Robin Dapp
> +(define_expand "vcond_mask_len_" > +  [(match_operand:V_VLS 0 "register_operand") > +    (match_operand: 3 "nonmemory_operand") > +    (match_operand:V_VLS 1 "nonmemory_operand") > +    (match_operand:V_VLS 2 "autovec_else_operand") > +    (match_operand 4 "autovec_length_operand") > +    (match

Re: Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-26 Thread 钟居哲
.ai From: Robin Dapp Date: 2023-10-26 22:02 To: richard.sandiford CC: rdapp.gcc; gcc-patches; rguenther; juzhe.zh...@rivai.ai Subject: Re: [PATCH] internal-fn: Add VCOND_MASK_LEN. Ok, next try. Now without dubious pattern and with direct optab but still dedicated expander function. This will ca

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-26 Thread Robin Dapp
Ok, next try. Now without dubious pattern and with direct optab but still dedicated expander function. This will cause one riscv regression in cond_widen_reduc-2.c that we can deal with later. It is just a missed optimization where we do not combine something that we used to because of the now-p

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-26 Thread Robin Dapp
> Yeah. I think Robin may need this : > > TREE_CODE (else_val) == SSA_NAAME > && SSA_NAME_IS_DEFAULT_DEF (else_val) > && VAR_P (SSA_NAME_VAR (else_val)) > > to differentiate whether the ELSE VALUE is uninitialized SSA or not. I think we are talking about a different simplification now. This

Re: Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-25 Thread 钟居哲
26 06:32 To: 钟居哲 CC: gcc-patches; rdapp.gcc; rguenther Subject: Re: [PATCH] internal-fn: Add VCOND_MASK_LEN. 钟居哲 writes: >>> Which one is right? > Hi, Richard. Let me explain this situation. > > Both situations are possible. It's depending on the 'ELSE' value whet

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-25 Thread Richard Sandiford
钟居哲 writes: >>> Which one is right? > Hi, Richard. Let me explain this situation. > > Both situations are possible. It's depending on the 'ELSE' value whether it > is unitialized value. > > For reduction case: > > for (int i = 0; i < n; i++) > result += a[i] > > The trailing elements should be

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-25 Thread Richard Sandiford
Robin Dapp writes: >> At first, this seemed like an odd place to fold away the length. >> AFAIK the length in res_op is inherited directly from the original >> operation, and so it isn't any more redundant after the fold than >> it was before. But I suppose the reason for doing it here is that >>

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-25 Thread Robin Dapp
> At first, this seemed like an odd place to fold away the length. > AFAIK the length in res_op is inherited directly from the original > operation, and so it isn't any more redundant after the fold than > it was before. But I suppose the reason for doing it here is that > we deliberately create I

Re: [PATCH] internal-fn: Add VCOND_MASK_LEN.

2023-10-24 Thread Richard Sandiford
Robin Dapp writes: > The attached patch introduces a VCOND_MASK_LEN, helps for the riscv cases > that were broken before and looks unchanged on x86, aarch64 and power > bootstrap and testsuites. > > I only went with the minimal number of new match.pd patterns and did not > try stripping the length