>-----Original Message-----
>From: Segher Boessenkool <seg...@kernel.crashing.org>
>Sent: Thursday, June 3, 2021 4:46 AM
>To: Richard Biener <richard.guent...@gmail.com>
>Cc: Liu, Hongtao <hongtao....@intel.com>; GCC Patches <gcc-
>patc...@gcc.gnu.org>
>Subject: Re: [PATCH] Canonicalize (vec_duplicate (not A)) to (not
>(vec_duplicate A)).
>
>Hi!
>
>On Wed, Jun 02, 2021 at 09:07:35AM +0200, Richard Biener wrote:
>> On Wed, Jun 2, 2021 at 7:41 AM liuhongt via Gcc-patches
>> <gcc-patches@gcc.gnu.org> wrote:
>> > For i386, it will enable below opt
>> >
>> > from
>> >         notl    %edi
>> >         vpbroadcastd    %edi, %xmm0
>> >         vpand   %xmm1, %xmm0, %xmm0
>> > to
>> >         vpbroadcastd    %edi, %xmm0
>> >         vpandn   %xmm1, %xmm0, %xmm0
>>
>> There will be cases where (vec_duplicate (not A)) is better than (not
>> (vec_duplicate A)), so I'm not sure it is a good idea to forcefully
>> canonicalize unary operations.
>
>It is two unaries in sequence, where the order does not matter either.
>As in all such cases you either have to handle both cases everywhere, or have
>a canonical order.
>
>> I suppose the
>> simplification happens inside combine
>
>combine uses simplify-rtx for most cases (it is part of combine, but used in
>quite a few other places these days).
>
>> - doesn't combine
>> already have code to try variants of an expression and isn't this a
>> good candidate that can be added there, avoiding the canonicalization?
>
>As I mentioned, this is done in simplify-rtx in cases that do not have a
>canonical representation.  This is critical because it prevents loops.
>
>A very typical example is how UMIN is optimised:
>
>   case UMIN:
>      if (trueop1 == CONST0_RTX (mode) && ! side_effects_p (op0))
>       return op1;
>      if (rtx_equal_p (trueop0, trueop1) && ! side_effects_p (op0))
>       return op0;
>      tem = simplify_associative_operation (code, mode, op0, op1);
>      if (tem)
>       return tem;
>      break;
>
>(the stuff using "tem").
>
>Hongtao, can we do something similar here?  Does that work well?  Please try
>it out :-)

In simplify_rtx, no simplication occurs, there is just the difference between
 (vec_duplicate (not REG)) and (not (vec_duplicate (REG)). So here tem will 
only be 0.
Basically we don't know it's a simplication until combine successfully split the
3->2 instructions (not + broadcast + and to andnot + broadcast), but it's 
pretty awkward
to do this in combine.

Consider andnot is existed for many backends, I think a canonicalization is 
needed here.
Maybe we can add insn canonicalization for transforming (and (vect_duplicate 
(not A)) B) to 
(and (not (duplicate (not A)) B) instead of (vec_duplicate (not A)) to (not 
(vec_duplicate A))?

>
>
>Segher

Reply via email to