"Yangfei (Felix)" <[email protected]> writes:
> Hi,
>
>> -----Original Message-----
>> From: Richard Sandiford [mailto:[email protected]]
>> Sent: Sunday, May 31, 2020 12:01 AM
>> To: Yangfei (Felix) <[email protected]>
>> Cc: [email protected]; Uros Bizjak <[email protected]>; Jakub
>> Jelinek <[email protected]>; Hongtao Liu <[email protected]>; H.J. Lu
>> <[email protected]>
>> Subject: Re: [PATCH PR95254] aarch64: gcc generate inefficient code with
>> fixed sve vector length
>>
>
> Snip...
>
>> >
>> > The v5 patch attached addressed this issue.
>> >
>> > There two added changes compared with the v4 patch:
>> > 1. In candidate_mem_p, mov_optab for innermode should be available.
>> > In this case, mov_optab for SDmode is not there and subreg are added
>> back by emit_move_insn_1. So we won't get the benefit with the patch.
>>
>> I agree we should have this check. I think the rule applies to all of the
>> transforms though, not just the mem one, so we should add the check to the
>> register and constant cases too.
>
> OK. I changed to make this an extra condition for calculating x_inner & y
> _inner.
Sounds good. Maybe at this point the x_inner and y_inner code is
getting complicated enough to put into a lambda too:
x_inner = ... (x);
y_inner = ... (y);
Just a suggestion though.
>> > 2. Instead of using adjust_address, I changed to use adjust_address_nv to
>> avoid the emit of invalid insn 13.
>> > The latter call to validize_mem() in emit_move_insn will take care of
>> > the
>> address for us.
>>
>> The validation performed by validize_mem is the same as that performed by
>> adjust_address, so the only case this should make a difference is for
>> push_operands:
>
> True.
>
>> /* If X or Y are memory references, verify that their addresses are valid
>> for the machine. */
>> if (MEM_P (x)
>> && (! memory_address_addr_space_p (GET_MODE (x), XEXP (x, 0),
>> MEM_ADDR_SPACE (x))
>> && ! push_operand (x, GET_MODE (x))))
>> x = validize_mem (x);
>>
>> if (MEM_P (y)
>> && ! memory_address_addr_space_p (GET_MODE (y), XEXP (y, 0),
>> MEM_ADDR_SPACE (y)))
>> y = validize_mem (y);
>>
>> So I think the fix is to punt on push_operands instead (and continue to use
>> adjust_address rather than adjust_address_nv).
>
> Not sure if I understand it correctly.
> Do you mean excluding push_operand in candidate_mem_p? Like:
>
> 3830 auto candidate_mem_p = [&](machine_mode innermode, rtx mem) {
> 3831 return !targetm.can_change_mode_class (innermode, GET_MODE (mem),
> ALL_REGS)
> 3832 && !push_operand (mem, GET_MODE (mem))
> 3833 /* Not a candiate if innermode requires too much alignment.
> */
> 3834 && (MEM_ALIGN (mem) >= GET_MODE_ALIGNMENT (innermode)
> 3835 || targetm.slow_unaligned_access (GET_MODE (mem),
> 3836 MEM_ALIGN (mem))
> 3837 || !targetm.slow_unaligned_access (innermode, MEM_ALIGN
> (mem)));
> 3838 };
Yeah, looks good.
Formatting nit though: multi-line conditions should be wrapped in (...),
i.e.:
return (...
&& ...
&& ...);
Thanks,
Richard