https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125876

--- Comment #2 from Sarvesh Chandra <Sarvesh.Chandra at amd dot com> ---
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604649.html
We saw the following comment from Uros that mentions:

>> -(define_expand "avx512f_movddup512<mask_name>"
>> -  [(set (match_operand:V8DF 0 "register_operand")
>> +(define_insn "avx512f_movddup512<mask_name>"
>> +  [(set (match_operand:V8DF 0 "register_operand" "=v")
>>         (vec_select:V8DF
>>           (vec_concat:V16DF
>> -           (match_operand:V8DF 1 "nonimmediate_operand")
>> +           (match_operand:V8DF 1 "memory_operand" "m")
>
>I think you should leave nonimmediate_operand here with "m" predicate.
>Reload is able to move the register to the memory, and it is
>beneficial to allow registers for possible combine opportunities.

Essentially we have a duplicate pattern for avx512f_movddup512 and
avx512f_unpcklpd512 in the case of even lane interleaving with matching
operands (both registers). We could keep the constraint on avx512f_movddup512
same as current, and allow avx512f_unpcklpd512 to match the case of matching
register operands. Matching memory operands will fall through to
avx512f_movddup512, one of the operands will be spilled.

Reply via email to