https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87767

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
What I mean is that we should try to simplify the md file, instead of adding
hundreds of new *_bcst patterns.
We have e.g.
(define_insn "*<plusminus_insn><mode>3"
  [(set (match_operand:VI_AVX2 0 "register_operand" "=x,v")
        (plusminus:VI_AVX2
          (match_operand:VI_AVX2 1 "vector_operand" "<comm>0,v")
          (match_operand:VI_AVX2 2 "vector_operand" "xBm,vm")))]
  "TARGET_SSE2 && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
  "@
   p<plusminus_mnemonic><ssemodesuffix>\t{%2, %0|%0, %2}
   vp<plusminus_mnemonic><ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
  [(set_attr "isa" "noavx,avx")
   (set_attr "type" "sseiadd")
   (set_attr "prefix_data16" "1,*")
   (set_attr "prefix" "orig,vex")
   (set_attr "mode" "<sseinsnmode>")])

(define_insn "*sub<mode>3_bcst"
  [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v")
        (minus:VI48_AVX512VL
          (match_operand:VI48_AVX512VL 1 "register_operand" "v")
          (vec_duplicate:VI48_AVX512VL
            (match_operand:<ssescalarmode> 2 "memory_operand" "m"))))]
  "TARGET_AVX512F && ix86_binary_operator_ok (MINUS, <MODE>mode, operands)"
  "vpsub<ssemodesuffix>\t{%2<avx512bcst>, %1, %0|%0, %1, %2<avx512bcst>}"
  [(set_attr "type" "sseiadd")
   (set_attr "prefix" "evex")
   (set_attr "mode" "<sseinsnmode>")])

What I meant is we could have just:
(define_insn "*<plusminus_insn><mode>3"
  [(set (match_operand:VI_AVX2 0 "register_operand" "=x,v")
        (plusminus:VI_AVX2
          (match_operand:VI_AVX2 1 "vector_bcst_operand" "<comm>0,v")
          (match_operand:VI_AVX2 2 "vector_bcst_operand" "xBm,vBb")))]
  "TARGET_SSE2 && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
  "@
   p<plusminus_mnemonic><ssemodesuffix>\t{%2, %0|%0, %2}
   vp<plusminus_mnemonic><ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
  [(set_attr "isa" "noavx,avx")
   (set_attr "type" "sseiadd")
   (set_attr "prefix_data16" "1,*")
   (set_attr "prefix" "orig,vex")
   (set_attr "mode" "<sseinsnmode>")])
where vector_bcst_operand is either vector_operand, or for TARGET_AVX512F
a VEC_DUPLICATE of the right mode with a MEM inside of it with the element mode
of the VEC_DUPLICATE mode, similarly Bb constraint is either m, or for
TARGET_AVX512F also again the VEC_DUPLICATE with MEM inside of it, and that
ix86_binary_operator_ok would treat a VEC_DUPLICATE wrapping MEM the same as
MEM (in particular ensure one e.g. doesn't have one VEC_DUPLICATE and one MEM
operand, or two VEC_DUPLICATE operands) and that the output code would handle
emitting an operand with VEC_DUPLICATE of a MEM properly.
Or perhaps the constraint there could be just for the broadcast and one could
write vmBb.  Still, I think the predicate needs to be accurate, i.e. for some
instructions we want e.g. vector_operand or TARGET_AVX512F and
bcst_mem_operand,
for others vector_operand or TARGET_AVX512VL and bcst_mem_operand etc.

Anyway, if we go down this route, might be best to handle just a couple of
patterns, then ask for review and see what Kirill (or if Uros would be
interested) think about it and only later convert more.

Reply via email to