https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87767
--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> --- What I mean is that we should try to simplify the md file, instead of adding hundreds of new *_bcst patterns. We have e.g. (define_insn "*<plusminus_insn><mode>3" [(set (match_operand:VI_AVX2 0 "register_operand" "=x,v") (plusminus:VI_AVX2 (match_operand:VI_AVX2 1 "vector_operand" "<comm>0,v") (match_operand:VI_AVX2 2 "vector_operand" "xBm,vm")))] "TARGET_SSE2 && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)" "@ p<plusminus_mnemonic><ssemodesuffix>\t{%2, %0|%0, %2} vp<plusminus_mnemonic><ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseiadd") (set_attr "prefix_data16" "1,*") (set_attr "prefix" "orig,vex") (set_attr "mode" "<sseinsnmode>")]) (define_insn "*sub<mode>3_bcst" [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v") (minus:VI48_AVX512VL (match_operand:VI48_AVX512VL 1 "register_operand" "v") (vec_duplicate:VI48_AVX512VL (match_operand:<ssescalarmode> 2 "memory_operand" "m"))))] "TARGET_AVX512F && ix86_binary_operator_ok (MINUS, <MODE>mode, operands)" "vpsub<ssemodesuffix>\t{%2<avx512bcst>, %1, %0|%0, %1, %2<avx512bcst>}" [(set_attr "type" "sseiadd") (set_attr "prefix" "evex") (set_attr "mode" "<sseinsnmode>")]) What I meant is we could have just: (define_insn "*<plusminus_insn><mode>3" [(set (match_operand:VI_AVX2 0 "register_operand" "=x,v") (plusminus:VI_AVX2 (match_operand:VI_AVX2 1 "vector_bcst_operand" "<comm>0,v") (match_operand:VI_AVX2 2 "vector_bcst_operand" "xBm,vBb")))] "TARGET_SSE2 && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)" "@ p<plusminus_mnemonic><ssemodesuffix>\t{%2, %0|%0, %2} vp<plusminus_mnemonic><ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseiadd") (set_attr "prefix_data16" "1,*") (set_attr "prefix" "orig,vex") (set_attr "mode" "<sseinsnmode>")]) where vector_bcst_operand is either vector_operand, or for TARGET_AVX512F a VEC_DUPLICATE of the right mode with a MEM inside of it with the element mode of the VEC_DUPLICATE mode, similarly Bb constraint is either m, or for TARGET_AVX512F also again the VEC_DUPLICATE with MEM inside of it, and that ix86_binary_operator_ok would treat a VEC_DUPLICATE wrapping MEM the same as MEM (in particular ensure one e.g. doesn't have one VEC_DUPLICATE and one MEM operand, or two VEC_DUPLICATE operands) and that the output code would handle emitting an operand with VEC_DUPLICATE of a MEM properly. Or perhaps the constraint there could be just for the broadcast and one could write vmBb. Still, I think the predicate needs to be accurate, i.e. for some instructions we want e.g. vector_operand or TARGET_AVX512F and bcst_mem_operand, for others vector_operand or TARGET_AVX512VL and bcst_mem_operand etc. Anyway, if we go down this route, might be best to handle just a couple of patterns, then ask for review and see what Kirill (or if Uros would be interested) think about it and only later convert more.