https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123779
--- Comment #3 from Hongyu Wang <hongyuw at gcc dot gnu.org> ---
(In reply to Hongtao Liu from comment #2)
> (In reply to Hongyu Wang from comment #1)
> > This is because define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
> > has mask_name subst, but it doesn't generate corresponding split for the
> > mask variant.
>
> we should remove <mask_name>
Just remove <mask_name> produces an extra blend:
vmovdqa e(%rip), %xmm0
vmovdqa g(%rip), %xmm1
vpcmpw $1, f(%rip), %xmm0, %k1
vpmovm2w %k1, %xmm0
vpmovzxbw d(%rip), %xmm0
vpblendmw %xmm0, %xmm1, %xmm0{%k1}
I think better to separate define_insn_and_split for nonmask/mask variants,
which produces
vmovdqa e(%rip), %xmm0
vpcmpw $1, f(%rip), %xmm0, %k1
vpmovm2w %k1, %xmm0
vmovdqa g(%rip), %xmm0
vpmovzxbw d(%rip), %xmm0{%k1}