Toma Tabacu <toma.tab...@imgtec.com> writes: > > From: Matthew Fortune > > > + /* The third argument needs to be in SImode in order to succesfully > > > match > > > + the operand from the insn definition. */ > > > > Please refer to operand here not argument as it is the second argument > > to the builtin but third operand of the instruction. Also double ss in > > successfully. > > > > I have rewritten the comment to address these mistakes. > > > > + case CODE_FOR_loongson_pshufh: > > > + case CODE_FOR_loongson_psllh: > > > + case CODE_FOR_loongson_psllw: > > > + case CODE_FOR_loongson_psrah: > > > + case CODE_FOR_loongson_psraw: > > > + case CODE_FOR_loongson_psrlh: > > > + case CODE_FOR_loongson_psrlw: > > > + gcc_assert (has_target_p && nops == 3 && ops[2].mode == QImode); > > > + ops[2].value = lowpart_subreg (SImode, ops[2].value, QImode); > > > + ops[2].mode = SImode; > > > + break; > > > + > > > case CODE_FOR_msa_addvi_b: > > > case CODE_FOR_msa_addvi_h: > > > case CODE_FOR_msa_addvi_w: > > > > For the record, given paradoxical subregs are a headache... > > I am OK with this on the basis that the argument to psllh etc is actually > > a uint8_t which means that bits 8 upwards are guaranteed to be zero so > > the subreg can be eliminated without any explicit sign or zero extension > > inserted. This is the same kind of optimisation that combine would > > perform when eliminating zero extension. > > > > Please can you check that a zero extension is inserted for the following > > case with -O2 or above: > > > > int16x4_t testme(int16x4_t s, int amount) > > { > > return psllh_s (s, amount); > > } > > > > If my understanding is correct there should be an ANDI 0xff inserted > > or similar. > > > > The ANDI 0xff is present for -O0, after the first time the value is loaded > from memory, but it is not generated for -O1 and -O2. > I'm not seeing any zero extension happening for -O1 and -O2.
That's not what I hoped but is what I was concerned about as I believe it means we have a change of behaviour. It boils down to simply ignoring the argument type of unsigned char. My guess is that a zero extension is created but then immediately eliminated because of the paradoxical subreg. I think you need to create a temporary and perform the zero extension to ensure we honour the unsigned char operand: rtx new_dst = gen_reg_rtx (SImode); emit_insn (gen_zero_extendqisi2 (new_dst, ops[2].value)); ops[2].value = foo; This should mean that the testcase I sent always has a zero extension but if you change the type of 'amount' to be unsigned char then there should not be a zero extension as the argument will be assumed to be correctly zero extended already and the explicitly introduced zero_extend will be eliminated. Apologies for not proposing this before. Thanks, Matthew > > The only change in the patch below is the fixed comment. > > Regards, > Toma > > gcc/ > > * config/mips/mips.c (mips_expand_builtin_insn): Put the QImode > argument of the pshufh, psllh, psllw, psrah, psraw, psrlh, psrlw > builtins into an SImode paradoxical SUBREG. > > diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c > index da7fa8f..e5b2d9a 100644 > --- a/gcc/config/mips/mips.c > +++ b/gcc/config/mips/mips.c > @@ -16574,6 +16574,20 @@ mips_expand_builtin_insn (enum insn_code icode, > unsigned int > nops, > > switch (icode) > { > + /* The third operand of these instructions is in SImode, so we need to > + bring the corresponding builtin argument from QImode into SImode. */ > + case CODE_FOR_loongson_pshufh: > + case CODE_FOR_loongson_psllh: > + case CODE_FOR_loongson_psllw: > + case CODE_FOR_loongson_psrah: > + case CODE_FOR_loongson_psraw: > + case CODE_FOR_loongson_psrlh: > + case CODE_FOR_loongson_psrlw: > + gcc_assert (has_target_p && nops == 3 && ops[2].mode == QImode); > + ops[2].value = lowpart_subreg (SImode, ops[2].value, QImode); > + ops[2].mode = SImode; > + break; > + > case CODE_FOR_msa_addvi_b: > case CODE_FOR_msa_addvi_h: > case CODE_FOR_msa_addvi_w: