On Sat, Oct 30, 2021 at 02:17:07PM -0700, Richard Henderson wrote: > On 10/29/21 1:23 PM, matheus.fe...@eldorado.org.br wrote: > > From: Luis Pires <luis.pi...@eldorado.org.br> > > > > Implement the following PowerISA v3.1 instruction: > > cntlzdm: Count Leading Zeros Doubleword Under Bit Mask > > > > Suggested-by: Richard Henderson <richard.hender...@linaro.org> > > Signed-off-by: Luis Pires <luis.pi...@eldorado.org.br> > > Signed-off-by: Matheus Ferst <matheus.fe...@eldorado.org.br> > > --- > > v2: > > - Inline implementation of cntlzdm > > --- > > target/ppc/insn32.decode | 1 + > > target/ppc/translate/fixedpoint-impl.c.inc | 36 ++++++++++++++++++++++ > > 2 files changed, 37 insertions(+) > > > > diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode > > index 9cb9fc00b8..221cb00dd6 100644 > > --- a/target/ppc/insn32.decode > > +++ b/target/ppc/insn32.decode > > @@ -203,6 +203,7 @@ ADDPCIS 010011 ..... ..... .......... 00010 . > > @DX > > ## Fixed-Point Logical Instructions > > CFUGED 011111 ..... ..... ..... 0011011100 - @X > > +CNTLZDM 011111 ..... ..... ..... 0000111011 - @X > > ### Float-Point Load Instructions > > diff --git a/target/ppc/translate/fixedpoint-impl.c.inc > > b/target/ppc/translate/fixedpoint-impl.c.inc > > index 0d9c6e0996..c9e9ae35df 100644 > > --- a/target/ppc/translate/fixedpoint-impl.c.inc > > +++ b/target/ppc/translate/fixedpoint-impl.c.inc > > @@ -413,3 +413,39 @@ static bool trans_CFUGED(DisasContext *ctx, arg_X *a) > > #endif > > return true; > > } > > + > > +#if defined(TARGET_PPC64) > > +static void do_cntlzdm(TCGv_i64 dst, TCGv_i64 src, TCGv_i64 mask) > > +{ > > + TCGv_i64 tmp; > > + TCGLabel *l1; > > + > > + tmp = tcg_temp_local_new_i64(); > > + l1 = gen_new_label(); > > + > > + tcg_gen_and_i64(tmp, src, mask); > > + tcg_gen_clzi_i64(tmp, tmp, 64); > > + > > + tcg_gen_brcondi_i64(TCG_COND_EQ, tmp, 0, l1); > > + > > + tcg_gen_subfi_i64(tmp, 64, tmp); > > + tcg_gen_shr_i64(tmp, mask, tmp); > > + tcg_gen_ctpop_i64(tmp, tmp); > > + > > + gen_set_label(l1); > > + > > + tcg_gen_mov_i64(dst, tmp); > > This works, but a form without brcond would be better (due to how poorly tcg > handles basic blocks). > > How about > > tcg_gen_clzi_i64(tmp, tmp, 0); > > tcg_gen_xori_i64(tmp, tmp, 63); > tcg_gen_shr_i64(tmp, mask, tmp); > tcg_gen_shri_i64(tmp, tmp, 1); > > tcg_gen_ctpop_i64(dst, tmp);
I've applied this to ppc-for-6.2. You can make this improvement as a followup if you want. > > The middle 3 operations perform a shift between [1-64], such that we are > assured of 0 for 64. > > Either way, > Reviewed-by: Richard Henderson <richard.hender...@linaro.org> > > > r~ > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature