Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts
On 11/29/2016 03:52 PM, Laurent Vivier wrote: > It doesn't work because we work with word/byte where the bit sign has > been extended to the long word. So in the case of 0 shift, with retrieve > C= and not 0. Ah, right. I wonder if it's better to always zero-extend the inputs for left-shifts, as opposed to needing the movcond. r~
Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts
Le 28/11/2016 à 15:54, Richard Henderson a écrit : > On 11/27/2016 11:35 AM, Laurent Vivier wrote: +tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64); >> This does not extract correctly the C flag when the opsize is word or byte. >> I think we should use a shift instead: >> >> -tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64); >> - >> -/* Note that C=0 if shift count is 0, and we get that for free. */ >> +if (opsize == OS_LONG) { >> +tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64); >> +/* Note that C=0 if shift count is 0, and we get that for >> free. */ >> +} else { >> +TCGv zero = tcg_const_i32(0); >> +tcg_gen_extrl_i64_i32(QREG_CC_N, t64); >> +tcg_gen_shri_i64(t64, t64, bits); >> +tcg_gen_extrl_i64_i32(QREG_CC_C, t64); >> +tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_C, >> +s32, zero, zero, QREG_CC_C); >> +tcg_temp_free(zero); >> +} >> >> Do you have a better idea? > > if (opsize == OS_LONG) { > tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64); > } else { > tcg_gen_extrl_i64_i32(QREG_CC_N, t64); > tcg_gen_shri_i32(QREG_CC_C, QREG_CC_N, bits); > } > > Since we zero-extend the input from bits, it's still true that a zero shift > gets C=0 for free. > It doesn't work because we work with word/byte where the bit sign has been extended to the long word. So in the case of 0 shift, with retrieve C= and not 0. Laurent
Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts
On 11/27/2016 11:35 AM, Laurent Vivier wrote: >> > +tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64); > This does not extract correctly the C flag when the opsize is word or byte. > I think we should use a shift instead: > > -tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64); > - > -/* Note that C=0 if shift count is 0, and we get that for free. */ > +if (opsize == OS_LONG) { > +tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64); > +/* Note that C=0 if shift count is 0, and we get that for > free. */ > +} else { > +TCGv zero = tcg_const_i32(0); > +tcg_gen_extrl_i64_i32(QREG_CC_N, t64); > +tcg_gen_shri_i64(t64, t64, bits); > +tcg_gen_extrl_i64_i32(QREG_CC_C, t64); > +tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_C, > +s32, zero, zero, QREG_CC_C); > +tcg_temp_free(zero); > +} > > Do you have a better idea? if (opsize == OS_LONG) { tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64); } else { tcg_gen_extrl_i64_i32(QREG_CC_N, t64); tcg_gen_shri_i32(QREG_CC_C, QREG_CC_N, bits); } Since we zero-extend the input from bits, it's still true that a zero shift gets C=0 for free. r~
Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts
On 11/27/2016 11:30 AM, Laurent Vivier wrote: > There is another bug on this one. > > Le 09/11/2016 à 14:46, Richard Henderson a écrit : >> diff --git a/target-m68k/translate.c b/target-m68k/translate.c >> index 4f224d7..1b3765f 100644 >> --- a/target-m68k/translate.c >> +++ b/target-m68k/translate.c >> +static inline void shift_reg(DisasContext *s, uint16_t insn, int opsize) > ... >> +/* M68000 sets V if the most significant bit is changed at >> + * any time during the shift operation. Do this via creating >> + * an extension of the sign bit, comparing, and discarding >> + * the bits below the sign bit. I.e. >> + * int64_t s = (intN_t)reg; >> + * int64_t t = (int64_t)(intN_t)reg << count; >> + * V = ((s ^ t) & (-1 << (bits - 1))) != 0 >> + */ >> +if (!logical && m68k_feature(s->env, M68K_FEATURE_M68000)) { >> +/* Sign extend the input to 64 bits; re-do the shift. */ >> +tcg_gen_ext_i32_i64(t64, reg); >> +tcg_gen_shl_i64(s64, t64, s64); >> +/* Clear all bits that are unchanged. */ >> +tcg_gen_xor_i64(t64, t64, s64); >> +/* Ignore the bits below the sign bit. */ >> +tcg_gen_andi_i64(t64, t64, -1ULL << (bits - 1)); >> +/* If any bits remain set, we have overflow. */ >> +tcg_gen_setcondi_i64(TCG_COND_NE, t64, t64, 0); >> +tcg_gen_extrl_i64_i32(QREG_CC_V, t64); >> +tcg_gen_neg_i32(QREG_CC_V, QREG_CC_V); > > if s64 is greater than 32, we lose all the bits needed to compute the V > flag. I think we can just add this to fix the problem: > > if (!logical && m68k_feature(s->env, M68K_FEATURE_M68000)) { > +TCGv_i64 tt = tcg_const_i64(32); > +/* if shift is greater than 32, use 32 */ > +tcg_gen_movcond_i64(TCG_COND_GT, s64, s64, tt, tt, s64); > +tcg_temp_free_i64(tt); > /* Sign extend the input to 64 bits; re-do the shift. */ > tcg_gen_ext_i32_i64(t64, reg); > tcg_gen_shl_i64(s64, t64, s64); Hmm. I guess the test case is input like 8 << 63, where the sign bit is clear, the input is non-zero, and we shift out all of the set bits. I can't think of anything cleaner than your movcond solution. r~
Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts
On 11/27/2016 09:53 AM, Laurent Vivier wrote: >> > +TCGv t0 = tcg_temp_new(); >> > +tcg_gen_sari_i32(QREG_CC_V, reg, bits - 1); >> > +tcg_gen_sari_i32(t0, t0, bits - count); > t0 is used unitialized, I think we should have here: > > tcg_gen_sari_i32(t0, reg, bits - count - 1); > > moreover we must use "bits - count - 1" to use also the current most > significant bit to compute V flag. > Yep. r~
Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts
Le 09/11/2016 à 14:46, Richard Henderson a écrit : > diff --git a/target-m68k/translate.c b/target-m68k/translate.c > index 4f224d7..1b3765f 100644 > --- a/target-m68k/translate.c > +++ b/target-m68k/translate.c > +static inline void shift_reg(DisasContext *s, uint16_t insn, int opsize) > +{ > +int logical = insn & 8; > +int left = insn & 0x100; > +int bits = opsize_bytes(opsize) * 8; > +TCGv reg = gen_extend(DREG(insn, 0), opsize, !logical); > +TCGv s32; > +TCGv_i64 t64, s64; > + > +t64 = tcg_temp_new_i64(); > +s64 = tcg_temp_new_i64(); > +s32 = tcg_temp_new(); > + > +/* Note that m68k truncates the shift count modulo 64, not 32. > + In addition, a 64-bit shift makes it easy to find "the last > + bit shifted out", for the carry flag. */ > +tcg_gen_andi_i32(s32, DREG(insn, 9), 63); > +tcg_gen_extu_i32_i64(s64, s32); > +tcg_gen_extu_i32_i64(t64, reg); > + > +/* Optimistically set V=0. Also used as a zero source below. */ > +tcg_gen_movi_i32(QREG_CC_V, 0); > +if (left) { > +tcg_gen_shl_i64(t64, t64, s64); > + > +tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64); This does not extract correctly the C flag when the opsize is word or byte. I think we should use a shift instead: -tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64); - -/* Note that C=0 if shift count is 0, and we get that for free. */ +if (opsize == OS_LONG) { +tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64); +/* Note that C=0 if shift count is 0, and we get that for free. */ +} else { +TCGv zero = tcg_const_i32(0); +tcg_gen_extrl_i64_i32(QREG_CC_N, t64); +tcg_gen_shri_i64(t64, t64, bits); +tcg_gen_extrl_i64_i32(QREG_CC_C, t64); +tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_C, +s32, zero, zero, QREG_CC_C); +tcg_temp_free(zero); +} Do you have a better idea? thanks, Laurent
Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts
There is another bug on this one. Le 09/11/2016 à 14:46, Richard Henderson a écrit : > diff --git a/target-m68k/translate.c b/target-m68k/translate.c > index 4f224d7..1b3765f 100644 > --- a/target-m68k/translate.c > +++ b/target-m68k/translate.c > +static inline void shift_reg(DisasContext *s, uint16_t insn, int opsize) ... > +/* M68000 sets V if the most significant bit is changed at > + * any time during the shift operation. Do this via creating > + * an extension of the sign bit, comparing, and discarding > + * the bits below the sign bit. I.e. > + * int64_t s = (intN_t)reg; > + * int64_t t = (int64_t)(intN_t)reg << count; > + * V = ((s ^ t) & (-1 << (bits - 1))) != 0 > + */ > +if (!logical && m68k_feature(s->env, M68K_FEATURE_M68000)) { > +/* Sign extend the input to 64 bits; re-do the shift. */ > +tcg_gen_ext_i32_i64(t64, reg); > +tcg_gen_shl_i64(s64, t64, s64); > +/* Clear all bits that are unchanged. */ > +tcg_gen_xor_i64(t64, t64, s64); > +/* Ignore the bits below the sign bit. */ > +tcg_gen_andi_i64(t64, t64, -1ULL << (bits - 1)); > +/* If any bits remain set, we have overflow. */ > +tcg_gen_setcondi_i64(TCG_COND_NE, t64, t64, 0); > +tcg_gen_extrl_i64_i32(QREG_CC_V, t64); > +tcg_gen_neg_i32(QREG_CC_V, QREG_CC_V); if s64 is greater than 32, we lose all the bits needed to compute the V flag. I think we can just add this to fix the problem: if (!logical && m68k_feature(s->env, M68K_FEATURE_M68000)) { +TCGv_i64 tt = tcg_const_i64(32); +/* if shift is greater than 32, use 32 */ +tcg_gen_movcond_i64(TCG_COND_GT, s64, s64, tt, tt, s64); +tcg_temp_free_i64(tt); /* Sign extend the input to 64 bits; re-do the shift. */ tcg_gen_ext_i32_i64(t64, reg); tcg_gen_shl_i64(s64, t64, s64); Is it correct? Laurent
Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts
Le 09/11/2016 à 14:46, Richard Henderson a écrit : > diff --git a/target-m68k/translate.c b/target-m68k/translate.c > index 4f224d7..1b3765f 100644 > --- a/target-m68k/translate.c > +++ b/target-m68k/translate.c > @@ -2883,48 +2883,205 @@ DISAS_INSN(addx_mem) > gen_store(s, opsize, addr_dest, QREG_CC_N); > } > > -/* TODO: This could be implemented without helper functions. */ > -DISAS_INSN(shift_im) > +static inline void shift_im(DisasContext *s, uint16_t insn, int opsize) > { > -TCGv reg; > -int tmp; > -TCGv shift; > +int count = (insn >> 9) & 7; > +int logical = insn & 8; > +int left = insn & 0x100; > +int bits = opsize_bytes(opsize) * 8; > +TCGv reg = gen_extend(DREG(insn, 0), opsize, !logical); > + > +if (count == 0) { > +count = 8; > +} > + > +tcg_gen_movi_i32(QREG_CC_V, 0); > +if (left) { > +tcg_gen_shri_i32(QREG_CC_C, reg, bits - count); > +tcg_gen_shli_i32(QREG_CC_N, reg, count); > + > +/* Note that ColdFire always clears V (done above), > + while M68000 sets if the most significant bit is changed at > + any time during the shift operation */ > +if (!logical && m68k_feature(s->env, M68K_FEATURE_M68000)) { > +/* if shift count >= bits, V is (reg != 0) */ > +if (count >= bits) { > +tcg_gen_setcond_i32(TCG_COND_NE, QREG_CC_V, reg, QREG_CC_V); > +} else { > +TCGv t0 = tcg_temp_new(); > +tcg_gen_sari_i32(QREG_CC_V, reg, bits - 1); > +tcg_gen_sari_i32(t0, t0, bits - count); t0 is used unitialized, I think we should have here: tcg_gen_sari_i32(t0, reg, bits - count - 1); moreover we must use "bits - count - 1" to use also the current most significant bit to compute V flag. > +tcg_gen_setcond_i32(TCG_COND_NE, QREG_CC_V, QREG_CC_V, t0); > +tcg_temp_free(t0); > +} > +tcg_gen_neg_i32(QREG_CC_V, QREG_CC_V); > +} > +} else { > +tcg_gen_shri_i32(QREG_CC_C, reg, count - 1); > +if (logical) { > +tcg_gen_shri_i32(QREG_CC_N, reg, count); > +} else { > +tcg_gen_sari_i32(QREG_CC_N, reg, count); > +} > +} Thanks, Laurent