Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts

2016-11-30 Thread Richard Henderson
On 11/29/2016 03:52 PM, Laurent Vivier wrote:
> It doesn't work because we work with word/byte where the bit sign has
> been extended to the long word. So in the case of 0 shift, with retrieve
> C= and not 0.

Ah, right.  I wonder if it's better to always zero-extend the inputs for
left-shifts, as opposed to needing the movcond.


r~




Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts

2016-11-29 Thread Laurent Vivier
Le 28/11/2016 à 15:54, Richard Henderson a écrit :
> On 11/27/2016 11:35 AM, Laurent Vivier wrote:
 +tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64);
>> This does not extract correctly the C flag when the opsize is word or byte.
>> I think we should use a shift instead:
>>
>> -tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64);
>> -
>> -/* Note that C=0 if shift count is 0, and we get that for free.  */
>> +if (opsize == OS_LONG) {
>> +tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64);
>> +/* Note that C=0 if shift count is 0, and we get that for
>> free.  */
>> +} else {
>> +TCGv zero = tcg_const_i32(0);
>> +tcg_gen_extrl_i64_i32(QREG_CC_N, t64);
>> +tcg_gen_shri_i64(t64, t64, bits);
>> +tcg_gen_extrl_i64_i32(QREG_CC_C, t64);
>> +tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_C,
>> +s32, zero, zero, QREG_CC_C);
>> +tcg_temp_free(zero);
>> +}
>>
>> Do you have a better idea?
> 
>   if (opsize == OS_LONG) {
> tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64);
>   } else {
> tcg_gen_extrl_i64_i32(QREG_CC_N, t64);
> tcg_gen_shri_i32(QREG_CC_C, QREG_CC_N, bits);
>   }
> 
> Since we zero-extend the input from bits, it's still true that a zero shift
> gets C=0 for free.
> 

It doesn't work because we work with word/byte where the bit sign has
been extended to the long word. So in the case of 0 shift, with retrieve
C= and not 0.

Laurent




Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts

2016-11-28 Thread Richard Henderson
On 11/27/2016 11:35 AM, Laurent Vivier wrote:
>> > +tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64);
> This does not extract correctly the C flag when the opsize is word or byte.
> I think we should use a shift instead:
> 
> -tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64);
> -
> -/* Note that C=0 if shift count is 0, and we get that for free.  */
> +if (opsize == OS_LONG) {
> +tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64);
> +/* Note that C=0 if shift count is 0, and we get that for
> free.  */
> +} else {
> +TCGv zero = tcg_const_i32(0);
> +tcg_gen_extrl_i64_i32(QREG_CC_N, t64);
> +tcg_gen_shri_i64(t64, t64, bits);
> +tcg_gen_extrl_i64_i32(QREG_CC_C, t64);
> +tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_C,
> +s32, zero, zero, QREG_CC_C);
> +tcg_temp_free(zero);
> +}
> 
> Do you have a better idea?

  if (opsize == OS_LONG) {
tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64);
  } else {
tcg_gen_extrl_i64_i32(QREG_CC_N, t64);
tcg_gen_shri_i32(QREG_CC_C, QREG_CC_N, bits);
  }

Since we zero-extend the input from bits, it's still true that a zero shift
gets C=0 for free.


r~



Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts

2016-11-28 Thread Richard Henderson
On 11/27/2016 11:30 AM, Laurent Vivier wrote:
> There is another bug on this one.
> 
> Le 09/11/2016 à 14:46, Richard Henderson a écrit :
>> diff --git a/target-m68k/translate.c b/target-m68k/translate.c
>> index 4f224d7..1b3765f 100644
>> --- a/target-m68k/translate.c
>> +++ b/target-m68k/translate.c
>> +static inline void shift_reg(DisasContext *s, uint16_t insn, int opsize)
> ...
>> +/* M68000 sets V if the most significant bit is changed at
>> + * any time during the shift operation.  Do this via creating
>> + * an extension of the sign bit, comparing, and discarding
>> + * the bits below the sign bit.  I.e.
>> + * int64_t s = (intN_t)reg;
>> + * int64_t t = (int64_t)(intN_t)reg << count;
>> + * V = ((s ^ t) & (-1 << (bits - 1))) != 0
>> + */
>> +if (!logical && m68k_feature(s->env, M68K_FEATURE_M68000)) {
>> +/* Sign extend the input to 64 bits; re-do the shift.  */
>> +tcg_gen_ext_i32_i64(t64, reg);
>> +tcg_gen_shl_i64(s64, t64, s64);
>> +/* Clear all bits that are unchanged.  */
>> +tcg_gen_xor_i64(t64, t64, s64);
>> +/* Ignore the bits below the sign bit.  */
>> +tcg_gen_andi_i64(t64, t64, -1ULL << (bits - 1));
>> +/* If any bits remain set, we have overflow.  */
>> +tcg_gen_setcondi_i64(TCG_COND_NE, t64, t64, 0);
>> +tcg_gen_extrl_i64_i32(QREG_CC_V, t64);
>> +tcg_gen_neg_i32(QREG_CC_V, QREG_CC_V);
> 
> if s64 is greater than 32, we lose all the bits needed to compute the V
> flag. I think we can just add this to fix the problem:
> 
>  if (!logical && m68k_feature(s->env, M68K_FEATURE_M68000)) {
> +TCGv_i64 tt = tcg_const_i64(32);
> +/* if shift is greater than 32, use 32 */
> +tcg_gen_movcond_i64(TCG_COND_GT, s64, s64, tt, tt, s64);
> +tcg_temp_free_i64(tt);
>  /* Sign extend the input to 64 bits; re-do the shift.  */
>  tcg_gen_ext_i32_i64(t64, reg);
>  tcg_gen_shl_i64(s64, t64, s64);

Hmm.  I guess the test case is input like 8 << 63, where the sign bit is clear,
the input is non-zero, and we shift out all of the set bits.

I can't think of anything cleaner than your movcond solution.


r~



Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts

2016-11-28 Thread Richard Henderson
On 11/27/2016 09:53 AM, Laurent Vivier wrote:
>> > +TCGv t0 = tcg_temp_new();
>> > +tcg_gen_sari_i32(QREG_CC_V, reg, bits - 1);
>> > +tcg_gen_sari_i32(t0, t0, bits - count);
> t0 is used unitialized, I think we should have here:
> 
> tcg_gen_sari_i32(t0, reg, bits - count - 1);
> 
> moreover we must use "bits - count - 1" to use also the current most
> significant bit to compute V flag.
> 

Yep.


r~



Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts

2016-11-27 Thread Laurent Vivier
Le 09/11/2016 à 14:46, Richard Henderson a écrit :
> diff --git a/target-m68k/translate.c b/target-m68k/translate.c
> index 4f224d7..1b3765f 100644
> --- a/target-m68k/translate.c
> +++ b/target-m68k/translate.c

> +static inline void shift_reg(DisasContext *s, uint16_t insn, int opsize)
> +{
> +int logical = insn & 8;
> +int left = insn & 0x100;
> +int bits = opsize_bytes(opsize) * 8;
> +TCGv reg = gen_extend(DREG(insn, 0), opsize, !logical);
> +TCGv s32;
> +TCGv_i64 t64, s64;
> +
> +t64 = tcg_temp_new_i64();
> +s64 = tcg_temp_new_i64();
> +s32 = tcg_temp_new();
> +
> +/* Note that m68k truncates the shift count modulo 64, not 32.
> +   In addition, a 64-bit shift makes it easy to find "the last
> +   bit shifted out", for the carry flag.  */
> +tcg_gen_andi_i32(s32, DREG(insn, 9), 63);
> +tcg_gen_extu_i32_i64(s64, s32);
> +tcg_gen_extu_i32_i64(t64, reg);
> +
> +/* Optimistically set V=0.  Also used as a zero source below.  */
> +tcg_gen_movi_i32(QREG_CC_V, 0);
> +if (left) {
> +tcg_gen_shl_i64(t64, t64, s64);
> +
> +tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64);

This does not extract correctly the C flag when the opsize is word or byte.
I think we should use a shift instead:

-tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64);
-
-/* Note that C=0 if shift count is 0, and we get that for free.  */
+if (opsize == OS_LONG) {
+tcg_gen_extr_i64_i32(QREG_CC_N, QREG_CC_C, t64);
+/* Note that C=0 if shift count is 0, and we get that for
free.  */
+} else {
+TCGv zero = tcg_const_i32(0);
+tcg_gen_extrl_i64_i32(QREG_CC_N, t64);
+tcg_gen_shri_i64(t64, t64, bits);
+tcg_gen_extrl_i64_i32(QREG_CC_C, t64);
+tcg_gen_movcond_i32(TCG_COND_EQ, QREG_CC_C,
+s32, zero, zero, QREG_CC_C);
+tcg_temp_free(zero);
+}

Do you have a better idea?

thanks,
Laurent



Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts

2016-11-27 Thread Laurent Vivier
There is another bug on this one.

Le 09/11/2016 à 14:46, Richard Henderson a écrit :
> diff --git a/target-m68k/translate.c b/target-m68k/translate.c
> index 4f224d7..1b3765f 100644
> --- a/target-m68k/translate.c
> +++ b/target-m68k/translate.c
> +static inline void shift_reg(DisasContext *s, uint16_t insn, int opsize)
...
> +/* M68000 sets V if the most significant bit is changed at
> + * any time during the shift operation.  Do this via creating
> + * an extension of the sign bit, comparing, and discarding
> + * the bits below the sign bit.  I.e.
> + * int64_t s = (intN_t)reg;
> + * int64_t t = (int64_t)(intN_t)reg << count;
> + * V = ((s ^ t) & (-1 << (bits - 1))) != 0
> + */
> +if (!logical && m68k_feature(s->env, M68K_FEATURE_M68000)) {
> +/* Sign extend the input to 64 bits; re-do the shift.  */
> +tcg_gen_ext_i32_i64(t64, reg);
> +tcg_gen_shl_i64(s64, t64, s64);
> +/* Clear all bits that are unchanged.  */
> +tcg_gen_xor_i64(t64, t64, s64);
> +/* Ignore the bits below the sign bit.  */
> +tcg_gen_andi_i64(t64, t64, -1ULL << (bits - 1));
> +/* If any bits remain set, we have overflow.  */
> +tcg_gen_setcondi_i64(TCG_COND_NE, t64, t64, 0);
> +tcg_gen_extrl_i64_i32(QREG_CC_V, t64);
> +tcg_gen_neg_i32(QREG_CC_V, QREG_CC_V);

if s64 is greater than 32, we lose all the bits needed to compute the V
flag. I think we can just add this to fix the problem:

 if (!logical && m68k_feature(s->env, M68K_FEATURE_M68000)) {
+TCGv_i64 tt = tcg_const_i64(32);
+/* if shift is greater than 32, use 32 */
+tcg_gen_movcond_i64(TCG_COND_GT, s64, s64, tt, tt, s64);
+tcg_temp_free_i64(tt);
 /* Sign extend the input to 64 bits; re-do the shift.  */
 tcg_gen_ext_i32_i64(t64, reg);
 tcg_gen_shl_i64(s64, t64, s64);

Is it correct?

Laurent



Re: [Qemu-devel] [PATCH v2 3/5] target-m68k: Inline shifts

2016-11-27 Thread Laurent Vivier
Le 09/11/2016 à 14:46, Richard Henderson a écrit :
> diff --git a/target-m68k/translate.c b/target-m68k/translate.c
> index 4f224d7..1b3765f 100644
> --- a/target-m68k/translate.c
> +++ b/target-m68k/translate.c
> @@ -2883,48 +2883,205 @@ DISAS_INSN(addx_mem)
>  gen_store(s, opsize, addr_dest, QREG_CC_N);
>  }
>  
> -/* TODO: This could be implemented without helper functions.  */
> -DISAS_INSN(shift_im)
> +static inline void shift_im(DisasContext *s, uint16_t insn, int opsize)
>  {
> -TCGv reg;
> -int tmp;
> -TCGv shift;
> +int count = (insn >> 9) & 7;
> +int logical = insn & 8;
> +int left = insn & 0x100;
> +int bits = opsize_bytes(opsize) * 8;
> +TCGv reg = gen_extend(DREG(insn, 0), opsize, !logical);
> +
> +if (count == 0) {
> +count = 8;
> +}
> +
> +tcg_gen_movi_i32(QREG_CC_V, 0);
> +if (left) {
> +tcg_gen_shri_i32(QREG_CC_C, reg, bits - count);
> +tcg_gen_shli_i32(QREG_CC_N, reg, count);
> +
> +/* Note that ColdFire always clears V (done above),
> +   while M68000 sets if the most significant bit is changed at
> +   any time during the shift operation */
> +if (!logical && m68k_feature(s->env, M68K_FEATURE_M68000)) {
> +/* if shift count >= bits, V is (reg != 0) */
> +if (count >= bits) {
> +tcg_gen_setcond_i32(TCG_COND_NE, QREG_CC_V, reg, QREG_CC_V);
> +} else {
> +TCGv t0 = tcg_temp_new();
> +tcg_gen_sari_i32(QREG_CC_V, reg, bits - 1);
> +tcg_gen_sari_i32(t0, t0, bits - count);

t0 is used unitialized, I think we should have here:

tcg_gen_sari_i32(t0, reg, bits - count - 1);

moreover we must use "bits - count - 1" to use also the current most
significant bit to compute V flag.

> +tcg_gen_setcond_i32(TCG_COND_NE, QREG_CC_V, QREG_CC_V, t0);
> +tcg_temp_free(t0);
> +}
> +tcg_gen_neg_i32(QREG_CC_V, QREG_CC_V);
> +}
> +} else {
> +tcg_gen_shri_i32(QREG_CC_C, reg, count - 1);
> +if (logical) {
> +tcg_gen_shri_i32(QREG_CC_N, reg, count);
> +} else {
> +tcg_gen_sari_i32(QREG_CC_N, reg, count);
> +}
> +}

Thanks,
Laurent