On 07/07/2014 11:13 AM, Bastian Koppelmann wrote: > +target_ulong helper_shac(CPUTRICOREState *env, target_ulong r1, > + target_ulong r2)
Align the r2 argument properly. > + shift_count = 0 - const6; > + cond = r1 & 0x80000000; > + if (cond != 0) { > + msk = (((1 << shift_count) - 1) << (32 - shift_count)); > + } else { > + msk = 0; > + } > + ret = msk | (r1 >> shift_count); Surely using a normal signed right shift is easier here. > +static int sign_extend(uint32_t val, uint32_t width) > +{ > + int sval; > + /* LSL. */ > + val <<= 31 - width; > + sval = val; > + /* ASR. */ > + sval >>= 31 - width; > + return sval; > +} This is sextract32. > +#define OP_COND(insn)\ > +static inline void gen_cond_##insn(int cond, TCGv r1, TCGv r2, TCGv r3, \ > + TCGv r4) \ > +{ \ > + int label = gen_new_label(); \ > + int label2 = gen_new_label(); \ > + \ > + tcg_gen_brcondi_tl(cond, r4, 0, label); \ > + tcg_gen_mov_tl(r3, r1); \ > + tcg_gen_br(label2); \ > + gen_set_label(label); \ > + tcg_gen_##insn ## _tl(r3, r1, r2); \ > + gen_set_label(label2); \ > +} \ Use tcg_gen_movcond_tl with the "insn" computation into a temporary. > +static inline void gen_cond_mov(int cond, TCGv r1, TCGv r2, TCGv r3, > + TCGv r4) > +{ > + int label = gen_new_label(); > + int label2 = gen_new_label(); > + > + tcg_gen_brcondi_tl(cond, r4, 0, label); > + tcg_gen_mov_tl(r3, r1); > + tcg_gen_br(label2); > + gen_set_label(label); > + tcg_gen_mov_tl(r3, r2); > + gen_set_label(label2); > +} Use movcond. > +static void gen_sh(TCGv ret, TCGv r1, TCGv r2) > +{ > + int label, label2; > + label = gen_new_label(); > + label2 = gen_new_label(); > + tcg_gen_brcondi_tl(TCG_COND_GE, r2, 0, label); > + /* r1 >>(-r2) */ > + tcg_gen_shr_tl(ret, r1, r2); > + tcg_gen_brcond_tl(TCG_COND_EQ, r2, r2, label2); > + gen_set_label(label); > + /* r1 << r2 */ > + tcg_gen_shl_tl(ret, r1, r2); > + gen_set_label(label2); The r2==r2 brcond is just silly. There's br (no cond) for that. The branches are going to break most of the code that uses this, since TCG temporaries not allocated with _local are discarded. And TCG temporaries allocated *with* _local are more expensive, and thus best avoided when possible. You'll be best off performing both shifts unconditionally and selecting the correct result with movcond. I do not see code to handle extracting the relevant bits from 0:5, nor perform the required negation, nor handle the special case of -32 (resulting in zero), nor are you examining the correct bit for the sign. Note that the case of shift-by-32-equals-zero case can be handled easily, since the 2's compliment negation is 1's compliment + 1. Thus the whole operation can be implemented like count = r2 & 31; lret = r1 << count; count = count ^ 31; /* one's complement of the field */ rret = r1 >> count; /* shr by count ... */ rret = r1 >> 1; /* ... + 1 */ neg = r2 & 32; ret = neg ? rret : lret; > +static void gen_shi(TCGv ret, TCGv r1, int32_t r2) > +{ > + TCGv temp = tcg_const_i32(r2); > + gen_sh(ret, r1, temp); > + tcg_temp_free(temp); > +} You'd do well to examine the contents of the immediate here, performing the checks at compile time that you'd perform at runtime in the fully variable function above. While the TCG optimizer might be able to clean up the code, shifts are common enough that it's worth the effort to do the right thing in the first place. > + case OPC1_16_SRC_ADD: > + const16 = sign_extend(MASK_OP_SRC_CONST4(ctx->opcode), 3); > + r1 = MASK_OP_SRC_S1D(ctx->opcode); Surely this function can be improved by hoisting this computation. It's identical for every entry, since they're all the same insn format. Note that you can afford to compute both const16 and r2 at the top, even though only one or the other will be used. r~