On 5/12/15 00:55, Richard Henderson wrote:
>> +static void gen_v1cmpeqi(struct DisasContext *dc,
>> > +                         uint8_t rdst, uint8_t rsrc, uint8_t imm8)
>> > +{
>> > +    int count;
>> > +    TCGv vdst = dest_gr(dc, rdst);
>> > +    TCGv tmp = tcg_temp_new_i64();
>> > +
>> > +    qemu_log_mask(CPU_LOG_TB_IN_ASM, "v1cmpeqi r%d, r%d, %d\n",
>> > +                  rdst, rsrc, imm8);
>> > +
>> > +    tcg_gen_movi_i64(vdst, 0);
>> > +
>> > +    for (count = 0; count < 8; count++) {
>> > +        tcg_gen_shri_i64(tmp, load_gr(dc, rsrc), (8 - count - 1) * 8);
>> > +        tcg_gen_andi_i64(tmp, tmp, 0xff);
>> > +        tcg_gen_setcondi_i64(TCG_COND_EQ, tmp, tmp, imm8);
>> > +        tcg_gen_or_i64(vdst, vdst, tmp);
>> > +        tcg_gen_shli_i64(vdst, vdst, 8);
> For all of these vector instructions, I would encourage you to use helpers to
> extract and insert values.  Extraction has little choice but to use a shift 
> and
> a mask, as you use here.  But insertion can use tcg_gen_deposit_i64.  I think
> that is a lot easier to reason with than your construction here which
> sequentially shifts vdst.
> 
> E.g.
> 
> static inline void
> extract_v1(TCGv out, TCGv in, unsigned byte)
> {
>   tcg_gen_shri_i64(out, in, byte * 8);
>   tcg_gen_ext8u_i64(out, out);
> }
> 
> static inline void
> insert_v1(TCGv out, TCGv in, unsigned byte)
> {
>   tcg_gen_deposit_i64(out, out, in, byte * 8, 8);
> }
> 
> 
> This loop then becomes
> 
>       TCGv vsrc = load_gr(dc, src);
>       for (count = 0; count < 8; ++count) {
>           extract_v1(tmp, vsrc, count);
>           tcg_gen_setcondi_i64(TCG_COND_EQ, tmp, tmp, imm8);
>           insert_v1(vdst, tmp, count);
>       }
> 

It also needs "tcg_gen_movi_i64(vdst, 0);" or will generate assertion
`ts->val_type == TEMP_VAL_REG' in debug mode.

And I shall try to send patch within one day (sorry for a little late).


Thanks.
-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed

Reply via email to