[Qemu-devel] Re: [PATCH] target-ppc: ext32u instead of andi with constant

2011-03-22 Thread Alexander Graf

On 22.03.2011, at 07:41, Aurelien Jarno wrote:

> Cc: Alexander Graf 
> Signed-off-by: Aurelien Jarno 
> ---
> target-ppc/translate.c |8 
> 1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index 3d265e3..49eab28 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -6975,7 +6975,7 @@ static inline void gen_evmergelo(DisasContext *ctx)
> #if defined(TARGET_PPC64)
> TCGv t0 = tcg_temp_new();
> TCGv t1 = tcg_temp_new();
> -tcg_gen_andi_tl(t0, cpu_gpr[rB(ctx->opcode)], 0xLL);
> +tcg_gen_ext32u_tl(t0, cpu_gpr[rB(ctx->opcode)]);
> tcg_gen_shli_tl(t1, cpu_gpr[rA(ctx->opcode)], 32);
> tcg_gen_or_tl(cpu_gpr[rD(ctx->opcode)], t0, t1);
> tcg_temp_free(t0);
> @@ -6994,7 +6994,7 @@ static inline void gen_evmergehilo(DisasContext *ctx)
> #if defined(TARGET_PPC64)
> TCGv t0 = tcg_temp_new();
> TCGv t1 = tcg_temp_new();
> -tcg_gen_andi_tl(t0, cpu_gpr[rB(ctx->opcode)], 0xLL);
> +tcg_gen_ext32u_tl(t0, cpu_gpr[rB(ctx->opcode)]);
> tcg_gen_andi_tl(t1, cpu_gpr[rA(ctx->opcode)], 0x000ULL);

Wouldn't deposit make sense here? But that can be a later optimization.

> tcg_gen_or_tl(cpu_gpr[rD(ctx->opcode)], t0, t1);
> tcg_temp_free(t0);
> @@ -7083,14 +7083,14 @@ static inline void gen_evsel(DisasContext *ctx)
> tcg_gen_andi_i32(t0, cpu_crf[ctx->opcode & 0x07], 1 << 2);
> tcg_gen_brcondi_i32(TCG_COND_EQ, t0, 0, l3);
> #if defined(TARGET_PPC64)
> -tcg_gen_andi_tl(t2, cpu_gpr[rA(ctx->opcode)], 0xULL);
> +tcg_gen_ext32u_tl(t2, cpu_gpr[rA(ctx->opcode)]);
> #else
> tcg_gen_mov_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]);
> #endif
> tcg_gen_br(l4);
> gen_set_label(l3);
> #if defined(TARGET_PPC64)
> -tcg_gen_andi_tl(t2, cpu_gpr[rB(ctx->opcode)], 0xULL);
> +tcg_gen_ext32u_tl(t2, cpu_gpr[rB(ctx->opcode)]);
> #else
> tcg_gen_mov_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rB(ctx->opcode)]);
> #endif

Acked-by: Alexander Graf 


Alex




Re: [Qemu-devel] Re: [PATCH] target-ppc: ext32u instead of andi with constant

2011-03-22 Thread Aurelien Jarno
On Tue, Mar 22, 2011 at 08:36:26AM +0100, Alexander Graf wrote:
> 
> On 22.03.2011, at 07:41, Aurelien Jarno wrote:
> 
> > Cc: Alexander Graf 
> > Signed-off-by: Aurelien Jarno 
> > ---
> > target-ppc/translate.c |8 
> > 1 files changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> > index 3d265e3..49eab28 100644
> > --- a/target-ppc/translate.c
> > +++ b/target-ppc/translate.c
> > @@ -6975,7 +6975,7 @@ static inline void gen_evmergelo(DisasContext *ctx)
> > #if defined(TARGET_PPC64)
> > TCGv t0 = tcg_temp_new();
> > TCGv t1 = tcg_temp_new();
> > -tcg_gen_andi_tl(t0, cpu_gpr[rB(ctx->opcode)], 0xLL);
> > +tcg_gen_ext32u_tl(t0, cpu_gpr[rB(ctx->opcode)]);
> > tcg_gen_shli_tl(t1, cpu_gpr[rA(ctx->opcode)], 32);
> > tcg_gen_or_tl(cpu_gpr[rD(ctx->opcode)], t0, t1);
> > tcg_temp_free(t0);
> > @@ -6994,7 +6994,7 @@ static inline void gen_evmergehilo(DisasContext *ctx)
> > #if defined(TARGET_PPC64)
> > TCGv t0 = tcg_temp_new();
> > TCGv t1 = tcg_temp_new();
> > -tcg_gen_andi_tl(t0, cpu_gpr[rB(ctx->opcode)], 0xLL);
> > +tcg_gen_ext32u_tl(t0, cpu_gpr[rB(ctx->opcode)]);
> > tcg_gen_andi_tl(t1, cpu_gpr[rA(ctx->opcode)], 0x000ULL);
> 
> Wouldn't deposit make sense here? But that can be a later optimization.

Indeed it makes sense here, the thing is that I don't really know how 
deposit is going. We have merged it, but we don't have the expected 
performance (ie no improvement).

> > tcg_gen_or_tl(cpu_gpr[rD(ctx->opcode)], t0, t1);
> > tcg_temp_free(t0);
> > @@ -7083,14 +7083,14 @@ static inline void gen_evsel(DisasContext *ctx)
> > tcg_gen_andi_i32(t0, cpu_crf[ctx->opcode & 0x07], 1 << 2);
> > tcg_gen_brcondi_i32(TCG_COND_EQ, t0, 0, l3);
> > #if defined(TARGET_PPC64)
> > -tcg_gen_andi_tl(t2, cpu_gpr[rA(ctx->opcode)], 0xULL);
> > +tcg_gen_ext32u_tl(t2, cpu_gpr[rA(ctx->opcode)]);
> > #else
> > tcg_gen_mov_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]);
> > #endif
> > tcg_gen_br(l4);
> > gen_set_label(l3);
> > #if defined(TARGET_PPC64)
> > -tcg_gen_andi_tl(t2, cpu_gpr[rB(ctx->opcode)], 0xULL);
> > +tcg_gen_ext32u_tl(t2, cpu_gpr[rB(ctx->opcode)]);
> > #else
> > tcg_gen_mov_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rB(ctx->opcode)]);
> > #endif
> 
> Acked-by: Alexander Graf 
> 
> 
> Alex
> 
> 
> 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] Re: [PATCH] target-ppc: ext32u instead of andi with constant

2011-03-22 Thread Alexander Graf

Am 22.03.2011 um 10:09 schrieb Aurelien Jarno :

> On Tue, Mar 22, 2011 at 08:36:26AM +0100, Alexander Graf wrote:
>> 
>> On 22.03.2011, at 07:41, Aurelien Jarno wrote:
>> 
>>> Cc: Alexander Graf 
>>> Signed-off-by: Aurelien Jarno 
>>> ---
>>> target-ppc/translate.c |8 
>>> 1 files changed, 4 insertions(+), 4 deletions(-)
>>> 
>>> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
>>> index 3d265e3..49eab28 100644
>>> --- a/target-ppc/translate.c
>>> +++ b/target-ppc/translate.c
>>> @@ -6975,7 +6975,7 @@ static inline void gen_evmergelo(DisasContext *ctx)
>>> #if defined(TARGET_PPC64)
>>>TCGv t0 = tcg_temp_new();
>>>TCGv t1 = tcg_temp_new();
>>> -tcg_gen_andi_tl(t0, cpu_gpr[rB(ctx->opcode)], 0xLL);
>>> +tcg_gen_ext32u_tl(t0, cpu_gpr[rB(ctx->opcode)]);
>>>tcg_gen_shli_tl(t1, cpu_gpr[rA(ctx->opcode)], 32);
>>>tcg_gen_or_tl(cpu_gpr[rD(ctx->opcode)], t0, t1);
>>>tcg_temp_free(t0);
>>> @@ -6994,7 +6994,7 @@ static inline void gen_evmergehilo(DisasContext *ctx)
>>> #if defined(TARGET_PPC64)
>>>TCGv t0 = tcg_temp_new();
>>>TCGv t1 = tcg_temp_new();
>>> -tcg_gen_andi_tl(t0, cpu_gpr[rB(ctx->opcode)], 0xLL);
>>> +tcg_gen_ext32u_tl(t0, cpu_gpr[rB(ctx->opcode)]);
>>>tcg_gen_andi_tl(t1, cpu_gpr[rA(ctx->opcode)], 0x000ULL);
>> 
>> Wouldn't deposit make sense here? But that can be a later optimization.
> 
> Indeed it makes sense here, the thing is that I don't really know how 
> deposit is going. We have merged it, but we don't have the expected 
> performance (ie no improvement).

Yup, as on x86 we have to special-case a lot. The current x86 deposit 
implementation doesn't implement too many of those :).

On ppc for example, the world looks different. And even without speedup, I like 
the cleanup it provides.

Btw, I do use deposit on the s390 target implementation that is finally able to 
run Debian guests as well. All I need to do to finish it up is to properly 
split it up into smaller readable patches. So please decide on deposit soon, 
otherwise we'll have a user merged :).


Alex

>