Re: [Qemu-devel] Re: [PATCH 01/12] TCG sync op

2009-11-18 Thread Aurelien Jarno
On Wed, Nov 18, 2009 at 01:01:13AM +0100, Alexander Graf wrote:

 On 18.11.2009, at 00:40, Aurelien Jarno wrote:

 On Wed, Nov 11, 2009 at 12:56:47AM +, Paul Brook wrote:
 On Thursday 22 October 2009, Aurelien Jarno wrote:
 On Wed, Oct 21, 2009 at 03:52:22PM +0200, Ulrich Hecht wrote:
 sync allows concurrent accesses to locations in memory through  
 different
 TCG variables. This comes in handy when you are emulating CPU  
 registers
 that can be used as either 32 or 64 bit, as TCG doesn't know  
 anything
 about aliases. See the s390x target for an example.

 Fixed sync_i64 build failure on 32-bit targets.

 Looking more in details to the use case of this patch, I think it  
 can be
 useful in QEMU. However I don't feel very comfortable in merging it
 without having the opinion of more persons. Paul, Malc Blue Swirl or
 others, any opinion?

 I don't think this is the right solution.

 IIUC the basic problem is that we have a register file where  
 adjacent pairs of
 32-bit registers are also accessed as a 64-bit value.  This is  
 something many
 other targets need to do (at least ARM, PPC, MIPS and SPARC).

 While sync appears attractive as a quick hack to achieve this, I  
 think it is
 liable to be abused, and cause us serious pain long-term. If you  
 need an easy
 solution then use ld/st (as with ARM VFP registers). If you want a  
 good
 solution then fix whichever bit of TCG makes accessing a pair of  
 registers
 horribly slow. We already have some support for this  
 (concat_i32_i64).


 What is probably needed here are merge_low and merge_high ops, merging 
 a
 32-bit value into the low or high part of a 64-bit value, leaving the
 other part unchanged. Not sure we can really optimize that on x86/ 
 x86_64
 compared to standard TCG code though.

 Maybe I got the whole point wrong but isn't this about having 2 virtual 
 register sets for the same target registers? So you'd have:

  TCGvar my_reg_32;
  TCGvar my_reg_64;

 and whenever you work with either of them you want to have the correct  
 value present in both (cut off for 32bit, extended for 64 bit).

 So what's really necessary is some internal coupling and dirty log of  
 variables within TCG so it knows that an access to the 64 bit of a  
 coupled variable means it needs to sync the 32 bit value over and vice  
 versa. Then magically everything would just work as expected.


That's an option, basically keeping the list (or only one ?) of aliased 
TCG variables for each of them, and freeing the others before using one.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net




Re: [Qemu-devel] Re: [PATCH 01/12] TCG sync op

2009-11-18 Thread Alexander Graf
Aurelien Jarno wrote:
 On Wed, Nov 18, 2009 at 01:01:13AM +0100, Alexander Graf wrote:
   
 On 18.11.2009, at 00:40, Aurelien Jarno wrote:

 
 On Wed, Nov 11, 2009 at 12:56:47AM +, Paul Brook wrote:
   
 On Thursday 22 October 2009, Aurelien Jarno wrote:
 
 On Wed, Oct 21, 2009 at 03:52:22PM +0200, Ulrich Hecht wrote:
   
 sync allows concurrent accesses to locations in memory through  
 different
 TCG variables. This comes in handy when you are emulating CPU  
 registers
 that can be used as either 32 or 64 bit, as TCG doesn't know  
 anything
 about aliases. See the s390x target for an example.

 Fixed sync_i64 build failure on 32-bit targets.
 
 Looking more in details to the use case of this patch, I think it  
 can be
 useful in QEMU. However I don't feel very comfortable in merging it
 without having the opinion of more persons. Paul, Malc Blue Swirl or
 others, any opinion?
   
 I don't think this is the right solution.

 IIUC the basic problem is that we have a register file where  
 adjacent pairs of
 32-bit registers are also accessed as a 64-bit value.  This is  
 something many
 other targets need to do (at least ARM, PPC, MIPS and SPARC).

 While sync appears attractive as a quick hack to achieve this, I  
 think it is
 liable to be abused, and cause us serious pain long-term. If you  
 need an easy
 solution then use ld/st (as with ARM VFP registers). If you want a  
 good
 solution then fix whichever bit of TCG makes accessing a pair of  
 registers
 horribly slow. We already have some support for this  
 (concat_i32_i64).

 
 What is probably needed here are merge_low and merge_high ops, merging 
 a
 32-bit value into the low or high part of a 64-bit value, leaving the
 other part unchanged. Not sure we can really optimize that on x86/ 
 x86_64
 compared to standard TCG code though.
   
 Maybe I got the whole point wrong but isn't this about having 2 virtual 
 register sets for the same target registers? So you'd have:

  TCGvar my_reg_32;
  TCGvar my_reg_64;

 and whenever you work with either of them you want to have the correct  
 value present in both (cut off for 32bit, extended for 64 bit).

 So what's really necessary is some internal coupling and dirty log of  
 variables within TCG so it knows that an access to the 64 bit of a  
 coupled variable means it needs to sync the 32 bit value over and vice  
 versa. Then magically everything would just work as expected.

 

 That's an option, basically keeping the list (or only one ?) of aliased 
 TCG variables for each of them, and freeing the others before using one.
   

Yeah, only one. I don't think this should be getting overengineered (and
thus slow) :-).
Doesn't really sound hard, does it? Any TCG magicians around? :)

Alex




Re: [Qemu-devel] Re: [PATCH 01/12] TCG sync op

2009-11-18 Thread Paul Brook
  That's an option, basically keeping the list (or only one ?) of aliased
  TCG variables for each of them, and freeing the others before using one.
 
 Yeah, only one. I don't think this should be getting overengineered (and
 thus slow) :-).
 Doesn't really sound hard, does it? Any TCG magicians around? :)

Maybe. This is the sort of thing that tends to get harder the more you think 
about it.

Paul




Re: [Qemu-devel] Re: [PATCH 01/12] TCG sync op

2009-11-17 Thread Aurelien Jarno
On Wed, Nov 11, 2009 at 12:56:47AM +, Paul Brook wrote:
 On Thursday 22 October 2009, Aurelien Jarno wrote:
  On Wed, Oct 21, 2009 at 03:52:22PM +0200, Ulrich Hecht wrote:
   sync allows concurrent accesses to locations in memory through different
   TCG variables. This comes in handy when you are emulating CPU registers
   that can be used as either 32 or 64 bit, as TCG doesn't know anything
   about aliases. See the s390x target for an example.
  
   Fixed sync_i64 build failure on 32-bit targets.
  
  Looking more in details to the use case of this patch, I think it can be
  useful in QEMU. However I don't feel very comfortable in merging it
  without having the opinion of more persons. Paul, Malc Blue Swirl or
  others, any opinion?
 
 I don't think this is the right solution.
 
 IIUC the basic problem is that we have a register file where adjacent pairs 
 of 
 32-bit registers are also accessed as a 64-bit value.  This is something many 
 other targets need to do (at least ARM, PPC, MIPS and SPARC).
 
 While sync appears attractive as a quick hack to achieve this, I think it is 
 liable to be abused, and cause us serious pain long-term. If you need an easy 
 solution then use ld/st (as with ARM VFP registers). If you want a good 
 solution then fix whichever bit of TCG makes accessing a pair of registers 
 horribly slow. We already have some support for this (concat_i32_i64).
 

What is probably needed here are merge_low and merge_high ops, merging a
32-bit value into the low or high part of a 64-bit value, leaving the
other part unchanged. Not sure we can really optimize that on x86/x86_64
compared to standard TCG code though.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net




Re: [Qemu-devel] Re: [PATCH 01/12] TCG sync op

2009-11-17 Thread Alexander Graf


On 18.11.2009, at 00:40, Aurelien Jarno wrote:


On Wed, Nov 11, 2009 at 12:56:47AM +, Paul Brook wrote:

On Thursday 22 October 2009, Aurelien Jarno wrote:

On Wed, Oct 21, 2009 at 03:52:22PM +0200, Ulrich Hecht wrote:
sync allows concurrent accesses to locations in memory through  
different
TCG variables. This comes in handy when you are emulating CPU  
registers
that can be used as either 32 or 64 bit, as TCG doesn't know  
anything

about aliases. See the s390x target for an example.

Fixed sync_i64 build failure on 32-bit targets.


Looking more in details to the use case of this patch, I think it  
can be

useful in QEMU. However I don't feel very comfortable in merging it
without having the opinion of more persons. Paul, Malc Blue Swirl or
others, any opinion?


I don't think this is the right solution.

IIUC the basic problem is that we have a register file where  
adjacent pairs of
32-bit registers are also accessed as a 64-bit value.  This is  
something many

other targets need to do (at least ARM, PPC, MIPS and SPARC).

While sync appears attractive as a quick hack to achieve this, I  
think it is
liable to be abused, and cause us serious pain long-term. If you  
need an easy
solution then use ld/st (as with ARM VFP registers). If you want a  
good
solution then fix whichever bit of TCG makes accessing a pair of  
registers
horribly slow. We already have some support for this  
(concat_i32_i64).




What is probably needed here are merge_low and merge_high ops,  
merging a

32-bit value into the low or high part of a 64-bit value, leaving the
other part unchanged. Not sure we can really optimize that on x86/ 
x86_64

compared to standard TCG code though.


Maybe I got the whole point wrong but isn't this about having 2  
virtual register sets for the same target registers? So you'd have:


 TCGvar my_reg_32;
 TCGvar my_reg_64;

and whenever you work with either of them you want to have the correct  
value present in both (cut off for 32bit, extended for 64 bit).


So what's really necessary is some internal coupling and dirty log of  
variables within TCG so it knows that an access to the 64 bit of a  
coupled variable means it needs to sync the 32 bit value over and vice  
versa. Then magically everything would just work as expected.


Or am I totally wrong here?

Alex




Re: [Qemu-devel] Re: [PATCH 01/12] TCG sync op

2009-11-16 Thread Alexander Graf
Paul Brook wrote:
 On Thursday 22 October 2009, Aurelien Jarno wrote:
   
 On Wed, Oct 21, 2009 at 03:52:22PM +0200, Ulrich Hecht wrote:
 
 sync allows concurrent accesses to locations in memory through different
 TCG variables. This comes in handy when you are emulating CPU registers
 that can be used as either 32 or 64 bit, as TCG doesn't know anything
 about aliases. See the s390x target for an example.

 Fixed sync_i64 build failure on 32-bit targets.
   
 Looking more in details to the use case of this patch, I think it can be
 useful in QEMU. However I don't feel very comfortable in merging it
 without having the opinion of more persons. Paul, Malc Blue Swirl or
 others, any opinion?
 

 I don't think this is the right solution.

 IIUC the basic problem is that we have a register file where adjacent pairs 
 of 
 32-bit registers are also accessed as a 64-bit value.  This is something many 
 other targets need to do (at least ARM, PPC, MIPS and SPARC).

 While sync appears attractive as a quick hack to achieve this, I think it is 
 liable to be abused, and cause us serious pain long-term. If you need an easy 
 solution then use ld/st (as with ARM VFP registers). If you want a good 
 solution then fix whichever bit of TCG makes accessing a pair of registers 
 horribly slow. We already have some support for this (concat_i32_i64).
   

Could we please include it nevertheless? I don't want to see S390 TCG
and KVM targets not included because of this sync operation.

If you like, add some big fat warning around it and maybe break
compilation if anyone but the s390 target uses that sync op, but let's
not keep a whole target from inclusion because of a single feature that
_might_ one day affect others.


Alex




Re: [Qemu-devel] Re: [PATCH 01/12] TCG sync op

2009-11-16 Thread Paul Brook
  While sync appears attractive as a quick hack to achieve this, I think it
  is liable to be abused, and cause us serious pain long-term. If you need
  an easy solution then use ld/st (as with ARM VFP registers). If you want
  a good solution then fix whichever bit of TCG makes accessing a pair of
  registers horribly slow. We already have some support for this
  (concat_i32_i64).
 
 Could we please include it nevertheless? I don't want to see S390 TCG
 and KVM targets not included because of this sync operation.
 
 If you like, add some big fat warning around it and maybe break
 compilation if anyone but the s390 target uses that sync op, but let's
 not keep a whole target from inclusion because of a single feature that
 _might_ one day affect others.

I'd rather not include sync, and instead use the explicit ld/st code you 
already wrote.

Paul




Re: [Qemu-devel] Re: [PATCH 01/12] TCG sync op

2009-11-16 Thread Alexander Graf
Paul Brook wrote:
 While sync appears attractive as a quick hack to achieve this, I think it
 is liable to be abused, and cause us serious pain long-term. If you need
 an easy solution then use ld/st (as with ARM VFP registers). If you want
 a good solution then fix whichever bit of TCG makes accessing a pair of
 registers horribly slow. We already have some support for this
 (concat_i32_i64).
   
 Could we please include it nevertheless? I don't want to see S390 TCG
 and KVM targets not included because of this sync operation.

 If you like, add some big fat warning around it and maybe break
 compilation if anyone but the s390 target uses that sync op, but let's
 not keep a whole target from inclusion because of a single feature that
 _might_ one day affect others.
 

 I'd rather not include sync, and instead use the explicit ld/st code you 
 already wrote.
   

As long as the KVM code comes in I'm good. I don't really care that much
about the TCG parts and only need them to have the build system
surroundings set up.

So yes, I don't care about sync. As long as we finally get any solution
here I'm good, but patches lying around for weeks surely doesn't help qemu.


Alex




Re: [Qemu-devel] Re: [PATCH 01/12] TCG sync op

2009-11-10 Thread Paul Brook
On Thursday 22 October 2009, Aurelien Jarno wrote:
 On Wed, Oct 21, 2009 at 03:52:22PM +0200, Ulrich Hecht wrote:
  sync allows concurrent accesses to locations in memory through different
  TCG variables. This comes in handy when you are emulating CPU registers
  that can be used as either 32 or 64 bit, as TCG doesn't know anything
  about aliases. See the s390x target for an example.
 
  Fixed sync_i64 build failure on 32-bit targets.
 
 Looking more in details to the use case of this patch, I think it can be
 useful in QEMU. However I don't feel very comfortable in merging it
 without having the opinion of more persons. Paul, Malc Blue Swirl or
 others, any opinion?

I don't think this is the right solution.

IIUC the basic problem is that we have a register file where adjacent pairs of 
32-bit registers are also accessed as a 64-bit value.  This is something many 
other targets need to do (at least ARM, PPC, MIPS and SPARC).

While sync appears attractive as a quick hack to achieve this, I think it is 
liable to be abused, and cause us serious pain long-term. If you need an easy 
solution then use ld/st (as with ARM VFP registers). If you want a good 
solution then fix whichever bit of TCG makes accessing a pair of registers 
horribly slow. We already have some support for this (concat_i32_i64).

Paul




[Qemu-devel] Re: [PATCH 01/12] TCG sync op

2009-10-22 Thread Aurelien Jarno
On Wed, Oct 21, 2009 at 03:52:22PM +0200, Ulrich Hecht wrote:
 sync allows concurrent accesses to locations in memory through different TCG
 variables. This comes in handy when you are emulating CPU registers that can
 be used as either 32 or 64 bit, as TCG doesn't know anything about aliases.
 See the s390x target for an example.
 
 Fixed sync_i64 build failure on 32-bit targets.

Looking more in details to the use case of this patch, I think it can be
useful in QEMU. However I don't feel very comfortable in merging it
without having the opinion of more persons. Paul, Malc Blue Swirl or
others, any opinion?

Anyway this patch should come with (simple) documentation to drop into
tcg/README.

Otherwise, please find my comments inline.

 Signed-off-by: Ulrich Hecht u...@suse.de
 ---
  tcg/tcg-op.h  |   12 
  tcg/tcg-opc.h |2 ++
  tcg/tcg.c |6 ++
  3 files changed, 20 insertions(+), 0 deletions(-)
 
 diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
 index faf2e8b..c1b4710 100644
 --- a/tcg/tcg-op.h
 +++ b/tcg/tcg-op.h
 @@ -316,6 +316,18 @@ static inline void tcg_gen_br(int label)
  tcg_gen_op1i(INDEX_op_br, label);
  }
  
 +static inline void tcg_gen_sync_i32(TCGv_i32 arg)
 +{
 +tcg_gen_op1_i32(INDEX_op_sync_i32, arg);
 +}
 +
 +#if TCG_TARGET_REG_BITS == 64
 +static inline void tcg_gen_sync_i64(TCGv_i64 arg)
 +{
 +tcg_gen_op1_i64(INDEX_op_sync_i64, arg);
 +}
 +#endif
 +
  static inline void tcg_gen_mov_i32(TCGv_i32 ret, TCGv_i32 arg)
  {
  if (!TCGV_EQUAL_I32(ret, arg))

You may also want to add a sync_tl as a #define at the end of the file.

 diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h
 index b7f3fd7..5dcdeba 100644
 --- a/tcg/tcg-opc.h
 +++ b/tcg/tcg-opc.h
 @@ -40,6 +40,7 @@ DEF2(call, 0, 1, 2, TCG_OPF_SIDE_EFFECTS) /* variable 
 number of parameters */
  DEF2(jmp, 0, 1, 0, TCG_OPF_BB_END | TCG_OPF_SIDE_EFFECTS)
  DEF2(br, 0, 0, 1, TCG_OPF_BB_END | TCG_OPF_SIDE_EFFECTS)
  
 +DEF2(sync_i32, 0, 1, 0, 0)
  DEF2(mov_i32, 1, 1, 0, 0)
  DEF2(movi_i32, 1, 0, 1, 0)
  /* load/store */
 @@ -109,6 +110,7 @@ DEF2(neg_i32, 1, 1, 0, 0)
  #endif
  
  #if TCG_TARGET_REG_BITS == 64
 +DEF2(sync_i64, 0, 1, 0, 0)
  DEF2(mov_i64, 1, 1, 0, 0)
  DEF2(movi_i64, 1, 0, 1, 0)
  /* load/store */
 diff --git a/tcg/tcg.c b/tcg/tcg.c
 index 3c0e296..8eb60f8 100644
 --- a/tcg/tcg.c
 +++ b/tcg/tcg.c
 @@ -1930,6 +1930,12 @@ static inline int tcg_gen_code_common(TCGContext *s, 
 uint8_t *gen_code_buf,
  //dump_regs(s);
  #endif
  switch(opc) {
 +case INDEX_op_sync_i32:
 +#if TCG_TARGET_REG_BITS == 64
 +case INDEX_op_sync_i64:
 +#endif
 +temp_save(s, args[0], s-reserved_regs);
 +break;

This won't work for fixed register. You should probably add something
like:
if (s-temps[args[0]].fixed_reg) {
tcg_abort();
}

  case INDEX_op_mov_i32:
  #if TCG_TARGET_REG_BITS == 64
  case INDEX_op_mov_i64:
 -- 
 1.6.2.1
 
 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net




Re: [Qemu-devel] Re: [PATCH 01/12] TCG sync op

2009-10-22 Thread malc
On Thu, 22 Oct 2009, Aurelien Jarno wrote:

 On Wed, Oct 21, 2009 at 03:52:22PM +0200, Ulrich Hecht wrote:
  sync allows concurrent accesses to locations in memory through different TCG
  variables. This comes in handy when you are emulating CPU registers that can
  be used as either 32 or 64 bit, as TCG doesn't know anything about aliases.
  See the s390x target for an example.
  
  Fixed sync_i64 build failure on 32-bit targets.
 
 Looking more in details to the use case of this patch, I think it can be
 useful in QEMU. However I don't feel very comfortable in merging it
 without having the opinion of more persons. Paul, Malc Blue Swirl or
 others, any opinion?

I don't really understand the problem/solution so no - no opinion.

[..snip..]

-- 
mailto:av1...@comtv.ru