Re: [Qemu-devel] [PATCH 1/9] exec: add endian specific phys ld/st functions

2011-07-06 Thread Paolo Bonzini

diff --git a/exec.c b/exec.c
index 5f2f87e..f281ba4 100644
--- a/exec.c
+++ b/exec.c
@@ -4127,7 +4127,8 @@ void cpu_physical_memory_unmap(void *buffer, 
target_phys_addr_t len,
  }

  /* warning: addr must be aligned */
-uint32_t ldl_phys(target_phys_addr_t addr)
+static inline uint32_t ldl_phys_internal(target_phys_addr_t addr,
+ enum device_endian endian)


You probably need the __always_inline__ attribute to really convince GCC 
to inline this.


Paolo



Re: [Qemu-devel] [PATCH 1/9] exec: add endian specific phys ld/st functions

2011-07-06 Thread Alexander Graf




On 06.07.2011, at 12:24, Paolo Bonzini pbonz...@redhat.com wrote:

 diff --git a/exec.c b/exec.c
 index 5f2f87e..f281ba4 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -4127,7 +4127,8 @@ void cpu_physical_memory_unmap(void *buffer, 
 target_phys_addr_t len,
  }
 
  /* warning: addr must be aligned */
 -uint32_t ldl_phys(target_phys_addr_t addr)
 +static inline uint32_t ldl_phys_internal(target_phys_addr_t addr,
 + enum device_endian endian)
 
 You probably need the __always_inline__ attribute to really convince GCC to 
 inline this.

There's a define in osdep.h that converts inline into always_inline :)

Alex

 



Re: [Qemu-devel] [PATCH 1/9] exec: add endian specific phys ld/st functions

2011-07-06 Thread Hannes Reinecke

On 07/06/2011 01:34 PM, Alexander Graf wrote:





On 06.07.2011, at 12:24, Paolo Bonzinipbonz...@redhat.com  wrote:


diff --git a/exec.c b/exec.c
index 5f2f87e..f281ba4 100644
--- a/exec.c
+++ b/exec.c
@@ -4127,7 +4127,8 @@ void cpu_physical_memory_unmap(void *buffer, 
target_phys_addr_t len,
  }

  /* warning: addr must be aligned */
-uint32_t ldl_phys(target_phys_addr_t addr)
+static inline uint32_t ldl_phys_internal(target_phys_addr_t addr,
+ enum device_endian endian)


You probably need the __always_inline__ attribute to really convince GCC to 
inline this.


There's a define in osdep.h that converts inline into always_inline :)


Btw, while you're at it:

uint32_t ldub_phys(target_phys_addr_t addr);
uint32_t lduw_phys(target_phys_addr_t addr);

Hmm? ldub is supposed to read an 'unsigned byte' (uint8_t),
and lduw is supposed to read an 'unsigned word' (uint16_t).

Why does it return an uint32_t?

Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)



Re: [Qemu-devel] [PATCH 1/9] exec: add endian specific phys ld/st functions

2011-07-06 Thread Alexander Graf

Am 06.07.2011 um 15:03 schrieb Hannes Reinecke h...@suse.de:

 On 07/06/2011 01:34 PM, Alexander Graf wrote:
 
 
 
 
 On 06.07.2011, at 12:24, Paolo Bonzinipbonz...@redhat.com  wrote:
 
 diff --git a/exec.c b/exec.c
 index 5f2f87e..f281ba4 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -4127,7 +4127,8 @@ void cpu_physical_memory_unmap(void *buffer, 
 target_phys_addr_t len,
  }
 
  /* warning: addr must be aligned */
 -uint32_t ldl_phys(target_phys_addr_t addr)
 +static inline uint32_t ldl_phys_internal(target_phys_addr_t addr,
 + enum device_endian endian)
 
 You probably need the __always_inline__ attribute to really convince GCC to 
 inline this.
 
 There's a define in osdep.h that converts inline into always_inline :)
 
 Btw, while you're at it:
 
 uint32_t ldub_phys(target_phys_addr_t addr);
 uint32_t lduw_phys(target_phys_addr_t addr);
 
 Hmm? ldub is supposed to read an 'unsigned byte' (uint8_t),
 and lduw is supposed to read an 'unsigned word' (uint16_t).
 
 Why does it return an uint32_t?

Because it ends up being a 32-bit register for the return value / parameter 
anyways :).

But this is a different thing. I'd prefer to  focus on the endian problem for 
now.


Alex




Re: [Qemu-devel] [PATCH 1/9] exec: add endian specific phys ld/st functions

2011-07-06 Thread Paolo Bonzini

On 07/06/2011 03:03 PM, Hannes Reinecke wrote:


uint32_t ldub_phys(target_phys_addr_t addr);
uint32_t lduw_phys(target_phys_addr_t addr);

Hmm? ldub is supposed to read an 'unsigned byte' (uint8_t),
and lduw is supposed to read an 'unsigned word' (uint16_t).

Why does it return an uint32_t?


I don't know if this is the reason, but uint{8,16}_t are promoted to a 
signed int.  So when you do


  (uint64_t) (ldub_phys (addr)  24)

you'd get a sign extension in bits 32-63.  Admittedly a bit contrived, 
but it can happen and QEMU is full of such bugs:


case 4:
lba = (uint64_t) buf[9] | ((uint64_t) buf[8]  8) |
  ((uint64_t) buf[7]  16) | ((uint64_t) buf[6]  24) |
  ((uint64_t) buf[5]  32) | ((uint64_t) buf[4]  40) |
  ((uint64_t) buf[3]  48) | ((uint64_t) buf[2]  56);
break;

(found by Coverity).

This was the reason for my series at 
http://permalink.gmane.org/gmane.comp.emulators.qemu/105336 (which you 
reminded me to ping---thanks!)


Paolo



[Qemu-devel] [PATCH 1/9] exec: add endian specific phys ld/st functions

2011-07-05 Thread Alexander Graf
Device code some times needs to access physical memory and does that
through the ld./st._phys functions. However, these are the exact same
functions that the CPU uses to access memory, which means they will
be endianness swapped depending on the target CPU.

However, devices don't know about the CPU's endianness, but instead
access memory directly using their own interface to the memory bus,
so they need some way to read data with their native endianness.

This patch adds _le and _be functions to ld./st._phys.

Signed-off-by: Alexander Graf ag...@suse.de
---
 cpu-common.h |   12 +++
 exec.c   |  102 ++
 2 files changed, 114 insertions(+), 0 deletions(-)

diff --git a/cpu-common.h b/cpu-common.h
index b027e43..c6a2b5f 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -135,14 +135,26 @@ void qemu_flush_coalesced_mmio_buffer(void);
 
 uint32_t ldub_phys(target_phys_addr_t addr);
 uint32_t lduw_phys(target_phys_addr_t addr);
+uint32_t lduw_le_phys(target_phys_addr_t addr);
+uint32_t lduw_be_phys(target_phys_addr_t addr);
 uint32_t ldl_phys(target_phys_addr_t addr);
+uint32_t ldl_le_phys(target_phys_addr_t addr);
+uint32_t ldl_be_phys(target_phys_addr_t addr);
 uint64_t ldq_phys(target_phys_addr_t addr);
+uint64_t ldq_le_phys(target_phys_addr_t addr);
+uint64_t ldq_be_phys(target_phys_addr_t addr);
 void stl_phys_notdirty(target_phys_addr_t addr, uint32_t val);
 void stq_phys_notdirty(target_phys_addr_t addr, uint64_t val);
 void stb_phys(target_phys_addr_t addr, uint32_t val);
 void stw_phys(target_phys_addr_t addr, uint32_t val);
+void stw_le_phys(target_phys_addr_t addr, uint32_t val);
+void stw_be_phys(target_phys_addr_t addr, uint32_t val);
 void stl_phys(target_phys_addr_t addr, uint32_t val);
+void stl_le_phys(target_phys_addr_t addr, uint32_t val);
+void stl_be_phys(target_phys_addr_t addr, uint32_t val);
 void stq_phys(target_phys_addr_t addr, uint64_t val);
+void stq_le_phys(target_phys_addr_t addr, uint64_t val);
+void stq_be_phys(target_phys_addr_t addr, uint64_t val);
 
 void cpu_physical_memory_write_rom(target_phys_addr_t addr,
const uint8_t *buf, int len);
diff --git a/exec.c b/exec.c
index 4c45299..5f2f87e 100644
--- a/exec.c
+++ b/exec.c
@@ -4158,6 +4158,24 @@ uint32_t ldl_phys(target_phys_addr_t addr)
 return val;
 }
 
+uint32_t ldl_le_phys(target_phys_addr_t addr)
+{
+#if defined(TARGET_WORDS_BIGENDIAN)
+return bswap32(ldl_phys(addr));
+#else
+return ldl_phys(addr);
+#endif
+}
+
+uint32_t ldl_be_phys(target_phys_addr_t addr)
+{
+#if defined(TARGET_WORDS_BIGENDIAN)
+return ldl_phys(addr);
+#else
+return bswap32(ldl_phys(addr));
+#endif
+}
+
 /* warning: addr must be aligned */
 uint64_t ldq_phys(target_phys_addr_t addr)
 {
@@ -4196,6 +4214,24 @@ uint64_t ldq_phys(target_phys_addr_t addr)
 return val;
 }
 
+uint64_t ldq_le_phys(target_phys_addr_t addr)
+{
+#if defined(TARGET_WORDS_BIGENDIAN)
+return bswap64(ldq_phys(addr));
+#else
+return ldq_phys(addr);
+#endif
+}
+
+uint64_t ldq_be_phys(target_phys_addr_t addr)
+{
+#if defined(TARGET_WORDS_BIGENDIAN)
+return ldq_phys(addr);
+#else
+return bswap64(ldq_phys(addr));
+#endif
+}
+
 /* XXX: optimize */
 uint32_t ldub_phys(target_phys_addr_t addr)
 {
@@ -4236,6 +4272,24 @@ uint32_t lduw_phys(target_phys_addr_t addr)
 return val;
 }
 
+uint32_t lduw_le_phys(target_phys_addr_t addr)
+{
+#if defined(TARGET_WORDS_BIGENDIAN)
+return bswap16(lduw_phys(addr));
+#else
+return lduw_phys(addr);
+#endif
+}
+
+uint32_t lduw_be_phys(target_phys_addr_t addr)
+{
+#if defined(TARGET_WORDS_BIGENDIAN)
+return lduw_phys(addr);
+#else
+return bswap16(lduw_phys(addr));
+#endif
+}
+
 /* warning: addr must be aligned. The ram page is not masked as dirty
and the code inside is not invalidated. It is useful if the dirty
bits are used to track modified PTEs */
@@ -4343,6 +4397,24 @@ void stl_phys(target_phys_addr_t addr, uint32_t val)
 }
 }
 
+void stl_le_phys(target_phys_addr_t addr, uint32_t val)
+{
+#if defined(TARGET_WORDS_BIGENDIAN)
+return stl_phys(addr, bswap32(val));
+#else
+return stl_phys(addr, val);
+#endif
+}
+
+void stl_be_phys(target_phys_addr_t addr, uint32_t val)
+{
+#if defined(TARGET_WORDS_BIGENDIAN)
+return stl_phys(addr, val);
+#else
+return stl_phys(addr, bswap32(val));
+#endif
+}
+
 /* XXX: optimize */
 void stb_phys(target_phys_addr_t addr, uint32_t val)
 {
@@ -4386,6 +4458,24 @@ void stw_phys(target_phys_addr_t addr, uint32_t val)
 }
 }
 
+void stw_le_phys(target_phys_addr_t addr, uint32_t val)
+{
+#if defined(TARGET_WORDS_BIGENDIAN)
+return stw_phys(addr, bswap16(val));
+#else
+return stw_phys(addr, val);
+#endif
+}
+
+void stw_be_phys(target_phys_addr_t addr, uint32_t val)
+{
+#if defined(TARGET_WORDS_BIGENDIAN)
+return stw_phys(addr, val);
+#else
+return stw_phys(addr, bswap16(val));
+#endif
+}
+
 /* XXX: optimize */
 void 

Re: [Qemu-devel] [PATCH 1/9] exec: add endian specific phys ld/st functions

2011-07-05 Thread Blue Swirl
On Tue, Jul 5, 2011 at 7:28 PM, Alexander Graf ag...@suse.de wrote:
 Device code some times needs to access physical memory and does that
 through the ld./st._phys functions. However, these are the exact same
 functions that the CPU uses to access memory, which means they will
 be endianness swapped depending on the target CPU.

 However, devices don't know about the CPU's endianness, but instead
 access memory directly using their own interface to the memory bus,
 so they need some way to read data with their native endianness.

 This patch adds _le and _be functions to ld./st._phys.

 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  cpu-common.h |   12 +++
  exec.c       |  102 
 ++
  2 files changed, 114 insertions(+), 0 deletions(-)

 diff --git a/cpu-common.h b/cpu-common.h
 index b027e43..c6a2b5f 100644
 --- a/cpu-common.h
 +++ b/cpu-common.h
 @@ -135,14 +135,26 @@ void qemu_flush_coalesced_mmio_buffer(void);

  uint32_t ldub_phys(target_phys_addr_t addr);
  uint32_t lduw_phys(target_phys_addr_t addr);
 +uint32_t lduw_le_phys(target_phys_addr_t addr);
 +uint32_t lduw_be_phys(target_phys_addr_t addr);
  uint32_t ldl_phys(target_phys_addr_t addr);
 +uint32_t ldl_le_phys(target_phys_addr_t addr);
 +uint32_t ldl_be_phys(target_phys_addr_t addr);
  uint64_t ldq_phys(target_phys_addr_t addr);
 +uint64_t ldq_le_phys(target_phys_addr_t addr);
 +uint64_t ldq_be_phys(target_phys_addr_t addr);
  void stl_phys_notdirty(target_phys_addr_t addr, uint32_t val);
  void stq_phys_notdirty(target_phys_addr_t addr, uint64_t val);
  void stb_phys(target_phys_addr_t addr, uint32_t val);
  void stw_phys(target_phys_addr_t addr, uint32_t val);
 +void stw_le_phys(target_phys_addr_t addr, uint32_t val);
 +void stw_be_phys(target_phys_addr_t addr, uint32_t val);
  void stl_phys(target_phys_addr_t addr, uint32_t val);
 +void stl_le_phys(target_phys_addr_t addr, uint32_t val);
 +void stl_be_phys(target_phys_addr_t addr, uint32_t val);
  void stq_phys(target_phys_addr_t addr, uint64_t val);
 +void stq_le_phys(target_phys_addr_t addr, uint64_t val);
 +void stq_be_phys(target_phys_addr_t addr, uint64_t val);

  void cpu_physical_memory_write_rom(target_phys_addr_t addr,
                                    const uint8_t *buf, int len);
 diff --git a/exec.c b/exec.c
 index 4c45299..5f2f87e 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -4158,6 +4158,24 @@ uint32_t ldl_phys(target_phys_addr_t addr)
     return val;
  }

 +uint32_t ldl_le_phys(target_phys_addr_t addr)
 +{
 +#if defined(TARGET_WORDS_BIGENDIAN)
 +    return bswap32(ldl_phys(addr));

This would introduce a second bswap in some cases. Please make instead
two versions of ldl_phys which use ldl_le/be_p instead of ldl_p. Then
ldl_phys could be #defined to the suitable function.

BTW, these functions would need a serious cleanup, there's a lot code
duplication and comments about XXX: optimize.

 +#else
 +    return ldl_phys(addr);
 +#endif
 +}
 +
 +uint32_t ldl_be_phys(target_phys_addr_t addr)
 +{
 +#if defined(TARGET_WORDS_BIGENDIAN)
 +    return ldl_phys(addr);
 +#else
 +    return bswap32(ldl_phys(addr));
 +#endif
 +}
 +
  /* warning: addr must be aligned */
  uint64_t ldq_phys(target_phys_addr_t addr)
  {
 @@ -4196,6 +4214,24 @@ uint64_t ldq_phys(target_phys_addr_t addr)
     return val;
  }

 +uint64_t ldq_le_phys(target_phys_addr_t addr)
 +{
 +#if defined(TARGET_WORDS_BIGENDIAN)
 +    return bswap64(ldq_phys(addr));
 +#else
 +    return ldq_phys(addr);
 +#endif
 +}
 +
 +uint64_t ldq_be_phys(target_phys_addr_t addr)
 +{
 +#if defined(TARGET_WORDS_BIGENDIAN)
 +    return ldq_phys(addr);
 +#else
 +    return bswap64(ldq_phys(addr));
 +#endif
 +}
 +
  /* XXX: optimize */
  uint32_t ldub_phys(target_phys_addr_t addr)
  {
 @@ -4236,6 +4272,24 @@ uint32_t lduw_phys(target_phys_addr_t addr)
     return val;
  }

 +uint32_t lduw_le_phys(target_phys_addr_t addr)
 +{
 +#if defined(TARGET_WORDS_BIGENDIAN)
 +    return bswap16(lduw_phys(addr));
 +#else
 +    return lduw_phys(addr);
 +#endif
 +}
 +
 +uint32_t lduw_be_phys(target_phys_addr_t addr)
 +{
 +#if defined(TARGET_WORDS_BIGENDIAN)
 +    return lduw_phys(addr);
 +#else
 +    return bswap16(lduw_phys(addr));
 +#endif
 +}
 +
  /* warning: addr must be aligned. The ram page is not masked as dirty
    and the code inside is not invalidated. It is useful if the dirty
    bits are used to track modified PTEs */
 @@ -4343,6 +4397,24 @@ void stl_phys(target_phys_addr_t addr, uint32_t val)
     }
  }

 +void stl_le_phys(target_phys_addr_t addr, uint32_t val)
 +{
 +#if defined(TARGET_WORDS_BIGENDIAN)
 +    return stl_phys(addr, bswap32(val));
 +#else
 +    return stl_phys(addr, val);
 +#endif
 +}
 +
 +void stl_be_phys(target_phys_addr_t addr, uint32_t val)
 +{
 +#if defined(TARGET_WORDS_BIGENDIAN)
 +    return stl_phys(addr, val);
 +#else
 +    return stl_phys(addr, bswap32(val));
 +#endif
 +}
 +
  /* XXX: optimize */
  void stb_phys(target_phys_addr_t addr, uint32_t val)
  {
 @@ 

Re: [Qemu-devel] [PATCH 1/9] exec: add endian specific phys ld/st functions

2011-07-05 Thread Alexander Graf

On 06.07.2011, at 00:05, Blue Swirl wrote:

 On Wed, Jul 6, 2011 at 12:55 AM, Alexander Graf ag...@suse.de wrote:
 
 On 05.07.2011, at 23:48, Blue Swirl wrote:
 
 On Tue, Jul 5, 2011 at 7:28 PM, Alexander Graf ag...@suse.de wrote:
 Device code some times needs to access physical memory and does that
 through the ld./st._phys functions. However, these are the exact same
 functions that the CPU uses to access memory, which means they will
 be endianness swapped depending on the target CPU.
 
 However, devices don't know about the CPU's endianness, but instead
 access memory directly using their own interface to the memory bus,
 so they need some way to read data with their native endianness.
 
 This patch adds _le and _be functions to ld./st._phys.
 
 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  cpu-common.h |   12 +++
  exec.c   |  102 
 ++
  2 files changed, 114 insertions(+), 0 deletions(-)
 
 diff --git a/cpu-common.h b/cpu-common.h
 index b027e43..c6a2b5f 100644
 --- a/cpu-common.h
 +++ b/cpu-common.h
 @@ -135,14 +135,26 @@ void qemu_flush_coalesced_mmio_buffer(void);
 
  uint32_t ldub_phys(target_phys_addr_t addr);
  uint32_t lduw_phys(target_phys_addr_t addr);
 +uint32_t lduw_le_phys(target_phys_addr_t addr);
 +uint32_t lduw_be_phys(target_phys_addr_t addr);
  uint32_t ldl_phys(target_phys_addr_t addr);
 +uint32_t ldl_le_phys(target_phys_addr_t addr);
 +uint32_t ldl_be_phys(target_phys_addr_t addr);
  uint64_t ldq_phys(target_phys_addr_t addr);
 +uint64_t ldq_le_phys(target_phys_addr_t addr);
 +uint64_t ldq_be_phys(target_phys_addr_t addr);
  void stl_phys_notdirty(target_phys_addr_t addr, uint32_t val);
  void stq_phys_notdirty(target_phys_addr_t addr, uint64_t val);
  void stb_phys(target_phys_addr_t addr, uint32_t val);
  void stw_phys(target_phys_addr_t addr, uint32_t val);
 +void stw_le_phys(target_phys_addr_t addr, uint32_t val);
 +void stw_be_phys(target_phys_addr_t addr, uint32_t val);
  void stl_phys(target_phys_addr_t addr, uint32_t val);
 +void stl_le_phys(target_phys_addr_t addr, uint32_t val);
 +void stl_be_phys(target_phys_addr_t addr, uint32_t val);
  void stq_phys(target_phys_addr_t addr, uint64_t val);
 +void stq_le_phys(target_phys_addr_t addr, uint64_t val);
 +void stq_be_phys(target_phys_addr_t addr, uint64_t val);
 
  void cpu_physical_memory_write_rom(target_phys_addr_t addr,
const uint8_t *buf, int len);
 diff --git a/exec.c b/exec.c
 index 4c45299..5f2f87e 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -4158,6 +4158,24 @@ uint32_t ldl_phys(target_phys_addr_t addr)
 return val;
  }
 
 +uint32_t ldl_le_phys(target_phys_addr_t addr)
 +{
 +#if defined(TARGET_WORDS_BIGENDIAN)
 +return bswap32(ldl_phys(addr));
 
 This would introduce a second bswap in some cases. Please make instead
 two versions of ldl_phys which use ldl_le/be_p instead of ldl_p. Then
 ldl_phys could be #defined to the suitable function.
 
 Yeah, I was thinking to do that one at first and then realized how MMIO gets 
 tricky, since we also need to potentially swap it again, as by now it 
 returns values in target CPU endianness. But if you prefer, I can dig into 
 that.
 
 Maybe the swapendian thing could be adjusted so that it's possible to
 access the device in either LE or BE way directly? For example, there
 could be two io_mem_read/write tables, the current one for CPU and
 another one for device MMIO with known endianness. Or even three
 tables: CPU, BE, LE.

If you take a look at the patches following this one, you can easily see how 
few devices actually use these functions. I don't think the additional memory 
usage would count up for a couple of byte swaps here really.

We could however try to be clever and directly check if the device mmio is a 
swapendian function and just bypass it. I'm just not sure it's worth the 
additional overhead for the non-swapped case (which should be the normal one).


Alex




Re: [Qemu-devel] [PATCH 1/9] exec: add endian specific phys ld/st functions

2011-07-05 Thread Blue Swirl
On Wed, Jul 6, 2011 at 1:13 AM, Alexander Graf ag...@suse.de wrote:

 On 06.07.2011, at 00:05, Blue Swirl wrote:

 On Wed, Jul 6, 2011 at 12:55 AM, Alexander Graf ag...@suse.de wrote:

 On 05.07.2011, at 23:48, Blue Swirl wrote:

 On Tue, Jul 5, 2011 at 7:28 PM, Alexander Graf ag...@suse.de wrote:
 Device code some times needs to access physical memory and does that
 through the ld./st._phys functions. However, these are the exact same
 functions that the CPU uses to access memory, which means they will
 be endianness swapped depending on the target CPU.

 However, devices don't know about the CPU's endianness, but instead
 access memory directly using their own interface to the memory bus,
 so they need some way to read data with their native endianness.

 This patch adds _le and _be functions to ld./st._phys.

 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  cpu-common.h |   12 +++
  exec.c       |  102 
 ++
  2 files changed, 114 insertions(+), 0 deletions(-)

 diff --git a/cpu-common.h b/cpu-common.h
 index b027e43..c6a2b5f 100644
 --- a/cpu-common.h
 +++ b/cpu-common.h
 @@ -135,14 +135,26 @@ void qemu_flush_coalesced_mmio_buffer(void);

  uint32_t ldub_phys(target_phys_addr_t addr);
  uint32_t lduw_phys(target_phys_addr_t addr);
 +uint32_t lduw_le_phys(target_phys_addr_t addr);
 +uint32_t lduw_be_phys(target_phys_addr_t addr);
  uint32_t ldl_phys(target_phys_addr_t addr);
 +uint32_t ldl_le_phys(target_phys_addr_t addr);
 +uint32_t ldl_be_phys(target_phys_addr_t addr);
  uint64_t ldq_phys(target_phys_addr_t addr);
 +uint64_t ldq_le_phys(target_phys_addr_t addr);
 +uint64_t ldq_be_phys(target_phys_addr_t addr);
  void stl_phys_notdirty(target_phys_addr_t addr, uint32_t val);
  void stq_phys_notdirty(target_phys_addr_t addr, uint64_t val);
  void stb_phys(target_phys_addr_t addr, uint32_t val);
  void stw_phys(target_phys_addr_t addr, uint32_t val);
 +void stw_le_phys(target_phys_addr_t addr, uint32_t val);
 +void stw_be_phys(target_phys_addr_t addr, uint32_t val);
  void stl_phys(target_phys_addr_t addr, uint32_t val);
 +void stl_le_phys(target_phys_addr_t addr, uint32_t val);
 +void stl_be_phys(target_phys_addr_t addr, uint32_t val);
  void stq_phys(target_phys_addr_t addr, uint64_t val);
 +void stq_le_phys(target_phys_addr_t addr, uint64_t val);
 +void stq_be_phys(target_phys_addr_t addr, uint64_t val);

  void cpu_physical_memory_write_rom(target_phys_addr_t addr,
                                    const uint8_t *buf, int len);
 diff --git a/exec.c b/exec.c
 index 4c45299..5f2f87e 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -4158,6 +4158,24 @@ uint32_t ldl_phys(target_phys_addr_t addr)
     return val;
  }

 +uint32_t ldl_le_phys(target_phys_addr_t addr)
 +{
 +#if defined(TARGET_WORDS_BIGENDIAN)
 +    return bswap32(ldl_phys(addr));

 This would introduce a second bswap in some cases. Please make instead
 two versions of ldl_phys which use ldl_le/be_p instead of ldl_p. Then
 ldl_phys could be #defined to the suitable function.

 Yeah, I was thinking to do that one at first and then realized how MMIO 
 gets tricky, since we also need to potentially swap it again, as by now it 
 returns values in target CPU endianness. But if you prefer, I can dig into 
 that.

 Maybe the swapendian thing could be adjusted so that it's possible to
 access the device in either LE or BE way directly? For example, there
 could be two io_mem_read/write tables, the current one for CPU and
 another one for device MMIO with known endianness. Or even three
 tables: CPU, BE, LE.

 If you take a look at the patches following this one, you can easily see how 
 few devices actually use these functions. I don't think the additional memory 
 usage would count up for a couple of byte swaps here really.

 We could however try to be clever and directly check if the device mmio is a 
 swapendian function and just bypass it. I'm just not sure it's worth the 
 additional overhead for the non-swapped case (which should be the normal one).

Neither seems to be very attractive option. It may be enough to
optimize just RAM accesses and forget about extra bswap for MMIO for
now.



Re: [Qemu-devel] [PATCH 1/9] exec: add endian specific phys ld/st functions

2011-07-05 Thread Alexander Graf

On 05.07.2011, at 23:48, Blue Swirl wrote:

 On Tue, Jul 5, 2011 at 7:28 PM, Alexander Graf ag...@suse.de wrote:
 Device code some times needs to access physical memory and does that
 through the ld./st._phys functions. However, these are the exact same
 functions that the CPU uses to access memory, which means they will
 be endianness swapped depending on the target CPU.
 
 However, devices don't know about the CPU's endianness, but instead
 access memory directly using their own interface to the memory bus,
 so they need some way to read data with their native endianness.
 
 This patch adds _le and _be functions to ld./st._phys.
 
 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  cpu-common.h |   12 +++
  exec.c   |  102 
 ++
  2 files changed, 114 insertions(+), 0 deletions(-)
 
 diff --git a/cpu-common.h b/cpu-common.h
 index b027e43..c6a2b5f 100644
 --- a/cpu-common.h
 +++ b/cpu-common.h
 @@ -135,14 +135,26 @@ void qemu_flush_coalesced_mmio_buffer(void);
 
  uint32_t ldub_phys(target_phys_addr_t addr);
  uint32_t lduw_phys(target_phys_addr_t addr);
 +uint32_t lduw_le_phys(target_phys_addr_t addr);
 +uint32_t lduw_be_phys(target_phys_addr_t addr);
  uint32_t ldl_phys(target_phys_addr_t addr);
 +uint32_t ldl_le_phys(target_phys_addr_t addr);
 +uint32_t ldl_be_phys(target_phys_addr_t addr);
  uint64_t ldq_phys(target_phys_addr_t addr);
 +uint64_t ldq_le_phys(target_phys_addr_t addr);
 +uint64_t ldq_be_phys(target_phys_addr_t addr);
  void stl_phys_notdirty(target_phys_addr_t addr, uint32_t val);
  void stq_phys_notdirty(target_phys_addr_t addr, uint64_t val);
  void stb_phys(target_phys_addr_t addr, uint32_t val);
  void stw_phys(target_phys_addr_t addr, uint32_t val);
 +void stw_le_phys(target_phys_addr_t addr, uint32_t val);
 +void stw_be_phys(target_phys_addr_t addr, uint32_t val);
  void stl_phys(target_phys_addr_t addr, uint32_t val);
 +void stl_le_phys(target_phys_addr_t addr, uint32_t val);
 +void stl_be_phys(target_phys_addr_t addr, uint32_t val);
  void stq_phys(target_phys_addr_t addr, uint64_t val);
 +void stq_le_phys(target_phys_addr_t addr, uint64_t val);
 +void stq_be_phys(target_phys_addr_t addr, uint64_t val);
 
  void cpu_physical_memory_write_rom(target_phys_addr_t addr,
const uint8_t *buf, int len);
 diff --git a/exec.c b/exec.c
 index 4c45299..5f2f87e 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -4158,6 +4158,24 @@ uint32_t ldl_phys(target_phys_addr_t addr)
 return val;
  }
 
 +uint32_t ldl_le_phys(target_phys_addr_t addr)
 +{
 +#if defined(TARGET_WORDS_BIGENDIAN)
 +return bswap32(ldl_phys(addr));
 
 This would introduce a second bswap in some cases. Please make instead
 two versions of ldl_phys which use ldl_le/be_p instead of ldl_p. Then
 ldl_phys could be #defined to the suitable function.

Yeah, I was thinking to do that one at first and then realized how MMIO gets 
tricky, since we also need to potentially swap it again, as by now it returns 
values in target CPU endianness. But if you prefer, I can dig into that.

 
 BTW, these functions would need a serious cleanup, there's a lot code
 duplication and comments about XXX: optimize.

Yes :). One thing at a time :).


Alex




Re: [Qemu-devel] [PATCH 1/9] exec: add endian specific phys ld/st functions

2011-07-05 Thread Blue Swirl
On Wed, Jul 6, 2011 at 12:55 AM, Alexander Graf ag...@suse.de wrote:

 On 05.07.2011, at 23:48, Blue Swirl wrote:

 On Tue, Jul 5, 2011 at 7:28 PM, Alexander Graf ag...@suse.de wrote:
 Device code some times needs to access physical memory and does that
 through the ld./st._phys functions. However, these are the exact same
 functions that the CPU uses to access memory, which means they will
 be endianness swapped depending on the target CPU.

 However, devices don't know about the CPU's endianness, but instead
 access memory directly using their own interface to the memory bus,
 so they need some way to read data with their native endianness.

 This patch adds _le and _be functions to ld./st._phys.

 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  cpu-common.h |   12 +++
  exec.c       |  102 
 ++
  2 files changed, 114 insertions(+), 0 deletions(-)

 diff --git a/cpu-common.h b/cpu-common.h
 index b027e43..c6a2b5f 100644
 --- a/cpu-common.h
 +++ b/cpu-common.h
 @@ -135,14 +135,26 @@ void qemu_flush_coalesced_mmio_buffer(void);

  uint32_t ldub_phys(target_phys_addr_t addr);
  uint32_t lduw_phys(target_phys_addr_t addr);
 +uint32_t lduw_le_phys(target_phys_addr_t addr);
 +uint32_t lduw_be_phys(target_phys_addr_t addr);
  uint32_t ldl_phys(target_phys_addr_t addr);
 +uint32_t ldl_le_phys(target_phys_addr_t addr);
 +uint32_t ldl_be_phys(target_phys_addr_t addr);
  uint64_t ldq_phys(target_phys_addr_t addr);
 +uint64_t ldq_le_phys(target_phys_addr_t addr);
 +uint64_t ldq_be_phys(target_phys_addr_t addr);
  void stl_phys_notdirty(target_phys_addr_t addr, uint32_t val);
  void stq_phys_notdirty(target_phys_addr_t addr, uint64_t val);
  void stb_phys(target_phys_addr_t addr, uint32_t val);
  void stw_phys(target_phys_addr_t addr, uint32_t val);
 +void stw_le_phys(target_phys_addr_t addr, uint32_t val);
 +void stw_be_phys(target_phys_addr_t addr, uint32_t val);
  void stl_phys(target_phys_addr_t addr, uint32_t val);
 +void stl_le_phys(target_phys_addr_t addr, uint32_t val);
 +void stl_be_phys(target_phys_addr_t addr, uint32_t val);
  void stq_phys(target_phys_addr_t addr, uint64_t val);
 +void stq_le_phys(target_phys_addr_t addr, uint64_t val);
 +void stq_be_phys(target_phys_addr_t addr, uint64_t val);

  void cpu_physical_memory_write_rom(target_phys_addr_t addr,
                                    const uint8_t *buf, int len);
 diff --git a/exec.c b/exec.c
 index 4c45299..5f2f87e 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -4158,6 +4158,24 @@ uint32_t ldl_phys(target_phys_addr_t addr)
     return val;
  }

 +uint32_t ldl_le_phys(target_phys_addr_t addr)
 +{
 +#if defined(TARGET_WORDS_BIGENDIAN)
 +    return bswap32(ldl_phys(addr));

 This would introduce a second bswap in some cases. Please make instead
 two versions of ldl_phys which use ldl_le/be_p instead of ldl_p. Then
 ldl_phys could be #defined to the suitable function.

 Yeah, I was thinking to do that one at first and then realized how MMIO gets 
 tricky, since we also need to potentially swap it again, as by now it returns 
 values in target CPU endianness. But if you prefer, I can dig into that.

Maybe the swapendian thing could be adjusted so that it's possible to
access the device in either LE or BE way directly? For example, there
could be two io_mem_read/write tables, the current one for CPU and
another one for device MMIO with known endianness. Or even three
tables: CPU, BE, LE.


 BTW, these functions would need a serious cleanup, there's a lot code
 duplication and comments about XXX: optimize.

 Yes :). One thing at a time :).


 Alex





Re: [Qemu-devel] [PATCH 1/9] exec: add endian specific phys ld/st functions

2011-07-05 Thread Alexander Graf

On 06.07.2011, at 00:22, Blue Swirl wrote:

 On Wed, Jul 6, 2011 at 1:13 AM, Alexander Graf ag...@suse.de wrote:
 
 On 06.07.2011, at 00:05, Blue Swirl wrote:
 
 On Wed, Jul 6, 2011 at 12:55 AM, Alexander Graf ag...@suse.de wrote:
 
 On 05.07.2011, at 23:48, Blue Swirl wrote:
 
 On Tue, Jul 5, 2011 at 7:28 PM, Alexander Graf ag...@suse.de wrote:
 Device code some times needs to access physical memory and does that
 through the ld./st._phys functions. However, these are the exact same
 functions that the CPU uses to access memory, which means they will
 be endianness swapped depending on the target CPU.
 
 However, devices don't know about the CPU's endianness, but instead
 access memory directly using their own interface to the memory bus,
 so they need some way to read data with their native endianness.
 
 This patch adds _le and _be functions to ld./st._phys.
 
 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  cpu-common.h |   12 +++
  exec.c   |  102 
 ++
  2 files changed, 114 insertions(+), 0 deletions(-)
 
 diff --git a/cpu-common.h b/cpu-common.h
 index b027e43..c6a2b5f 100644
 --- a/cpu-common.h
 +++ b/cpu-common.h
 @@ -135,14 +135,26 @@ void qemu_flush_coalesced_mmio_buffer(void);
 
  uint32_t ldub_phys(target_phys_addr_t addr);
  uint32_t lduw_phys(target_phys_addr_t addr);
 +uint32_t lduw_le_phys(target_phys_addr_t addr);
 +uint32_t lduw_be_phys(target_phys_addr_t addr);
  uint32_t ldl_phys(target_phys_addr_t addr);
 +uint32_t ldl_le_phys(target_phys_addr_t addr);
 +uint32_t ldl_be_phys(target_phys_addr_t addr);
  uint64_t ldq_phys(target_phys_addr_t addr);
 +uint64_t ldq_le_phys(target_phys_addr_t addr);
 +uint64_t ldq_be_phys(target_phys_addr_t addr);
  void stl_phys_notdirty(target_phys_addr_t addr, uint32_t val);
  void stq_phys_notdirty(target_phys_addr_t addr, uint64_t val);
  void stb_phys(target_phys_addr_t addr, uint32_t val);
  void stw_phys(target_phys_addr_t addr, uint32_t val);
 +void stw_le_phys(target_phys_addr_t addr, uint32_t val);
 +void stw_be_phys(target_phys_addr_t addr, uint32_t val);
  void stl_phys(target_phys_addr_t addr, uint32_t val);
 +void stl_le_phys(target_phys_addr_t addr, uint32_t val);
 +void stl_be_phys(target_phys_addr_t addr, uint32_t val);
  void stq_phys(target_phys_addr_t addr, uint64_t val);
 +void stq_le_phys(target_phys_addr_t addr, uint64_t val);
 +void stq_be_phys(target_phys_addr_t addr, uint64_t val);
 
  void cpu_physical_memory_write_rom(target_phys_addr_t addr,
const uint8_t *buf, int len);
 diff --git a/exec.c b/exec.c
 index 4c45299..5f2f87e 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -4158,6 +4158,24 @@ uint32_t ldl_phys(target_phys_addr_t addr)
 return val;
  }
 
 +uint32_t ldl_le_phys(target_phys_addr_t addr)
 +{
 +#if defined(TARGET_WORDS_BIGENDIAN)
 +return bswap32(ldl_phys(addr));
 
 This would introduce a second bswap in some cases. Please make instead
 two versions of ldl_phys which use ldl_le/be_p instead of ldl_p. Then
 ldl_phys could be #defined to the suitable function.
 
 Yeah, I was thinking to do that one at first and then realized how MMIO 
 gets tricky, since we also need to potentially swap it again, as by now it 
 returns values in target CPU endianness. But if you prefer, I can dig into 
 that.
 
 Maybe the swapendian thing could be adjusted so that it's possible to
 access the device in either LE or BE way directly? For example, there
 could be two io_mem_read/write tables, the current one for CPU and
 another one for device MMIO with known endianness. Or even three
 tables: CPU, BE, LE.
 
 If you take a look at the patches following this one, you can easily see how 
 few devices actually use these functions. I don't think the additional 
 memory usage would count up for a couple of byte swaps here really.
 
 We could however try to be clever and directly check if the device mmio is a 
 swapendian function and just bypass it. I'm just not sure it's worth the 
 additional overhead for the non-swapped case (which should be the normal 
 one).
 
 Neither seems to be very attractive option. It may be enough to
 optimize just RAM accesses and forget about extra bswap for MMIO for
 now.

How about something like this on top? Plus the q, uw and st version of course. 
The inline is there on purpose, moving the be/le specific code (which is rarely 
used) out of the way from the often(?) used native ones. Eventually, TCG 
generated code comes into these, no?


diff --git a/exec.c b/exec.c
index 5f2f87e..f281ba4 100644
--- a/exec.c
+++ b/exec.c
@@ -4127,7 +4127,8 @@ void cpu_physical_memory_unmap(void *buffer, 
target_phys_addr_t len,
 }
 
 /* warning: addr must be aligned */
-uint32_t ldl_phys(target_phys_addr_t addr)
+static inline uint32_t ldl_phys_internal(target_phys_addr_t addr,
+ enum device_endian endian)
 {
 int io_index;
 uint8_t *ptr;
@@ -4149,31 +4150,47 @@ uint32_t