date:20160117

[Qemu-devel] Regarding Intel IGD passthru support for QEMU/KVM

2016-01-17 Thread Raghavan Santhanam

Hi,

Based on the Intel IGD passthru support that has been added to Qemu/Xen
code base,
is there any way to use/reuse the same logic currently to have a successful
passthru of an Intel
IGD with Qemu/KVM on a Linux host(Ubuntu x86_64) or will that require some
more
work in addition to what Xen code based already has for the IGD passthru?

Best,
Raghavan

Re: [Qemu-devel] [PATCH] Propagate OEM ID info into other tables when using SLIC

2016-01-17 Thread Xiao Guangrong


Hi,

Is this you wanted?
https://www.mail-archive.com/qemu-devel@nongnu.org/msg345911.html


On 01/16/2016 04:19 AM, Steven Newbury wrote:

In order to support Windows 7 "Activation", the OEM ID info must match
in SLIC and RSDT, and for UEFI, FACP.  The OEM ID from the SLIC is only
applied when oemtableid is not specified expliicitly.

This was originally based on the patch from Michael Tokarev but has
been significantly re-worked, and re-based.

Signed-off-by: Steven Newbury 

---
  hw/acpi/aml-build.c | 19 ---
  hw/acpi/core.c  | 11 +++
  hw/i386/acpi-build.c|  9 -
  include/hw/acpi/acpi_slic.h | 11 +++
  qemu-options.hx |  2 ++
  5 files changed, 48 insertions(+), 4 deletions(-)
  create mode 100644 include/hw/acpi/acpi_slic.h

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 78e1290..bc16dc2 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -29,6 +29,9 @@
  #include "qemu/bswap.h"
  #include "qemu/bitops.h"
  #include "hw/acpi/bios-linker-loader.h"
+#include "hw/acpi/acpi_slic.h"
+
+extern const struct slic_info oem_data;

  static GArray *build_alloc_array(void)
  {
@@ -1435,13 +1438,23 @@ build_header(GArray *linker, GArray
*table_data,
  memcpy(>signature, sig, 4);
  h->length = cpu_to_le32(len);
  h->revision = rev;
-memcpy(h->oem_id, ACPI_BUILD_APPNAME6, 6);

  if (oem_table_id) {
  strncpy((char *)h->oem_table_id, oem_table_id, sizeof(h-

oem_table_id));

  } else {
-memcpy(h->oem_table_id, ACPI_BUILD_APPNAME4, 4);
-memcpy(h->oem_table_id + 4, sig, 4);
+/* When including the system SLIC Win7 requires all OEM info
to match
+(including sig) in SLIC, RSDT and FACP.  Other tables could
match,
+but is unnecessary to pass "Activation". Overriden by above.
*/
+if (oem_data.has_slic) {
+memcpy(h->oem_id, _data.oem_id, 6);
+memcpy(h->oem_table_id, _data.oem_table_id, 8);
+if (memcmp(sig, "RSDT", 4) != 0 && memcmp(sig, "FACP", 4)
!= 0)
+memcpy(h->oem_table_id + 4, sig, 4);
+} else {
+memcpy(h->oem_id, ACPI_BUILD_APPNAME6, 6);
+memcpy(h->oem_table_id, ACPI_BUILD_APPNAME4, 4);
+memcpy(h->oem_table_id + 4, sig, 4);
+}
  }

  h->oem_revision = cpu_to_le32(1);
diff --git a/hw/acpi/core.c b/hw/acpi/core.c
index 21e113d..cbef437 100644
--- a/hw/acpi/core.c
+++ b/hw/acpi/core.c
@@ -22,6 +22,7 @@
  #include "hw/hw.h"
  #include "hw/i386/pc.h"
  #include "hw/acpi/acpi.h"
+#include "hw/acpi/acpi_slic.h"
  #include "hw/nvram/fw_cfg.h"
  #include "qemu/config-file.h"
  #include "qapi/opts-visitor.h"
@@ -29,6 +30,9 @@
  #include "qapi-visit.h"
  #include "qapi-event.h"

+struct slic_info oem_data;
+
+
  struct acpi_table_header {
  uint16_t _length; /* our length, not actual part of the
hdr */
/* allows easier parsing for fw_cfg
clients */
@@ -227,6 +231,13 @@ static void acpi_table_install(const char unsigned
*blob, size_t bloblen,
  /* recalculate checksum */
  ext_hdr->checksum = acpi_checksum((const char unsigned *)ext_hdr +
ACPI_TABLE_PFX_SIZE,
acpi_payload_size);
+
+/* Copy OEM fields from SLIC for use in all relevant tables
+   (oem_id[6] + tableid[4] + tableid(sig)[4] = 14 bytes) */
+if ((!oem_data.has_slic) && (memcmp(ext_hdr->sig, "SLIC", 4) ==
0)) {
+   memcpy(_data.oem_id, ext_hdr->oem_id, 14);
+   oem_data.has_slic = 1;
+}
  }

  void acpi_table_add(const QemuOpts *opts, Error **errp)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 78758e2..fbe4f3a 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -55,6 +55,7 @@
  #include "hw/timer/hpet.h"

  #include "hw/acpi/aml-build.h"
+#include "hw/acpi/acpi_slic.h"

  #include "qapi/qmp/qint.h"
  #include "qom/qom-qobject.h"
@@ -122,6 +123,8 @@ typedef struct AcpiBuildPciBusHotplugState {
  bool pcihp_bridge_en;
  } AcpiBuildPciBusHotplugState;

+extern const struct slic_info oem_data;
+
  static
  int acpi_add_cpu_info(Object *o, void *opaque)
  {
@@ -2542,7 +2545,11 @@ build_rsdp(GArray *rsdp_table, GArray *linker,
unsigned rsdt)
   true /* fseg memory */);

  memcpy(>signature, "RSD PTR ", 8);
-memcpy(rsdp->oem_id, ACPI_BUILD_APPNAME6, 6);
+if (oem_data.has_slic) {
+memcpy(rsdp->oem_id, _data.oem_id, 6);
+} else {
+memcpy(rsdp->oem_id, ACPI_BUILD_APPNAME6, 6);
+}
  rsdp->rsdt_physical_address = cpu_to_le32(rsdt);
  /* Address to be filled by Guest linker */
  bios_linker_loader_add_pointer(linker, ACPI_BUILD_RSDP_FILE,
diff --git a/include/hw/acpi/acpi_slic.h b/include/hw/acpi/acpi_slic.h
new file mode 100644
index 000..bc04a71
--- /dev/null
+++ b/include/hw/acpi/acpi_slic.h
@@ -0,0 +1,11 @@
+#ifndef QEMU_HW_ACPI_SLIC_H

Re: [Qemu-devel] [PATCH] cadence_gem: fix buffer overflow

2016-01-17 Thread Peter Crosthwaite

On Sun, Jan 17, 2016 at 10:50 PM, Jason Wang  wrote:
>
>
> On 01/14/2016 05:43 PM, Michael S. Tsirkin wrote:
>> gem_receive copies a packet received from network into an rxbuf[2048]
>> array on stack, with size limited by descriptor length set by guest.  If
>> guest is malicious and specifies a descriptor length that is too large,
>> and should packet size exceed array size, this results in a buffer
>> overflow.
>>
>> Reported-by: 刘令 
>> Signed-off-by: Michael S. Tsirkin 
>> ---
>>  hw/net/cadence_gem.c | 8 
>>  1 file changed, 8 insertions(+)
>
> Apply to my -net with tweak on commit log (changing receive to transmit
> as noticed).
>

As this is actually an unimplemented feature you should change the
message to a LOG_UNIMP rather than a debug printf.

Regards,
Peter

> Thanks
>
>>
>> diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
>> index 3639fc1..15a0786 100644
>> --- a/hw/net/cadence_gem.c
>> +++ b/hw/net/cadence_gem.c
>> @@ -862,6 +862,14 @@ static void gem_transmit(CadenceGEMState *s)
>>  break;
>>  }
>>
>> +if (tx_desc_get_length(desc) > sizeof(tx_packet) - (p - tx_packet)) 
>> {
>> +DB_PRINT("TX descriptor @ 0x%x too large: size 0x%x space 
>> 0x%x\n",
>> + (unsigned)packet_desc_addr,
>> + (unsigned)tx_desc_get_length(desc),
>> + sizeof(tx_packet) - (p - tx_packet));
>> +break;
>> +}
>> +
>>  /* Gather this fragment of the packet from "dma memory" to our 
>> contig.
>>   * buffer.
>>   */
>

[Qemu-devel] [PATCH v1 01/17] linux-user: arm: fix coding style for some linux-user signal functions

2016-01-17 Thread Peter Crosthwaite

From: Paolo Bonzini 

Reviewed-by: Peter Maydell 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Peter Crosthwaite 
---

 linux-user/signal.c | 110 ++--
 1 file changed, 56 insertions(+), 54 deletions(-)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 919aa83..2cc6d73 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -1541,82 +1541,84 @@ static void
 setup_sigcontext(struct target_sigcontext *sc, /*struct _fpstate *fpstate,*/
  CPUARMState *env, abi_ulong mask)
 {
-   __put_user(env->regs[0], >arm_r0);
-   __put_user(env->regs[1], >arm_r1);
-   __put_user(env->regs[2], >arm_r2);
-   __put_user(env->regs[3], >arm_r3);
-   __put_user(env->regs[4], >arm_r4);
-   __put_user(env->regs[5], >arm_r5);
-   __put_user(env->regs[6], >arm_r6);
-   __put_user(env->regs[7], >arm_r7);
-   __put_user(env->regs[8], >arm_r8);
-   __put_user(env->regs[9], >arm_r9);
-   __put_user(env->regs[10], >arm_r10);
-   __put_user(env->regs[11], >arm_fp);
-   __put_user(env->regs[12], >arm_ip);
-   __put_user(env->regs[13], >arm_sp);
-   __put_user(env->regs[14], >arm_lr);
-   __put_user(env->regs[15], >arm_pc);
+__put_user(env->regs[0], >arm_r0);
+__put_user(env->regs[1], >arm_r1);
+__put_user(env->regs[2], >arm_r2);
+__put_user(env->regs[3], >arm_r3);
+__put_user(env->regs[4], >arm_r4);
+__put_user(env->regs[5], >arm_r5);
+__put_user(env->regs[6], >arm_r6);
+__put_user(env->regs[7], >arm_r7);
+__put_user(env->regs[8], >arm_r8);
+__put_user(env->regs[9], >arm_r9);
+__put_user(env->regs[10], >arm_r10);
+__put_user(env->regs[11], >arm_fp);
+__put_user(env->regs[12], >arm_ip);
+__put_user(env->regs[13], >arm_sp);
+__put_user(env->regs[14], >arm_lr);
+__put_user(env->regs[15], >arm_pc);
 #ifdef TARGET_CONFIG_CPU_32
-   __put_user(cpsr_read(env), >arm_cpsr);
+__put_user(cpsr_read(env), >arm_cpsr);
 #endif
 
-   __put_user(/* current->thread.trap_no */ 0, >trap_no);
-   __put_user(/* current->thread.error_code */ 0, >error_code);
-   __put_user(/* current->thread.address */ 0, >fault_address);
-   __put_user(mask, >oldmask);
+__put_user(/* current->thread.trap_no */ 0, >trap_no);
+__put_user(/* current->thread.error_code */ 0, >error_code);
+__put_user(/* current->thread.address */ 0, >fault_address);
+__put_user(mask, >oldmask);
 }
 
 static inline abi_ulong
 get_sigframe(struct target_sigaction *ka, CPUARMState *regs, int framesize)
 {
-   unsigned long sp = regs->regs[13];
+unsigned long sp = regs->regs[13];
 
-   /*
-* This is the X/Open sanctioned signal stack switching.
-*/
-   if ((ka->sa_flags & TARGET_SA_ONSTACK) && !sas_ss_flags(sp))
-sp = target_sigaltstack_used.ss_sp + 
target_sigaltstack_used.ss_size;
-   /*
-* ATPCS B01 mandates 8-byte alignment
-*/
-   return (sp - framesize) & ~7;
+/*
+ * This is the X/Open sanctioned signal stack switching.
+ */
+if ((ka->sa_flags & TARGET_SA_ONSTACK) && !sas_ss_flags(sp)) {
+sp = target_sigaltstack_used.ss_sp + target_sigaltstack_used.ss_size;
+}
+/*
+ * ATPCS B01 mandates 8-byte alignment
+ */
+return (sp - framesize) & ~7;
 }
 
 static void
 setup_return(CPUARMState *env, struct target_sigaction *ka,
 abi_ulong *rc, abi_ulong frame_addr, int usig, abi_ulong rc_addr)
 {
-   abi_ulong handler = ka->_sa_handler;
-   abi_ulong retcode;
-   int thumb = handler & 1;
-   uint32_t cpsr = cpsr_read(env);
+abi_ulong handler = ka->_sa_handler;
+abi_ulong retcode;
+int thumb = handler & 1;
+uint32_t cpsr = cpsr_read(env);
 
-   cpsr &= ~CPSR_IT;
-   if (thumb) {
-   cpsr |= CPSR_T;
-   } else {
-   cpsr &= ~CPSR_T;
-   }
+cpsr &= ~CPSR_IT;
+if (thumb) {
+cpsr |= CPSR_T;
+} else {
+cpsr &= ~CPSR_T;
+}
 
-   if (ka->sa_flags & TARGET_SA_RESTORER) {
-   retcode = ka->sa_restorer;
-   } else {
-   unsigned int idx = thumb;
+if (ka->sa_flags & TARGET_SA_RESTORER) {
+retcode = ka->sa_restorer;
+} else {
+unsigned int idx = thumb;
 
-   if (ka->sa_flags & TARGET_SA_SIGINFO)
-   idx += 2;
+if (ka->sa_flags & TARGET_SA_SIGINFO) {
+idx += 2;
+}
 
 __put_user(retcodes[idx], rc);
 
-   retcode = rc_addr + thumb;
-   }
+retcode = rc_addr + thumb;
+}
 
-   env->regs[0] = usig;
-   env->regs[13] = frame_addr;
-   env->regs[14] = retcode;
-   env->regs[15] = handler & (thumb ? ~1 : ~3);
-   cpsr_write(env, cpsr, 0x);
+env->regs[0] = usig;
+env->regs[13] =

[Qemu-devel] [PATCH v1 02/17] linux-user: arm: set CPSR.E/SCTLR.E0E correctly for BE mode

2016-01-17 Thread Peter Crosthwaite

From: Peter Crosthwaite 

If doing big-endian linux-user mode, set both the CPSR.E and SCTLR.E0E
bits. This sets big-endian mode for data accesses in AA32 and AA64
resp.

Signed-off-by: Peter Crosthwaite 
---

 linux-user/main.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/linux-user/main.c b/linux-user/main.c
index ee12035..4f8ea9c 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -4454,6 +4454,10 @@ int main(int argc, char **argv, char **envp)
 for(i = 0; i < 16; i++) {
 env->regs[i] = regs->uregs[i];
 }
+#ifdef TARGET_WORDS_BIGENDIAN
+env->uncached_cpsr |= CPSR_E;
+env->cp15.sctlr_el[1] |= SCTLR_E0E;
+#endif
 /* Enable BE8.  */
 if (EF_ARM_EABI_VERSION(info->elf_flags) >= EF_ARM_EABI_VER4
 && (info->elf_flags & EF_ARM_BE8)) {
-- 
1.9.1

[Qemu-devel] [PATCH v1 00/17] ARM big-endian and setend support

2016-01-17 Thread Peter Crosthwaite

Hi All,

This patch series adds system-mode big-endian support for ARM. It also
implements the setend instruction, and loading of BE binaries even in
LE emulation mode.

Based on Paolo's original work. I have moved all the BE32 related work
to the back of the series. Multiple parties are interested in the BE8
work just on its own, so that could potentially be merged w/o BE32.
PMM requested BE32 be at least thought out architecturally, so this
series sees BE32 functionality through.

I have tested all of LE. BE8 and BE32 in both linux-user mode (for
regressions) and system mode (BE8 and BE32 are new here).
My test application is here, the README gives some example command
lines you can run:

https://github.com/pcrost/arm-be-test

Regards,
Peter


Paolo Bonzini (8):
  linux-user: arm: fix coding style for some linux-user signal functions
  linux-user: arm: handle CPSR.E correctly in strex emulation
  target-arm: pass DisasContext to gen_aa32_ld*/st*
  target-arm: introduce disas flag for endianness
  target-arm: implement setend
  linux-user: arm: pass env to get_user_code_*
  target-arm: implement SCTLR.B, drop bswap_code
  target-arm: implement BE32 mode in system emulation

Peter Crosthwaite (9):
  linux-user: arm: set CPSR.E/SCTLR.E0E correctly for BE mode
  target-arm: implement SCTLR.EE
  target-arm: a64: Add endianness support
  target-arm: cpu: Move cpu_is_big_endian to header
  target-arm: introduce tbflag for endianness
  arm: linux-user: don't set CPSR.E in BE32 mode
  loader: add API to load elf header
  loader: Add data swap option to load-elf
  arm: boot: Support big-endian elfs

 hw/alpha/dp264.c   |   4 +-
 hw/arm/armv7m.c|   2 +-
 hw/arm/boot.c  |  96 --
 hw/core/loader.c   |  57 +-
 hw/cris/boot.c |   2 +-
 hw/i386/multiboot.c|   3 +-
 hw/lm32/lm32_boards.c  |   4 +-
 hw/lm32/milkymist.c|   2 +-
 hw/m68k/an5206.c   |   2 +-
 hw/m68k/dummy_m68k.c   |   2 +-
 hw/m68k/mcf5208.c  |   2 +-
 hw/microblaze/boot.c   |   4 +-
 hw/mips/mips_fulong2e.c|   2 +-
 hw/mips/mips_malta.c   |   2 +-
 hw/mips/mips_mipssim.c |   2 +-
 hw/mips/mips_r4k.c |   2 +-
 hw/moxie/moxiesim.c|   3 +-
 hw/openrisc/openrisc_sim.c |   3 +-
 hw/pci-host/prep.c |   2 +-
 hw/ppc/e500.c  |   2 +-
 hw/ppc/mac_newworld.c  |   5 +-
 hw/ppc/mac_oldworld.c  |   5 +-
 hw/ppc/ppc440_bamboo.c |   3 +-
 hw/ppc/spapr.c |   6 +-
 hw/ppc/virtex_ml507.c  |   3 +-
 hw/s390x/ipl.c |   4 +-
 hw/sparc/leon3.c   |   2 +-
 hw/sparc/sun4m.c   |   4 +-
 hw/sparc64/sun4u.c |   4 +-
 hw/tricore/tricore_testboard.c |   2 +-
 hw/xtensa/sim.c|   4 +-
 hw/xtensa/xtfpga.c |   2 +-
 include/hw/arm/arm.h   |   9 +
 include/hw/elf_ops.h   |  22 ++-
 include/hw/loader.h|   3 +-
 linux-user/main.c  |  77 ++--
 linux-user/signal.c| 110 +--
 target-arm/arm_ldst.h  |   8 +-
 target-arm/cpu.c   |  21 +--
 target-arm/cpu.h   | 103 ++-
 target-arm/helper.c|  50 +++--
 target-arm/helper.h|   1 +
 target-arm/op_helper.c |   5 +
 target-arm/translate-a64.c |  56 +++---
 target-arm/translate.c | 407 -
 target-arm/translate.h |   3 +-
 46 files changed, 752 insertions(+), 365 deletions(-)

-- 
1.9.1

[Qemu-devel] [PATCH v1 04/17] target-arm: implement SCTLR.EE

2016-01-17 Thread Peter Crosthwaite

From: Peter Crosthwaite 

Implement SCTLR.EE bit which controls data endianess for exceptions
and page table translations. SCTLR.EE is mirrored to the CPSR.E bit
on exception entry.

Signed-off-by: Peter Crosthwaite 
---

 target-arm/helper.c | 42 --
 1 file changed, 32 insertions(+), 10 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 59d5a41..afac1b2 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -5889,7 +5889,10 @@ void arm_cpu_do_interrupt(CPUState *cs)
 /* Clear IT bits.  */
 env->condexec_bits = 0;
 /* Switch to the new mode, and to the correct instruction set.  */
-env->uncached_cpsr = (env->uncached_cpsr & ~CPSR_M) | new_mode;
+env->uncached_cpsr = (env->uncached_cpsr & ~(CPSR_M)) | new_mode;
+/* Set new mode endianess */
+env->uncached_cpsr = (env->uncached_cpsr & ~(CPSR_E)) |
+(env->cp15.sctlr_el[arm_current_el(env)] & SCTLR_EE ? CPSR_E : 0);
 env->daif |= mask;
 /* this is a lie, as the was no c1_sys on V4T/V5, but who cares
  * and we should just guard the thumb mode on V4 */
@@ -5958,6 +5961,12 @@ static inline bool 
regime_translation_disabled(CPUARMState *env,
 return (regime_sctlr(env, mmu_idx) & SCTLR_M) == 0;
 }
 
+static inline bool regime_translation_big_endian(CPUARMState *env,
+ ARMMMUIdx mmu_idx)
+{
+return (regime_sctlr(env, mmu_idx) & SCTLR_EE) != 0;
+}
+
 /* Return the TCR controlling this translation regime */
 static inline TCR *regime_tcr(CPUARMState *env, ARMMMUIdx mmu_idx)
 {
@@ -6263,7 +6272,7 @@ static hwaddr S1_ptw_translate(CPUARMState *env, 
ARMMMUIdx mmu_idx,
  */
 static uint32_t arm_ldl_ptw(CPUState *cs, hwaddr addr, bool is_secure,
 ARMMMUIdx mmu_idx, uint32_t *fsr,
-ARMMMUFaultInfo *fi)
+ARMMMUFaultInfo *fi, bool be)
 {
 ARMCPU *cpu = ARM_CPU(cs);
 CPUARMState *env = >env;
@@ -6274,12 +6283,16 @@ static uint32_t arm_ldl_ptw(CPUState *cs, hwaddr addr, 
bool is_secure,
 if (fi->s1ptw) {
 return 0;
 }
-return address_space_ldl(cs->as, addr, attrs, NULL);
+if (be) {
+return address_space_ldl_be(cs->as, addr, attrs, NULL);
+} else {
+return address_space_ldl_le(cs->as, addr, attrs, NULL);
+}
 }
 
 static uint64_t arm_ldq_ptw(CPUState *cs, hwaddr addr, bool is_secure,
 ARMMMUIdx mmu_idx, uint32_t *fsr,
-ARMMMUFaultInfo *fi)
+ARMMMUFaultInfo *fi, bool be)
 {
 ARMCPU *cpu = ARM_CPU(cs);
 CPUARMState *env = >env;
@@ -6290,7 +6303,11 @@ static uint64_t arm_ldq_ptw(CPUState *cs, hwaddr addr, 
bool is_secure,
 if (fi->s1ptw) {
 return 0;
 }
-return address_space_ldq(cs->as, addr, attrs, NULL);
+if (be) {
+return address_space_ldq_be(cs->as, addr, attrs, NULL);
+} else {
+return address_space_ldq_le(cs->as, addr, attrs, NULL);
+}
 }
 
 static bool get_phys_addr_v5(CPUARMState *env, uint32_t address,
@@ -6318,7 +6335,8 @@ static bool get_phys_addr_v5(CPUARMState *env, uint32_t 
address,
 goto do_fault;
 }
 desc = arm_ldl_ptw(cs, table, regime_is_secure(env, mmu_idx),
-   mmu_idx, fsr, fi);
+   mmu_idx, fsr, fi,
+   regime_translation_big_endian(env, mmu_idx));
 type = (desc & 3);
 domain = (desc >> 5) & 0x0f;
 if (regime_el(env, mmu_idx) == 1) {
@@ -6355,7 +6373,8 @@ static bool get_phys_addr_v5(CPUARMState *env, uint32_t 
address,
 table = (desc & 0xf000) | ((address >> 8) & 0xffc);
 }
 desc = arm_ldl_ptw(cs, table, regime_is_secure(env, mmu_idx),
-   mmu_idx, fsr, fi);
+   mmu_idx, fsr, fi,
+   regime_translation_big_endian(env, mmu_idx));
 switch (desc & 3) {
 case 0: /* Page translation fault.  */
 code = 7;
@@ -6437,7 +6456,8 @@ static bool get_phys_addr_v6(CPUARMState *env, uint32_t 
address,
 goto do_fault;
 }
 desc = arm_ldl_ptw(cs, table, regime_is_secure(env, mmu_idx),
-   mmu_idx, fsr, fi);
+   mmu_idx, fsr, fi,
+   regime_translation_big_endian(env, mmu_idx));
 type = (desc & 3);
 if (type == 0 || (type == 3 && !arm_feature(env, ARM_FEATURE_PXN))) {
 /* Section translation fault, or attempt to use the encoding
@@ -6489,7 +6509,8 @@ static bool get_phys_addr_v6(CPUARMState *env, uint32_t 
address,
 /* Lookup l2 entry.  */
 table = (desc & 0xfc00) | ((address >> 10) & 0x3fc);
 desc = arm_ldl_ptw(cs, table, regime_is_secure(env, mmu_idx),
-   mmu_idx, fsr, fi);
+   mmu_idx,

[Qemu-devel] [PATCH v1 16/17] loader: Add data swap option to load-elf

2016-01-17 Thread Peter Crosthwaite

Some CPUs are of an opposite data-endianness to other components in the
system. Sometimes elfs have the data sections layed out with this CPU
data-endianess accounting for when loaded via the CPU, byte swaps
(relative to other system components) will occur.

The leading example, is ARM's BE32 mode, which is is basically LE with
address manipulation on half-word and byte accesses to access the
hw/byte reversed address. This means that word data is invariant
accross LE and BE32. This also means that instructions are still LE.
The expectation is that the elf will be loaded via the CPU in this
endianness scheme, which means the data in the elf is reversed at
compile time.

As QEMU loads via the system memory directly, rather than the CPU, we
need a mechanism to reverse elf data endianness to implement this
possibility.

Signed-off-by: Peter Crosthwaite 
---

 hw/alpha/dp264.c   |  4 ++--
 hw/arm/armv7m.c|  2 +-
 hw/arm/boot.c  |  2 +-
 hw/core/loader.c   |  9 ++---
 hw/cris/boot.c |  2 +-
 hw/i386/multiboot.c|  3 ++-
 hw/lm32/lm32_boards.c  |  4 ++--
 hw/lm32/milkymist.c|  2 +-
 hw/m68k/an5206.c   |  2 +-
 hw/m68k/dummy_m68k.c   |  2 +-
 hw/m68k/mcf5208.c  |  2 +-
 hw/microblaze/boot.c   |  4 ++--
 hw/mips/mips_fulong2e.c|  2 +-
 hw/mips/mips_malta.c   |  2 +-
 hw/mips/mips_mipssim.c |  2 +-
 hw/mips/mips_r4k.c |  2 +-
 hw/moxie/moxiesim.c|  3 ++-
 hw/openrisc/openrisc_sim.c |  3 ++-
 hw/pci-host/prep.c |  2 +-
 hw/ppc/e500.c  |  2 +-
 hw/ppc/mac_newworld.c  |  5 +++--
 hw/ppc/mac_oldworld.c  |  5 +++--
 hw/ppc/ppc440_bamboo.c |  3 ++-
 hw/ppc/spapr.c |  6 --
 hw/ppc/virtex_ml507.c  |  3 ++-
 hw/s390x/ipl.c |  4 ++--
 hw/sparc/leon3.c   |  2 +-
 hw/sparc/sun4m.c   |  4 ++--
 hw/sparc64/sun4u.c |  4 ++--
 hw/tricore/tricore_testboard.c |  2 +-
 hw/xtensa/sim.c|  4 ++--
 hw/xtensa/xtfpga.c |  2 +-
 include/hw/elf_ops.h   | 22 +-
 include/hw/loader.h|  2 +-
 34 files changed, 78 insertions(+), 46 deletions(-)

diff --git a/hw/alpha/dp264.c b/hw/alpha/dp264.c
index 27bdaa1..df071fa 100644
--- a/hw/alpha/dp264.c
+++ b/hw/alpha/dp264.c
@@ -109,7 +109,7 @@ static void clipper_init(MachineState *machine)
 }
 size = load_elf(palcode_filename, cpu_alpha_superpage_to_phys,
 NULL, _entry, _low, _high,
-0, EM_ALPHA, 0);
+0, EM_ALPHA, 0, 0);
 if (size < 0) {
 hw_error("could not load palcode '%s'\n", palcode_filename);
 exit(1);
@@ -129,7 +129,7 @@ static void clipper_init(MachineState *machine)
 
 size = load_elf(kernel_filename, cpu_alpha_superpage_to_phys,
 NULL, _entry, _low, _high,
-0, EM_ALPHA, 0);
+0, EM_ALPHA, 0, 0);
 if (size < 0) {
 hw_error("could not load kernel '%s'\n", kernel_filename);
 exit(1);
diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
index a80d2ad..d721e5b 100644
--- a/hw/arm/armv7m.c
+++ b/hw/arm/armv7m.c
@@ -210,7 +210,7 @@ DeviceState *armv7m_init(MemoryRegion *system_memory, int 
mem_size, int num_irq,
 
 if (kernel_filename) {
 image_size = load_elf(kernel_filename, NULL, NULL, , ,
-  NULL, big_endian, EM_ARM, 1);
+  NULL, big_endian, EM_ARM, 1, 0);
 if (image_size < 0) {
 image_size = load_image_targphys(kernel_filename, 0, mem_size);
 lowaddr = 0;
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 75f69bf..0de4269 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -700,7 +700,7 @@ static void arm_load_kernel_notify(Notifier *notifier, void 
*data)
 /* Assume that raw images are linux kernels, and ELF images are not.  */
 kernel_size = load_elf(info->kernel_filename, NULL, NULL, _entry,
_low_addr, _high_addr, big_endian,
-   elf_machine, 1);
+   elf_machine, 1, 0);
 if (kernel_size > 0 && have_dtb(info)) {
 /* If there is still some room left at the base of RAM, try and put
  * the DTB there like we do for images loaded with -bios or -pflash.
diff --git a/hw/core/loader.c b/hw/core/loader.c
index 28da8e2..bee6cf5 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -382,7 +382,8 @@ fail:
 /* return < 0 if error, otherwise the number of bytes loaded in memory */
 int load_elf(const char *filename, uint64_t (*translate_fn)(void *, uint64_t),
  void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
- uint64_t *highaddr, int big_endian, int

[Qemu-devel] [PATCH v1 13/17] arm: linux-user: don't set CPSR.E in BE32 mode

2016-01-17 Thread Peter Crosthwaite

Don't set CPSR.E for BE32 linux-user mode. As linux-user mode models
BE32, using normal BE (and system mode will not), a special case is
needed for user-mode where if sctlr.b is set, the CPU identifies as BE.

Signed-off-by: Peter Crosthwaite 
---

 linux-user/main.c |  2 --
 target-arm/cpu.h  | 12 +++-
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index d481458..60375fb 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -4496,8 +4496,6 @@ int main(int argc, char **argv, char **envp)
 env->uncached_cpsr |= CPSR_E;
 } else {
 env->cp15.sctlr_el[1] |= SCTLR_B;
-/* We model BE32 as regular BE, so set CPSR_E */
-env->uncached_cpsr |= CPSR_E;
 }
 #endif
 }
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 3edd56b..96b1e99 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -1812,7 +1812,17 @@ static bool arm_cpu_is_big_endian(CPUARMState *env)
 
 /* In 32bit endianness is determined by looking at CPSR's E bit */
 if (!is_a64(env)) {
-return (env->uncached_cpsr & CPSR_E) ? 1 : 0;
+return
+#ifdef CONFIG_USER_ONLY
+/* In user mode, BE32 data accesses are just modelled as
+ * regular BE access. In system mode, BE32 is modelled as
+ * little endian, with the appropriate address translations on
+ * non-word accesses. So sctlr.b only affects overall
+ * endianness in user mode
+ */
+arm_sctlr_b(env) ||
+#endif
+((env->uncached_cpsr & CPSR_E) ? 1 : 0);
 }
 
 cur_el = arm_current_el(env);
-- 
1.9.1

[Qemu-devel] [PATCHv3 2/9] pseries: Cleanup error handling of spapr_cpu_init()

2016-01-17 Thread David Gibson

Currently spapr_cpu_init() is hardcoded to handle any errors as fatal.
That works for now, since it's only called from initial setup where an
error here means we really can't proceed.

However, we'll want to handle this more flexibly for cpu hotplug in future
so generalize this using the error reporting infrastructure.  While we're
at it make a small cleanup in a related part of ppc_spapr_init() to use
error_report() instead of an old-style explicit fprintf().

Signed-off-by: David Gibson 
Reviewed-by: Bharata B Rao 
---
 hw/ppc/spapr.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index fa7a7f4..b7fd09a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1617,7 +1617,8 @@ static void spapr_boot_set(void *opaque, const char 
*boot_device,
 machine->boot_order = g_strdup(boot_device);
 }
 
-static void spapr_cpu_init(sPAPRMachineState *spapr, PowerPCCPU *cpu)
+static void spapr_cpu_init(sPAPRMachineState *spapr, PowerPCCPU *cpu,
+   Error **errp)
 {
 CPUPPCState *env = >env;
 
@@ -1635,7 +1636,13 @@ static void spapr_cpu_init(sPAPRMachineState *spapr, 
PowerPCCPU *cpu)
 }
 
 if (cpu->max_compat) {
-ppc_set_compat(cpu, cpu->max_compat, _fatal);
+Error *local_err = NULL;
+
+ppc_set_compat(cpu, cpu->max_compat, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
 }
 
 xics_cpu_setup(spapr->icp, cpu);
@@ -1804,10 +1811,10 @@ static void ppc_spapr_init(MachineState *machine)
 for (i = 0; i < smp_cpus; i++) {
 cpu = cpu_ppc_init(machine->cpu_model);
 if (cpu == NULL) {
-fprintf(stderr, "Unable to find PowerPC CPU definition\n");
+error_report("Unable to find PowerPC CPU definition");
 exit(1);
 }
-spapr_cpu_init(spapr, cpu);
+spapr_cpu_init(spapr, cpu, _fatal);
 }
 
 if (kvm_enabled()) {
-- 
2.5.0

[Qemu-devel] [PATCHv3 4/9] pseries: Clean up error handling in spapr_validate_node_memory()

2016-01-17 Thread David Gibson

Use error_setg() and return an error, rather than using an explicit exit().

Also improve messages, and be more explicit about which constraint failed.

Signed-off-by: David Gibson 
Reviewed-by: Bharata B Rao 
---
 hw/ppc/spapr.c | 37 ++---
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index d28e349..87097bc 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1699,27 +1699,34 @@ static void 
spapr_create_lmb_dr_connectors(sPAPRMachineState *spapr)
  * to SPAPR_MEMORY_BLOCK_SIZE(256MB), then refuse to start the guest
  * since we can't support such unaligned sizes with DRCONF_MEMORY.
  */
-static void spapr_validate_node_memory(MachineState *machine)
+static void spapr_validate_node_memory(MachineState *machine, Error **errp)
 {
 int i;
 
-if (machine->maxram_size % SPAPR_MEMORY_BLOCK_SIZE ||
-machine->ram_size % SPAPR_MEMORY_BLOCK_SIZE) {
-error_report("Can't support memory configuration where RAM size "
- "0x" RAM_ADDR_FMT " or maxmem size "
- "0x" RAM_ADDR_FMT " isn't aligned to %llu MB",
- machine->ram_size, machine->maxram_size,
- SPAPR_MEMORY_BLOCK_SIZE/M_BYTE);
-exit(EXIT_FAILURE);
+if (machine->ram_size % SPAPR_MEMORY_BLOCK_SIZE) {
+error_setg(errp, "Memory size 0x" RAM_ADDR_FMT
+   " is not aligned to %llu MiB",
+   machine->ram_size,
+   SPAPR_MEMORY_BLOCK_SIZE / M_BYTE);
+return;
+}
+
+if (machine->maxram_size % SPAPR_MEMORY_BLOCK_SIZE) {
+error_setg(errp, "Maximum memory size 0x" RAM_ADDR_FMT
+   " is not aligned to %llu MiB",
+   machine->ram_size,
+   SPAPR_MEMORY_BLOCK_SIZE / M_BYTE);
+return;
 }
 
 for (i = 0; i < nb_numa_nodes; i++) {
 if (numa_info[i].node_mem % SPAPR_MEMORY_BLOCK_SIZE) {
-error_report("Can't support memory configuration where memory size"
- " %" PRIx64 " of node %d isn't aligned to %llu MB",
- numa_info[i].node_mem, i,
- SPAPR_MEMORY_BLOCK_SIZE/M_BYTE);
-exit(EXIT_FAILURE);
+error_setg(errp,
+   "Node %d memory size 0x" RAM_ADDR_FMT
+   " is not aligned to %llu MiB",
+   i, numa_info[i].node_mem,
+   SPAPR_MEMORY_BLOCK_SIZE / M_BYTE);
+return;
 }
 }
 }
@@ -1809,7 +1816,7 @@ static void ppc_spapr_init(MachineState *machine)
   XICS_IRQS);
 
 if (smc->dr_lmb_enabled) {
-spapr_validate_node_memory(machine);
+spapr_validate_node_memory(machine, _fatal);
 }
 
 /* init CPUs */
-- 
2.5.0

[Qemu-devel] [PATCHv3 5/9] pseries: Cleanup error handling in spapr_vga_init()

2016-01-17 Thread David Gibson

Use error_setg() to return an error rather than an explicit exit().
Previously it was an exit(0) instead of a non-zero exit code, which was
simply a bug.  Also improve the error message.

While we're at it change the type of spapr_vga_init() to bool since that's
how we're using it anyway.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 87097bc..bb5eaa5 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1246,7 +1246,7 @@ static void spapr_rtc_create(sPAPRMachineState *spapr)
 }
 
 /* Returns whether we want to use VGA or not */
-static int spapr_vga_init(PCIBus *pci_bus)
+static bool spapr_vga_init(PCIBus *pci_bus, Error **errp)
 {
 switch (vga_interface_type) {
 case VGA_NONE:
@@ -1257,9 +1257,9 @@ static int spapr_vga_init(PCIBus *pci_bus)
 case VGA_VIRTIO:
 return pci_vga_init(pci_bus) != NULL;
 default:
-fprintf(stderr, "This vga model is not supported,"
-"currently it only supports -vga std\n");
-exit(0);
+error_setg(errp,
+   "Unsupported VGA mode, only -vga std or -vga virtio is 
supported");
+return false;
 }
 }
 
@@ -1934,7 +1934,7 @@ static void ppc_spapr_init(MachineState *machine)
 }
 
 /* Graphics */
-if (spapr_vga_init(phb->bus)) {
+if (spapr_vga_init(phb->bus, _fatal)) {
 spapr->has_graphics = true;
 machine->usb |= defaults_enabled() && !machine->usb_disabled;
 }
-- 
2.5.0

[Qemu-devel] [PATCHv3 1/9] ppc: Cleanup error handling in ppc_set_compat()

2016-01-17 Thread David Gibson

Current ppc_set_compat() returns -1 for errors, and also (unconditionally)
reports an error message.  The caller in h_client_architecture_support()
may then report it again using an outdated fprintf().

Clean this up by using the modern error reporting mechanisms.  Also add
strerror(errno) to the error message.

Signed-off-by: David Gibson 
Reviewed-by: Thomas Huth 
---
 hw/ppc/spapr.c  |  4 +---
 hw/ppc/spapr_hcall.c| 10 +-
 target-ppc/cpu.h|  2 +-
 target-ppc/translate_init.c | 13 +++--
 4 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 50e5a26..fa7a7f4 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1635,9 +1635,7 @@ static void spapr_cpu_init(sPAPRMachineState *spapr, 
PowerPCCPU *cpu)
 }
 
 if (cpu->max_compat) {
-if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
-exit(1);
-}
+ppc_set_compat(cpu, cpu->max_compat, _fatal);
 }
 
 xics_cpu_setup(spapr->icp, cpu);
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index cebceea..8b0fcb3 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -837,7 +837,7 @@ static target_ulong cas_get_option_vector(int vector, 
target_ulong table)
 typedef struct {
 PowerPCCPU *cpu;
 uint32_t cpu_version;
-int ret;
+Error *err;
 } SetCompatState;
 
 static void do_set_compat(void *arg)
@@ -845,7 +845,7 @@ static void do_set_compat(void *arg)
 SetCompatState *s = arg;
 
 cpu_synchronize_state(CPU(s->cpu));
-s->ret = ppc_set_compat(s->cpu, s->cpu_version);
+ppc_set_compat(s->cpu, s->cpu_version, >err);
 }
 
 #define get_compat_level(cpuver) ( \
@@ -929,13 +929,13 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu_,
 SetCompatState s = {
 .cpu = POWERPC_CPU(cs),
 .cpu_version = cpu_version,
-.ret = 0
+.err = NULL,
 };
 
 run_on_cpu(cs, do_set_compat, );
 
-if (s.ret < 0) {
-fprintf(stderr, "Unable to set compatibility mode\n");
+if (s.err) {
+error_report_err(s.err);
 return H_HARDWARE;
 }
 }
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 9706000..b3b89e6 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1210,7 +1210,7 @@ void ppc_store_msr (CPUPPCState *env, target_ulong value);
 
 void ppc_cpu_list (FILE *f, fprintf_function cpu_fprintf);
 int ppc_get_compat_smt_threads(PowerPCCPU *cpu);
-int ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version);
+void ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version, Error **errp);
 
 /* Time-base and decrementer management */
 #ifndef NO_CPU_IO_DEFS
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 4ab2d92..678957a 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -9186,7 +9186,7 @@ int ppc_get_compat_smt_threads(PowerPCCPU *cpu)
 return ret;
 }
 
-int ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version)
+void ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version, Error **errp)
 {
 int ret = 0;
 CPUPPCState *env = >env;
@@ -9208,12 +9208,13 @@ int ppc_set_compat(PowerPCCPU *cpu, uint32_t 
cpu_version)
 break;
 }
 
-if (kvm_enabled() && kvmppc_set_compat(cpu, cpu->cpu_version) < 0) {
-error_report("Unable to set compatibility mode in KVM");
-ret = -1;
+if (kvm_enabled()) {
+ret = kvmppc_set_compat(cpu, cpu->cpu_version);
+if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Unable to set CPU compatibility mode in KVM");
+}
 }
-
-return ret;
 }
 
 static gint ppc_cpu_compare_class_pvr(gconstpointer a, gconstpointer b)
-- 
2.5.0

[Qemu-devel] [RFC 0/3] Draft implementation of HPT resizing (qemu side)

2016-01-17 Thread David Gibson

Here is a draft qemu implementation of my proposed PAPR extension for
allowing runtime resizing of a KVM/ppc64 guest's hash page table.
That in turn will allow for more flexible memory hotplug.

This should work with the guest kernel side patches I also posted
recently [1].

Still required to make this into a full implementation:
  * Guest needs to auto-resize HPT on memory hotplug events

  * qemu needs to allocate HPT size based on current rather than
maximum memory if the guest is HPT resize aware

  * KVM host side implementation

  * PAPR standardization


[1] http://thread.gmane.org/gmane.linux.ports.ppc.embedded/90392

David Gibson (3):
  pseries: Stub hypercalls for HPT resizing
  pseries: Implement HPT resizing
  pseries: Advertise HPT resize capability

 hw/ppc/spapr.c  |   5 +-
 hw/ppc/spapr_hcall.c| 331 
 include/hw/ppc/spapr.h  |   9 +-
 target-ppc/mmu-hash64.h |   4 +
 trace-events|   2 +
 5 files changed, 348 insertions(+), 3 deletions(-)

-- 
2.5.0

[Qemu-devel] [RFC 2/3] pseries: Implement HPT resizing

2016-01-17 Thread David Gibson

This patch implements hypercalls allowing a PAPR guest to resize its own
hash page table.  This will eventually allow for more flexible memory
hotplug.

The implementation is partially asynchronous, handled in a special thread
running the hpt_prepare_thread() function.  The state of a pending resize
is stored in SPAPR_MACHINE->pending_hpt.

The H_RESIZE_HPT_PREPARE hypercall will kick off creation of a new HPT, or,
if one is already in progress, monitor it for completion.  If there is an
existing HPT resize in progress that doesn't match the size specified in
the call, it will cancel it, replacing it with a new one matching the
given size.

The H_RESIZE_HPT_COMMIT completes transition to a resized HPT, and can only
be called successfully once H_RESIZE_HPT_PREPARE has successfully
completed initialization of a new HPT.  The guest must ensure that there
are no concurrent accesses to the existing HPT while this is called (this
effectively means stop_machine() for Linux guests).

For now H_RESIZE_HPT_COMMIT goes through the whole old HPT, rehashing each
HPTE into the new HPT.  This can have quite high latency, but it seems to
be of the order of typical migration downtime latencies for HPTs of size
up to ~2GiB (which would be used in a 256GiB guest).

In future we probably want to move more of the rehashing to the "prepare"
phase, by having H_ENTER and other hcalls update both current and
pending HPTs.  That's a project for another day, but should be possible
without any changes to the guest interface.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c  |   2 -
 hw/ppc/spapr_hcall.c| 308 +++-
 include/hw/ppc/spapr.h  |   5 +
 target-ppc/mmu-hash64.h |   4 +
 4 files changed, 314 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 50e5a26..e26baca 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -90,8 +90,6 @@
 
 #define PHANDLE_XICP0x
 
-#define HTAB_SIZE(spapr)(1ULL << ((spapr)->htab_shift))
-
 static XICSState *try_create_xics(const char *type, int nr_servers,
   int nr_irqs, Error **errp)
 {
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 01c034c..1d5efef 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1,4 +1,5 @@
 #include "sysemu/sysemu.h"
+#include "qemu/error-report.h"
 #include "cpu.h"
 #include "helper_regs.h"
 #include "hw/ppc/spapr.h"
@@ -316,16 +317,278 @@ static target_ulong h_read(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 return H_SUCCESS;
 }
 
+struct sPAPRPendingHPT {
+/* These fields are read-only after initialization */
+int shift;
+QemuThread thread;
+
+/* These fields are protected by the BQL */
+bool complete;
+
+/* These fields are private to the preparation thread if
+ * !complete, otherwise protected by the BQL */
+int ret;
+void *hpt;
+};
+
+static void free_pending_hpt(sPAPRPendingHPT *pending)
+{
+if (pending->hpt) {
+qemu_vfree(pending->hpt);
+}
+
+g_free(pending);
+}
+
+static void *hpt_prepare_thread(void *opaque)
+{
+sPAPRPendingHPT *pending = opaque;
+size_t size = 1ULL << pending->shift;
+
+pending->hpt = qemu_memalign(size, size);
+if (pending->hpt) {
+memset(pending->hpt, 0, size);
+pending->ret = H_SUCCESS;
+} else {
+pending->ret = H_NO_MEM;
+}
+
+qemu_mutex_lock_iothread();
+
+if (SPAPR_MACHINE(qdev_get_machine())->pending_hpt != pending) {
+/* We've been cancelled, clean ourselves up */
+free_pending_hpt(pending);
+goto out;
+}
+
+pending->complete = true;
+
+out:
+qemu_mutex_unlock_iothread();
+return NULL;
+}
+
+/* Must be called with BQL held */
+static void cancel_hpt_prepare(sPAPRMachineState *spapr)
+{
+sPAPRPendingHPT *pending = spapr->pending_hpt;
+
+/* Let the thread know it's cancelled */
+spapr->pending_hpt = NULL;
+
+if (!pending) {
+/* Nothing to do */
+return;
+}
+
+if (!pending->complete) {
+/* thread will clean itself up */
+return;
+}
+
+free_pending_hpt(pending);
+}
+
 static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
  sPAPRMachineState *spapr,
  target_ulong opcode,
  target_ulong *args)
 {
 target_ulong flags = args[0];
-target_ulong shift = args[1];
+int shift = args[1];
+sPAPRPendingHPT *pending = spapr->pending_hpt;
 
 trace_spapr_h_resize_hpt_prepare(flags, shift);
-return H_HARDWARE;
+
+if (flags != 0) {
+return H_PARAMETER;
+}
+
+if (shift && ((shift < 18) || (shift > 46))) {
+return H_PARAMETER;
+}
+
+if (pending) {
+/* something already in progress */
+if (pending->shift == shift) {
+/* and it's suitable */

[Qemu-devel] [RFC 3/3] pseries: Advertise HPT resize capability

2016-01-17 Thread David Gibson

This adds a new string to the hypertas property in the device tree,
advertising to the guest the availability of the HPT resizing hypercalls.
This is a tentative suggested value, and would need to be standardized by
PAPR before being merged.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index e26baca..1147382 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -334,6 +334,9 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
 add_str(hypertas, "hcall-splpar");
 add_str(hypertas, "hcall-bulk");
 add_str(hypertas, "hcall-set-mode");
+if (!kvm_enabled()) { /* Not implemented in KVM yet */
+add_str(hypertas, "hcall-hpt-resize");
+}
 add_str(qemu_hypertas, "hcall-memop1");
 
 fdt = g_malloc0(FDT_MAX_SIZE);
-- 
2.5.0

[Qemu-devel] [RFC 1/3] pseries: Stub hypercalls for HPT resizing

2016-01-17 Thread David Gibson

This introduces stub implementations of the H_RESIZE_HPT_PREPARE and
H_RESIZE_HPT_COMMIT hypercalls which we hope to add in a PAPR extension to
allow run time resizing of a guest's hash page table.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr_hcall.c   | 29 +
 include/hw/ppc/spapr.h |  4 +++-
 trace-events   |  2 ++
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 1a1bea8..01c034c 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -316,6 +316,30 @@ static target_ulong h_read(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 return H_SUCCESS;
 }
 
+static target_ulong h_resize_hpt_prepare(PowerPCCPU *cpu,
+ sPAPRMachineState *spapr,
+ target_ulong opcode,
+ target_ulong *args)
+{
+target_ulong flags = args[0];
+target_ulong shift = args[1];
+
+trace_spapr_h_resize_hpt_prepare(flags, shift);
+return H_HARDWARE;
+}
+
+static target_ulong h_resize_hpt_commit(PowerPCCPU *cpu,
+sPAPRMachineState *spapr,
+target_ulong opcode,
+target_ulong *args)
+{
+target_ulong flags = args[0];
+target_ulong shift = args[1];
+
+trace_spapr_h_resize_hpt_commit(flags, shift);
+return H_HARDWARE;
+}
+
 static target_ulong h_set_dabr(PowerPCCPU *cpu, sPAPRMachineState *spapr,
target_ulong opcode, target_ulong *args)
 {
@@ -974,6 +998,11 @@ static void hypercall_register_types(void)
 /* hcall-bulk */
 spapr_register_hypercall(H_BULK_REMOVE, h_bulk_remove);
 
+/* hcall-hpt-resize */
+spapr_register_hypercall(KVMPPC_H_RESIZE_HPT_PREPARE,
+ h_resize_hpt_prepare);
+spapr_register_hypercall(KVMPPC_H_RESIZE_HPT_COMMIT, h_resize_hpt_commit);
+
 /* hcall-dabr */
 spapr_register_hypercall(H_SET_DABR, h_set_dabr);
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 53af76a..028afc9 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -352,7 +352,9 @@ struct sPAPRMachineState {
 #define KVMPPC_H_LOGICAL_MEMOP  (KVMPPC_HCALL_BASE + 0x1)
 /* Client Architecture support */
 #define KVMPPC_H_CAS(KVMPPC_HCALL_BASE + 0x2)
-#define KVMPPC_HCALL_MAXKVMPPC_H_CAS
+#define KVMPPC_H_RESIZE_HPT_PREPARE (KVMPPC_HCALL_BASE + 0x3)
+#define KVMPPC_H_RESIZE_HPT_COMMIT  (KVMPPC_HCALL_BASE + 0x4)
+#define KVMPPC_HCALL_MAXKVMPPC_H_RESIZE_HPT_COMMIT
 
 typedef struct sPAPRDeviceTreeUpdateHeader {
 uint32_t version_id;
diff --git a/trace-events b/trace-events
index 934a7b6..f0d6e49 100644
--- a/trace-events
+++ b/trace-events
@@ -1403,6 +1403,8 @@ spapr_cas_continue(unsigned long n) "Copy changes to the 
guest: %ld bytes"
 # hw/ppc/spapr_hcall.c
 spapr_cas_pvr_try(uint32_t pvr) "%x"
 spapr_cas_pvr(uint32_t cur_pvr, bool cpu_match, uint32_t new_pvr, uint64_t 
pcr) "current=%x, cpu_match=%u, new=%x, compat flags=%"PRIx64
+spapr_h_resize_hpt_prepare(uint64_t flags, uint64_t shift) "flags=0x%"PRIx64", 
shift=%"PRIu64
+spapr_h_resize_hpt_commit(uint64_t flags, uint64_t shift) "flags=0x%"PRIx64", 
shift=%"PRIu64
 
 # hw/ppc/spapr_iommu.c
 spapr_iommu_put(uint64_t liobn, uint64_t ioba, uint64_t tce, uint64_t ret) 
"liobn=%"PRIx64" ioba=0x%"PRIx64" tce=0x%"PRIx64" ret=%"PRId64
-- 
2.5.0

Re: [Qemu-devel] [RFC 0/3] Draft implementation of HPT resizing (qemu side)

2016-01-17 Thread David Gibson

On Mon, Jan 18, 2016 at 04:44:38PM +1100, David Gibson wrote:
1;2802;0c> Here is a draft qemu implementation of my proposed PAPR extension for
> allowing runtime resizing of a KVM/ppc64 guest's hash page table.
> That in turn will allow for more flexible memory hotplug.
> 
> This should work with the guest kernel side patches I also posted
> recently [1].
> 
> Still required to make this into a full implementation:
>   * Guest needs to auto-resize HPT on memory hotplug events
> 
>   * qemu needs to allocate HPT size based on current rather than
> maximum memory if the guest is HPT resize aware
> 
>   * KVM host side implementation
> 
>   * PAPR standardization
> 
> 
> [1] http://thread.gmane.org/gmane.linux.ports.ppc.embedded/90392

Sorry, forgot to mention that this series applies on top of my page
size handling cleanup series posted recently.

> 
> David Gibson (3):
>   pseries: Stub hypercalls for HPT resizing
>   pseries: Implement HPT resizing
>   pseries: Advertise HPT resize capability
> 
>  hw/ppc/spapr.c  |   5 +-
>  hw/ppc/spapr_hcall.c| 331 
> 
>  include/hw/ppc/spapr.h  |   9 +-
>  target-ppc/mmu-hash64.h |   4 +
>  trace-events|   2 +
>  5 files changed, 348 insertions(+), 3 deletions(-)
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 03/10] pseries: Clean up hash page table allocation error handling

2016-01-17 Thread Alexey Kardashevskiy


On 01/18/2016 04:35 PM, David Gibson wrote:

On Mon, Jan 18, 2016 at 04:17:08PM +1100, Alexey Kardashevskiy wrote:

On 01/18/2016 03:42 PM, David Gibson wrote:

On Mon, Jan 18, 2016 at 01:44:00PM +1100, Alexey Kardashevskiy wrote:

On 01/15/2016 11:00 PM, David Gibson wrote:

The spapr_alloc_htab() and spapr_reset_htab() functions currently handle
all errors with error_setg(_abort, ...).

But really, the callers are really better placed to decide on the error
handling.  So, instead make the functions use the error propagation
infrastructure.

In the callers we change to _fatal instead of _abort, since
this can be triggered by a bad configuration or kernel error rather than
indicating a programming error in qemu.

While we're at it improve the messages themselves a bit, and clean up the
indentation a little.

Signed-off-by: David Gibson 
---
  hw/ppc/spapr.c | 24 
  1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b7fd09a..d28e349 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1016,7 +1016,7 @@ static void emulate_spapr_hypercall(PowerPCCPU *cpu)
  #define CLEAN_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) &= 
tswap64(~HPTE64_V_HPTE_DIRTY))
  #define DIRTY_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) |= 
tswap64(HPTE64_V_HPTE_DIRTY))

-static void spapr_alloc_htab(sPAPRMachineState *spapr)
+static void spapr_alloc_htab(sPAPRMachineState *spapr, Error **errp)
  {
  long shift;
  int index;
@@ -1031,7 +1031,8 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
   * For HV KVM, host kernel will return -ENOMEM when requested
   * HTAB size can't be allocated.
   */
-error_setg(_abort, "Failed to allocate HTAB of requested size, try 
with smaller maxmem");
+error_setg_errno(errp, -shift,
+ "Error allocating KVM hash page table, try smaller 
maxmem");
  } else if (shift > 0) {
  /*
   * Kernel handles htab, we don't need to allocate one
@@ -1040,7 +1041,10 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
   * but we don't allow booting of such guests.
   */
  if (shift != spapr->htab_shift) {
-error_setg(_abort, "Failed to allocate HTAB of requested size, 
try with smaller maxmem");
+error_setg(errp,
+"Small allocation for KVM hash page table (%ld < %"
+PRIu32 "), try smaller maxmem",




Even though it is not in the CODING_STYLE, I have not seen anyone objecting
the very good kernel's "never break user-visible strings" rule or rejecting
patches with user-visible strings failing to fit 80 chars limit.


I'm not.  Or rather, the string is already broken by the PRIu32, so
the newline doesn't make it any less greppable.



"KVM hash page table.*smaller maxmem" stopped working. Not a big deal but I
do not see any win in breaking strings anyway.


The problem is that the current pre-commit hooks don't agree with
you.  They seem to allow long unbroken strings, but if there's a break
like the PRIu32, they won't permit the commit.



checkpatch.pl reports it as "WARNING: line over 80 characters", not an 
ERROR, so I'd say the hook has a problem.




--
Alexey

[Qemu-devel] [PATCHv3 0/9] Cleanups to error reporting on ppc and spapr

2016-01-17 Thread David Gibson

Another spin of my patches to clean up a bunch of error reporting in
the pseries machine type and target-ppc code, to better use the error
API.

Once reviewed, I hope to merge this into ppc-for-2.6 shortly.

Changes in v3:
 * Adjusted a commit message for accuracy (suggest by Markus)
 * Dropped a patch which relied on a wrong guess about the behaviour
   of foreach_dynamic_sysbus_device().
Changes in v2:
 * Assorted minor tweaks based on review

David Gibson (9):
  ppc: Cleanup error handling in ppc_set_compat()
  pseries: Cleanup error handling of spapr_cpu_init()
  pseries: Clean up hash page table allocation error handling
  pseries: Clean up error handling in spapr_validate_node_memory()
  pseries: Cleanup error handling in spapr_vga_init()
  pseries: Clean up error handling in spapr_rtas_register()
  pseries: Clean up error handling in xics_system_init()
  pseries: Clean up error reporting in ppc_spapr_init()
  pseries: Clean up error reporting in htab migration functions

 hw/ppc/spapr.c  | 134 ++--
 hw/ppc/spapr_hcall.c|  10 ++--
 hw/ppc/spapr_rtas.c |  12 +---
 target-ppc/cpu.h|   2 +-
 target-ppc/translate_init.c |  13 +++--
 5 files changed, 94 insertions(+), 77 deletions(-)

-- 
2.5.0

[Qemu-devel] [PATCHv3 3/9] pseries: Clean up hash page table allocation error handling

2016-01-17 Thread David Gibson

The spapr_alloc_htab() and spapr_reset_htab() functions currently handle
all errors with error_setg(_abort, ...).

But really, the callers are really better placed to decide on the error
handling.  So, instead make the functions use the error propagation
infrastructure.

In the callers we change to _fatal instead of _abort, since
this can be triggered by a bad configuration or kernel error rather than
indicating a programming error in qemu.

While we're at it improve the messages themselves a bit, and clean up the
indentation a little.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b7fd09a..d28e349 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1016,7 +1016,7 @@ static void emulate_spapr_hypercall(PowerPCCPU *cpu)
 #define CLEAN_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) &= 
tswap64(~HPTE64_V_HPTE_DIRTY))
 #define DIRTY_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) |= 
tswap64(HPTE64_V_HPTE_DIRTY))
 
-static void spapr_alloc_htab(sPAPRMachineState *spapr)
+static void spapr_alloc_htab(sPAPRMachineState *spapr, Error **errp)
 {
 long shift;
 int index;
@@ -1031,7 +1031,8 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
  * For HV KVM, host kernel will return -ENOMEM when requested
  * HTAB size can't be allocated.
  */
-error_setg(_abort, "Failed to allocate HTAB of requested size, 
try with smaller maxmem");
+error_setg_errno(errp, -shift,
+ "Error allocating KVM hash page table, try smaller 
maxmem");
 } else if (shift > 0) {
 /*
  * Kernel handles htab, we don't need to allocate one
@@ -1040,7 +1041,10 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
  * but we don't allow booting of such guests.
  */
 if (shift != spapr->htab_shift) {
-error_setg(_abort, "Failed to allocate HTAB of requested 
size, try with smaller maxmem");
+error_setg(errp,
+"Small allocation for KVM hash page table (%ld < %"
+PRIu32 "), try smaller maxmem",
+shift, spapr->htab_shift);
 }
 
 spapr->htab_shift = shift;
@@ -1064,17 +1068,21 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
  * If host kernel has allocated HTAB, KVM_PPC_ALLOCATE_HTAB ioctl is
  * used to clear HTAB. Otherwise QEMU-allocated HTAB is cleared manually.
  */
-static void spapr_reset_htab(sPAPRMachineState *spapr)
+static void spapr_reset_htab(sPAPRMachineState *spapr, Error **errp)
 {
 long shift;
 int index;
 
 shift = kvmppc_reset_htab(spapr->htab_shift);
 if (shift < 0) {
-error_setg(_abort, "Failed to reset HTAB");
+error_setg_errno(errp, -shift,
+   "Error resetting KVM hash page table, try smaller maxmem");
 } else if (shift > 0) {
 if (shift != spapr->htab_shift) {
-error_setg(_abort, "Requested HTAB allocation failed during 
reset");
+error_setg(errp,
+"Reduced size on reset of KVM hash page table (%ld < %"
+PRIu32 "), try smaller maxmem",
+shift, spapr->htab_shift);
 }
 
 /* Tell readers to update their file descriptor */
@@ -1145,7 +1153,7 @@ static void ppc_spapr_reset(void)
 foreach_dynamic_sysbus_device(find_unknown_sysbus_device, NULL);
 
 /* Reset the hash table & recalc the RMA */
-spapr_reset_htab(spapr);
+spapr_reset_htab(spapr, _fatal);
 
 qemu_devices_reset();
 
@@ -1792,7 +1800,7 @@ static void ppc_spapr_init(MachineState *machine)
 }
 spapr->htab_shift++;
 }
-spapr_alloc_htab(spapr);
+spapr_alloc_htab(spapr, _fatal);
 
 /* Set up Interrupt Controller before we create the VCPUs */
 spapr->icp = xics_system_init(machine,
-- 
2.5.0

[Qemu-devel] [PATCHv3 7/9] pseries: Clean up error handling in xics_system_init()

2016-01-17 Thread David Gibson

Use the error handling infrastructure to pass an error out from
try_create_xics() instead of assuming _abort - the caller is in a
better position to decide on error handling policy.

Also change the error handling from an _abort to _fatal, since
this occurs during the initial machine construction and could be triggered
by bad configuration rather than a program error.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index bb5eaa5..148ca5a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -111,7 +111,7 @@ static XICSState *try_create_xics(const char *type, int 
nr_servers,
 }
 
 static XICSState *xics_system_init(MachineState *machine,
-   int nr_servers, int nr_irqs)
+   int nr_servers, int nr_irqs, Error **errp)
 {
 XICSState *icp = NULL;
 
@@ -130,7 +130,7 @@ static XICSState *xics_system_init(MachineState *machine,
 }
 
 if (!icp) {
-icp = try_create_xics(TYPE_XICS, nr_servers, nr_irqs, _abort);
+icp = try_create_xics(TYPE_XICS, nr_servers, nr_irqs, errp);
 }
 
 return icp;
@@ -1813,7 +1813,7 @@ static void ppc_spapr_init(MachineState *machine)
 spapr->icp = xics_system_init(machine,
   DIV_ROUND_UP(max_cpus * kvmppc_smt_threads(),
smp_threads),
-  XICS_IRQS);
+  XICS_IRQS, _fatal);
 
 if (smc->dr_lmb_enabled) {
 spapr_validate_node_memory(machine, _fatal);
-- 
2.5.0

[Qemu-devel] [PATCHv3 6/9] pseries: Clean up error handling in spapr_rtas_register()

2016-01-17 Thread David Gibson

The errors detected in this function necessarily indicate bugs in the rest
of the qemu code, rather than an external or configuration problem.

So, a simple assert() is more appropriate than any more complex error
reporting.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr_rtas.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 34b12a3..0be52ae 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -648,17 +648,11 @@ target_ulong spapr_rtas_call(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 
 void spapr_rtas_register(int token, const char *name, spapr_rtas_fn fn)
 {
-if (!((token >= RTAS_TOKEN_BASE) && (token < RTAS_TOKEN_MAX))) {
-fprintf(stderr, "RTAS invalid token 0x%x\n", token);
-exit(1);
-}
+assert((token >= RTAS_TOKEN_BASE) && (token < RTAS_TOKEN_MAX));
 
 token -= RTAS_TOKEN_BASE;
-if (rtas_table[token].name) {
-fprintf(stderr, "RTAS call \"%s\" is registered already as 0x%x\n",
-rtas_table[token].name, token);
-exit(1);
-}
+
+assert(!rtas_table[token].name);
 
 rtas_table[token].name = name;
 rtas_table[token].fn = fn;
-- 
2.5.0

Re: [Qemu-devel] [PATCH RFC 0/4] ARM SMMUv3 Emulation

2016-01-17 Thread Prem (Premachandra) Mallappa

> Edgar has done all of the SMMU work for Xilinx, he knows it the best.
> I'll let him comment on it.
> 
> For anyone interested you can see our implementation at:
> https://github.com/Xilinx/qemu/blob/master/hw/misc/arm-smmu.c. It does
> use the register API that we have been trying to upstream.
> 
Hi,
I took a quick look at the code, The Xilinx implements mmu-500 which is a SMMU- 
v2.
I am not very familiar with V2, however, the architecture and internal workings 
are different between v2 and v3. 
hence there are 2 different drivers in Linux for V2 and V3, and the code I 
posted is for SMMU-v3.

Cheers,
/Prem

Re: [Qemu-devel] [PATCH] net: cadence_gem: check packet size in gem_recieve

2016-01-17 Thread Jason Wang



On 01/18/2016 01:34 PM, P J P wrote:
> +-- On Mon, 18 Jan 2016, Jason Wang wrote --+
> | > +if (size > sizeof(rxbuf) - sizeof(crc_val)) {
> | > +size = sizeof(rxbuf) - sizeof(crc_val);
> | > +}
> | > +bytes_to_copy = size;
> | > +
> | 
> | We probably need more check, is there any guarantee that size <= 2048?
> | If not, need fix.
>
>   Sorry? The above check would fix that, no?

You're right. Apply to my -net (and removing the unnecessary whitespace
change).

Thanks

> --
>  - P J P
> 47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F

[Qemu-devel] [PATCH v1 08/17] target-arm: cpu: Move cpu_is_big_endian to header

2016-01-17 Thread Peter Crosthwaite

From: Peter Crosthwaite 

There is a CPU data endianness test that is used to drive the
virtio_big_endian test.

Move this up to the header so it can be more generally used for endian
tests. The KVM specific cpu_syncronize_state call is left behind in the
virtio specific function.

Signed-off-by: Peter Crosthwaite 
---

 target-arm/cpu.c | 19 +++
 target-arm/cpu.h | 19 +++
 2 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index 35a1f12..d3b73bf 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -368,26 +368,13 @@ static void arm_cpu_kvm_set_irq(void *opaque, int irq, 
int level)
 #endif
 }
 
-static bool arm_cpu_is_big_endian(CPUState *cs)
+static bool arm_cpu_virtio_is_big_endian(CPUState *cs)
 {
 ARMCPU *cpu = ARM_CPU(cs);
 CPUARMState *env = >env;
-int cur_el;
 
 cpu_synchronize_state(cs);
-
-/* In 32bit guest endianness is determined by looking at CPSR's E bit */
-if (!is_a64(env)) {
-return (env->uncached_cpsr & CPSR_E) ? 1 : 0;
-}
-
-cur_el = arm_current_el(env);
-
-if (cur_el == 0) {
-return (env->cp15.sctlr_el[1] & SCTLR_E0E) != 0;
-}
-
-return (env->cp15.sctlr_el[cur_el] & SCTLR_EE) != 0;
+return arm_cpu_is_big_endian(env);
 }
 
 #endif
@@ -1420,7 +1407,7 @@ static void arm_cpu_class_init(ObjectClass *oc, void 
*data)
 cc->do_unaligned_access = arm_cpu_do_unaligned_access;
 cc->get_phys_page_debug = arm_cpu_get_phys_page_debug;
 cc->vmsd = _arm_cpu;
-cc->virtio_is_big_endian = arm_cpu_is_big_endian;
+cc->virtio_is_big_endian = arm_cpu_virtio_is_big_endian;
 #endif
 cc->gdb_num_core_regs = 26;
 cc->gdb_core_xml_file = "arm-core.xml";
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index f83070a..54675c7 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -1795,6 +1795,25 @@ static inline bool arm_singlestep_active(CPUARMState 
*env)
 && arm_generate_debug_exceptions(env);
 }
 
+/* Return true if the processor is in big-endian mode. */
+static bool arm_cpu_is_big_endian(CPUARMState *env)
+{
+int cur_el;
+
+/* In 32bit endianness is determined by looking at CPSR's E bit */
+if (!is_a64(env)) {
+return (env->uncached_cpsr & CPSR_E) ? 1 : 0;
+}
+
+cur_el = arm_current_el(env);
+
+if (cur_el == 0) {
+return (env->cp15.sctlr_el[1] & SCTLR_E0E) != 0;
+}
+
+return (env->cp15.sctlr_el[cur_el] & SCTLR_EE) != 0;
+}
+
 #include "exec/cpu-all.h"
 
 /* Bit usage in the TB flags field: bit 31 indicates whether we are
-- 
1.9.1

Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/4] target-ppc: use cpu_write_xer() helper in cpu_post_load

2016-01-17 Thread David Gibson

On Fri, Jan 08, 2016 at 01:25:32PM +1100, Alexey Kardashevskiy wrote:
> On 01/07/2016 05:22 AM, Mark Cave-Ayland wrote:
> >Otherwise some internal xer variables fail to get set post-migration.
> >
> >Signed-off-by: Mark Cave-Ayland 
> >---
> >  target-ppc/machine.c |2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >diff --git a/target-ppc/machine.c b/target-ppc/machine.c
> >index 98fc63a..322ce84 100644
> >--- a/target-ppc/machine.c
> >+++ b/target-ppc/machine.c
> >@@ -168,7 +168,7 @@ static int cpu_post_load(void *opaque, int version_id)
> >  env->spr[SPR_PVR] = env->spr_cb[SPR_PVR].default_value;
> >  env->lr = env->spr[SPR_LR];
> >  env->ctr = env->spr[SPR_CTR];
> >-env->xer = env->spr[SPR_XER];
> >+cpu_write_xer(env, env->spr[SPR_XER]);
> >  #if defined(TARGET_PPC64)
> >  env->cfar = env->spr[SPR_CFAR];
> >  #endif
> >
> 
> Reviewed-by: Alexey Kardashevskiy 

I've merged just this patch into ppc-for-2.6.  The others in the
series have some problems which have been pointed out elsewhere.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 03/10] pseries: Clean up hash page table allocation error handling

2016-01-17 Thread David Gibson

On Mon, Jan 18, 2016 at 04:17:08PM +1100, Alexey Kardashevskiy wrote:
> On 01/18/2016 03:42 PM, David Gibson wrote:
> >On Mon, Jan 18, 2016 at 01:44:00PM +1100, Alexey Kardashevskiy wrote:
> >>On 01/15/2016 11:00 PM, David Gibson wrote:
> >>>The spapr_alloc_htab() and spapr_reset_htab() functions currently handle
> >>>all errors with error_setg(_abort, ...).
> >>>
> >>>But really, the callers are really better placed to decide on the error
> >>>handling.  So, instead make the functions use the error propagation
> >>>infrastructure.
> >>>
> >>>In the callers we change to _fatal instead of _abort, since
> >>>this can be triggered by a bad configuration or kernel error rather than
> >>>indicating a programming error in qemu.
> >>>
> >>>While we're at it improve the messages themselves a bit, and clean up the
> >>>indentation a little.
> >>>
> >>>Signed-off-by: David Gibson 
> >>>---
> >>>  hw/ppc/spapr.c | 24 
> >>>  1 file changed, 16 insertions(+), 8 deletions(-)
> >>>
> >>>diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>>index b7fd09a..d28e349 100644
> >>>--- a/hw/ppc/spapr.c
> >>>+++ b/hw/ppc/spapr.c
> >>>@@ -1016,7 +1016,7 @@ static void emulate_spapr_hypercall(PowerPCCPU *cpu)
> >>>  #define CLEAN_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) &= 
> >>> tswap64(~HPTE64_V_HPTE_DIRTY))
> >>>  #define DIRTY_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) |= 
> >>> tswap64(HPTE64_V_HPTE_DIRTY))
> >>>
> >>>-static void spapr_alloc_htab(sPAPRMachineState *spapr)
> >>>+static void spapr_alloc_htab(sPAPRMachineState *spapr, Error **errp)
> >>>  {
> >>>  long shift;
> >>>  int index;
> >>>@@ -1031,7 +1031,8 @@ static void spapr_alloc_htab(sPAPRMachineState 
> >>>*spapr)
> >>>   * For HV KVM, host kernel will return -ENOMEM when requested
> >>>   * HTAB size can't be allocated.
> >>>   */
> >>>-error_setg(_abort, "Failed to allocate HTAB of requested 
> >>>size, try with smaller maxmem");
> >>>+error_setg_errno(errp, -shift,
> >>>+ "Error allocating KVM hash page table, try 
> >>>smaller maxmem");
> >>>  } else if (shift > 0) {
> >>>  /*
> >>>   * Kernel handles htab, we don't need to allocate one
> >>>@@ -1040,7 +1041,10 @@ static void spapr_alloc_htab(sPAPRMachineState 
> >>>*spapr)
> >>>   * but we don't allow booting of such guests.
> >>>   */
> >>>  if (shift != spapr->htab_shift) {
> >>>-error_setg(_abort, "Failed to allocate HTAB of 
> >>>requested size, try with smaller maxmem");
> >>>+error_setg(errp,
> >>>+"Small allocation for KVM hash page table (%ld < %"
> >>>+PRIu32 "), try smaller maxmem",
> >>
> >>
> >>
> >>Even though it is not in the CODING_STYLE, I have not seen anyone objecting
> >>the very good kernel's "never break user-visible strings" rule or rejecting
> >>patches with user-visible strings failing to fit 80 chars limit.
> >
> >I'm not.  Or rather, the string is already broken by the PRIu32, so
> >the newline doesn't make it any less greppable.
> 
> 
> "KVM hash page table.*smaller maxmem" stopped working. Not a big deal but I
> do not see any win in breaking strings anyway.

The problem is that the current pre-commit hooks don't agree with
you.  They seem to allow long unbroken strings, but if there's a break
like the PRIu32, they won't permit the commit.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] net: cadence_gem: check packet size in gem_recieve

2016-01-17 Thread P J P

+-- On Mon, 18 Jan 2016, Jason Wang wrote --+
| > +if (size > sizeof(rxbuf) - sizeof(crc_val)) {
| > +size = sizeof(rxbuf) - sizeof(crc_val);
| > +}
| > +bytes_to_copy = size;
| > +
| 
| We probably need more check, is there any guarantee that size <= 2048?
| If not, need fix.

  Sorry? The above check would fix that, no?

--
 - P J P
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F

[Qemu-devel] [PATCH] linux-user: add option to intercept execve() syscalls

2016-01-17 Thread Petros Angelatos

From: Petros Angelatos 

In order for one to use QEMU user mode emulation under a chroot, it is
required to use binfmt_misc. This can be avoided by QEMU never doing a
raw execve() to the host system.

Introduce a new option, -execve=path, that sets the absolute path to the
QEMU interpreter and enables execve() interception. When a guest process
tries to call execve(), qemu_execve() is called instead.

qemu_execve() will prepend the interpreter set with -execve, similar to
what binfmt_misc would do, and then pass the modified execve() to the
host.

It is necessary to parse hashbang scripts in that function otherwise
the kernel will try to run the interpreter of a script without QEMU and
get an invalid exec format error.

Signed-off-by: Petros Angelatos 
---
 linux-user/main.c|   8 
 linux-user/qemu.h|   1 +
 linux-user/syscall.c | 111 ++-
 3 files changed, 119 insertions(+), 1 deletion(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index ee12035..5951279 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -79,6 +79,7 @@ static void usage(int exitcode);
 
 static const char *interp_prefix = CONFIG_QEMU_INTERP_PREFIX;
 const char *qemu_uname_release;
+const char *qemu_execve_path;
 
 /* XXX: on x86 MAP_GROWSDOWN only works if ESP <= address + 32, so
we allocate a bigger stack. Need a better solution, for example
@@ -3828,6 +3829,11 @@ static void handle_arg_guest_base(const char *arg)
 have_guest_base = 1;
 }
 
+static void handle_arg_execve(const char *arg)
+{
+qemu_execve_path = strdup(arg);
+}
+
 static void handle_arg_reserved_va(const char *arg)
 {
 char *p;
@@ -3913,6 +3919,8 @@ static const struct qemu_argument arg_table[] = {
  "uname",  "set qemu uname release string to 'uname'"},
 {"B",  "QEMU_GUEST_BASE",  true,  handle_arg_guest_base,
  "address","set guest_base address to 'address'"},
+{"execve", "QEMU_EXECVE",  true,   handle_arg_execve,
+ "path",   "use interpreter at 'path' when a process calls execve()"},
 {"R",  "QEMU_RESERVED_VA", true,  handle_arg_reserved_va,
  "size",   "reserve 'size' bytes for guest virtual address space"},
 {"d",  "QEMU_LOG", true,  handle_arg_log,
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index bd90cc3..0d9b058 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -140,6 +140,7 @@ void init_task_state(TaskState *ts);
 void task_settid(TaskState *);
 void stop_all_tasks(void);
 extern const char *qemu_uname_release;
+extern const char *qemu_execve_path;
 extern unsigned long mmap_min_addr;
 
 /* ??? See if we can avoid exposing so much of the loader internals.  */
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 0cbace4..d0b5442 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -5854,6 +5854,109 @@ static target_timer_t get_timer_id(abi_long arg)
 return timerid;
 }
 
+#define BINPRM_BUF_SIZE 128
+
+/* qemu_execve() Must return target values and target errnos. */
+static abi_long qemu_execve(char *filename, char *argv[],
+  char *envp[])
+{
+char *i_arg = NULL, *i_name = NULL;
+char **new_argp;
+int argc, fd, ret, i, offset = 3;
+char *cp;
+char buf[BINPRM_BUF_SIZE];
+
+for (argc = 0; argv[argc] != NULL; argc++) {
+/* nothing */ ;
+}
+
+fd = open(filename, O_RDONLY);
+if (fd == -1) {
+return -ENOENT;
+}
+
+ret = read(fd, buf, BINPRM_BUF_SIZE);
+if (ret == -1) {
+close(fd);
+return -ENOENT;
+}
+
+close(fd);
+
+/* adapted from the kernel
+ * 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/binfmt_script.c
+ */
+if ((buf[0] == '#') && (buf[1] == '!')) {
+/*
+ * This section does the #! interpretation.
+ * Sorta complicated, but hopefully it will work.  -TYT
+ */
+
+buf[BINPRM_BUF_SIZE - 1] = '\0';
+cp = strchr(buf, '\n');
+if (cp == NULL) {
+cp = buf+BINPRM_BUF_SIZE-1;
+}
+*cp = '\0';
+while (cp > buf) {
+cp--;
+if ((*cp == ' ') || (*cp == '\t')) {
+*cp = '\0';
+} else {
+break;
+}
+}
+for (cp = buf+2; (*cp == ' ') || (*cp == '\t'); cp++) {
+/* nothing */ ;
+}
+if (*cp == '\0') {
+return -ENOEXEC; /* No interpreter name found */
+}
+i_name = cp;
+i_arg = NULL;
+for ( ; *cp && (*cp != ' ') && (*cp != '\t'); cp++) {
+/* nothing */ ;
+}
+while ((*cp == ' ') || (*cp == '\t')) {
+*cp++ = '\0';
+}
+if (*cp) {
+i_arg = cp;
+}
+
+if (i_arg) {
+offset = 5;
+} else {
+offset = 4;
+}
+}
+
+

Re: [Qemu-devel] [PATCH] cadence_gem: fix buffer overflow

2016-01-17 Thread Jason Wang



On 01/14/2016 05:43 PM, Michael S. Tsirkin wrote:
> gem_receive copies a packet received from network into an rxbuf[2048]
> array on stack, with size limited by descriptor length set by guest.  If
> guest is malicious and specifies a descriptor length that is too large,
> and should packet size exceed array size, this results in a buffer
> overflow.
>
> Reported-by: 刘令 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  hw/net/cadence_gem.c | 8 
>  1 file changed, 8 insertions(+)

Apply to my -net with tweak on commit log (changing receive to transmit
as noticed).

Thanks

>
> diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
> index 3639fc1..15a0786 100644
> --- a/hw/net/cadence_gem.c
> +++ b/hw/net/cadence_gem.c
> @@ -862,6 +862,14 @@ static void gem_transmit(CadenceGEMState *s)
>  break;
>  }
>  
> +if (tx_desc_get_length(desc) > sizeof(tx_packet) - (p - tx_packet)) {
> +DB_PRINT("TX descriptor @ 0x%x too large: size 0x%x space 
> 0x%x\n",
> + (unsigned)packet_desc_addr,
> + (unsigned)tx_desc_get_length(desc),
> + sizeof(tx_packet) - (p - tx_packet));
> +break;
> +}
> +
>  /* Gather this fragment of the packet from "dma memory" to our 
> contig.
>   * buffer.
>   */

Re: [Qemu-devel] [RFC PATCH v2 01/10] Init colo-proxy object based on netfilter

2016-01-17 Thread Zhang Chen




On 01/16/2016 02:21 AM, Dr. David Alan Gilbert wrote:

* Zhang Chen (zhangchen.f...@cn.fujitsu.com) wrote:

From: zhangchen 

add colo-proxy to vl.c and qemu-options.hx
add trace-colo-proxy relation

Signed-off-by: zhangchen 
Signed-off-by: zhanghailiang 
---
  qemu-options.hx | 6 ++
  trace-events| 8 
  vl.c| 3 ++-
  3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 0eea4ee..6daa3f0 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3670,6 +3670,12 @@ queue @var{all|rx|tx} is an option that can be applied 
to any netfilter.
  @option{tx}: the filter is attached to the transmit queue of the netdev,
   where it will receive packets sent by the netdev.
  
+@item -object colo-proxy,id=@var{id},netdev=@var{netdevid},addr=@var{host:port},mode=@var{primary|secondary}[,queue=@var{all}]

+
+Colo-proxy on netdev @var{netdevid},set colo mode @var{primary|secondary}
+connect other colo through addr@var{host:port},and colo needs queue all
+packet arriving in queue=@var{all}
+
  @item -object 
filter-dump,id=@var{id},netdev=@var{dev},file=@var{filename}][,maxlen=@var{len}]
  
  Dump the network traffic on netdev @var{dev} to the file specified by

diff --git a/trace-events b/trace-events
index 5f95b3c..a957fb3 100644
--- a/trace-events
+++ b/trace-events
@@ -1586,6 +1586,14 @@ colo_failover_set_state(int new_state) "new state %d"
  colo_start_block_replication(void) "Block replication is started"
  colo_stop_block_replication(const char *reason) "Block replication is 
stopped(reason: '%s')"
  
+# net/colo-proxy.c

+colo_proxy(const char *sta) ": %s"

You use the 'colo_proxy' trace in a lot of different places;  it would
be better to use individual trace entries, so for example you could
just trace miscompares.

Dave


I will fix it in next version.

Thanks
zhangchen




+colo_proxy_with_ret(const char *sta, ssize_t ret) ": %s ret = %zu"
+colo_proxy_packet_src(const char *src) ":ipsrc = %s"
+colo_proxy_packet_dst(const char *dst) ":ipdst = %s"
+colo_proxy_packet_size(int size) ": %d"
+colo_proxy_queue_size(int size) ": %d"
+
  # kvm-all.c
  kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
  kvm_vm_ioctl(int type, void *arg) "type 0x%x, arg %p"
diff --git a/vl.c b/vl.c
index 8dc34ce..dcfb3a9 100644
--- a/vl.c
+++ b/vl.c
@@ -2838,7 +2838,8 @@ static bool object_create_initial(const char *type)
   * they depend on netdevs already existing
   */
  if (g_str_equal(type, "filter-buffer") ||
-g_str_equal(type, "filter-dump")) {
+g_str_equal(type, "filter-dump") ||
+g_str_equal(type, "colo-proxy")) {
  return false;
  }
  
--

1.9.1




--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK


.



--
Thanks
zhangchen

[Qemu-devel] [PATCH v1 07/17] target-arm: a64: Add endianness support

2016-01-17 Thread Peter Crosthwaite

From: Peter Crosthwaite 

Set the dc->mo_endianness flag for AA64 and use it in all ldst ops.

Signed-off-by: Peter Crosthwaite 
---

 target-arm/translate-a64.c | 49 --
 1 file changed, 30 insertions(+), 19 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index d826b92..59026b6 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -726,7 +726,7 @@ static void do_gpr_st_memidx(DisasContext *s, TCGv_i64 
source,
  TCGv_i64 tcg_addr, int size, int memidx)
 {
 g_assert(size <= 3);
-tcg_gen_qemu_st_i64(source, tcg_addr, memidx, MO_TE + size);
+tcg_gen_qemu_st_i64(source, tcg_addr, memidx, s->mo_endianness + size);
 }
 
 static void do_gpr_st(DisasContext *s, TCGv_i64 source,
@@ -741,7 +741,7 @@ static void do_gpr_st(DisasContext *s, TCGv_i64 source,
 static void do_gpr_ld_memidx(DisasContext *s, TCGv_i64 dest, TCGv_i64 tcg_addr,
  int size, bool is_signed, bool extend, int memidx)
 {
-TCGMemOp memop = MO_TE + size;
+TCGMemOp memop = s->mo_endianness + size;
 
 g_assert(size <= 3);
 
@@ -773,13 +773,18 @@ static void do_fp_st(DisasContext *s, int srcidx, 
TCGv_i64 tcg_addr, int size)
 TCGv_i64 tmp = tcg_temp_new_i64();
 tcg_gen_ld_i64(tmp, cpu_env, fp_reg_offset(s, srcidx, MO_64));
 if (size < 4) {
-tcg_gen_qemu_st_i64(tmp, tcg_addr, get_mem_index(s), MO_TE + size);
+tcg_gen_qemu_st_i64(tmp, tcg_addr, get_mem_index(s),
+s->mo_endianness + size);
 } else {
+bool be = s->mo_endianness == MO_BE;
 TCGv_i64 tcg_hiaddr = tcg_temp_new_i64();
-tcg_gen_qemu_st_i64(tmp, tcg_addr, get_mem_index(s), MO_TEQ);
+
+tcg_gen_addi_i64(tcg_hiaddr, tcg_addr, 8);
+tcg_gen_qemu_st_i64(tmp, be ? tcg_hiaddr : tcg_addr, get_mem_index(s),
+s->mo_endianness | MO_Q);
 tcg_gen_ld_i64(tmp, cpu_env, fp_reg_hi_offset(s, srcidx));
-tcg_gen_addi_i64(tcg_hiaddr, tcg_addr, 8);
-tcg_gen_qemu_st_i64(tmp, tcg_hiaddr, get_mem_index(s), MO_TEQ);
+tcg_gen_qemu_st_i64(tmp, be ? tcg_addr : tcg_hiaddr, get_mem_index(s),
+s->mo_endianness | MO_Q);
 tcg_temp_free_i64(tcg_hiaddr);
 }
 
@@ -796,17 +801,21 @@ static void do_fp_ld(DisasContext *s, int destidx, 
TCGv_i64 tcg_addr, int size)
 TCGv_i64 tmphi;
 
 if (size < 4) {
-TCGMemOp memop = MO_TE + size;
+TCGMemOp memop = s->mo_endianness + size;
 tmphi = tcg_const_i64(0);
 tcg_gen_qemu_ld_i64(tmplo, tcg_addr, get_mem_index(s), memop);
 } else {
+bool be = s->mo_endianness == MO_BE;
 TCGv_i64 tcg_hiaddr;
+
 tmphi = tcg_temp_new_i64();
 tcg_hiaddr = tcg_temp_new_i64();
 
-tcg_gen_qemu_ld_i64(tmplo, tcg_addr, get_mem_index(s), MO_TEQ);
 tcg_gen_addi_i64(tcg_hiaddr, tcg_addr, 8);
-tcg_gen_qemu_ld_i64(tmphi, tcg_hiaddr, get_mem_index(s), MO_TEQ);
+tcg_gen_qemu_ld_i64(tmplo, be ? tcg_hiaddr : tcg_addr, 
get_mem_index(s),
+s->mo_endianness | MO_Q);
+tcg_gen_qemu_ld_i64(tmphi, be ? tcg_addr : tcg_hiaddr, 
get_mem_index(s),
+s->mo_endianness | MO_Q);
 tcg_temp_free_i64(tcg_hiaddr);
 }
 
@@ -945,7 +954,7 @@ static void clear_vec_high(DisasContext *s, int rd)
 static void do_vec_st(DisasContext *s, int srcidx, int element,
   TCGv_i64 tcg_addr, int size)
 {
-TCGMemOp memop = MO_TE + size;
+TCGMemOp memop = s->mo_endianness + size;
 TCGv_i64 tcg_tmp = tcg_temp_new_i64();
 
 read_vec_element(s, tcg_tmp, srcidx, element, size);
@@ -958,7 +967,7 @@ static void do_vec_st(DisasContext *s, int srcidx, int 
element,
 static void do_vec_ld(DisasContext *s, int destidx, int element,
   TCGv_i64 tcg_addr, int size)
 {
-TCGMemOp memop = MO_TE + size;
+TCGMemOp memop = s->mo_endianness + size;
 TCGv_i64 tcg_tmp = tcg_temp_new_i64();
 
 tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr, get_mem_index(s), memop);
@@ -1703,7 +1712,7 @@ static void gen_load_exclusive(DisasContext *s, int rt, 
int rt2,
TCGv_i64 addr, int size, bool is_pair)
 {
 TCGv_i64 tmp = tcg_temp_new_i64();
-TCGMemOp memop = MO_TE + size;
+TCGMemOp memop = s->mo_endianness + size;
 
 g_assert(size <= 3);
 tcg_gen_qemu_ld_i64(tmp, addr, get_mem_index(s), memop);
@@ -1765,7 +1774,7 @@ static void gen_store_exclusive(DisasContext *s, int rd, 
int rt, int rt2,
 tcg_gen_brcond_i64(TCG_COND_NE, addr, cpu_exclusive_addr, fail_label);
 
 tmp = tcg_temp_new_i64();
-tcg_gen_qemu_ld_i64(tmp, addr, get_mem_index(s), MO_TE + size);
+tcg_gen_qemu_ld_i64(tmp, addr, get_mem_index(s), s->mo_endianness + size);

[Qemu-devel] [PATCH v1 06/17] target-arm: introduce disas flag for endianness

2016-01-17 Thread Peter Crosthwaite

From: Paolo Bonzini 

Introduce a disas flag for setting the CPU data endianness. This allows
control of the endianness from the CPU state rather than hard-coding it
to TARGET_WORDS_BIGENDIAN.

Signed-off-by: Paolo Bonzini 
[ PC changes:
  * Split off as new patch from original:
"target-arm: introduce tbflag for CPSR.E"
  * Wrote commit message from scratch
]
Signed-off-by: Peter Crosthwaite 
---

 target-arm/translate-a64.c |  1 +
 target-arm/translate.c | 39 ---
 target-arm/translate.h |  1 +
 3 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 14e8131..d826b92 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -11033,6 +11033,7 @@ void gen_intermediate_code_a64(ARMCPU *cpu, 
TranslationBlock *tb)
!arm_el_is_aa64(env, 3);
 dc->thumb = 0;
 dc->bswap_code = 0;
+dc->mo_endianness = MO_TE;
 dc->condexec_mask = 0;
 dc->condexec_cond = 0;
 dc->mmu_idx = ARM_TBFLAG_MMUIDX(tb->flags);
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 55ecca5..e1679d3 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -927,26 +927,30 @@ static inline void store_reg_from_load(DisasContext *s, 
int reg, TCGv_i32 var)
 static inline void gen_aa32_ld##SUFF(DisasContext *s, TCGv_i32 val,  \
  TCGv_i32 addr, int index)   \
 {\
-tcg_gen_qemu_ld_i32(val, addr, index, (OPC));\
+TCGMemOp opc = (OPC) | s->mo_endianness; \
+tcg_gen_qemu_ld_i32(val, addr, index, opc);  \
 }
 
 #define DO_GEN_ST(SUFF, OPC) \
 static inline void gen_aa32_st##SUFF(DisasContext *s, TCGv_i32 val,  \
  TCGv_i32 addr, int index)   \
 {\
-tcg_gen_qemu_st_i32(val, addr, index, (OPC));\
+TCGMemOp opc = (OPC) | s->mo_endianness; \
+tcg_gen_qemu_st_i32(val, addr, index, opc);  \
 }
 
 static inline void gen_aa32_ld64(DisasContext *s, TCGv_i64 val,
  TCGv_i32 addr, int index)
 {
-tcg_gen_qemu_ld_i64(val, addr, index, MO_TEQ);
+TCGMemOp opc = MO_Q | s->mo_endianness;
+tcg_gen_qemu_ld_i64(val, addr, index, opc);
 }
 
 static inline void gen_aa32_st64(DisasContext *s, TCGv_i64 val,
  TCGv_i32 addr, int index)
 {
-tcg_gen_qemu_st_i64(val, addr, index, MO_TEQ);
+TCGMemOp opc = MO_Q | s->mo_endianness;
+tcg_gen_qemu_st_i64(val, addr, index, opc);
 }
 
 #else
@@ -955,9 +959,10 @@ static inline void gen_aa32_st64(DisasContext *s, TCGv_i64 
val,
 static inline void gen_aa32_ld##SUFF(DisasContext *s, TCGv_i32 val,  \
  TCGv_i32 addr, int index)   \
 {\
+TCGMemOp opc = (OPC) | s->mo_endianness; \
 TCGv addr64 = tcg_temp_new();\
 tcg_gen_extu_i32_i64(addr64, addr);  \
-tcg_gen_qemu_ld_i32(val, addr64, index, OPC);\
+tcg_gen_qemu_ld_i32(val, addr64, index, opc);\
 tcg_temp_free(addr64);   \
 }
 
@@ -965,27 +970,30 @@ static inline void gen_aa32_ld##SUFF(DisasContext *s, 
TCGv_i32 val,  \
 static inline void gen_aa32_st##SUFF(DisasContext *s, TCGv_i32 val,  \
  TCGv_i32 addr, int index)   \
 {\
+TCGMemOp opc = (OPC) | s->mo_endianness; \
 TCGv addr64 = tcg_temp_new();\
 tcg_gen_extu_i32_i64(addr64, addr);  \
-tcg_gen_qemu_st_i32(val, addr64, index, OPC);\
+tcg_gen_qemu_st_i32(val, addr64, index, opc);\
 tcg_temp_free(addr64);   \
 }
 
 static inline void gen_aa32_ld64(DisasContext *s, TCGv_i64 val,
  TCGv_i32 addr, int index)
 {
+TCGMemOp opc = MO_Q | s->mo_endianness;
 TCGv addr64 = tcg_temp_new();
 tcg_gen_extu_i32_i64(addr64, addr);
-tcg_gen_qemu_ld_i64(val, addr64, index, MO_TEQ);
+tcg_gen_qemu_ld_i64(val, addr64, index, opc);
 tcg_temp_free(addr64);
 }
 
 static inline void gen_aa32_st64(DisasContext *s, TCGv_i64

[Qemu-devel] [PATCH v1 10/17] target-arm: implement setend

2016-01-17 Thread Peter Crosthwaite

From: Paolo Bonzini 

Since this is not a high-performance path, just use a helper to
flip the E bit and force a lookup in the hash table since the
flags have changed.

Signed-off-by: Paolo Bonzini 
Signed-off-by: Peter Crosthwaite 
---

 target-arm/helper.h|  1 +
 target-arm/op_helper.c |  5 +
 target-arm/translate.c | 16 
 3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/target-arm/helper.h b/target-arm/helper.h
index c2a85c7..2315a9c 100644
--- a/target-arm/helper.h
+++ b/target-arm/helper.h
@@ -48,6 +48,7 @@ DEF_HELPER_FLAGS_3(sel_flags, TCG_CALL_NO_RWG_SE,
i32, i32, i32, i32)
 DEF_HELPER_2(exception_internal, void, env, i32)
 DEF_HELPER_4(exception_with_syndrome, void, env, i32, i32, i32)
+DEF_HELPER_1(setend, void, env)
 DEF_HELPER_1(wfi, void, env)
 DEF_HELPER_1(wfe, void, env)
 DEF_HELPER_1(yield, void, env)
diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
index e42d287..2a4bc67 100644
--- a/target-arm/op_helper.c
+++ b/target-arm/op_helper.c
@@ -295,6 +295,11 @@ uint32_t HELPER(usat16)(CPUARMState *env, uint32_t x, 
uint32_t shift)
 return res;
 }
 
+void HELPER(setend)(CPUARMState *env)
+{
+env->uncached_cpsr ^= CPSR_E;
+}
+
 /* Function checks whether WFx (WFI/WFE) instructions are set up to be trapped.
  * The function returns the target EL (1-3) if the instruction is to be 
trapped;
  * otherwise it returns 0 indicating it is not trapped.
diff --git a/target-arm/translate.c b/target-arm/translate.c
index cb925ef..192a5d6 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -7726,10 +7726,10 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 if ((insn & 0x0dff) == 0x0101) {
 ARCH(6);
 /* setend */
-if (((insn >> 9) & 1) != s->bswap_code) {
-/* Dynamic endianness switching not implemented. */
-qemu_log_mask(LOG_UNIMP, "arm: unimplemented setend\n");
-goto illegal_op;
+if (((insn >> 9) & 1) != !!(s->mo_endianness == MO_BE)) {
+gen_helper_setend(cpu_env);
+gen_set_pc_im(s, s->pc);
+s->is_jmp = DISAS_JUMP;
 }
 return;
 } else if ((insn & 0x0f00) == 0x057ff000) {
@@ -11064,10 +11064,10 @@ static void disas_thumb_insn(CPUARMState *env, 
DisasContext *s)
 case 2:
 /* setend */
 ARCH(6);
-if (((insn >> 3) & 1) != s->bswap_code) {
-/* Dynamic endianness switching not implemented. */
-qemu_log_mask(LOG_UNIMP, "arm: unimplemented setend\n");
-goto illegal_op;
+if (((insn >> 3) & 1) != !!(s->mo_endianness == MO_BE)) {
+gen_helper_setend(cpu_env);
+gen_set_pc_im(s, s->pc);
+s->is_jmp = DISAS_JUMP;
 }
 break;
 case 3:
-- 
1.9.1

Re: [Qemu-devel] [PATCH 2/2] migration/virtio: Remove simple .get/.put use

2016-01-17 Thread Cornelia Huck

On Fri, 15 Jan 2016 12:01:44 +
"Dr. David Alan Gilbert"  wrote:

> I misunderstood the vmstate macro definition when
> I reworked the virtio .get/.put - but I can't
> get it to break for me, which suggests I'm perhaps
> not managing to get that structure into being
> sent in my tests.

The first of the structures should be sent whenever virtio-1 is enabled
on the host. I think virtio-pci (unlike virtio-ccw) still defaults to
off?

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter

2016-01-17 Thread Zhang Chen




On 01/06/2016 01:16 PM, Jason Wang wrote:


On 01/04/2016 07:17 PM, Zhang Chen wrote:


On 01/04/2016 05:46 PM, Jason Wang wrote:

On 01/04/2016 04:16 PM, Zhang Chen wrote:

On 01/04/2016 01:37 PM, Jason Wang wrote:

On 12/31/2015 04:40 PM, Zhang Chen wrote:

On 12/31/2015 10:36 AM, Jason Wang wrote:

On 12/22/2015 06:42 PM, Zhang Chen wrote:

From: zhangchen 

Hi,all

This patch add an colo-proxy object, COLO-Proxy is a part of COLO,
based on qemu netfilter and it's a plugin for qemu netfilter. the
function
keep Secondary VM connect normal to Primary VM and compare packets
sent by PVM to sent by SVM.if the packet difference,notify COLO do
checkpoint and send all primary packet has queued.

Thanks for the work. I don't object this method but still not
convinced
that qemu is the best place for this.

As been raised in the past discussion, it's almost impossible to
cooperate with vhost backends. If we want this to be used in
production
environment, need to think of a solution for vhost. There's no such
worry if we decouple this from qemu.


You can also get the series from:

https://github.com/zhangckid/qemu/tree/colo-v2.2-periodic-mode-with-colo-proxyV2




Usage:

primary:
-netdev tap,id=bn0 -device e1000,netdev=bn0
-object
colo-proxy,id=f0,netdev=bn0,queue=all,mode=primary,addr=host:port

secondary:
-netdev tap,id=bn0 -device e1000,netdev=bn0
-object
colo-proxy,id=f0,netdev=bn0,queue=all,mode=secondary,addr=host:port

Have a quick glance at how secondary mode work. What it does is just
forwarding packets between a nic and a socket, qemu socket
backend did
exact the same job. You could even use socket in primary node and
let
packet compare module talk to both primary and secondary node.

If we use qemu socket backend , the same netdev will used by qemu
socket and
qemu netfilter. this will against qemu net design. and then, when
colo
do failover,
secondary do not have backend to use. that's the real problem.

Then, maybe it's time to implement changing the netdev of a nic. The
point here is that what secondary mode did is in fact a netdev backend
instead of a filter ...

Currently, you are right. in colo-proxy V2 code, I just compare IP
packet to
decide whether to do checkpoint.
But, in colo-proxy V3 I will compare tcp,icmp,udp packet to decide it.
because that can reduce frequency of checkpoint and improve
performance. To keep tcp connection well, colo secondary need to record
primary guest's init seq and adjust secondary guest's ack. if colo do
failover,
secondary also need do this to old tcp connection. qemu socket
can't do this job.

So a question here: is it a must to do things (e.g TCP analysis stuffs)
at secondary? Looks like we could do this at primary node. And I saw
you're doing packet comparing in primary node, any advantages of doing
this in primary instead of secondary?

We think must  to do this in secondary, because if colo do
failover,secondary
must continues do TCP analysis stuffs to before tcp connection(if not,
tcp connection
will disconnect in that time), in this time primary already down or
disconnect to
secondary.so we can't make primary do this  TCP analysis stuffs.it can
not ensure
FT function.

Thanks
zhangchen

Makes sense.

Thanks


Hi~, Jason.
No news for a week.
Can you give me some comments for code.
Let's make colo-proxy work well.

Thanks
zhangchen


and another problem is do failover, if we use qemu socket
to be backend in secondary, when colo do failover, I don't know how to
change
secondary be a normal qemu, if you know, please tell me.

Current qemu couldn't do this, but I mean we implement something like
nic_change_backend which can change nic's peer(s). With this, in
secondary, we can replace the socket backend with whatever you want (e.g
tap or other).

Thanks


Thanks for your revew
zhangchen


.




.



--
Thanks
zhangchen

[Qemu-devel] [PATCH v1 09/17] target-arm: introduce tbflag for endianness

2016-01-17 Thread Peter Crosthwaite

From: Peter Crosthwaite 

Introduce a tbflags for endianness, set based upon the CPUs current
endianness. This in turn propagates through to the disas endianness
flag.

Signed-off-by: Peter Crosthwaite 
---

 target-arm/cpu.h   | 7 +++
 target-arm/translate-a64.c | 2 +-
 target-arm/translate.c | 2 +-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 54675c7..74048d1 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -1857,6 +1857,8 @@ static bool arm_cpu_is_big_endian(CPUARMState *env)
  */
 #define ARM_TBFLAG_NS_SHIFT 19
 #define ARM_TBFLAG_NS_MASK  (1 << ARM_TBFLAG_NS_SHIFT)
+#define ARM_TBFLAG_MOE_SHIFT20
+#define ARM_TBFLAG_MOE_MASK (1 << ARM_TBFLAG_MOE_SHIFT)
 
 /* Bit usage when in AArch64 state: currently we have no A64 specific bits */
 
@@ -1887,6 +1889,8 @@ static bool arm_cpu_is_big_endian(CPUARMState *env)
 (((F) & ARM_TBFLAG_XSCALE_CPAR_MASK) >> ARM_TBFLAG_XSCALE_CPAR_SHIFT)
 #define ARM_TBFLAG_NS(F) \
 (((F) & ARM_TBFLAG_NS_MASK) >> ARM_TBFLAG_NS_SHIFT)
+#define ARM_TBFLAG_MOE(F) \
+(((F) & ARM_TBFLAG_MOE_MASK) >> ARM_TBFLAG_MOE_SHIFT)
 
 /* Return the exception level to which FP-disabled exceptions should
  * be taken, or 0 if FP is enabled.
@@ -2018,6 +2022,9 @@ static inline void cpu_get_tb_cpu_state(CPUARMState *env, 
target_ulong *pc,
 }
 }
 }
+if (arm_cpu_is_big_endian(env)) {
+*flags |= ARM_TBFLAG_MOE_MASK;
+}
 *flags |= fp_exception_el(env) << ARM_TBFLAG_FPEXC_EL_SHIFT;
 
 *cs_base = 0;
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 59026b6..db68662 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -11044,7 +11044,7 @@ void gen_intermediate_code_a64(ARMCPU *cpu, 
TranslationBlock *tb)
!arm_el_is_aa64(env, 3);
 dc->thumb = 0;
 dc->bswap_code = 0;
-dc->mo_endianness = MO_TE;
+dc->mo_endianness = ARM_TBFLAG_MOE(tb->flags) ? MO_BE : MO_LE;
 dc->condexec_mask = 0;
 dc->condexec_cond = 0;
 dc->mmu_idx = ARM_TBFLAG_MMUIDX(tb->flags);
diff --git a/target-arm/translate.c b/target-arm/translate.c
index e1679d3..cb925ef 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11274,7 +11274,7 @@ void gen_intermediate_code(CPUARMState *env, 
TranslationBlock *tb)
!arm_el_is_aa64(env, 3);
 dc->thumb = ARM_TBFLAG_THUMB(tb->flags);
 dc->bswap_code = ARM_TBFLAG_BSWAP_CODE(tb->flags);
-dc->mo_endianness = MO_TE;
+dc->mo_endianness = ARM_TBFLAG_MOE(tb->flags) ? MO_BE : MO_LE;
 dc->condexec_mask = (ARM_TBFLAG_CONDEXEC(tb->flags) & 0xf) << 1;
 dc->condexec_cond = ARM_TBFLAG_CONDEXEC(tb->flags) >> 4;
 dc->mmu_idx = ARM_TBFLAG_MMUIDX(tb->flags);
-- 
1.9.1

[Qemu-devel] [PATCH v1 15/17] loader: add API to load elf header

2016-01-17 Thread Peter Crosthwaite

Add an API to load an elf header header from a file. Populates a
buffer with the header contents, as well as a boolean for whether the
elf is 64b or not. Both arguments are optional.

Signed-off-by: Peter Crosthwaite 
---

 hw/core/loader.c| 48 
 include/hw/loader.h |  1 +
 2 files changed, 49 insertions(+)

diff --git a/hw/core/loader.c b/hw/core/loader.c
index 6b69852..28da8e2 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -331,6 +331,54 @@ const char *load_elf_strerror(int error)
 }
 }
 
+void load_elf_hdr(const char *filename, void *hdr, bool *is64, Error **errp)
+{
+int fd;
+uint8_t e_ident[EI_NIDENT];
+size_t hdr_size, off = 0;
+bool is64l;
+
+fd = open(filename, O_RDONLY | O_BINARY);
+if (fd < 0) {
+error_setg_errno(errp, errno, "Fail to open file");
+return;
+}
+if (read(fd, e_ident, sizeof(e_ident)) != sizeof(e_ident)) {
+error_setg_errno(errp, errno, "Fail to read file");
+goto fail;
+}
+if (e_ident[0] != ELFMAG0 ||
+e_ident[1] != ELFMAG1 ||
+e_ident[2] != ELFMAG2 ||
+e_ident[3] != ELFMAG3) {
+error_setg(errp, "Bad ELF magic");
+goto fail;
+}
+
+is64l = e_ident[EI_CLASS] == ELFCLASS64;
+hdr_size = is64l ? sizeof(Elf64_Ehdr) : sizeof(Elf32_Ehdr);
+if (is64) {
+*is64 = is64l;
+}
+
+lseek(fd, 0, SEEK_SET);
+while (hdr && off < hdr_size) {
+size_t br = read(fd, hdr + off, hdr_size - off);
+switch (br) {
+case 0:
+error_setg(errp, "File too short");
+goto fail;
+case -1:
+error_setg_errno(errp, errno, "Failed to read file");
+goto fail;
+}
+off += br;
+}
+
+fail:
+close(fd);
+}
+
 /* return < 0 if error, otherwise the number of bytes loaded in memory */
 int load_elf(const char *filename, uint64_t (*translate_fn)(void *, uint64_t),
  void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
diff --git a/include/hw/loader.h b/include/hw/loader.h
index f7b43ab..33067f8 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -36,6 +36,7 @@ int load_elf(const char *filename, uint64_t 
(*translate_fn)(void *, uint64_t),
  void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
  uint64_t *highaddr, int big_endian, int elf_machine,
  int clear_lsb);
+void load_elf_hdr(const char *filename, void *hdr, bool *is64, Error **errp);
 int load_aout(const char *filename, hwaddr addr, int max_sz,
   int bswap_needed, hwaddr target_page_size);
 int load_uimage(const char *filename, hwaddr *ep,
-- 
1.9.1

[Qemu-devel] [PATCH v1 14/17] target-arm: implement BE32 mode in system emulation

2016-01-17 Thread Peter Crosthwaite

From: Paolo Bonzini 

System emulation only has a little-endian target; BE32 mode
is implemented by adjusting the low bits of the address
for every byte and halfword load and store.  64-bit accesses
flip the low and high words.

Signed-off-by: Paolo Bonzini 
[PC changes:
  * rebased against master (Jan 2016)
]
Signed-off-by: Peter Crosthwaite 
---

 target-arm/cpu.h   |  5 ++-
 target-arm/translate.c | 86 +-
 2 files changed, 73 insertions(+), 18 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 96b1e99..5814019 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -1925,9 +1925,8 @@ static inline bool bswap_code(bool sctlr_b)
 #endif
 sctlr_b;
 #else
-/* We do not implement BE32 mode for system-mode emulation, but
- * anyway it would always do little-endian accesses with
- * TARGET_WORDS_BIGENDIAN = 0.
+/* All code access in ARM is little endian, and there are no loaders
+ * doing swaps that need to be reversed
  */
 return 0;
 #endif
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 44c3ac9..2d80bb2 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -914,6 +914,12 @@ static inline void store_reg_from_load(DisasContext *s, 
int reg, TCGv_i32 var)
 }
 }
 
+#ifdef CONFIG_USER_ONLY
+#define IS_USER_ONLY 1
+#else
+#define IS_USER_ONLY 0
+#endif
+
 /* Abstractions of "generate code to do a guest load/store for
  * AArch32", where a vaddr is always 32 bits (and is zero
  * extended if we're a 64 bit core) and  data is also
@@ -923,19 +929,35 @@ static inline void store_reg_from_load(DisasContext *s, 
int reg, TCGv_i32 var)
  */
 #if TARGET_LONG_BITS == 32
 
-#define DO_GEN_LD(SUFF, OPC) \
+#define DO_GEN_LD(SUFF, OPC, BE32_XOR)   \
 static inline void gen_aa32_ld##SUFF(DisasContext *s, TCGv_i32 val,  \
  TCGv_i32 addr, int index)   \
 {\
 TCGMemOp opc = (OPC) | s->mo_endianness; \
+/* Not needed for user-mode BE32, where we use MO_BE instead.  */\
+if (!IS_USER_ONLY && s->sctlr_b && BE32_XOR) {   \
+TCGv addr_be = tcg_temp_new();   \
+tcg_gen_xori_i32(addr_be, addr, BE32_XOR);   \
+tcg_gen_qemu_ld_i32(val, addr_be, index, opc);   \
+tcg_temp_free(addr_be);  \
+return;  \
+}\
 tcg_gen_qemu_ld_i32(val, addr, index, opc);  \
 }
 
-#define DO_GEN_ST(SUFF, OPC) \
+#define DO_GEN_ST(SUFF, OPC, BE32_XOR)   \
 static inline void gen_aa32_st##SUFF(DisasContext *s, TCGv_i32 val,  \
  TCGv_i32 addr, int index)   \
 {\
 TCGMemOp opc = (OPC) | s->mo_endianness; \
+/* Not needed for user-mode BE32, where we use MO_BE instead.  */\
+if (!IS_USER_ONLY && s->sctlr_b && BE32_XOR) {   \
+TCGv addr_be = tcg_temp_new();   \
+tcg_gen_xori_i32(addr_be, addr, BE32_XOR);   \
+tcg_gen_qemu_ld_i32(val, addr_be, index, opc);   \
+tcg_temp_free(addr_be);  \
+return;  \
+}\
 tcg_gen_qemu_st_i32(val, addr, index, opc);  \
 }
 
@@ -944,35 +966,55 @@ static inline void gen_aa32_ld64(DisasContext *s, 
TCGv_i64 val,
 {
 TCGMemOp opc = MO_Q | s->mo_endianness;
 tcg_gen_qemu_ld_i64(val, addr, index, opc);
+/* Not needed for user-mode BE32, where we use MO_BE instead.  */
+if (!IS_USER_ONLY && s->sctlr_b) {
+tcg_gen_rotri_i64(val, val, 32);
+}
 }
 
 static inline void gen_aa32_st64(DisasContext *s, TCGv_i64 val,
  TCGv_i32 addr, int index)
 {
 TCGMemOp opc = MO_Q | s->mo_endianness;
+/* Not needed for user-mode BE32, where we use MO_BE instead.  */
+if (!IS_USER_ONLY && s->sctlr_b) {
+TCGv_i64 tmp = tcg_temp_new_i64();
+tcg_gen_rotri_i64(tmp, val, 32);
+tcg_gen_qemu_st_i64(tmp, addr, index, opc);
+tcg_temp_free_i64(tmp);
+return;
+}
 tcg_gen_qemu_st_i64(val, addr, index, opc);
 }
 
 #else
 
-#define

[Qemu-devel] [PATCHv3 9/9] pseries: Clean up error reporting in htab migration functions

2016-01-17 Thread David Gibson

The functions for migrating the hash page table on pseries machine type
(htab_save_setup() and htab_load()) can report some errors with an
explicit fprintf() before returning an appropriate error code.  Change these
to use error_report() instead.

Signed-off-by: David Gibson 
Reviewed-by: Thomas Huth 
---
 hw/ppc/spapr.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 58f26cd..ba0bfdf 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1317,8 +1317,9 @@ static int htab_save_setup(QEMUFile *f, void *opaque)
 spapr->htab_fd = kvmppc_get_htab_fd(false);
 spapr->htab_fd_stale = false;
 if (spapr->htab_fd < 0) {
-fprintf(stderr, "Unable to open fd for reading hash table from 
KVM: %s\n",
-strerror(errno));
+error_report(
+"Unable to open fd for reading hash table from KVM: %s",
+strerror(errno));
 return -1;
 }
 }
@@ -1534,7 +1535,7 @@ static int htab_load(QEMUFile *f, void *opaque, int 
version_id)
 int fd = -1;
 
 if (version_id < 1 || version_id > 1) {
-fprintf(stderr, "htab_load() bad version\n");
+error_report("htab_load() bad version");
 return -EINVAL;
 }
 
@@ -1555,8 +1556,8 @@ static int htab_load(QEMUFile *f, void *opaque, int 
version_id)
 
 fd = kvmppc_get_htab_fd(true);
 if (fd < 0) {
-fprintf(stderr, "Unable to open fd to restore KVM hash table: 
%s\n",
-strerror(errno));
+error_report("Unable to open fd to restore KVM hash table: %s",
+ strerror(errno));
 }
 }
 
@@ -1576,9 +1577,9 @@ static int htab_load(QEMUFile *f, void *opaque, int 
version_id)
 if ((index + n_valid + n_invalid) >
 (HTAB_SIZE(spapr) / HASH_PTE_SIZE_64)) {
 /* Bad index in stream */
-fprintf(stderr, "htab_load() bad index %d (%hd+%hd entries) "
-"in htab stream (htab_shift=%d)\n", index, n_valid, 
n_invalid,
-spapr->htab_shift);
+error_report(
+"htab_load() bad index %d (%hd+%hd entries) in htab stream 
(htab_shift=%d)",
+index, n_valid, n_invalid, spapr->htab_shift);
 return -EINVAL;
 }
 
-- 
2.5.0

[Qemu-devel] [PATCHv3 8/9] pseries: Clean up error reporting in ppc_spapr_init()

2016-01-17 Thread David Gibson

This function includes a number of explicit fprintf()s for errors.
Change these to use error_report() instead.

Also replace the single exit(EXIT_FAILURE) with an explicit exit(1), since
the latter is the more usual idiom in qemu by a large margin.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 148ca5a..58f26cd 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1789,8 +1789,8 @@ static void ppc_spapr_init(MachineState *machine)
 }
 
 if (spapr->rma_size > node0_size) {
-fprintf(stderr, "Error: Numa node 0 has to span the RMA 
(%#08"HWADDR_PRIx")\n",
-spapr->rma_size);
+error_report("Numa node 0 has to span the RMA (%#08"HWADDR_PRIx")",
+ spapr->rma_size);
 exit(1);
 }
 
@@ -1856,10 +1856,10 @@ static void ppc_spapr_init(MachineState *machine)
 ram_addr_t hotplug_mem_size = machine->maxram_size - machine->ram_size;
 
 if (machine->ram_slots > SPAPR_MAX_RAM_SLOTS) {
-error_report("Specified number of memory slots %" PRIu64
- " exceeds max supported %d",
- machine->ram_slots, SPAPR_MAX_RAM_SLOTS);
-exit(EXIT_FAILURE);
+error_report("Specified number of memory slots %"
+ PRIu64" exceeds max supported %d",
+machine->ram_slots, SPAPR_MAX_RAM_SLOTS);
+exit(1);
 }
 
 spapr->hotplug_memory.base = ROUND_UP(machine->ram_size,
@@ -1955,8 +1955,9 @@ static void ppc_spapr_init(MachineState *machine)
 }
 
 if (spapr->rma_size < (MIN_RMA_SLOF << 20)) {
-fprintf(stderr, "qemu: pSeries SLOF firmware requires >= "
-"%ldM guest RMA (Real Mode Area memory)\n", MIN_RMA_SLOF);
+error_report(
+"pSeries SLOF firmware requires >= %ldM guest RMA (Real Mode Area 
memory)",
+MIN_RMA_SLOF);
 exit(1);
 }
 
@@ -1972,8 +1973,8 @@ static void ppc_spapr_init(MachineState *machine)
 kernel_le = kernel_size > 0;
 }
 if (kernel_size < 0) {
-fprintf(stderr, "qemu: error loading %s: %s\n",
-kernel_filename, load_elf_strerror(kernel_size));
+error_report("error loading %s: %s",
+ kernel_filename, load_elf_strerror(kernel_size));
 exit(1);
 }
 
@@ -1986,8 +1987,8 @@ static void ppc_spapr_init(MachineState *machine)
 initrd_size = load_image_targphys(initrd_filename, initrd_base,
   load_limit - initrd_base);
 if (initrd_size < 0) {
-fprintf(stderr, "qemu: could not load initial ram disk '%s'\n",
-initrd_filename);
+error_report("could not load initial ram disk '%s'",
+ initrd_filename);
 exit(1);
 }
 } else {
-- 
2.5.0

Re: [Qemu-devel] [PATCH 4/4] target-ppc: ensure we include the decrementer value during migration

2016-01-17 Thread David Gibson

On Fri, Jan 15, 2016 at 05:46:10PM +, Mark Cave-Ayland wrote:
> On 12/01/16 02:44, David Gibson wrote:
> 
> >>> In other words, isn't this just skipping the decrementer interrupts at
> >>> the qemu level rather than the guest level?
> >>>
> >>> It seems that instead we should be reconstructing the decrementer on
> >>> the destination based on an offset from the timebase.
> >>
> >> Well I haven't really looked at how time warping works during in
> >> migration for QEMU, however this seems to be the method used by
> >> hw/ppc/ppc.c's timebase_post_load() function but my understanding is
> >> that this isn't currently available for the g3beige/mac99 machines?
> > 
> > Ah.. yes, it looks like the timebase migration stuff is only hooked in
> > on the pseries machine type.  As far as I can tell it should be
> > trivial to add it to other machines though - it doesn't appear to rely
> > on anything outside the common ppc timebase stuff.
> > 
> >> Should the patch in fact do this but also add decrementer support? And
> >> if it did, would this have a negative effect on pseries?
> > 
> > Yes, I think that's the right approach.  Note that rather than
> > duplicating the logic to adjust the decrementer over migration, it
> > should be possible to encode the decrementer as a diff from the
> > timebase across the migration.
> > 
> > In fact.. I'm not sure it ever makes sense to store the decrementer
> > value as a direct value, since it's constantly changing - probably
> > makes more sense to derive it from the timebase whenever it is needed.
> > 
> > As far as I know that should be fine for pseries.  I think the current
> > behaviour is probably technically wrong for pseries as well, but the
> > timing code of our Linux guests is robust enough to handle a small
> > displacement to the time of the next decrementer interrupt.
> 
> I've had a bit of an experiment trying to implement something suitable,
> but I'm not 100% certain I've got this right.
> 
> >From the code my understanding is that the timebase is effectively
> free-running and so if a migration takes 5s then you use tb_offset to
> calculate the difference between the timebase before migration, and
> subsequently apply the offset for all future reads of the timebase for
> the lifetime of the CPU (i.e. the migrated guest is effectively living
> at a point in the past where the timebase is consistent).

Um.. no.  At least in the usual configuration, the timebase represents
real, wall-clock time, so we expect it to jump forward across the
migration downtime.  This is important because the guest will use the
timebase to calculate real time differences.

However, the absolute value of the timebase may be different on the
*host* between source and destination for migration.  So what we need
to do is before migration we work out the delta between host and guest
notions of wall clock time (as defined by the guest timebase), and
transfer that in the migration stream.

On the destination we initialize the guest timebase so that the guest
maintains the same realtime offset from the host.  This means that as
long as source and destination system time is synchronized, guest
real-time tracking will continue correctly across the migration.

We do also make sure that the guest timebase never goes backwards, but
that would only happen if the source and destination host times were
badly out of sync.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 03/10] pseries: Clean up hash page table allocation error handling

2016-01-17 Thread David Gibson

On Mon, Jan 18, 2016 at 01:44:00PM +1100, Alexey Kardashevskiy wrote:
> On 01/15/2016 11:00 PM, David Gibson wrote:
> >The spapr_alloc_htab() and spapr_reset_htab() functions currently handle
> >all errors with error_setg(_abort, ...).
> >
> >But really, the callers are really better placed to decide on the error
> >handling.  So, instead make the functions use the error propagation
> >infrastructure.
> >
> >In the callers we change to _fatal instead of _abort, since
> >this can be triggered by a bad configuration or kernel error rather than
> >indicating a programming error in qemu.
> >
> >While we're at it improve the messages themselves a bit, and clean up the
> >indentation a little.
> >
> >Signed-off-by: David Gibson 
> >---
> >  hw/ppc/spapr.c | 24 
> >  1 file changed, 16 insertions(+), 8 deletions(-)
> >
> >diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >index b7fd09a..d28e349 100644
> >--- a/hw/ppc/spapr.c
> >+++ b/hw/ppc/spapr.c
> >@@ -1016,7 +1016,7 @@ static void emulate_spapr_hypercall(PowerPCCPU *cpu)
> >  #define CLEAN_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) &= 
> > tswap64(~HPTE64_V_HPTE_DIRTY))
> >  #define DIRTY_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) |= 
> > tswap64(HPTE64_V_HPTE_DIRTY))
> >
> >-static void spapr_alloc_htab(sPAPRMachineState *spapr)
> >+static void spapr_alloc_htab(sPAPRMachineState *spapr, Error **errp)
> >  {
> >  long shift;
> >  int index;
> >@@ -1031,7 +1031,8 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
> >   * For HV KVM, host kernel will return -ENOMEM when requested
> >   * HTAB size can't be allocated.
> >   */
> >-error_setg(_abort, "Failed to allocate HTAB of requested 
> >size, try with smaller maxmem");
> >+error_setg_errno(errp, -shift,
> >+ "Error allocating KVM hash page table, try smaller 
> >maxmem");
> >  } else if (shift > 0) {
> >  /*
> >   * Kernel handles htab, we don't need to allocate one
> >@@ -1040,7 +1041,10 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
> >   * but we don't allow booting of such guests.
> >   */
> >  if (shift != spapr->htab_shift) {
> >-error_setg(_abort, "Failed to allocate HTAB of requested 
> >size, try with smaller maxmem");
> >+error_setg(errp,
> >+"Small allocation for KVM hash page table (%ld < %"
> >+PRIu32 "), try smaller maxmem",
> 
> 
> 
> Even though it is not in the CODING_STYLE, I have not seen anyone objecting
> the very good kernel's "never break user-visible strings" rule or rejecting
> patches with user-visible strings failing to fit 80 chars limit.

I'm not.  Or rather, the string is already broken by the PRIu32, so
the newline doesn't make it any less greppable.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 03/10] pseries: Clean up hash page table allocation error handling

2016-01-17 Thread Alexey Kardashevskiy


On 01/18/2016 03:42 PM, David Gibson wrote:

On Mon, Jan 18, 2016 at 01:44:00PM +1100, Alexey Kardashevskiy wrote:

On 01/15/2016 11:00 PM, David Gibson wrote:

The spapr_alloc_htab() and spapr_reset_htab() functions currently handle
all errors with error_setg(_abort, ...).

But really, the callers are really better placed to decide on the error
handling.  So, instead make the functions use the error propagation
infrastructure.

In the callers we change to _fatal instead of _abort, since
this can be triggered by a bad configuration or kernel error rather than
indicating a programming error in qemu.

While we're at it improve the messages themselves a bit, and clean up the
indentation a little.

Signed-off-by: David Gibson 
---
  hw/ppc/spapr.c | 24 
  1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b7fd09a..d28e349 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1016,7 +1016,7 @@ static void emulate_spapr_hypercall(PowerPCCPU *cpu)
  #define CLEAN_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) &= 
tswap64(~HPTE64_V_HPTE_DIRTY))
  #define DIRTY_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) |= 
tswap64(HPTE64_V_HPTE_DIRTY))

-static void spapr_alloc_htab(sPAPRMachineState *spapr)
+static void spapr_alloc_htab(sPAPRMachineState *spapr, Error **errp)
  {
  long shift;
  int index;
@@ -1031,7 +1031,8 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
   * For HV KVM, host kernel will return -ENOMEM when requested
   * HTAB size can't be allocated.
   */
-error_setg(_abort, "Failed to allocate HTAB of requested size, try 
with smaller maxmem");
+error_setg_errno(errp, -shift,
+ "Error allocating KVM hash page table, try smaller 
maxmem");
  } else if (shift > 0) {
  /*
   * Kernel handles htab, we don't need to allocate one
@@ -1040,7 +1041,10 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
   * but we don't allow booting of such guests.
   */
  if (shift != spapr->htab_shift) {
-error_setg(_abort, "Failed to allocate HTAB of requested size, 
try with smaller maxmem");
+error_setg(errp,
+"Small allocation for KVM hash page table (%ld < %"
+PRIu32 "), try smaller maxmem",




Even though it is not in the CODING_STYLE, I have not seen anyone objecting
the very good kernel's "never break user-visible strings" rule or rejecting
patches with user-visible strings failing to fit 80 chars limit.


I'm not.  Or rather, the string is already broken by the PRIu32, so
the newline doesn't make it any less greppable.



"KVM hash page table.*smaller maxmem" stopped working. Not a big deal but I 
do not see any win in breaking strings anyway.


btw the chunk above (and other patches in the patchset) uses incorrect indent.



--
Alexey

[Qemu-devel] [PATCH v1 05/17] target-arm: pass DisasContext to gen_aa32_ld/st

2016-01-17 Thread Peter Crosthwaite

From: Paolo Bonzini 

We'll need the DisasContext in the next patch to retrieve the
desired endianness, so pass it as a whole to gen_aa32_ld*/st*.

Unfortunately we cannot let those functions call get_mem_index,
because of user-mode load/store instructions.

Signed-off-by: Paolo Bonzini 
[ PC changes:
 * Fix long lines
]
Signed-off-by: Peter Crosthwaite 
---

 target-arm/translate.c | 270 ++---
 1 file changed, 142 insertions(+), 128 deletions(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index d485e7d..55ecca5 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -924,23 +924,27 @@ static inline void store_reg_from_load(DisasContext *s, 
int reg, TCGv_i32 var)
 #if TARGET_LONG_BITS == 32
 
 #define DO_GEN_LD(SUFF, OPC) \
-static inline void gen_aa32_ld##SUFF(TCGv_i32 val, TCGv_i32 addr, int index) \
+static inline void gen_aa32_ld##SUFF(DisasContext *s, TCGv_i32 val,  \
+ TCGv_i32 addr, int index)   \
 {\
 tcg_gen_qemu_ld_i32(val, addr, index, (OPC));\
 }
 
 #define DO_GEN_ST(SUFF, OPC) \
-static inline void gen_aa32_st##SUFF(TCGv_i32 val, TCGv_i32 addr, int index) \
+static inline void gen_aa32_st##SUFF(DisasContext *s, TCGv_i32 val,  \
+ TCGv_i32 addr, int index)   \
 {\
 tcg_gen_qemu_st_i32(val, addr, index, (OPC));\
 }
 
-static inline void gen_aa32_ld64(TCGv_i64 val, TCGv_i32 addr, int index)
+static inline void gen_aa32_ld64(DisasContext *s, TCGv_i64 val,
+ TCGv_i32 addr, int index)
 {
 tcg_gen_qemu_ld_i64(val, addr, index, MO_TEQ);
 }
 
-static inline void gen_aa32_st64(TCGv_i64 val, TCGv_i32 addr, int index)
+static inline void gen_aa32_st64(DisasContext *s, TCGv_i64 val,
+ TCGv_i32 addr, int index)
 {
 tcg_gen_qemu_st_i64(val, addr, index, MO_TEQ);
 }
@@ -948,7 +952,8 @@ static inline void gen_aa32_st64(TCGv_i64 val, TCGv_i32 
addr, int index)
 #else
 
 #define DO_GEN_LD(SUFF, OPC) \
-static inline void gen_aa32_ld##SUFF(TCGv_i32 val, TCGv_i32 addr, int index) \
+static inline void gen_aa32_ld##SUFF(DisasContext *s, TCGv_i32 val,  \
+ TCGv_i32 addr, int index)   \
 {\
 TCGv addr64 = tcg_temp_new();\
 tcg_gen_extu_i32_i64(addr64, addr);  \
@@ -957,7 +962,8 @@ static inline void gen_aa32_ld##SUFF(TCGv_i32 val, TCGv_i32 
addr, int index) \
 }
 
 #define DO_GEN_ST(SUFF, OPC) \
-static inline void gen_aa32_st##SUFF(TCGv_i32 val, TCGv_i32 addr, int index) \
+static inline void gen_aa32_st##SUFF(DisasContext *s, TCGv_i32 val,  \
+ TCGv_i32 addr, int index)   \
 {\
 TCGv addr64 = tcg_temp_new();\
 tcg_gen_extu_i32_i64(addr64, addr);  \
@@ -965,7 +971,8 @@ static inline void gen_aa32_st##SUFF(TCGv_i32 val, TCGv_i32 
addr, int index) \
 tcg_temp_free(addr64);   \
 }
 
-static inline void gen_aa32_ld64(TCGv_i64 val, TCGv_i32 addr, int index)
+static inline void gen_aa32_ld64(DisasContext *s, TCGv_i64 val,
+ TCGv_i32 addr, int index)
 {
 TCGv addr64 = tcg_temp_new();
 tcg_gen_extu_i32_i64(addr64, addr);
@@ -973,7 +980,8 @@ static inline void gen_aa32_ld64(TCGv_i64 val, TCGv_i32 
addr, int index)
 tcg_temp_free(addr64);
 }
 
-static inline void gen_aa32_st64(TCGv_i64 val, TCGv_i32 addr, int index)
+static inline void gen_aa32_st64(DisasContext *s, TCGv_i64 val,
+ TCGv_i32 addr, int index)
 {
 TCGv addr64 = tcg_temp_new();
 tcg_gen_extu_i32_i64(addr64, addr);
@@ -1288,18 +1296,18 @@ VFP_GEN_FIX(ulto, )
 static inline void gen_vfp_ld(DisasContext *s, int dp, TCGv_i32 addr)
 {
 if (dp) {
-gen_aa32_ld64(cpu_F0d, addr, get_mem_index(s));
+gen_aa32_ld64(s, cpu_F0d, addr, get_mem_index(s));
 } else {
-gen_aa32_ld32u(cpu_F0s, addr, get_mem_index(s));
+gen_aa32_ld32u(s, cpu_F0s, addr, get_mem_index(s));
 }
 }
 
 static inline void gen_vfp_st(DisasContext *s, int dp, TCGv_i32 addr)
 {
 if (dp) {
-gen_aa32_st64(cpu_F0d, addr, get_mem_index(s));
+

[Qemu-devel] [PATCH v1 11/17] linux-user: arm: pass env to get_user_code_*

2016-01-17 Thread Peter Crosthwaite

From: Paolo Bonzini 

This matches the idiom used by get_user_data_* later in the series,
and will help when bswap_code will be replaced by SCTLR.B.

Reviewed-by: Peter Maydell 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Peter Crosthwaite 
---

 linux-user/main.c | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 8348ddc..2157774 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -439,17 +439,17 @@ void cpu_loop(CPUX86State *env)
 
 #ifdef TARGET_ARM
 
-#define get_user_code_u32(x, gaddr, doswap) \
+#define get_user_code_u32(x, gaddr, env)\
 ({ abi_long __r = get_user_u32((x), (gaddr));   \
-if (!__r && (doswap)) { \
+if (!__r && (env)->bswap_code) {\
 (x) = bswap32(x);   \
 }   \
 __r;\
 })
 
-#define get_user_code_u16(x, gaddr, doswap) \
+#define get_user_code_u16(x, gaddr, env)\
 ({ abi_long __r = get_user_u16((x), (gaddr));   \
-if (!__r && (doswap)) { \
+if (!__r && (env)->bswap_code) {\
 (x) = bswap16(x);   \
 }   \
 __r;\
@@ -732,7 +732,7 @@ void cpu_loop(CPUARMState *env)
 /* we handle the FPU emulation here, as Linux */
 /* we get the opcode */
 /* FIXME - what to do if get_user() fails? */
-get_user_code_u32(opcode, env->regs[15], env->bswap_code);
+get_user_code_u32(opcode, env->regs[15], env);
 
 rc = EmulateAll(opcode, >fpa, env);
 if (rc == 0) { /* illegal instruction */
@@ -802,25 +802,23 @@ void cpu_loop(CPUARMState *env)
 if (trapnr == EXCP_BKPT) {
 if (env->thumb) {
 /* FIXME - what to do if get_user() fails? */
-get_user_code_u16(insn, env->regs[15], 
env->bswap_code);
+get_user_code_u16(insn, env->regs[15], env);
 n = insn & 0xff;
 env->regs[15] += 2;
 } else {
 /* FIXME - what to do if get_user() fails? */
-get_user_code_u32(insn, env->regs[15], 
env->bswap_code);
+get_user_code_u32(insn, env->regs[15], env);
 n = (insn & 0xf) | ((insn >> 4) & 0xff0);
 env->regs[15] += 4;
 }
 } else {
 if (env->thumb) {
 /* FIXME - what to do if get_user() fails? */
-get_user_code_u16(insn, env->regs[15] - 2,
-  env->bswap_code);
+get_user_code_u16(insn, env->regs[15] - 2, env);
 n = insn & 0xff;
 } else {
 /* FIXME - what to do if get_user() fails? */
-get_user_code_u32(insn, env->regs[15] - 4,
-  env->bswap_code);
+get_user_code_u32(insn, env->regs[15] - 4, env);
 n = insn & 0xff;
 }
 }
-- 
1.9.1

[Qemu-devel] [PATCH v1 17/17] arm: boot: Support big-endian elfs

2016-01-17 Thread Peter Crosthwaite

Support ARM big-endian ELF files in system-mode emulation. When loading
an elf, determine the endianness mode expected by the elf, and set the
relevant CPU state accordingly.

With this, big-endian modes are now fully supported via system-mode LE,
so there is no need to restrict the elf loading to the TARGET
endianness so the ifdeffery on TARGET_WORDS_BIGENDIAN goes away.

Signed-off-by: Peter Crosthwaite 
---

 hw/arm/boot.c| 96 ++--
 include/hw/arm/arm.h |  9 +
 2 files changed, 88 insertions(+), 17 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 0de4269..053c9e8 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -465,9 +465,34 @@ static void do_cpu_reset(void *opaque)
 cpu_reset(cs);
 if (info) {
 if (!info->is_linux) {
+int i;
 /* Jump to the entry point.  */
 uint64_t entry = info->entry;
 
+switch (info->endianness) {
+case ARM_ENDIANNESS_LE:
+env->cp15.sctlr_el[1] &= ~SCTLR_E0E;
+for (i = 1; i < 4; ++i) {
+env->cp15.sctlr_el[i] &= ~SCTLR_EE;
+}
+env->uncached_cpsr &= ~CPSR_E;
+break;
+case ARM_ENDIANNESS_BE8:
+env->cp15.sctlr_el[1] |= SCTLR_E0E;
+for (i = 1; i < 4; ++i) {
+env->cp15.sctlr_el[i] |= SCTLR_EE;
+}
+env->uncached_cpsr |= CPSR_E;
+break;
+case ARM_ENDIANNESS_BE32:
+env->cp15.sctlr_el[1] |= SCTLR_B;
+break;
+case ARM_ENDIANNESS_UNKNOWN:
+break; /* Board's decision */
+default:
+g_assert_not_reached();
+}
+
 if (!env->aarch64) {
 env->thumb = info->entry & 1;
 entry &= 0xfffe;
@@ -589,16 +614,23 @@ static void arm_load_kernel_notify(Notifier *notifier, 
void *data)
 int kernel_size;
 int initrd_size;
 int is_linux = 0;
+
 uint64_t elf_entry, elf_low_addr, elf_high_addr;
 int elf_machine;
+bool elf_is64;
+union {
+Elf32_Ehdr h32;
+Elf64_Ehdr h64;
+} elf_header;
+
 hwaddr entry, kernel_load_offset;
-int big_endian;
 static const ARMInsnFixup *primary_loader;
 ArmLoadKernelNotifier *n = DO_UPCAST(ArmLoadKernelNotifier,
  notifier, notifier);
 ARMCPU *cpu = n->cpu;
 struct arm_boot_info *info =
 container_of(n, struct arm_boot_info, load_kernel_notifier);
+Error *err = NULL;
 
 /* The board code is not supposed to set secure_board_setup unless
  * running its code in secure mode is actually possible, and KVM
@@ -678,12 +710,6 @@ static void arm_load_kernel_notify(Notifier *notifier, 
void *data)
 if (info->nb_cpus == 0)
 info->nb_cpus = 1;
 
-#ifdef TARGET_WORDS_BIGENDIAN
-big_endian = 1;
-#else
-big_endian = 0;
-#endif
-
 /* We want to put the initrd far enough into RAM that when the
  * kernel is uncompressed it will not clobber the initrd. However
  * on boards without much RAM we must ensure that we still leave
@@ -698,16 +724,52 @@ static void arm_load_kernel_notify(Notifier *notifier, 
void *data)
 MIN(info->ram_size / 2, 128 * 1024 * 1024);
 
 /* Assume that raw images are linux kernels, and ELF images are not.  */
-kernel_size = load_elf(info->kernel_filename, NULL, NULL, _entry,
-   _low_addr, _high_addr, big_endian,
-   elf_machine, 1, 0);
-if (kernel_size > 0 && have_dtb(info)) {
-/* If there is still some room left at the base of RAM, try and put
- * the DTB there like we do for images loaded with -bios or -pflash.
- */
-if (elf_low_addr > info->loader_start
-|| elf_high_addr < info->loader_start) {
-/* Pass elf_low_addr as address limit to load_dtb if it may be
+
+load_elf_hdr(info->kernel_filename, _header, _is64, );
+
+if (!err) {
+int data_swab = 0;
+bool big_endian;
+
+if (elf_is64) {
+big_endian = elf_header.h64.e_ident[EI_DATA] == ELFDATA2MSB;
+info->endianness = big_endian ? ARM_ENDIANNESS_BE8
+  : ARM_ENDIANNESS_LE;
+} else {
+big_endian = elf_header.h32.e_ident[EI_DATA] == ELFDATA2MSB;
+if (big_endian) {
+if (bswap32(elf_header.h32.e_flags) & EF_ARM_BE8) {
+info->endianness = ARM_ENDIANNESS_BE8;
+} else {
+info->endianness = ARM_ENDIANNESS_BE32;
+/* In BE32, the CPU has a different view of the per-byte
+ * address map than the rest of the system. BE32 elfs are
+ * organised such that they can

[Qemu-devel] [PATCH v1 03/17] linux-user: arm: handle CPSR.E correctly in strex emulation

2016-01-17 Thread Peter Crosthwaite

From: Paolo Bonzini 

Now that CPSR.E is set correctly, prepare for when setend will be able
to change it; bswap data in and out of strex manually by comparing
SCTLR.B, CPSR.E and TARGET_WORDS_BIGENDIAN (we do not have the luxury
of using TCGMemOps).

Reviewed-by: Peter Maydell 
Signed-off-by: Paolo Bonzini 
[ PC changes:
  * Remove BE32 support
]
Signed-off-by: Peter Crosthwaite 
---

 linux-user/main.c | 50 +++---
 target-arm/cpu.h  | 21 +
 2 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 4f8ea9c..8348ddc 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -455,6 +455,38 @@ void cpu_loop(CPUX86State *env)
 __r;\
 })
 
+#define get_user_data_u32(x, gaddr, env)\
+({ abi_long __r = get_user_u32((x), (gaddr));   \
+if (!__r && arm_cpu_bswap_data(env)) {  \
+(x) = bswap32(x);   \
+}   \
+__r;\
+})
+
+#define get_user_data_u16(x, gaddr, env)\
+({ abi_long __r = get_user_u16((x), (gaddr));   \
+if (!__r && arm_cpu_bswap_data(env)) {  \
+(x) = bswap16(x);   \
+}   \
+__r;\
+})
+
+#define put_user_data_u32(x, gaddr, env)\
+({ typeof(x) __x = (x); \
+if (arm_cpu_bswap_data(env)) {  \
+__x = bswap32(__x); \
+}   \
+put_user_u32(__x, (gaddr)); \
+})
+
+#define put_user_data_u16(x, gaddr, env)\
+({ typeof(x) __x = (x); \
+if (arm_cpu_bswap_data(env)) {  \
+__x = bswap16(__x); \
+}   \
+put_user_u16(__x, (gaddr)); \
+})
+
 #ifdef TARGET_ABI32
 /* Commpage handling -- there is no commpage for AArch64 */
 
@@ -614,11 +646,11 @@ static int do_strex(CPUARMState *env)
 segv = get_user_u8(val, addr);
 break;
 case 1:
-segv = get_user_u16(val, addr);
+segv = get_user_data_u16(val, addr, env);
 break;
 case 2:
 case 3:
-segv = get_user_u32(val, addr);
+segv = get_user_data_u32(val, addr, env);
 break;
 default:
 abort();
@@ -629,12 +661,16 @@ static int do_strex(CPUARMState *env)
 }
 if (size == 3) {
 uint32_t valhi;
-segv = get_user_u32(valhi, addr + 4);
+segv = get_user_data_u32(valhi, addr + 4, env);
 if (segv) {
 env->exception.vaddress = addr + 4;
 goto done;
 }
-val = deposit64(val, 32, 32, valhi);
+if (arm_cpu_bswap_data(env)) {
+val = deposit64((uint64_t)valhi, 32, 32, val);
+} else {
+val = deposit64(val, 32, 32, valhi);
+}
 }
 if (val != env->exclusive_val) {
 goto fail;
@@ -646,11 +682,11 @@ static int do_strex(CPUARMState *env)
 segv = put_user_u8(val, addr);
 break;
 case 1:
-segv = put_user_u16(val, addr);
+segv = put_user_data_u16(val, addr, env);
 break;
 case 2:
 case 3:
-segv = put_user_u32(val, addr);
+segv = put_user_data_u32(val, addr, env);
 break;
 }
 if (segv) {
@@ -659,7 +695,7 @@ static int do_strex(CPUARMState *env)
 }
 if (size == 3) {
 val = env->regs[(env->exclusive_info >> 12) & 0xf];
-segv = put_user_u32(val, addr + 4);
+segv = put_user_data_u32(val, addr + 4, env);
 if (segv) {
 env->exception.vaddress = addr + 4;
 goto done;
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 815fef8..f83070a 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -1934,6 +1934,27 @@ static inline int fp_exception_el(CPUARMState *env)
 return 0;
 }
 
+#ifdef CONFIG_USER_ONLY
+/* get_user and put_user respectively return and expect data according
+ * to TARGET_WORDS_BIGENDIAN, but ldrex/strex emulation needs to take
+ * into account CPSR.E.
+ *
+ *TARGET_WORDS_BIGENDIAN  CPSR.Eneed swap?
+ *   LE/LE no   0  no
+ *   LE/BE no   1  yes
+ *   BE8/LEyes  0  yes
+ *   BE8/BEyes  1  no
+ */
+static inline bool arm_cpu_bswap_data(CPUARMState *env)
+{
+return
+#ifdef

Re: [Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...)

2016-01-17 Thread Alex Williamson

Hi Jike,

On Mon, 2016-01-18 at 10:39 +0800, Jike Song wrote:
> Hi Alex, let's continue with a new thread :)
> 
> Basically we agree with you: exposing vGPU via VFIO can make
> QEMU share as much code as possible with pcidev(PF or VF) assignment.
> And yes, different vGPU vendors can share quite a lot of the
> QEMU part, which will do good for upper layers such as libvirt.
> 
> 
> To achieve this, there are quite a lot to do, I'll summarize
> it below. I dived into VFIO for a while but still may have
> things misunderstood, so please correct me :)
> 
> 
> 
> First, let me illustrate my understanding of current VFIO
> framework used to pass through a pcidev to guest:
> 
> 
>  +--+
>  |vfio qemu |
>  +-++---+
>    |DMA  ^  |CFG
> QEMU   |map   IRQ|  |
> ---|-|--|---
> KERNEL+|-|--|--+
>   | VFIO   | |  |  |
>   |v |  v  |
>   |  +---+ +-+---+ |
> IOMMU |  | vfio iommu driver | | vfio bus driver | |
> API  <---+   | | | |
> Layer |  | e.g. type1| | e.g. vfio_pci   | |
>   |  +---+ +-+ |
>   ++
> 
> 
> Here when a particular pcidev is passed-through to a KVM guest,
> it is attached to vfio_pci driver in host, and guest memory
> is mapped into IOMMU via the type1 iommu driver.
> 
> 
> Then, the draft infrastructure of future VFIO-based vgpu:
> 
> 
> 
>  +-+
>  |  vfio qemu  |
>  ++-+--+
>   |DMA   ^  |CFG
> QEMU  |mapIRQ|  |
> --|--|--|---
> KERNEL|  |  |
>  +|--|--|--+
>  |VFIO|  |  |  |
>  |v  |  v  |
>  | ++  +-+---+ |
> DMA  | | vfio iommu driver  |  | vfio bus driver | |
> API <--+|  | | |
> Layer| |  e.g. vfio_type2   |  |  e.g. vfio_vgpu | |
>  | ++  +-+ |
>  | |  ^  |  ^  |
>  +-|--|--|--|--+
>    |  |  |  |
>    |  |  v  |
>  +-|--|--+   +-+
>  | +---v---+ |   | |
>  | |   | |   | |
>  | |  KVMGT| |   | |
>  | |   | |   |   host gfx driver   |
>  | +---+ |   | |
>  |   |   | |
>  |KVM hypervisor |   | |
>  +---+   +-+
> 
> NOTEvfio_type2 and vfio_vgpu are only *logically* parts
> of VFIO, they may be implemented in KVM hypervisor
> or host gfx driver.
> 
> 
> 
> Here we need to implement a new vfio IOMMU driver instead of type1,
> let's call it vfio_type2 temporarily. The main difference from pcidev
> assignment is, vGPU doesn't have its own DMA requester id, so it has
> to share mappings with host and other vGPUs.
> 
> - type1 iommu driver maps gpa to hpa for passing through;
>   whereas type2 maps iova to hpa;
> 
> - hardware iommu is always needed by type1, whereas for
>   type2, hardware iommu is optional;
> 
> - type1 will invoke low-level IOMMU API (iommu_map et al) to
>   setup IOMMU page table directly, whereas type2 dosen't (only
>   need to invoke higher level DMA API like dma_map_page);

Yes, the current type1 implementation is not compatible with vgpu since
there are not separate requester IDs on the bus and you probably don't
want or need to pin all of guest memory like we do for direct
assignment.  However, let's separate the type1 user API from the
current implementation.  It's quite easy within the vfio code to
consider "type1" to be an API specification that may have multiple
implementations.  A minor code change would allow us to continue
looking for compatible iommu backends if the group we're trying to
attach is rejected.

[Qemu-devel] [PATCH] misc: zynq-xadc: Fix off-by-one

2016-01-17 Thread Peter Crosthwaite

This bounds check was off-by-one. Fix.

Reported-by: Paolo Bonzini 
Signed-off-by: Peter Crosthwaite 
---
 hw/misc/zynq-xadc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/misc/zynq-xadc.c b/hw/misc/zynq-xadc.c
index 1a32595..d160ff2 100644
--- a/hw/misc/zynq-xadc.c
+++ b/hw/misc/zynq-xadc.c
@@ -220,7 +220,7 @@ static void zynq_xadc_write(void *opaque, hwaddr offset, 
uint64_t val,
 break;
 }
 
-if (xadc_reg > ZYNQ_XADC_NUM_ADC_REGS && xadc_cmd != CMD_NOP) {
+if (xadc_reg >= ZYNQ_XADC_NUM_ADC_REGS && xadc_cmd != CMD_NOP) {
 qemu_log_mask(LOG_GUEST_ERROR, "read/write op to invalid xadc "
   "reg 0x%x\n", xadc_reg);
 break;
-- 
1.9.1

[Qemu-devel] [PATCH v1 12/17] target-arm: implement SCTLR.B, drop bswap_code

2016-01-17 Thread Peter Crosthwaite

From: Paolo Bonzini 

bswap_code is a CPU property of sorts ("is the iside endianness the
opposite way round to TARGET_WORDS_BIGENDIAN?") but it is not the
actual CPU state involved here which is SCTLR.B (set for BE32
binaries, clear for BE8).

Replace bswap_code with SCTLR.B, and pass that to arm_ld*_code.
The next patches will make data fetches honor both SCTLR.B and
CPSR.E appropriately.

Signed-off-by: Paolo Bonzini 
[PC changes:
 * rebased on master (Jan 2016)
 * Dropped comment about CPSR.E being unimplemented
 * s/TARGET_USER_ONLY/CONFIG_USER_ONLY
 * Use bswap_code() for disas_set_info() instead of raw sctlr_b
]
Signed-off-by: Peter Crosthwaite 
---

 linux-user/main.c  | 13 -
 target-arm/arm_ldst.h  |  8 
 target-arm/cpu.c   |  2 +-
 target-arm/cpu.h   | 47 ++
 target-arm/helper.c|  8 
 target-arm/translate-a64.c |  6 +++---
 target-arm/translate.c | 12 ++--
 target-arm/translate.h |  2 +-
 8 files changed, 66 insertions(+), 32 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 2157774..d481458 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -441,7 +441,7 @@ void cpu_loop(CPUX86State *env)
 
 #define get_user_code_u32(x, gaddr, env)\
 ({ abi_long __r = get_user_u32((x), (gaddr));   \
-if (!__r && (env)->bswap_code) {\
+if (!__r && bswap_code(arm_sctlr_b(env))) { \
 (x) = bswap32(x);   \
 }   \
 __r;\
@@ -449,7 +449,7 @@ void cpu_loop(CPUX86State *env)
 
 #define get_user_code_u16(x, gaddr, env)\
 ({ abi_long __r = get_user_u16((x), (gaddr));   \
-if (!__r && (env)->bswap_code) {\
+if (!__r && bswap_code(arm_sctlr_b(env))) { \
 (x) = bswap16(x);   \
 }   \
 __r;\
@@ -4489,14 +4489,17 @@ int main(int argc, char **argv, char **envp)
 env->regs[i] = regs->uregs[i];
 }
 #ifdef TARGET_WORDS_BIGENDIAN
-env->uncached_cpsr |= CPSR_E;
 env->cp15.sctlr_el[1] |= SCTLR_E0E;
-#endif
 /* Enable BE8.  */
 if (EF_ARM_EABI_VERSION(info->elf_flags) >= EF_ARM_EABI_VER4
 && (info->elf_flags & EF_ARM_BE8)) {
-env->bswap_code = 1;
+env->uncached_cpsr |= CPSR_E;
+} else {
+env->cp15.sctlr_el[1] |= SCTLR_B;
+/* We model BE32 as regular BE, so set CPSR_E */
+env->uncached_cpsr |= CPSR_E;
 }
+#endif
 }
 #elif defined(TARGET_UNICORE32)
 {
diff --git a/target-arm/arm_ldst.h b/target-arm/arm_ldst.h
index b1ece01..35c2c43 100644
--- a/target-arm/arm_ldst.h
+++ b/target-arm/arm_ldst.h
@@ -25,10 +25,10 @@
 
 /* Load an instruction and return it in the standard little-endian order */
 static inline uint32_t arm_ldl_code(CPUARMState *env, target_ulong addr,
-bool do_swap)
+bool sctlr_b)
 {
 uint32_t insn = cpu_ldl_code(env, addr);
-if (do_swap) {
+if (bswap_code(sctlr_b)) {
 return bswap32(insn);
 }
 return insn;
@@ -36,10 +36,10 @@ static inline uint32_t arm_ldl_code(CPUARMState *env, 
target_ulong addr,
 
 /* Ditto, for a halfword (Thumb) instruction */
 static inline uint16_t arm_lduw_code(CPUARMState *env, target_ulong addr,
- bool do_swap)
+ bool sctlr_b)
 {
 uint16_t insn = cpu_lduw_code(env, addr);
-if (do_swap) {
+if (bswap_code(sctlr_b)) {
 return bswap16(insn);
 }
 return insn;
diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index d3b73bf..cec5147 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -413,7 +413,7 @@ static void arm_disas_set_info(CPUState *cpu, 
disassemble_info *info)
 } else {
 info->print_insn = print_insn_arm;
 }
-if (env->bswap_code) {
+if (bswap_code(arm_sctlr_b(env))) {
 #ifdef TARGET_WORDS_BIGENDIAN
 info->endian = BFD_ENDIAN_LITTLE;
 #else
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 74048d1..3edd56b 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -478,9 +478,6 @@ typedef struct CPUARMState {
 uint32_t cregs[16];
 } iwmmxt;
 
-/* For mixed endian mode.  */
-bool bswap_code;
-
 #if defined(CONFIG_USER_ONLY)
 /* For usermode syscall translation.  */
 int eabi;
@@ -1795,6 +1792,19 @@ static inline bool arm_singlestep_active(CPUARMState 
*env)
 && arm_generate_debug_exceptions(env);
 }
 
+static inline bool arm_sctlr_b(CPUARMState *env)
+{
+return
+

[Qemu-devel] Using directory as initrd

2016-01-17 Thread Kasper Dupont

I would like to use a directory as initrd file without
having to write it to an initrd file each time I have
changed anything in that directory.

I have written code to pipe an initrd directly from cpio
to qemu. Do you have any feedback on the attached patch?

-- 
Kasper Dupont -- Rigtige mænd skriver deres egne backupprogrammer
#define _(_)"d.%.4s%."_"2s" /* This is my email address */
char*_="@2kaspner"_()"%03"_("4s%.")"t\n";printf(_+11,_+6,_,12,_+2,_+7,_+6);
diff -up qemu-2.0.0+dfsg/hw/core/loader.c.orig qemu-2.0.0+dfsg/hw/core/loader.c
--- qemu-2.0.0+dfsg/hw/core/loader.c.orig   2014-04-17 15:30:59.0 
+0200
+++ qemu-2.0.0+dfsg/hw/core/loader.c2016-01-17 19:38:01.505138214 +0100
@@ -137,6 +137,70 @@ void pstrcpy_targphys(const char *name,
 }
 }
 
+/* return the size or -1 if error */
+int load_initrd(const char *filename, uint8_t **addr)
+{
+pid_t child = 0;
+int fd, size = 0, buffer_size = 4096;
+fd = open(filename, O_RDONLY | O_BINARY);
+if (fd < 0)
+return -1;
+
+*addr = malloc(buffer_size);
+
+while(*addr) {
+int bytes_read = read(fd, (*addr) + size, buffer_size - size);
+
+if (bytes_read == 0) {
+close(fd);
+if (child) {
+waitpid(child, NULL, 0);
+}
+return size;
+}
+
+if (bytes_read > 0) {
+size += bytes_read;
+} else {
+int pipefd[2];
+
+if (errno != EISDIR)
+return -1;
+
+assert(!child);
+
+if (pipe(pipefd))
+return -1;
+
+child = fork();
+if (child == -1)
+return -1;
+
+if (!child) {
+if(fchdir(fd))
+error(1, errno, "fchdir failed");
+if(dup2(pipefd[1], 1) == -1)
+error(1, errno, "dup2 failed");
+execlp("/bin/sh", "/bin/sh", "-c",
+   "find . | cpio --quiet -R 0:0 -o -H newc",
+   NULL);
+error(1, errno, "/bin/sh");
+}
+
+close(fd);
+close(pipefd[1]);
+fd = pipefd[0];
+}
+
+if (size == buffer_size) {
+buffer_size <<= 1;
+*addr = realloc(*addr, buffer_size);
+}
+}
+
+return -1;
+}
+
 /* A.OUT loader */
 
 struct exec
diff -up qemu-2.0.0+dfsg/hw/i386/pc.c.orig qemu-2.0.0+dfsg/hw/i386/pc.c
--- qemu-2.0.0+dfsg/hw/i386/pc.c.orig   2014-04-17 15:30:59.0 +0200
+++ qemu-2.0.0+dfsg/hw/i386/pc.c2016-01-17 19:16:04.704258817 +0100
@@ -829,12 +829,13 @@ static void load_linux(FWCfgState *fw_cf
 
 /* load initrd */
 if (initrd_filename) {
+uint8_t *initrd_load_buffer;
 if (protocol < 0x200) {
 fprintf(stderr, "qemu: linux kernel too old to load a ram disk\n");
 exit(1);
 }
 
-initrd_size = get_image_size(initrd_filename);
+initrd_size = load_initrd(initrd_filename, _load_buffer);
 if (initrd_size < 0) {
 fprintf(stderr, "qemu: error reading initrd %s: %s\n",
 initrd_filename, strerror(errno));
@@ -844,7 +845,8 @@ static void load_linux(FWCfgState *fw_cf
 initrd_addr = (initrd_max-initrd_size) & ~4095;
 
 initrd_data = g_malloc(initrd_size);
-load_image(initrd_filename, initrd_data);
+memcpy(initrd_data, initrd_load_buffer, initrd_size);
+free(initrd_load_buffer);
 
 fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
 fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
diff -up qemu-2.0.0+dfsg/include/hw/loader.h.orig 
qemu-2.0.0+dfsg/include/hw/loader.h
--- qemu-2.0.0+dfsg/include/hw/loader.h.orig2014-04-17 15:30:59.0 
+0200
+++ qemu-2.0.0+dfsg/include/hw/loader.h 2016-01-17 19:15:30.688236909 +0100
@@ -13,6 +13,7 @@
  */
 int get_image_size(const char *filename);
 int load_image(const char *filename, uint8_t *addr); /* deprecated */
+int load_initrd(const char *filename, uint8_t **addr);
 int load_image_targphys(const char *filename, hwaddr,
 uint64_t max_sz);

[Qemu-devel] CMSG_SPACE() causing compile time error on Mac OS X

2016-01-17 Thread Programmingkid

I was wondering if you had problems compiling QEMU on Mac OS X recently. On my 
system, the channel-socket.c file causes this error:

io/channel-socket.c: In function 'qio_channel_socket_writev':
io/channel-socket.c:497:18: error: variable-sized object may not be initialized
 char control[CMSG_SPACE(sizeof(int) * SOCKET_MAX_FDS)] = { 0 };
  
As a test I made this simple program:

#include 
#include 

int main (int argc, char * const argv[]) {
printf("GCC version = %d.%d.%d\n", __GNUC__, __GNUC_MINOR__, 
__GNUC_PATCHLEVEL__);
char control[CMSG_SPACE(sizeof(int) * 16)] = { 0 };
control[0] = 'a';  // just to eliminate a warning
return 0;
}

When compiling under Xcode, the program does compile and run. It prints "GCC 
version = 4.2.1".

When I try to compile it under gcc 4.2.1 using just the terminal, I see this 
error message:
main.cpp: In function ‘int main(int, char* const*)’:
main.cpp:6: error: size of array ‘control’ is not an integral 
constant-expression

Why there is a difference between XCode and the terminal I have no idea. 

When compiling it under gcc 4.9.2, it compiles and runs without problem. 

I have set the configure option for cc to gcc-4.9 with this "-cc=gcc-4.9". So 
the error message makes me believe that the wrong compiler is being used. 

This is the full configure command options I used:
./configure --cxx=gcc-4.9 --cc=gcc-4.9 --objcc=gcc-4.9 --disable-gtk 
--disable-sdl --target-list=ppc-softmmu,i386-softmmu

Any insight as to what could be wrong?

[Qemu-devel] ping: [PATCH v12] block/raw-posix.c: Make physical devices usable in QEMU under Mac OS X host

2016-01-17 Thread Programmingkid

https://patchwork.ozlabs.org/patch/555945/

> Mac OS X can be picky when it comes to allowing the user
> to use physical devices in QEMU. Most mounted volumes
> appear to be off limits to QEMU. If an issue is detected,
> a message is displayed showing the user how to unmount a
> volume.
> 
> Signed-off-by: John Arbuckle 
> 
> ---
> Removed mediaType parameter from FindEjectableOpticalMedia().
> Added goto statements to hdev_open.
> Replaced snprintf() with g_strdup() in FindEjectableOpticalMedia().
> Added return statement to hdev_open for Linux compatibility.
> 
>  block/raw-posix.c |  163 
> -
>  1 files changed, 124 insertions(+), 39 deletions(-)
> 
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index d9162fd..82e8e62 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -43,6 +43,7 @@
>  #include 
>  #include 
>  //#include 
> +#include 
>  #include 
>  #endif
>  
> @@ -1975,33 +1976,46 @@ BlockDriver bdrv_file = {
>  /* host device */
>  
>  #if defined(__APPLE__) && defined(__MACH__)
> -static kern_return_t FindEjectableCDMedia( io_iterator_t *mediaIterator );
>  static kern_return_t GetBSDPath(io_iterator_t mediaIterator, char *bsdPath,
>  CFIndex maxPathSize, int flags);
> -kern_return_t FindEjectableCDMedia( io_iterator_t *mediaIterator )
> +static char *FindEjectableOpticalMedia(io_iterator_t *mediaIterator)
>  {
> -kern_return_t   kernResult;
> +kern_return_t kernResult = KERN_FAILURE;
>  mach_port_t masterPort;
>  CFMutableDictionaryRef  classesToMatch;
> +const char *matching_array[] = {kIODVDMediaClass, kIOCDMediaClass};
> +char *mediaType = NULL;
>  
>  kernResult = IOMasterPort( MACH_PORT_NULL,  );
>  if ( KERN_SUCCESS != kernResult ) {
>  printf( "IOMasterPort returned %d\n", kernResult );
>  }
>  
> -classesToMatch = IOServiceMatching( kIOCDMediaClass );
> -if ( classesToMatch == NULL ) {
> -printf( "IOServiceMatching returned a NULL dictionary.\n" );
> -} else {
> -CFDictionarySetValue( classesToMatch, CFSTR( kIOMediaEjectableKey ), 
> kCFBooleanTrue );
> -}
> -kernResult = IOServiceGetMatchingServices( masterPort, classesToMatch, 
> mediaIterator );
> -if ( KERN_SUCCESS != kernResult )
> -{
> -printf( "IOServiceGetMatchingServices returned %d\n", kernResult );
> -}
> +int index;
> +for (index = 0; index < ARRAY_SIZE(matching_array); index++) {
> +classesToMatch = IOServiceMatching(matching_array[index]);
> +if (classesToMatch == NULL) {
> +error_report("IOServiceMatching returned NULL for %s",
> + matching_array[index]);
> +continue;
> +}
> +CFDictionarySetValue(classesToMatch, CFSTR(kIOMediaEjectableKey),
> + kCFBooleanTrue);
> +kernResult = IOServiceGetMatchingServices(masterPort, classesToMatch,
> +  mediaIterator);
> +if (kernResult != KERN_SUCCESS) {
> +error_report("Note: IOServiceGetMatchingServices returned %d",
> + kernResult);
> +}
>  
> -return kernResult;
> +/* If a match was found, leave the loop */
> +if (*mediaIterator != 0) {
> +DPRINTF("Matching using %s\n", matching_array[index]);
> +mediaType = g_strdup(matching_array[index]);
> +break;
> +}
> +}
> +return mediaType;
>  }
>  
>  kern_return_t GetBSDPath(io_iterator_t mediaIterator, char *bsdPath,
> @@ -2033,7 +2047,35 @@ kern_return_t GetBSDPath(io_iterator_t mediaIterator, 
> char *bsdPath,
>  return kernResult;
>  }
>  
> -#endif
> +/* Sets up a real cdrom for use in QEMU */
> +static bool setup_cdrom(char *bsd_path, Error **errp)
> +{
> +int index, num_of_test_partitions = 2, fd;
> +char test_partition[MAXPATHLEN];
> +bool partition_found = false;
> +
> +/* look for a working partition */
> +for (index = 0; index < num_of_test_partitions; index++) {
> +snprintf(test_partition, sizeof(test_partition), "%ss%d", bsd_path,
> + index);
> +fd = qemu_open(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE);
> +if (fd >= 0) {
> +partition_found = true;
> +qemu_close(fd);
> +break;
> +}
> +}
> +
> +/* if a working partition on the device was not found */
> +if (partition_found == false) {
> +error_setg(errp, "Failed to find a working partition on disc");
> +} else {
> +DPRINTF("Using %s as optical disc\n", test_partition);
> +pstrcpy(bsd_path, MAXPATHLEN, test_partition);
> +}
> +return partition_found;
> +}
> +#endif /* defined(__APPLE__) && defined(__MACH__) */
>  
>  static int hdev_probe_device(const char *filename)
>  {
> @@ -2115,6 +2157,16 @@

Re: [Qemu-devel] CMSG_SPACE() causing compile time error on Mac OS X

2016-01-17 Thread Programmingkid


On Jan 17, 2016, at 6:22 PM, Paolo Bonzini wrote:

> 
> 
> On 17/01/2016 23:23, Programmingkid wrote:
>> When compiling under Xcode, the program does compile and run. It prints "GCC 
>> version = 4.2.1".
>> 
>> When I try to compile it under gcc 4.2.1 using just the terminal, I see this 
>> error message:
>> main.cpp: In function ‘int main(int, char* const*)’:
>> main.cpp:6: error: size of array ‘control’ is not an integral 
>> constant-expression
>> 
>> Why there is a difference between XCode and the terminal I have no idea. 
>> 
>> When compiling it under gcc 4.9.2, it compiles and runs without problem. 
>> 
>> I have set the configure option for cc to gcc-4.9 with this "-cc=gcc-4.9". 
>> So the error message makes me believe that the wrong compiler is being used. 
>> 
>> This is the full configure command options I used:
>> ./configure --cxx=gcc-4.9 --cc=gcc-4.9 --objcc=gcc-4.9 --disable-gtk 
>> --disable-sdl --target-list=ppc-softmmu,i386-softmmu
>> 
>> Any insight as to what could be wrong? 
> 
> What's the definition of the CMSG_SPACE macro under OS X?
> 
> Paolo

#define CMSG_SPACE(l)   (__DARWIN_ALIGN32(sizeof(struct cmsghdr)) + 
__DARWIN_ALIGN32(l))

Hope this helps.

Re: [Qemu-devel] [PATCH 00/10] Cleanups to error reporting on ppc and spapr (v2)

2016-01-17 Thread Alexey Kardashevskiy


On 01/16/2016 02:47 AM, Markus Armbruster wrote:

David Gibson  writes:


Here's a new spin of my patches to clean up a bunch of error reporting
in the pseries machine type and target-ppc code, to better use the
error API.

Once reviewed, I hope to merge this into ppc-for-2.6 shortly.


There's an error_setg(_abort, ...) left in spapr_drc.c.  Should
that be converted to a straight abort()?



I was under impression that all abort()/exit() instances are aimed to be 
converted to error_setg(_abort) eventually (as all fprintf(error) to 
perror() and tracepoints, etc), is there any howto what to use when? :)



--
Alexey

Re: [Qemu-devel] CMSG_SPACE() causing compile time error on Mac OS X

2016-01-17 Thread Paolo Bonzini



On 17/01/2016 23:23, Programmingkid wrote:
> When compiling under Xcode, the program does compile and run. It prints "GCC 
> version = 4.2.1".
> 
> When I try to compile it under gcc 4.2.1 using just the terminal, I see this 
> error message:
> main.cpp: In function ‘int main(int, char* const*)’:
> main.cpp:6: error: size of array ‘control’ is not an integral 
> constant-expression
> 
> Why there is a difference between XCode and the terminal I have no idea. 
> 
> When compiling it under gcc 4.9.2, it compiles and runs without problem. 
> 
> I have set the configure option for cc to gcc-4.9 with this "-cc=gcc-4.9". So 
> the error message makes me believe that the wrong compiler is being used. 
> 
> This is the full configure command options I used:
> ./configure --cxx=gcc-4.9 --cc=gcc-4.9 --objcc=gcc-4.9 --disable-gtk 
> --disable-sdl --target-list=ppc-softmmu,i386-softmmu
> 
> Any insight as to what could be wrong? 

What's the definition of the CMSG_SPACE macro under OS X?

Paolo

Re: [Qemu-devel] [PATCH 0/3] Reduce abuse of rtas_st / rtas_ld

2016-01-17 Thread Alexey Kardashevskiy


On 01/16/2016 12:14 PM, David Gibson wrote:

The rtas_ld() and rtas_st() helpers were designed for loading RTAS
arguments and storing RTAS returns which are in a simple, common array
format.

However, a number of RTAS routines - and even non-RTAS routines - have
started using these for accessing other memory buffers, where the
normal qemu memory access routines would be more appropriate.

This series removes some of these abuses of the RTAS accessors.


imho simple renaming rtas_st to stl_be_phys_real (and so on for other 
rtas_xx) would make more sense as rtas_st do not have to do anything 
with RTAS itself, it is all about realmode guest memory accessб RTAS just 
happened to be the first client of it.


btw in st_cc_buf(), what does "cc" stand for?



--
Alexey

Re: [Qemu-devel] [PATCH v4 1/2] blockdev: Error out on negative throttling option values

2016-01-17 Thread Fam Zheng

On Fri, 01/15 15:28, Kevin Wolf wrote:
> Am 15.01.2016 um 03:09 hat Fam Zheng geschrieben:
> > The implicit casting from unsigned int to double changes negative values
> > into large positive numbers and accepts them.  We should instead print
> > an error.
> > 
> > Check the number range so this case is caught and reported.
> > 
> > Signed-off-by: Fam Zheng 
> > Reviewed-by: Max Reitz 
> > ---
> >  blockdev.c  |  3 ++-
> >  include/qemu/throttle.h |  2 ++
> >  util/throttle.c | 16 ++--
> >  3 files changed, 10 insertions(+), 11 deletions(-)
> > 
> > diff --git a/blockdev.c b/blockdev.c
> > index 2df0c6d..b925e5d 100644
> > --- a/blockdev.c
> > +++ b/blockdev.c
> > @@ -348,7 +348,8 @@ static bool check_throttle_config(ThrottleConfig *cfg, 
> > Error **errp)
> >  }
> >  
> >  if (!throttle_is_valid(cfg)) {
> > -error_setg(errp, "bps/iops/maxs values must be 0 or greater");
> > +error_setg(errp, "bps/iops/max values must be within [0, %" PRId64
> > + ")", (int64_t)THROTTLE_VALUE_MAX);
> 
> I think that should be "]". If you agree, I'll fix it up while applying.

Yes, that's right. Thanks.

Fam

> 
> >  return false;
> >  }
> 
> Kevin
>

Re: [Qemu-devel] [PATCH 0/3] Reduce abuse of rtas_st / rtas_ld

2016-01-17 Thread David Gibson

On Mon, Jan 18, 2016 at 10:51:51AM +1100, Alexey Kardashevskiy wrote:
> On 01/16/2016 12:14 PM, David Gibson wrote:
> >The rtas_ld() and rtas_st() helpers were designed for loading RTAS
> >arguments and storing RTAS returns which are in a simple, common array
> >format.
> >
> >However, a number of RTAS routines - and even non-RTAS routines - have
> >started using these for accessing other memory buffers, where the
> >normal qemu memory access routines would be more appropriate.
> >
> >This series removes some of these abuses of the RTAS accessors.
> 
> imho simple renaming rtas_st to stl_be_phys_real (and so on for other
> rtas_xx) would make more sense as rtas_st do not have to do anything with
> RTAS itself, it is all about realmode guest memory accessб RTAS just
> happened to be the first client of it.

Well, no.  They were designed, specifically, to be a concise way to
load RTAS arguments, and store RTAS returns.  Nothing more.

To match other ldl* routines they would need to change to not take the
'n' argument, which would make using them more awkward for the exact
use case they're intended for.

I did consider adding ldXX_real() routines for all the memory access
cases these have been abused for, but they're few enough that it seems
simpler just to open code them in terms of the base memory access
routines.

> btw in st_cc_buf(), what does "cc" stand for?

"configure connector" - it's a helper just for that routine.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v9 0/3] qapi: child add/delete support

2016-01-17 Thread Wen Congyang

Ping...

On 12/25/2015 05:22 PM, Changlong Xie wrote:
> If quorum's child is broken, we can use mirror job to replace it.
> But sometimes, the user only need to remove the broken child, and
> add it later when the problem is fixed.
> 
> ChangLog:
> v9:
> 1. Rebase to the newest codes
> 2. Remove redundant codes in quorum_add_child() and quorum_del_child()
> 3. Fix typos and in qmp-commands.hx 
> v8:
> 1. Rebase to the newest codes
> 2. Address the comments from Eric Blake
> v7:
> 1. Remove the qmp command x-blockdev-change's parameter operation according
>to Kevin's comments.
> 2. Remove the hmp command.
> v6:
> 1. Use a single qmp command x-blockdev-change to replace x-blockdev-child-add
>and x-blockdev-child-delete
> v5:
> 1. Address Eric Blake's comments
> v4:
> 1. drop nbd driver's implementation. We can use human-monitor-command
>to do it.
> 2. Rename the command name.
> v3:
> 1. Don't open BDS in bdrv_add_child(). Use the existing BDS which is
>created by the QMP command blockdev-add.
> 2. The driver NBD can support filename, path, host:port now.
> v2:
> 1. Use bdrv_get_device_or_node_name() instead of new function
>bdrv_get_id_or_node_name()
> 2. Update the error message
> 3. Update the documents in block-core.json
> 
> Wen Congyang (3):
>   Add new block driver interface to add/delete a BDS's child
>   quorum: implement bdrv_add_child() and bdrv_del_child()
>   qmp: add monitor command to add/remove a child
> 
>  block.c   |  58 --
>  block/quorum.c| 122 
> +-
>  blockdev.c|  54 
>  include/block/block.h |   9 
>  include/block/block_int.h |   5 ++
>  qapi/block-core.json  |  23 +
>  qmp-commands.hx   |  47 ++
>  7 files changed, 312 insertions(+), 6 deletions(-)
>

Re: [Qemu-devel] [PATCHv3 0/4] Start allowing ISA to be configured out

2016-01-17 Thread David Gibson

On Fri, Jan 15, 2016 at 11:21:20PM +1100, David Gibson wrote:
> Finally got around to respinning this series I last sent out ~6 months
> ago.
> 
> At the moment isa-bus.c is compiled unconditionally for all targets.
> However, some targets have never used legacy ISA devices.  Many more
> targets have at least some machine types without ISA.
> 
> These patches allow ISA bus to be disabled in the configuration, thus
> allowing cut down configurations for targets and machine types that
> don't have ISA.
> 
> Actually turning off ISA will require more than this for most targets
> - there are a number of non-obvious dependencies on the ISA code.
> b19c1c0 "isa: remove isa_mem_base variable" already got rid of an
> important one (VGA depended on ISA).  Patches 2/4 and 4/4 in this
> series remove some more.  There are a number more though, for example
> CONFIG_IDE_CORE depends on ISA and the HMP "info irq" command depends
> on I8259 code.
> 
> But, these patches patch should allow easier experimentation so we can
> chip away at those dependencies on legacy code in the future.

I'm not really sure how to go about moving this forward.

Michael, should this go through your tree, or should I send a pull
request direct to Peter?  If the latter, whose acks will I need?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCHv3 2/4] Split serial-isa into its own config option

2016-01-17 Thread David Gibson

On Sat, Jan 16, 2016 at 01:37:57PM +0100, Thomas Huth wrote:
> On 15.01.2016 13:21, David Gibson wrote:
> > At present, the core device model code for 8250-like serial ports
> > (serial.c) and the code for serial ports attached to ISA-style legacy IO
> > (serial-isa.c) are both controlled by the CONFIG_SERIAL variable.
> > 
> > There are lots and lots of embedded platforms that have 8250-like serial
> > ports but have never had anything resembling ISA legacy IO.  Therefore,
> > split serial-isa into its own CONFIG_SERIAL_ISA option so it can be
> > disabled for platforms where it's not appropriate.
> > 
> > For now, I enabled CONFIG_SERIAL_ISA in every default-config where
> > CONFIG_SERIAL is enabled, excepting microblaze, moxie, or32, and
> > xtensa.  As best as I can tell, those platforms never used legacy ISA,
> > and also don't include PCI support (which would allow connection of a
> > PCI->ISA bridge and/or a southbridge including legacy ISA serial
> > ports).
> > 
> > Signed-off-by: David Gibson 
> > ---
> >  default-configs/alpha-softmmu.mak| 1 +
> >  default-configs/arm-softmmu.mak  | 1 +
> >  default-configs/i386-softmmu.mak | 1 +
> >  default-configs/mips-softmmu.mak | 1 +
> >  default-configs/mips64-softmmu.mak   | 1 +
> >  default-configs/mips64el-softmmu.mak | 1 +
> >  default-configs/mipsel-softmmu.mak   | 1 +
> >  default-configs/ppc-softmmu.mak  | 1 +
> >  default-configs/ppc64-softmmu.mak| 1 +
> >  default-configs/ppcemb-softmmu.mak   | 1 +
> >  default-configs/sh4-softmmu.mak  | 1 +
> >  default-configs/sh4eb-softmmu.mak| 1 +
> >  default-configs/sparc64-softmmu.mak  | 1 +
> >  default-configs/x86_64-softmmu.mak   | 1 +
> >  hw/char/Makefile.objs| 3 ++-
> >  15 files changed, 16 insertions(+), 1 deletion(-)
> ...
> > diff --git a/default-configs/ppc-softmmu.mak 
> > b/default-configs/ppc-softmmu.mak
> > index d4d0f9b..13eb94f 100644
> > --- a/default-configs/ppc-softmmu.mak
> > +++ b/default-configs/ppc-softmmu.mak
> > @@ -45,5 +45,6 @@ CONFIG_PLATFORM_BUS=y
> >  CONFIG_ETSEC=y
> >  CONFIG_LIBDECNUMBER=y
> >  # For PReP
> > +CONFIG_SERIAL_ISA=y
> >  CONFIG_MC146818RTC=y
> >  CONFIG_ISA_TESTDEV=y
> > diff --git a/default-configs/ppc64-softmmu.mak 
> > b/default-configs/ppc64-softmmu.mak
> > index 70a89d1..3e243fd 100644
> > --- a/default-configs/ppc64-softmmu.mak
> > +++ b/default-configs/ppc64-softmmu.mak
> > @@ -50,6 +50,7 @@ CONFIG_LIBDECNUMBER=y
> >  CONFIG_XICS=$(CONFIG_PSERIES)
> >  CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
> >  # For PReP
> > +CONFIG_SERIAL_ISA=y
> >  CONFIG_MC146818RTC=y
> >  CONFIG_ISA_TESTDEV=y
> >  CONFIG_MEM_HOTPLUG=y
> 
> A little bit off-topic ... but maybe we should simply "include
> ppc-softmmu.mak" in ppc64-softmmu.mak since the ppc64-softmmu is
> supposed to offer all the 32 bit platforms, too? Then changes like this
> would only affect one file instead of two.

Um.. perhaps, but not really within the scope of this series.

> 
> ...
> > diff --git a/hw/char/Makefile.objs b/hw/char/Makefile.objs
> > index 5931cc8..be42d2f 100644
> > --- a/hw/char/Makefile.objs
> > +++ b/hw/char/Makefile.objs
> > @@ -2,7 +2,8 @@ common-obj-$(CONFIG_IPACK) += ipoctal232.o
> >  common-obj-$(CONFIG_ESCC) += escc.o
> >  common-obj-$(CONFIG_PARALLEL) += parallel.o
> >  common-obj-$(CONFIG_PL011) += pl011.o
> > -common-obj-$(CONFIG_SERIAL) += serial.o serial-isa.o
> > +common-obj-$(CONFIG_SERIAL) += serial.o
> > +common-obj-$(CONFIG_SERIAL_ISA) += serial-isa.o
> >  common-obj-$(CONFIG_SERIAL_PCI) += serial-pci.o
> >  common-obj-$(CONFIG_VIRTIO) += virtio-console.o
> >  common-obj-$(CONFIG_XILINX) += xilinx_uartlite.o
> 
> Patch looks fine to me.
> 
> Reviewed-by: Thomas Huth 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 5/7] target-ppc: gdbstub: fix altivec registers for little-endian guests

2016-01-17 Thread David Gibson

On Fri, Jan 15, 2016 at 04:00:38PM +0100, Greg Kurz wrote:
> Altivec registers are 128-bit wide. They are stored in memory as two
> 64-bit values that must be byteswapped when the guest is little-endian.
> Let's reuse the ppc_maybe_bswap_register() helper for this.
> 
> We also need to fix the ordering of the 64-bit elements according to
> the target endianness, for both system and user mode.
> 
> Signed-off-by: Greg Kurz 

What bothers me about this is that avr_need_swap() now depends on both
host and guest endianness.  However the VSCR and VRSAVE swap - like
the swaps for GPRs and FPRs - uses ppc_maybe_bswap_register() which
depends only on guest endianness.

Why does altivec depend on the host endianness?

> ---
>  target-ppc/translate_init.c |   12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index 18e9e561561f..80d53e4dcf5a 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -8754,9 +8754,9 @@ static void dump_ppc_insns (CPUPPCState *env)
>  static bool avr_need_swap(CPUPPCState *env)
>  {
>  #ifdef HOST_WORDS_BIGENDIAN
> -return false;
> +return msr_le;
>  #else
> -return true;
> +return !msr_le;
>  #endif
>  }
>  
> @@ -8800,14 +8800,18 @@ static int gdb_get_avr_reg(CPUPPCState *env, uint8_t 
> *mem_buf, int n)
>  stq_p(mem_buf, env->avr[n].u64[1]);
>  stq_p(mem_buf+8, env->avr[n].u64[0]);
>  }
> +ppc_maybe_bswap_register(env, mem_buf, 8);
> +ppc_maybe_bswap_register(env, mem_buf + 8, 8);
>  return 16;
>  }
>  if (n == 32) {
>  stl_p(mem_buf, env->vscr);
> +ppc_maybe_bswap_register(env, mem_buf, 4);
>  return 4;
>  }
>  if (n == 33) {
>  stl_p(mem_buf, (uint32_t)env->spr[SPR_VRSAVE]);
> +ppc_maybe_bswap_register(env, mem_buf, 4);
>  return 4;
>  }
>  return 0;
> @@ -8816,6 +8820,8 @@ static int gdb_get_avr_reg(CPUPPCState *env, uint8_t 
> *mem_buf, int n)
>  static int gdb_set_avr_reg(CPUPPCState *env, uint8_t *mem_buf, int n)
>  {
>  if (n < 32) {
> +ppc_maybe_bswap_register(env, mem_buf, 8);
> +ppc_maybe_bswap_register(env, mem_buf + 8, 8);
>  if (!avr_need_swap(env)) {
>  env->avr[n].u64[0] = ldq_p(mem_buf);
>  env->avr[n].u64[1] = ldq_p(mem_buf+8);
> @@ -8826,10 +8832,12 @@ static int gdb_set_avr_reg(CPUPPCState *env, uint8_t 
> *mem_buf, int n)
>  return 16;
>  }
>  if (n == 32) {
> +ppc_maybe_bswap_register(env, mem_buf, 4);
>  env->vscr = ldl_p(mem_buf);
>  return 4;
>  }
>  if (n == 33) {
> +ppc_maybe_bswap_register(env, mem_buf, 4);
>  env->spr[SPR_VRSAVE] = (target_ulong)ldl_p(mem_buf);
>  return 4;
>  }
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 1/7] target-ppc: kvm: fix floating point registers sync on little-endian hosts

2016-01-17 Thread David Gibson

On Fri, Jan 15, 2016 at 04:00:12PM +0100, Greg Kurz wrote:
> On VSX capable CPUs, the 32 FP registers are mapped to the high-bits
> of the 32 first VSX registers. So if you have:
> 
> VSR31 = (uint128) 0x0102030405060708090a0b0c0d0e0f00
> 
> then
> 
> FPR31 = (uint64) 0x0102030405060708
> 
> The kernel stores the VSX registers in the fp_state struct following the
> host endian element ordering.
> 
> On big-endian:
> 
> fp_state.fpr[31][0] = 0x0102030405060708
> fp_state.fpr[31][1] = 0x090a0b0c0d0e0f00
> 
> On little-endian:
> 
> fp_state.fpr[31][0] = 0x090a0b0c0d0e0f00
> fp_state.fpr[31][1] = 0x0102030405060708
> 
> The KVM_GET_ONE_REG and KVM_SET_ONE_REG ioctls preserve this ordering, but
> QEMU considers it as big-endian and always copies element [0] to the
> fpr[] array and element [1] to the vsr[] array. This does not work with
> little-endian hosts, and you will get:
> 
> (qemu) p $f31
> 0x90a0b0c0d0e0f00
> 
> instead of:
> 
> (qemu) p $f31
> 0x102030405060708
> 
> This patch fixes the element ordering for little-endian hosts.
> 
> Signed-off-by: Greg Kurz 

If I'm understanding correctly, the only reason this bug didn't affect
things other than the gdbstub is because the get and put routines had
mirrored bugs.  So although qemu ended up with definitely wrong
information in its internal state, it reshuffled it to be right on
setting it back into KVM.

Is that correct?

> ---
>  target-ppc/kvm.c |   12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 9940a9046220..45249990bda1 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -650,8 +650,13 @@ static int kvm_put_fp(CPUState *cs)
>  for (i = 0; i < 32; i++) {
>  uint64_t vsr[2];
>  
> +#ifdef HOST_WORDS_BIGENDIAN
>  vsr[0] = float64_val(env->fpr[i]);
>  vsr[1] = env->vsr[i];
> +#else
> +vsr[0] = env->vsr[i];
> +vsr[1] = float64_val(env->fpr[i]);
> +#endif
>  reg.addr = (uintptr_t) 
>  reg.id = vsx ? KVM_REG_PPC_VSR(i) : KVM_REG_PPC_FPR(i);
>  
> @@ -721,10 +726,17 @@ static int kvm_get_fp(CPUState *cs)
>  vsx ? "VSR" : "FPR", i, strerror(errno));
>  return ret;
>  } else {
> +#ifdef HOST_WORDS_BIGENDIAN
>  env->fpr[i] = vsr[0];
>  if (vsx) {
>  env->vsr[i] = vsr[1];
>  }
> +#else
> +env->fpr[i] = vsr[1];
> +if (vsx) {
> +env->vsr[i] = vsr[0];
> +}
> +#endif
>  }
>  }
>  }
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Qemu-devel] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...)

2016-01-17 Thread Jike Song

Hi Alex, let's continue with a new thread :)

Basically we agree with you: exposing vGPU via VFIO can make
QEMU share as much code as possible with pcidev(PF or VF) assignment.
And yes, different vGPU vendors can share quite a lot of the
QEMU part, which will do good for upper layers such as libvirt.


To achieve this, there are quite a lot to do, I'll summarize
it below. I dived into VFIO for a while but still may have
things misunderstood, so please correct me :)



First, let me illustrate my understanding of current VFIO
framework used to pass through a pcidev to guest:


 +--+
 |vfio qemu |
 +-++---+
   |DMA  ^  |CFG
QEMU   |map   IRQ|  |
---|-|--|---
KERNEL+|-|--|--+
  | VFIO   | |  |  |
  |v |  v  |
  |  +---+ +-+---+ |
IOMMU |  | vfio iommu driver | | vfio bus driver | |
API  <---+   | | | |
Layer |  | e.g. type1| | e.g. vfio_pci   | |
  |  +---+ +-+ |
  ++


Here when a particular pcidev is passed-through to a KVM guest,
it is attached to vfio_pci driver in host, and guest memory
is mapped into IOMMU via the type1 iommu driver.


Then, the draft infrastructure of future VFIO-based vgpu:



 +-+
 |  vfio qemu  |
 ++-+--+
  |DMA   ^  |CFG
QEMU  |mapIRQ|  |
--|--|--|---
KERNEL|  |  |
 +|--|--|--+
 |VFIO|  |  |  |
 |v  |  v  |
 | ++  +-+---+ |
DMA  | | vfio iommu driver  |  | vfio bus driver | |
API <--+|  | | |
Layer| |  e.g. vfio_type2   |  |  e.g. vfio_vgpu | |
 | ++  +-+ |
 | |  ^  |  ^  |
 +-|--|--|--|--+
   |  |  |  |
   |  |  v  |
 +-|--|--+   +-+
 | +---v---+ |   | |
 | |   | |   | |
 | |  KVMGT| |   | |
 | |   | |   |   host gfx driver   |
 | +---+ |   | |
 |   |   | |
 |KVM hypervisor |   | |
 +---+   +-+

NOTEvfio_type2 and vfio_vgpu are only *logically* parts
of VFIO, they may be implemented in KVM hypervisor
or host gfx driver.



Here we need to implement a new vfio IOMMU driver instead of type1,
let's call it vfio_type2 temporarily. The main difference from pcidev
assignment is, vGPU doesn't have its own DMA requester id, so it has
to share mappings with host and other vGPUs.

- type1 iommu driver maps gpa to hpa for passing through;
  whereas type2 maps iova to hpa;

- hardware iommu is always needed by type1, whereas for
  type2, hardware iommu is optional;

- type1 will invoke low-level IOMMU API (iommu_map et al) to
  setup IOMMU page table directly, whereas type2 dosen't (only
  need to invoke higher level DMA API like dma_map_page);


We also need to implement a new 'bus' driver instead of vfio_pci,
let's call it vfio_vgpu temporarily:

- vfio_pci is a real pci driver, it has a probe method called
  during dev attaching; whereas the vfio_vgpu is a pseudo
  driver, it won't attach any devivce - the GPU is always owned by
  host gfx driver. It has to do 'probing' elsewhere, but
  still in host gfx driver attached to the device;

- pcidev(PF or VF) attached to vfio_pci has a natural path
  in sysfs; whereas vgpu is purely a software concept:
  vfio_vgpu needs to create create/destory vgpu instances,
  maintain their paths in sysfs (e.g. "/sys/class/vgpu/intel/vgpu0")
  etc. There should be something added in a higher layer
  to do this

Re: [Qemu-devel] [PATCH 03/10] pseries: Clean up hash page table allocation error handling

2016-01-17 Thread Alexey Kardashevskiy


On 01/15/2016 11:00 PM, David Gibson wrote:

The spapr_alloc_htab() and spapr_reset_htab() functions currently handle
all errors with error_setg(_abort, ...).

But really, the callers are really better placed to decide on the error
handling.  So, instead make the functions use the error propagation
infrastructure.

In the callers we change to _fatal instead of _abort, since
this can be triggered by a bad configuration or kernel error rather than
indicating a programming error in qemu.

While we're at it improve the messages themselves a bit, and clean up the
indentation a little.

Signed-off-by: David Gibson 
---
  hw/ppc/spapr.c | 24 
  1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b7fd09a..d28e349 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1016,7 +1016,7 @@ static void emulate_spapr_hypercall(PowerPCCPU *cpu)
  #define CLEAN_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) &= 
tswap64(~HPTE64_V_HPTE_DIRTY))
  #define DIRTY_HPTE(_hpte)  ((*(uint64_t *)(_hpte)) |= 
tswap64(HPTE64_V_HPTE_DIRTY))

-static void spapr_alloc_htab(sPAPRMachineState *spapr)
+static void spapr_alloc_htab(sPAPRMachineState *spapr, Error **errp)
  {
  long shift;
  int index;
@@ -1031,7 +1031,8 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
   * For HV KVM, host kernel will return -ENOMEM when requested
   * HTAB size can't be allocated.
   */
-error_setg(_abort, "Failed to allocate HTAB of requested size, try 
with smaller maxmem");
+error_setg_errno(errp, -shift,
+ "Error allocating KVM hash page table, try smaller 
maxmem");
  } else if (shift > 0) {
  /*
   * Kernel handles htab, we don't need to allocate one
@@ -1040,7 +1041,10 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
   * but we don't allow booting of such guests.
   */
  if (shift != spapr->htab_shift) {
-error_setg(_abort, "Failed to allocate HTAB of requested size, 
try with smaller maxmem");
+error_setg(errp,
+"Small allocation for KVM hash page table (%ld < %"
+PRIu32 "), try smaller maxmem",




Even though it is not in the CODING_STYLE, I have not seen anyone objecting 
the very good kernel's "never break user-visible strings" rule or rejecting 
patches with user-visible strings failing to fit 80 chars limit.





--
Alexey

Re: [Qemu-devel] [PATCH 06/10] pseries: Improve error handling in find_unknown_sysbus_device()

2016-01-17 Thread David Gibson

On Fri, Jan 15, 2016 at 04:40:24PM +0100, Markus Armbruster wrote:
> David Gibson  writes:
> 
> > Use error_setg() to return an error instead of using an explicit exit().
> >
> > Signed-off-by: David Gibson 
> > ---
> >  hw/ppc/spapr.c | 10 ++
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index bb5eaa5..ddca6e6 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1106,6 +1106,7 @@ static void spapr_reset_htab(sPAPRMachineState 
> > *spapr, Error **errp)
> >  
> >  static int find_unknown_sysbus_device(SysBusDevice *sbdev, void *opaque)
> >  {
> > +Error **errp = opaque;
> >  bool matched = false;
> >  
> >  if (object_dynamic_cast(OBJECT(sbdev), TYPE_SPAPR_PCI_HOST_BRIDGE)) {
> > @@ -1113,9 +1114,10 @@ static int find_unknown_sysbus_device(SysBusDevice 
> > *sbdev, void *opaque)
> >  }
> >  
> >  if (!matched) {
> > -error_report("Device %s is not supported by this machine yet.",
> > - qdev_fw_name(DEVICE(sbdev)));
> > -exit(1);
> > +error_setg(errp,
> > +   "Device %s is not supported by this machine yet",
> > +   qdev_fw_name(DEVICE(sbdev)));
> > +return 1; /* Don't continue scanning devices */
> 
> Re the comment: really?
> 
> find_unknown_sysbus_device() gets passed to
> foreach_dynamic_sysbus_device(), which passes it on to
> find_sysbus_device().
> 
> find_sysbus_device() calls it directly for non-containers, ignoring the
> function value.
> 
> For containers, it iterates over the container's contents with
> object_child_foreach().  That function indeed stops when the callback
> returns non-zero.  However, the callback is find_sysbus_device(), not
> find_unknown_sysbus_device().
> 
> Am I confused?

No, I am.

I can't see a reasonable way to change this without assuming the error
is fatal, so I think I'll just drop this patch from the series.

> 
> >  }
> >  
> >  return 0;
> > @@ -1150,7 +1152,7 @@ static void ppc_spapr_reset(void)
> >  uint32_t rtas_limit;
> >  
> >  /* Check for unknown sysbus devices */
> > -foreach_dynamic_sysbus_device(find_unknown_sysbus_device, NULL);
> > +foreach_dynamic_sysbus_device(find_unknown_sysbus_device, 
> > _fatal);
> >  
> >  /* Reset the hash table & recalc the RMA */
> >  spapr_reset_htab(spapr, _fatal);
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] net: cadence_gem: check packet size in gem_recieve

2016-01-17 Thread Jason Wang



On 01/15/2016 03:00 PM, P J P wrote:
> From: Prasad J Pandit 
>
> While receiving packets in 'gem_receive' routine, if Frame Check
> Sequence(FCS) is enabled, it copies the packet into a local
> buffer without checking its size. Add check to validate packet
> length against the buffer size to avoid buffer overflow.
>
> Reported-by: Ling Liu 
> Signed-off-by: Prasad J Pandit 
> ---
>  hw/net/cadence_gem.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
> index 3639fc1..c3a273b 100644
> --- a/hw/net/cadence_gem.c
> +++ b/hw/net/cadence_gem.c
> @@ -677,10 +677,14 @@ static ssize_t gem_receive(NetClientState *nc, const 
> uint8_t *buf, size_t size)
>  } else {
>  unsigned crc_val;
>  
> +if (size > sizeof(rxbuf) - sizeof(crc_val)) {
> +size = sizeof(rxbuf) - sizeof(crc_val);
> +}
> +bytes_to_copy = size;
> +
>  /* The application wants the FCS field, which QEMU does not provide.
>   * We must try and calculate one.
>   */
> -

Unnecessary whitespace change.

>  memcpy(rxbuf, buf, size);

We probably need more check, is there any guarantee that size <= 2048?
If not, need fix.

Thanks

>  memset(rxbuf + size, 0, sizeof(rxbuf) - size);
>  rxbuf_ptr = rxbuf;

Re: [Qemu-devel] [Qemu-arm] [PATCH] cadence_gem: fix buffer overflow

2016-01-17 Thread Jason Wang



On 01/15/2016 02:19 PM, Peter Crosthwaite wrote:
> On Thu, Jan 14, 2016 at 2:03 AM, Peter Maydell  
> wrote:
>> On 14 January 2016 at 09:43, Michael S. Tsirkin  wrote:
>>> gem_receive copies a packet received from network into an rxbuf[2048]
>>> array on stack, with size limited by descriptor length set by guest.  If
>>> guest is malicious and specifies a descriptor length that is too large,
>>> and should packet size exceed array size, this results in a buffer
>>> overflow.
>>>
>>> Reported-by: 刘令 
>>> Signed-off-by: Michael S. Tsirkin 
>>> ---
>>>  hw/net/cadence_gem.c | 8 
>>>  1 file changed, 8 insertions(+)
>>>
>>> diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
>>> index 3639fc1..15a0786 100644
>>> --- a/hw/net/cadence_gem.c
>>> +++ b/hw/net/cadence_gem.c
>>> @@ -862,6 +862,14 @@ static void gem_transmit(CadenceGEMState *s)
>>>  break;
>>>  }
>>>
>>> +if (tx_desc_get_length(desc) > sizeof(tx_packet) - (p - 
>>> tx_packet)) {
>>> +DB_PRINT("TX descriptor @ 0x%x too large: size 0x%x space 
>>> 0x%x\n",
>>> + (unsigned)packet_desc_addr,
>>> + (unsigned)tx_desc_get_length(desc),
>>> + sizeof(tx_packet) - (p - tx_packet));
>>> +break;
>>> +}
>> Is this what the real hardware does in this situation?
>> Should we log this as a guest error?
>>
> I'm not sure it is a guest error. I think its just a shortcut in the
> original implementation. I guess QEMU needs the whole packet before
> handing off to the net layer and the assumption is that the packet is
> always within 2048. I think the hardware is just going to put the data
> on the wire as it goes.

If we are not sure this is what real hardware did, dropping looks more
safe than sending the truncated packets on the wire.

>  The easiest solution is to realloc the buffer
> as it goes with the increasing sizes.

This could allow possible DOS from guest (see
cde31a0e3dc0e4ac83e454d6096350cec584adf1).

> Otherwise you could refactor the
> code to be two pass over the descriptor ring section (containing the
> packet). If we want to fix the buffer overflow more urgently, the
> correct error would be an assert().
>
> Regards,
> Peter

Let's avoid putting guest trigger-able assert() here. The patch looks
good for fixing the issue. Refactoring could be done on top.

Thanks

>
>>> +
>>>  /* Gather this fragment of the packet from "dma memory" to our 
>>> contig.
>>>   * buffer.
>>>   */
>>> --
>>> MST
>>>
>> thanks
>> -- PMM
>>

Re: [Qemu-devel] [PATCH 00/10] Cleanups to error reporting on ppc and spapr (v2)

2016-01-17 Thread David Gibson

On Fri, Jan 15, 2016 at 04:47:53PM +0100, Markus Armbruster wrote:
> David Gibson  writes:
> 
> > Here's a new spin of my patches to clean up a bunch of error reporting
> > in the pseries machine type and target-ppc code, to better use the
> > error API.
> >
> > Once reviewed, I hope to merge this into ppc-for-2.6 shortly.
> 
> There's an error_setg(_abort, ...) left in spapr_drc.c.  Should
> that be converted to a straight abort()?

Maybe.  I basically ignored for now all functions which work with the
device tree.  I have some other substantial rework I hope to get
around to there, which may make local changes like this error rework
moot.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Qemu-devel] [PATCH 25/51] paaudio: fix playback glitches

2016-01-17 Thread Volker Rümelin

Hi,

a better way to fix the playback glitches is to use a bigger playback
buffer on pulseaudio server side. I suggest you replace your patch with
a patch like this one:

diff --git a/audio/paaudio.c b/audio/paaudio.c
index fea6071..8bd5b91 100644
--- a/audio/paaudio.c
+++ b/audio/paaudio.c
@@ -554,7 +554,7 @@ static int qpa_init_out(HWVoiceOut *hw, struct audsettings 
*as,
  * qemu audio tick runs at 100 Hz (by default), so processing
  * data chunks worth 10 ms of sound should be a good fit.
  */
-ba.tlength = pa_usec_to_bytes (10 * 1000, );
+ba.tlength = pa_usec_to_bytes (50 * 1000, );
 ba.minreq = pa_usec_to_bytes (5 * 1000, );
 ba.maxlength = -1;
 ba.prebuf = -1;

I tested your patch and while it really improves audio playback, I
still notice audio drop-outs. With my suggestion I experience no
playback glitches.

Regards,
Volker

Re: [Qemu-devel] [PATCH v5 2/5] Add Error **errp for xen_host_pci_device_get()

2016-01-17 Thread Cao jin




On 01/16/2016 12:41 AM, Eric Blake wrote:

On 01/14/2016 08:11 PM, Cao jin wrote:


   buf[rc] = 0;
-rc = qemu_strtoul(buf, , base, );
-if (!rc) {
-*pvalue = value;
+rc = qemu_strtoul(buf, , base, (unsigned long *)pvalue);


Ouch. Casting unsigned int * to unsigned long * and then dereferencing
it is bogus (you end up having qemu_strtoul() write beyond bounds on
platforms where long is larger than int).


Yes, I considered this issue a little. Because the current condition is:
the value it want to get won`t exceed 4 byte (vendor/device ID, etc). So
I guess even if on x86_64(length of int != long), it won`t break things.
So, compared with following, which style do you prefer?


Maybe:

rc = qemu_strtoul(buf, , base, );
if (rc) {
 assert(value < UINT_MAX);
 *pvalue = value;
} else {
 report error ...
}

And maybe some of it should even be done as part of the conversion to
qemu_strtoul() in 1/5.



Thanks for the example, will give v6 soon.
--
Yours Sincerely,

Cao jin

Re: [Qemu-devel] [PATCH v5 0/5] Xen PCI passthru: Convert to realize()

2016-01-17 Thread Cao jin




On 01/15/2016 10:16 PM, Stefano Stabellini wrote:

On Thu, 14 Jan 2016, Eric Blake wrote:

On 01/14/2016 09:50 AM, Stefano Stabellini wrote:

Eric,

I'll wait for your reviewed-by on the whole series before committing.


Found a bug in 2/5, up to you if you want to fix that or wait for a v6.


If Cao is happy to update and resend, I'll wait for v6.



fine by me, will update it:) But I guess that bug maybe not a show-stopper:)

--
Yours Sincerely,

Cao jin

Re: [Qemu-devel] [PATCH 01/10] ppc: Cleanup error handling in ppc_set_compat()

2016-01-17 Thread David Gibson

On Fri, Jan 15, 2016 at 04:19:18PM +0100, Markus Armbruster wrote:
> David Gibson  writes:
> 
> > Current ppc_set_compat() returns -1 for errors, and also (unconditionally)
> > reports an error message.  The caller in h_client_architecture_support()
> > may then report it again using an outdated fprintf().
> >
> > Clean this up by using the modern error reporting mechanisms.
> >
> > Signed-off-by: David Gibson 
> > Reviewed-by: Thomas Huth 
> > ---
> >  hw/ppc/spapr.c  |  4 +---
> >  hw/ppc/spapr_hcall.c| 10 +-
> >  target-ppc/cpu.h|  2 +-
> >  target-ppc/translate_init.c | 13 +++--
> >  4 files changed, 14 insertions(+), 15 deletions(-)
> >
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 50e5a26..fa7a7f4 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1635,9 +1635,7 @@ static void spapr_cpu_init(sPAPRMachineState *spapr, 
> > PowerPCCPU *cpu)
> >  }
> >  
> >  if (cpu->max_compat) {
> > -if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
> > -exit(1);
> > -}
> > +ppc_set_compat(cpu, cpu->max_compat, _fatal);
> >  }
> >  
> >  xics_cpu_setup(spapr->icp, cpu);
> > diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> > index cebceea..8b0fcb3 100644
> > --- a/hw/ppc/spapr_hcall.c
> > +++ b/hw/ppc/spapr_hcall.c
> > @@ -837,7 +837,7 @@ static target_ulong cas_get_option_vector(int vector, 
> > target_ulong table)
> >  typedef struct {
> >  PowerPCCPU *cpu;
> >  uint32_t cpu_version;
> > -int ret;
> > +Error *err;
> >  } SetCompatState;
> >  
> >  static void do_set_compat(void *arg)
> > @@ -845,7 +845,7 @@ static void do_set_compat(void *arg)
> >  SetCompatState *s = arg;
> >  
> >  cpu_synchronize_state(CPU(s->cpu));
> > -s->ret = ppc_set_compat(s->cpu, s->cpu_version);
> > +ppc_set_compat(s->cpu, s->cpu_version, >err);
> >  }
> >  
> >  #define get_compat_level(cpuver) ( \
> > @@ -929,13 +929,13 @@ static target_ulong 
> > h_client_architecture_support(PowerPCCPU *cpu_,
> >  SetCompatState s = {
> >  .cpu = POWERPC_CPU(cs),
> >  .cpu_version = cpu_version,
> > -.ret = 0
> > +.err = NULL,
> >  };
> >  
> >  run_on_cpu(cs, do_set_compat, );
> >  
> > -if (s.ret < 0) {
> > -fprintf(stderr, "Unable to set compatibility mode\n");
> > +if (s.err) {
> > +error_report_err(s.err);
> >  return H_HARDWARE;
> >  }
> >  }
> > diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
> > index 9706000..b3b89e6 100644
> > --- a/target-ppc/cpu.h
> > +++ b/target-ppc/cpu.h
> > @@ -1210,7 +1210,7 @@ void ppc_store_msr (CPUPPCState *env, target_ulong 
> > value);
> >  
> >  void ppc_cpu_list (FILE *f, fprintf_function cpu_fprintf);
> >  int ppc_get_compat_smt_threads(PowerPCCPU *cpu);
> > -int ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version);
> > +void ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version, Error **errp);
> >  
> >  /* Time-base and decrementer management */
> >  #ifndef NO_CPU_IO_DEFS
> > diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> > index e88dc7f..d0feed0 100644
> > --- a/target-ppc/translate_init.c
> > +++ b/target-ppc/translate_init.c
> > @@ -9186,7 +9186,7 @@ int ppc_get_compat_smt_threads(PowerPCCPU *cpu)
> >  return ret;
> >  }
> >  
> > -int ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version)
> > +void ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version, Error **errp)
> >  {
> >  int ret = 0;
> >  CPUPPCState *env = >env;
> > @@ -9208,12 +9208,13 @@ int ppc_set_compat(PowerPCCPU *cpu, uint32_t 
> > cpu_version)
> >  break;
> >  }
> >  
> > -if (kvm_enabled() && kvmppc_set_compat(cpu, cpu->cpu_version) < 0) {
> > -error_report("Unable to set compatibility mode in KVM");
> > -ret = -1;
> > +if (kvm_enabled()) {
> > +ret = kvmppc_set_compat(cpu, cpu->cpu_version);
> > +if (ret < 0) {
> > +error_setg_errno(errp, -ret,
> > + "Unable to set CPU compatibility mode in 
> > KVM");
> > +}
> >  }
> > -
> > -return ret;
> >  }
> >  
> >  static gint ppc_cpu_compare_class_pvr(gconstpointer a, gconstpointer b)
> 
> Error message now includes strerror() of the ioctl's errno.  Suggest to
> mention that in the commit message.

Done.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v16 04/14] vfio: make the 4 bytes aligned for capability size

2016-01-17 Thread Marcel Apfelbaum


On 01/12/2016 04:43 AM, Cao jin wrote:

From: Chen Fan 

this function search the capability from the end, the last
size should 0x100 - pos, not 0xff - pos.


Indeed, "next" should be the first address of the next capability.


Reviewed-by: Marcel Apfelbaum 



Signed-off-by: Chen Fan 
---
  hw/vfio/pci.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index a63cf85..288f2c7 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1469,7 +1469,8 @@ static void vfio_unmap_bars(VFIOPCIDevice *vdev)
   */
  static uint8_t vfio_std_cap_max_size(PCIDevice *pdev, uint8_t pos)
  {
-uint8_t tmp, next = 0xff;
+uint8_t tmp;
+uint16_t next = PCI_CONFIG_SPACE_SIZE;

  for (tmp = pdev->config[PCI_CAPABILITY_LIST]; tmp;
   tmp = pdev->config[tmp + 1]) {

Re: [Qemu-devel] [PATCH v16 01/14] vfio: extract vfio_get_hot_reset_info as a single function

2016-01-17 Thread Marcel Apfelbaum


On 01/12/2016 04:43 AM, Cao jin wrote:

From: Chen Fan 

the function is used to get affected devices by bus reset.
so here extract it, and can used for aer soon.

Signed-off-by: Chen Fan 
---
  hw/vfio/pci.c | 66 +++
  1 file changed, 48 insertions(+), 18 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 1fb868c..efcd3cd 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1654,6 +1654,51 @@ static void vfio_check_af_flr(VFIOPCIDevice *vdev, 
uint8_t pos)
  }
  }

+/*
+ * return negative with errno, return 0 on success.
+ * if success, the point of ret_info fill with the affected device reset info.
+ *
+ */
+static int vfio_get_hot_reset_info(VFIOPCIDevice *vdev,
+   struct vfio_pci_hot_reset_info **ret_info)
+{
+struct vfio_pci_hot_reset_info *info;
+int ret, count;
+
+*ret_info = NULL;
+
+info = g_malloc0(sizeof(*info));
+info->argsz = sizeof(*info);
+
+ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info);
+if (ret && errno != ENOSPC) {
+ret = -errno;
+goto error;
+}
+
+count = info->count;
+
+info = g_realloc(info, sizeof(*info) +
+ (count * sizeof(struct vfio_pci_dependent_device)));
+info->argsz = sizeof(*info) +
+  (count * sizeof(struct vfio_pci_dependent_device));
+
+ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info);
+if (ret) {
+ret = -errno;
+error_report("vfio: hot reset info failed: %m");
+goto error;
+}
+
+*ret_info = info;
+info = NULL;
+
+return 0;
+error:
+g_free(info);
+return ret;
+}
+
  static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos)
  {
  PCIDevice *pdev = >pdev;
@@ -1793,7 +1838,7 @@ static bool vfio_pci_host_match(PCIHostDeviceAddress 
*host1,
  static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
  {
  VFIOGroup *group;
-struct vfio_pci_hot_reset_info *info;
+struct vfio_pci_hot_reset_info *info = NULL;
  struct vfio_pci_dependent_device *devices;
  struct vfio_pci_hot_reset *reset;
  int32_t *fds;
@@ -1805,12 +1850,8 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool 
single)
  vfio_pci_pre_reset(vdev);
  vdev->vbasedev.needs_reset = false;

-info = g_malloc0(sizeof(*info));
-info->argsz = sizeof(*info);
-
-ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info);
-if (ret && errno != ENOSPC) {
-ret = -errno;
+ret = vfio_get_hot_reset_info(vdev, );
+if (ret) {
  if (!vdev->has_pm_reset) {
  error_report("vfio: Cannot reset device %04x:%02x:%02x.%x, "
   "no available reset mechanism.", vdev->host.domain,



Hi,

I don't know how important this is, however if the second call to ioctl
fails (int the new vfio_get_hot_reset_info function) we will get both error
messages, the last one "no available reset mechanism"  being 
unnecessary/(wrong?).

You may want to move the error message to the new function (again, not sure
if it worth doing it)

Thanks,
Marcel


@@ -1819,18 +1860,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool 
single)
  goto out_single;
  }

-count = info->count;
-info = g_realloc(info, sizeof(*info) + (count * sizeof(*devices)));
-info->argsz = sizeof(*info) + (count * sizeof(*devices));
  devices = >devices[0];
-
-ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info);
-if (ret) {
-ret = -errno;
-error_report("vfio: hot reset info failed: %m");
-goto out_single;
-}
-
  trace_vfio_pci_hot_reset_has_dep_devices(vdev->vbasedev.name);

  /* Verify that we have all the groups required */

Re: [Qemu-devel] [PATCH v16 02/14] vfio: squeeze out vfio_pci_do_hot_reset for support bus reset

2016-01-17 Thread Marcel Apfelbaum


On 01/12/2016 04:43 AM, Cao jin wrote:

From: Chen Fan 

squeeze out vfio_pci_do_hot_reset to do host bus reset when AER recovery.

Signed-off-by: Chen Fan 
---
  hw/vfio/pci.c | 75 +++
  1 file changed, 44 insertions(+), 31 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index efcd3cd..a63cf85 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1699,6 +1699,48 @@ error:
  return ret;
  }

+static int vfio_pci_do_hot_reset(VFIOPCIDevice *vdev,
+ struct vfio_pci_hot_reset_info *info)
+{
+VFIOGroup *group;
+struct vfio_pci_hot_reset *reset;
+int32_t *fds;
+int ret, i, count;
+struct vfio_pci_dependent_device *devices;
+
+/* Determine how many group fds need to be passed */
+count = 0;
+devices = >devices[0];
+QLIST_FOREACH(group, _group_list, next) {
+for (i = 0; i < info->count; i++) {
+if (group->groupid == devices[i].group_id) {
+count++;
+break;
+}
+}
+}
+
+reset = g_malloc0(sizeof(*reset) + (count * sizeof(*fds)));
+reset->argsz = sizeof(*reset) + (count * sizeof(*fds));
+fds = >group_fds[0];
+
+/* Fill in group fds */
+QLIST_FOREACH(group, _group_list, next) {
+for (i = 0; i < info->count; i++) {
+if (group->groupid == devices[i].group_id) {
+fds[reset->count++] = group->fd;
+break;
+}
+}
+}
+
+/* Bus reset! */
+ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_PCI_HOT_RESET, reset);
+g_free(reset);
+
+return ret;
+}
+
  static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos)
  {
  PCIDevice *pdev = >pdev;
@@ -1840,9 +1882,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool 
single)
  VFIOGroup *group;
  struct vfio_pci_hot_reset_info *info = NULL;
  struct vfio_pci_dependent_device *devices;
-struct vfio_pci_hot_reset *reset;
-int32_t *fds;
-int ret, i, count;
+int ret, i;
  bool multi = false;

  trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" : "multi");
@@ -1921,34 +1961,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool 
single)
  goto out_single;
  }

-/* Determine how many group fds need to be passed */
-count = 0;
-QLIST_FOREACH(group, _group_list, next) {
-for (i = 0; i < info->count; i++) {
-if (group->groupid == devices[i].group_id) {
-count++;
-break;
-}
-}
-}
-
-reset = g_malloc0(sizeof(*reset) + (count * sizeof(*fds)));
-reset->argsz = sizeof(*reset) + (count * sizeof(*fds));
-fds = >group_fds[0];
-
-/* Fill in group fds */
-QLIST_FOREACH(group, _group_list, next) {
-for (i = 0; i < info->count; i++) {
-if (group->groupid == devices[i].group_id) {
-fds[reset->count++] = group->fd;
-break;
-}
-}
-}
-
-/* Bus reset! */
-ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_PCI_HOT_RESET, reset);
-g_free(reset);
+ret = vfio_pci_do_hot_reset(vdev, info);

  trace_vfio_pci_hot_reset_result(vdev->vbasedev.name,
  ret ? "%m" : "Success");




The commit message may be improved, other than that it looks OK to me.

Reviewed-by: Marcel Apfelbaum 

Thanks,
Marcel

Re: [Qemu-devel] [PATCH v16 05/14] vfio: add pcie extanded capability support

2016-01-17 Thread Marcel Apfelbaum


On 01/12/2016 04:43 AM, Cao jin wrote:

From: Chen Fan 



Hi,

I noticed a type in the subject, extanded -> extended


For vfio pcie device, we could expose the extended capability on
PCIE bus. in order to avoid config space broken, we introduce
a copy config for parsing extended caps. and rebuild the pcie
extended config space.


Maybe we can re-word this. Will someone with better English skills
advice :) ?



Signed-off-by: Chen Fan 
---
  hw/vfio/pci.c | 70 ++-
  1 file changed, 69 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 288f2c7..64b0867 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1482,6 +1482,21 @@ static uint8_t vfio_std_cap_max_size(PCIDevice *pdev, 
uint8_t pos)
  return next - pos;
  }

+
+static uint16_t vfio_ext_cap_max_size(const uint8_t *config, uint16_t pos)
+{
+uint16_t tmp, next = PCIE_CONFIG_SPACE_SIZE;
+
+for (tmp = PCI_CONFIG_SPACE_SIZE; tmp;
+tmp = PCI_EXT_CAP_NEXT(pci_get_long(config + tmp))) {
+if (tmp > pos && tmp < next) {
+next = tmp;
+}
+}
+
+return next - pos;
+}


Can't we reuse vfio_std_cap_max_size here? if only the config size differs,
we can pass it as parameter.


+
  static void vfio_set_word_bits(uint8_t *buf, uint16_t val, uint16_t mask)
  {
  pci_set_word(buf, (pci_get_word(buf) & ~mask) | val);
@@ -1817,16 +1832,69 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, 
uint8_t pos)
  return 0;
  }

+static int vfio_add_ext_cap(VFIOPCIDevice *vdev)
+{
+PCIDevice *pdev = >pdev;
+uint32_t header;
+uint16_t cap_id, next, size;
+uint8_t cap_ver;
+uint8_t *config;
+
+/*
+ * In order to avoid breaking config space, create a copy to
+ * use for parsing extended capabilities.


It will be nice to know *how* do we break/*what* will break the config
space, I confess that I didn't see it :(.


+ */
+config = g_memdup(pdev->config, vdev->config_size);
+
+for (next = PCI_CONFIG_SPACE_SIZE; next;
+ next = PCI_EXT_CAP_NEXT(pci_get_long(config + next))) {
+header = pci_get_long(config + next);
+cap_id = PCI_EXT_CAP_ID(header);
+cap_ver = PCI_EXT_CAP_VER(header);
+
+/*
+ * If it becomes important to configure extended capabilities to their
+ * actual size, use this as the default when it's something we don't
+ * recognize. Since QEMU doesn't actually handle many of the config
+ * accesses, exact size doesn't seem worthwhile.
+ */
+size = vfio_ext_cap_max_size(config, next);
+
+pcie_add_capability(pdev, cap_id, cap_ver, next, size);
+pci_set_long(dev->config + next, PCI_EXT_CAP(cap_id, cap_ver, 0));
+
+/* Use emulated next pointer to allow dropping extended caps */
+pci_long_test_and_set_mask(vdev->emulated_config_bits + next,
+   PCI_EXT_CAP_NEXT_MASK);
+}
+
+g_free(config);
+return 0;
+}
+
  static int vfio_add_capabilities(VFIOPCIDevice *vdev)
  {
  PCIDevice *pdev = >pdev;
+int ret;

  if (!(pdev->config[PCI_STATUS] & PCI_STATUS_CAP_LIST) ||
  !pdev->config[PCI_CAPABILITY_LIST]) {
  return 0; /* Nothing to add */
  }

-return vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST]);
+ret = vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST]);
+if (ret) {
+return ret;
+}
+
+/* on PCI bus, it doesn't make sense to expose extended capabilities. */
+if (!pci_is_express(pdev) ||
+!pci_bus_is_express(pdev->bus) ||
+!pci_get_long(pdev->config + PCI_CONFIG_SPACE_SIZE)) {


I am curious about the last check, "!pci_get_long(pdev->config + 
PCI_CONFIG_SPACE_SIZE)",
can you please explain?

Thank you,
Marcel


+return 0;
+}
+
+return vfio_add_ext_cap(vdev);
  }

  static void vfio_pci_pre_reset(VFIOPCIDevice *vdev)

Re: [Qemu-devel] [PATCH v4 2/2] change type of pci_bridge_initfn() to void

2016-01-17 Thread Marcel Apfelbaum


On 01/15/2016 04:23 AM, Cao jin wrote:

Since it can`t fail. Also modify the callers.

Signed-off-by: Cao jin 
Reviewed-by: Markus Armbruster 
---
  hw/pci-bridge/i82801b11.c  | 5 +
  hw/pci-bridge/ioh3420.c| 6 +-
  hw/pci-bridge/pci_bridge_dev.c | 8 +++-
  hw/pci-bridge/xio3130_downstream.c | 6 +-
  hw/pci-bridge/xio3130_upstream.c   | 6 +-
  hw/pci-host/apb.c  | 7 +--
  hw/pci/pci_bridge.c| 3 +--
  include/hw/pci/pci_bridge.h| 2 +-
  8 files changed, 10 insertions(+), 33 deletions(-)

diff --git a/hw/pci-bridge/i82801b11.c b/hw/pci-bridge/i82801b11.c
index 7e79bc0..b21bc2c 100644
--- a/hw/pci-bridge/i82801b11.c
+++ b/hw/pci-bridge/i82801b11.c
@@ -61,10 +61,7 @@ static int i82801b11_bridge_initfn(PCIDevice *d)
  {
  int rc;

-rc = pci_bridge_initfn(d, TYPE_PCI_BUS);
-if (rc < 0) {
-return rc;
-}
+pci_bridge_initfn(d, TYPE_PCI_BUS);

  rc = pci_bridge_ssvid_init(d, I82801ba_SSVID_OFFSET,
 I82801ba_SSVID_SVID, I82801ba_SSVID_SSID);
diff --git a/hw/pci-bridge/ioh3420.c b/hw/pci-bridge/ioh3420.c
index cce2fdd..eead195 100644
--- a/hw/pci-bridge/ioh3420.c
+++ b/hw/pci-bridge/ioh3420.c
@@ -97,11 +97,7 @@ static int ioh3420_initfn(PCIDevice *d)
  PCIESlot *s = PCIE_SLOT(d);
  int rc;

-rc = pci_bridge_initfn(d, TYPE_PCIE_BUS);
-if (rc < 0) {
-return rc;
-}
-
+pci_bridge_initfn(d, TYPE_PCIE_BUS);
  pcie_port_init_reg(d);

  rc = pci_bridge_ssvid_init(d, IOH_EP_SSVID_OFFSET,
diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
index 26aded9..bc3e1b7 100644
--- a/hw/pci-bridge/pci_bridge_dev.c
+++ b/hw/pci-bridge/pci_bridge_dev.c
@@ -52,10 +52,8 @@ static int pci_bridge_dev_initfn(PCIDevice *dev)
  PCIBridgeDev *bridge_dev = PCI_BRIDGE_DEV(dev);
  int err;

-err = pci_bridge_initfn(dev, TYPE_PCI_BUS);
-if (err) {
-goto bridge_error;
-}
+pci_bridge_initfn(dev, TYPE_PCI_BUS);
+
  if (bridge_dev->flags & (1 << PCI_BRIDGE_DEV_F_SHPC_REQ)) {
  dev->config[PCI_INTERRUPT_PIN] = 0x1;
  memory_region_init(_dev->bar, OBJECT(dev), "shpc-bar",
@@ -94,7 +92,7 @@ slotid_error:
  }
  shpc_error:
  pci_bridge_exitfn(dev);
-bridge_error:
+
  return err;
  }

diff --git a/hw/pci-bridge/xio3130_downstream.c 
b/hw/pci-bridge/xio3130_downstream.c
index b3a6479..b4dd25f 100644
--- a/hw/pci-bridge/xio3130_downstream.c
+++ b/hw/pci-bridge/xio3130_downstream.c
@@ -60,11 +60,7 @@ static int xio3130_downstream_initfn(PCIDevice *d)
  PCIESlot *s = PCIE_SLOT(d);
  int rc;

-rc = pci_bridge_initfn(d, TYPE_PCIE_BUS);
-if (rc < 0) {
-return rc;
-}
-
+pci_bridge_initfn(d, TYPE_PCIE_BUS);
  pcie_port_init_reg(d);

  rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR,
diff --git a/hw/pci-bridge/xio3130_upstream.c b/hw/pci-bridge/xio3130_upstream.c
index eada582..434c8fd 100644
--- a/hw/pci-bridge/xio3130_upstream.c
+++ b/hw/pci-bridge/xio3130_upstream.c
@@ -56,11 +56,7 @@ static int xio3130_upstream_initfn(PCIDevice *d)
  PCIEPort *p = PCIE_PORT(d);
  int rc;

-rc = pci_bridge_initfn(d, TYPE_PCIE_BUS);
-if (rc < 0) {
-return rc;
-}
-
+pci_bridge_initfn(d, TYPE_PCIE_BUS);
  pcie_port_init_reg(d);

  rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR,
diff --git a/hw/pci-host/apb.c b/hw/pci-host/apb.c
index 599768e..0a53137 100644
--- a/hw/pci-host/apb.c
+++ b/hw/pci-host/apb.c
@@ -634,12 +634,7 @@ static void pci_apb_set_irq(void *opaque, int irq_num, int 
level)

  static int apb_pci_bridge_initfn(PCIDevice *dev)
  {
-int rc;
-
-rc = pci_bridge_initfn(dev, TYPE_PCI_BUS);
-if (rc < 0) {
-return rc;
-}
+pci_bridge_initfn(dev, TYPE_PCI_BUS);

  /*
   * command register:
diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
index 40c97b1..5c30795 100644
--- a/hw/pci/pci_bridge.c
+++ b/hw/pci/pci_bridge.c
@@ -332,7 +332,7 @@ void pci_bridge_reset(DeviceState *qdev)
  }

  /* default qdev initialization function for PCI-to-PCI bridge */
-int pci_bridge_initfn(PCIDevice *dev, const char *typename)
+void pci_bridge_initfn(PCIDevice *dev, const char *typename)
  {
  PCIBus *parent = dev->bus;
  PCIBridge *br = PCI_BRIDGE(dev);
@@ -378,7 +378,6 @@ int pci_bridge_initfn(PCIDevice *dev, const char *typename)
  br->windows = pci_bridge_region_init(br);
  QLIST_INIT(_bus->child);
  QLIST_INSERT_HEAD(>child, sec_bus, sibling);
-return 0;
  }

  /* default qdev clean up function for PCI-to-PCI bridge */
diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
index 93b621c..ed4aff6 100644
--- a/include/hw/pci/pci_bridge.h
+++ b/include/hw/pci/pci_bridge.h
@@ -48,7 +48,7 @@ void pci_bridge_disable_base_limit(PCIDevice *dev);
  void pci_bridge_reset_reg(PCIDevice *dev);
  void

Re: [Qemu-devel] [V3 2/4] hw/core: Add AMD IO MMU to machine properties

2016-01-17 Thread Marcel Apfelbaum


On 01/14/2016 10:04 AM, David Kiarie wrote:

Add IO MMU as a string to machine properties which
is used to control whether and they type of IO MMU
to emulate

Signed-off-by: David Kiarie 
---
  hw/core/machine.c   | 17 +
  include/hw/boards.h |  3 ++-
  qemu-options.hx |  6 +++---
  util/qemu-config.c  |  4 ++--
  4 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index c46ddc7..cb309aa 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -283,18 +283,19 @@ static void machine_set_firmware(Object *obj, const char 
*value, Error **errp)
  ms->firmware = g_strdup(value);
  }

-static bool machine_get_iommu(Object *obj, Error **errp)
+static char *machine_get_iommu(Object *obj, Error **errp)
  {
  MachineState *ms = MACHINE(obj);

-return ms->iommu;
+return g_strdup(ms->iommu);
  }

-static void machine_set_iommu(Object *obj, bool value, Error **errp)
+static void machine_set_iommu(Object *obj, const char *value, Error **errp)
  {
  MachineState *ms = MACHINE(obj);

-ms->iommu = value;
+g_free(ms->iommu);
+ms->iommu = g_strdup(value);


Hi,

The patch is looking good, the only thing I am missing
is dealing with incorrect input. We should accept only
"intel" or "amd".

Thanks,
Marcel


  }

  static void machine_set_suppress_vmdesc(Object *obj, bool value, Error **errp)
@@ -454,11 +455,10 @@ static void machine_initfn(Object *obj)
  object_property_set_description(obj, "firmware",
  "Firmware image",
  NULL);
-object_property_add_bool(obj, "iommu",
- machine_get_iommu,
- machine_set_iommu, NULL);
+object_property_add_str(obj, "iommu",
+machine_get_iommu, machine_set_iommu, NULL);
  object_property_set_description(obj, "iommu",
-"Set on/off to enable/disable Intel IOMMU 
(VT-d)",
+"IOMMU list",
  NULL);
  object_property_add_bool(obj, "suppress-vmdesc",
   machine_get_suppress_vmdesc,
@@ -484,6 +484,7 @@ static void machine_finalize(Object *obj)
  g_free(ms->dumpdtb);
  g_free(ms->dt_compatible);
  g_free(ms->firmware);
+g_free(ms->iommu);
  }

  bool machine_usb(MachineState *machine)
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 0f30959..b119245 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -36,6 +36,7 @@ bool machine_usb(MachineState *machine);
  bool machine_kernel_irqchip_allowed(MachineState *machine);
  bool machine_kernel_irqchip_required(MachineState *machine);
  bool machine_kernel_irqchip_split(MachineState *machine);
+bool machine_amd_iommu(MachineState *machine);
  int machine_kvm_shadow_mem(MachineState *machine);
  int machine_phandle_start(MachineState *machine);
  bool machine_dump_guest_core(MachineState *machine);
@@ -126,7 +127,7 @@ struct MachineState {
  bool usb_disabled;
  bool igd_gfx_passthru;
  char *firmware;
-bool iommu;
+char *iommu;
  bool suppress_vmdesc;

  ram_addr_t ram_size;
diff --git a/qemu-options.hx b/qemu-options.hx
index 215d00d..ac327c8 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -38,7 +38,7 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
  "kvm_shadow_mem=size of KVM shadow MMU\n"
  "dump-guest-core=on|off include guest memory in a core dump 
(default=on)\n"
  "mem-merge=on|off controls memory merge support (default: 
on)\n"
-"iommu=on|off controls emulated Intel IOMMU (VT-d) support 
(default=off)\n"
+"iommu=amd|intel enables and selects the emulated IO MMU 
(default: off)\n"
  "igd-passthru=on|off controls IGD GFX passthrough support 
(default=off)\n"
  "aes-key-wrap=on|off controls support for AES key wrapping 
(default=on)\n"
  "dea-key-wrap=on|off controls support for DEA key wrapping 
(default=on)\n"
@@ -72,8 +72,8 @@ Include guest memory in a core dump. The default is on.
  Enables or disables memory merge support. This feature, when supported by
  the host, de-duplicates identical memory pages among VMs instances
  (enabled by default).
-@item iommu=on|off
-Enables or disables emulated Intel IOMMU (VT-d) support. The default is off.
+@item iommu=intel|amd
+Enables and selects the emulated IO MMU. The default is off.
  @item aes-key-wrap=on|off
  Enables or disables AES key wrapping support on s390-ccw hosts. This feature
  controls whether AES wrapping keys will be created to allow
diff --git a/util/qemu-config.c b/util/qemu-config.c
index 687fd34..f79b98c 100644
--- a/util/qemu-config.c
+++ b/util/qemu-config.c
@@ -213,8 +213,8 @@ static QemuOptsList machine_opts = {

Re: [Qemu-devel] [V3 4/4] hw/pci-host: Emulate AMD IO MMU

2016-01-17 Thread Marcel Apfelbaum


On 01/14/2016 10:04 AM, David Kiarie wrote:

Support AMD IO MMU emulation in q35 and piix chipsets

Signed-off-by: David Kiarie 
---
  hw/pci-host/piix.c | 11 +++
  hw/pci-host/q35.c  | 16 ++--
  2 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 924f0fa..19e2930 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -35,6 +35,7 @@
  #include "hw/i386/ioapic.h"
  #include "qapi/visitor.h"
  #include "qemu/error-report.h"
+#include "hw/i386/amd_iommu.h"

  /*
   * I440FX chipset data sheet.
@@ -297,6 +298,16 @@ static void i440fx_pcihost_realize(DeviceState *dev, Error 
**errp)

  sysbus_add_io(sbd, 0xcfc, >data_mem);
  sysbus_init_ioports(sbd, 0xcfc, 4);
+
+/* AMD IOMMU (AMD-Vi) */
+if (g_strcmp0(object_property_get_str(qdev_get_machine(), "iommu", NULL),
+  "amd") == 0) {


You can use the Machine wrapper and it will look slightly better (at least you 
get rid of the literal):
MACHINE(qdev_get_machine())->iommu  <=> object_property_get_str(qdev_get_machine(), 
"iommu", NULL)


By the way, does i440fx host work with AMD iommu?




+AMDIOMMUState *iommu_state;
+PCIDevice *iommu;
+iommu = pci_create_simple(s->bus, 0x20, TYPE_AMD_IOMMU_DEVICE);
+iommu_state = AMD_IOMMU_DEVICE(iommu);
+pci_setup_iommu(s->bus, bridge_host_amd_iommu, iommu_state);
+}
  }

  static void i440fx_realize(PCIDevice *dev, Error **errp)
diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
index 1fb4707..dd4c822 100644
--- a/hw/pci-host/q35.c
+++ b/hw/pci-host/q35.c
@@ -30,6 +30,7 @@
  #include "hw/hw.h"
  #include "hw/pci-host/q35.h"
  #include "qapi/visitor.h"
+#include "hw/i386/amd_iommu.h"

  /
   * Q35 host
@@ -505,10 +506,21 @@ static void mch_realize(PCIDevice *d, Error **errp)
   mch->pci_address_space, >pam_regions[i+1],
   PAM_EXPAN_BASE + i * PAM_EXPAN_SIZE, PAM_EXPAN_SIZE);
  }
-/* Intel IOMMU (VT-d) */
-if (object_property_get_bool(qdev_get_machine(), "iommu", NULL)) {
+
+char *iommu = object_property_get_str(qdev_get_machine(), "iommu", NULL);
+
+if (g_strcmp0(iommu, "intel") == 0) {
+/* Intel IOMMU (VT-d) */
  mch_init_dmar(mch);
+} else if (g_strcmp0(iommu, "amd") == 0) {


Last thing, maybe you can define "intel" and "amd" literals in one please,
then use it them as you see fit.


Thanks,
Marcel



+AMDIOMMUState *iommu_state;
+PCIDevice *iommu;
+PCIBus *bus = PCI_BUS(qdev_get_parent_bus(DEVICE(mch)));
+iommu = pci_create_simple(bus, 0x20, TYPE_AMD_IOMMU_DEVICE);
+iommu_state = AMD_IOMMU_DEVICE(iommu);
+pci_setup_iommu(bus, bridge_host_amd_iommu, iommu_state);
  }
+g_free(iommu);
  }

  uint64_t mch_mcfg_base(void)

Re: [Qemu-devel] [PATCH] vhost-user: Slave crashes as Master unmaps vrings during guest reboot

2016-01-17 Thread Michael S. Tsirkin

On Fri, Jan 15, 2016 at 12:12:43PM -0800, Shesha Sreenivasamurthy wrote:
> Problem:
> 
> If a guest has vhost-user enabled, then on reboot vhost_virtqueue_stop
> is invoked. This unmaps vring memory mappings. However, it will not give
> any indication to the underlying DPDK slave application about it.
> Therefore, a pollmode DPDK driver tries to read the ring to check for
> packets and segfaults.

The spec currently says:
Client must start ring upon receiving a kick (that is, detecting that file
descriptor is readable) on the descriptor specified by
VHOST_USER_SET_VRING_KICK, and stop ring upon receiving
VHOST_USER_GET_VRING_BASE.

Why isn't this sufficient?

> Solution:
> --
> VHOST_USER_RESET_OWNER API is issued by QEMU so that DPDK slave
> application is informed that mappings will be soon gone so that
> it can take necessary steps.
> 
> Shesha Sreenivasamurthy (1):
>   vhost-user: Slave crashes as Master unmaps vrings during guest reboot
> 
>  hw/virtio/vhost.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> -- 
> 1.9.5 (Apple Git-50.3)

Re: [Qemu-devel] [PATCH] vhost-user: Slave crashes as Master unmaps vrings during guest reboot

2016-01-17 Thread Michael S. Tsirkin

On Fri, Jan 15, 2016 at 12:12:44PM -0800, Shesha Sreenivasamurthy wrote:
> Send VHOST_USER_RESET_OWNER when the device is stopped.
> 
> Signed-off-by: Shesha Sreenivasamurthy 

That's a bad commit log.  A good one should describe why changes are
made, not what they are (that can be seen from the change diff).
The cover letter is no good for that, it
should just give some introduction in case of a large patchset.

I commented on the cover letter for now since that is where
you put the motivation.

> ---
>  hw/virtio/vhost.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index de29968..808184f 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -1256,6 +1256,11 @@ void vhost_dev_stop(struct vhost_dev *hdev, 
> VirtIODevice *vdev)
>   hdev->vq_index + i);
>  }
>  

If rings keep going at this point then the index
retrieved above will be wrong, and we will not
be able to re-start the device.

> +if (hdev->vhost_ops->vhost_reset_device(hdev) < 0) {
> +fprintf(stderr, "vhost reset device %s failed\n", vdev->name);
> +fflush(stderr);
> +}
> +

Looks more or less like a revert of
commit 12b8cbac3c8243b3dd485aaebb82547aefa06adb
Author: Yuanhan Liu 
Date:   Fri Nov 13 15:24:10 2015 +0800

vhost: don't send RESET_OWNER at stop

If you still think it's a good idea, pls
copy people signed on that patch.

>  vhost_log_put(hdev, true);
>  hdev->started = false;
>  hdev->log = NULL;
> -- 
> 1.9.5 (Apple Git-50.3)

Re: [Qemu-devel] [PATCH 5/8] ipmi: add ACPI power and GUID commands

2016-01-17 Thread Marcel Apfelbaum


On 01/05/2016 07:29 PM, Cédric Le Goater wrote:

Signed-off-by: Cédric Le Goater 
---
  hw/ipmi/ipmi_bmc_sim.c | 55 ++
  1 file changed, 55 insertions(+)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 60586a67104e..c3a06d0ac7e4 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -25,6 +25,7 @@
  #include 
  #include 
  #include 
+#include "sysemu/sysemu.h"
  #include "qemu/timer.h"
  #include "hw/ipmi/ipmi.h"
  #include "qemu/error-report.h"
@@ -54,6 +55,9 @@
  #define IPMI_CMD_GET_DEVICE_ID0x01
  #define IPMI_CMD_COLD_RESET   0x02
  #define IPMI_CMD_WARM_RESET   0x03
+#define IPMI_CMD_SET_POWER_STATE  0x06
+#define IPMI_CMD_GET_POWER_STATE  0x07
+#define IPMI_CMD_GET_DEVICE_GUID  0x08
  #define IPMI_CMD_RESET_WATCHDOG_TIMER 0x22
  #define IPMI_CMD_SET_WATCHDOG_TIMER   0x24
  #define IPMI_CMD_GET_WATCHDOG_TIMER   0x25
@@ -215,6 +219,9 @@ struct IPMIBmcSim {

  uint8_t restart_cause;

+uint8_t power_state[2];
+uint8_t uuid[16];
+
  IPMISel sel;
  IPMISdr sdr;
  IPMIFru fru;
@@ -842,6 +849,42 @@ static void warm_reset(IPMIBmcSim *ibs,
  k->reset(s, false);
  }
  }
+static void set_power_state(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+IPMI_CHECK_CMD_LEN(4);
+ibs->power_state[0] = cmd[2];
+ibs->power_state[1] = cmd[3];
+ out:
+return;



Hi,

I am sorry for my late comment, but I find a little strange the use of
the "out" label here.
I understand this is because of its usage in IPMI_*  macros, but
I looked into every usage(I hope I didn't miss anything) and the code
simply returns.
Also the correlation between those macros is a little odd.

Thanks,
Marcel



+}
+
+static void get_power_state(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+IPMI_ADD_RSP_DATA(ibs->power_state[0]);
+IPMI_ADD_RSP_DATA(ibs->power_state[1]);
+ out:
+return;
+}
+
+static void get_device_guid(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+unsigned int i;
+
+for (i = 0; i < 16; i++) {
+IPMI_ADD_RSP_DATA(ibs->uuid[i]);
+}
+ out:
+return;
+}

  static void set_bmc_global_enables(IPMIBmcSim *ibs,
 uint8_t *cmd, unsigned int cmd_len,
@@ -1781,6 +1824,9 @@ static const IPMICmdHandler 
app_cmds[IPMI_NETFN_APP_MAXCMD] = {
  [IPMI_CMD_GET_DEVICE_ID] = get_device_id,
  [IPMI_CMD_COLD_RESET] = cold_reset,
  [IPMI_CMD_WARM_RESET] = warm_reset,
+[IPMI_CMD_SET_POWER_STATE] = set_power_state,
+[IPMI_CMD_GET_POWER_STATE] = get_power_state,
+[IPMI_CMD_GET_DEVICE_GUID] = get_device_guid,
  [IPMI_CMD_SET_BMC_GLOBAL_ENABLES] = set_bmc_global_enables,
  [IPMI_CMD_GET_BMC_GLOBAL_ENABLES] = get_bmc_global_enables,
  [IPMI_CMD_CLR_MSG_FLAGS] = clr_msg_flags,
@@ -1907,6 +1953,15 @@ static void ipmi_sim_init(Object *obj)
  i += len;
  }

+ibs->power_state[0] = 0;
+ibs->power_state[1] = 0;
+
+if (qemu_uuid_set) {
+memcpy(>uuid, qemu_uuid, 16);
+} else {
+memset(>uuid, 0, 16);
+}
+
  ipmi_init_sensors_from_sdrs(ibs);
  register_cmds(ibs);

Re: [Qemu-devel] [PATCH 5/8] ipmi: add ACPI power and GUID commands

2016-01-17 Thread Marcel Apfelbaum


On 01/17/2016 02:04 PM, Marcel Apfelbaum wrote:

On 01/05/2016 07:29 PM, Cédric Le Goater wrote:

Signed-off-by: Cédric Le Goater 
---
  hw/ipmi/ipmi_bmc_sim.c | 55 ++
  1 file changed, 55 insertions(+)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 60586a67104e..c3a06d0ac7e4 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -25,6 +25,7 @@
  #include 
  #include 
  #include 
+#include "sysemu/sysemu.h"
  #include "qemu/timer.h"
  #include "hw/ipmi/ipmi.h"
  #include "qemu/error-report.h"
@@ -54,6 +55,9 @@
  #define IPMI_CMD_GET_DEVICE_ID0x01
  #define IPMI_CMD_COLD_RESET   0x02
  #define IPMI_CMD_WARM_RESET   0x03
+#define IPMI_CMD_SET_POWER_STATE  0x06
+#define IPMI_CMD_GET_POWER_STATE  0x07
+#define IPMI_CMD_GET_DEVICE_GUID  0x08
  #define IPMI_CMD_RESET_WATCHDOG_TIMER 0x22
  #define IPMI_CMD_SET_WATCHDOG_TIMER   0x24
  #define IPMI_CMD_GET_WATCHDOG_TIMER   0x25
@@ -215,6 +219,9 @@ struct IPMIBmcSim {

  uint8_t restart_cause;

+uint8_t power_state[2];
+uint8_t uuid[16];
+
  IPMISel sel;
  IPMISdr sdr;
  IPMIFru fru;
@@ -842,6 +849,42 @@ static void warm_reset(IPMIBmcSim *ibs,
  k->reset(s, false);
  }
  }
+static void set_power_state(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+IPMI_CHECK_CMD_LEN(4);
+ibs->power_state[0] = cmd[2];
+ibs->power_state[1] = cmd[3];
+ out:
+return;



Hi,

I am sorry for my late comment, but I find a little strange the use of
the "out" label here.
I understand this is because of its usage in IPMI_*  macros, but
I looked into every usage(I hope I didn't miss anything) and the code
simply returns.
Also the correlation between those macros is a little odd.


I meant the correlation between the macros and the "out" label.

Thanks,
Marcel



Thanks,
Marcel



+}
+
+static void get_power_state(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+IPMI_ADD_RSP_DATA(ibs->power_state[0]);
+IPMI_ADD_RSP_DATA(ibs->power_state[1]);
+ out:
+return;
+}
+
+static void get_device_guid(IPMIBmcSim *ibs,
+  uint8_t *cmd, unsigned int cmd_len,
+  uint8_t *rsp, unsigned int *rsp_len,
+  unsigned int max_rsp_len)
+{
+unsigned int i;
+
+for (i = 0; i < 16; i++) {
+IPMI_ADD_RSP_DATA(ibs->uuid[i]);
+}
+ out:
+return;
+}

  static void set_bmc_global_enables(IPMIBmcSim *ibs,
 uint8_t *cmd, unsigned int cmd_len,
@@ -1781,6 +1824,9 @@ static const IPMICmdHandler 
app_cmds[IPMI_NETFN_APP_MAXCMD] = {
  [IPMI_CMD_GET_DEVICE_ID] = get_device_id,
  [IPMI_CMD_COLD_RESET] = cold_reset,
  [IPMI_CMD_WARM_RESET] = warm_reset,
+[IPMI_CMD_SET_POWER_STATE] = set_power_state,
+[IPMI_CMD_GET_POWER_STATE] = get_power_state,
+[IPMI_CMD_GET_DEVICE_GUID] = get_device_guid,
  [IPMI_CMD_SET_BMC_GLOBAL_ENABLES] = set_bmc_global_enables,
  [IPMI_CMD_GET_BMC_GLOBAL_ENABLES] = get_bmc_global_enables,
  [IPMI_CMD_CLR_MSG_FLAGS] = clr_msg_flags,
@@ -1907,6 +1953,15 @@ static void ipmi_sim_init(Object *obj)
  i += len;
  }

+ibs->power_state[0] = 0;
+ibs->power_state[1] = 0;
+
+if (qemu_uuid_set) {
+memcpy(>uuid, qemu_uuid, 16);
+} else {
+memset(>uuid, 0, 16);
+}
+
  ipmi_init_sensors_from_sdrs(ibs);
  register_cmds(ibs);

[Qemu-devel] [PATCH v6 1/6] Change xen_host_pci_sysfs_path() to return void

2016-01-17 Thread Cao jin

And assert the snprintf() error, because user can do nothing in case of
snprintf() fail.

Signed-off-by: Cao jin 
---
 hw/xen/xen-host-pci-device.c | 35 +++
 1 file changed, 11 insertions(+), 24 deletions(-)

diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
index 7d8a023..9c342e7 100644
--- a/hw/xen/xen-host-pci-device.c
+++ b/hw/xen/xen-host-pci-device.c
@@ -31,19 +31,14 @@
 #define IORESOURCE_PREFETCH 0x1000  /* No side effects */
 #define IORESOURCE_MEM_64   0x0010
 
-static int xen_host_pci_sysfs_path(const XenHostPCIDevice *d,
-   const char *name, char *buf, ssize_t size)
+static void xen_host_pci_sysfs_path(const XenHostPCIDevice *d,
+const char *name, char *buf, ssize_t size)
 {
 int rc;
 
 rc = snprintf(buf, size, "/sys/bus/pci/devices/%04x:%02x:%02x.%d/%s",
   d->domain, d->bus, d->dev, d->func, name);
-
-if (rc >= size || rc < 0) {
-/* The output is truncated, or some other error was encountered */
-return -ENODEV;
-}
-return 0;
+assert(rc >= 0 && rc < size);
 }
 
 
@@ -58,10 +53,8 @@ static int xen_host_pci_get_resource(XenHostPCIDevice *d)
 char *endptr, *s;
 uint8_t type;
 
-rc = xen_host_pci_sysfs_path(d, "resource", path, sizeof (path));
-if (rc) {
-return rc;
-}
+xen_host_pci_sysfs_path(d, "resource", path, sizeof(path));
+
 fd = open(path, O_RDONLY);
 if (fd == -1) {
 XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, strerror(errno));
@@ -150,10 +143,8 @@ static int xen_host_pci_get_value(XenHostPCIDevice *d, 
const char *name,
 unsigned long value;
 char *endptr;
 
-rc = xen_host_pci_sysfs_path(d, name, path, sizeof (path));
-if (rc) {
-return rc;
-}
+xen_host_pci_sysfs_path(d, name, path, sizeof(path));
+
 fd = open(path, O_RDONLY);
 if (fd == -1) {
 XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, strerror(errno));
@@ -200,21 +191,17 @@ static bool xen_host_pci_dev_is_virtfn(XenHostPCIDevice 
*d)
 char path[PATH_MAX];
 struct stat buf;
 
-if (xen_host_pci_sysfs_path(d, "physfn", path, sizeof (path))) {
-return false;
-}
+xen_host_pci_sysfs_path(d, "physfn", path, sizeof(path));
+
 return !stat(path, );
 }
 
 static int xen_host_pci_config_open(XenHostPCIDevice *d)
 {
 char path[PATH_MAX];
-int rc;
 
-rc = xen_host_pci_sysfs_path(d, "config", path, sizeof (path));
-if (rc) {
-return rc;
-}
+xen_host_pci_sysfs_path(d, "config", path, sizeof(path));
+
 d->config_fd = open(path, O_RDWR);
 if (d->config_fd < 0) {
 return -errno;
-- 
2.1.0

[Qemu-devel] [PATCH v6 0/6] Xen PCI passthru: Convert to realize()

2016-01-17 Thread Cao jin

v6 changelog:
1. split modification of xen_host_pci_sysfs_path() into a separate new patch
   as 1/6 shows.
2. 'bug' fix of qemu_strtoul(), in patch 2/6 & 3/6
3. Grammar fix in patch 4/6
4. 'msg' --> 'message' in commit message.

Cao jin (6):
  Change xen_host_pci_sysfs_path() to return void
  Xen: use qemu_strtoul instead of strtol
  Add Error **errp for xen_host_pci_device_get()
  Add Error **errp for xen_pt_setup_vga()
  Add Error **errp for xen_pt_config_init()
  Xen PCI passthru: convert to realize()

 hw/xen/xen-host-pci-device.c | 149 +--
 hw/xen/xen-host-pci-device.h |   5 +-
 hw/xen/xen_pt.c  |  77 --
 hw/xen/xen_pt.h  |   5 +-
 hw/xen/xen_pt_config_init.c  |  51 ---
 hw/xen/xen_pt_graphics.c |  11 ++--
 6 files changed, 155 insertions(+), 143 deletions(-)

-- 
2.1.0

[Qemu-devel] [PATCH v6 4/6] Add Error **errp for xen_pt_setup_vga()

2016-01-17 Thread Cao jin

To catch the error message. Also modify the caller

Signed-off-by: Cao jin 
Reviewed-by: Eric Blake 
---
 hw/xen/xen_pt.c  |  7 +--
 hw/xen/xen_pt.h  |  3 ++-
 hw/xen/xen_pt_graphics.c | 11 ++-
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 53b5bca..07bfcec 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -808,8 +808,11 @@ static int xen_pt_initfn(PCIDevice *d)
 return -1;
 }
 
-if (xen_pt_setup_vga(s, >real_device) < 0) {
-XEN_PT_ERR(d, "Setup VGA BIOS of passthrough GFX failed!\n");
+xen_pt_setup_vga(s, >real_device, );
+if (err) {
+error_append_hint(, "Setup VGA BIOS of passthrough"
+" GFX failed");
+error_report_err(err);
 xen_host_pci_device_put(>real_device);
 return -1;
 }
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 3749711..26f74f8 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -330,5 +330,6 @@ static inline bool is_igd_vga_passthrough(XenHostPCIDevice 
*dev)
 }
 int xen_pt_register_vga_regions(XenHostPCIDevice *dev);
 int xen_pt_unregister_vga_regions(XenHostPCIDevice *dev);
-int xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev);
+void xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev,
+ Error **errp);
 #endif /* !XEN_PT_H */
diff --git a/hw/xen/xen_pt_graphics.c b/hw/xen/xen_pt_graphics.c
index df6069b..e7a7c7e 100644
--- a/hw/xen/xen_pt_graphics.c
+++ b/hw/xen/xen_pt_graphics.c
@@ -161,7 +161,8 @@ struct pci_data {
 uint16_t reserved;
 } __attribute__((packed));
 
-int xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev)
+void xen_pt_setup_vga(XenPCIPassthroughState *s, XenHostPCIDevice *dev,
+ Error **errp)
 {
 unsigned char *bios = NULL;
 struct rom_header *rom;
@@ -172,13 +173,14 @@ int xen_pt_setup_vga(XenPCIPassthroughState *s, 
XenHostPCIDevice *dev)
 struct pci_data *pd = NULL;
 
 if (!is_igd_vga_passthrough(dev)) {
-return -1;
+error_setg(errp, "Need to enable igd-passthrough");
+return;
 }
 
 bios = get_vgabios(s, _size, dev);
 if (!bios) {
-XEN_PT_ERR(>dev, "VGA: Can't getting VBIOS!\n");
-return -1;
+error_setg(errp, "VGA: Can't get VBIOS");
+return;
 }
 
 /* Currently we fixed this address as a primary. */
@@ -203,7 +205,6 @@ int xen_pt_setup_vga(XenPCIPassthroughState *s, 
XenHostPCIDevice *dev)
 
 /* Currently we fixed this address as a primary for legacy BIOS. */
 cpu_physical_memory_rw(0xc, bios, bios_size, 1);
-return 0;
 }
 
 uint32_t igd_read_opregion(XenPCIPassthroughState *s)
-- 
2.1.0

[Qemu-devel] [PATCH v6 2/6] Xen: use qemu_strtoul instead of strtol

2016-01-17 Thread Cao jin

No need to roll our own (with slightly incorrect handling of errno),
when we can use the common version.

Change signed parsing to unsigned, because what it read are values in
PCI config space, which are non-negative.

Signed-off-by: Cao jin 
---
 hw/xen/xen-host-pci-device.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
index 9c342e7..83da9c4 100644
--- a/hw/xen/xen-host-pci-device.c
+++ b/hw/xen/xen-host-pci-device.c
@@ -141,7 +141,7 @@ static int xen_host_pci_get_value(XenHostPCIDevice *d, 
const char *name,
 char buf[XEN_HOST_PCI_GET_VALUE_BUFFER_SIZE];
 int fd, rc;
 unsigned long value;
-char *endptr;
+const char *endptr;
 
 xen_host_pci_sysfs_path(d, name, path, sizeof(path));
 
@@ -158,13 +158,9 @@ static int xen_host_pci_get_value(XenHostPCIDevice *d, 
const char *name,
 }
 } while (rc < 0);
 buf[rc] = 0;
-value = strtol(buf, , base);
-if (endptr == buf || *endptr != '\n') {
-rc = -1;
-} else if ((value == LONG_MIN || value == LONG_MAX) && errno == ERANGE) {
-rc = -errno;
-} else {
-rc = 0;
+rc = qemu_strtoul(buf, , base, );
+if (!rc) {
+assert(value <= UINT_MAX);
 *pvalue = value;
 }
 out:
-- 
2.1.0

[Qemu-devel] [PATCH v6 3/6] Add Error **errp for xen_host_pci_device_get()

2016-01-17 Thread Cao jin

To catch the error message. Also modify the caller

Signed-off-by: Cao jin 
---
 hw/xen/xen-host-pci-device.c | 102 ---
 hw/xen/xen-host-pci-device.h |   5 ++-
 hw/xen/xen_pt.c  |  13 +++---
 3 files changed, 68 insertions(+), 52 deletions(-)

diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
index 83da9c4..3827ca7 100644
--- a/hw/xen/xen-host-pci-device.c
+++ b/hw/xen/xen-host-pci-device.c
@@ -44,7 +44,7 @@ static void xen_host_pci_sysfs_path(const XenHostPCIDevice *d,
 
 /* This size should be enough to read the first 7 lines of a resource file */
 #define XEN_HOST_PCI_RESOURCE_BUFFER_SIZE 400
-static int xen_host_pci_get_resource(XenHostPCIDevice *d)
+static void xen_host_pci_get_resource(XenHostPCIDevice *d, Error **errp)
 {
 int i, rc, fd;
 char path[PATH_MAX];
@@ -57,19 +57,18 @@ static int xen_host_pci_get_resource(XenHostPCIDevice *d)
 
 fd = open(path, O_RDONLY);
 if (fd == -1) {
-XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, strerror(errno));
-return -errno;
+error_setg_file_open(errp, errno, path);
+return;
 }
 
 do {
-rc = read(fd, , sizeof (buf) - 1);
+rc = read(fd, , sizeof(buf) - 1);
 if (rc < 0 && errno != EINTR) {
-rc = -errno;
+error_setg_errno(errp, errno, "read err");
 goto out;
 }
 } while (rc < 0);
 buf[rc] = 0;
-rc = 0;
 
 s = buf;
 for (i = 0; i < PCI_NUM_REGIONS; i++) {
@@ -122,20 +121,19 @@ static int xen_host_pci_get_resource(XenHostPCIDevice *d)
 d->rom.bus_flags = flags & IORESOURCE_BITS;
 }
 }
+
 if (i != PCI_NUM_REGIONS) {
-/* Invalid format or input to short */
-rc = -ENODEV;
+error_setg(errp, "Invalid format or input too short: %s", buf);
 }
 
 out:
 close(fd);
-return rc;
 }
 
 /* This size should be enough to read a long from a file */
 #define XEN_HOST_PCI_GET_VALUE_BUFFER_SIZE 22
-static int xen_host_pci_get_value(XenHostPCIDevice *d, const char *name,
-  unsigned int *pvalue, int base)
+static void xen_host_pci_get_value(XenHostPCIDevice *d, const char *name,
+   unsigned int *pvalue, int base, Error 
**errp)
 {
 char path[PATH_MAX];
 char buf[XEN_HOST_PCI_GET_VALUE_BUFFER_SIZE];
@@ -147,39 +145,45 @@ static int xen_host_pci_get_value(XenHostPCIDevice *d, 
const char *name,
 
 fd = open(path, O_RDONLY);
 if (fd == -1) {
-XEN_HOST_PCI_LOG("Error: Can't open %s: %s\n", path, strerror(errno));
-return -errno;
+error_setg_file_open(errp, errno, path);
+return;
 }
+
 do {
-rc = read(fd, , sizeof (buf) - 1);
+rc = read(fd, , sizeof(buf) - 1);
 if (rc < 0 && errno != EINTR) {
-rc = -errno;
+error_setg_errno(errp, errno, "read err");
 goto out;
 }
 } while (rc < 0);
+
 buf[rc] = 0;
 rc = qemu_strtoul(buf, , base, );
 if (!rc) {
 assert(value <= UINT_MAX);
 *pvalue = value;
+} else {
+error_setg_errno(errp, -rc, "failed to parse value '%s'", buf);
 }
+
 out:
 close(fd);
-return rc;
 }
 
-static inline int xen_host_pci_get_hex_value(XenHostPCIDevice *d,
- const char *name,
- unsigned int *pvalue)
+static inline void xen_host_pci_get_hex_value(XenHostPCIDevice *d,
+  const char *name,
+  unsigned int *pvalue,
+  Error **errp)
 {
-return xen_host_pci_get_value(d, name, pvalue, 16);
+xen_host_pci_get_value(d, name, pvalue, 16, errp);
 }
 
-static inline int xen_host_pci_get_dec_value(XenHostPCIDevice *d,
- const char *name,
- unsigned int *pvalue)
+static inline void xen_host_pci_get_dec_value(XenHostPCIDevice *d,
+  const char *name,
+  unsigned int *pvalue,
+  Error **errp)
 {
-return xen_host_pci_get_value(d, name, pvalue, 10);
+xen_host_pci_get_value(d, name, pvalue, 10, errp);
 }
 
 static bool xen_host_pci_dev_is_virtfn(XenHostPCIDevice *d)
@@ -192,17 +196,16 @@ static bool xen_host_pci_dev_is_virtfn(XenHostPCIDevice 
*d)
 return !stat(path, );
 }
 
-static int xen_host_pci_config_open(XenHostPCIDevice *d)
+static void xen_host_pci_config_open(XenHostPCIDevice *d, Error **errp)
 {
 char path[PATH_MAX];
 
 xen_host_pci_sysfs_path(d, "config", path, sizeof(path));
 
 d->config_fd = open(path, O_RDWR);
-if (d->config_fd < 0) {
-return -errno;
+if

[Qemu-devel] [PATCH v6 6/6] Xen PCI passthru: convert to realize()

2016-01-17 Thread Cao jin

Signed-off-by: Cao jin 
Reviewed-by: Eric Blake 
---
 hw/xen/xen_pt.c | 53 -
 1 file changed, 28 insertions(+), 25 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 9eef3df..d33221b 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -760,10 +760,10 @@ static void xen_pt_destroy(PCIDevice *d) {
 }
 /* init */
 
-static int xen_pt_initfn(PCIDevice *d)
+static void xen_pt_realize(PCIDevice *d, Error **errp)
 {
 XenPCIPassthroughState *s = XEN_PT_DEVICE(d);
-int rc = 0;
+int i, rc = 0;
 uint8_t machine_irq = 0, scratch;
 uint16_t cmd = 0;
 int pirq = XEN_PT_UNASSIGNED_PIRQ;
@@ -781,8 +781,8 @@ static int xen_pt_initfn(PCIDevice *d)
 );
 if (err) {
 error_append_hint(, "Failed to \"open\" the real pci device");
-error_report_err(err);
-return -1;
+error_propagate(errp, err);
+return;
 }
 
 s->is_virtfn = s->real_device.is_virtfn;
@@ -802,19 +802,19 @@ static int xen_pt_initfn(PCIDevice *d)
 if ((s->real_device.domain == 0) && (s->real_device.bus == 0) &&
 (s->real_device.dev == 2) && (s->real_device.func == 0)) {
 if (!is_igd_vga_passthrough(>real_device)) {
-XEN_PT_ERR(d, "Need to enable igd-passthru if you're trying"
-   " to passthrough IGD GFX.\n");
+error_setg(errp, "Need to enable igd-passthru if you're trying"
+" to passthrough IGD GFX");
 xen_host_pci_device_put(>real_device);
-return -1;
+return;
 }
 
 xen_pt_setup_vga(s, >real_device, );
 if (err) {
 error_append_hint(, "Setup VGA BIOS of passthrough"
 " GFX failed");
-error_report_err(err);
+error_propagate(errp, err);
 xen_host_pci_device_put(>real_device);
-return -1;
+return;
 }
 
 /* Register ISA bridge for passthrough GFX. */
@@ -836,20 +836,19 @@ static int xen_pt_initfn(PCIDevice *d)
 /* Bind interrupt */
 rc = xen_host_pci_get_byte(>real_device, PCI_INTERRUPT_PIN, );
 if (rc) {
-XEN_PT_ERR(d, "Failed to read PCI_INTERRUPT_PIN! (rc:%d)\n", rc);
+error_setg_errno(errp, errno, "Failed to read PCI_INTERRUPT_PIN");
 goto err_out;
 }
 if (!scratch) {
-XEN_PT_LOG(d, "no pin interrupt\n");
+error_setg(errp, "no pin interrupt");
 goto out;
 }
 
 machine_irq = s->real_device.irq;
 rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, );
-
 if (rc < 0) {
-XEN_PT_ERR(d, "Mapping machine irq %u to pirq %i failed, (err: %d)\n",
-   machine_irq, pirq, errno);
+error_setg_errno(errp, errno, "Mapping machine irq %u to"
+ " pirq %i failed", machine_irq, pirq);
 
 /* Disable PCI intx assertion (turn on bit10 of devctl) */
 cmd |= PCI_COMMAND_INTX_DISABLE;
@@ -870,8 +869,8 @@ static int xen_pt_initfn(PCIDevice *d)
PCI_SLOT(d->devfn),
e_intx);
 if (rc < 0) {
-XEN_PT_ERR(d, "Binding of interrupt %i failed! (err: %d)\n",
-   e_intx, errno);
+error_setg_errno(errp, errno, "Binding of interrupt %u failed",
+ e_intx);
 
 /* Disable PCI intx assertion (turn on bit10 of devctl) */
 cmd |= PCI_COMMAND_INTX_DISABLE;
@@ -879,8 +878,8 @@ static int xen_pt_initfn(PCIDevice *d)
 
 if (xen_pt_mapped_machine_irq[machine_irq] == 0) {
 if (xc_physdev_unmap_pirq(xen_xc, xen_domid, machine_irq)) {
-XEN_PT_ERR(d, "Unmapping of machine interrupt %i failed!"
-   " (err: %d)\n", machine_irq, errno);
+error_setg_errno(errp, errno, "Unmapping of machine"
+" interrupt %u failed", machine_irq);
 }
 }
 s->machine_irq = 0;
@@ -893,14 +892,14 @@ out:
 
 rc = xen_host_pci_get_word(>real_device, PCI_COMMAND, );
 if (rc) {
-XEN_PT_ERR(d, "Failed to read PCI_COMMAND! (rc: %d)\n", rc);
+error_setg_errno(errp, errno, "Failed to read PCI_COMMAND");
 goto err_out;
 } else {
 val |= cmd;
 rc = xen_host_pci_set_word(>real_device, PCI_COMMAND, val);
 if (rc) {
-XEN_PT_ERR(d, "Failed to write PCI_COMMAND val=0x%x!(rc: 
%d)\n",
-   val, rc);
+error_setg_errno(errp, errno, "Failed to write PCI_COMMAND"
+ " val = 0x%x", val);
 goto err_out;
 }
 }
@@ -910,15 +909,19 @@ out:
 memory_listener_register(>io_listener, _space_io);

[Qemu-devel] [PATCH v6 5/6] Add Error **errp for xen_pt_config_init()

2016-01-17 Thread Cao jin

To catch the error message. Also modify the caller

Signed-off-by: Cao jin 
Reviewed-by: Eric Blake 
---
 hw/xen/xen_pt.c |  8 ---
 hw/xen/xen_pt.h |  2 +-
 hw/xen/xen_pt_config_init.c | 51 -
 3 files changed, 33 insertions(+), 28 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 07bfcec..9eef3df 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -825,9 +825,11 @@ static int xen_pt_initfn(PCIDevice *d)
 xen_pt_register_regions(s, );
 
 /* reinitialize each config register to be emulated */
-rc = xen_pt_config_init(s);
-if (rc) {
-XEN_PT_ERR(d, "PCI Config space initialisation failed.\n");
+xen_pt_config_init(s, );
+if (err) {
+error_append_hint(, "PCI Config space initialisation failed");
+error_report_err(err);
+rc = -1;
 goto err_out;
 }
 
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 26f74f8..c2f8e1f 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -230,7 +230,7 @@ struct XenPCIPassthroughState {
 bool listener_set;
 };
 
-int xen_pt_config_init(XenPCIPassthroughState *s);
+void xen_pt_config_init(XenPCIPassthroughState *s, Error **errp);
 void xen_pt_config_delete(XenPCIPassthroughState *s);
 XenPTRegGroup *xen_pt_find_reg_grp(XenPCIPassthroughState *s, uint32_t 
address);
 XenPTReg *xen_pt_find_reg(XenPTRegGroup *reg_grp, uint32_t address);
diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 185a698..81c6721 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1887,8 +1887,9 @@ static uint8_t find_cap_offset(XenPCIPassthroughState *s, 
uint8_t cap)
 return 0;
 }
 
-static int xen_pt_config_reg_init(XenPCIPassthroughState *s,
-  XenPTRegGroup *reg_grp, XenPTRegInfo *reg)
+static void xen_pt_config_reg_init(XenPCIPassthroughState *s,
+   XenPTRegGroup *reg_grp, XenPTRegInfo *reg,
+   Error **errp)
 {
 XenPTReg *reg_entry;
 uint32_t data = 0;
@@ -1907,12 +1908,13 @@ static int 
xen_pt_config_reg_init(XenPCIPassthroughState *s,
reg_grp->base_offset + reg->offset, );
 if (rc < 0) {
 g_free(reg_entry);
-return rc;
+error_setg(errp, "Init emulate register fail");
+return;
 }
 if (data == XEN_PT_INVALID_REG) {
 /* free unused BAR register entry */
 g_free(reg_entry);
-return 0;
+return;
 }
 /* Sync up the data to dev.config */
 offset = reg_grp->base_offset + reg->offset;
@@ -1930,7 +1932,8 @@ static int xen_pt_config_reg_init(XenPCIPassthroughState 
*s,
 if (rc) {
 /* Serious issues when we cannot read the host values! */
 g_free(reg_entry);
-return rc;
+error_setg(errp, "Cannot read host values");
+return;
 }
 /* Set bits in emu_mask are the ones we emulate. The dev.config shall
  * contain the emulated view of the guest - therefore we flip the mask
@@ -1955,10 +1958,10 @@ static int 
xen_pt_config_reg_init(XenPCIPassthroughState *s,
 val = data;
 
 if (val & ~size_mask) {
-XEN_PT_ERR(>dev,"Offset 0x%04x:0x%04x expands past register 
size(%d)!\n",
-   offset, val, reg->size);
+error_setg(errp, "Offset 0x%04x:0x%04x expands past"
+" register size (%d)", offset, val, reg->size);
 g_free(reg_entry);
-return -ENXIO;
+return;
 }
 /* This could be just pci_set_long as we don't modify the bits
  * past reg->size, but in case this routine is run in parallel or the
@@ -1978,13 +1981,12 @@ static int 
xen_pt_config_reg_init(XenPCIPassthroughState *s,
 }
 /* list add register entry */
 QLIST_INSERT_HEAD(_grp->reg_tbl_list, reg_entry, entries);
-
-return 0;
 }
 
-int xen_pt_config_init(XenPCIPassthroughState *s)
+void xen_pt_config_init(XenPCIPassthroughState *s, Error **errp)
 {
 int i, rc;
+Error *err = NULL;
 
 QLIST_INIT(>reg_grps);
 
@@ -2027,11 +2029,12 @@ int xen_pt_config_init(XenPCIPassthroughState *s)
   reg_grp_offset,
   _grp_entry->size);
 if (rc < 0) {
-XEN_PT_LOG(>dev, "Failed to initialize %d/%ld, type=0x%x, 
rc:%d\n",
-   i, ARRAY_SIZE(xen_pt_emu_reg_grps),
+error_setg(, "Failed to initialize %d/%zu, type = 0x%x,"
+   " rc: %d", i, ARRAY_SIZE(xen_pt_emu_reg_grps),
xen_pt_emu_reg_grps[i].grp_type, rc);
+error_propagate(errp, err);
 xen_pt_config_delete(s);
-

Re: [Qemu-devel] [PATCH 5/8] ipmi: add ACPI power and GUID commands

2016-01-17 Thread Michael S. Tsirkin

On Sun, Jan 17, 2016 at 02:04:32PM +0200, Marcel Apfelbaum wrote:
> On 01/05/2016 07:29 PM, Cédric Le Goater wrote:
> >Signed-off-by: Cédric Le Goater 
> >---
> >  hw/ipmi/ipmi_bmc_sim.c | 55 
> > ++
> >  1 file changed, 55 insertions(+)
> >
> >diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
> >index 60586a67104e..c3a06d0ac7e4 100644
> >--- a/hw/ipmi/ipmi_bmc_sim.c
> >+++ b/hw/ipmi/ipmi_bmc_sim.c
> >@@ -25,6 +25,7 @@
> >  #include 
> >  #include 
> >  #include 
> >+#include "sysemu/sysemu.h"
> >  #include "qemu/timer.h"
> >  #include "hw/ipmi/ipmi.h"
> >  #include "qemu/error-report.h"
> >@@ -54,6 +55,9 @@
> >  #define IPMI_CMD_GET_DEVICE_ID0x01
> >  #define IPMI_CMD_COLD_RESET   0x02
> >  #define IPMI_CMD_WARM_RESET   0x03
> >+#define IPMI_CMD_SET_POWER_STATE  0x06
> >+#define IPMI_CMD_GET_POWER_STATE  0x07
> >+#define IPMI_CMD_GET_DEVICE_GUID  0x08
> >  #define IPMI_CMD_RESET_WATCHDOG_TIMER 0x22
> >  #define IPMI_CMD_SET_WATCHDOG_TIMER   0x24
> >  #define IPMI_CMD_GET_WATCHDOG_TIMER   0x25
> >@@ -215,6 +219,9 @@ struct IPMIBmcSim {
> >
> >  uint8_t restart_cause;
> >
> >+uint8_t power_state[2];
> >+uint8_t uuid[16];
> >+
> >  IPMISel sel;
> >  IPMISdr sdr;
> >  IPMIFru fru;
> >@@ -842,6 +849,42 @@ static void warm_reset(IPMIBmcSim *ibs,
> >  k->reset(s, false);
> >  }
> >  }
> >+static void set_power_state(IPMIBmcSim *ibs,
> >+  uint8_t *cmd, unsigned int cmd_len,
> >+  uint8_t *rsp, unsigned int *rsp_len,
> >+  unsigned int max_rsp_len)
> >+{
> >+IPMI_CHECK_CMD_LEN(4);
> >+ibs->power_state[0] = cmd[2];
> >+ibs->power_state[1] = cmd[3];
> >+ out:
> >+return;
> 
> 
> Hi,
> 
> I am sorry for my late comment, but I find a little strange the use of
> the "out" label here.
> I understand this is because of its usage in IPMI_*  macros, but
> I looked into every usage(I hope I didn't miss anything) and the code
> simply returns.
> Also the correlation between those macros is a little odd.
> 
> Thanks,
> Marcel


Yes - these macros with goto out are confusing.

Please rewrite them to return bool, and put
goto out in the caller.



> 
> >+}
> >+
> >+static void get_power_state(IPMIBmcSim *ibs,
> >+  uint8_t *cmd, unsigned int cmd_len,
> >+  uint8_t *rsp, unsigned int *rsp_len,
> >+  unsigned int max_rsp_len)
> >+{
> >+IPMI_ADD_RSP_DATA(ibs->power_state[0]);
> >+IPMI_ADD_RSP_DATA(ibs->power_state[1]);
> >+ out:
> >+return;
> >+}
> >+
> >+static void get_device_guid(IPMIBmcSim *ibs,
> >+  uint8_t *cmd, unsigned int cmd_len,
> >+  uint8_t *rsp, unsigned int *rsp_len,
> >+  unsigned int max_rsp_len)
> >+{
> >+unsigned int i;
> >+
> >+for (i = 0; i < 16; i++) {
> >+IPMI_ADD_RSP_DATA(ibs->uuid[i]);
> >+}
> >+ out:
> >+return;
> >+}
> >
> >  static void set_bmc_global_enables(IPMIBmcSim *ibs,
> > uint8_t *cmd, unsigned int cmd_len,
> >@@ -1781,6 +1824,9 @@ static const IPMICmdHandler 
> >app_cmds[IPMI_NETFN_APP_MAXCMD] = {
> >  [IPMI_CMD_GET_DEVICE_ID] = get_device_id,
> >  [IPMI_CMD_COLD_RESET] = cold_reset,
> >  [IPMI_CMD_WARM_RESET] = warm_reset,
> >+[IPMI_CMD_SET_POWER_STATE] = set_power_state,
> >+[IPMI_CMD_GET_POWER_STATE] = get_power_state,
> >+[IPMI_CMD_GET_DEVICE_GUID] = get_device_guid,
> >  [IPMI_CMD_SET_BMC_GLOBAL_ENABLES] = set_bmc_global_enables,
> >  [IPMI_CMD_GET_BMC_GLOBAL_ENABLES] = get_bmc_global_enables,
> >  [IPMI_CMD_CLR_MSG_FLAGS] = clr_msg_flags,
> >@@ -1907,6 +1953,15 @@ static void ipmi_sim_init(Object *obj)
> >  i += len;
> >  }
> >
> >+ibs->power_state[0] = 0;
> >+ibs->power_state[1] = 0;
> >+
> >+if (qemu_uuid_set) {
> >+memcpy(>uuid, qemu_uuid, 16);
> >+} else {
> >+memset(>uuid, 0, 16);
> >+}
> >+
> >  ipmi_init_sensors_from_sdrs(ibs);
> >  register_cmds(ibs);
> >
> >

Re: [Qemu-devel] [PATCH RESEND] softfloat: fix return type of roundAndPackFloat16

2016-01-17 Thread Aurelien Jarno

On 2016-01-15 14:21, Peter Maydell wrote:
> On 13 January 2016 at 16:03, Aurelien Jarno  wrote:
> > The roundAndPackFloat16 function should return a float16 value, not a
> > float32 one. Fix that.
> >
> > Cc: Peter Maydell 
> > Signed-off-by: Aurelien Jarno 
> > ---
> >  fpu/softfloat.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > Peter, given you are working on softfloat patches, you might want to get
> > this one merged at the same time.
> >
> > diff --git a/fpu/softfloat.c b/fpu/softfloat.c
> > index f1170fe..acc9099 100644
> > --- a/fpu/softfloat.c
> > +++ b/fpu/softfloat.c
> > @@ -3368,7 +3368,7 @@ static float16 packFloat16(flag zSign, int_fast16_t 
> > zExp, uint16_t zSig)
> >  | Binary Floating-Point Arithmetic.
> >  
> > **/
> >
> > -static float32 roundAndPackFloat16(flag zSign, int_fast16_t zExp,
> > +static float16 roundAndPackFloat16(flag zSign, int_fast16_t zExp,
> > uint32_t zSig, flag ieee,
> > float_status *status)
> >  {
> 
> Reviewed-by: Peter Maydell 
> 
> (a harmless error in the current code but we might as well get it right).

It's harmless in the default build, but it fails to build when softfloat
type checking is activated. Unfortunately more code with the wrong type
has been added recently.

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

96 matches

Mail list logo