date:20140905

Re: [Qemu-devel] [PATCH 02/17] ppc: avoid excessive TLB flushing

2014-09-05 Thread David Gibson

On Thu, Aug 28, 2014 at 09:35:27PM +0200, Paolo Bonzini wrote:
 Il 28/08/2014 19:30, Peter Maydell ha scritto:
  On 28 August 2014 18:14, Paolo Bonzini pbonz...@redhat.com
 wrote:
[snip]
  Does PPC hardware do lots of TLB flushes on user-kernel
  transitions, or does it have some sort of info in the TLB
  entry about whether it should match or not?
 
 The IR and DR bits simply disable paging for respectively instructions
 and data.  I suppose real hardware simply does not use the TLB when
 paging is disabled.

That's right for the most part, although IR and DR transitions are
still pretty horribly slow, due to the various synchronizations that
need to happen.

There are some other complications though.  At least POWER7 and POWER8
supports a virtual real mode area (VRMA, which is where a guest OS
sees itself as having translation off, but in fact translations
(managed by the hypervisor) are still in use.  In the (hashed) page
table the VRMA co-exists with the guest's normal translations.  I'm
not up to date with the TLB architecture and how it's impacted.  Note
that POWER chips (since POWER4?) have both the ERAT (which is what
most cpus would think of as fast but smallish TLB) and the TLB which
is a sort of L1.5 cache, which needs to be combined with the SLB to
form full translations.  I strongly suspect the ERAT does need to be
flushed on real-mode/virtual-mode transitions, but I'm less sure about
the SLB and TLB.

As yet another case, BookE (embedded) powerpc CPUs have IS/DS bits
instead of IR/DR, reflecting that instead of a true translation off
mode, they have two different address spaces.  In this case,
however, the address space is tagged into the TLB, so the whole thing
doesn't need to be flushed on address space transitions.

 IIRC each user-kernel transition disables paging, and then the kernel
 can re-enable it (optionally only on data).  So the transition is
 user-kernel unpaged-kernel paged, and the kernel unpaged-kernel paged
 part is what triggers the TLB flush.  (Something like this---Alex
 explained it to me a year ago when I asked why tlb_flush was always the
 top function in the profile of qemu-system-ppc*).
 
 Paolo
 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpwrXDe3zS5U.pgp
Description: PGP signature

Re: [Qemu-devel] [libvirt] NBD TLS support in QEMU

2014-09-05 Thread Michal Privoznik


On 03.09.2014 18:44, Stefan Hajnoczi wrote:

Hi,
QEMU offers both NBD client and server functionality.  The NBD protocol
runs unencrypted, which is a problem when the client and server
communicate over an untrusted network.


This is not problem for NBD only, but for the rest of data that qemu 
sends over  network, i.e. migration stream, VNC/SPICE, ...




The particular use case that prompted this mail is storage migration in
OpenStack.  The goal is to encrypt the NBD connection between source and
destination hosts during storage migration.

I think we can integrate TLS into the NBD protocol as an optional flag.
A quick web search does not reveal existing open source SSL/TLS NBD
implementations.  I do see a VMware NBDSSL protocol but there is no
specification so I guess it is proprietary.


In case of libvirt, we have so called tunnelled migration (both spelled 
 misspelled :P) in which libvirt opens a local ports on both src  dst 
side and then sets up secured forwarding pipe to the other side. Or a 
insecured one if user wishes so. The only problem is that when I adapted 
libvirt for NBD, I intentionally forbade NBD in tunnelled migration as 
it requires one more data stream for which libvirt migration protocol is 
not ready yet. Having saidy that, I feel that libvirt is the show 
stopper here, not QEMU.


I'm not saying that I'm against this. I've heard rumors that not 
everybody out there uses libvirt and thus they might appreciate this 
ability.




The NBD protocol starts with a negotiation phase.  This would be the
appropriate place to indicate that TLS will be used.  After client and
server complete TLS setup the connection can continue as normal.


Yep, that's how most of the secured protocols run. Somebody mentions 
STARTTLS for which I vote as well.




Besides QEMU, the userspace NBD tools (http://nbd.sf.net/) can also be
extended to support TLS.  In this case the kernel needs a localhost
socket and userspace handles TLS.


Michal

[Qemu-devel] [PATCH] target-ppc: Fix kvmppc_set_compat to use negotiated cpu-version

2014-09-05 Thread Alexey Kardashevskiy

By mistake, QEMU uses the maximum compatibility level from the command
line instead of the value negotiated in client-architecture-support call.

This replaces @max_compat with @cpu_version. This only affects guests
which do not support the host CPU.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 target-ppc/translate_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 48177ed..188b67e 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -9137,7 +9137,7 @@ int ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version)
 break;
 }
 
-if (kvm_enabled()  kvmppc_set_compat(cpu, cpu-max_compat)  0) {
+if (kvm_enabled()  kvmppc_set_compat(cpu, cpu-cpu_version)  0) {
 error_report(Unable to set compatibility mode in KVM);
 ret = -1;
 }
-- 
2.0.0

Re: [Qemu-devel] [Qemu-ppc] [PATCH 02/17] ppc: avoid excessive TLB flushing

2014-09-05 Thread Alexander Graf



On 28.08.14 19:14, Paolo Bonzini wrote:
 PowerPC TCG flushes the TLB on every IR/DR change, which basically
 means on every user-kernel context switch.  Use the 6-element
 TLB array as a cache, where each MMU index is mapped to a different
 state of the IR/DR/PR/HV bits.
 
 This brings the number of TLB flushes down from ~90 to ~5
 for starting up the Debian installer, which is in line with x86
 and gives a ~10% performance improvement.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  cputlb.c| 19 +
  hw/ppc/spapr_hcall.c|  6 +-
  include/exec/exec-all.h |  5 +
  target-ppc/cpu.h|  4 +++-
  target-ppc/excp_helper.c|  6 +-
  target-ppc/helper_regs.h| 52 
 +++--
  target-ppc/translate_init.c |  5 +
  7 files changed, 74 insertions(+), 23 deletions(-)
 
 diff --git a/cputlb.c b/cputlb.c
 index afd3705..17e1b03 100644
 --- a/cputlb.c
 +++ b/cputlb.c
 @@ -67,6 +67,25 @@ void tlb_flush(CPUState *cpu, int flush_global)
  tlb_flush_count++;
  }
  
 +void tlb_flush_idx(CPUState *cpu, int mmu_idx)
 +{
 +CPUArchState *env = cpu-env_ptr;
 +
 +#if defined(DEBUG_TLB)
 +printf(tlb_flush_idx %d:\n, mmu_idx);
 +#endif
 +/* must reset current TB so that interrupts cannot modify the
 +   links while we are modifying them */
 +cpu-current_tb = NULL;
 +
 +memset(env-tlb_table[mmu_idx], -1, sizeof(env-tlb_table[mmu_idx]));
 +memset(cpu-tb_jmp_cache, 0, sizeof(cpu-tb_jmp_cache));
 +
 +env-tlb_flush_addr = -1;
 +env-tlb_flush_mask = 0;
 +tlb_flush_count++;
 +}
 +
  static inline void tlb_flush_entry(CPUTLBEntry *tlb_entry, target_ulong addr)
  {
  if (addr == (tlb_entry-addr_read 
 diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
 index 467858c..b95961c 100644
 --- a/hw/ppc/spapr_hcall.c
 +++ b/hw/ppc/spapr_hcall.c
 @@ -556,13 +556,17 @@ static target_ulong h_cede(PowerPCCPU *cpu, 
 sPAPREnvironment *spapr,
  {
  CPUPPCState *env = cpu-env;
  CPUState *cs = CPU(cpu);
 +bool flush;
  
  env-msr |= (1ULL  MSR_EE);
 -hreg_compute_hflags(env);
 +flush = hreg_compute_hflags(env);
  if (!cpu_has_work(cs)) {
  cs-halted = 1;
  cs-exception_index = EXCP_HLT;
  cs-exit_request = 1;
 +} else if (flush) {
 +cs-interrupt_request |= CPU_INTERRUPT_EXITTB;
 +cs-exit_request = 1;

Can this ever happen?

  }
  return H_SUCCESS;
  }
 diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
 index 5e5d86e..629a550 100644
 --- a/include/exec/exec-all.h
 +++ b/include/exec/exec-all.h
 @@ -100,6 +100,7 @@ void tcg_cpu_address_space_init(CPUState *cpu, 
 AddressSpace *as);
  /* cputlb.c */
  void tlb_flush_page(CPUState *cpu, target_ulong addr);
  void tlb_flush(CPUState *cpu, int flush_global);
 +void tlb_flush_idx(CPUState *cpu, int mmu_idx);
  void tlb_set_page(CPUState *cpu, target_ulong vaddr,
hwaddr paddr, int prot,
int mmu_idx, target_ulong size);
 @@ -112,6 +113,10 @@ static inline void tlb_flush_page(CPUState *cpu, 
 target_ulong addr)
  static inline void tlb_flush(CPUState *cpu, int flush_global)
  {
  }
 +
 +static inline void tlb_flush_idx(CPUState *cpu, int mmu_idx)
 +{
 +}
  #endif
  
  #define CODE_GEN_ALIGN   16 /* must be = of the size of a icache 
 line */
 diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
 index b64c652..c1cb27f 100644
 --- a/target-ppc/cpu.h
 +++ b/target-ppc/cpu.h
 @@ -922,7 +922,7 @@ struct ppc_segment_page_sizes {
  
  
 /*/
  /* The whole PowerPC CPU context */
 -#define NB_MMU_MODES 3
 +#define NB_MMU_MODES 6
  
  #define PPC_CPU_OPCODES_LEN 0x40
  
 @@ -1085,6 +1085,8 @@ struct CPUPPCState {
  target_ulong hflags;  /* hflags is a MSR  HFLAGS_MASK */
  target_ulong hflags_nmsr; /* specific hflags, not coming from MSR */
  int mmu_idx; /* precomputed MMU index to speed up mem accesses */
 +uint32_t mmu_msr[NB_MMU_MODES];  /* ir/dr/hv/pr values for TLBs */
 +int mmu_fifo;  /* for replacement in mmu_msr */
  
  /* Power management */
  int (*check_pow)(CPUPPCState *env);
 diff --git a/target-ppc/excp_helper.c b/target-ppc/excp_helper.c
 index be71590..bf25d44 100644
 --- a/target-ppc/excp_helper.c
 +++ b/target-ppc/excp_helper.c
 @@ -623,9 +623,6 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
 excp_model, int excp)
  
  if (env-spr[SPR_LPCR]  LPCR_AIL) {
  new_msr |= (1  MSR_IR) | (1  MSR_DR);
 -} else if (msr  ((1  MSR_IR) | (1  MSR_DR))) {
 -/* If we disactivated any translation, flush TLBs */
 -tlb_flush(cs, 1);
  }
  
  #ifdef TARGET_PPC64
 @@ -678,8 +675,7 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
 excp_model, int excp)
  if ((env-mmu_model == POWERPC_MMU_BOOKE) ||
  (env-mmu_model ==

Re: [Qemu-devel] [Qemu-ppc] [PATCH 03/17] ppc: fix monitor access to CR

2014-09-05 Thread Alexander Graf



On 03.09.14 20:21, Tom Musta wrote:
 On 8/28/2014 12:14 PM, Paolo Bonzini wrote:
 This was off-by-one.

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  monitor.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/monitor.c b/monitor.c
 index 34cee74..ec73dd4 100644
 --- a/monitor.c
 +++ b/monitor.c
 @@ -2968,7 +2968,7 @@ static target_long monitor_get_ccr (const struct 
 MonitorDef *md, int val)
  
  u = 0;
  for (i = 0; i  8; i++)
 -u |= env-crf[i]  (32 - (4 * i));
 +u |= env-crf[i]  (32 - (4 * (i + 1)));
  
  return u;
  }

 
 Reviewed-by: Tom Musta tommu...@gmail.com
 

Thanks, applied to ppc-next.


Alex

Re: [Qemu-devel] [Qemu-ppc] [PATCH 06/17] ppc: use CRF_* in int_helper.c

2014-09-05 Thread Alexander Graf



On 03.09.14 20:28, Tom Musta wrote:
 On 8/28/2014 12:15 PM, Paolo Bonzini wrote:
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  target-ppc/int_helper.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

 diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
 index f6e8846..9c1c5cd 100644
 --- a/target-ppc/int_helper.c
 +++ b/target-ppc/int_helper.c
 @@ -2303,25 +2303,25 @@ uint32_t helper_bcdadd(ppc_avr_t *r,  ppc_avr_t *a, 
 ppc_avr_t *b, uint32_t ps)
  if (sgna == sgnb) {
  result.u8[BCD_DIG_BYTE(0)] = bcd_preferred_sgn(sgna, ps);
  zero = bcd_add_mag(result, a, b, invalid, overflow);
 -cr = (sgna  0) ? 4 : 8;
 +cr = (sgna  0) ? 1  CRF_GT : 1  CRF_LT;
  } else if (bcd_cmp_mag(a, b)  0) {
  result.u8[BCD_DIG_BYTE(0)] = bcd_preferred_sgn(sgna, ps);
  zero = bcd_sub_mag(result, a, b, invalid, overflow);
 -cr = (sgna  0) ? 4 : 8;
 +cr = (sgna  0) ? 1  CRF_GT : 1  CRF_LT;
  } else {
  result.u8[BCD_DIG_BYTE(0)] = bcd_preferred_sgn(sgnb, ps);
  zero = bcd_sub_mag(result, b, a, invalid, overflow);
 -cr = (sgnb  0) ? 4 : 8;
 +cr = (sgnb  0) ? 1  CRF_GT : 1  CRF_LT;
  }
  }
  
  if (unlikely(invalid)) {
  result.u64[HI_IDX] = result.u64[LO_IDX] = -1;
 -cr = 1;
 +cr = 1  CRF_SO;
  } else if (overflow) {
 -cr |= 1;
 +cr |= 1  CRF_SO;
  } else if (zero) {
 -cr = 2;
 +cr = 1  CRF_EQ;
  }
  
  *r = result;

 
 Reviewed-by: Tom Musta tommu...@gmail.com
 Tested-by: Tom Musta tommu...@gmail.com

Thanks, applied to ppc-next.


Alex

Re: [Qemu-devel] [PATCH 2/3] s390x/css: support format-0 ccws

2014-09-05 Thread Christian Borntraeger

On 05/09/14 00:29, Alexander Graf wrote:
 
 
 On 04.09.14 17:32, Jens Freimann wrote:
 From: Cornelia Huck cornelia.h...@de.ibm.com

 Add support for format-0 ccws in channel programs. As a format-1 ccw
 contains the same information as format-0 ccws, only supporting larger
 addresses, simply convert every ccw to format-1 as we walk the chain.

 Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com
 Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
 ---
  hw/s390x/css.c| 30 +-
  hw/s390x/css.h|  1 +
  target-s390x/ioinst.h | 10 ++
  3 files changed, 32 insertions(+), 9 deletions(-)

 diff --git a/hw/s390x/css.c b/hw/s390x/css.c
 index 49c2aaf..34637cb 100644
 --- a/hw/s390x/css.c
 +++ b/hw/s390x/css.c
 @@ -243,17 +243,25 @@ static void copy_sense_id_to_guest(SenseId *dest, 
 SenseId *src)
  }
  }
  
 -static CCW1 copy_ccw_from_guest(hwaddr addr)
 +static CCW1 copy_ccw_from_guest(hwaddr addr, bool fmt1)
  {
 -CCW1 tmp;
 +CCW0 tmp0;
 +CCW1 tmp1;
  CCW1 ret;
  
 -cpu_physical_memory_read(addr, tmp, sizeof(tmp));
 -ret.cmd_code = tmp.cmd_code;
 -ret.flags = tmp.flags;
 -ret.count = be16_to_cpu(tmp.count);
 -ret.cda = be32_to_cpu(tmp.cda);
 -
 +if (fmt1) {
 +cpu_physical_memory_read(addr, tmp1, sizeof(tmp1));
 +ret.cmd_code = tmp1.cmd_code;
 +ret.flags = tmp1.flags;
 +ret.count = be16_to_cpu(tmp1.count);
 +ret.cda = be32_to_cpu(tmp1.cda);
 +} else {
 +cpu_physical_memory_read(addr, tmp0, sizeof(tmp0));
 +ret.cmd_code = tmp0.cmd_code;
 +ret.flags = tmp0.flags;
 +ret.count = be16_to_cpu(tmp0.count);
 +ret.cda = be16_to_cpu(tmp0.cda1) | (tmp0.cda0  16);
 +}
  return ret;
  }
  
 @@ -268,7 +276,8 @@ static int css_interpret_ccw(SubchDev *sch, hwaddr 
 ccw_addr)
  return -EIO;
  }
  
 -ccw = copy_ccw_from_guest(ccw_addr);
 +/* Translate everything to format-1 ccws - the information is the same. 
 */
 +ccw = copy_ccw_from_guest(ccw_addr, sch-ccw_fmt_1);
  
  /* Check for invalid command codes. */
  if ((ccw.cmd_code  0x0f) == 0) {
 @@ -386,6 +395,7 @@ static void sch_handle_start_func(SubchDev *sch, ORB 
 *orb)
  s-ctrl |= (SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND);
  return;
  }
 +sch-ccw_fmt_1 = !!(orb-ctrl0  ORB_CTRL0_MASK_FMT);
  } else {
  s-ctrl = ~(SCSW_ACTL_SUSP | SCSW_ACTL_RESUME_PEND);
  }
 @@ -1347,6 +1357,7 @@ void subch_device_save(SubchDev *s, QEMUFile *f)
  qemu_put_byte(f, s-id.ciw[i].command);
  qemu_put_be16(f, s-id.ciw[i].count);
  }
 +qemu_put_byte(f, s-ccw_fmt_1);
 
 This changes the migration stream format. Please increase the version
 number for the device that gets migrated, so that we have the chance to
 catch it.
 
 Though - does migration work at all yet? :)

Not yet.
Myself and Jens are currently testing Davids latest patches regarding CPU 
states. (just chasing the final (tm) bug). 
After that I can push the initial cpu migration patch set.
So maybe we can leave this as is? Whatever you suggest.


 
 
 Alex
 
  return;
  }
  
 @@ -1402,6 +1413,7 @@ int subch_device_load(SubchDev *s, QEMUFile *f)
  s-id.ciw[i].command = qemu_get_byte(f);
  s-id.ciw[i].count = qemu_get_be16(f);
  }
 +s-ccw_fmt_1 = qemu_get_byte(f);
  return 0;
  }
  
 diff --git a/hw/s390x/css.h b/hw/s390x/css.h
 index c864ea7..384a455 100644
 --- a/hw/s390x/css.h
 +++ b/hw/s390x/css.h
 @@ -76,6 +76,7 @@ struct SubchDev {
  hwaddr channel_prog;
  CCW1 last_cmd;
  bool last_cmd_valid;
 +bool ccw_fmt_1;
  bool thinint_active;
  /* transport-provided data: */
  int (*ccw_cb) (SubchDev *, CCW1);
 diff --git a/target-s390x/ioinst.h b/target-s390x/ioinst.h
 index 5bbc67d..29f6423 100644
 --- a/target-s390x/ioinst.h
 +++ b/target-s390x/ioinst.h
 @@ -156,6 +156,16 @@ typedef struct ORB {
  #define ORB_CTRL1_MASK_ORBX 0x01
  #define ORB_CTRL1_MASK_INVALID 0x3e
  
 +/* channel command word (type 0) */
 +typedef struct CCW0 {
 +uint8_t cmd_code;
 +uint8_t cda0;
 +uint16_t cda1;
 +uint8_t flags;
 +uint8_t reserved;
 +uint16_t count;
 +} QEMU_PACKED CCW0;
 +
  /* channel command word (type 1) */
  typedef struct CCW1 {
  uint8_t cmd_code;

Re: [Qemu-devel] [PATCH 2/3] s390x/css: support format-0 ccws

2014-09-05 Thread Alexander Graf



On 05.09.14 09:23, Christian Borntraeger wrote:
 On 05/09/14 00:29, Alexander Graf wrote:


 On 04.09.14 17:32, Jens Freimann wrote:
 From: Cornelia Huck cornelia.h...@de.ibm.com

 Add support for format-0 ccws in channel programs. As a format-1 ccw
 contains the same information as format-0 ccws, only supporting larger
 addresses, simply convert every ccw to format-1 as we walk the chain.

 Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com
 Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
 ---
  hw/s390x/css.c| 30 +-
  hw/s390x/css.h|  1 +
  target-s390x/ioinst.h | 10 ++
  3 files changed, 32 insertions(+), 9 deletions(-)

 diff --git a/hw/s390x/css.c b/hw/s390x/css.c
 index 49c2aaf..34637cb 100644
 --- a/hw/s390x/css.c
 +++ b/hw/s390x/css.c
 @@ -243,17 +243,25 @@ static void copy_sense_id_to_guest(SenseId *dest, 
 SenseId *src)
  }
  }
  
 -static CCW1 copy_ccw_from_guest(hwaddr addr)
 +static CCW1 copy_ccw_from_guest(hwaddr addr, bool fmt1)
  {
 -CCW1 tmp;
 +CCW0 tmp0;
 +CCW1 tmp1;
  CCW1 ret;
  
 -cpu_physical_memory_read(addr, tmp, sizeof(tmp));
 -ret.cmd_code = tmp.cmd_code;
 -ret.flags = tmp.flags;
 -ret.count = be16_to_cpu(tmp.count);
 -ret.cda = be32_to_cpu(tmp.cda);
 -
 +if (fmt1) {
 +cpu_physical_memory_read(addr, tmp1, sizeof(tmp1));
 +ret.cmd_code = tmp1.cmd_code;
 +ret.flags = tmp1.flags;
 +ret.count = be16_to_cpu(tmp1.count);
 +ret.cda = be32_to_cpu(tmp1.cda);
 +} else {
 +cpu_physical_memory_read(addr, tmp0, sizeof(tmp0));
 +ret.cmd_code = tmp0.cmd_code;
 +ret.flags = tmp0.flags;
 +ret.count = be16_to_cpu(tmp0.count);
 +ret.cda = be16_to_cpu(tmp0.cda1) | (tmp0.cda0  16);
 +}
  return ret;
  }
  
 @@ -268,7 +276,8 @@ static int css_interpret_ccw(SubchDev *sch, hwaddr 
 ccw_addr)
  return -EIO;
  }
  
 -ccw = copy_ccw_from_guest(ccw_addr);
 +/* Translate everything to format-1 ccws - the information is the 
 same. */
 +ccw = copy_ccw_from_guest(ccw_addr, sch-ccw_fmt_1);
  
  /* Check for invalid command codes. */
  if ((ccw.cmd_code  0x0f) == 0) {
 @@ -386,6 +395,7 @@ static void sch_handle_start_func(SubchDev *sch, ORB 
 *orb)
  s-ctrl |= (SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND);
  return;
  }
 +sch-ccw_fmt_1 = !!(orb-ctrl0  ORB_CTRL0_MASK_FMT);
  } else {
  s-ctrl = ~(SCSW_ACTL_SUSP | SCSW_ACTL_RESUME_PEND);
  }
 @@ -1347,6 +1357,7 @@ void subch_device_save(SubchDev *s, QEMUFile *f)
  qemu_put_byte(f, s-id.ciw[i].command);
  qemu_put_be16(f, s-id.ciw[i].count);
  }
 +qemu_put_byte(f, s-ccw_fmt_1);

 This changes the migration stream format. Please increase the version
 number for the device that gets migrated, so that we have the chance to
 catch it.

 Though - does migration work at all yet? :)
 
 Not yet.
 Myself and Jens are currently testing Davids latest patches regarding CPU 
 states. (just chasing the final (tm) bug). 
 After that I can push the initial cpu migration patch set.
 So maybe we can leave this as is? Whatever you suggest.

Well, if migration doesn't work at all yet I think it's ok to consider
all of the state in flux.

However, please make sure that once you have migration work for the
first time, that any change like this has to also increase the version
number of the device migration stream.


Alex

Re: [Qemu-devel] [Qemu-ppc] [PATCH 07/17] ppc: fix result of DLMZB when no zero bytes are found

2014-09-05 Thread Alexander Graf



On 03.09.14 20:28, Tom Musta wrote:
 On 8/28/2014 12:15 PM, Paolo Bonzini wrote:
 It must return 8 and place 8 in XER, but the current code uses
 i directly which is 9 at this point of the code.

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  target-ppc/int_helper.c | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
 index 9c1c5cd..7955bf7 100644
 --- a/target-ppc/int_helper.c
 +++ b/target-ppc/int_helper.c
 @@ -2573,6 +2573,7 @@ target_ulong helper_dlmzb(CPUPPCState *env, 
 target_ulong high,
  }
  i++;
  }
 +i = 8;
  if (update_Rc) {
  env-crf[0] = 0x2;
  }

 
 Reviewed-by: Tom Musta tommu...@gmail.com

Thanks, applied to ppc-next.


Alex

Re: [Qemu-devel] [Qemu-ppc] [PATCH 11/17] ppc: rename gen_set_cr6_from_fpscr

2014-09-05 Thread Alexander Graf



On 03.09.14 21:41, Tom Musta wrote:
 On 8/28/2014 12:15 PM, Paolo Bonzini wrote:
 It sets CR1, not CR6 (and the spec agrees).

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  target-ppc/translate.c | 14 +++---
  1 file changed, 7 insertions(+), 7 deletions(-)

 diff --git a/target-ppc/translate.c b/target-ppc/translate.c
 index 8def0ae..67f13f7 100644
 --- a/target-ppc/translate.c
 +++ b/target-ppc/translate.c
 @@ -8179,7 +8179,7 @@ static inline TCGv_ptr gen_fprp_ptr(int reg)
  }
  
  #if defined(TARGET_PPC64)
 -static void gen_set_cr6_from_fpscr(DisasContext *ctx)
 +static void gen_set_cr1_from_fpscr(DisasContext *ctx)
  {
  TCGv_i32 tmp = tcg_temp_new_i32();
  tcg_gen_trunc_tl_i32(tmp, cpu_fpscr);
 @@ -8187,7 +8187,7 @@ static void gen_set_cr6_from_fpscr(DisasContext *ctx)
  tcg_temp_free_i32(tmp);
  }
  #else
 -static void gen_set_cr6_from_fpscr(DisasContext *ctx)
 +static void gen_set_cr1_from_fpscr(DisasContext *ctx)
  {
  gen_op_mtcr(4, cpu_fpscr, 28);
  }
 @@ -8207,7 +8207,7 @@ static void gen_##name(DisasContext *ctx)\
  rb = gen_fprp_ptr(rB(ctx-opcode));  \
  gen_helper_##name(cpu_env, rd, ra, rb);  \
  if (unlikely(Rc(ctx-opcode) != 0)) {\
 -gen_set_cr6_from_fpscr(ctx); \
 +gen_set_cr1_from_fpscr(ctx); \
  }\
  tcg_temp_free_ptr(rd);   \
  tcg_temp_free_ptr(ra);   \
 @@ -8265,7 +8265,7 @@ static void gen_##name(DisasContext *ctx) \
  u32_2 = tcg_const_i32(u32f2(ctx-opcode));\
  gen_helper_##name(cpu_env, rt, rb, u32_1, u32_2); \
  if (unlikely(Rc(ctx-opcode) != 0)) { \
 -gen_set_cr6_from_fpscr(ctx);  \
 +gen_set_cr1_from_fpscr(ctx);  \
  } \
  tcg_temp_free_ptr(rt);\
  tcg_temp_free_ptr(rb);\
 @@ -8289,7 +8289,7 @@ static void gen_##name(DisasContext *ctx)\
  i32 = tcg_const_i32(i32fld(ctx-opcode));\
  gen_helper_##name(cpu_env, rt, ra, rb, i32); \
  if (unlikely(Rc(ctx-opcode) != 0)) {\
 -gen_set_cr6_from_fpscr(ctx); \
 +gen_set_cr1_from_fpscr(ctx); \
  }\
  tcg_temp_free_ptr(rt);   \
  tcg_temp_free_ptr(rb);   \
 @@ -8310,7 +8310,7 @@ static void gen_##name(DisasContext *ctx)\
  rb = gen_fprp_ptr(rB(ctx-opcode));  \
  gen_helper_##name(cpu_env, rt, rb);  \
  if (unlikely(Rc(ctx-opcode) != 0)) {\
 -gen_set_cr6_from_fpscr(ctx); \
 +gen_set_cr1_from_fpscr(ctx); \
  }\
  tcg_temp_free_ptr(rt);   \
  tcg_temp_free_ptr(rb);   \
 @@ -8331,7 +8331,7 @@ static void gen_##name(DisasContext *ctx)  \
  i32 = tcg_const_i32(i32fld(ctx-opcode));  \
  gen_helper_##name(cpu_env, rt, rs, i32);   \
  if (unlikely(Rc(ctx-opcode) != 0)) {  \
 -gen_set_cr6_from_fpscr(ctx);   \
 +gen_set_cr1_from_fpscr(ctx);   \
  }  \
  tcg_temp_free_ptr(rt); \
  tcg_temp_free_ptr(rs); \

 
 Reviewed-by: Tom Musta tommu...@gmail.com
 Tested-by: Tom Musta tommu...@gmail.com

Thanks, applied to ppc-next.


Alex

Re: [Qemu-devel] [PATCH 1/5] s390x/gdb: don't touch the cc if tcg is not enabled

2014-09-05 Thread Christian Borntraeger

On 03/09/14 11:27, Alexander Graf wrote:
 
 
 On 02.09.14 09:07, Christian Borntraeger wrote:
 On 02/09/14 00:39, Alexander Graf wrote:


 On 29.08.14 15:52, Jens Freimann wrote:
 From: David Hildenbrand d...@linux.vnet.ibm.com

 When reading/writing the psw mask, the condition code may only be touched 
 if
 running on tcg.

 Why? Shouldn't we be able to set CC from gdb as well?


 You can. What this patch does (and the patch description is a bit vague 
 here) is to not modify the PSW it gets from KVM when passing it to gdb:
 The qemu core gets the PSW from KVM. Without this patch, we use cc_op to 
 calculate the current CC of the PSW (No idea of TCG, I guess its evaluated 
 lazy - at least this is how valgrind works). This is wrong for the KVM case, 
 as cc_op does not contain any useful data for the KVM case. With this patch 
 we simply pass the psw from KVM to gdb and back.

 The symptom was that the cc was always shown as zero.
 
 Ah, I see. I agree with the patch, but the patch description does not
 actually describe what the patch does. Please rework it.

The patch was already pulled...Well the description is misleaded but with some 
interpretion still correct ;-).
But your point is well taken. We will try to make sure that all patch 
description are descriptive and not ambiguous. Your question is a good 
indication that this was not the case in this patch.

Christian

 
 I also wouldn't mind if instead of hard coding this logic in the
 gdbstub, we'd extract it as helper functions to read and write the
 PSW.MASK in cpu.h.

Re: [Qemu-devel] [Qemu-ppc] [PATCH 13/17] ppc: compute mask from BI using right shift

2014-09-05 Thread Alexander Graf



On 03.09.14 22:59, Tom Musta wrote:
 On 8/28/2014 12:15 PM, Paolo Bonzini wrote:
 This will match the code we use in fpu_helper.c when we flip
 CRF_* bit-endianness.

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  target-ppc/translate.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

 diff --git a/target-ppc/translate.c b/target-ppc/translate.c
 index 48c7b66..4ce7af4 100644
 --- a/target-ppc/translate.c
 +++ b/target-ppc/translate.c
 @@ -794,7 +794,7 @@ static void gen_isel(DisasContext *ctx)
  TCGv_i32 t0;
  TCGv t1, true_op, zero;
  
 -mask = 1  (3 - (bi  0x03));
 +mask = 0x08  (bi  0x03);
  t0 = tcg_temp_new_i32();
  tcg_gen_andi_i32(t0, cpu_crf[bi  2], mask);
  t1 = tcg_temp_new();
 @@ -3870,7 +3870,7 @@ static inline void gen_bcond(DisasContext *ctx, int 
 type)
  if ((bo  0x10) == 0) {
  /* Test CR */
  uint32_t bi = BI(ctx-opcode);
 -uint32_t mask = 1  (3 - (bi  0x03));
 +uint32_t mask = 0x08  (bi  0x03);
  TCGv_i32 temp = tcg_temp_new_i32();
  
  if (bo  0x8) {
 @@ -3952,7 +3952,7 @@ static void glue(gen_, name)(DisasContext *ctx)
  else
   \
  tcg_gen_mov_i32(t1, cpu_crf[crbB(ctx-opcode)  2]);   
   \
  tcg_op(t0, t0, t1); 
   \
 -bitmask = 1  (3 - (crbD(ctx-opcode)  0x03));
   \
 +bitmask = 0x08  (crbD(ctx-opcode)  0x03);   
   \
  tcg_gen_andi_i32(t0, t0, bitmask);  
   \
  tcg_gen_andi_i32(t1, cpu_crf[crbD(ctx-opcode)  2], ~bitmask);
   \
  tcg_gen_or_i32(cpu_crf[crbD(ctx-opcode)  2], t0, t1);
   \

 
 Reviewed-by: Tom Musta tommu...@gmail.com
 Tested-by: Tom Musta tommu...@gmail.com

Thanks, applied to ppc-next.


Alex

[Qemu-devel] [PATCH 3/3] s390x/css: catch ccw sequence errors

2014-09-05 Thread Jens Freimann

From: Cornelia Huck cornelia.h...@de.ibm.com

We must not allow chains of more than 255 ccws without data transfer.

Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
---
 hw/s390x/css.c | 10 ++
 hw/s390x/css.h |  1 +
 2 files changed, 11 insertions(+)

diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index 34637cb..b67c039 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -294,6 +294,13 @@ static int css_interpret_ccw(SubchDev *sch, hwaddr 
ccw_addr)
 
 check_len = !((ccw.flags  CCW_FLAG_SLI)  !(ccw.flags  CCW_FLAG_DC));
 
+if (!ccw.cda) {
+if (sch-ccw_no_data_cnt == 255) {
+return -EINVAL;
+}
+sch-ccw_no_data_cnt++;
+}
+
 /* Look at the command. */
 switch (ccw.cmd_code) {
 case CCW_CMD_NOOP:
@@ -396,6 +403,7 @@ static void sch_handle_start_func(SubchDev *sch, ORB *orb)
 return;
 }
 sch-ccw_fmt_1 = !!(orb-ctrl0  ORB_CTRL0_MASK_FMT);
+sch-ccw_no_data_cnt = 0;
 } else {
 s-ctrl = ~(SCSW_ACTL_SUSP | SCSW_ACTL_RESUME_PEND);
 }
@@ -1358,6 +1366,7 @@ void subch_device_save(SubchDev *s, QEMUFile *f)
 qemu_put_be16(f, s-id.ciw[i].count);
 }
 qemu_put_byte(f, s-ccw_fmt_1);
+qemu_put_byte(f, s-ccw_no_data_cnt);
 return;
 }
 
@@ -1414,6 +1423,7 @@ int subch_device_load(SubchDev *s, QEMUFile *f)
 s-id.ciw[i].count = qemu_get_be16(f);
 }
 s-ccw_fmt_1 = qemu_get_byte(f);
+s-ccw_no_data_cnt = qemu_get_byte(f);
 return 0;
 }
 
diff --git a/hw/s390x/css.h b/hw/s390x/css.h
index 384a455..33104ac 100644
--- a/hw/s390x/css.h
+++ b/hw/s390x/css.h
@@ -78,6 +78,7 @@ struct SubchDev {
 bool last_cmd_valid;
 bool ccw_fmt_1;
 bool thinint_active;
+uint8_t ccw_no_data_cnt;
 /* transport-provided data: */
 int (*ccw_cb) (SubchDev *, CCW1);
 SenseId id;
-- 
1.8.5.5

[Qemu-devel] [PATCH 0/3 RESEND] s390x: css patches and small sclp cleanup

2014-09-05 Thread Jens Freimann

Cornelia, Christian, Alex,

here are two css patches and a small sclp cleanup. 

Patch 1 remove duplicate defines in SCLP code 
Patch 2 adds support for format-0 ccws
Patch 3 a css bugfix adding a limit of 255 to ccws chains without data transfer 

regards
Jens


Cornelia Huck (2):
  s390x/css: support format-0 ccws
  s390x/css: catch ccw sequence errors

Jens Freimann (1):
  s390x: remove duplicate defines in SCLP code

 hw/s390x/css.c  | 40 +++-
 hw/s390x/css.h  |  2 ++
 include/hw/s390x/sclp.h |  2 --
 target-s390x/ioinst.h   | 10 ++
 4 files changed, 43 insertions(+), 11 deletions(-)

-- 
1.8.5.5

[Qemu-devel] [PATCH 2/3] s390x/css: support format-0 ccws

2014-09-05 Thread Jens Freimann

From: Cornelia Huck cornelia.h...@de.ibm.com

Add support for format-0 ccws in channel programs. As a format-1 ccw
contains the same information as format-0 ccws, only supporting larger
addresses, simply convert every ccw to format-1 as we walk the chain.

Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
---
 hw/s390x/css.c| 30 +-
 hw/s390x/css.h|  1 +
 target-s390x/ioinst.h | 10 ++
 3 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index 49c2aaf..34637cb 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -243,17 +243,25 @@ static void copy_sense_id_to_guest(SenseId *dest, SenseId 
*src)
 }
 }
 
-static CCW1 copy_ccw_from_guest(hwaddr addr)
+static CCW1 copy_ccw_from_guest(hwaddr addr, bool fmt1)
 {
-CCW1 tmp;
+CCW0 tmp0;
+CCW1 tmp1;
 CCW1 ret;
 
-cpu_physical_memory_read(addr, tmp, sizeof(tmp));
-ret.cmd_code = tmp.cmd_code;
-ret.flags = tmp.flags;
-ret.count = be16_to_cpu(tmp.count);
-ret.cda = be32_to_cpu(tmp.cda);
-
+if (fmt1) {
+cpu_physical_memory_read(addr, tmp1, sizeof(tmp1));
+ret.cmd_code = tmp1.cmd_code;
+ret.flags = tmp1.flags;
+ret.count = be16_to_cpu(tmp1.count);
+ret.cda = be32_to_cpu(tmp1.cda);
+} else {
+cpu_physical_memory_read(addr, tmp0, sizeof(tmp0));
+ret.cmd_code = tmp0.cmd_code;
+ret.flags = tmp0.flags;
+ret.count = be16_to_cpu(tmp0.count);
+ret.cda = be16_to_cpu(tmp0.cda1) | (tmp0.cda0  16);
+}
 return ret;
 }
 
@@ -268,7 +276,8 @@ static int css_interpret_ccw(SubchDev *sch, hwaddr ccw_addr)
 return -EIO;
 }
 
-ccw = copy_ccw_from_guest(ccw_addr);
+/* Translate everything to format-1 ccws - the information is the same. */
+ccw = copy_ccw_from_guest(ccw_addr, sch-ccw_fmt_1);
 
 /* Check for invalid command codes. */
 if ((ccw.cmd_code  0x0f) == 0) {
@@ -386,6 +395,7 @@ static void sch_handle_start_func(SubchDev *sch, ORB *orb)
 s-ctrl |= (SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND);
 return;
 }
+sch-ccw_fmt_1 = !!(orb-ctrl0  ORB_CTRL0_MASK_FMT);
 } else {
 s-ctrl = ~(SCSW_ACTL_SUSP | SCSW_ACTL_RESUME_PEND);
 }
@@ -1347,6 +1357,7 @@ void subch_device_save(SubchDev *s, QEMUFile *f)
 qemu_put_byte(f, s-id.ciw[i].command);
 qemu_put_be16(f, s-id.ciw[i].count);
 }
+qemu_put_byte(f, s-ccw_fmt_1);
 return;
 }
 
@@ -1402,6 +1413,7 @@ int subch_device_load(SubchDev *s, QEMUFile *f)
 s-id.ciw[i].command = qemu_get_byte(f);
 s-id.ciw[i].count = qemu_get_be16(f);
 }
+s-ccw_fmt_1 = qemu_get_byte(f);
 return 0;
 }
 
diff --git a/hw/s390x/css.h b/hw/s390x/css.h
index c864ea7..384a455 100644
--- a/hw/s390x/css.h
+++ b/hw/s390x/css.h
@@ -76,6 +76,7 @@ struct SubchDev {
 hwaddr channel_prog;
 CCW1 last_cmd;
 bool last_cmd_valid;
+bool ccw_fmt_1;
 bool thinint_active;
 /* transport-provided data: */
 int (*ccw_cb) (SubchDev *, CCW1);
diff --git a/target-s390x/ioinst.h b/target-s390x/ioinst.h
index 5bbc67d..29f6423 100644
--- a/target-s390x/ioinst.h
+++ b/target-s390x/ioinst.h
@@ -156,6 +156,16 @@ typedef struct ORB {
 #define ORB_CTRL1_MASK_ORBX 0x01
 #define ORB_CTRL1_MASK_INVALID 0x3e
 
+/* channel command word (type 0) */
+typedef struct CCW0 {
+uint8_t cmd_code;
+uint8_t cda0;
+uint16_t cda1;
+uint8_t flags;
+uint8_t reserved;
+uint16_t count;
+} QEMU_PACKED CCW0;
+
 /* channel command word (type 1) */
 typedef struct CCW1 {
 uint8_t cmd_code;
-- 
1.8.5.5

[Qemu-devel] [PATCH 1/3] s390x: remove duplicate defines in SCLP code

2014-09-05 Thread Jens Freimann

Let's get rid of these duplicate defines.

Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
---
 include/hw/s390x/sclp.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/hw/s390x/sclp.h b/include/hw/s390x/sclp.h
index 5c43574..ec07a11 100644
--- a/include/hw/s390x/sclp.h
+++ b/include/hw/s390x/sclp.h
@@ -28,8 +28,6 @@
 #define SCLP_UNASSIGN_STORAGE   0x000C0001
 #define SCLP_CMD_READ_EVENT_DATA0x00770005
 #define SCLP_CMD_WRITE_EVENT_DATA   0x00760005
-#define SCLP_CMD_READ_EVENT_DATA0x00770005
-#define SCLP_CMD_WRITE_EVENT_DATA   0x00760005
 #define SCLP_CMD_WRITE_EVENT_MASK   0x00780005
 
 /* SCLP Memory hotplug codes */
-- 
1.8.5.5

Re: [Qemu-devel] [PATCH] target-ppc: Fix kvmppc_set_compat to use negotiated cpu-version

2014-09-05 Thread Alexander Graf



On 05.09.14 09:04, Alexey Kardashevskiy wrote:
 By mistake, QEMU uses the maximum compatibility level from the command
 line instead of the value negotiated in client-architecture-support call.
 
 This replaces @max_compat with @cpu_version. This only affects guests
 which do not support the host CPU.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru

Thanks, applied to ppc-next.


Alex

Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 1/4] target-ppc: Extend rtas-blob

2014-09-05 Thread Alexander Graf



On 04.09.14 13:13, Aravinda Prasad wrote:
 Extend rtas-blob to accommodate error log. Error log
 structure is saved in rtas space upon a machine check
 exception.
 
 Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com
 ---
  hw/ppc/spapr.c |4 
  1 file changed, 4 insertions(+)
 
 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index 4b20e36..4a7c0ae 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -1409,6 +1409,10 @@ static void ppc_spapr_init(MachineState *machine)
  
  filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, spapr-rtas.bin);
  spapr-rtas_size = get_image_size(filename);
 +
 +/* Resize blob to accommodate error log at a page aligned address */
 +spapr-rtas_size = TARGET_PAGE_ALIGN(spapr-rtas_size) + 
 TARGET_PAGE_SIZE;

I think it's clearer if you do

#define RTAS_ERROR_LOG_SIZE 4096

and add this to the aligned size instead.


Alex

Re: [Qemu-devel] [RFC][patch 0/6] pci pass-through support for qemu/KVM on s390

2014-09-05 Thread Frank Blaschka

On Thu, Sep 04, 2014 at 07:16:24AM -0600, Alex Williamson wrote:
 On Thu, 2014-09-04 at 12:52 +0200, frank.blasc...@de.ibm.com wrote:
  This set of patches implements pci pass-through support for qemu/KVM on 
  s390.
  PCI support on s390 is very different from other platforms.
  Major differences are:
  
  1) all PCI operations are driven by special s390 instructions
 
 Generating config cycles is always arch specific.
 
  2) all s390 PCI instructions are privileged
 
 While the operations to generate config cycles on x86 are not
 privileged, they must be arbitrated between accesses, so in a sense
 they're privileged.
 
  3) PCI config and memory spaces can not be mmap'ed
 
 VFIO has mapping flags that allow any region to specify mmap support.


Hi Alex,

thx for your reply.

Let me elaborate a little bit ore on 1 - 3. Config and memory space can not
be accessed via memory operations. You have to use special s390 instructions.
This instructions can not be executed in user space. So there is no other
way than executing this instructions in kernel. Yes vfio does support a
slow path via ioctrl we could use, but this seems suboptimal from performance
point of view.
 
  4) no classic interrupts (INTX, MSI). The pci hw understands the concept
 of requesting MSIX irqs but irqs are delivered as s390 adapter irqs.
 
 VFIO delivers interrupts as eventfds regardless of the underlying
 platform mechanism.
 

yes that's right, but then we have to do platform specific stuff to present
the irq to the guest. I do not say this is impossible but we have add s390
specific code to vfio. 

  5) For DMA access there is always an IOMMU required.
 
 x86 requires the same.
 
   s390 pci implementation
 does not support a complete memory to iommu mapping, dma mappings are
 created on request.
 
 Sounds like POWER.

Don't know the details from power, maybe it is similar but not the same.
We might be able to extend vfio to have a new interface allowing
us to do DMA mappings on request.

 
  6) The OS does not get any informations about the physical layout
 of the PCI bus.
 
 If that means that every device is isolated (seems unlikely for
 multifunction devices) then that makes IOMMU group support really easy.


OK
 
  7) To take advantage of system z specific virtualization features
 we need to access the SIE control block residing in the kernel KVM
 
 The KVM-VFIO device allows interaction between VFIO devices and KVM.
 
  8) To enable system z specific virtualization features we have to manipulate
 the zpci device in kernel.
 
 VFIO supports different device backends, currently pci_dev and working
 towards platform devices.  zpci might just be an extension to standard
 pci.
 

7 - 8 At least this is not as straightforward as the pure kernel approach, but
I have to dig into that in more detail if we could only agree on a vfio 
solution.

  For this reasons I decided to implement a kernel based approach similar
  to x86 device assignment. There is a new qemu device (s390-pci) 
  representing a
  pass through device on the host. Here is a sample qemu device configuration:
  
  -device s390-pci,host=:00:00.0
  
  The device executes the KVM_ASSIGN_PCI_DEVICE ioctl to create a proxy 
  instance
  in the kernel KVM and connect this instance to the host pci device.
  
  kernel patches apply to linux-kvm
  
  s390: cio: chsc function to register GIB
  s390: pci: export pci functions for pass-through usage
  KVM: s390: Add GISA support
  KVM: s390: Add PCI pass-through support
  
  qemu patches apply to qemu-master
  
  s390: Add PCI bus support
  s390: Add PCI pass-through device support
  
  Feedback and discussion is highly welcome ...
 
 KVM-based device assignment needs to go away.  It's a horrible model for
 devices, it offers very little protection to the kernel, assumes every
 device is fully isolated and visible to the IOMMU, relies on smattering
 of sysfs files to operate, etc.  x86, POWER, and ARM are all moving to
 VFIO-based device assignment.  Why is s390 special enough to repeat all
 the mistakes that x86 did?  Thanks,
 

Is this your personal opinion or was this a strategic decision of the
QEMU/KVM community? Can anybody give us direction about this?

Actually I can understand your point. In the last weeks I did some development
and testing regarding the use of vfio too. But the in kernel solutions seems to
offer the best performance and most straighforward implementation for our
platform.

Greetings,

Frank

 Alex
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [CVE-2014-3615 PATCH v2 3/3] spice: make sure we don't overflow ssd-buf

2014-09-05 Thread Laszlo Ersek

On 09/04/14 09:04, Gerd Hoffmann wrote:
 Related spice-only bug.  We have a fixed 16 MB buffer here, being
 presented to the spice-server as qxl video memory in case spice is
 used with a non-qxl card.  It's also used with qxl in vga mode.
 
 When using display resolutions requiring more than 16 MB of memory we
 are going to overflow that buffer.  In theory the guest can write,
 indirectly via spice-server.  The spice-server clears the memory after
 setting a new video mode though, triggering a segfault in the overflow
 case, so qemu crashes before the guest has a chance to do something
 evil.
 
 Fix that by switching to dynamic allocation for the buffer.
 
 CVE-2014-3615
 
 Cc: qemu-sta...@nongnu.org
 Cc: secal...@redhat.com
 Cc: Laszlo Ersek ler...@redhat.com
 Signed-off-by: Gerd Hoffmann kra...@redhat.com
 ---
  ui/spice-display.c | 16 
  1 file changed, 12 insertions(+), 4 deletions(-)
 
 diff --git a/ui/spice-display.c b/ui/spice-display.c
 index 66e2578..57a8fd0 100644
 --- a/ui/spice-display.c
 +++ b/ui/spice-display.c
 @@ -334,6 +334,7 @@ void qemu_spice_create_host_memslot(SimpleSpiceDisplay 
 *ssd)
  void qemu_spice_create_host_primary(SimpleSpiceDisplay *ssd)
  {
  QXLDevSurfaceCreate surface;
 +uint64_t surface_size;
  
  memset(surface, 0, sizeof(surface));
  
 @@ -347,9 +348,18 @@ void qemu_spice_create_host_primary(SimpleSpiceDisplay 
 *ssd)
  surface.mouse_mode = true;
  surface.flags  = 0;
  surface.type   = 0;
 -surface.mem= (uintptr_t)ssd-buf;
  surface.group_id   = MEMSLOT_GROUP_HOST;
  
 +surface_size = surface.width * surface.height * 4;

(1) surface.width and surface.height are uint32_t fields
[spice-server/server/spice.h]:

struct QXLDevSurfaceCreate {
uint32_t width;
uint32_t height;

uint32_t equals unsigned int in our case, hence the multiplication
will be carried out in unsigned int -- 32-bits. Given that the
dimensions here are under guest control, I suggest to write it as

surface_size = (uint64_t)surface.width * surface.height * 4;

(2) However, I have a concern even that way.

The above multiplication (due to the *4) can overflow in uint64_t as well.

I can see that surface.width and surface.height come from
pixman_image_get_width() and pixman_image_get_height(), which seem to
return int values. Hence,

(uint64_t)0x7FFF_ * 0x7FFF_ * 4 == 0x_FFFC__0004

which is OK. But, what if pixman returns a negative value? Can we create
an image that big?

From qemu_spice_create_one_update():

int bw, bh;

bw   = rect-right - rect-left;
bh   = rect-bottom - rect-top;

dest = pixman_image_create_bits(PIXMAN_x8r8g8b8, bw, bh,
(void *)update-bitmap, bw * 4);

where

typedef struct SPICE_ATTR_PACKED QXLRect {
int32_t top;
int32_t left;
int32_t bottom;
int32_t right;
} QXLRect;



I can't track this back far enough. I'd feel safer if you checked that
the multiplication can't overflow even in uint64_t.

(3) In addition:

 +if (ssd-bufsize  surface_size) {
 +ssd-bufsize = surface_size;

struct SimpleSpiceDisplay {
[...]
int bufsize;

Meaning, even if the multiplication fits okay in an uint64_t, the above
assignment can overflow ssd-bufsize, because that's just an int.

 +fprintf(stderr, %s: % PRId64  (%dx%d)\n, __func__,
 +surface_size, surface.width, surface.height);

(4) surface_size is uint64_t, it should be printed with PRIu64, not
PRId64. Similarly, surface.width and surface.height are uint32_t, they
should be printed with either %u or PRIu32, not %d.

 +g_free(ssd-buf);
 +ssd-buf = g_malloc(ssd-bufsize);

Then, g_malloc() takes a gsize, which means unsigned long:

https://developer.gnome.org/glib/2.28/glib-Memory-Allocation.html#g-malloc
https://developer.gnome.org/glib/2.28/glib-Basic-Types.html#gsize

which has the range of uint32_t in an ILP32 (i686) build. Therefore even
changing the type of ssd-bufsize to uint64_t wouldn't help.

(5) Instead, you really need to make sure that the very first
multiplication fits in a signed int:

int width, height;

width = surface_width(ssd-ds);
height = surface_height(ssd-ds);

if (width = 0 || height = 0 ||
width  INT_MAX / 4 ||
height  INT_MAX / (width * 4)) {
/* won't ever fit */
abort();
}

if (ssd-bufsize  width * height * 4) {
ssd-bufsize = width * height * 4;
/* do the rest of the realloc */
}

and do everything else after, even the population of surface's fields.

 +}
 +surface.mem = (uintptr_t)ssd-buf;
 +
  qemu_spice_create_primary_surface(ssd, 0, surface, QXL_SYNC);
  }
  
 @@ -369,8 +379,6 @@ void qemu_spice_display_init_common(SimpleSpiceDisplay 
 *ssd)
  if (ssd-num_surfaces == 0) {
  ssd-num_surfaces = 1024;
  }
 -ssd-bufsize = (16 * 1024 * 1024);
 -ssd-buf = g_malloc(ssd-bufsize);
  }

I'm

Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 1/4] target-ppc: Extend rtas-blob

2014-09-05 Thread Aravinda Prasad



On Friday 05 September 2014 01:12 PM, Alexander Graf wrote:
 
 
 On 04.09.14 13:13, Aravinda Prasad wrote:
 Extend rtas-blob to accommodate error log. Error log
 structure is saved in rtas space upon a machine check
 exception.

 Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com
 ---
  hw/ppc/spapr.c |4 
  1 file changed, 4 insertions(+)

 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index 4b20e36..4a7c0ae 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -1409,6 +1409,10 @@ static void ppc_spapr_init(MachineState *machine)
  
  filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, spapr-rtas.bin);
  spapr-rtas_size = get_image_size(filename);
 +
 +/* Resize blob to accommodate error log at a page aligned address */
 +spapr-rtas_size = TARGET_PAGE_ALIGN(spapr-rtas_size) + 
 TARGET_PAGE_SIZE;
 
 I think it's clearer if you do
 
 #define RTAS_ERROR_LOG_SIZE 4096
 
 and add this to the aligned size instead.

sure

 
 
 Alex
 

-- 
Regards,
Aravinda

Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 3/4] target-ppc: Build error log

2014-09-05 Thread Alexander Graf



On 04.09.14 13:13, Aravinda Prasad wrote:
 Whenever there is a physical memory error due to bit
 flips, which cannot be corrected by hardware, the error
 is passed on to the kernel. If the memory address in
 error belongs to guest address space then guest kernel
 is responsible to take action. Hence the error is passed
 on to guest via KVM by invoking 0x200 NMI vector.
 
 However, guest OS, as per PAPR, expects an error log
 upon such error. This patch registers a new hcall
 which is issued from 0x200 interrupt vector and builds
 the error log, copies the error log to rtas space and
 passes the address of the error log to guest
 
 Enhancement to KVM to perform above functionality is
 already in upstream kernel.
 
 Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com
 ---
  hw/ppc/spapr_hcall.c   |  154 
 
  include/hw/ppc/spapr.h |4 +
  2 files changed, 157 insertions(+), 1 deletion(-)
 
 diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
 index 01650ba..c3aa448 100644
 --- a/hw/ppc/spapr_hcall.c
 +++ b/hw/ppc/spapr_hcall.c
 @@ -14,6 +14,88 @@ struct SPRSyncState {
  target_ulong mask;
  };
  
 +/* Offset from rtas-base where error log is placed */
 +#define RTAS_ERROR_OFFSET   (TARGET_PAGE_SIZE)

You can't assume this. Please compute the value at the same place you
compute the rtas-size.

 +
 +#define RTAS_ELOG_SEVERITY_SHIFT 0x5
 +#define RTAS_ELOG_DISPOSITION_SHIFT  0x3
 +#define RTAS_ELOG_INITIATOR_SHIFT0x4
 +
 +/*
 + * Only required RTAS event severity, disposition, initiator
 + * target and type are copied from arch/powerpc/include/asm/rtas.h
 + */
 +
 +/* RTAS event severity */
 +#define RTAS_SEVERITY_ERROR_SYNC0x3
 +
 +/* RTAS event disposition */
 +#define RTAS_DISP_NOT_RECOVERED 0x2
 +
 +/* RTAS event initiator */
 +#define RTAS_INITIATOR_MEMORY   0x4
 +
 +/* RTAS event target */
 +#define RTAS_TARGET_MEMORY  0x4
 +
 +/* RTAS event type */
 +#define RTAS_TYPE_ECC_UNCORR0x09
 +
 +/*
 + * Currently KVM only passes on the uncorrected machine
 + * check memory error to guest. Other machine check errors
 + * such as SLB multi-hit and TLB multi-hit are recovered
 + * in KVM and are not passed on to guest.
 + *
 + * DSISR Bit for uncorrected machine check error. Based
 + * on arch/powerpc/include/asm/mce.h

Please don't include Linux code. This file is GPLv2+ licensed and I
don't want to taint it as GPLv2 only just for the sake of mce.

 + */
 +#define PPC_BIT(bit)(0x8000ULL  bit)
 +#define P7_DSISR_MC_UE  (PPC_BIT(48))  /* P8 too */
 +
 +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */
 +struct rtas_error_log {
 +/* Byte 0 */
 +uint8_t byte0;  /* Architectural version */
 +
 +/* Byte 1 */
 +uint8_t byte1;
 +/* 
 + * XXX  3: Severity level of error
 + *XX2: Degree of recovery
 + *  X   1: Extended log present?
 + *   XX 2: Reserved
 + */
 +
 +/* Byte 2 */
 +uint8_t byte2;
 +/* 
 + *  4: Initiator of event
 + *  4: Target of failed operation
 + */
 +uint8_t byte3;  /* General event or error*/
 +};
 +
 +/*
 + * Data format in RTAS-Blob
 + *
 + * This structure contains error information related to Machine
 + * Check exception. This is filled up and copied to rtas-blob
 + * upon machine check exception.
 + */
 +struct rtas_mc_log {
 +target_ulong srr0;
 +target_ulong srr1;
 +/*
 + * Beginning of error log address. This is properly
 + * populated and passed on to OS registered machine
 + * check notification routine upon machine check
 + * exception
 + */
 +target_ulong r3;
 +struct rtas_error_log err_log;
 +};
 +
  static void do_spr_sync(void *arg)
  {
  struct SPRSyncState *s = arg;
 @@ -586,6 +668,77 @@ static target_ulong h_rtas_update(PowerPCCPU *cpu, 
 sPAPREnvironment *spapr,
  return 0;
  }
  
 +static target_ulong h_report_mc_err(PowerPCCPU *cpu, sPAPREnvironment *spapr,
 + target_ulong opcode, target_ulong *args)
 +{
 +struct rtas_mc_log mc_log;
 +CPUPPCState *env = cpu-env;
 +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 +
 +/*
 + * We save the original r3 register in SPRG2 in 0x200 vector,
 + * which is patched during call to ibm.nmi-register. Original
 + * r3 is required to be included in error log
 + */
 +mc_log.r3 = env-spr[SPR_SPRG2];

Don't you have to call cpu_synchronize_registers() before you access
SPRG2? Otherwise the value may be stale.

 +
 +/*
 + * SRR0 and SRR1, containing nip and msr at the time of exception,
 + * are clobbered when we return from this hcall. Hence they
 + * need to be properly saved and restored. We save srr0
 + * and srr1 in rtas blob and restore it in 0x200 vector
 + * before branching to OS

Re: [Qemu-devel] [PATCH V3] vhost_net: start/stop guest notifiers properly

2014-09-05 Thread Zhangjie (HZ)



On 2014/9/1 16:18, Michael S. Tsirkin wrote:
 On Fri, Aug 29, 2014 at 06:40:24PM +0800, Zhangjie (HZ) wrote:


 On 2014/8/27 20:59, Michael S. Tsirkin wrote:
 On Thu, Aug 21, 2014 at 03:42:53PM +0800, Zhangjie (HZ) wrote:
 On 2014/8/21 14:53, Jason Wang wrote:
 On 08/21/2014 02:28 PM, Zhangjie (HZ) wrote:

 After migration, vhost is not disabled, virtual nic became unreachable 
 because vhost is not awakened.
 By the logical of EVENT_IDX, virtio-net will not kick vhost again if the 
 used idx is not updated.
 So, if one interrupts is lost during migration, virtio_net will not kick 
 vhost again.
 Then, no skb from virtio-net can be sent to tap.

 Yes and I mean to test vhost=off to see if it was the issue of vhost.
 That sounds reasonable, but the test case is to test vhost.

 Jason's patch reduced the probability of occurrence, from about 1/20 to 
 1/80. It is really effective. I think the patch should be acked.
 May be we can try to solve the problem from another perspective. Do you 
 have some methods to sense the migration?
 We can make up a signal from virtio-net after the migration.

 You can make a patch like this to debug. If problem disappears, it means
 interrupt was really lost anyway.

 Anyway, I will try to reproduce it by myself.

 The test environment is really terrible, I build a environment myself, 
 but it problem did not occur.
 The environment I use now is from a colleague Responsible for test work.
 Two hosts, every host has about 20 vms, they send packages(ipv4 and 
 ipv6) between each other.
 The VM to be migrated also sens packages itself, and there is a ping(-i 
 0.001) from another host to it.
 The physical nic is 1GE, connected through a internal nework.

 Just want to confirm. For the problem did not occur, you mean with my
 patch on top?
 .

 I mean, with your patch, I have to test 80 times before it occurs, the 
 probability is reduced.

 Could you please try to apply the patch
 [PATCH V4] net: Forbid dealing with packets when VM is not running
 on top and see if this helps?

 Thanks!

 -- 
 Best Wishes!
 Zhang Jie
 .

 Thanks! I will have a test.
 
 Great, once you have the result of the two patches applied
 together, please let us know on the list.
 
 
 -- 
 Best Wishes!
 Zhang Jie
 .
 

I'm sorry for not giving test results in time, test resource is busy these days.
Perhaps I can give the result next week.
-- 
Best Wishes!
Zhang Jie

Re: [Qemu-devel] [PATCH v2 3/4] target-ppc: Build error log

2014-09-05 Thread Aravinda Prasad



On Friday 05 September 2014 07:44 AM, Alexey Kardashevskiy wrote:
 On 09/04/2014 09:13 PM, Aravinda Prasad wrote:
 Whenever there is a physical memory error due to bit
 flips, which cannot be corrected by hardware, the error
 is passed on to the kernel. If the memory address in
 error belongs to guest address space then guest kernel
 is responsible to take action. Hence the error is passed
 on to guest via KVM by invoking 0x200 NMI vector.

 However, guest OS, as per PAPR, expects an error log
 upon such error. This patch registers a new hcall
 which is issued from 0x200 interrupt vector and builds
 the error log, copies the error log to rtas space and
 passes the address of the error log to guest

 Enhancement to KVM to perform above functionality is
 already in upstream kernel.

 Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com
 ---
  hw/ppc/spapr_hcall.c   |  154 
 
  include/hw/ppc/spapr.h |4 +
  2 files changed, 157 insertions(+), 1 deletion(-)

 diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
 index 01650ba..c3aa448 100644
 --- a/hw/ppc/spapr_hcall.c
 +++ b/hw/ppc/spapr_hcall.c
 @@ -14,6 +14,88 @@ struct SPRSyncState {
  target_ulong mask;
  };
  
 +/* Offset from rtas-base where error log is placed */
 +#define RTAS_ERROR_OFFSET   (TARGET_PAGE_SIZE)
 +
 +#define RTAS_ELOG_SEVERITY_SHIFT 0x5
 +#define RTAS_ELOG_DISPOSITION_SHIFT  0x3
 +#define RTAS_ELOG_INITIATOR_SHIFT0x4
 +
 +/*
 + * Only required RTAS event severity, disposition, initiator
 + * target and type are copied from arch/powerpc/include/asm/rtas.h
 + */
 +
 +/* RTAS event severity */
 +#define RTAS_SEVERITY_ERROR_SYNC0x3
 +
 +/* RTAS event disposition */
 +#define RTAS_DISP_NOT_RECOVERED 0x2
 +
 +/* RTAS event initiator */
 +#define RTAS_INITIATOR_MEMORY   0x4
 +
 +/* RTAS event target */
 +#define RTAS_TARGET_MEMORY  0x4
 +
 +/* RTAS event type */
 +#define RTAS_TYPE_ECC_UNCORR0x09
 +
 +/*
 + * Currently KVM only passes on the uncorrected machine
 + * check memory error to guest. Other machine check errors
 + * such as SLB multi-hit and TLB multi-hit are recovered
 + * in KVM and are not passed on to guest.
 + *
 + * DSISR Bit for uncorrected machine check error. Based
 + * on arch/powerpc/include/asm/mce.h
 + */
 +#define PPC_BIT(bit)(0x8000ULL  bit)
 +#define P7_DSISR_MC_UE  (PPC_BIT(48))  /* P8 too */
 +
 +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */
 +struct rtas_error_log {
 +/* Byte 0 */
 +uint8_t byte0;  /* Architectural version */
 +
 +/* Byte 1 */
 +uint8_t byte1;
 +/* 
 + * XXX  3: Severity level of error
 + *XX2: Degree of recovery
 + *  X   1: Extended log present?
 + *   XX 2: Reserved
 + */
 +
 +/* Byte 2 */
 +uint8_t byte2;
 +/* 
 + *  4: Initiator of event
 + *  4: Target of failed operation
 + */
 +uint8_t byte3;  /* General event or error*/
 +};
 
 
 Any particular reason not to copy rtas_error_log as is?

No specific reason. I overlooked it when I moved my code form early
prototype version. Will include rest all members.

Regards,
Aravinda


 
 
 
 

-- 
Regards,
Aravinda

Re: [Qemu-devel] [libvirt] NBD TLS support in QEMU

2014-09-05 Thread Daniel P. Berrange

On Fri, Sep 05, 2014 at 08:23:17AM +0200, Michal Privoznik wrote:
 On 03.09.2014 18:44, Stefan Hajnoczi wrote:
 Hi,
 QEMU offers both NBD client and server functionality.  The NBD protocol
 runs unencrypted, which is a problem when the client and server
 communicate over an untrusted network.
 
 This is not problem for NBD only, but for the rest of data that qemu sends
 over  network, i.e. migration stream, VNC/SPICE, ...

We already have TLS support for VNC and SPICE.  We do need it for NBD,
migration and the chardev TCP backend.

I'd suggest that it is likely possible to add support for NBD, migration
and chardev all at the same time by doing a general purpose TLS socket
wrapper in QEMU that all those areas can use.

 The particular use case that prompted this mail is storage migration in
 OpenStack.  The goal is to encrypt the NBD connection between source and
 destination hosts during storage migration.
 
 I think we can integrate TLS into the NBD protocol as an optional flag.
 A quick web search does not reveal existing open source SSL/TLS NBD
 implementations.  I do see a VMware NBDSSL protocol but there is no
 specification so I guess it is proprietary.
 
 In case of libvirt, we have so called tunnelled migration (both spelled 
 misspelled :P) in which libvirt opens a local ports on both src  dst side
 and then sets up secured forwarding pipe to the other side. Or a insecured
 one if user wishes so. The only problem is that when I adapted libvirt for
 NBD, I intentionally forbade NBD in tunnelled migration as it requires one
 more data stream for which libvirt migration protocol is not ready yet.
 Having saidy that, I feel that libvirt is the show stopper here, not QEMU.

While tunnelled migration via libvirt is doable, it is very much
sub-optimal as it involves a great many data copies which is bad
for performance of migration. This is the main reason that having
direct TLS support in the QEMU migration and NBD data channels is
desired.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] NBD TLS support in QEMU

2014-09-05 Thread Daniel P. Berrange

On Fri, Sep 05, 2014 at 12:02:18AM +0200, Wouter Verhelst wrote:
 [Cc: to nbd-general list added]
 
 On Wed, Sep 03, 2014 at 05:44:17PM +0100, Stefan Hajnoczi wrote:
  Hi,
  QEMU offers both NBD client and server functionality.  The NBD protocol
  runs unencrypted, which is a problem when the client and server
  communicate over an untrusted network.
  
  The particular use case that prompted this mail is storage migration in
  OpenStack.  The goal is to encrypt the NBD connection between source and
  destination hosts during storage migration.
 
 I've never given encrypted NBD high priority, since I don't think
 encryption without authentication serves much purpose -- and I haven't
 gotten around to adding authentication yet (for which I have plans; but
 other things have priority).

While have an authentication layer like SASL wired into the NBD protocol
would be nice, it shouldn't be considered a blocker / pre-requisite. It
is pretty straightforward for the server to require x509 certificates
from the client and validate those as a means of authentication. We've
used that as an authentication mechanism in libvirt and VNC with success,
though we did later add SASL integration as an option too.

  I think we can integrate TLS into the NBD protocol as an optional flag.
  A quick web search does not reveal existing open source SSL/TLS NBD
  implementations.  I do see a VMware NBDSSL protocol but there is no
  specification so I guess it is proprietary.
  
  The NBD protocol starts with a negotiation phase.  This would be the
  appropriate place to indicate that TLS will be used.  After client and
  server complete TLS setup the connection can continue as normal.
  
  Besides QEMU, the userspace NBD tools (http://nbd.sf.net/) can also be
  extended to support TLS.  In this case the kernel needs a localhost
  socket and userspace handles TLS.
 
 That introduces a possibility for a deadlock, since now your network
 socket isn't on the PF_MEMALLOC-protected socket anymore, which will
 cause the kernel to throw away packets which are needed for your nbd
 connection, in hopes of clearing some memory.
 
 I suppose you could theoretically do the encryption in kernel space.
 Not convinced that trying TLS in kernel space is a good idea, though.
 
 I have heard of people using stunnel or the likes to pipe the NBD
 protocol over a secure channel, with various levels of success.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [RFC][patch 0/6] pci pass-through support for qemu/KVM on s390

2014-09-05 Thread Alexander Graf



On 04.09.14 12:52, frank.blasc...@de.ibm.com wrote:
 This set of patches implements pci pass-through support for qemu/KVM on s390.
 PCI support on s390 is very different from other platforms.
 Major differences are:
 
 1) all PCI operations are driven by special s390 instructions
 2) all s390 PCI instructions are privileged
 3) PCI config and memory spaces can not be mmap'ed

That's ok, vfio abstracts config space anyway.

 4) no classic interrupts (INTX, MSI). The pci hw understands the concept
of requesting MSIX irqs but irqs are delivered as s390 adapter irqs.

This is in line with other implementations. Interrupts go from

  device - PHB - PIC - CPU

(some times you can have another converter device in between)

In your case, the PHB converts INTX and MSI interrupts to Adapter
interrupts to go to the floating interrupt controller. Same thing as
everyone else really.

 5) For DMA access there is always an IOMMU required. s390 pci implementation
does not support a complete memory to iommu mapping, dma mappings are
created on request.

Sounds great :). So I suppose we should implement a guest facing IOMMU?

 6) The OS does not get any informations about the physical layout
of the PCI bus.

So how does it know whether different devices are behind the same IOMMU
context? Or can we assume that every device has its own context?

 7) To take advantage of system z specific virtualization features
we need to access the SIE control block residing in the kernel KVM

Pleas elaborate.

 8) To enable system z specific virtualization features we have to manipulate
the zpci device in kernel.

Why?

 
 For this reasons I decided to implement a kernel based approach similar
 to x86 device assignment. There is a new qemu device (s390-pci) representing a

I fail to see the rationale and I definitely don't want to see anything
even remotely similar to the legacy x86 device assignment on s390 ;).

Can't we just enhance VFIO?

Also, I think we'll get the cleanest model if we start off with an
implementation that allows us to add emulated PCI devices to an s390x
machine and only then follow on with physical ones.


Alex

Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 3/4] target-ppc: Build error log

2014-09-05 Thread Aravinda Prasad



On Friday 05 September 2014 01:34 PM, Alexander Graf wrote:
 
 
 On 04.09.14 13:13, Aravinda Prasad wrote:
 Whenever there is a physical memory error due to bit
 flips, which cannot be corrected by hardware, the error
 is passed on to the kernel. If the memory address in
 error belongs to guest address space then guest kernel
 is responsible to take action. Hence the error is passed
 on to guest via KVM by invoking 0x200 NMI vector.

 However, guest OS, as per PAPR, expects an error log
 upon such error. This patch registers a new hcall
 which is issued from 0x200 interrupt vector and builds
 the error log, copies the error log to rtas space and
 passes the address of the error log to guest

 Enhancement to KVM to perform above functionality is
 already in upstream kernel.

 Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com
 ---
  hw/ppc/spapr_hcall.c   |  154 
 
  include/hw/ppc/spapr.h |4 +
  2 files changed, 157 insertions(+), 1 deletion(-)

 diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
 index 01650ba..c3aa448 100644
 --- a/hw/ppc/spapr_hcall.c
 +++ b/hw/ppc/spapr_hcall.c
 @@ -14,6 +14,88 @@ struct SPRSyncState {
  target_ulong mask;
  };
  
 +/* Offset from rtas-base where error log is placed */
 +#define RTAS_ERROR_OFFSET   (TARGET_PAGE_SIZE)
 
 You can't assume this. Please compute the value at the same place you
 compute the rtas-size.

sure


 
 +
 +#define RTAS_ELOG_SEVERITY_SHIFT 0x5
 +#define RTAS_ELOG_DISPOSITION_SHIFT  0x3
 +#define RTAS_ELOG_INITIATOR_SHIFT0x4
 +
 +/*
 + * Only required RTAS event severity, disposition, initiator
 + * target and type are copied from arch/powerpc/include/asm/rtas.h
 + */
 +
 +/* RTAS event severity */
 +#define RTAS_SEVERITY_ERROR_SYNC0x3
 +
 +/* RTAS event disposition */
 +#define RTAS_DISP_NOT_RECOVERED 0x2
 +
 +/* RTAS event initiator */
 +#define RTAS_INITIATOR_MEMORY   0x4
 +
 +/* RTAS event target */
 +#define RTAS_TARGET_MEMORY  0x4
 +
 +/* RTAS event type */
 +#define RTAS_TYPE_ECC_UNCORR0x09
 +
 +/*
 + * Currently KVM only passes on the uncorrected machine
 + * check memory error to guest. Other machine check errors
 + * such as SLB multi-hit and TLB multi-hit are recovered
 + * in KVM and are not passed on to guest.
 + *
 + * DSISR Bit for uncorrected machine check error. Based
 + * on arch/powerpc/include/asm/mce.h
 
 Please don't include Linux code. This file is GPLv2+ licensed and I
 don't want to taint it as GPLv2 only just for the sake of mce.

Hmm.. ok.  In that case I should not copy rtas_error_log structure below
from kernel source as well.

 
 + */
 +#define PPC_BIT(bit)(0x8000ULL  bit)
 +#define P7_DSISR_MC_UE  (PPC_BIT(48))  /* P8 too */
 +
 +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */
 +struct rtas_error_log {
 +/* Byte 0 */
 +uint8_t byte0;  /* Architectural version */
 +
 +/* Byte 1 */
 +uint8_t byte1;
 +/* 
 + * XXX  3: Severity level of error
 + *XX2: Degree of recovery
 + *  X   1: Extended log present?
 + *   XX 2: Reserved
 + */
 +
 +/* Byte 2 */
 +uint8_t byte2;
 +/* 
 + *  4: Initiator of event
 + *  4: Target of failed operation
 + */
 +uint8_t byte3;  /* General event or error*/
 +};
 +
 +/*
 + * Data format in RTAS-Blob
 + *
 + * This structure contains error information related to Machine
 + * Check exception. This is filled up and copied to rtas-blob
 + * upon machine check exception.
 + */
 +struct rtas_mc_log {
 +target_ulong srr0;
 +target_ulong srr1;
 +/*
 + * Beginning of error log address. This is properly
 + * populated and passed on to OS registered machine
 + * check notification routine upon machine check
 + * exception
 + */
 +target_ulong r3;
 +struct rtas_error_log err_log;
 +};
 +
  static void do_spr_sync(void *arg)
  {
  struct SPRSyncState *s = arg;
 @@ -586,6 +668,77 @@ static target_ulong h_rtas_update(PowerPCCPU *cpu, 
 sPAPREnvironment *spapr,
  return 0;
  }
  
 +static target_ulong h_report_mc_err(PowerPCCPU *cpu, sPAPREnvironment 
 *spapr,
 + target_ulong opcode, target_ulong *args)
 +{
 +struct rtas_mc_log mc_log;
 +CPUPPCState *env = cpu-env;
 +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 +
 +/*
 + * We save the original r3 register in SPRG2 in 0x200 vector,
 + * which is patched during call to ibm.nmi-register. Original
 + * r3 is required to be included in error log
 + */
 +mc_log.r3 = env-spr[SPR_SPRG2];
 
 Don't you have to call cpu_synchronize_registers() before you access
 SPRG2? Otherwise the value may be stale.

Yes true. Will include.

 
 +
 +/*
 + * SRR0 and SRR1, containing nip and msr at the time of exception,
 + *

Re: [Qemu-devel] [RFC][patch 3/6] KVM: s390: Add GISA support

2014-09-05 Thread Alexander Graf



On 04.09.14 12:52, frank.blasc...@de.ibm.com wrote:
 From: Frank Blaschka frank.blasc...@de.ibm.com
 
 This patch adds GISA (Guest Interrupt State Area) support
 to s390 kvm. GISA can be used for exitless interrupts. The
 patch provides a set of functions for GISA related operations
 like accessing GISA fields or registering ISCs for alert.
 Exploiters of GISA will follow with additional patches.
 
 Signed-off-by: Frank Blaschka frank.blasc...@de.ibm.com

That's a nice feature. However, please make sure that you maintain the
abstraction levels.

What should happen is that you request an irqfd from FLIC. Then you
associate that irqfd with the PCI device.

Thanks to that association, both parties can now talk to each other and
negotiate their GISA number space and make sure things are connected.

However, it should always be possible to do things without this direct
IRQ injection.

So you should be able to receive an irqfd event when an IRQ happened, so
that VFIO user space applications can also handle interrupts for example.

And the same applies for interrupt injection. We also need to be able to
inject an adapter interrupt from QEMU for emulated devices ;).


Alex

Re: [Qemu-devel] [RFC][patch 0/6] pci pass-through support for qemu/KVM on s390

2014-09-05 Thread Alexander Graf



On 05.09.14 09:46, Frank Blaschka wrote:
 On Thu, Sep 04, 2014 at 07:16:24AM -0600, Alex Williamson wrote:
 On Thu, 2014-09-04 at 12:52 +0200, frank.blasc...@de.ibm.com wrote:
 This set of patches implements pci pass-through support for qemu/KVM on 
 s390.
 PCI support on s390 is very different from other platforms.
 Major differences are:

 1) all PCI operations are driven by special s390 instructions

 Generating config cycles is always arch specific.

 2) all s390 PCI instructions are privileged

 While the operations to generate config cycles on x86 are not
 privileged, they must be arbitrated between accesses, so in a sense
 they're privileged.

 3) PCI config and memory spaces can not be mmap'ed

 VFIO has mapping flags that allow any region to specify mmap support.

 
 Hi Alex,
 
 thx for your reply.
 
 Let me elaborate a little bit ore on 1 - 3. Config and memory space can not
 be accessed via memory operations. You have to use special s390 instructions.
 This instructions can not be executed in user space. So there is no other
 way than executing this instructions in kernel. Yes vfio does support a
 slow path via ioctrl we could use, but this seems suboptimal from performance
 point of view.

Ah, I missed the memory spaces part ;). I agree that it's suboptimal
to call into the kernel for every PCI access, but I still think that
VFIO provides the correct abstraction layer for us to use. If nothing
else, it would at least give us identical configuration to x86 and nice
debugability en par with the other platforms.

  
 4) no classic interrupts (INTX, MSI). The pci hw understands the concept
of requesting MSIX irqs but irqs are delivered as s390 adapter irqs.

 VFIO delivers interrupts as eventfds regardless of the underlying
 platform mechanism.

 
 yes that's right, but then we have to do platform specific stuff to present
 the irq to the guest. I do not say this is impossible but we have add s390
 specific code to vfio. 

Not at all - interrupt delivery is completely transparent to VFIO.

 
 5) For DMA access there is always an IOMMU required.

 x86 requires the same.

  s390 pci implementation
does not support a complete memory to iommu mapping, dma mappings are
created on request.

 Sounds like POWER.
 
 Don't know the details from power, maybe it is similar but not the same.
 We might be able to extend vfio to have a new interface allowing
 us to do DMA mappings on request.

We already have that.

 

 6) The OS does not get any informations about the physical layout
of the PCI bus.

 If that means that every device is isolated (seems unlikely for
 multifunction devices) then that makes IOMMU group support really easy.

 
 OK
  
 7) To take advantage of system z specific virtualization features
we need to access the SIE control block residing in the kernel KVM

 The KVM-VFIO device allows interaction between VFIO devices and KVM.

 8) To enable system z specific virtualization features we have to manipulate
the zpci device in kernel.

 VFIO supports different device backends, currently pci_dev and working
 towards platform devices.  zpci might just be an extension to standard
 pci.

 
 7 - 8 At least this is not as straightforward as the pure kernel approach, but
 I have to dig into that in more detail if we could only agree on a vfio 
 solution.

Please do so, yes :).

 
 For this reasons I decided to implement a kernel based approach similar
 to x86 device assignment. There is a new qemu device (s390-pci) 
 representing a
 pass through device on the host. Here is a sample qemu device configuration:

 -device s390-pci,host=:00:00.0

 The device executes the KVM_ASSIGN_PCI_DEVICE ioctl to create a proxy 
 instance
 in the kernel KVM and connect this instance to the host pci device.

 kernel patches apply to linux-kvm

 s390: cio: chsc function to register GIB
 s390: pci: export pci functions for pass-through usage
 KVM: s390: Add GISA support
 KVM: s390: Add PCI pass-through support

 qemu patches apply to qemu-master

 s390: Add PCI bus support
 s390: Add PCI pass-through device support

 Feedback and discussion is highly welcome ...

 KVM-based device assignment needs to go away.  It's a horrible model for
 devices, it offers very little protection to the kernel, assumes every
 device is fully isolated and visible to the IOMMU, relies on smattering
 of sysfs files to operate, etc.  x86, POWER, and ARM are all moving to
 VFIO-based device assignment.  Why is s390 special enough to repeat all
 the mistakes that x86 did?  Thanks,

 
 Is this your personal opinion or was this a strategic decision of the
 QEMU/KVM community? Can anybody give us direction about this?
 
 Actually I can understand your point. In the last weeks I did some development
 and testing regarding the use of vfio too. But the in kernel solutions seems 
 to
 offer the best performance and most straighforward implementation for our
 platform.

I don't see why there should be any difference in

Re: [Qemu-devel] [RFC][patch 4/6] KVM: s390: Add PCI pass-through support

2014-09-05 Thread Alexander Graf



On 04.09.14 12:52, frank.blasc...@de.ibm.com wrote:
 From: Frank Blaschka frank.blasc...@de.ibm.com
 
 This patch implemets PCI pass-through kernel support for s390.
 Design approach is very similar to the x86 device assignment.
 User space executes the KVM_ASSIGN_PCI_DEVICE ioctl to create
 a proxy instance in the kernel KVM and connect this instance to the
 host pci device. s390 pci instructions are intercepted in kernel and
 operations are passed directly to the assigned pci device.
 To take advantage of all system z specific virtualization features
 we need to access the SIE control block residing in KVM. Also we have to
 enable z pci devices with special configuration information coming
 form the SIE block as well.
 
 Signed-off-by: Frank Blaschka frank.blasc...@de.ibm.com
 ---
  arch/s390/include/asm/kvm_host.h |1 
  arch/s390/kvm/Makefile   |2 
  arch/s390/kvm/intercept.c|1 
  arch/s390/kvm/kvm-s390.c |   33 
  arch/s390/kvm/kvm-s390.h |   17 
  arch/s390/kvm/pci.c  | 2130 
 +++
  arch/s390/kvm/priv.c |   21 
  7 files changed, 2202 insertions(+), 3 deletions(-)


I would love to review this patch, but in its current form it's
impossible to do. I can't possibly keep  2000 lines of code in my head.


Alex

[Qemu-devel] [PATCH v7 00/28] modify boot order of guest, and take effect after rebooting

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Sometimes, we want to modify boot order of a guest, but no need to
shutdown it. We can call dynamic changing bootindex of a guest, which
can be assured taking effect just after the guest rebooting.

For example, in P2V scene, we boot a guest and then attach a
new system disk, for copying some thing. We want to assign the
new disk as the booting disk, which means its bootindex=1.

Different nics can be assigen different bootindex dynamically
also make sense.

This patch series do belows works:
 1. add an fw_cfg_machine_reset() assure re-read global fw_boot_order list
   during vm rebooting.
 2. switch the property from qdev to qom, then use the set
callback to also update the fw_cfg file.

 Note:
 - Do not support change pci option rom's bootindex.
 - Do not handle those devices which don't have use the bootindex property.

changes since v6:
 - move all bootindex/boot-device code to a new file, named bootdevice.c.
 - introduce a getter/setter wrapper for all device.
 - call add_boot_device_path in setter bootindx callback function.
 - call del_boot_device_path in finalize bootindex qom callback function.
 - other bugfixes.

 Thanks for Eduardo's good suggestion! And other guys, thanks too!

changes since v5:
 rework by Gerd and Markus's suggestion(Thanks a lot):
 - Set/update bootindex on reset instead of realize/init.
 - Switch the property from qdev to qom, then use the set
   callback to also update the fw_cfg file.
 - using qom-set instead of 'set-bootindex' qmp interface,
   remove it.

 This is a huge change relative to the previous version. 

Changes since v4:
 - using error_setg() instead of qerror_report() in patch 1/8.
 - call del_boot_device_path() from device_finalize() instead
  of placing it into each individual device in patch 4/8.

Changes since v3:
 - rework del_* and modify_* function, because of virtio devices' specialation.
   For example, virtio-net's id is NULL, and its parent virtio-net-pci's id was 
assigned.
   Though the global fw_boot_order stored the virtio-net device.
 - call dell_boot_device_path in each individual device avoiding waste resouce.
 - introduce qmp query-bootindex command
 - introcude hmp info bootindex command
 - Fixes by Eric's reviewing comments, thanks.

Changes since v2:
 *address Gerd's reviewing suggestion:
 - use the old entry's suffix, if the caller do not pass it in.
 - call del_boot_device_path() from device_finalize() instead
   of placing it into each individual device.

Changes since v1:
 *rework by Gerd's suggestion:
 - split modify and del fw_boot_order for single function.
 - change modify bootindex's realization which simply lookup
   the device and modify the bootindex. if the new bootindex
   has already used by another device just throw an error.
 - change to del_boot_device_path(DeviceState *dev) and simply delete all
   entries belonging to the device.

For Convenience of testing, my test case based on Andreas's patch series:

 [PATCH qom-next 0/4] qom: HMP commands to replace info qtree
 http://thread.gmane.org/gmane.comp.emulators.qemu/271513
However, there is no direct relation with this bootindex patch series.

This series based on my patch posted yestoday:
 [PATCH] virtio-pci: fix virtio-net child refcount in transports
http://lists.gnu.org/archive/html/qemu-devel/2014-09/msg00805.html

./qemu-system-x86_64 -enable-kvm -m 4096 -smp 4 -name redhat6.2 -drive \
file=/home/win7_32_2U,if=none,id=drive-ide0-0-0 -device ide-hd,bus=ide.0,\
unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive \
file=/home/rhel-server-7.0-x86_64-dvd.iso,if=none,id=drive-ide0-0-1 \
-device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=4 \
-vnc 0.0.0.0:10 -netdev type=user,id=net0 -device 
virtio-net-pci,netdev=net0,bootindex=3,id=nic1 \
-drive 
file=/mnt/sdb/gonglei/image/virtio-win-1.5.3.vfd,if=none,id=drive-fdc0-0-0,format=raw
 \
-device isa-fdc,driveA=drive-fdc0-0-0,bootindexA=5,id=floppy1 -qmp 
unix:/tmp/qmp,server,nowait \
-monitor stdio -netdev type=user,id=net1 -device 
e1000,netdev=net1,bootindex=2,id=nic \
-boot menu=on -device virtio-scsi-pci,id=scsi0 -drive 
file=/home/suse11_sp3_32,if=none,\
id=drive-scsi0-0-0-0,format=raw,cache=none,aio=native \
-device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=8
QEMU 2.1.50 monitor - type 'help' for more information
(qemu) qom-get nic1 bootindex
3 (0x3)
(qemu) qom-set nic1 bootindex 3
The bootindex 3 has already been used
(qemu) qom-set nic1 bootindex 0
(qemu) qom-set floppy1 bootindexA 3
(qemu) system_reset
(qemu) qom-get nic1 bootindex
0 (0x0)
(qemu) qom-get scsi0-0-0-0 bootindex
8 (0x8)
(qemu) qom-set scsi0-0-0-0 bootindex 0
The bootindex 0 has already been used
(qemu) qom-set nic1 bootindex -1
(qemu) qom-set scsi0-0-0-0 bootindex 0
(qemu) qom-get scsi0-0-0-0 bootindex
0 (0x0)
(qemu) 

Gonglei (28):
  bootdevice: move bootdevice related code to new file bootdevice.c
  bootindex: add check

[Qemu-devel] [PATCH v7 14/28] spapr_lian: add bootindex to qom property

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/net/spapr_llan.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/hw/net/spapr_llan.c b/hw/net/spapr_llan.c
index 2d47df6..38bbb01 100644
--- a/hw/net/spapr_llan.c
+++ b/hw/net/spapr_llan.c
@@ -219,6 +219,29 @@ static int spapr_vlan_init(VIOsPAPRDevice *sdev)
 return 0;
 }
 
+static void spapr_vlan_get_bootindex(Object *obj, Visitor *v, void *opaque,
+ const char *name, Error **errp)
+{
+VIOsPAPRVLANDevice *dev = VIO_SPAPR_VLAN_DEVICE(obj);
+
+get_bootindex(dev-nicconf.bootindex, v, name, errp);
+}
+
+static void spapr_vlan_set_bootindex(Object *obj, Visitor *v, void *opaque,
+ const char *name, Error **errp)
+{
+VIOsPAPRVLANDevice *dev = VIO_SPAPR_VLAN_DEVICE(obj);
+
+set_bootindex(dev-nicconf.bootindex, v, name, errp);
+}
+
+static void spapr_vlan_instance_init(Object *obj)
+{
+object_property_add(obj, bootindex, int,
+spapr_vlan_get_bootindex,
+spapr_vlan_set_bootindex, NULL, NULL, NULL);
+}
+
 void spapr_vlan_create(VIOsPAPRBus *bus, NICInfo *nd)
 {
 DeviceState *dev;
@@ -546,6 +569,7 @@ static const TypeInfo spapr_vlan_info = {
 .parent= TYPE_VIO_SPAPR_DEVICE,
 .instance_size = sizeof(VIOsPAPRVLANDevice),
 .class_init= spapr_vlan_class_init,
+.instance_init = spapr_vlan_instance_init,
 };
 
 static void spapr_vlan_register_types(void)
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 11/28] ne2000: add bootindex to qom property

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/net/ne2000.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/hw/net/ne2000.c b/hw/net/ne2000.c
index a62d12d..62b86af 100644
--- a/hw/net/ne2000.c
+++ b/hw/net/ne2000.c
@@ -752,6 +752,17 @@ static void pci_ne2000_exit(PCIDevice *pci_dev)
 qemu_free_irq(s-irq);
 }
 
+static void ne2000_instance_init(Object *obj)
+{
+PCIDevice *pci_dev = PCI_DEVICE(obj);
+PCINE2000State *d = DO_UPCAST(PCINE2000State, dev, pci_dev);
+NE2000State *s = d-ne2000;
+
+device_add_bootindex_property(obj, s-c.bootindex,
+  bootindex, /ethernet-phy@0,
+  pci_dev-qdev, NULL);
+}
+
 static Property ne2000_properties[] = {
 DEFINE_NIC_PROPERTIES(PCINE2000State, ne2000.c),
 DEFINE_PROP_END_OF_LIST(),
@@ -778,6 +789,7 @@ static const TypeInfo ne2000_info = {
 .parent= TYPE_PCI_DEVICE,
 .instance_size = sizeof(PCINE2000State),
 .class_init= ne2000_class_init,
+.instance_init = ne2000_instance_init,
 };
 
 static void ne2000_register_types(void)
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 16/28] usb-net: add bootindex to qom property

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/usb/dev-network.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/hw/usb/dev-network.c b/hw/usb/dev-network.c
index 518d536..0dd4df7 100644
--- a/hw/usb/dev-network.c
+++ b/hw/usb/dev-network.c
@@ -1376,6 +1376,16 @@ static int usb_net_initfn(USBDevice *dev)
 return 0;
 }
 
+static void usb_net_instance_init(Object *obj)
+{
+USBDevice *dev = USB_DEVICE(obj);
+USBNetState *s = DO_UPCAST(USBNetState, dev, dev);
+
+device_add_bootindex_property(obj, s-conf.bootindex,
+  bootindex, /ethernet-phy@0,
+  dev-qdev, NULL);
+}
+
 static USBDevice *usb_net_init(USBBus *bus, const char *cmdline)
 {
 Error *local_err = NULL;
@@ -1439,6 +1449,7 @@ static const TypeInfo net_info = {
 .parent= TYPE_USB_DEVICE,
 .instance_size = sizeof(USBNetState),
 .class_init= usb_net_class_initfn,
+.instance_init = usb_net_instance_init,
 };
 
 static void usb_net_register_types(void)
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 19/28] pci-assign: remove bootindex property from qdev to qom

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Remove bootindex form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/i386/kvm/pci-assign.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index 17c7d6dc..56f387b 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -1850,13 +1850,23 @@ static void assigned_exitfn(struct PCIDevice *pci_dev)
 free_assigned_device(dev);
 }
 
+static void assigned_dev_instance_init(Object *obj)
+{
+PCIDevice *pci_dev = PCI_DEVICE(obj);
+AssignedDevice *d = DO_UPCAST(AssignedDevice, dev, PCI_DEVICE(obj));
+
+device_add_bootindex_property(obj, d-bootindex,
+  bootindex, NULL,
+  pci_dev-qdev, NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
+}
+
 static Property assigned_dev_properties[] = {
 DEFINE_PROP_PCI_HOST_DEVADDR(host, AssignedDevice, host),
 DEFINE_PROP_BIT(prefer_msi, AssignedDevice, features,
 ASSIGNED_DEVICE_PREFER_MSI_BIT, false),
 DEFINE_PROP_BIT(share_intx, AssignedDevice, features,
 ASSIGNED_DEVICE_SHARE_INTX_BIT, true),
-DEFINE_PROP_INT32(bootindex, AssignedDevice, bootindex, -1),
 DEFINE_PROP_STRING(configfd, AssignedDevice, configfd_name),
 DEFINE_PROP_END_OF_LIST(),
 };
@@ -1882,6 +1892,7 @@ static const TypeInfo assign_info = {
 .parent = TYPE_PCI_DEVICE,
 .instance_size  = sizeof(AssignedDevice),
 .class_init = assign_class_init,
+.instance_init  = assigned_dev_instance_init,
 };
 
 static void assign_register_types(void)
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 13/28] rtl8139: add bootindex to qom property

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/net/rtl8139.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 6e59f38..138a03a 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -3543,6 +3543,15 @@ static int pci_rtl8139_init(PCIDevice *dev)
 return 0;
 }
 
+static void rtl8139_instance_init(Object *obj)
+{
+RTL8139State *s = RTL8139(obj);
+
+device_add_bootindex_property(obj, s-conf.bootindex,
+  bootindex, /ethernet-phy@0,
+  DEVICE(obj), NULL);
+}
+
 static Property rtl8139_properties[] = {
 DEFINE_NIC_PROPERTIES(RTL8139State, conf),
 DEFINE_PROP_END_OF_LIST(),
@@ -3571,6 +3580,7 @@ static const TypeInfo rtl8139_info = {
 .parent= TYPE_PCI_DEVICE,
 .instance_size = sizeof(RTL8139State),
 .class_init= rtl8139_class_init,
+.instance_init = rtl8139_instance_init,
 };
 
 static void rtl8139_register_types(void)
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 07/28] bootindex: add a setter/getter functions wrapper for bootindex property

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

when we remove bootindex form qdev.property to qom.property,
we can use those functions set/get bootindex property for all
correlative devices.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 bootdevice.c| 70 +
 include/sysemu/sysemu.h |  3 +++
 2 files changed, 73 insertions(+)

diff --git a/bootdevice.c b/bootdevice.c
index 484d0c9..c669293 100644
--- a/bootdevice.c
+++ b/bootdevice.c
@@ -23,6 +23,7 @@
  */
 
 #include sysemu/sysemu.h
+#include qapi/visitor.h
 
 typedef struct FWBootEntry FWBootEntry;
 
@@ -207,3 +208,72 @@ char *get_boot_devices_list(size_t *size, bool 
ignore_suffixes)
 }
 return list;
 }
+
+typedef struct {
+int32_t *bootindex;
+const char *suffix;
+DeviceState *dev;
+} BootIndexProperty;
+
+static void device_get_bootindex(Object *obj, Visitor *v, void *opaque,
+ const char *name, Error **errp)
+{
+BootIndexProperty *prop = opaque;
+visit_type_int32(v, prop-bootindex, name, errp);
+}
+
+static void device_set_bootindex(Object *obj, Visitor *v, void *opaque,
+ const char *name, Error **errp)
+{
+BootIndexProperty *prop = opaque;
+int32_t boot_index;
+Error *local_err = NULL;
+
+visit_type_int32(v, boot_index, name, local_err);
+if (local_err) {
+goto out;
+}
+/* check whether bootindex is present in fw_boot_order list  */
+check_boot_index(boot_index, local_err);
+if (local_err) {
+goto out;
+}
+/* change bootindex to a new one */
+*prop-bootindex = boot_index;
+
+out:
+if (local_err) {
+error_propagate(errp, local_err);
+}
+}
+
+static void property_release_bootindex(Object *obj, const char *name,
+   void *opaque)
+
+{
+BootIndexProperty *prop = opaque;
+g_free(prop);
+}
+
+void device_add_bootindex_property(Object *obj, int32_t *bootindex,
+   const char *name, const char *suffix,
+   DeviceState *dev, Error **errp)
+{
+Error *local_err = NULL;
+BootIndexProperty *prop = g_malloc0(sizeof(*prop));
+
+prop-bootindex = bootindex;
+prop-suffix = suffix;
+prop-dev = dev;
+
+object_property_add(obj, name, int,
+device_get_bootindex,
+device_set_bootindex,
+property_release_bootindex,
+prop, local_err);
+
+if (local_err) {
+error_propagate(errp, local_err);
+g_free(prop);
+}
+}
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 10cd573..a636d2e 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -215,6 +215,9 @@ char *get_boot_devices_list(size_t *size, bool 
ignore_suffixes);
 DeviceState *get_boot_device(uint32_t position);
 void check_boot_index(int32_t bootindex, Error **errp);
 void del_boot_device_path(DeviceState *dev);
+void device_add_bootindex_property(Object *obj, int32_t *bootindex,
+   const char *name, const char *suffix,
+   DeviceState *dev, Error **errp);
 
 QemuOpts *qemu_get_machine_opts(void);
 
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 03/28] bootindex: add del_boot_device_path function

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Introduce del_boot_device_path() to clean up fw_cfg content when
hot-unplugging a device that refers to a bootindex.

Signed-off-by: Gonglei arei.gong...@huawei.com
Signed-off-by: Chenliang chenlian...@huawei.com
---
 bootdevice.c| 21 +
 include/sysemu/sysemu.h |  1 +
 2 files changed, 22 insertions(+)

diff --git a/bootdevice.c b/bootdevice.c
index f5399df..89aca7f 100644
--- a/bootdevice.c
+++ b/bootdevice.c
@@ -51,6 +51,27 @@ void check_boot_index(int32_t bootindex, Error **errp)
 }
 }
 
+void del_boot_device_path(DeviceState *dev)
+{
+FWBootEntry *i;
+
+assert(dev != NULL);
+
+/* remove all entries of the assigned device */
+QTAILQ_FOREACH(i, fw_boot_order, link) {
+if (i-dev == NULL) {
+continue;
+}
+if (i-dev == dev) {
+QTAILQ_REMOVE(fw_boot_order, i, link);
+g_free(i-suffix);
+g_free(i);
+
+break;
+}
+}
+}
+
 void add_boot_device_path(int32_t bootindex, DeviceState *dev,
   const char *suffix)
 {
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 72463de..10cd573 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -214,6 +214,7 @@ char *get_boot_devices_list(size_t *size, bool 
ignore_suffixes);
 
 DeviceState *get_boot_device(uint32_t position);
 void check_boot_index(int32_t bootindex, Error **errp);
+void del_boot_device_path(DeviceState *dev);
 
 QemuOpts *qemu_get_machine_opts(void);
 
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 21/28] redirect: remove bootindex property from qdev to qom

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Remove bootindex form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/usb/redirect.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/usb/redirect.c b/hw/usb/redirect.c
index 44522d9..352ba80 100644
--- a/hw/usb/redirect.c
+++ b/hw/usb/redirect.c
@@ -2468,7 +2468,6 @@ static Property usbredir_properties[] = {
 DEFINE_PROP_CHR(chardev, USBRedirDevice, cs),
 DEFINE_PROP_UINT8(debug, USBRedirDevice, debug, usbredirparser_warning),
 DEFINE_PROP_STRING(filter, USBRedirDevice, filter_str),
-DEFINE_PROP_INT32(bootindex, USBRedirDevice, bootindex, -1),
 DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -2493,11 +2492,23 @@ static void usbredir_class_initfn(ObjectClass *klass, 
void *data)
 set_bit(DEVICE_CATEGORY_MISC, dc-categories);
 }
 
+static void usbredir_instance_init(Object *obj)
+{
+USBDevice *udev = USB_DEVICE(obj);
+USBRedirDevice *dev = DO_UPCAST(USBRedirDevice, dev, udev);
+
+device_add_bootindex_property(obj, dev-bootindex,
+  bootindex, NULL,
+  udev-qdev, NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
+}
+
 static const TypeInfo usbredir_dev_info = {
 .name  = usb-redir,
 .parent= TYPE_USB_DEVICE,
 .instance_size = sizeof(USBRedirDevice),
 .class_init= usbredir_class_initfn,
+.instance_init = usbredir_instance_init,
 };
 
 static void usbredir_register_types(void)
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 23/28] scsi: add bootindex to qom property

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/scsi/scsi-disk.c| 14 ++
 hw/scsi/scsi-generic.c |  1 +
 include/hw/scsi/scsi.h |  1 +
 3 files changed, 16 insertions(+)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index e34a544..205cdb0 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2331,6 +2331,16 @@ static void scsi_disk_realize(SCSIDevice *dev, Error 
**errp)
 }
 }
 
+void scsi_dev_instance_init(Object *obj)
+{
+DeviceState *dev = DEVICE(obj);
+SCSIDevice *s = DO_UPCAST(SCSIDevice, qdev, dev);
+
+device_add_bootindex_property(obj, s-conf.bootindex,
+  bootindex, NULL,
+  s-qdev, NULL);
+}
+
 static const SCSIReqOps scsi_disk_emulate_reqops = {
 .size = sizeof(SCSIDiskReq),
 .free_req = scsi_free_request,
@@ -2644,6 +2654,7 @@ static const TypeInfo scsi_hd_info = {
 .parent= TYPE_SCSI_DEVICE,
 .instance_size = sizeof(SCSIDiskState),
 .class_init= scsi_hd_class_initfn,
+.instance_init = scsi_dev_instance_init,
 };
 
 static Property scsi_cd_properties[] = {
@@ -2675,6 +2686,7 @@ static const TypeInfo scsi_cd_info = {
 .parent= TYPE_SCSI_DEVICE,
 .instance_size = sizeof(SCSIDiskState),
 .class_init= scsi_cd_class_initfn,
+.instance_init = scsi_dev_instance_init,
 };
 
 #ifdef __linux__
@@ -2705,6 +2717,7 @@ static const TypeInfo scsi_block_info = {
 .parent= TYPE_SCSI_DEVICE,
 .instance_size = sizeof(SCSIDiskState),
 .class_init= scsi_block_class_initfn,
+.instance_init = scsi_dev_instance_init,
 };
 #endif
 
@@ -2743,6 +2756,7 @@ static const TypeInfo scsi_disk_info = {
 .parent= TYPE_SCSI_DEVICE,
 .instance_size = sizeof(SCSIDiskState),
 .class_init= scsi_disk_class_initfn,
+.instance_init = scsi_dev_instance_init,
 };
 
 static void scsi_disk_register_types(void)
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 20587b4..f6100a9 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -513,6 +513,7 @@ static const TypeInfo scsi_generic_info = {
 .parent= TYPE_SCSI_DEVICE,
 .instance_size = sizeof(SCSIDevice),
 .class_init= scsi_generic_class_initfn,
+.instance_init = scsi_dev_instance_init,
 };
 
 static void scsi_generic_register_types(void)
diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h
index 2e3a8f9..99c1c57 100644
--- a/include/hw/scsi/scsi.h
+++ b/include/hw/scsi/scsi.h
@@ -270,4 +270,5 @@ SCSIDevice *scsi_device_find(SCSIBus *bus, int channel, int 
target, int lun);
 /* scsi-generic.c. */
 extern const SCSIReqOps scsi_generic_req_ops;
 
+void scsi_dev_instance_init(Object *obj);
 #endif
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 17/28] net: remove bootindex property from qdev to qom

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Remove bootindex form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Meanwhile set the initial value of bootindex to -1.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/net/e1000.c | 1 +
 hw/net/eepro100.c  | 1 +
 hw/net/lance.c | 1 +
 hw/net/ne2000.c| 1 +
 hw/net/pcnet-pci.c | 1 +
 hw/net/rtl8139.c   | 1 +
 hw/net/spapr_llan.c| 1 +
 hw/net/virtio-net.c| 1 +
 hw/net/vmxnet3.c   | 1 +
 hw/usb/dev-network.c   | 1 +
 hw/virtio/virtio-pci.c | 1 +
 include/net/net.h  | 4 +---
 12 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 0edbfa6..2e5dc41 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1627,6 +1627,7 @@ static void e1000_instance_init(Object *obj)
 device_add_bootindex_property(obj, n-conf.bootindex,
   bootindex, /ethernet-phy@0,
   DEVICE(n), NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
 }
 
 static const TypeInfo e1000_base_info = {
diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index fb9c944..cc79f09 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -1912,6 +1912,7 @@ static void eepro100_instance_init(Object *obj)
 device_add_bootindex_property(obj, s-conf.bootindex,
   bootindex, /ethernet-phy@0,
   DEVICE(s), NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
 }
 
 static E100PCIDeviceInfo e100_devices[] = {
diff --git a/hw/net/lance.c b/hw/net/lance.c
index a1c49f1..4972b01 100644
--- a/hw/net/lance.c
+++ b/hw/net/lance.c
@@ -152,6 +152,7 @@ static void lance_instance_init(Object *obj)
 device_add_bootindex_property(obj, s-conf.bootindex,
   bootindex, /ethernet-phy@0,
   DEVICE(obj), NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
 }
 
 static Property lance_properties[] = {
diff --git a/hw/net/ne2000.c b/hw/net/ne2000.c
index 62b86af..6d31555 100644
--- a/hw/net/ne2000.c
+++ b/hw/net/ne2000.c
@@ -761,6 +761,7 @@ static void ne2000_instance_init(Object *obj)
 device_add_bootindex_property(obj, s-c.bootindex,
   bootindex, /ethernet-phy@0,
   pci_dev-qdev, NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
 }
 
 static Property ne2000_properties[] = {
diff --git a/hw/net/pcnet-pci.c b/hw/net/pcnet-pci.c
index fb5f5d6..55f906d 100644
--- a/hw/net/pcnet-pci.c
+++ b/hw/net/pcnet-pci.c
@@ -353,6 +353,7 @@ static void pcnet_instance_init(Object *obj)
 device_add_bootindex_property(obj, s-conf.bootindex,
   bootindex, /ethernet-phy@0,
   DEVICE(obj), NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
 }
 
 static Property pcnet_properties[] = {
diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 138a03a..d921ebd 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -3550,6 +3550,7 @@ static void rtl8139_instance_init(Object *obj)
 device_add_bootindex_property(obj, s-conf.bootindex,
   bootindex, /ethernet-phy@0,
   DEVICE(obj), NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
 }
 
 static Property rtl8139_properties[] = {
diff --git a/hw/net/spapr_llan.c b/hw/net/spapr_llan.c
index c3eb99a..c8b0978 100644
--- a/hw/net/spapr_llan.c
+++ b/hw/net/spapr_llan.c
@@ -226,6 +226,7 @@ static void spapr_vlan_instance_init(Object *obj)
 device_add_bootindex_property(obj, dev-nicconf.bootindex,
   bootindex, ,
   DEVICE(dev), NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
 }
 
 void spapr_vlan_create(VIOsPAPRBus *bus, NICInfo *nd)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index f5f6bf7..3e6d3fa 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1719,6 +1719,7 @@ static void virtio_net_instance_init(Object *obj)
 device_add_bootindex_property(obj, n-nic_conf.bootindex,
   bootindex, /ethernet-phy@0,
   DEVICE(n), NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
 }
 
 static Property virtio_net_properties[] = {
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 88e9d9c..56b22f3 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -2183,6 +2183,7 @@ static void vmxnet3_instance_init(Object *obj)
 device_add_bootindex_property(obj, s-conf.bootindex,
   bootindex, /ethernet-phy@0,
   DEVICE(obj), NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
 }
 
 static void vmxnet3_pci_uninit(PCIDevice *pci_dev)
diff

[Qemu-devel] [PATCH v7 15/28] vmxnet3: add bootindex to qom property

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/net/spapr_llan.c | 22 --
 hw/net/vmxnet3.c|  8 
 2 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/hw/net/spapr_llan.c b/hw/net/spapr_llan.c
index 38bbb01..c3eb99a 100644
--- a/hw/net/spapr_llan.c
+++ b/hw/net/spapr_llan.c
@@ -219,27 +219,13 @@ static int spapr_vlan_init(VIOsPAPRDevice *sdev)
 return 0;
 }
 
-static void spapr_vlan_get_bootindex(Object *obj, Visitor *v, void *opaque,
- const char *name, Error **errp)
-{
-VIOsPAPRVLANDevice *dev = VIO_SPAPR_VLAN_DEVICE(obj);
-
-get_bootindex(dev-nicconf.bootindex, v, name, errp);
-}
-
-static void spapr_vlan_set_bootindex(Object *obj, Visitor *v, void *opaque,
- const char *name, Error **errp)
+static void spapr_vlan_instance_init(Object *obj)
 {
 VIOsPAPRVLANDevice *dev = VIO_SPAPR_VLAN_DEVICE(obj);
 
-set_bootindex(dev-nicconf.bootindex, v, name, errp);
-}
-
-static void spapr_vlan_instance_init(Object *obj)
-{
-object_property_add(obj, bootindex, int,
-spapr_vlan_get_bootindex,
-spapr_vlan_set_bootindex, NULL, NULL, NULL);
+device_add_bootindex_property(obj, dev-nicconf.bootindex,
+  bootindex, ,
+  DEVICE(dev), NULL);
 }
 
 void spapr_vlan_create(VIOsPAPRBus *bus, NICInfo *nd)
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index f246fa1..88e9d9c 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -2177,6 +2177,13 @@ static int vmxnet3_pci_init(PCIDevice *pci_dev)
 return 0;
 }
 
+static void vmxnet3_instance_init(Object *obj)
+{
+VMXNET3State *s = VMXNET3(obj);
+device_add_bootindex_property(obj, s-conf.bootindex,
+  bootindex, /ethernet-phy@0,
+  DEVICE(obj), NULL);
+}
 
 static void vmxnet3_pci_uninit(PCIDevice *pci_dev)
 {
@@ -2524,6 +2531,7 @@ static const TypeInfo vmxnet3_info = {
 .parent= TYPE_PCI_DEVICE,
 .instance_size = sizeof(VMXNET3State),
 .class_init= vmxnet3_class_init,
+.instance_init = vmxnet3_instance_init,
 };
 
 static void vmxnet3_register_types(void)
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 08/28] virtio-net: add bootindex to qom property

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/net/virtio-net.c|  3 +++
 hw/virtio/virtio-pci.c | 11 ---
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 826a2a5..f5f6bf7 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1716,6 +1716,9 @@ static void virtio_net_instance_init(Object *obj)
  * Can be overriden with virtio_net_set_config_size.
  */
 n-config_size = sizeof(struct virtio_net_config);
+device_add_bootindex_property(obj, n-nic_conf.bootindex,
+  bootindex, /ethernet-phy@0,
+  DEVICE(n), NULL);
 }
 
 static Property virtio_net_properties[] = {
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 78dcd68..0779d28 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1454,9 +1454,14 @@ static void virtio_net_pci_class_init(ObjectClass 
*klass, void *data)
 static void virtio_net_pci_instance_init(Object *obj)
 {
 VirtIONetPCI *dev = VIRTIO_NET_PCI(obj);
-object_initialize(dev-vdev, sizeof(dev-vdev), TYPE_VIRTIO_NET);
-object_property_add_child(obj, virtio-backend, OBJECT(dev-vdev), NULL);
-object_unref(OBJECT(dev-vdev));
+VirtIONet *n = dev-vdev;
+
+object_initialize(n, sizeof(dev-vdev), TYPE_VIRTIO_NET);
+object_property_add_child(obj, virtio-backend, OBJECT(n), NULL);
+object_unref(OBJECT(n));
+device_add_bootindex_property(obj, n-nic_conf.bootindex,
+  bootindex, /ethernet-phy@0,
+  DEVICE(n), NULL);
 }
 
 static const TypeInfo virtio_net_pci_info = {
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 26/28] block: remove bootindex property from qdev to qom

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Remove bootindex form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Meanwhile set the initial value of bootindex to -1.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/block/virtio-blk.c| 1 +
 hw/ide/qdev.c| 1 +
 hw/scsi/scsi-disk.c  | 2 +-
 hw/scsi/scsi-generic.c   | 1 -
 include/hw/block/block.h | 1 -
 5 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 9d04590..4d05114 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -808,6 +808,7 @@ static void virtio_blk_instance_init(Object *obj)
 device_add_bootindex_property(obj, s-conf-bootindex,
   bootindex, /disk@0,0,
   DEVICE(obj), NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
 }
 
 static Property virtio_blk_properties[] = {
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 9e2ed40..8382f24 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -200,6 +200,7 @@ static void ide_dev_instance_init(Object *obj)
   bootindex,
   d-unit ? /disk@1 : /disk@0,
   d-qdev, NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
 }
 
 static int ide_hd_initfn(IDEDevice *dev)
diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 205cdb0..c37a8be 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2339,6 +2339,7 @@ void scsi_dev_instance_init(Object *obj)
 device_add_bootindex_property(obj, s-conf.bootindex,
   bootindex, NULL,
   s-qdev, NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
 }
 
 static const SCSIReqOps scsi_disk_emulate_reqops = {
@@ -2692,7 +2693,6 @@ static const TypeInfo scsi_cd_info = {
 #ifdef __linux__
 static Property scsi_block_properties[] = {
 DEFINE_PROP_DRIVE(drive, SCSIDiskState, qdev.conf.bs),
-DEFINE_PROP_INT32(bootindex, SCSIDiskState, qdev.conf.bootindex, -1),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index f6100a9..a542d1a 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -482,7 +482,6 @@ static SCSIRequest *scsi_new_request(SCSIDevice *d, 
uint32_t tag, uint32_t lun,
 
 static Property scsi_generic_properties[] = {
 DEFINE_PROP_DRIVE(drive, SCSIDevice, conf.bs),
-DEFINE_PROP_INT32(bootindex, SCSIDevice, conf.bootindex, -1),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/block/block.h b/include/hw/block/block.h
index 3a01488..867a226 100644
--- a/include/hw/block/block.h
+++ b/include/hw/block/block.h
@@ -49,7 +49,6 @@ static inline unsigned int get_physical_block_exp(BlockConf 
*conf)
   _conf.physical_block_size, 512),  \
 DEFINE_PROP_UINT16(min_io_size, _state, _conf.min_io_size, 0),  \
 DEFINE_PROP_UINT32(opt_io_size, _state, _conf.opt_io_size, 0),\
-DEFINE_PROP_INT32(bootindex, _state, _conf.bootindex, -1),\
 DEFINE_PROP_UINT32(discard_granularity, _state, \
_conf.discard_granularity, -1)
 
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 18/28] host-libusb: remove bootindex property from qdev to qom

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Remove bootindex form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/usb/host-libusb.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index c189147..a2fa70e 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -980,6 +980,17 @@ static int usb_host_initfn(USBDevice *udev)
 return 0;
 }
 
+static void usb_host_instance_init(Object *obj)
+{
+USBDevice *udev = USB_DEVICE(obj);
+USBHostDevice *s = USB_HOST_DEVICE(udev);
+
+device_add_bootindex_property(obj, s-bootindex,
+  bootindex, NULL,
+  udev-qdev, NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
+}
+
 static void usb_host_handle_destroy(USBDevice *udev)
 {
 USBHostDevice *s = USB_HOST_DEVICE(udev);
@@ -1464,7 +1475,6 @@ static Property usb_host_dev_properties[] = {
 DEFINE_PROP_UINT32(productid, USBHostDevice, match.product_id, 0),
 DEFINE_PROP_UINT32(isobufs,  USBHostDevice, iso_urb_count,4),
 DEFINE_PROP_UINT32(isobsize, USBHostDevice, iso_urb_frames,   32),
-DEFINE_PROP_INT32(bootindex, USBHostDevice, bootindex,-1),
 DEFINE_PROP_UINT32(loglevel,  USBHostDevice, loglevel,
LIBUSB_LOG_LEVEL_WARNING),
 DEFINE_PROP_BIT(pipeline,USBHostDevice, options,
@@ -1497,6 +1507,7 @@ static TypeInfo usb_host_dev_info = {
 .parent= TYPE_USB_DEVICE,
 .instance_size = sizeof(USBHostDevice),
 .class_init= usb_host_class_initfn,
+.instance_init = usb_host_instance_init,
 };
 
 static void usb_host_register_types(void)
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 20/28] vfio: remove bootindex property from qdev to qom

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Remove bootindex form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/misc/vfio.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 40dcaa6..cdbaedf 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -4365,13 +4365,23 @@ post_reset:
 vfio_pci_post_reset(vdev);
 }
 
+static void vfio_instance_init(Object *obj)
+{
+PCIDevice *pci_dev = PCI_DEVICE(obj);
+VFIODevice *vdev = DO_UPCAST(VFIODevice, pdev, PCI_DEVICE(obj));
+
+device_add_bootindex_property(obj, vdev-bootindex,
+  bootindex, NULL,
+  pci_dev-qdev, NULL);
+object_property_set_int(obj, -1, bootindex, NULL);
+}
+
 static Property vfio_pci_dev_properties[] = {
 DEFINE_PROP_PCI_HOST_DEVADDR(host, VFIODevice, host),
 DEFINE_PROP_UINT32(x-intx-mmap-timeout-ms, VFIODevice,
intx.mmap_timeout, 1100),
 DEFINE_PROP_BIT(x-vga, VFIODevice, features,
 VFIO_FEATURE_ENABLE_VGA_BIT, false),
-DEFINE_PROP_INT32(bootindex, VFIODevice, bootindex, -1),
 /*
  * TODO - support passed fds... is this necessary?
  * DEFINE_PROP_STRING(vfiofd, VFIODevice, vfiofd_name),
@@ -4407,6 +4417,7 @@ static const TypeInfo vfio_pci_dev_info = {
 .parent = TYPE_PCI_DEVICE,
 .instance_size = sizeof(VFIODevice),
 .class_init = vfio_pci_dev_class_init,
+.instance_init = vfio_instance_init,
 };
 
 static void register_vfio_pci_dev_type(void)
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 24/28] ide: add bootindex to qom property

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/ide/qdev.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index efab95b..9e2ed40 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -191,6 +191,17 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind 
kind)
 return 0;
 }
 
+static void ide_dev_instance_init(Object *obj)
+{
+DeviceState *dev = DEVICE(obj);
+IDEDevice *d = DO_UPCAST(IDEDevice, qdev, dev);
+
+device_add_bootindex_property(obj, d-conf.bootindex,
+  bootindex,
+  d-unit ? /disk@1 : /disk@0,
+  d-qdev, NULL);
+}
+
 static int ide_hd_initfn(IDEDevice *dev)
 {
 return ide_dev_initfn(dev, IDE_HD);
@@ -238,6 +249,7 @@ static const TypeInfo ide_hd_info = {
 .parent= TYPE_IDE_DEVICE,
 .instance_size = sizeof(IDEDrive),
 .class_init= ide_hd_class_init,
+.instance_init = ide_dev_instance_init,
 };
 
 static Property ide_cd_properties[] = {
@@ -260,6 +272,7 @@ static const TypeInfo ide_cd_info = {
 .parent= TYPE_IDE_DEVICE,
 .instance_size = sizeof(IDEDrive),
 .class_init= ide_cd_class_init,
+.instance_init = ide_dev_instance_init,
 };
 
 static Property ide_drive_properties[] = {
@@ -282,6 +295,7 @@ static const TypeInfo ide_drive_info = {
 .parent= TYPE_IDE_DEVICE,
 .instance_size = sizeof(IDEDrive),
 .class_init= ide_drive_class_init,
+.instance_init = ide_dev_instance_init,
 };
 
 static void ide_device_class_init(ObjectClass *klass, void *data)
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 22/28] isa-fdc: remove bootindexA/B property from qdev to qom

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Remove bootindexA/B form qdev property to qom, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/block/fdc.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 490d127..51b1f08 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -2217,8 +2217,6 @@ static Property isa_fdc_properties[] = {
 DEFINE_PROP_UINT32(dma, FDCtrlISABus, dma, 2),
 DEFINE_PROP_DRIVE(driveA, FDCtrlISABus, state.drives[0].bs),
 DEFINE_PROP_DRIVE(driveB, FDCtrlISABus, state.drives[1].bs),
-DEFINE_PROP_INT32(bootindexA, FDCtrlISABus, bootindexA, -1),
-DEFINE_PROP_INT32(bootindexB, FDCtrlISABus, bootindexB, -1),
 DEFINE_PROP_BIT(check_media_rate, FDCtrlISABus, state.check_media_rate,
 0, true),
 DEFINE_PROP_END_OF_LIST(),
@@ -2236,11 +2234,26 @@ static void isabus_fdc_class_init(ObjectClass *klass, 
void *data)
 set_bit(DEVICE_CATEGORY_STORAGE, dc-categories);
 }
 
+static void isabus_fdc_instance_init(Object *obj)
+{
+FDCtrlISABus *isa = ISA_FDC(obj);
+
+device_add_bootindex_property(obj, isa-bootindexA,
+  bootindexA, /floppy@0,
+  DEVICE(obj), NULL);
+device_add_bootindex_property(obj, isa-bootindexB,
+  bootindexB, /floppy@1,
+  DEVICE(obj), NULL);
+object_property_set_int(obj, -1, bootindexA, NULL);
+object_property_set_int(obj, -1, bootindexB, NULL);
+}
+
 static const TypeInfo isa_fdc_info = {
 .name  = TYPE_ISA_FDC,
 .parent= TYPE_ISA_DEVICE,
 .instance_size = sizeof(FDCtrlISABus),
 .class_init= isabus_fdc_class_init,
+.instance_init = isabus_fdc_instance_init,
 };
 
 static const VMStateDescription vmstate_sysbus_fdc ={
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 12/28] pcnet: add bootindex to qom property

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/net/lance.c | 12 
 hw/net/pcnet-pci.c | 12 
 hw/net/pcnet.h |  1 -
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/hw/net/lance.c b/hw/net/lance.c
index 7811a9e..a1c49f1 100644
--- a/hw/net/lance.c
+++ b/hw/net/lance.c
@@ -42,6 +42,7 @@
 #include hw/sparc/sun4m.h
 #include pcnet.h
 #include trace.h
+#include sysemu/sysemu.h
 
 #define TYPE_LANCE lance
 #define SYSBUS_PCNET(obj) \
@@ -143,6 +144,16 @@ static void lance_reset(DeviceState *dev)
 pcnet_h_reset(d-state);
 }
 
+static void lance_instance_init(Object *obj)
+{
+SysBusPCNetState *d = SYSBUS_PCNET(obj);
+PCNetState *s = d-state;
+
+device_add_bootindex_property(obj, s-conf.bootindex,
+  bootindex, /ethernet-phy@0,
+  DEVICE(obj), NULL);
+}
+
 static Property lance_properties[] = {
 DEFINE_PROP_PTR(dma, SysBusPCNetState, state.dma_opaque),
 DEFINE_NIC_PROPERTIES(SysBusPCNetState, state.conf),
@@ -169,6 +180,7 @@ static const TypeInfo lance_info = {
 .parent= TYPE_SYS_BUS_DEVICE,
 .instance_size = sizeof(SysBusPCNetState),
 .class_init= lance_class_init,
+.instance_init = lance_instance_init,
 };
 
 static void lance_register_types(void)
diff --git a/hw/net/pcnet-pci.c b/hw/net/pcnet-pci.c
index 50ffe91..fb5f5d6 100644
--- a/hw/net/pcnet-pci.c
+++ b/hw/net/pcnet-pci.c
@@ -32,6 +32,7 @@
 #include hw/loader.h
 #include qemu/timer.h
 #include sysemu/dma.h
+#include sysemu/sysemu.h
 
 #include pcnet.h
 
@@ -344,6 +345,16 @@ static void pci_reset(DeviceState *dev)
 pcnet_h_reset(d-state);
 }
 
+static void pcnet_instance_init(Object *obj)
+{
+PCIPCNetState *d = PCI_PCNET(obj);
+PCNetState *s = d-state;
+
+device_add_bootindex_property(obj, s-conf.bootindex,
+  bootindex, /ethernet-phy@0,
+  DEVICE(obj), NULL);
+}
+
 static Property pcnet_properties[] = {
 DEFINE_NIC_PROPERTIES(PCIPCNetState, state.conf),
 DEFINE_PROP_END_OF_LIST(),
@@ -372,6 +383,7 @@ static const TypeInfo pcnet_info = {
 .parent= TYPE_PCI_DEVICE,
 .instance_size = sizeof(PCIPCNetState),
 .class_init= pcnet_class_init,
+.instance_init = pcnet_instance_init,
 };
 
 static void pci_pcnet_register_types(void)
diff --git a/hw/net/pcnet.h b/hw/net/pcnet.h
index 9dee6f3..f8e8a6f 100644
--- a/hw/net/pcnet.h
+++ b/hw/net/pcnet.h
@@ -66,5 +66,4 @@ void pcnet_set_link_status(NetClientState *nc);
 void pcnet_common_cleanup(PCNetState *d);
 int pcnet_common_init(DeviceState *dev, PCNetState *s, NetClientInfo *info);
 extern const VMStateDescription vmstate_pcnet;
-
 #endif
-- 
1.7.12.4

Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 3/4] target-ppc: Build error log

2014-09-05 Thread Alexander Graf



On 05.09.14 10:28, Aravinda Prasad wrote:
 
 
 On Friday 05 September 2014 01:34 PM, Alexander Graf wrote:


 On 04.09.14 13:13, Aravinda Prasad wrote:
 Whenever there is a physical memory error due to bit
 flips, which cannot be corrected by hardware, the error
 is passed on to the kernel. If the memory address in
 error belongs to guest address space then guest kernel
 is responsible to take action. Hence the error is passed
 on to guest via KVM by invoking 0x200 NMI vector.

 However, guest OS, as per PAPR, expects an error log
 upon such error. This patch registers a new hcall
 which is issued from 0x200 interrupt vector and builds
 the error log, copies the error log to rtas space and
 passes the address of the error log to guest

 Enhancement to KVM to perform above functionality is
 already in upstream kernel.

 Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com
 ---
  hw/ppc/spapr_hcall.c   |  154 
 
  include/hw/ppc/spapr.h |4 +
  2 files changed, 157 insertions(+), 1 deletion(-)

 diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
 index 01650ba..c3aa448 100644
 --- a/hw/ppc/spapr_hcall.c
 +++ b/hw/ppc/spapr_hcall.c
 @@ -14,6 +14,88 @@ struct SPRSyncState {
  target_ulong mask;
  };
  
 +/* Offset from rtas-base where error log is placed */
 +#define RTAS_ERROR_OFFSET   (TARGET_PAGE_SIZE)

 You can't assume this. Please compute the value at the same place you
 compute the rtas-size.
 
 sure
 
 

 +
 +#define RTAS_ELOG_SEVERITY_SHIFT 0x5
 +#define RTAS_ELOG_DISPOSITION_SHIFT  0x3
 +#define RTAS_ELOG_INITIATOR_SHIFT0x4
 +
 +/*
 + * Only required RTAS event severity, disposition, initiator
 + * target and type are copied from arch/powerpc/include/asm/rtas.h
 + */
 +
 +/* RTAS event severity */
 +#define RTAS_SEVERITY_ERROR_SYNC0x3
 +
 +/* RTAS event disposition */
 +#define RTAS_DISP_NOT_RECOVERED 0x2
 +
 +/* RTAS event initiator */
 +#define RTAS_INITIATOR_MEMORY   0x4
 +
 +/* RTAS event target */
 +#define RTAS_TARGET_MEMORY  0x4
 +
 +/* RTAS event type */
 +#define RTAS_TYPE_ECC_UNCORR0x09
 +
 +/*
 + * Currently KVM only passes on the uncorrected machine
 + * check memory error to guest. Other machine check errors
 + * such as SLB multi-hit and TLB multi-hit are recovered
 + * in KVM and are not passed on to guest.
 + *
 + * DSISR Bit for uncorrected machine check error. Based
 + * on arch/powerpc/include/asm/mce.h

 Please don't include Linux code. This file is GPLv2+ licensed and I
 don't want to taint it as GPLv2 only just for the sake of mce.
 
 Hmm.. ok.  In that case I should not copy rtas_error_log structure below
 from kernel source as well.
 

 + */
 +#define PPC_BIT(bit)(0x8000ULL  bit)
 +#define P7_DSISR_MC_UE  (PPC_BIT(48))  /* P8 too */
 +
 +/* Adopted from kernel source arch/powerpc/include/asm/rtas.h */
 +struct rtas_error_log {
 +/* Byte 0 */
 +uint8_t byte0;  /* Architectural version */
 +
 +/* Byte 1 */
 +uint8_t byte1;
 +/* 
 + * XXX  3: Severity level of error
 + *XX2: Degree of recovery
 + *  X   1: Extended log present?
 + *   XX 2: Reserved
 + */
 +
 +/* Byte 2 */
 +uint8_t byte2;
 +/* 
 + *  4: Initiator of event
 + *  4: Target of failed operation
 + */
 +uint8_t byte3;  /* General event or error*/
 +};
 +
 +/*
 + * Data format in RTAS-Blob
 + *
 + * This structure contains error information related to Machine
 + * Check exception. This is filled up and copied to rtas-blob
 + * upon machine check exception.
 + */
 +struct rtas_mc_log {
 +target_ulong srr0;
 +target_ulong srr1;
 +/*
 + * Beginning of error log address. This is properly
 + * populated and passed on to OS registered machine
 + * check notification routine upon machine check
 + * exception
 + */
 +target_ulong r3;
 +struct rtas_error_log err_log;
 +};
 +
  static void do_spr_sync(void *arg)
  {
  struct SPRSyncState *s = arg;
 @@ -586,6 +668,77 @@ static target_ulong h_rtas_update(PowerPCCPU *cpu, 
 sPAPREnvironment *spapr,
  return 0;
  }
  
 +static target_ulong h_report_mc_err(PowerPCCPU *cpu, sPAPREnvironment 
 *spapr,
 + target_ulong opcode, target_ulong *args)
 +{
 +struct rtas_mc_log mc_log;
 +CPUPPCState *env = cpu-env;
 +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 +
 +/*
 + * We save the original r3 register in SPRG2 in 0x200 vector,
 + * which is patched during call to ibm.nmi-register. Original
 + * r3 is required to be included in error log
 + */
 +mc_log.r3 = env-spr[SPR_SPRG2];

 Don't you have to call cpu_synchronize_registers() before you access
 SPRG2? Otherwise the value may be stale.
 
 Yes true. Will include.
 

 +
 +/*
 + * SRR0 and SRR1,

[Qemu-devel] [PATCH v7 09/28] e1000: add bootindex to qom property

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/net/e1000.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 272df00..0edbfa6 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1621,10 +1621,19 @@ static void e1000_class_init(ObjectClass *klass, void 
*data)
 dc-props = e1000_properties;
 }
 
+static void e1000_instance_init(Object *obj)
+{
+E1000State *n = E1000(obj);
+device_add_bootindex_property(obj, n-conf.bootindex,
+  bootindex, /ethernet-phy@0,
+  DEVICE(n), NULL);
+}
+
 static const TypeInfo e1000_base_info = {
 .name  = TYPE_E1000_BASE,
 .parent= TYPE_PCI_DEVICE,
 .instance_size = sizeof(E1000State),
+.instance_init = e1000_instance_init,
 .class_size= sizeof(E1000BaseClass),
 .abstract  = true,
 };
@@ -1668,6 +1677,7 @@ static void e1000_register_types(void)
 type_info.parent = TYPE_E1000_BASE;
 type_info.class_data = (void *)info;
 type_info.class_init = e1000_class_init;
+type_info.instance_init = e1000_instance_init;
 
 type_register(type_info);
 }
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 05/28] bootindex: rework add_boot_device_path function

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Add the function of updating bootindex about fw_boot_order list
in add_boot_device_path(). We should delete the old one if a
device has existed in global fw_boot_order list.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 bootdevice.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/bootdevice.c b/bootdevice.c
index 89aca7f..6f430ec 100644
--- a/bootdevice.c
+++ b/bootdevice.c
@@ -72,6 +72,34 @@ void del_boot_device_path(DeviceState *dev)
 }
 }
 
+static void del_original_boot_device(DeviceState *dev, const char *suffix)
+{
+FWBootEntry *i;
+
+if (dev == NULL) {
+return;
+}
+
+QTAILQ_FOREACH(i, fw_boot_order, link) {
+if (suffix) {
+if (i-dev == dev  !strcmp(i-suffix, suffix)) {
+QTAILQ_REMOVE(fw_boot_order, i, link);
+g_free(i-suffix);
+g_free(i);
+
+break;
+}
+} else { /* host-usb and scsi devices do not have a suffix */
+if (i-dev == dev) {
+QTAILQ_REMOVE(fw_boot_order, i, link);
+g_free(i);
+
+break;
+}
+}
+}
+}
+
 void add_boot_device_path(int32_t bootindex, DeviceState *dev,
   const char *suffix)
 {
@@ -83,6 +111,8 @@ void add_boot_device_path(int32_t bootindex, DeviceState 
*dev,
 
 assert(dev != NULL || suffix != NULL);
 
+del_original_boot_device(dev, suffix);
+
 node = g_malloc0(sizeof(FWBootEntry));
 node-bootindex = bootindex;
 node-suffix = g_strdup(suffix);
-- 
1.7.12.4

Re: [Qemu-devel] [Qemu-ppc] [PATCH 4/5] target-ppc: Handle ibm, nmi-register RTAS call

2014-09-05 Thread Alexander Graf



On 04.09.14 15:49, Aravinda Prasad wrote:
 
 
 On Thursday 04 September 2014 06:39 PM, Alexander Graf wrote:


 Am 04.09.2014 um 10:25 schrieb Aravinda Prasad 
 aravi...@linux.vnet.ibm.com:



 On Friday 29 August 2014 03:46 AM, Alexander Graf wrote:


 On 28.08.14 19:42, Aravinda Prasad wrote:


 On Thursday 28 August 2014 02:07 PM, Alexander Graf wrote:


 On 28.08.14 08:38, Aravinda Prasad wrote:


 On Wednesday 27 August 2014 04:07 PM, Alexander Graf wrote:


 On 25.08.14 15:45, Aravinda Prasad wrote:
 This patch adds FWNMI support in qemu for powerKVM
 guests by handling the ibm,nmi-register rtas call.
 Whenever OS issues ibm,nmi-register RTAS call, the
 machine check notification address is saved and the
 machine check interrupt vector 0x200 is patched to
 issue a private hcall.

 Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com
 ---
 hw/ppc/spapr_rtas.c|   91 
 
 include/hw/ppc/spapr.h |8 
 2 files changed, 98 insertions(+), 1 deletion(-)

 diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
 index 02ddbf9..1135d2b 100644
 --- a/hw/ppc/spapr_rtas.c
 +++ b/hw/ppc/spapr_rtas.c
 @@ -277,6 +277,91 @@ static void 
 rtas_ibm_set_system_parameter(PowerPCCPU *cpu,
 rtas_st(rets, 0, ret);
 }

 +static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
 +  sPAPREnvironment *spapr,
 +  uint32_t token, uint32_t nargs,
 +  target_ulong args,
 +  uint32_t nret, target_ulong rets)
 +{
 +int i;
 +uint32_t branch_inst = 0x4802;
 +target_ulong guest_machine_check_addr;
 +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 +/*
 + * Trampoline saves r3 in sprg2 and issues private hcall
 + * to request qemu to build error log. QEMU builds the
 + * error log, copies to rtas-blob and returns the address.
 + * The initial 16 bytes in rtas-blob consists of saved srr0
 + * and srr1 which we restore and pass on the actual error
 + * log address to OS handled mcachine check notification
 + * routine
 + */
 +uint32_t trampoline[] = {
 +0x7c7243a6,/* mtspr   SPRN_SPRG2,r3 */
 +0x3860,/* li  r3,0   */
 +/* 0xf004 is the KVMPPC_H_REPORT_ERR private HCALL */
 +0x6063f004,/* ori r3,r3,f004  */
 +/* Issue H_CALL */
 +0x4422,/*  sc  1 */

 So up to here we're saving r3 in SPRG2 (how do we know that we can
 clobber it?) and call our special hypercall.

 But what does all the cruft below here do?

 The saved r3 in SPRG2 is consumed in KVMPPC_H_REPORT_ERR hcall, hence we
 can clobber SPRG2 after hcall returns. I have included a comment in
 patch 3/5 while building error log. I think better I add one here as 
 well.


 +0x7c9243a6,/* mtspr r4 sprg2 */

 Apart from th fact that your order is wrong, this destroys the value of
 r3 that we saved above again.

 SPRG2 is saved inside hcall and hence we don't need SPRG2 further after
 KVMPPC_H_REPORT_ERR hcall returns.


 +0xe883,/* ld r4, 0(r3) */
 +0x7c9a03a6,/* mtspr r4, srr0 */
 +0xe8830008,/* ld r4, 8(r3) */
 +0x7c9b03a6,/* mtspr r4, srr1 */

 Can't we just set srr0 and srr1 directly?

 I checked for instructions in ISA which set srr0/1 directly given an
 address, but could not find any such instructions.

 I mean from QEMU :).

 srr0 and srr1, which are properly set when 0x200 is invoked, are
 clobbered when we return from KVMPPC_H_REPORT_ERR hcall. I think they
 are modified before issuing rfid (I can see them getting clobbered from
 QEMU monitor). However when we jump to OS registered machine check
 routine srr0 and srr1 should reflect the value they had when 0x200 was
 invoked.

 Hence srr0 and srr1 are saved in hcall and restored when we return from
 hcall. Also we don't have enough scratch registers available to save
 these before invoking hcall from 0x200.

 Or am I missing other ways to do this from QEMU?

 If you just do

 cpu_synchronize_state() and then change env-spr[SPRN_SRR0/1] inside
 your hypercall handler that should also change the value when you return
 from the hcall.

 I tried cpu_synchronize_state(), however, srr0 and srr1 are still clobbered.

 Just before I issue hcall from 0x200 I see the following values from
 QEMU monitor:

 SRR0 d0f40264  SRR1 80209033

 Inside hcall, I call cpu_synchronize_state(). As soon as I return from
 hcall I see:

 SRR0 0214  SRR1 80001001

 I see SRR0 is now set to nip in 0x200 and SRR1 to msr value. I think it
 is reset during returning from hcall before executing rfid.

Ah, because the hypercall is an interrupt itself. Heh ;). True.


Alex

[Qemu-devel] [PATCH v7 28/28] bootindex: delete bootindex when device is removed

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Device should be removed from global boot list when
it is hot-unplugged.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 bootdevice.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/bootdevice.c b/bootdevice.c
index e95d085..1255c06 100644
--- a/bootdevice.c
+++ b/bootdevice.c
@@ -254,6 +254,8 @@ static void property_release_bootindex(Object *obj, const 
char *name,
 
 {
 BootIndexProperty *prop = opaque;
+
+del_boot_device_path(prop-dev);
 g_free(prop);
 }
 
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 25/28] virtio-blk: add bootindex to qom property

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/block/virtio-blk.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index a7f2827..9d04590 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -805,6 +805,9 @@ static void virtio_blk_instance_init(Object *obj)
  (Object **)s-blk.iothread,
  qdev_prop_allow_set_link_before_realize,
  OBJ_PROP_LINK_UNREF_ON_RELEASE, NULL);
+device_add_bootindex_property(obj, s-conf-bootindex,
+  bootindex, /disk@0,0,
+  DEVICE(obj), NULL);
 }
 
 static Property virtio_blk_properties[] = {
-- 
1.7.12.4

Re: [Qemu-devel] NBD TLS support in QEMU

2014-09-05 Thread Hani Benhabiles

On Wed, Sep 03, 2014 at 05:44:17PM +0100, Stefan Hajnoczi wrote:
 Hi,
 QEMU offers both NBD client and server functionality.  The NBD protocol
 runs unencrypted, which is a problem when the client and server
 communicate over an untrusted network.
 
 The particular use case that prompted this mail is storage migration in
 OpenStack.  The goal is to encrypt the NBD connection between source and
 destination hosts during storage migration.
 
 I think we can integrate TLS into the NBD protocol as an optional flag.
 A quick web search does not reveal existing open source SSL/TLS NBD
 implementations.  I do see a VMware NBDSSL protocol but there is no
 specification so I guess it is proprietary.

Not sold on this because:
- It requires (unnecessary) changes to the NBD specification.
- It would still be vulnerable to a protocol downgrade attack: ie. An attacker
  could strip the TLS flag from the server's response resulting in either:
  * Connection fallback to cleartext, if both ends don't force TLS.
  * Loss of backward compatibility, if one of the ends forces TLS, making the
reason for which such a flag is added moot IIUC.

IMO, it is more fool proof add some --use-tls flag to the client and server
implementations to wrap the data in TLS, just like HTTPS does for instance.

Also, so mean of verification is required (otherwise, back to point 0 being
vulnerable to sslstrip style attacks) either that the server's cert is signed
with a certain (self-generated) CA certificate or that it matches a certain
fingerprint. Doing it similarly on the server-side would allow hitting a 2nd
bird (authentication.)


 The NBD protocol starts with a negotiation phase.  This would be the
 appropriate place to indicate that TLS will be used.  After client and
 server complete TLS setup the connection can continue as normal.
 
 Besides QEMU, the userspace NBD tools (http://nbd.sf.net/) can also be
 extended to support TLS.  In this case the kernel needs a localhost
 socket and userspace handles TLS.
 
 Thoughts?
 
 Stefan

[Qemu-devel] [PATCH v7 02/28] bootindex: add check bootindex function

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Determine whether a given bootindex exists or not.
If exists, we report an error.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 bootdevice.c| 15 +++
 include/sysemu/sysemu.h |  1 +
 2 files changed, 16 insertions(+)

diff --git a/bootdevice.c b/bootdevice.c
index d5b8789..f5399df 100644
--- a/bootdevice.c
+++ b/bootdevice.c
@@ -36,6 +36,21 @@ struct FWBootEntry {
 static QTAILQ_HEAD(, FWBootEntry) fw_boot_order =
 QTAILQ_HEAD_INITIALIZER(fw_boot_order);
 
+void check_boot_index(int32_t bootindex, Error **errp)
+{
+FWBootEntry *i;
+
+if (bootindex = 0) {
+QTAILQ_FOREACH(i, fw_boot_order, link) {
+if (i-bootindex == bootindex) {
+error_setg(errp, The bootindex %d has already been used,
+   bootindex);
+return;
+}
+}
+}
+}
+
 void add_boot_device_path(int32_t bootindex, DeviceState *dev,
   const char *suffix)
 {
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 8de5100..72463de 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -213,6 +213,7 @@ void add_boot_device_path(int32_t bootindex, DeviceState 
*dev,
 char *get_boot_devices_list(size_t *size, bool ignore_suffixes);
 
 DeviceState *get_boot_device(uint32_t position);
+void check_boot_index(int32_t bootindex, Error **errp);
 
 QemuOpts *qemu_get_machine_opts(void);
 
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 04/28] fw_cfg: add fw_cfg_machine_reset function

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

We must assure that the changed bootindex can take effect
when guest is rebooted. So we introduce fw_cfg_machine_reset(),
which change the fw_cfg file's bootindex data using the new
global fw_boot_order list.

Signed-off-by: Chenliang chenlian...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/nvram/fw_cfg.c | 55 ---
 include/hw/nvram/fw_cfg.h |  2 ++
 2 files changed, 54 insertions(+), 3 deletions(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index b71d251..e7ed27e 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -402,6 +402,26 @@ static void fw_cfg_add_bytes_read_callback(FWCfgState *s, 
uint16_t key,
 s-entries[arch][key].callback_opaque = callback_opaque;
 }
 
+static void *fw_cfg_modify_bytes_read(FWCfgState *s, uint16_t key,
+  void *data, size_t len)
+{
+void *ptr;
+int arch = !!(key  FW_CFG_ARCH_LOCAL);
+
+key = FW_CFG_ENTRY_MASK;
+
+assert(key  FW_CFG_MAX_ENTRY  len  UINT32_MAX);
+
+/* return the old data to the function caller, avoid memory leak */
+ptr = s-entries[arch][key].data;
+s-entries[arch][key].data = data;
+s-entries[arch][key].len = len;
+s-entries[arch][key].callback_opaque = NULL;
+s-entries[arch][key].callback = NULL;
+
+return ptr;
+}
+
 void fw_cfg_add_bytes(FWCfgState *s, uint16_t key, void *data, size_t len)
 {
 fw_cfg_add_bytes_read_callback(s, key, NULL, NULL, data, len);
@@ -499,13 +519,42 @@ void fw_cfg_add_file(FWCfgState *s,  const char *filename,
 fw_cfg_add_file_callback(s, filename, NULL, NULL, data, len);
 }
 
-static void fw_cfg_machine_ready(struct Notifier *n, void *data)
+void *fw_cfg_modify_file(FWCfgState *s, const char *filename,
+void *data, size_t len)
+{
+int i, index;
+
+assert(s-files);
+
+index = be32_to_cpu(s-files-count);
+assert(index  FW_CFG_FILE_SLOTS);
+
+for (i = 0; i  index; i++) {
+if (strcmp(filename, s-files-f[i].name) == 0) {
+return fw_cfg_modify_bytes_read(s, FW_CFG_FILE_FIRST + i,
+ data, len);
+}
+}
+/* add new one */
+fw_cfg_add_file_callback(s, filename, NULL, NULL, data, len);
+return NULL;
+}
+
+static void fw_cfg_machine_reset(void *opaque)
 {
+void *ptr;
 size_t len;
-FWCfgState *s = container_of(n, FWCfgState, machine_ready);
+FWCfgState *s = opaque;
 char *bootindex = get_boot_devices_list(len, false);
 
-fw_cfg_add_file(s, bootorder, (uint8_t*)bootindex, len);
+ptr = fw_cfg_modify_file(s, bootorder, (uint8_t *)bootindex, len);
+g_free(ptr);
+}
+
+static void fw_cfg_machine_ready(struct Notifier *n, void *data)
+{
+FWCfgState *s = container_of(n, FWCfgState, machine_ready);
+qemu_register_reset(fw_cfg_machine_reset, s);
 }
 
 FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index 72b1549..56e1ed7 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -76,6 +76,8 @@ void fw_cfg_add_file(FWCfgState *s, const char *filename, 
void *data,
 void fw_cfg_add_file_callback(FWCfgState *s, const char *filename,
   FWCfgReadCallback callback, void 
*callback_opaque,
   void *data, size_t len);
+void *fw_cfg_modify_file(FWCfgState *s, const char *filename, void *data,
+ size_t len);
 FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
 hwaddr crl_addr, hwaddr data_addr);
 
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 06/28] bootindex: support to set a existent device's bootindex to -1

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

When set a device's bootindex to -1, we remove it from global
fw_boot_order list.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 bootdevice.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/bootdevice.c b/bootdevice.c
index 6f430ec..484d0c9 100644
--- a/bootdevice.c
+++ b/bootdevice.c
@@ -106,6 +106,7 @@ void add_boot_device_path(int32_t bootindex, DeviceState 
*dev,
 FWBootEntry *node, *i;
 
 if (bootindex  0) {
+del_original_boot_device(dev, suffix);
 return;
 }
 
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 10/28] eepro100: add bootindex to qom property

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Add a qom property with the same name 'bootindex',
when we remove it form qdev property, things will
continue to work just fine, and we can use qom features
which are not supported by qdev property.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 hw/net/eepro100.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index 3cd826a..fb9c944 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -1906,6 +1906,14 @@ static int e100_nic_init(PCIDevice *pci_dev)
 return 0;
 }
 
+static void eepro100_instance_init(Object *obj)
+{
+EEPRO100State *s = DO_UPCAST(EEPRO100State, dev, PCI_DEVICE(obj));
+device_add_bootindex_property(obj, s-conf.bootindex,
+  bootindex, /ethernet-phy@0,
+  DEVICE(s), NULL);
+}
+
 static E100PCIDeviceInfo e100_devices[] = {
 {
 .name = i82550,
@@ -2104,7 +2112,8 @@ static void eepro100_register_types(void)
 type_info.parent = TYPE_PCI_DEVICE;
 type_info.class_init = eepro100_class_init;
 type_info.instance_size = sizeof(EEPRO100State);
-
+type_info.instance_init = eepro100_instance_init;
+
 type_register(type_info);
 }
 }
-- 
1.7.12.4

[Qemu-devel] [PATCH v7 01/28] bootdevice: move bootdevice related code to new file bootdevice.c

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 Makefile.target |   2 +-
 bootdevice.c| 142 
 include/sysemu/sysemu.h |   1 +
 vl.c| 118 +---
 4 files changed, 145 insertions(+), 118 deletions(-)
 create mode 100644 bootdevice.c

diff --git a/Makefile.target b/Makefile.target
index 1e8d7ab..e9ff1ee 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -127,7 +127,7 @@ endif #CONFIG_BSD_USER
 # System emulator target
 ifdef CONFIG_SOFTMMU
 obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
-obj-y += qtest.o
+obj-y += qtest.o bootdevice.o
 obj-y += hw/
 obj-$(CONFIG_FDT) += device_tree.o
 obj-$(CONFIG_KVM) += kvm-all.o
diff --git a/bootdevice.c b/bootdevice.c
new file mode 100644
index 000..d5b8789
--- /dev/null
+++ b/bootdevice.c
@@ -0,0 +1,142 @@
+/*
+ * QEMU Boot Device Implement
+ *
+ * Copyright (c) 2014 HUAWEI TECHNOLOGIES CO.,LTD.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include sysemu/sysemu.h
+
+typedef struct FWBootEntry FWBootEntry;
+
+struct FWBootEntry {
+QTAILQ_ENTRY(FWBootEntry) link;
+int32_t bootindex;
+DeviceState *dev;
+char *suffix;
+};
+
+static QTAILQ_HEAD(, FWBootEntry) fw_boot_order =
+QTAILQ_HEAD_INITIALIZER(fw_boot_order);
+
+void add_boot_device_path(int32_t bootindex, DeviceState *dev,
+  const char *suffix)
+{
+FWBootEntry *node, *i;
+
+if (bootindex  0) {
+return;
+}
+
+assert(dev != NULL || suffix != NULL);
+
+node = g_malloc0(sizeof(FWBootEntry));
+node-bootindex = bootindex;
+node-suffix = g_strdup(suffix);
+node-dev = dev;
+
+QTAILQ_FOREACH(i, fw_boot_order, link) {
+if (i-bootindex == bootindex) {
+fprintf(stderr, Two devices with same boot index %d\n, 
bootindex);
+exit(1);
+} else if (i-bootindex  bootindex) {
+continue;
+}
+QTAILQ_INSERT_BEFORE(i, node, link);
+return;
+}
+QTAILQ_INSERT_TAIL(fw_boot_order, node, link);
+}
+
+DeviceState *get_boot_device(uint32_t position)
+{
+uint32_t counter = 0;
+FWBootEntry *i = NULL;
+DeviceState *res = NULL;
+
+if (!QTAILQ_EMPTY(fw_boot_order)) {
+QTAILQ_FOREACH(i, fw_boot_order, link) {
+if (counter == position) {
+res = i-dev;
+break;
+}
+counter++;
+}
+}
+return res;
+}
+
+/*
+ * This function returns null terminated string that consist of new line
+ * separated device paths.
+ *
+ * memory pointed by size is assigned total length of the array in bytes
+ *
+ */
+char *get_boot_devices_list(size_t *size, bool ignore_suffixes)
+{
+FWBootEntry *i;
+size_t total = 0;
+char *list = NULL;
+
+QTAILQ_FOREACH(i, fw_boot_order, link) {
+char *devpath = NULL, *bootpath;
+size_t len;
+
+if (i-dev) {
+devpath = qdev_get_fw_dev_path(i-dev);
+assert(devpath);
+}
+
+if (i-suffix  !ignore_suffixes  devpath) {
+size_t bootpathlen = strlen(devpath) + strlen(i-suffix) + 1;
+
+bootpath = g_malloc(bootpathlen);
+snprintf(bootpath, bootpathlen, %s%s, devpath, i-suffix);
+g_free(devpath);
+} else if (devpath) {
+bootpath = devpath;
+} else if (!ignore_suffixes) {
+assert(i-suffix);
+bootpath = g_strdup(i-suffix);
+} else {
+bootpath = g_strdup();
+}
+
+if (total) {
+list[total-1] = '\n';
+}
+len = strlen(bootpath) + 1;
+list = g_realloc(list, total + len);
+memcpy(list[total], bootpath, len);
+total += len;
+g_free(bootpath);
+}
+
+*size = total;
+
+if (boot_strict

Re: [Qemu-devel] [CVE-2014-3615 PATCH v2 3/3] spice: make sure we don't overflow ssd-buf

2014-09-05 Thread Gerd Hoffmann

  Hi,

 I can't track this back far enough. I'd feel safer if you checked that
 the multiplication can't overflow even in uint64_t.

Effectively it comes from the emulated graphics hardware (anything in
hw/display/*).  The gfx emulation must make sure that the framebuffer
fits into the video memory, which in turn pretty much implies that we
can't overflow uint64_t.  I think even uint32_t can't overflow with the
gfx hardware we are emulating today.

 (5) Instead, you really need to make sure that the very first
 multiplication fits in a signed int:

Makes sense.  I think it is easier to just multiply in 64bit, then check
the result is small enougth (new patch attached).

   /* display listener callbacks */
  @@ -495,7 +503,7 @@ static void interface_get_init_info(QXLInstance *sin, 
  QXLDevInitInfo *info)
   info-num_memslots = NUM_MEMSLOTS;
   info-num_memslots_groups = NUM_MEMSLOTS_GROUPS;
   info-internal_groupslot_id = 0;
  -info-qxl_ram_size = ssd-bufsize;
  +info-qxl_ram_size = 16 * 1024 * 1024;
   info-n_surfaces = ssd-num_surfaces;
   }

spice-server doesn't do anything with it, other than passing to
spice-client.  Not fully sure what spice-client uses this for, maybe as
some kind of hint for resource management.  Maybe not at all.

It surely doesn't matter at all for ssd-buf size.

cheers,
  Gerd

[Qemu-devel] [PATCH v7 27/28] bootindex: move calling add_boot_device_patch to bootindex setter function

2014-09-05 Thread arei.gonglei

From: Gonglei arei.gong...@huawei.com

On this way, we can assure the new bootindex take effect
during vm rebooting.

Signed-off-by: Gonglei arei.gong...@huawei.com
---
 bootdevice.c | 2 ++
 hw/block/fdc.c   | 3 ---
 hw/block/virtio-blk.c| 2 --
 hw/i386/kvm/pci-assign.c | 2 --
 hw/ide/qdev.c| 3 ---
 hw/misc/vfio.c   | 1 -
 hw/net/e1000.c   | 2 --
 hw/net/eepro100.c| 2 --
 hw/net/ne2000.c  | 2 --
 hw/net/pcnet.c   | 2 --
 hw/net/rtl8139.c | 2 --
 hw/net/spapr_llan.c  | 2 --
 hw/net/virtio-net.c  | 2 --
 hw/net/vmxnet3.c | 2 --
 hw/scsi/scsi-disk.c  | 1 -
 hw/scsi/scsi-generic.c   | 3 ---
 hw/usb/dev-network.c | 1 -
 hw/usb/host-libusb.c | 1 -
 hw/usb/redirect.c| 1 -
 19 files changed, 2 insertions(+), 34 deletions(-)

diff --git a/bootdevice.c b/bootdevice.c
index c669293..e95d085 100644
--- a/bootdevice.c
+++ b/bootdevice.c
@@ -241,6 +241,8 @@ static void device_set_bootindex(Object *obj, Visitor *v, 
void *opaque,
 /* change bootindex to a new one */
 *prop-bootindex = boot_index;
 
+add_boot_device_path(*prop-bootindex, prop-dev, prop-suffix);
+
 out:
 if (local_err) {
 error_propagate(errp, local_err);
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 51b1f08..970b7c3 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -2142,9 +2142,6 @@ static void isabus_fdc_realize(DeviceState *dev, Error 
**errp)
 error_propagate(errp, err);
 return;
 }
-
-add_boot_device_path(isa-bootindexA, dev, /floppy@0);
-add_boot_device_path(isa-bootindexB, dev, /floppy@1);
 }
 
 static void sysbus_fdc_initfn(Object *obj)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 4d05114..787955b 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -777,8 +777,6 @@ static void virtio_blk_device_realize(DeviceState *dev, 
Error **errp)
 bdrv_set_guest_block_size(s-bs, s-conf-logical_block_size);
 
 bdrv_iostatus_enable(s-bs);
-
-add_boot_device_path(s-conf-bootindex, dev, /disk@0,0);
 }
 
 static void virtio_blk_device_unrealize(DeviceState *dev, Error **errp)
diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index 56f387b..4159978 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -1825,8 +1825,6 @@ static int assigned_initfn(struct PCIDevice *pci_dev)
 
 assigned_dev_load_option_rom(dev);
 
-add_boot_device_path(dev-bootindex, pci_dev-qdev, NULL);
-
 return 0;
 
 assigned_out:
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 8382f24..daf6992 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -185,9 +185,6 @@ static int ide_dev_initfn(IDEDevice *dev, IDEDriveKind kind)
 dev-serial = g_strdup(s-drive_serial_str);
 }
 
-add_boot_device_path(dev-conf.bootindex, dev-qdev,
- dev-unit ? /disk@1 : /disk@0);
-
 return 0;
 }
 
diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index cdbaedf..d073304 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -4296,7 +4296,6 @@ static int vfio_initfn(PCIDevice *pdev)
 }
 }
 
-add_boot_device_path(vdev-bootindex, pdev-qdev, NULL);
 vfio_register_err_notifier(vdev);
 
 return 0;
diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 2e5dc41..34dde84 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1569,8 +1569,6 @@ static int pci_e1000_init(PCIDevice *pci_dev)
 
 qemu_format_nic_info_str(qemu_get_queue(d-nic), macaddr);
 
-add_boot_device_path(d-conf.bootindex, dev, /ethernet-phy@0);
-
 d-autoneg_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, e1000_autoneg_timer, 
d);
 d-mit_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, e1000_mit_timer, d);
 
diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
index cc79f09..9acc8a3 100644
--- a/hw/net/eepro100.c
+++ b/hw/net/eepro100.c
@@ -1901,8 +1901,6 @@ static int e100_nic_init(PCIDevice *pci_dev)
 s-vmstate-name = qemu_get_queue(s-nic)-model;
 vmstate_register(pci_dev-qdev, -1, s-vmstate, s);
 
-add_boot_device_path(s-conf.bootindex, pci_dev-qdev, /ethernet-phy@0);
-
 return 0;
 }
 
diff --git a/hw/net/ne2000.c b/hw/net/ne2000.c
index 6d31555..b6d8cb1 100644
--- a/hw/net/ne2000.c
+++ b/hw/net/ne2000.c
@@ -738,8 +738,6 @@ static int pci_ne2000_init(PCIDevice *pci_dev)
   object_get_typename(OBJECT(pci_dev)), 
pci_dev-qdev.id, s);
 qemu_format_nic_info_str(qemu_get_queue(s-nic), s-c.macaddr.a);
 
-add_boot_device_path(s-c.bootindex, pci_dev-qdev, /ethernet-phy@0);
-
 return 0;
 }
 
diff --git a/hw/net/pcnet.c b/hw/net/pcnet.c
index 5299d52..d344c15 100644
--- a/hw/net/pcnet.c
+++ b/hw/net/pcnet.c
@@ -1735,8 +1735,6 @@ int pcnet_common_init(DeviceState *dev, PCNetState *s, 
NetClientInfo *info)
 s-nic = qemu_new_nic(info, s-conf, object_get_typename(OBJECT(dev)), 
dev-id, s);
 qemu_format_nic_info_str(qemu_get_queue(s-nic), s-conf.macaddr.a);
 
-add_boot_device_path(s-conf.bootindex,

Re: [Qemu-devel] [PATCH] translate-all.c: fix debug memory maps printing

2014-09-05 Thread Mikhail Ilin



I've also found that this issue in walker API leads to creating
a misformed core file. elf_core_dump() uses walk_memory_regions()
to build memory mapping for a core file.

As a result the core file has a very small size and doesn't contain
page snapshots of mapped libraries.

I've compiled a simple helloworld prog (which dereference a NULL
pointer at the end to get a core file) and run the prog twice
without a workaround patch and with it.

$ gcc -m32 -g -Wall -o /tmp/prog /tmp/prog.c
$ ulimic -c unlimited

I use i386 but the same works for ARM and believe for other 32bits
arches.

$ qemu-i386 /tmp/prog
Hello world!
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)

Here is core files sizes without a workaround patch and with it.

$ ls -l
892  qemu_prog_20140905-121147_21636.core
8454144  qemu_prog_20140905-120737_21355.core

Also I've investigate them with gdb.

No workaround:

$ gdb /tmp/prog qemu_prog_20140905-121147_21636.core
Cannot access memory at address 0x407fffe4
#-1 0x08048406 in main () at /tmp/prog.c:8

Here is the fail because there is no mapping that contains address
0x407fffe4.

(gdb) info files
0x00048000 - 0x00048000 is load1
0x00049000 - 0x00049000 is load2
0x0004 - 0x0004 is load3
0x00041000 - 0x00041000 is load4
0x00041800 - 0x00041800 is load5
0x00061800 - 0x00061800 is load6
0x00062800 - 0x00062800 is load7
0x00045800 - 0x00045800 is load8
0x001ee800 - 0x001ee800 is load9
0x001ef800 - 0x001ef800 is load10
0x001f1800 - 0x001f1800 is load11


With workaround:

0x08048000 - 0x08048000 is load1
0x08049000 - 0x0804a000 is load2
0x4000 - 0x4000 is load3
0x40001000 - 0x40801000 is load4
0x40801000 - 0x40801000 is load5
0x40821000 - 0x40822000 is load6
0x40822000 - 0x40827000 is load7
0x40845000 - 0x40845000 is load8
0x409ee000 - 0x409ee000 is load9
0x409ef000 - 0x409f1000 is load10
0x409f1000 - 0x409f7000 is load11

address 0x407fffe4 belongs to load4.

Also this core includes other library mappings that are not
presented in the core file without a workaround:

0x40845174 - 0x40845198 is /lib/i386-linux-gnu/libc.so.6
0x40845198 - 0x408451b8 is /lib/i386-linux-gnu/libc.so.6
0x408451b8 - 0x40848ec8 is /lib/i386-linux-gnu/libc.so.6
0x40848ec8 - 0x40852438 is /lib/i386-linux-gnu/libc.so.6
0x40852438 - 0x4085815e is /lib/i386-linux-gnu/libc.so.6
0x4085815e - 0x4085940c is /lib/i386-linux-gnu/libc.so.6
0x4085940c - 0x40859898 is /lib/i386-linux-gnu/libc.so.6
0x40859898 - 0x408598d8 is /lib/i386-linux-gnu/libc.so.6
0x408598d8 - 0x4085c2e8 is /lib/i386-linux-gnu/libc.so.6
0x4085c2e8 - 0x4085c348 is /lib/i386-linux-gnu/libc.so.6
0x4085c350 - 0x4085c420 is /lib/i386-linux-gnu/libc.so.6
0x4085c420 - 0x4098e44e is /lib/i386-linux-gnu/libc.so.6
0x4098e450 - 0x4098f3db is /lib/i386-linux-gnu/libc.so.6
0x4098f3e0 - 0x4098f5de is /lib/i386-linux-gnu/libc.so.6

 This patch fails to compile for MIPS N32 (a 32-bit ABI with a 64-bit
 virtual address space).

I also wonder we have separate linux-user emulators for i386 (32 bit
ABI + 32 bit address space) and amd64 binaries (64 bit ABI + 64 bit
address space). And we can not run 32 bits apps under qemu-x86_64 but
MIPS N32 looks in some other way and it supports 32 bit ABI apps with
64 bit address space. I really not sure but is it a right design or
not?

 Riku, do you have any ideas (or free cycles to do the change)?
Please, give some clues.

Thanks.

On 25.08.2014 16:05, Paolo Bonzini wrote:

Il 25/08/2014 13:45, Paolo Bonzini ha scritto:

Il 11/08/2014 12:28, Mikhail Ilyin ha scritto:

Fix memory maps textualizing function. The output was not correct because of
wrong base address calculation. The initial address has to be shifted also
for TARGET_PAGE_BITS.

Signed-off-by: Mikhail Ilyin m.i...@samsung.com
---
  translate-all.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/translate-all.c b/translate-all.c
index 8f7e11b..cb7a33d 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -1728,9 +1728,8 @@ int walk_memory_regions(void *priv, 
walk_memory_regions_fn fn)
  data.prot = 0;

  for (i = 0; i  V_L1_SIZE; i++) {
-int rc = walk_memory_regions_1(data, (abi_ulong)i  V_L1_SHIFT,
+int rc = walk_memory_regions_1(data, (abi_ulong)i  (V_L1_SHIFT + 
TARGET_PAGE_BITS),
 V_L1_SHIFT / V_L2_BITS - 1, l1_map + 
i);
-
  if (rc != 0) {
  return rc;
  }



Thanks, this is simple enough that I've queued it.


Ouch, I spoke too soon.

This patch fails to compile for MIPS N32 (a 32-bit ABI with a 64-bit
virtual address space).  I'm not sure if there is a simple fix.
walk_memory_regions and

Re: [Qemu-devel] [Qemu-ppc] [PATCH 4/5] target-ppc: Handle ibm, nmi-register RTAS call

2014-09-05 Thread Aravinda Prasad



On Friday 05 September 2014 02:16 PM, Alexander Graf wrote:
 
 
 On 04.09.14 15:49, Aravinda Prasad wrote:


 On Thursday 04 September 2014 06:39 PM, Alexander Graf wrote:


 Am 04.09.2014 um 10:25 schrieb Aravinda Prasad 
 aravi...@linux.vnet.ibm.com:



 On Friday 29 August 2014 03:46 AM, Alexander Graf wrote:


 On 28.08.14 19:42, Aravinda Prasad wrote:


 On Thursday 28 August 2014 02:07 PM, Alexander Graf wrote:


 On 28.08.14 08:38, Aravinda Prasad wrote:


 On Wednesday 27 August 2014 04:07 PM, Alexander Graf wrote:


 On 25.08.14 15:45, Aravinda Prasad wrote:
 This patch adds FWNMI support in qemu for powerKVM
 guests by handling the ibm,nmi-register rtas call.
 Whenever OS issues ibm,nmi-register RTAS call, the
 machine check notification address is saved and the
 machine check interrupt vector 0x200 is patched to
 issue a private hcall.

 Signed-off-by: Aravinda Prasad aravi...@linux.vnet.ibm.com
 ---
 hw/ppc/spapr_rtas.c|   91 
 
 include/hw/ppc/spapr.h |8 
 2 files changed, 98 insertions(+), 1 deletion(-)

 diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
 index 02ddbf9..1135d2b 100644
 --- a/hw/ppc/spapr_rtas.c
 +++ b/hw/ppc/spapr_rtas.c
 @@ -277,6 +277,91 @@ static void 
 rtas_ibm_set_system_parameter(PowerPCCPU *cpu,
 rtas_st(rets, 0, ret);
 }

 +static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
 +  sPAPREnvironment *spapr,
 +  uint32_t token, uint32_t nargs,
 +  target_ulong args,
 +  uint32_t nret, target_ulong rets)
 +{
 +int i;
 +uint32_t branch_inst = 0x4802;
 +target_ulong guest_machine_check_addr;
 +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
 +/*
 + * Trampoline saves r3 in sprg2 and issues private hcall
 + * to request qemu to build error log. QEMU builds the
 + * error log, copies to rtas-blob and returns the address.
 + * The initial 16 bytes in rtas-blob consists of saved srr0
 + * and srr1 which we restore and pass on the actual error
 + * log address to OS handled mcachine check notification
 + * routine
 + */
 +uint32_t trampoline[] = {
 +0x7c7243a6,/* mtspr   SPRN_SPRG2,r3 */
 +0x3860,/* li  r3,0   */
 +/* 0xf004 is the KVMPPC_H_REPORT_ERR private HCALL */
 +0x6063f004,/* ori r3,r3,f004  */
 +/* Issue H_CALL */
 +0x4422,/*  sc  1 */

 So up to here we're saving r3 in SPRG2 (how do we know that we can
 clobber it?) and call our special hypercall.

 But what does all the cruft below here do?

 The saved r3 in SPRG2 is consumed in KVMPPC_H_REPORT_ERR hcall, hence 
 we
 can clobber SPRG2 after hcall returns. I have included a comment in
 patch 3/5 while building error log. I think better I add one here as 
 well.


 +0x7c9243a6,/* mtspr r4 sprg2 */

 Apart from th fact that your order is wrong, this destroys the value 
 of
 r3 that we saved above again.

 SPRG2 is saved inside hcall and hence we don't need SPRG2 further after
 KVMPPC_H_REPORT_ERR hcall returns.


 +0xe883,/* ld r4, 0(r3) */
 +0x7c9a03a6,/* mtspr r4, srr0 */
 +0xe8830008,/* ld r4, 8(r3) */
 +0x7c9b03a6,/* mtspr r4, srr1 */

 Can't we just set srr0 and srr1 directly?

 I checked for instructions in ISA which set srr0/1 directly given an
 address, but could not find any such instructions.

 I mean from QEMU :).

 srr0 and srr1, which are properly set when 0x200 is invoked, are
 clobbered when we return from KVMPPC_H_REPORT_ERR hcall. I think they
 are modified before issuing rfid (I can see them getting clobbered from
 QEMU monitor). However when we jump to OS registered machine check
 routine srr0 and srr1 should reflect the value they had when 0x200 was
 invoked.

 Hence srr0 and srr1 are saved in hcall and restored when we return from
 hcall. Also we don't have enough scratch registers available to save
 these before invoking hcall from 0x200.

 Or am I missing other ways to do this from QEMU?

 If you just do

 cpu_synchronize_state() and then change env-spr[SPRN_SRR0/1] inside
 your hypercall handler that should also change the value when you return
 from the hcall.

 I tried cpu_synchronize_state(), however, srr0 and srr1 are still 
 clobbered.

 Just before I issue hcall from 0x200 I see the following values from
 QEMU monitor:

 SRR0 d0f40264  SRR1 80209033

 Inside hcall, I call cpu_synchronize_state(). As soon as I return from
 hcall I see:

 SRR0 0214  SRR1 80001001

 I see SRR0 is now set to nip in 0x200 and SRR1 to msr value. I think it
 is reset during returning from hcall before executing rfid.
 
 Ah, because the hypercall is an interrupt itself. Heh ;). True.

Yes. However, as you mentioned, we still need cpu_synchronize_state() to
avoid stale value

Re: [Qemu-devel] [PATCH v5 1/3] contrib: add ivshmem client and server

2014-09-05 Thread Claudio Fontana

Just to point out that for the client there is also a DEBUG_LOG to uppercase, 
just like already pointed out for the server.

 diff --git a/contrib/ivshmem-client/ivshmem-client.c 
 b/contrib/ivshmem-client/ivshmem-client.c
 new file mode 100644
 index 000..ad210c8
 --- /dev/null
 +++ b/contrib/ivshmem-client/ivshmem-client.c
 @@ -0,0 +1,405 @@
 +/*
 + * Copyright 6WIND S.A., 2014
 + *
 + * This work is licensed under the terms of the GNU GPL, version 2 or
 + * (at your option) any later version.  See the COPYING file in the
 + * top-level directory.
 + */
 +
 +#include sys/types.h
 +#include sys/socket.h
 +#include sys/un.h
 +
 +#include qemu-common.h
 +#include qemu/queue.h
 +
 +#include ivshmem-client.h
 +
 +/* log a message on stdout if verbose=1 */
 +#define debug_log(client, fmt, ...) do { \
 +if ((client)-verbose) { \
 +printf(fmt, ## __VA_ARGS__); \
 +}\
 +} while (0)
 +

..here (DEBUG_LOG?)

Thanks to all who are working on this.

Claudio

Re: [Qemu-devel] [PATCH v3] ide: Add resize callback to ide/core

2014-09-05 Thread Markus Armbruster

John Snow js...@redhat.com writes:

 On 09/04/2014 12:13 PM, Stefan Hajnoczi wrote:
 This patch seems to break tests/bios-tables-test.c:
 ERROR:tests/bios-tables-test.c:744:test_acpi_one: assertion failed
 (signature == SIGNATURE): (0x == 0xdead)
 GTester: last random seed: R02S3d881198f35228a485b4c3d116dff3b1

 I have run it many times to make sure the failure is deterministic and
 I used git-bisect(1) to find this commit as the cause.

 Not sure why but it seems to break the test.  Please take a look.

 Dropped from the block branch.

 Stefan


 I've fixed it in a v4, but before I submit and while I'm at it, you
 didn't like the comments I left in the identify functions because they
 were prone to bitrot. would you prefer I excised them entirely?

 I found them helpful because it makes sense to read these functions
 alongside the identify data specs, and excising any references to
 those word indices in their natural order obfuscates the code
 needlessly.

 My inclination is to leave them in.

I'd replace them by a pointer to the factored-out code.

 Meanwhile, the bios-tables-test problem is such:
 We serve the identify request out of the io_buffer, not the
 identify_data cache, thus for 2nd and subsequent requests for
 identify_data, we get correct size information, but for the 1st
 request to ide_identify, we get zeroes.

I found that part confusing and stared at it until I believed it works.
Looks like I believed too quickly.

 I corrected this by making
 ide_identify and ide_atapi_identify mimic the flow of
 ide_cfata_identify, which is more clear about the nature of the two
 buffers.

Yup, that one's much better.

 Why this causes a failure here and for only the Q35 machine type I am
 not certain, but this is the causative bug.

My best guess is different firmware behavior.

Re: [Qemu-devel] [PATCH 1/3] trace: Only link generated-tracers.o with simple backend

2014-09-05 Thread Stefan Hajnoczi

On Wed, Sep 03, 2014 at 11:44:54AM +0800, Fam Zheng wrote:
 In any other cases the object file is effectively empty, which is
 disliked by ranlib and nm on Mac OS X.
 
 Reported-by: Peter Maydell peter.mayd...@linaro.org
 Signed-off-by: Fam Zheng f...@redhat.com
 ---
  trace/Makefile.objs | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

If another trace backend uses generate_c() there will be a link error so
they'll figure out they need to add it in the makefile.

Reviewed-by: Stefan Hajnoczi stefa...@redhat.com


pgplLAiuKpydd.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH] qemu-char: fix terminal crash when using -monitor stdio -nographic

2014-09-05 Thread Markus Armbruster

Li Liu john.li...@huawei.com writes:

 Ping, any more comments? Thanks.

I'd like to hear Gerd's opinion (cc'ed).

 On 2014/8/27 15:40, Li Liu wrote:
 
 
 On 2014/8/27 14:44, Markus Armbruster wrote:
 john.liuli john.li...@huawei.com writes:

 From: Li Liu john.li...@huawei.com

 Eeay to reproduce, just try qemu -monitor stdio -nographic
 and type quit, then the terminal will be crashed.

 There are two pathes try to call tcgetattr of stdio in vl.c:

 1) Monitor_parse(optarg, readline);
.
qemu_opts_foreach(qemu_find_opts(chardev),
  chardev_init_func, NULL, 1) != 0)

 2) if (default_serial)
add_device_config(DEV_SERIAL, stdio);

if (foreach_device_config(DEV_SERIAL, serial_parse)  0)

 Both of them will trigger qemu_chr_open_stdio which will disable
 ECHO attributes. First one has updated the attributes of stdio
 by calling qemu_chr_fe_set_echo(chr, false). And the tty
 attributes has been saved in oldtty. Then the second path will
 redo such actions, and the oldtty is overlapped. So till quit,
 term_exit can't recove the correct attributes.

 Signed-off-by: Li Liu john.li...@huawei.com

 Yes, failure to restore tty settings is a bug.

 But is having multiple character devices use the same terminal valid?
 
 I'm not sure. But I have found such comments in vl.c
 According to documentation and historically, -nographic redirects
 serial port, parallel port and monitor to stdio
 
 Best regards
 Li.
 
 If no, can we catch and reject the attempt?

 [...]

Re: [Qemu-devel] [PATCH v7 18/28] host-libusb: remove bootindex property from qdev to qom

2014-09-05 Thread Gerd Hoffmann

 +static void usb_host_instance_init(Object *obj)
 +{
 +USBDevice *udev = USB_DEVICE(obj);
 +USBHostDevice *s = USB_HOST_DEVICE(udev);
 +
 +device_add_bootindex_property(obj, s-bootindex,
 +  bootindex, NULL,
 +  udev-qdev, NULL);
 +object_property_set_int(obj, -1, bootindex, NULL);

I think the object_property_set_int() call can be moved to
device_add_bootindex_property, so there is no need to duplicate the call
to init bootindex with -1 in all devices.

cheers,
  Gerd

Re: [Qemu-devel] [CVE-2014-3615 PATCH v2 3/3] spice: make sure we don't overflow ssd-buf

2014-09-05 Thread Laszlo Ersek

On 09/05/14 10:58, Gerd Hoffmann wrote:
   Hi,
 
 I can't track this back far enough. I'd feel safer if you checked that
 the multiplication can't overflow even in uint64_t.
 
 Effectively it comes from the emulated graphics hardware (anything in
 hw/display/*).  The gfx emulation must make sure that the framebuffer
 fits into the video memory, which in turn pretty much implies that we
 can't overflow uint64_t.  I think even uint32_t can't overflow with the
 gfx hardware we are emulating today.
 
 (5) Instead, you really need to make sure that the very first
 multiplication fits in a signed int:
 
 Makes sense.  I think it is easier to just multiply in 64bit, then check
 the result is small enougth (new patch attached).

Okay, if you can guarantee that the product fits in uint64_t, then such
a check would suffice.

New patch has not been attached though :)

 
  /* display listener callbacks */
 @@ -495,7 +503,7 @@ static void interface_get_init_info(QXLInstance *sin, 
 QXLDevInitInfo *info)
  info-num_memslots = NUM_MEMSLOTS;
  info-num_memslots_groups = NUM_MEMSLOTS_GROUPS;
  info-internal_groupslot_id = 0;
 -info-qxl_ram_size = ssd-bufsize;
 +info-qxl_ram_size = 16 * 1024 * 1024;
  info-n_surfaces = ssd-num_surfaces;
  }
 
 spice-server doesn't do anything with it, other than passing to
 spice-client.  Not fully sure what spice-client uses this for, maybe as
 some kind of hint for resource management.  Maybe not at all.
 
 It surely doesn't matter at all for ssd-buf size.

Okay, I'll trust you on this.

Thanks
Laszlo

Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/2] PPC: kvm: Fix incorrect remapping of in-kernel MPIC

2014-09-05 Thread Alexander Graf



On 03.09.14 20:36, Bogdan Purcareata wrote:
 On target-ppc, the kvm-openpic memory region is part of the E500-CCSR memory
 region. On the kernel side, the MPIC is mapped at the same offset as the
 kvm-openpic within the address space.
 
 When adding the PCI BAR0 memory region, an alias is created to point to the
 E500-CCSR memory region. This results in firing the kvm_openpic_region_add 
 once
 more, since kvm-openpic is part of the latter. Only this time, the offset is
 wrong - it's part of the PCI memory region. This leads to the in-kernel MPIC 
 to
 be remapped at a wrong address, and thus all traps to the kvm-openpic
 address to be emulated in userspace.
 
 The fix consists in an additional filter in kvm_openpic_region_{add,del} to
 consider only addresses matching the start of the kvm-openpic memory region.

If this is true, wouldn't vhost and vfio be broken too?


Alex

Re: [Qemu-devel] [Qemu-ppc] [PATCH 0/2] PPC: kvm: Fix incorrect remapping of in-kernel MPIC

2014-09-05 Thread Alexander Graf



On 03.09.14 20:36, Bogdan Purcareata wrote:
 On target-ppc, the kvm-openpic memory region is part of the E500-CCSR memory
 region. On the kernel side, the MPIC is mapped at the same offset as the
 kvm-openpic within the address space.
 
 When adding the PCI BAR0 memory region, an alias is created to point to the
 E500-CCSR memory region. This results in firing the kvm_openpic_region_add 
 once
 more, since kvm-openpic is part of the latter. Only this time, the offset is
 wrong - it's part of the PCI memory region. This leads to the in-kernel MPIC 
 to
 be remapped at a wrong address, and thus all traps to the kvm-openpic
 address to be emulated in userspace.
 
 The fix consists in an additional filter in kvm_openpic_region_{add,del} to
 consider only addresses matching the start of the kvm-openpic memory region.

If this is true, wouldn't vfio and host be broken too?


Alex

Re: [Qemu-devel] [PATCH] virtio-pci: fix virtio-net child refcount in transports

2014-09-05 Thread Gonglei (Arei)

Hi,

CC'ing Stefan and qemu-stable@ for more attention. :)


Best regards,
-Gonglei


 -Original Message-
 From: Gonglei (Arei)
 Sent: Thursday, September 04, 2014 7:42 PM
 To: qemu-devel@nongnu.org
 Cc: m...@redhat.com; Huangweidong (C); Gonglei (Arei)
 Subject: [PATCH] virtio-pci: fix virtio-net child refcount in transports
 
 From: Gonglei arei.gong...@huawei.com
 
 object_initialize() leaves the object with a refcount of 1.
 object_property_add_child() adds its own reference which is dropped
 again when the property is deleted.
 
 The upshot of this is that we always have a refcount = 1.  Upon hot
 unplug the virtio-net child is not finalized!
 
 Drop our reference after the child property has been added to the
 parent.
 
 Signed-off-by: Gonglei arei.gong...@huawei.com
 ---
 Stefan had post virtio-blk in commit c5d49db4, but virtio-net has
 the same problem. Maybe the other virtio devices have too.
 ---
  hw/virtio/virtio-pci.c | 1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
 index ddb5da1..78dcd68 100644
 --- a/hw/virtio/virtio-pci.c
 +++ b/hw/virtio/virtio-pci.c
 @@ -1456,6 +1456,7 @@ static void virtio_net_pci_instance_init(Object *obj)
  VirtIONetPCI *dev = VIRTIO_NET_PCI(obj);
  object_initialize(dev-vdev, sizeof(dev-vdev), TYPE_VIRTIO_NET);
  object_property_add_child(obj, virtio-backend, OBJECT(dev-vdev),
 NULL);
 +object_unref(OBJECT(dev-vdev));
  }
 
  static const TypeInfo virtio_net_pci_info = {
 --
 1.7.12.4

Re: [Qemu-devel] [PATCH] translate-all.c: fix debug memory maps printing

2014-09-05 Thread Peter Maydell

On 5 September 2014 09:59, Mikhail Ilin m.i...@samsung.com wrote:
 I also wonder we have separate linux-user emulators for i386 (32 bit
 ABI + 32 bit address space) and amd64 binaries (64 bit ABI + 64 bit
 address space). And we can not run 32 bits apps under qemu-x86_64 but
 MIPS N32 looks in some other way and it supports 32 bit ABI apps with
 64 bit address space. I really not sure but is it a right design or
 not?

The design here is that we have one QEMU executable for each
Linux ABI we support. On MIPS there are a number of different
ABIs which may be in use even with the same CPU type, which
is why there are separate emulator binaries. On x86, if we ever
supported the x86_64 x32 ABI, that would be an extra emulator
binary in addition to the current x86_64 and i386 ones.

This should all not matter much for core code, which simply
has to use the correct types for things. Quoting from HACKING:

vaddr is the best type to use to hold a CPU virtual address in
target-independent code. It is guaranteed to be large enough to hold a
virtual address for any target, and it does not change size from target
to target. It is always unsigned.
target_ulong is a type the size of a virtual address on the CPU; this means
it may be 32 or 64 bits depending on which target is being built. It should
therefore be used only in target-specific code, and in some
performance-critical built-per-target core code such as the TLB code.
There is also a signed version, target_long.
abi_ulong is for the *-user targets, and represents a type the size of
'void *' in that target's ABI. (This may not be the same as the size of a
full CPU virtual address in the case of target ABIs which use 32 bit pointers
on 64 bit CPUs, like sparc32plus.) Definitions of structures that must match
the target's ABI must use this type for anything that on the target is defined
to be an 'unsigned long' or a pointer type.
There is also a signed version, abi_long.

The problem you're running into here is with the ABIs where
abi_ulong and target_ulong are different sizes.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v13 2/6] block: don't convert file size to sector size

2014-09-05 Thread Hu Tao

On Thu, Sep 04, 2014 at 11:57:58AM +0200, Kevin Wolf wrote:
 Am 29.08.2014 um 10:33 hat Hu Tao geschrieben:
  and avoid converting it back later.
  
  Signed-off-by: Hu Tao hu...@cn.fujitsu.com
 
  diff --git a/block/raw-posix.c b/block/raw-posix.c
  index 9c22e3f..abe0759 100644
  --- a/block/raw-posix.c
  +++ b/block/raw-posix.c
  @@ -1369,8 +1369,8 @@ static int raw_create(const char *filename, QemuOpts 
  *opts, Error **errp)
   strstart(filename, file:, filename);
   
   /* Read out options */
  -total_size = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 
  0),
  -  BDRV_SECTOR_SIZE);
  +total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
  +  BDRV_SECTOR_SIZE);
   nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
   
   fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
  @@ -1394,7 +1394,7 @@ static int raw_create(const char *filename, QemuOpts 
  *opts, Error **errp)
   #endif
   }
   
  -if (ftruncate(fd, total_size * BDRV_SECTOR_SIZE) != 0) {
  +if (ftruncate(fd, total_size) != 0) {
   result = -errno;
   error_setg_errno(errp, -result, Could not resize file);
   }
 
 You forgot changing hdev_create() in raw-posix. Doesn't make the patch
 less correct, but you may want to add it for v14.

Thanks, changed it too.

Regards,
Hu

 
 Kevin

Re: [Qemu-devel] [PATCH v7 18/28] host-libusb: remove bootindex property from qdev to qom

2014-09-05 Thread Gonglei (Arei)

Hi,

 From: Gerd Hoffmann [mailto:kra...@redhat.com]
 Sent: Friday, September 05, 2014 5:06 PM
 Subject: Re: [PATCH v7 18/28] host-libusb: remove bootindex property from
 qdev to qom
 
  +static void usb_host_instance_init(Object *obj)
  +{
  +USBDevice *udev = USB_DEVICE(obj);
  +USBHostDevice *s = USB_HOST_DEVICE(udev);
  +
  +device_add_bootindex_property(obj, s-bootindex,
  +  bootindex, NULL,
  +  udev-qdev, NULL);
  +object_property_set_int(obj, -1, bootindex, NULL);
 
 I think the object_property_set_int() call can be moved to
 device_add_bootindex_property, so there is no need to duplicate the call
 to init bootindex with -1 in all devices.
 
Good idea. Thanks! Will fix this.

Best regards,
-Gonglei

Re: [Qemu-devel] [PATCH] cow: make padding in the header explicit

2014-09-05 Thread Kevin Wolf

Am 04.09.2014 um 17:43 hat Stefan Hajnoczi geschrieben:
 On Thu, Sep 04, 2014 at 04:10:14PM +0200, Kevin Wolf wrote:
  Am 04.09.2014 um 15:51 hat Stefan Hajnoczi geschrieben:
   On Thu, Sep 04, 2014 at 06:07:32AM -0600, Eric Blake wrote:
On 09/04/2014 02:58 AM, Stefan Hajnoczi wrote:
 On-disk structures should be marked packed so the compiler does not
 insert padding for field alignment.  Padding should be explicit so
 on-disk layout is obvious and we don't rely on the 
 architecture-specific
 ABI for alignment rules.
 
 The pahole(1) diff shows that the padding is now explicit and offsets
 are unchanged:
 
   char   backing_file[1024];   /* 8  1024 
 */
   /* --- cacheline 16 boundary (1024 bytes) was 8 bytes ago --- */
   int32_tmtime;/*  1032 4 
 */
 -
 - /* XXX 4 bytes hole, try to pack */
 -
 + uint32_t   padding;  /*  1036 4 
 */
   uint64_t   size; /*  1040 8 
 */

Was a 32-bit build also inserting this padding, or do we have historical
differences where 32-bit and 64-bit cow files are actually different,
and we may need to be prepared to parse files from both sources?
   
   Good point.  Let's not merge this patch since it breaks 32-bit hosts.
   
   The fact that no one hit problems when exchanging files between 32-bit
   and 64-bit machines shows that the cow format is rarely used.
   
   At this point we have 2 different formats: one without padding
   (i386-style) and one with padding (x86_64-style).  The chance of more
   variants is small but who knows, maybe some other host architecture ABI
   has yet another alignment rule for uint64_t.
   
   I'd like to git rm block/cow.c but I suppose the backwards-compatible
   thing to do is to introduce subformats to support both variants.
   Opinions?
  
  Can we safely detect which of the subformats we have? But I'm not sure
  if it's even worth fixing.
 
 I think it would default to the subformat depending on the host
 architecture but allow overriding with -o subformat=i386|x86_64.

Hm, okay. If we can't do it automatically, that's an option, too.

 I'm also not sure if it's worth fixing.  The cow file format is so
 rarely used I wonder if we'd be better off without it.

It has never been used as a native qemu image format. As I understand
it, its use case is compatibility with existing images, and one of the
great things about qemu-img is that it can open more or less any random
image that you get from somewhere. The exotic formats are part of this.

If we wanted to simplify our code, we could probably make it read-only
(at the cost of losing qemu-iotests support), but then it's so simple
that leaving r/w support around won't hurt us.


Hm, actually, looking at the kernel (arch/um/drivers/cow_user.c), it
doesn't look as if we were compatible at all for v2, at least in the
current kernel version, we use a different backing file name length...
And they have the struct packed today, so we should probably follow
their example.

On the other hand, out of three versions, we only support v2, and we
don't use the same format as the kernel. Yeah, I guess we might just
drop the support then, it's not helpful for compatibility. The only
other way would be to update it to handle all three versions the same as
the kernel code.

Kevin


pgp57GQ6wLdw8.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH] qemu-char: fix terminal crash when using -monitor stdio -nographic

2014-09-05 Thread Gerd Hoffmann

On Fr, 2014-09-05 at 11:04 +0200, Markus Armbruster wrote:
 Li Liu john.li...@huawei.com writes:
 
  Ping, any more comments? Thanks.
 
 I'd like to hear Gerd's opinion (cc'ed).
 
  But is having multiple character devices use the same terminal valid?

No (guess we should catch that case in stdio init).

Beside the tty initialization and cleanup you also have the problem that
both users are racing for input.  Well, maybe not in the qemu case as it
is the same process and it very well might be that it polls the two
chardevs in a well defined order, so one of them gets all input and the
other gets nothing.  With two processes reading from the terminal (try
'cat | less') it is actually random though.

  I'm not sure. But I have found such comments in vl.c
  According to documentation and historically, -nographic redirects
  serial port, parallel port and monitor to stdio

In that case mux chardev is used (that is the piece which handles the
input switching between serial and monitor via 'Ctrl-A c').  There is
one stdio instance, and one mux instance, the mux is chained to stdio,
and mux allows multiple backends to connect.

You can construct it on the command line this way:

qemu -nographic -nodefaults \
   -chardev stdio,mux=on,id=terminal \
   -serial chardev:terminal \
   -monitor chardev:terminal

[ serial is default, so no output here, unless you boot a guest
  with serial console configured ]

[ Hit 'Ctrl-A h' now ]

C-a hprint this help
C-a xexit emulator
C-a ssave disk data back to file (if -snapshot)
C-a ttoggle console timestamps
C-a bsend break (magic sysrq)
C-a cswitch between console and monitor
C-a C-a  sends C-a

[ Hit 'Ctrl-A c' now ]

QEMU 2.1.50 monitor - type 'help' for more information
(qemu) info chardev
terminal: filename=mux
terminal-base: filename=stdio
(qemu) 

HTH,
  Gerd

Re: [Qemu-devel] [CVE-2014-3615 PATCH v2 3/3] spice: make sure we don't overflow ssd-buf

2014-09-05 Thread Gerd Hoffmann

On Fr, 2014-09-05 at 11:06 +0200, Laszlo Ersek wrote:
  Makes sense.  I think it is easier to just multiply in 64bit, then
 check
  the result is small enougth (new patch attached).
 
 Okay, if you can guarantee that the product fits in uint64_t, then
 such
 a check would suffice.
 
 New patch has not been attached though :)

Oops.  Here we go.

cheers,
  Gerd
From 33c5c3d1736fd577fc1279a1f3c50d2414e98fe3 Mon Sep 17 00:00:00 2001
From: Gerd Hoffmann kra...@redhat.com
Date: Wed, 3 Sep 2014 15:50:08 +0200
Subject: [PATCH] spice: make sure we don't overflow ssd-buf

Related spice-only bug.  We have a fixed 16 MB buffer here, being
presented to the spice-server as qxl video memory in case spice is
used with a non-qxl card.  It's also used with qxl in vga mode.

When using display resolutions requiring more than 16 MB of memory we
are going to overflow that buffer.  In theory the guest can write,
indirectly via spice-server.  The spice-server clears the memory after
setting a new video mode though, triggering a segfault in the overflow
case, so qemu crashes before the guest has a chance to do something
evil.

Fix that by switching to dynamic allocation for the buffer.

CVE-2014-3615

Cc: qemu-sta...@nongnu.org
Cc: secal...@redhat.com
Cc: Laszlo Ersek ler...@redhat.com
Signed-off-by: Gerd Hoffmann kra...@redhat.com
---
 ui/spice-display.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/ui/spice-display.c b/ui/spice-display.c
index 66e2578..def7b52 100644
--- a/ui/spice-display.c
+++ b/ui/spice-display.c
@@ -334,11 +334,23 @@ void qemu_spice_create_host_memslot(SimpleSpiceDisplay *ssd)
 void qemu_spice_create_host_primary(SimpleSpiceDisplay *ssd)
 {
 QXLDevSurfaceCreate surface;
+uint64_t surface_size;
 
 memset(surface, 0, sizeof(surface));
 
-dprint(1, %s/%d: %dx%d\n, __func__, ssd-qxl.id,
-   surface_width(ssd-ds), surface_height(ssd-ds));
+surface_size = (uint64_t) surface_width(ssd-ds) *
+surface_height(ssd-ds) * 4;
+assert(surface_size  0);
+assert(surface_size  INT_MAX);
+if (ssd-bufsize  surface_size) {
+ssd-bufsize = surface_size;
+g_free(ssd-buf);
+ssd-buf = g_malloc(ssd-bufsize);
+}
+
+dprint(1, %s/%d: %ux%u (size % PRIu64 /%d)\n, __func__, ssd-qxl.id,
+   surface_width(ssd-ds), surface_height(ssd-ds),
+   surface_size, ssd-bufsize);
 
 surface.format = SPICE_SURFACE_FMT_32_xRGB;
 surface.width  = surface_width(ssd-ds);
@@ -369,8 +381,6 @@ void qemu_spice_display_init_common(SimpleSpiceDisplay *ssd)
 if (ssd-num_surfaces == 0) {
 ssd-num_surfaces = 1024;
 }
-ssd-bufsize = (16 * 1024 * 1024);
-ssd-buf = g_malloc(ssd-bufsize);
 }
 
 /* display listener callbacks */
@@ -495,7 +505,7 @@ static void interface_get_init_info(QXLInstance *sin, QXLDevInitInfo *info)
 info-num_memslots = NUM_MEMSLOTS;
 info-num_memslots_groups = NUM_MEMSLOTS_GROUPS;
 info-internal_groupslot_id = 0;
-info-qxl_ram_size = ssd-bufsize;
+info-qxl_ram_size = 16 * 1024 * 1024;
 info-n_surfaces = ssd-num_surfaces;
 }
 
-- 
1.8.3.1

Re: [Qemu-devel] [PATCH v4 0/2] add resize callback to ide/core

2014-09-05 Thread Stefan Hajnoczi

On Thu, Sep 04, 2014 at 11:42:15PM -0400, John Snow wrote:
 This patch series fixes incorrect IDENTIFY data returned
 for an IDE drive after a block_resize event by adding
 a resize callback for IDE devices.
 
 Inconsistencies between identify routines are also
 removed so that they read easier.
 
 V4:
  - Added patch that makes the buffer and cache fill order for
identify routines more consistent.
  - Fixed a bug where the very first call to IDENTIFY does not
return any size information.
 
 V3:
  - Factored out the size update into new functions.
  - Fixed the size update for CFATA.
  - Added assertion to clarify that ide_resize_cb is non-atapi.
 
 V2:
  - Do not attempt to update geometry values, to avoid clobbering
user-specified values, if they exist.
  - Do not regenerate the entire IDENTIFY buffer to avoid losing
any settings that occurred during normal operation.
 
 John Snow (2):
   IDE: Fill the IDENTIFY request consistently
   ide: Add resize callback to ide/core
 
  hw/ide/core.c | 97 
 +--
  1 file changed, 74 insertions(+), 23 deletions(-)
 
 -- 
 1.9.3
 
 

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


pgpoJQFXydmIB.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 0/2] vmdk: fix leaks in vmdk_parse_extents()

2014-09-05 Thread Stefan Hajnoczi

On Thu, Sep 04, 2014 at 09:04:41PM +0100, Stefan Hajnoczi wrote:
 See patches for the specific leaks.
 
 Stefan Hajnoczi (2):
   vmdk: fix vmdk_parse_extents() extent_file leaks
   vmdk: fix buf leak in vmdk_parse_extents()
 
  block/vmdk.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)
 
 -- 
 1.9.3
 
 

Applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


pgpXUoZi35KrL.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 1/4] block: Correct bs-growable

2014-09-05 Thread Kevin Wolf

Am 04.09.2014 um 22:01 hat Max Reitz geschrieben:
 On 20.08.2014 13:40, Kevin Wolf wrote:
 Am 12.07.2014 um 00:23 hat Max Reitz geschrieben:
 Currently, the field growable in a BDS is set iff the BDS is opened in
 protocol mode (with O_BDRV_PROTOCOL). However, not every protocol block
 driver allows growing: NBD, for instance, does not. On the other hand,
 a non-protocol block driver may allow growing: The raw driver does.
 
 Fix this by correcting the growable field in the driver-specific open
 function for the BDS, if necessary.
 
 Signed-off-by: Max Reitz mre...@redhat.com
 I'm not sure I agree with bs-growable = true for raw. It's certainly
 true that the backend can technically provide the functionality that
 writes beyond EOF grow the file. That's not the point of bs-growable,
 though.
 
 The point of it was to _forbid_ it to grow even when it's technically
 possible (non-file protocols weren't really a thing back then, apart
 from vvfat, so the assumption was that it's always technically
 possible). growable was introduced with bdrv_check_request(), which is
 supposed to reject guest requests after the end of the virtual disk (and
 this fixed a CVE, see commit 71d0770c). You're now disabling this check
 for raw.
 
 I think we need to make sure that bs-growable is only set if it is
 opened for an image that has drv-requires_growing_file set and
 therefore not directly used by a guest.
 
 Well, except that with node-name a guest will be able to use any image
 in the chain... Might this mean that it's really a BlockBackend
 property?
 
 Okay, the more I sit at this problem, the harder it seems to get.
 The only thing I currently know for sure is that I disagree with
 Anthony's opinion in 71d0770c (this patch makes the BlockDriver API
 guarantee that all requests are within 0..bdrv_getlength() which to
 me seems like a Good Thing).
 
 The initial point was to range check guest requests. In 2009, it may
 have been enough to statically check the BDS type (protocol or
 format) to know whether the guest directly accesses it (format) or
 not (protocol). However, this is no longer sufficient. Now, as far
 as I know, guests can access basically any BDS (as you yourself
 say). Therefore, it seems to me like it's impossible to determine
 whether the device should be marked growable or not when opening it.
 Instead, I think we have to check for each single request whether it
 comes from the guest or from within the block layer and do range
 checking only for the former case; but this should not be the task
 of the block layer core, but of the block devices instead.
 Theoretically, guests may write beyond the image end and grow it
 that way all they want, but in practice this should be limited by
 the block devices which have a fixed length.

What about block jobs, built-in NBD server, etc.?

Also, we have many device models that I don't trust a bit and having one
central check is much easier to get right than n duplicates in each
device emulation.

 Under this impression, I wanted to simply return to growable = false
 for raw. However, this breaks test 071 which attaches blkdebug to a
 raw BDS after qemu has been started. blkdebug detects raw is not
 growable, therefore reports not to be growable as well, and because
 qcow2 is on top of all that, the warning introduced by this series
 is emitted. Okay, so we will need growable = true for raw in some
 cases.

I don't want to make the shortcut more tempting, but if we come to the
conclusion that a fixed growable=false for raw is the right thing to do,
then we simply need to fix the test case.

 It's not trivial to determine whether the superordinate BDS of a
 certain BDS has BlockDriver.requires_growing_file set or not. We
 could introduce a new flag for bdrv_open(), but I'd rather avoid it.
 In fact, I tried something like this, but I quickly got into
 problems because e.g. blkdebug does not have
 requires_growing_file_set, but decides whether its own BDS are
 growable based on whether the underlying file is growable or not. So
 if a blkdebug BDS is growable, the underlying file actually needs to
 be growable as well. Therefore, we'd need a more sophisticated
 requires_growing_file_set and maybe even propagation of growable
 requirement through the BDS layers which quickly turns ugly when one
 has to consider that a BDS may be used by multiple users.

Ugh. You're right, it's not a static per-driver property, but really a
per-BDS one. That's somewhat unfortunate.

 Also, it's actually irrelevant whether requires_growing_file is set
 or not. growable's current sole purpose apparently is enabling range
 checks for guest-accessible images. If the BDS is only a subordinate
 to another BDS, it doesn't matter whether that other BDS needs
 growable set or not.

Hm... So in blockdev talk, what you're saying is that range checks are a
BlockBackend thing, and if we're on BDS level, we don't care? Perhaps
that's the right approach and gets us rid of bs-growable immediately.

Re: [Qemu-devel] I/O parallelism on QCOW2

2014-09-05 Thread Stefan Hajnoczi

On Thu, Sep 04, 2014 at 12:32:12PM -0400, Xingbo Wu wrote:
   After running a 16-thread sync-random-write test against qcow2, It is
 observed that QCOW2 seems to be serializing all its metadata-related writes.
  If qcow2 is designed to do this,* then what is the concern?* What would go
 wrong if this ordering is relaxed?

How do you know that serializing part of the write request is a
significant bottleneck?

Please post your benchmark results with raw, qed, and qcow2 handling 1-,
8-, and 16-threads of I/O (or whatever similar benchmarks you have run).

The bottleneck may actually be something else, so please share your
benchmark configuration and results.

 By providing less features, raw-file and QED scales well on parallel I/O
 workload. I believe qcow2 does this with clear reasons. Thanks!

QED serializes allocating writes, see qed_aio_write_alloc().

In qcow2 the BdrvQcowState-lock is held across metadata updates.  The
important pieces here are:
 * qcow2_co_writev() only releases the lock around data writes
   (including COW).
 * qcow2_co_flush_to_os() holds the lock around metadata updates

Stefan


pgpAi29D4DiLt.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH] Fix improper usage of cpu_to_be32 in vpc

2014-09-05 Thread Stefan Hajnoczi

On Thu, Sep 04, 2014 at 10:43:58PM +0800, Gordon Gong wrote:
 From fd3f0fd9c53d7782d4d835597c8a07b897bec3d0 Mon Sep 17 00:00:00 2001
 
 From: Xiaodong Gong gordongong0...@gmail.com
 
 Date: Sat, 30 Aug 2014 03:17:03 +0800
 
 Subject: Fix improper usage of cpu_to_be32 in vpc
 
 
 
 cpu_to_be32() is wrong since vhd_type is an enum constant
 
 (just a regular CPU-endian integer).
 
 
 
 Signed-off-by: Xiaodong Gong gordongong0...@gmail.com
 
 ---
 
 block/vpc.c | 8 
 
 1 file changed, 4 insertions(+), 4 deletions(-)

This patch is malformed and does not apply with git-am(1).  Please use
git-send-email(1) to send properly formatted patches.

Copy-pasting patches into GMail's web interface does not work.

http://qemu-project.org/Contribute/SubmitAPatch

Stefan


pgpLKrQQsS2QW.pgp
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH v2] Support vhd type VHD_DIFFERENCING

2014-09-05 Thread Stefan Hajnoczi

On Thu, Sep 04, 2014 at 10:49:43PM +0800, Gordon Gong wrote:
 [Qemu-devel][RFC PATCH v2] Support vhd type VHD_DIFFERENCING
 
 
 
 From 5387a2a7b6ad052659a08a1fc7e89595708396d1 Mon Sep 17 00:00:00 2001
 
 From: Xiaodong Gong gordongong0...@gmail.com
 
 Date: Thu, 4 Sep 2014 01:14:59 +0800
 
 Subject: [PATCH 2/2] Support vhd type VHD_DIFFERENCING
 
 
 
 Now qemu only supports vhd type VHD_FIXED and VHD_DYNAMIC,
 
 so qemu can't read snapshot volume of vhd, and can't support
 
 other storage features of vhd file.
 
 
 
 This patch add read parent information in function vpc_open,
 
 read bitmap in vpc_read, and change bitmap in vpc_write.
 
 
 
 Signed-off-by: Xiaodong Gong gordongong0...@gmail.com
 
 ---
 
 block/vpc.c | 329
 +++-
 
 1 file changed, 261 insertions(+), 68 deletions(-)

This patch is malformed.  Please use git-send-email(1):
http://qemu-project.org/Contribute/SubmitAPatch

Stefan


pgpiiF3wOEU6z.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH] gtk.c: Fix memory leak in gd_set_keycode_type()

2014-09-05 Thread Gerd Hoffmann

On Di, 2014-09-02 at 14:33 +0800, Chen Fan wrote:
  this memory leak is introduced by the original
  commit 3158a3482b0093e41f2b2596fba50774ea31ae08

added to gtk queue.

thanks,
  Gerd

Re: [Qemu-devel] [PULL for-2.1 0/7] QOM devices patch queue 2014-09-04

2014-09-05 Thread Peter Maydell

On 4 September 2014 18:21, Andreas Färber afaer...@suse.de wrote:
 Hello Peter,

 This is my QOM (devices) patch queue. Please pull.

 Regards,
 Andreas

 Cc: Peter Maydell peter.mayd...@linaro.org

 Cc: Michael S. Tsirkin m...@redhat.com

 The following changes since commit 01eb313907dda97313b8fea62e5632fca64f069c:

   Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-2014-09-03' 
 into staging (2014-09-04 13:33:53 +0100)

 are available in the git repository at:


   git://github.com/afaerber/qemu-cpu.git tags/qom-devices-for-peter

 for you to fetch changes up to 1d45a705fc007a13f20d18473290082eae6d1725:

   qdev: Add cleanup logic in device_set_realized() to avoid resource leak 
 (2014-09-04 19:15:54 +0200)

 
 QOM infrastructure fixes and device conversions

 * Cleanups for recursive device unrealization

 
 Alexey Kardashevskiy (1):
   qom: Make object_child_foreach() safe for objects removal

 Andreas Färber (1):
   machine: Clean up -machine handling

 Gonglei (3):
   qdev: Use error_abort instead of using local_err
   qdev: Use NULL instead of local_err for qbus_child unrealize
   qdev: Add cleanup logic in device_set_realized() to avoid resource leak

 Peter Crosthwaite (2):
   qom: Add automatic arrayification to object_property_add()
   memory: Remove object_property_add_child_array()

Applied, thanks.

-- PMM

Re: [Qemu-devel] [CVE-2014-3615 PATCH v2 3/3] spice: make sure we don't overflow ssd-buf

2014-09-05 Thread Laszlo Ersek

On 09/05/14 11:33, Gerd Hoffmann wrote:
 On Fr, 2014-09-05 at 11:06 +0200, Laszlo Ersek wrote:
   Makes sense.  I think it is easier to just multiply in 64bit, then
  check
   the result is small enougth (new patch attached).
  
  Okay, if you can guarantee that the product fits in uint64_t, then
  such
  a check would suffice.
  
  New patch has not been attached though :)
 Oops.  Here we go.
 
 cheers,
   Gerd
 
 
 0001-spice-make-sure-we-don-t-overflow-ssd-buf.patch
 
 
 From 33c5c3d1736fd577fc1279a1f3c50d2414e98fe3 Mon Sep 17 00:00:00 2001
 From: Gerd Hoffmann kra...@redhat.com
 Date: Wed, 3 Sep 2014 15:50:08 +0200
 Subject: [PATCH] spice: make sure we don't overflow ssd-buf
 
 Related spice-only bug.  We have a fixed 16 MB buffer here, being
 presented to the spice-server as qxl video memory in case spice is
 used with a non-qxl card.  It's also used with qxl in vga mode.
 
 When using display resolutions requiring more than 16 MB of memory we
 are going to overflow that buffer.  In theory the guest can write,
 indirectly via spice-server.  The spice-server clears the memory after
 setting a new video mode though, triggering a segfault in the overflow
 case, so qemu crashes before the guest has a chance to do something
 evil.
 
 Fix that by switching to dynamic allocation for the buffer.
 
 CVE-2014-3615
 
 Cc: qemu-sta...@nongnu.org
 Cc: secal...@redhat.com
 Cc: Laszlo Ersek ler...@redhat.com
 Signed-off-by: Gerd Hoffmann kra...@redhat.com
 ---
  ui/spice-display.c | 20 +++-
  1 file changed, 15 insertions(+), 5 deletions(-)
 
 diff --git a/ui/spice-display.c b/ui/spice-display.c
 index 66e2578..def7b52 100644
 --- a/ui/spice-display.c
 +++ b/ui/spice-display.c
 @@ -334,11 +334,23 @@ void qemu_spice_create_host_memslot(SimpleSpiceDisplay 
 *ssd)
  void qemu_spice_create_host_primary(SimpleSpiceDisplay *ssd)
  {
  QXLDevSurfaceCreate surface;
 +uint64_t surface_size;
  
  memset(surface, 0, sizeof(surface));
  
 -dprint(1, %s/%d: %dx%d\n, __func__, ssd-qxl.id,
 -   surface_width(ssd-ds), surface_height(ssd-ds));
 +surface_size = (uint64_t) surface_width(ssd-ds) *
 +surface_height(ssd-ds) * 4;
 +assert(surface_size  0);
 +assert(surface_size  INT_MAX);
 +if (ssd-bufsize  surface_size) {
 +ssd-bufsize = surface_size;
 +g_free(ssd-buf);
 +ssd-buf = g_malloc(ssd-bufsize);
 +}
 +
 +dprint(1, %s/%d: %ux%u (size % PRIu64 /%d)\n, __func__, ssd-qxl.id,
 +   surface_width(ssd-ds), surface_height(ssd-ds),
 +   surface_size, ssd-bufsize);
  
  surface.format = SPICE_SURFACE_FMT_32_xRGB;
  surface.width  = surface_width(ssd-ds);
 @@ -369,8 +381,6 @@ void qemu_spice_display_init_common(SimpleSpiceDisplay 
 *ssd)
  if (ssd-num_surfaces == 0) {
  ssd-num_surfaces = 1024;
  }
 -ssd-bufsize = (16 * 1024 * 1024);
 -ssd-buf = g_malloc(ssd-bufsize);
  }
  
  /* display listener callbacks */
 @@ -495,7 +505,7 @@ static void interface_get_init_info(QXLInstance *sin, 
 QXLDevInitInfo *info)
  info-num_memslots = NUM_MEMSLOTS;
  info-num_memslots_groups = NUM_MEMSLOTS_GROUPS;
  info-internal_groupslot_id = 0;
 -info-qxl_ram_size = ssd-bufsize;
 +info-qxl_ram_size = 16 * 1024 * 1024;
  info-n_surfaces = ssd-num_surfaces;
  }
  
 -- 1.8.3.1
 

Reviewed-by: Laszlo Ersek ler...@redhat.com

Re: [Qemu-devel] [PATCH v2] qcow2: add update refcount table realization for update_refcount

2014-09-05 Thread Kevin Wolf

Am 01.09.2014 um 12:52 hat Jun Li geschrieben:
 When every item of refcount block is NULL, free refcount block and reset the
 corresponding item of refcount table with NULL.
 
 Signed-off-by: Jun Li address@hidden

The commit message should also describe why this is a relevant
improvement for some use case. My gut feeling is that it complicates the
code for a very minimal gain.

Kevin

Re: [Qemu-devel] [PATCH v5 3/3] ivshmem: add check on protocol version in QEMU

2014-09-05 Thread Stefan Hajnoczi

On Thu, Sep 04, 2014 at 02:51:01PM +0200, David Marchand wrote:
 diff --git a/contrib/ivshmem-client/ivshmem-client.c 
 b/contrib/ivshmem-client/ivshmem-client.c
 index ad210c8..0c4e016 100644
 --- a/contrib/ivshmem-client/ivshmem-client.c
 +++ b/contrib/ivshmem-client/ivshmem-client.c
 @@ -184,10 +184,18 @@ ivshmem_client_connect(IvshmemClient *client)
  goto err_close;
  }
  
 -/* first, we expect our index + a fd == -1 */
 +/* first, we expect a protocol version */
 +if (read_one_msg(client, tmp, fd)  0 ||
 +(tmp != IVSHMEM_PROTOCOL_VERSION) || fd != -1) {
 +debug_log(client, cannot read from server\n);
 +goto err_close;
 +}
 +debug_log(client, our_id=%ld\n, client-local.id);

This debug_log() is probably not intentional.  local.id will always be
-1 here so the output is not useful.

 +static void ivshmem_check_version(void *opaque, const uint8_t * buf, int 
 flags)
 +{
 +IVShmemState *s = opaque;
 +PCIDevice *dev = PCI_DEVICE(s);
 +int tmp;
 +long version;
 +
 +memcpy(version, buf, sizeof(long));
 +tmp = qemu_chr_fe_get_msgfd(s-server_chr);
 +if (tmp != -1 || version != IVSHMEM_PROTOCOL_VERSION) {
 +fprintf(stderr, incompatible version, you are connecting to a 
 ivhsmem-
 +server using a different protocol please check your 
 setup\n);
 +qemu_chr_delete(s-server_chr);
 +s-server_chr = NULL;
 +return;
 +}
 +
 +IVSHMEM_DPRINTF(version check ok, finish init and switch to real 
 chardev 
 +handler\n);
 +
 +pci_register_bar(dev, 2, s-ivshmem_attr, s-bar);

Not sure if it is okay to delay PCI initialization to a fd hander
callback.

If the version message is too slow the guest could see the PCI adapter
without the BAR!

Did you move this code in order to prevent the guest from accessing the
device before it has connected to the server?  Perhaps the device needs
a state field that tracks whether or not it is ready for operation.  Any
access before RUNNING state is reached will be ignored (?).


pgpFAOfA0J1aK.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH] virtio-pci: fix virtio-net child refcount in transports

2014-09-05 Thread Michael S. Tsirkin

On Thu, Sep 04, 2014 at 07:41:32PM +0800, arei.gong...@huawei.com wrote:
 From: Gonglei arei.gong...@huawei.com
 
 object_initialize() leaves the object with a refcount of 1.
 object_property_add_child() adds its own reference which is dropped
 again when the property is deleted.
 
 The upshot of this is that we always have a refcount = 1.  Upon hot
 unplug the virtio-net child is not finalized!
 
 Drop our reference after the child property has been added to the
 parent.
 
 Signed-off-by: Gonglei arei.gong...@huawei.com

Aren't other virtio devices affected? what about virtio-scsi?

 ---
 Stefan had post virtio-blk in commit c5d49db4, but virtio-net has 
 the same problem. Maybe the other virtio devices have too.
 ---
  hw/virtio/virtio-pci.c | 1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
 index ddb5da1..78dcd68 100644
 --- a/hw/virtio/virtio-pci.c
 +++ b/hw/virtio/virtio-pci.c
 @@ -1456,6 +1456,7 @@ static void virtio_net_pci_instance_init(Object *obj)
  VirtIONetPCI *dev = VIRTIO_NET_PCI(obj);
  object_initialize(dev-vdev, sizeof(dev-vdev), TYPE_VIRTIO_NET);
  object_property_add_child(obj, virtio-backend, OBJECT(dev-vdev), 
 NULL);
 +object_unref(OBJECT(dev-vdev));
  }
  
  static const TypeInfo virtio_net_pci_info = {
 -- 
 1.7.12.4

Re: [Qemu-devel] [PATCH v5 2/3] docs: update ivshmem device spec

2014-09-05 Thread Stefan Hajnoczi

On Thu, Sep 04, 2014 at 02:51:00PM +0200, David Marchand wrote:
 Add some notes on the parts needed to use ivshmem devices: more specifically,
 explain the purpose of an ivshmem server and the basic concept to use the
 ivshmem devices in guests.
 Move some parts of the documentation and re-organise it.
 
 Signed-off-by: David Marchand david.march...@6wind.com
 Reviewed-by: Claudio Fontana claudio.font...@huawei.com
 ---
  docs/specs/ivshmem_device_spec.txt |  124 
 +++-
  1 file changed, 93 insertions(+), 31 deletions(-)

Reviewed-by: Stefan Hajnoczi stefa...@redhat.com


pgpRRPmN581Xw.pgp
Description: PGP signature

Re: [Qemu-devel] [Qemu-ppc] [PULL 00/52] ppc patch queue 2014-09-04

2014-09-05 Thread Peter Maydell

On 4 September 2014 23:17, Alexander Graf ag...@suse.de wrote:
 Peter, please pull the same tag name again - I updated it with the now
 working state.

Doesn't build on Windows:

hw/ppc/spapr.o: In function `spapr_populate_memory':
/home/petmay01/linaro/qemu-for-merges/hw/ppc/spapr.c:708: undefined
reference to `_ffsl'
/home/petmay01/linaro/qemu-for-merges/hw/ppc/spapr.c:708: undefined
reference to `_ffsl'
/home/petmay01/linaro/qemu-for-merges/hw/ppc/spapr.c:709: undefined
reference to `_ffsl'

Don't try to use ffs() or ffsl() -- use the ctz32(),
ctz64() or ctzl() functions in host-utils.h (whichever
is appropriate for the size of the type; in this
case ctz64 I think). Watch out that in the common
case ctzl(x) == ffsl(x) - 1, and check the handling
of the edge case of zero input is what you want.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v5 1/3] contrib: add ivshmem client and server

2014-09-05 Thread Stefan Hajnoczi

On Thu, Sep 04, 2014 at 02:50:59PM +0200, David Marchand wrote:
 When using ivshmem devices, notifications between guests can be sent as
 interrupts using a ivshmem-server (typical use described in documentation).
 The client is provided as a debug tool.
 
 Signed-off-by: Olivier Matz olivier.m...@6wind.com
 Signed-off-by: David Marchand david.march...@6wind.com
 ---
  Makefile|8 +
  configure   |3 +
  contrib/ivshmem-client/ivshmem-client.c |  405 
 +++
  contrib/ivshmem-client/ivshmem-client.h |  239 ++
  contrib/ivshmem-client/main.c   |  237 ++
  contrib/ivshmem-server/ivshmem-server.c |  395 ++
  contrib/ivshmem-server/ivshmem-server.h |  186 ++
  contrib/ivshmem-server/main.c   |  244 +++
  qemu-doc.texi   |   10 +-
  9 files changed, 1724 insertions(+), 3 deletions(-)
  create mode 100644 contrib/ivshmem-client/ivshmem-client.c
  create mode 100644 contrib/ivshmem-client/ivshmem-client.h
  create mode 100644 contrib/ivshmem-client/main.c
  create mode 100644 contrib/ivshmem-server/ivshmem-server.c
  create mode 100644 contrib/ivshmem-server/ivshmem-server.h
  create mode 100644 contrib/ivshmem-server/main.c

Modulo Michael's comments:

Reviewed-by: Stefan Hajnoczi stefa...@redhat.com


pgpLEOnkg7Lw3.pgp
Description: PGP signature

Re: [Qemu-devel] [RFC][patch 3/6] KVM: s390: Add GISA support

2014-09-05 Thread Frank Blaschka

On Fri, Sep 05, 2014 at 10:29:26AM +0200, Alexander Graf wrote:
 
 
 On 04.09.14 12:52, frank.blasc...@de.ibm.com wrote:
  From: Frank Blaschka frank.blasc...@de.ibm.com
  
  This patch adds GISA (Guest Interrupt State Area) support
  to s390 kvm. GISA can be used for exitless interrupts. The
  patch provides a set of functions for GISA related operations
  like accessing GISA fields or registering ISCs for alert.
  Exploiters of GISA will follow with additional patches.
  
  Signed-off-by: Frank Blaschka frank.blasc...@de.ibm.com
 
 That's a nice feature. However, please make sure that you maintain the
 abstraction levels.
 
 What should happen is that you request an irqfd from FLIC. Then you
 associate that irqfd with the PCI device.
 
 Thanks to that association, both parties can now talk to each other and
 negotiate their GISA number space and make sure things are connected.
 
 However, it should always be possible to do things without this direct
 IRQ injection.
 
 So you should be able to receive an irqfd event when an IRQ happened, so
 that VFIO user space applications can also handle interrupts for example.
 
 And the same applies for interrupt injection. We also need to be able to
 inject an adapter interrupt from QEMU for emulated devices ;).


OK, assuming we are doing the vfio solution expoiting GISA would be a
second step. Will take your feedback into account. THX!
 
 Alex
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH v4 04/20] block: Convert bdrv_em_aiocb_info.cancel to .cancel_async

2014-09-05 Thread Fam Zheng

On Thu, 09/04 17:21, Benoît Canet wrote:
 The Wednesday 03 Sep 2014 à 19:23:39 (+0800), Fam Zheng wrote :
  All the difference is that the old .cancel doesn't call cb, but
  .cancel_async does.
  
  Signed-off-by: Fam Zheng f...@redhat.com
  ---
   block.c | 5 -
   1 file changed, 4 insertions(+), 1 deletion(-)
  
  diff --git a/block.c b/block.c
  index 4aa1bd7..b7253af 100644
  --- a/block.c
  +++ b/block.c
  @@ -4679,6 +4679,9 @@ static void bdrv_aio_cancel_em(BlockDriverAIOCB 
  *blockacb)
   {
   BlockDriverAIOCBSync *acb =
   container_of(blockacb, BlockDriverAIOCBSync, common);
  +
  +acb-ret = -ECANCELED;
  +acb-common.cb(acb-common.opaque, acb-ret);
   qemu_bh_delete(acb-bh);
   acb-bh = NULL;
   qemu_aio_release(acb);
  @@ -4686,7 +4689,7 @@ static void bdrv_aio_cancel_em(BlockDriverAIOCB 
  *blockacb)
   
   static const AIOCBInfo bdrv_em_aiocb_info = {
   .aiocb_size = sizeof(BlockDriverAIOCBSync),
  -.cancel = bdrv_aio_cancel_em,
  +.cancel_async   = bdrv_aio_cancel_em,
   };
 
 Note from an AIO noob. Could we explain somewhere in the block layer what the 
 _em suffix means ?

I didn't write these functions but I think it means emulation.

Fam

Re: [Qemu-devel] [PATCH] cow: make padding in the header explicit

2014-09-05 Thread Markus Armbruster

Stefan Hajnoczi stefa...@gmail.com writes:

 On Thu, Sep 04, 2014 at 04:10:14PM +0200, Kevin Wolf wrote:
 Am 04.09.2014 um 15:51 hat Stefan Hajnoczi geschrieben:
  On Thu, Sep 04, 2014 at 06:07:32AM -0600, Eric Blake wrote:
   On 09/04/2014 02:58 AM, Stefan Hajnoczi wrote:
On-disk structures should be marked packed so the compiler does not
insert padding for field alignment.  Padding should be explicit so
on-disk layout is obvious and we don't rely on the 
architecture-specific
ABI for alignment rules.

The pahole(1) diff shows that the padding is now explicit and offsets
are unchanged:

   char backing_file[1024]; /* 8 1024 */
   /* --- cacheline 16 boundary (1024 bytes) was 8 bytes ago --- */
   int32_t mtime; /* 1032 4 */
-
-  /* XXX 4 bytes hole, try to pack */
-
+ uint32_t padding; /* 1036 4 */
   uint64_t size; /* 1040 8 */
   
   Was a 32-bit build also inserting this padding, or do we have historical
   differences where 32-bit and 64-bit cow files are actually different,
   and we may need to be prepared to parse files from both sources?
  
  Good point.  Let's not merge this patch since it breaks 32-bit hosts.
  
  The fact that no one hit problems when exchanging files between 32-bit
  and 64-bit machines shows that the cow format is rarely used.
  
  At this point we have 2 different formats: one without padding
  (i386-style) and one with padding (x86_64-style).  The chance of more
  variants is small but who knows, maybe some other host architecture ABI
  has yet another alignment rule for uint64_t.
  
  I'd like to git rm block/cow.c but I suppose the backwards-compatible
  thing to do is to introduce subformats to support both variants.
  Opinions?
 
 Can we safely detect which of the subformats we have? But I'm not sure
 if it's even worth fixing.

 I think it would default to the subformat depending on the host
 architecture but allow overriding with -o subformat=i386|x86_64.

Admirable dedication to bug-compatibility, but...

 I'm also not sure if it's worth fixing.  The cow file format is so
 rarely used I wonder if we'd be better off without it.

... good grief, nuke it already :)

Re: [Qemu-devel] [PATCH v4 04/20] block: Convert bdrv_em_aiocb_info.cancel to .cancel_async

2014-09-05 Thread Benoît Canet

The Friday 05 Sep 2014 à 18:55:51 (+0800), Fam Zheng wrote :
 On Thu, 09/04 17:21, Benoît Canet wrote:
  The Wednesday 03 Sep 2014 à 19:23:39 (+0800), Fam Zheng wrote :
   All the difference is that the old .cancel doesn't call cb, but
   .cancel_async does.
   
   Signed-off-by: Fam Zheng f...@redhat.com
   ---
block.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
   
   diff --git a/block.c b/block.c
   index 4aa1bd7..b7253af 100644
   --- a/block.c
   +++ b/block.c
   @@ -4679,6 +4679,9 @@ static void bdrv_aio_cancel_em(BlockDriverAIOCB 
   *blockacb)
{
BlockDriverAIOCBSync *acb =
container_of(blockacb, BlockDriverAIOCBSync, common);
   +
   +acb-ret = -ECANCELED;
   +acb-common.cb(acb-common.opaque, acb-ret);
qemu_bh_delete(acb-bh);
acb-bh = NULL;
qemu_aio_release(acb);
   @@ -4686,7 +4689,7 @@ static void bdrv_aio_cancel_em(BlockDriverAIOCB 
   *blockacb)

static const AIOCBInfo bdrv_em_aiocb_info = {
.aiocb_size = sizeof(BlockDriverAIOCBSync),
   -.cancel = bdrv_aio_cancel_em,
   +.cancel_async   = bdrv_aio_cancel_em,
};
  
  Note from an AIO noob. Could we explain somewhere in the block layer what 
  the _em suffix means ?
 
 I didn't write these functions but I think it means emulation.

Thanks

Best regards

Benoît

 
 Fam

1 2 3 >

1 - 100 of 284 matches

Mail list logo