Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Amit Shah
On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
 There is a need to add some more fields to VirtIODevice that should be
 migrated (broken status, endianness). The problem is that we do not
 want to break compatibility while adding a new feature... This issue has
 been addressed in the generic VMState code with the use of optional
 subsections. As a *temporary* alternative to port the whole virtio
 migration code to VMState, this patch mimics a similar subsectionning
 ability for virtio.
 
 Since each virtio device is streamed in its own section, the idea is to
 stream subsections between the end of the device section and the start
 of the next sections. This allows an older QEMU to complain and exit
 when fed with subsections:
 
 Unknown savevm section type 5
 Error -22 while loading VM state

Please make this configurable -- either via configure or device
properties.  That avoids having to break existing configurations that
work without this patch.

 All users of virtio_load()/virtio_save() need to be patched because the
 subsections are streamed AFTER the device itself.

Since all have the same fixup, I'm wondering if a new section can be
added to the virtio-bus itself, which gets propagated to all devices
upon load in the dest.

Amit



Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Michael S. Tsirkin
On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
 On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
  There is a need to add some more fields to VirtIODevice that should be
  migrated (broken status, endianness). The problem is that we do not
  want to break compatibility while adding a new feature... This issue has
  been addressed in the generic VMState code with the use of optional
  subsections. As a *temporary* alternative to port the whole virtio
  migration code to VMState, this patch mimics a similar subsectionning
  ability for virtio.
  
  Since each virtio device is streamed in its own section, the idea is to
  stream subsections between the end of the device section and the start
  of the next sections. This allows an older QEMU to complain and exit
  when fed with subsections:
  
  Unknown savevm section type 5
  Error -22 while loading VM state
 
 Please make this configurable -- either via configure or device
 properties.  That avoids having to break existing configurations that
 work without this patch.
 
  All users of virtio_load()/virtio_save() need to be patched because the
  subsections are streamed AFTER the device itself.
 
 Since all have the same fixup, I'm wondering if a new section can be
 added to the virtio-bus itself, which gets propagated to all devices
 upon load in the dest.
 
   Amit

This calls for a way for devices to inherit properties from the bus,
which doesn't exist ATM.
Fine but let's not hold up this patchset because of this.

-- 
MST



[Qemu-devel] pidfile option in config file

2014-05-15 Thread William Dauchy
Hello,

I'm using the `-pidfile` option in my qemu command line. Since I'm
also using the `-readconfig` option I thought it was a good thing to
include the pidfile option in my config file. Unfortunately I did not
found any possibility to add such option in my config file.
Is it something I could expect?

Regards,
-- 
William



Re: [Qemu-devel] [PATCH] SMI handler should set the CPL to zero and save and restore it on rsm.

2014-05-15 Thread Paolo Bonzini

Il 15/05/2014 03:32, Kevin O'Connor ha scritto:

On Wed, May 14, 2014 at 08:20:59PM -0400, Kevin O'Connor wrote:

On Wed, May 14, 2014 at 10:05:47AM +0200, Paolo Bonzini wrote:

CPL isn't even altered when CS is reloaded, because you cannot jump out
of ring-0 except with an inter-privilege IRET, and that reloads SS too.

An IRET or task switch is also the only way to set EFLAGS.VM, and it will
hardcode SS.DPL=3, again matching CPL=3.

Finally, to get out of real mode you need to have CPL=0, and whatever got
you at CPL has also loaded SS with a ring-0 stack.  This means that SS.DPL=0
right after clearing CR0.PE.

Using SS.DPL as the CPL really sounds like the right approach.  I
tried it on my KVM testcase, and it works well.  For QEMU, even the
special case of SYSRET will be handled fine because QEMU does set
SS.DPL = 3:

cpu_x86_load_seg_cache(env, R_SS, selector + 8,
   0, 0x,
   DESC_G_MASK | DESC_B_MASK | DESC_P_MASK |
   DESC_S_MASK | (3  DESC_DPL_SHIFT) |
   DESC_W_MASK | DESC_A_MASK);

SS.DPL=CPL=3, SS.RPL=selector  3 is a mix of Intel behavior (which is
SS.DPL=SS.RPL=CPL=3) and AMD behavior (because they set CPL=3 but
SS.DPL=SS.RPL=selector  3).  We may want to match Intel behavior,
but that's a different change.

Can you check if this patch works for you, and if so reply with
Tested-by/Reviewed-by?


Your patch causes Freedos to crash when emm386 is loaded, so I think
it broke VM86 mode.  Below are some logs I took from qemu at the point
of the crash.


FYI, with the patch below my quick test cases all look okay.

--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -974,7 +974,6 @@ static inline void cpu_x86_load_seg_cache(CPUX86State *env,
 /* update the hidden flags */
 {
 if (seg_reg == R_CS) {
-int cpl = selector  3;
 #ifdef TARGET_X86_64
 if ((env-hflags  HF_LMA_MASK)  (flags  DESC_L_MASK)) {
 /* long mode */
@@ -984,16 +983,17 @@ static inline void cpu_x86_load_seg_cache(CPUX86State 
*env,
 #endif
 {
 /* legacy / compatibility case */
-if (!(env-cr[0]  CR0_PE_MASK)) {
-cpl = 0;
-} else if (env-eflags  VM_MASK) {
-cpl = 3;
-}
 new_hflags = (env-segs[R_CS].flags  DESC_B_MASK)
  (DESC_B_SHIFT - HF_CS32_SHIFT);
 env-hflags = (env-hflags  ~(HF_CS32_MASK | HF_CS64_MASK)) |
 new_hflags;
 }
+}
+if (seg_reg == R_SS) {
+int cpl = (flags  DESC_DPL_SHIFT)  3;
+if (env-eflags  VM_MASK) {
+cpl = 3;
+}
 #if HF_CPL_MASK != 3
 #error HF_CPL_MASK is hardcoded
 #endif



Looks like a bug entering VM86 mode.  I'll take a look, thanks for 
confirming what works and what doesn't!


Paolo



Re: [Qemu-devel] [PATCH v3] serial-pci: Set prog interface field of pci config to 16550 compatible

2014-05-15 Thread Michael S. Tsirkin
On Thu, Feb 27, 2014 at 02:05:05AM +0100, BALATON Zoltan wrote:
 Signed-off-by: BALATON Zoltan bala...@eik.bme.hu
 ---


This looks sane, a minor comment below (hopefully last).
Thanks!

  v2: resubmission after pc-2.1 is added with the multiport case
  v3: added compatibility check to avoid changing earlier than pc-2.1
 
  hw/char/serial-pci.c | 11 +++
  include/hw/i386/pc.h | 15 +++
  2 files changed, 26 insertions(+)
 
 diff --git a/hw/char/serial-pci.c b/hw/char/serial-pci.c
 index 991c99f..ae57098 100644
 --- a/hw/char/serial-pci.c
 +++ b/hw/char/serial-pci.c
 @@ -34,6 +34,7 @@
  typedef struct PCISerialState {
  PCIDevice dev;
  SerialState state;
 +uint8_t compat;
  } PCISerialState;
  
  typedef struct PCIMultiSerialState {
 @@ -44,6 +45,7 @@ typedef struct PCIMultiSerialState {
  SerialState  state[PCI_SERIAL_MAX_PORTS];
  uint32_t level[PCI_SERIAL_MAX_PORTS];
  qemu_irq *irqs;
 +uint8_t  compat;
  } PCIMultiSerialState;
  
  static int serial_pci_init(PCIDevice *dev)

This name isn't very informative.
I think this should be a prog_if property
defaulting to 0x2 and over-written for old machine types,
Devices don't care why their prog_if is changed,
it's the machine type that cares about compatibility.

See
commit aa93200b88fb1071eaf21bf766711762ed4630e2
Author: Gabriel L. Somlo gso...@gmail.com
Date:   Mon May 5 10:52:51 2014 -0400

apic: use emulated lapic version 0x14 on pc machines = 2.1
as an example.


Alternatively, create a bit property legacy_prog_if
and change field name to compat_flags, have property
set flags here.

See
commit 2af234e61d59f39ae16ba882271e7c4fef2c41c1
Author: Michael S. Tsirkin m...@redhat.com
Date:   Thu Feb 14 19:11:27 2013 +0200

e1000: unbreak the guest network migration to 1.3
for an example.

I think the 1st option is better but second one would be
more or less ok.

 @@ -60,6 +62,9 @@ static int serial_pci_init(PCIDevice *dev)
  return -1;
  }
  
 +if (!pci-compat) {
 +pci-dev.config[PCI_CLASS_PROG] = 0x02; /* 16550 compatible */
 +}
  pci-dev.config[PCI_INTERRUPT_PIN] = 0x01;
  s-irq = pci_allocate_irq(pci-dev);
  
 @@ -101,6 +106,9 @@ static int multi_serial_pci_init(PCIDevice *dev)
  assert(pci-ports  0);
  assert(pci-ports = PCI_SERIAL_MAX_PORTS);
  
 +if (!pci-compat) {
 +pci-dev.config[PCI_CLASS_PROG] = 0x02; /* 16550 compatible */
 +}
  pci-dev.config[PCI_INTERRUPT_PIN] = 0x01;
  memory_region_init(pci-iobar, OBJECT(pci), multiserial, 8 * 
 pci-ports);
  pci_register_bar(pci-dev, 0, PCI_BASE_ADDRESS_SPACE_IO, pci-iobar);
 @@ -177,12 +185,14 @@ static const VMStateDescription 
 vmstate_pci_multi_serial = {
  
  static Property serial_pci_properties[] = {
  DEFINE_PROP_CHR(chardev,  PCISerialState, state.chr),
 +DEFINE_PROP_UINT8(compat,  PCISerialState, compat, 0),
  DEFINE_PROP_END_OF_LIST(),
  };
  
  static Property multi_2x_serial_pci_properties[] = {
  DEFINE_PROP_CHR(chardev1,  PCIMultiSerialState, state[0].chr),
  DEFINE_PROP_CHR(chardev2,  PCIMultiSerialState, state[1].chr),
 +DEFINE_PROP_UINT8(compat,  PCIMultiSerialState, compat, 0),
  DEFINE_PROP_END_OF_LIST(),
  };
  
 @@ -191,6 +201,7 @@ static Property multi_4x_serial_pci_properties[] = {
  DEFINE_PROP_CHR(chardev2,  PCIMultiSerialState, state[1].chr),
  DEFINE_PROP_CHR(chardev3,  PCIMultiSerialState, state[2].chr),
  DEFINE_PROP_CHR(chardev4,  PCIMultiSerialState, state[3].chr),
 +DEFINE_PROP_UINT8(compat,  PCIMultiSerialState, compat, 0),
  DEFINE_PROP_END_OF_LIST(),
  };
  
 diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
 index 32a7687..8fb8046 100644
 --- a/include/hw/i386/pc.h
 +++ b/include/hw/i386/pc.h
 @@ -271,6 +271,21 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t 
 *);
  .driver   = apic,\
  .property = version,\
  .value= stringify(0x11),\
 +},\
 +{\
 +.driver   = pci-serial,\
 +.property = compat,\
 +.value= stringify(1),\
 +},\
 +{\
 +.driver   = pci-serial-2x,\
 +.property = compat,\
 +.value= stringify(1),\
 +},\
 +{\
 +.driver   = pci-serial-4x,\
 +.property = compat,\
 +.value= stringify(1),\
  }
  
  #define PC_COMPAT_1_7 \
 -- 
 1.8.1.5



Re: [Qemu-devel] [PATCH v3] serial-pci: Set prog interface field of pci config to 16550 compatible

2014-05-15 Thread Gerd Hoffmann
On Do, 2014-02-27 at 02:05 +0100, BALATON Zoltan wrote:
   ^^
Looks like your clock is _way_ off.

 +if (!pci-compat) {
 +pci-dev.config[PCI_CLASS_PROG] = 0x02; /* 16550 compatible */
 +}

  static Property serial_pci_properties[] = {
  DEFINE_PROP_CHR(chardev,  PCISerialState, state.chr),
 +DEFINE_PROP_UINT8(compat,  PCISerialState, compat, 0),
  DEFINE_PROP_END_OF_LIST(),
  };

 +{\
 +.driver   = pci-serial,\
 +.property = compat,\
 +.value= stringify(1),\
 +},\

Reviewed-by: Gerd Hoffmann kra...@redhat.com

mst, do you take that through the pci tree?

cheers,
  Gerd





Re: [Qemu-devel] [PATCH v3] serial-pci: Set prog interface field of pci config to 16550 compatible

2014-05-15 Thread Michael S. Tsirkin
On Thu, May 15, 2014 at 08:42:17AM +0200, Gerd Hoffmann wrote:
 On Do, 2014-02-27 at 02:05 +0100, BALATON Zoltan wrote:
^^
 Looks like your clock is _way_ off.
 
  +if (!pci-compat) {
  +pci-dev.config[PCI_CLASS_PROG] = 0x02; /* 16550 compatible */
  +}
 
   static Property serial_pci_properties[] = {
   DEFINE_PROP_CHR(chardev,  PCISerialState, state.chr),
  +DEFINE_PROP_UINT8(compat,  PCISerialState, compat, 0),
   DEFINE_PROP_END_OF_LIST(),
   };
 
  +{\
  +.driver   = pci-serial,\
  +.property = compat,\
  +.value= stringify(1),\
  +},\
 
 Reviewed-by: Gerd Hoffmann kra...@redhat.com
 
 mst, do you take that through the pci tree?
 
 cheers,
   Gerd
 

Yes but I'd like the property renamed. Agree?




Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Amit Shah
On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
 On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
  On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
   There is a need to add some more fields to VirtIODevice that should be
   migrated (broken status, endianness). The problem is that we do not
   want to break compatibility while adding a new feature... This issue has
   been addressed in the generic VMState code with the use of optional
   subsections. As a *temporary* alternative to port the whole virtio
   migration code to VMState, this patch mimics a similar subsectionning
   ability for virtio.

BTW Greg, do you plan on working on vmstate for virtio?

   Since each virtio device is streamed in its own section, the idea is to
   stream subsections between the end of the device section and the start
   of the next sections. This allows an older QEMU to complain and exit
   when fed with subsections:
   
   Unknown savevm section type 5
   Error -22 while loading VM state
  
  Please make this configurable -- either via configure or device
  properties.  That avoids having to break existing configurations that
  work without this patch.
  
   All users of virtio_load()/virtio_save() need to be patched because the
   subsections are streamed AFTER the device itself.
  
  Since all have the same fixup, I'm wondering if a new section can be
  added to the virtio-bus itself, which gets propagated to all devices
  upon load in the dest.
 
 This calls for a way for devices to inherit properties from the bus,
 which doesn't exist ATM.
 Fine but let's not hold up this patchset because of this.

No, only suggestion is to add a migration section in the bus, and then
it's easier to do this in the post-migrate functions for each device
-- so only one new section gets introduced instead of all devices
being modified to send a new subsection.

Amit



Re: [Qemu-devel] [PATCH] rdma: Fix block during rdma migration

2014-05-15 Thread Gonglei (Arei)
 -Original Message-
 From: Michael R. Hines [mailto:mrhi...@linux.vnet.ibm.com]
 Sent: Thursday, May 15, 2014 8:44 AM
 To: Gonglei (Arei); qemu-devel@nongnu.org
 Cc: Huangweidong (C); quint...@redhat.com; dgilb...@redhat.com;
 owass...@redhat.com; mrhi...@us.ibm.com; pbonz...@redhat.com;
 Moyuxiang
 Subject: Re: [Qemu-devel] [PATCH] rdma: Fix block during rdma migration
 
 On 05/09/2014 12:25 PM, Gonglei (Arei) wrote:
  Hi,
 
  -Original Message-
  From: Michael R. Hines [mailto:mrhi...@linux.vnet.ibm.com]
  Sent: Tuesday, April 01, 2014 8:42 AM
  To: Gonglei (Arei); qemu-devel@nongnu.org
  Cc: Huangweidong (C); quint...@redhat.com; dgilb...@redhat.com;
  owass...@redhat.com; mrhi...@us.ibm.com; Moyuxiang;
  pbonz...@redhat.com
  Subject: Re: [Qemu-devel] [PATCH] rdma: Fix block during rdma migration
 
  On 03/29/2014 03:39 PM, arei.gong...@huawei.com wrote:
  From: Mo Yuxiang moyuxi...@huawei.com
 
  If the networking break or there's something wrong with rdma
  device(ib0 with no IP) during rdma migration, the main_loop of
  qemu will be blocked in rdma_destroy_id. I add rdma_ack_cm_event
  to fix this bug.
 
  Signed-off-by: Mo Yuxiang moyuxi...@huawei.com
  Signed-off-by: Gonglei arei.gong...@huawei.com
  ---
 migration-rdma.c | 1 +
 1 file changed, 1 insertion(+)
 
  diff --git a/migration-rdma.c b/migration-rdma.c
  index eeb4302..f60749b 100644
  --- a/migration-rdma.c
  +++ b/migration-rdma.c
  @@ -949,6 +949,7 @@ route:
 ERROR(errp, result not equal to event_addr_resolved %s,
 rdma_event_str(cm_event-event));
 perror(rdma_resolve_addr);
  +rdma_ack_cm_event(cm_event);
 ret = -EINVAL;
 goto err_resolve_get_addr;
 }
  Reviewed-by: Michael R. Hines mrhi...@us.ibm.com
 
  Good catch. =) That's an obvious bug. It looks like I need
  to do a much better job of kill -9 inside the regression
  testing scripts - probably i should try killing the migration
  prematurely at different periods just to be sure there are
  no more places where the connection state is not getting
  cleaned up..
 
  - Michael
 
  Michael, do you have a plan to pull this patch to master? Thanks.
 
  Best regards,
  -Gonglei
 
 
 Sorry for the late reply, but I'm not the maintainer for migration,
 that's Juan
 (I can only signoff on patches like everyone else =).
 
 I also have outstanding RDMA patches myself that have not yet been pulled.
 
 Would you mind pinging Juan for both of us?
 
Thanks.
The patch is Cc'ing Juan, maybe he is very busy. 
I have post v2 even, but I have not gotten any reply. 
I have no idea how to do next.


Best regards,
-Gonglei




Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Greg Kurz
On Thu, 15 May 2014 11:34:25 +0530
Amit Shah amit.s...@redhat.com wrote:

 On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
  There is a need to add some more fields to VirtIODevice that should be
  migrated (broken status, endianness). The problem is that we do not
  want to break compatibility while adding a new feature... This issue has
  been addressed in the generic VMState code with the use of optional
  subsections. As a *temporary* alternative to port the whole virtio
  migration code to VMState, this patch mimics a similar subsectionning
  ability for virtio.
  
  Since each virtio device is streamed in its own section, the idea is to
  stream subsections between the end of the device section and the start
  of the next sections. This allows an older QEMU to complain and exit
  when fed with subsections:
  
  Unknown savevm section type 5
  Error -22 while loading VM state
 
 Please make this configurable -- either via configure or device
 properties.  That avoids having to break existing configurations that
 work without this patch.
 

Hmmm... you mean we support migration from a newer QEMU to an older one ?

  All users of virtio_load()/virtio_save() need to be patched because the
  subsections are streamed AFTER the device itself.
 
 Since all have the same fixup, I'm wondering if a new section can be
 added to the virtio-bus itself, which gets propagated to all devices
 upon load in the dest.
 

That would be nice if possible. I will have a closer look.

   Amit
 

Thanks.

-- 
Gregory Kurz kurzg...@fr.ibm.com
 gk...@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys  http://www.ibm.com
Tel +33 (0)562 165 496

Anarchy is about taking complete responsibility for yourself.
Alan Moore.




Re: [Qemu-devel] usb: usb tablet freeze when save/restore guest os

2014-05-15 Thread Gerd Hoffmann
  Hi,

 Well then, may I post a formal patch for this issue, Gerd? Thanks.

I'd like to know what the root cause for the lost interrupt is.

Not implementing PIRQ enable could be it, especially as the guest os
seems to use it (otherwise your patch would have no effect).

The check for the PIRQ enable bit should be in uhci_update_irq though,
and you should check the single bit only, not the whole legacy support
register.

cheers,
  Gerd






Re: [Qemu-devel] [PATCH] iotests: Use configured python

2014-05-15 Thread Markus Armbruster
Max Reitz mre...@redhat.com writes:

 On 14.05.2014 14:33, Markus Armbruster wrote:
 Max Reitz mre...@redhat.com writes:

 Currently, QEMU's iotests rely on /usr/bin/env to start the correct
 Python (that is, at least Python 2.4, but not 3). On systems where
 Python 3 is the default, the user has no clean way of making the iotests
 use the correct binary.

 This commit makes the iotests use the Python selected by configure.

 Signed-off-by: Max Reitz mre...@redhat.com
 I'm afraid this broke qemu-iotests in a separate build tree:

  ./check: line 38: ./common.env: No such file or directory

 As I already replied to Peter, I see two (or maybe three) ways to fix this:

 The first is, we use the correct path to common.env. This would
 however result in modification of the source tree although this is
 probably not what the user intends with an out-of-tree build. On the
 other hand, this would just work.

Writing to the source tree works only when the write is exactly the same
for every build tree.  Example: autoconf writing configure.

Modifying files under git-control is right out.

 The second is, we do not create common.env for out-of-tree builds and
 add a default common.env to the repository (PYTHON = python should
 probably suffice). This would not introduce any regressions, however,
 the iotests would remain problematic on systems with Python 3 being
 the default when using out-of-tree builds.

A difference between an in-tree and an out-of-tree build is a bug.

If plain python is good enough for out-of-tree builds, it should be
good enough for in-tree builds.

As I guess that out-of-tree
 builds are actually recommended,

Correct.

  this would mean that the better
 solution might be to revert my patch and instead change every python
 occurrence in the iotests' Shebangs to python2, as kind of a third
 way to go. However, for this I'm not sure whether all systems which
 are supposed to be supported by qemu actually have a python2
 executable/symlink. I guess so, but then again...

I don't know.  Try and find out :)

 So, either we fix it but try to write to the source tree without the
 user intending to modify it; or we fix it for in-tree builds only; or
 we drop the magic and just use python2 in the iotests' Shebangs.

The problem is including generated bits, namely results of configure,
into source files.

The Autoconf way is to substitute placeholders in FOO.in producing FOO.

When you want to limit .in contents as much as possible, you factor out
the stuff that needs substitutions into some SUB.in, which you then
include into FOO.  Requires a suitable include mechanism.  In shell,
builtin source.

But then you need to find SUB from FOO.  I've seen two ways used:

1. Assume SUB is in the current directory.  Link FOO into the build tree
to make it so.

2. Require FOO to be run in a way that lets it find its source
directory.  If whatever runs FOO puts the full path into argv[0], you
can use that.  Else, require a suitable argument or environment
variable.



Re: [Qemu-devel] [PATCH v3] serial-pci: Set prog interface field of pci config to 16550 compatible

2014-05-15 Thread Gerd Hoffmann
static Property serial_pci_properties[] = {
DEFINE_PROP_CHR(chardev,  PCISerialState, state.chr),
   +DEFINE_PROP_UINT8(compat,  PCISerialState, compat, 0),
DEFINE_PROP_END_OF_LIST(),
};
  
  mst, do you take that through the pci tree?
  
  cheers,
Gerd
  
 
 Yes but I'd like the property renamed. Agree?

Yes, something more descriptive is reasonable.

cheers,
  Gerd





Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Amit Shah
On (Thu) 15 May 2014 [08:49:48], Greg Kurz wrote:
 On Thu, 15 May 2014 11:34:25 +0530
 Amit Shah amit.s...@redhat.com wrote:
 
  On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
   There is a need to add some more fields to VirtIODevice that should be
   migrated (broken status, endianness). The problem is that we do not
   want to break compatibility while adding a new feature... This issue has
   been addressed in the generic VMState code with the use of optional
   subsections. As a *temporary* alternative to port the whole virtio
   migration code to VMState, this patch mimics a similar subsectionning
   ability for virtio.
   
   Since each virtio device is streamed in its own section, the idea is to
   stream subsections between the end of the device section and the start
   of the next sections. This allows an older QEMU to complain and exit
   when fed with subsections:
   
   Unknown savevm section type 5
   Error -22 while loading VM state
  
  Please make this configurable -- either via configure or device
  properties.  That avoids having to break existing configurations that
  work without this patch.
  
 
 Hmmm... you mean we support migration from a newer QEMU to an older one ?

Well, not really support, since many things don't work anyway, but
let's make that easier on downstreams that may want to.

   All users of virtio_load()/virtio_save() need to be patched because the
   subsections are streamed AFTER the device itself.
  
  Since all have the same fixup, I'm wondering if a new section can be
  added to the virtio-bus itself, which gets propagated to all devices
  upon load in the dest.
  
 
 That would be nice if possible. I will have a closer look.

Thanks!

Amit



[Qemu-devel] [PATCH v4 1/2] qemu-iotests: Add data pattern in version3 VMDK sample image in 059

2014-05-15 Thread Fam Zheng
It's possible that we diverge from the specification with our
implementation.  Having a reference image in the test cases may detect
such problems when we introduce a bug that can read what it creates, but
can't handle a real VMDK.

Signed-off-by: Fam Zheng f...@redhat.com
---
 tests/qemu-iotests/059 |   4 +
 tests/qemu-iotests/059.out | 202 -
 .../sample_images/iotest-version3.vmdk.bz2 | Bin 414 - 4764 bytes
 3 files changed, 205 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/059 b/tests/qemu-iotests/059
index 26a2fd3..3c053c2 100755
--- a/tests/qemu-iotests/059
+++ b/tests/qemu-iotests/059
@@ -114,6 +114,10 @@ echo
 echo === Testing version 3 ===
 _use_sample_img iotest-version3.vmdk.bz2
 _img_info
+for i in {0..99}; do
+$QEMU_IO -r -c read -P $(( i % 10 + 0x30 )) $(( i * 64 * 1024 * 10 + i * 
512 )) 512 $TEST_IMG \
+| _filter_qemu_io
+done
 
 echo
 echo === Testing 4TB monolithicFlat creation and IO ===
diff --git a/tests/qemu-iotests/059.out b/tests/qemu-iotests/059.out
index eba0ded..0dadba6 100644
--- a/tests/qemu-iotests/059.out
+++ b/tests/qemu-iotests/059.out
@@ -2056,8 +2056,208 @@ wrote 512/512 bytes at offset 10240
 === Testing version 3 ===
 image: TEST_DIR/iotest-version3.IMGFMT
 file format: IMGFMT
-virtual size: 1.0G (1073741824 bytes)
+virtual size: 16G (17179869184 bytes)
 cluster_size: 65536
+read 512/512 bytes at offset 0
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 655872
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 1311744
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 1967616
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 2623488
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 3279360
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 3935232
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 4591104
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 5246976
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 5902848
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 6558720
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 7214592
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 7870464
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 8526336
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 9182208
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 9838080
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 10493952
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 11149824
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 11805696
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 12461568
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 13117440
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 13773312
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 14429184
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 15085056
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 15740928
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 16396800
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 17052672
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 17708544
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 18364416
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 19020288
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 19676160
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 20332032
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 20987904
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 21643776
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 22299648
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 bytes at offset 22955520
+512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+read 512/512 

[Qemu-devel] [PATCH v4 0/2] vmdk: Optimize cluster allocation

2014-05-15 Thread Fam Zheng
Patch 01 is new. Patch 02 is V4 with Kevin's comments addressed.

Fam Zheng (2):
  qemu-iotests: Add data pattern in version3 VMDK sample image in 059
  vmdk: Optimize cluster allocation

 block/vmdk.c   | 220 +
 tests/qemu-iotests/059 |   4 +
 tests/qemu-iotests/059.out | 202 ++-
 .../sample_images/iotest-version3.vmdk.bz2 | Bin 414 - 4764 bytes
 4 files changed, 343 insertions(+), 83 deletions(-)

-- 
1.9.2




[Qemu-devel] [PATCH v4 2/2] vmdk: Optimize cluster allocation

2014-05-15 Thread Fam Zheng
This drops the unnecessary bdrv_truncate() from, and also improves,
cluster allocation code path.

Before, when we need a new cluster, get_cluster_offset truncates the
image to bdrv_getlength() + cluster_size, and returns the offset of
added area, i.e. the image length before truncating.

This is not efficient, so it's now rewritten as:

  - Save the extent file length when opening.

  - When allocating cluster, use the saved length as cluster offset.

  - Don't truncate image, because we'll anyway write data there: just
write any data at the EOF position, in descending priority:

* New user data (cluster allocation happens in a write request).

* Filling data in the beginning and/or ending of the new cluster, if
  not covered by user data: either backing file content (COW), or
  zero for standalone images.

One major benifit of this change is, on host mounted NFS images, even
over a fast network, ftruncate is slow (see the example below). This
change significantly speeds up cluster allocation. Comparing by
converting a cirros image (296M) to VMDK on an NFS mount point, over
1Gbe LAN:

$ time qemu-img convert cirros-0.3.1.img /mnt/a.raw -O vmdk

Before:
real0m21.796s
user0m0.130s
sys 0m0.483s

After:
real0m2.017s
user0m0.047s
sys 0m0.190s

We also get rid of unchecked bdrv_getlength() and bdrv_truncate(), and
get a little more documentation in function comments.

Tested that this passes qemu-iotests for all VMDK subformats.

Signed-off-by: Fam Zheng f...@redhat.com

---
V4: Fix L2 table cache and image, and improve according to any other
comments. (Kevin)

V3: A new implementation following Kevin's suggestion.
---
 block/vmdk.c | 220 +--
 1 file changed, 138 insertions(+), 82 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 480ea37..796bc01 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -106,6 +106,7 @@ typedef struct VmdkExtent {
 uint32_t l2_cache_counts[L2_CACHE_SIZE];
 
 int64_t cluster_sectors;
+int64_t next_cluster_sector;
 char *type;
 } VmdkExtent;
 
@@ -124,7 +125,6 @@ typedef struct BDRVVmdkState {
 } BDRVVmdkState;
 
 typedef struct VmdkMetaData {
-uint32_t offset;
 unsigned int l1_index;
 unsigned int l2_index;
 unsigned int l2_offset;
@@ -397,6 +397,7 @@ static int vmdk_add_extent(BlockDriverState *bs,
 {
 VmdkExtent *extent;
 BDRVVmdkState *s = bs-opaque;
+int64_t ret;
 
 if (cluster_sectors  0x20) {
 /* 0x20 * 512Bytes = 1GB for one cluster is unrealistic */
@@ -428,6 +429,12 @@ static int vmdk_add_extent(BlockDriverState *bs,
 extent-l2_size = l2_size;
 extent-cluster_sectors = flat ? sectors : cluster_sectors;
 
+ret = bdrv_getlength(extent-file);
+if (ret  0) {
+return ret;
+}
+extent-next_cluster_sector = DIV_ROUND_UP(ret, BDRV_SECTOR_SIZE);
+
 if (s-num_extents  1) {
 extent-end_sector = (*(extent - 1)).end_sector + extent-sectors;
 } else {
@@ -954,57 +961,96 @@ static int vmdk_refresh_limits(BlockDriverState *bs)
 return 0;
 }
 
+/**
+ * get_whole_cluster
+ *
+ * Copy backing file's cluster that covers @sector_num, otherwise write zero,
+ * to the cluster at @cluster_sector_num.
+ *
+ * If @skip_start  @skip_end, the relative range [@skip_start, @skip_end) is
+ * not copied or written, and leave it for call to write user data in the
+ * request.
+ */
 static int get_whole_cluster(BlockDriverState *bs,
-VmdkExtent *extent,
-uint64_t cluster_offset,
-uint64_t offset,
-bool allocate)
+ VmdkExtent *extent,
+ uint64_t cluster_sector_num,
+ uint64_t sector_num,
+ uint64_t skip_start, uint64_t skip_end)
 {
 int ret = VMDK_OK;
-uint8_t *whole_grain = NULL;
+int64_t cluster_bytes;
+uint8_t *whole_grain;
+
+/* For COW, align request sector_num to cluster start */
+sector_num = QEMU_ALIGN_DOWN(sector_num, extent-cluster_sectors);
+cluster_bytes = extent-cluster_sectors  BDRV_SECTOR_BITS;
+whole_grain = qemu_blockalign(bs, cluster_bytes);
+
+if (!bs-backing_hd) {
+memset(whole_grain, 0,  skip_start  BDRV_SECTOR_BITS);
+memset(whole_grain + (skip_end  BDRV_SECTOR_BITS), 0,
+   cluster_bytes - (skip_end  BDRV_SECTOR_BITS));
+}
 
+assert(skip_end = sector_num + extent-cluster_sectors);
 /* we will be here if it's first write on non-exist grain(cluster).
  * try to read from parent image, if exist */
-if (bs-backing_hd) {
-whole_grain =
-qemu_blockalign(bs, extent-cluster_sectors  BDRV_SECTOR_BITS);
-if (!vmdk_is_cid_valid(bs)) {
-ret = VMDK_ERROR;
-goto exit;
-}
+if (bs-backing_hd  

Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Greg Kurz
On Thu, 15 May 2014 12:16:35 +0530
Amit Shah amit.s...@redhat.com wrote:
 On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
  On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
   On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
There is a need to add some more fields to VirtIODevice that should be
migrated (broken status, endianness). The problem is that we do not
want to break compatibility while adding a new feature... This issue has
been addressed in the generic VMState code with the use of optional
subsections. As a *temporary* alternative to port the whole virtio
migration code to VMState, this patch mimics a similar subsectionning
ability for virtio.
 
 BTW Greg, do you plan on working on vmstate for virtio?
 

Yes.

Since each virtio device is streamed in its own section, the idea is to
stream subsections between the end of the device section and the start
of the next sections. This allows an older QEMU to complain and exit
when fed with subsections:

Unknown savevm section type 5
Error -22 while loading VM state
   
   Please make this configurable -- either via configure or device
   properties.  That avoids having to break existing configurations that
   work without this patch.
   
All users of virtio_load()/virtio_save() need to be patched because the
subsections are streamed AFTER the device itself.
   
   Since all have the same fixup, I'm wondering if a new section can be
   added to the virtio-bus itself, which gets propagated to all devices
   upon load in the dest.
  
  This calls for a way for devices to inherit properties from the bus,
  which doesn't exist ATM.
  Fine but let's not hold up this patchset because of this.
 
 No, only suggestion is to add a migration section in the bus, and then
 it's easier to do this in the post-migrate functions for each device
 -- so only one new section gets introduced instead of all devices
 being modified to send a new subsection.
 

The main problem I see is that virtio sucks: as you see in patch 8, we have
to be careful not to call vring or virtqueue stuff before the device knows
its endianness or it breaks... I need to study how the virtio-bus gets
migrated to ensure the endian section is streamed before the devices.

   Amit
 

Thanks.

-- 
Gregory Kurz kurzg...@fr.ibm.com
 gk...@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys  http://www.ibm.com
Tel +33 (0)562 165 496

Anarchy is about taking complete responsibility for yourself.
Alan Moore.




Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Michael S. Tsirkin
On Thu, May 15, 2014 at 08:49:48AM +0200, Greg Kurz wrote:
 On Thu, 15 May 2014 11:34:25 +0530
 Amit Shah amit.s...@redhat.com wrote:
 
  On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
   There is a need to add some more fields to VirtIODevice that should be
   migrated (broken status, endianness). The problem is that we do not
   want to break compatibility while adding a new feature... This issue has
   been addressed in the generic VMState code with the use of optional
   subsections. As a *temporary* alternative to port the whole virtio
   migration code to VMState, this patch mimics a similar subsectionning
   ability for virtio.
   
   Since each virtio device is streamed in its own section, the idea is to
   stream subsections between the end of the device section and the start
   of the next sections. This allows an older QEMU to complain and exit
   when fed with subsections:
   
   Unknown savevm section type 5
   Error -22 while loading VM state
  
  Please make this configurable -- either via configure or device
  properties.  That avoids having to break existing configurations that
  work without this patch.
  
 
 Hmmm... you mean we support migration from a newer QEMU to an older one ?

In theory yes, that's why we have the -M switch.
This area isn't well tested unfortunately, but there are
downstreams that test and productize it.


   All users of virtio_load()/virtio_save() need to be patched because the
   subsections are streamed AFTER the device itself.
  
  Since all have the same fixup, I'm wondering if a new section can be
  added to the virtio-bus itself, which gets propagated to all devices
  upon load in the dest.
  
 
 That would be nice if possible. I will have a closer look.
 
  Amit
  
 
 Thanks.
 
 -- 
 Gregory Kurz kurzg...@fr.ibm.com
  gk...@linux.vnet.ibm.com
 Software Engineer @ IBM/Meiosys  http://www.ibm.com
 Tel +33 (0)562 165 496
 
 Anarchy is about taking complete responsibility for yourself.
 Alan Moore.



Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Michael S. Tsirkin
On Thu, May 15, 2014 at 12:16:35PM +0530, Amit Shah wrote:
 On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
  On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
   On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
There is a need to add some more fields to VirtIODevice that should be
migrated (broken status, endianness). The problem is that we do not
want to break compatibility while adding a new feature... This issue has
been addressed in the generic VMState code with the use of optional
subsections. As a *temporary* alternative to port the whole virtio
migration code to VMState, this patch mimics a similar subsectionning
ability for virtio.
 
 BTW Greg, do you plan on working on vmstate for virtio?
 
Since each virtio device is streamed in its own section, the idea is to
stream subsections between the end of the device section and the start
of the next sections. This allows an older QEMU to complain and exit
when fed with subsections:

Unknown savevm section type 5
Error -22 while loading VM state
   
   Please make this configurable -- either via configure or device
   properties.  That avoids having to break existing configurations that
   work without this patch.
   
All users of virtio_load()/virtio_save() need to be patched because the
subsections are streamed AFTER the device itself.
   
   Since all have the same fixup, I'm wondering if a new section can be
   added to the virtio-bus itself, which gets propagated to all devices
   upon load in the dest.
  
  This calls for a way for devices to inherit properties from the bus,
  which doesn't exist ATM.
  Fine but let's not hold up this patchset because of this.
 
 No, only suggestion is to add a migration section in the bus, and then
 it's easier to do this in the post-migrate functions for each device
 -- so only one new section gets introduced instead of all devices
 being modified to send a new subsection.
 
   Amit

I don't mind but the gain isn't very big here.

-- 
MST



Re: [Qemu-devel] [PATCH 0/4] Support more than 255 cpus

2014-05-15 Thread Li, ZhenHua

Kernel's kvm support is not here.
x2APIC is needed, I will try to do that later.

On 05/13/2014 06:53 PM, Jan Kiszka wrote:

On 2014-05-13 09:09, Li, Zhen-Hua wrote:

From: Li, ZhenHua zhen-h...@hp.com

These series patches are trying to make Qemu support more than 255 CPUs.
The max cpu number changed to 4096.

  Support more than 255 cpus: ACPI and APIC defines
  Support more than 255 cpus: max_cpus to 4096
  Support more than 255 cpus: max cpumask bit to 4096
  Support more than 255 cpus: runtime chec

  include/hw/acpi/cpu_hotplug_defs.h | 4 ++--
  include/hw/i386/apic_internal.h| 2 +-
  include/hw/i386/pc.h | 2 +-
  include/sysemu/sysemu.h | 2 +-
  hw/i386/acpi-build.c | 8 

Don't we need x2APIC support to provide 255 CPUs? Where so you enforce
this, i.e. keep the restriction to 255 CPUs when we are not in KVM mode
with in-kernel APIC (the emulate APIC lacks x2APIC mode, unfortunately)?
But, wait, KVM only supports up to 255 VCPUs. So what are you targeting at?

Jan






Re: [Qemu-devel] [PATCH 4/4] Support more than 255 cpus: runtime check

2014-05-15 Thread Li, ZhenHua

Maybe it should be 4 bytes for 4096 (0x1000).

On 05/13/2014 04:19 PM, Max Filippov wrote:

On Tue, May 13, 2014 at 11:09 AM, Li, Zhen-Hua zhen-h...@hp.com wrote:

From: Li, ZhenHua zhen-h...@hp.com

There is some runtime check for max cpu count. Make them support 4096 cpus.

Signed-off-by: Li, ZhenHua zhen-h...@hp.com
---
  hw/i386/acpi-build.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index c98df88..5c3bf10 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c

[...]


@@ -680,7 +680,7 @@ build_append_notify_method(GArray *device, const char *name,
  for (i = 0; i  count; i++) {
  GArray *target = build_alloc_array();
  build_append_nameseg(target, format, i);
-assert(i  256); /* Fits in 1 byte */
+assert(i  4096); /* Fits in 1 byte */

The comment is no longer true.
Also the function build_append_notify_method is called with format argument
set to CP%0.02X, looks like this should be changed to CP%0.03X.






[Qemu-devel] [PATCH] virtio-net: announce self by guest

2014-05-15 Thread Jason Wang
It's hard to track all mac addresses and their configurations (e.g
vlan or ipv6) in qemu. Without those informations, it's impossible to
build proper garp packet after migration. The only possible solution
to this is let guest (who knew all configurations) to do this.

So, this patch introduces a new readonly config status bit of virtio-net,
VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce
presence of its link through config update interrupt.When guest has
done the announcement, it should ack the notification through
VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated by a new
feature bit VIRTIO_NET_F_ANNOUNCE (which has already been supported by
Linux guest).

During load, a counter of announcing rounds were set so that the after
the vm is running it can trigger rounds of config interrupts to notify
the guest to build and send the correct garps.

Tested with ping to guest with vlan during migration. Without the
patch, lots of the packets were lost after migration. With the patch,
could not notice packet loss after migration.

Reference:
RFC v2: https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01750.html
RFC v1: https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg02648.html
V7: https://lists.gnu.org/archive/html/qemu-devel/2013-03/msg01127.html

Changes from RFC v2:
- use QEMU_CLOCK_VIRTUAL instead of QEMU_CLOCK_REALTIME
- compat self announce for 2.0 machine type

Changes from RFC v1:
- clean VIRTIO_NET_S_ANNOUNCE bit during reset
- free announce timer during clean
- make announce work for non-vhost case

Changes from V7:
- Instead of introducing a global method for each kind of nic, this
  version limits the changes to virtio-net itself.

Cc: Liuyongan liuyon...@huawei.com
Cc: Amos Kong ak...@redhat.com
Signed-off-by: Jason Wang jasow...@redhat.com
---
 hw/net/virtio-net.c|   48 
 include/hw/i386/pc.h   |5 
 include/hw/virtio/virtio-net.h |   16 +
 3 files changed, 69 insertions(+), 0 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 940a7cf..98d59e9 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -25,6 +25,7 @@
 #include monitor/monitor.h
 
 #define VIRTIO_NET_VM_VERSION11
+#define VIRTIO_NET_ANNOUNCE_ROUNDS3
 
 #define MAC_TABLE_ENTRIES64
 #define MAX_VLAN(1  12)   /* Per 802.1Q definition */
@@ -99,6 +100,25 @@ static bool virtio_net_started(VirtIONet *n, uint8_t status)
 (n-status  VIRTIO_NET_S_LINK_UP)  vdev-vm_running;
 }
 
+static void virtio_net_announce_timer(void *opaque)
+{
+VirtIONet *n = opaque;
+VirtIODevice *vdev = VIRTIO_DEVICE(n);
+
+if (n-announce 
+virtio_net_started(n, vdev-status) 
+vdev-guest_features  (0x1  VIRTIO_NET_F_GUEST_ANNOUNCE) 
+vdev-guest_features  (0x1  VIRTIO_NET_F_CTRL_VQ)) {
+
+n-announce--;
+n-status |= VIRTIO_NET_S_ANNOUNCE;
+virtio_notify_config(vdev);
+} else {
+timer_del(n-announce_timer);
+n-announce = 0;
+}
+}
+
 static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(n);
@@ -147,6 +167,8 @@ static void virtio_net_set_status(struct VirtIODevice 
*vdev, uint8_t status)
 
 virtio_net_vhost_status(n, status);
 
+virtio_net_announce_timer(n);
+
 for (i = 0; i  n-max_queues; i++) {
 q = n-vqs[i];
 
@@ -322,6 +344,9 @@ static void virtio_net_reset(VirtIODevice *vdev)
 n-nobcast = 0;
 /* multiqueue is disabled by default */
 n-curr_queues = 1;
+timer_del(n-announce_timer);
+n-announce = 0;
+n-status = ~VIRTIO_NET_S_ANNOUNCE;
 
 /* Flush any MAC and VLAN filter table state */
 n-mac_table.in_use = 0;
@@ -731,6 +756,22 @@ static int virtio_net_handle_vlan_table(VirtIONet *n, 
uint8_t cmd,
 return VIRTIO_NET_OK;
 }
 
+static int virtio_net_handle_announce(VirtIONet *n, uint8_t cmd,
+  struct iovec *iov, unsigned int iov_cnt)
+{
+if (cmd == VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
+n-status = ~VIRTIO_NET_S_ANNOUNCE;
+if (n-announce) {
+timer_mod(n-announce_timer,
+  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 50 +
+  (VIRTIO_NET_ANNOUNCE_ROUNDS - n-announce - 1) * 100);
+}
+return VIRTIO_NET_OK;
+} else {
+return VIRTIO_NET_ERR;
+}
+}
+
 static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd,
 struct iovec *iov, unsigned int iov_cnt)
 {
@@ -794,6 +835,8 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 status = virtio_net_handle_mac(n, ctrl.cmd, iov, iov_cnt);
 } else if (ctrl.class == VIRTIO_NET_CTRL_VLAN) {
 status = virtio_net_handle_vlan_table(n, ctrl.cmd, iov, iov_cnt);
+} else if (ctrl.class == VIRTIO_NET_CTRL_ANNOUNCE) {
+status = virtio_net_handle_announce(n, 

Re: [Qemu-devel] [PATCH v3 21/25] vmdk: implement .bdrv_detach/attach_aio_context()

2014-05-15 Thread Stefan Hajnoczi
On Thu, May 15, 2014 at 09:34:20AM +0800, Fam Zheng wrote:
 On Thu, 05/08 16:34, Stefan Hajnoczi wrote:
  Implement .bdrv_detach/attach_aio_context() interfaces to propagate
  detach/attach to BDRVVmdkState-extents[].file.  The block layer takes
  care of -file and -backing_hd but doesn't know about our extents
  BlockDriverStates, which is also part of the graph.
  
  Reviewed-by: Fam Zheng f...@redhat.com
  Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
  ---
 
 Noticed this doesn't apply any more due to recent VMDK fixes, because the
 context is different. Trivial to fix, though.

Yep, me too.  The resolution is trivial though.

Stefan



Re: [Qemu-devel] [PATCH 1/3] throttle: add throttle_detach/attach_aio_context()

2014-05-15 Thread Stefan Hajnoczi
On Wed, May 14, 2014 at 05:05:58PM +0200, Benoît Canet wrote:
 The Wednesday 14 May 2014 à 16:22:45 (+0200), Stefan Hajnoczi wrote :
  Block I/O throttling uses timers and currently always adds them to the
  main loop.  Throttling will break if bdrv_set_aio_context() is used to
  move a BlockDriverState to a different AioContext.
  
  This patch adds throttle_detach/attach_aio_context() interfaces so the
 s/so the/to the/ ? 

Thanks, let me know how you feel about my replies below.  If there is no
other need to respin then let's let this commit description typo slide.

  diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
  index ab29b0b..b890613 100644
  --- a/include/qemu/throttle.h
  +++ b/include/qemu/throttle.h
  @@ -67,6 +67,11 @@ typedef struct ThrottleState {
 
 Can we make sure this header knows about AioContext in case
 there is another throttle user not block related ?

It already does because qemu-common.h includes qemu-typedefs.h.  We get
an opaque AioContext.

The point of the typedefs.h is to avoid pulling in all the headers when
callers just pass around pointers and don't need the actual definition.

In this case, I think the split makes sense because callers might pass
AioContext without actually using the AioContext API themselves.

  diff --git a/tests/test-throttle.c b/tests/test-throttle.c
  index 1d4ffd3..5fa5000 100644
  --- a/tests/test-throttle.c
  +++ b/tests/test-throttle.c
  @@ -12,8 +12,10 @@
   
   #include glib.h
   #include math.h
 
  +#include block/aio.h
 
 And remove this one eventually.

This include pulls in the AioContext API because the test case itself
manipulates the AioContext.



Re: [Qemu-devel] [PATCH 0/3] throttle: use AioContext for dataplane support

2014-05-15 Thread Stefan Hajnoczi
On Wed, May 14, 2014 at 07:40:18PM +0200, Benoît Canet wrote:
 The Wednesday 14 May 2014 à 16:22:44 (+0200), Stefan Hajnoczi wrote :
  This series applies on top of my dataplane: use QEMU block layer series.
  
  Now that the dataplane code path is using the QEMU block layer we should 
  make
  I/O throttling limits safe to use.  When the block_set_io_throttle monitor
  command is executed, the BlockDriverState's AioContext must be acquired in
  order to prevent race conditions with the IOThread that is processing 
  requests
  from the guest.
  
  The new block layer AioContext detach/attach mechanism needs to be extended 
  to
  move the throttling timer to a new AioContext.  This makes throttling work
  across bdrv_set_aio_context() calls.
  
  The result of this series is that I/O throttling works with dataplane and
  limits may be changed at runtime using the monitor.
  
  Stefan Hajnoczi (3):
throttle: add throttle_detach/attach_aio_context()
throttle: add detach/attach test case
blockdev: acquire AioContext in block_set_io_throttle
  
   block.c |  7 +++
   blockdev.c  |  6 ++
   include/qemu/throttle.h | 10 ++
   tests/test-throttle.c   | 49 
  -
   util/throttle.c | 27 +++
   5 files changed, 90 insertions(+), 9 deletions(-)
  
  -- 
  1.9.0
  
 
 I find One thing chocking is this series.
 I carefully decloupled the throttling code from the block layer so anyone 
 could reuse it.
 I am under the impression that this series couples it back.

The coupling is just due to the current header file layout.  It is not a
real coupling to the block layer.

Throttling has a dependency on timers.  Timers are part of an event loop
(either AioContext or main loop).

The AioContext prototypes happen to be in block/aio.h but they are not a
dependency on block.h or BlockDriverState.  This means we could extract
them and move them to qemu/aiocontext.h in a separate series.

Hope this explains the block/aio.h header and shows it's not true
coupling.

Stefan



Re: [Qemu-devel] [PATCH] iotests: Use configured python

2014-05-15 Thread Markus Armbruster
Markus Armbruster arm...@redhat.com writes:

[...]
 The problem is including generated bits, namely results of configure,
 into source files.

 The Autoconf way is to substitute placeholders in FOO.in producing FOO.

 When you want to limit .in contents as much as possible, you factor out
 the stuff that needs substitutions into some SUB.in, which you then
 include into FOO.  Requires a suitable include mechanism.  In shell,
 builtin source.

 But then you need to find SUB from FOO.  I've seen two ways used:

 1. Assume SUB is in the current directory.  Link FOO into the build tree
 to make it so.

 2. Require FOO to be run in a way that lets it find its source
 directory.  If whatever runs FOO puts the full path into argv[0], you
 can use that.  Else, require a suitable argument or environment
 variable.

Uh, I left out some obvious details here.  Revert the role of FOO and
SUB.  Generate FOO from FOO.in into the build tree, include the real
meat from SUB.



Re: [Qemu-devel] pidfile option in config file

2014-05-15 Thread Markus Armbruster
William Dauchy wdau...@gmail.com writes:

 Hello,

 I'm using the `-pidfile` option in my qemu command line. Since I'm
 also using the `-readconfig` option I thought it was a good thing to
 include the pidfile option in my config file. Unfortunately I did not
 found any possibility to add such option in my config file.
 Is it something I could expect?

User interface bug: not all options are coded in a way that makes them
work with -readconfig.  This includes -pidfile.  Sorry!



Re: [Qemu-devel] [PATCH] configure: Ensure tests/qemu-iotests exists before writing common.env

2014-05-15 Thread Peter Maydell
On 14 May 2014 23:47, Max Reitz mre...@redhat.com wrote:
 On 14.05.2014 16:26, Peter Maydell wrote:

 Before we write common.env to the tests/qemu-iotests directory, ensure
 that it exists. This fixes out-of-tree builds from clean.

 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 ---
 If somebody would like to review this I'll apply it to master as
 a buildfix...

   configure | 1 +
   1 file changed, 1 insertion(+)

 diff --git a/configure b/configure
 index 6adfa72..c4e43ed 100755
 --- a/configure
 +++ b/configure
 @@ -4770,6 +4770,7 @@ fi
 iotests_common_env=tests/qemu-iotests/common.env
   +mkdir -p $(dirname $iotests_common_env)
   echo # Automatically generated by configure - do not modify 
 $iotests_common_env
   echo  $iotests_common_env
   echo PYTHON='$python'  $iotests_common_env


 If we do this, we'd have a commen.env which we cannot use, as all the tests
 are still in the original source tree.

OK, this is busted then. We need to revert your original patch;
when we've figured out how to do it properly we can apply
a corrected version.

 I'd rather use $source_path/tests/qemu_iotests/common.env instead, or, if
 that is unacceptable as it modifies the original source tree (which is
 probably not desired when doing an out-of-tree build), write common.env only
 if this is an in-tree build.

You're correct, we must not write to the original source tree.
We mustn't behave differently on in-tree and out-of-tree builds
either.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 6/7] arm/virt: enable PSCI emulation support for system emulation

2014-05-15 Thread Peter Maydell
On 14 May 2014 23:58, Rob Herring rob.herr...@linaro.org wrote:
 On Wed, May 14, 2014 at 4:25 PM, Peter Maydell peter.mayd...@linaro.org 
 wrote:
 An old host kernel, or an old guest kernel? The former is fine,
 because the KVM CPU init code will just ask for the KVM
 capability and fill in the ARMCPU field appropriately.
 For the latter, how are you supposed to determine what the
 guest kernel can support?

 Guest kernels and this was exactly my point that you can't determine
 it. The virt dtb is for the guest kernel and must be either 0.1 PSCI
 only or both 0.1 and 0.2. I think I misread what you meant. Reading
 the other thread, as long as you just mean changing the if statement
 like this, then we are in agreement:

 if (psci version is 0.1) {
 qemu_fdt_setprop_string(fdt, /psci, compatible, arm,psci);
 } else {
 const char compat[] = arm,psci-0.2\0arm,psci;
 qemu_fdt_setprop(fdt, /psci, compatible, compat, sizeof(compat));
 }

Yes, that's all I meant; sorry for the confusion.

I hadn't noticed that we announce and support both the 0.1
and 0.2 interfaces in the else {} branch, which is probably
why my phrasing was confusing.

thanks
-- PMM



Re: [Qemu-devel] pidfile option in config file

2014-05-15 Thread William Dauchy
On Thu, May 15, 2014 at 10:15 AM, Markus Armbruster arm...@redhat.com wrote:
 User interface bug: not all options are coded in a way that makes them
 work with -readconfig.  This includes -pidfile.  Sorry!

Thanks for the answer. Maybe I can try to send a patch one day or another.
I wanted to make sure there was no reason for this current behavior.

Regards,
-- 
William



Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest

2014-05-15 Thread Michael S. Tsirkin
Thanks, looks good.
Some minor comments below,

On Thu, May 15, 2014 at 03:16:47PM +0800, Jason Wang wrote:
 It's hard to track all mac addresses and their configurations (e.g
 vlan or ipv6) in qemu. Without those informations, it's impossible to

s/those informations/this information/

 build proper garp packet after migration. The only possible solution
 to this is let guest (who knew all configurations) to do this.

s/knew/knows/

 
 So, this patch introduces a new readonly config status bit of virtio-net,
 VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce
 presence of its link through config update interrupt.When guest has
 done the announcement, it should ack the notification through
 VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated by a new
 feature bit VIRTIO_NET_F_ANNOUNCE (which has already been supported by
 Linux guest).
 
 During load, a counter of announcing rounds were set so that the after

s/were/is/
s/the after/after/

 the vm is running it can trigger rounds of config interrupts to notify
 the guest to build and send the correct garps.
 
 Tested with ping to guest with vlan during migration. Without the
 patch, lots of the packets were lost after migration. With the patch,
 could not notice packet loss after migration.

below changelog should go after ---, until the ack list.

 
 Reference:
 RFC v2: https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01750.html
 RFC v1: https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg02648.html
 V7: https://lists.gnu.org/archive/html/qemu-devel/2013-03/msg01127.html
 
 Changes from RFC v2:
 - use QEMU_CLOCK_VIRTUAL instead of QEMU_CLOCK_REALTIME
 - compat self announce for 2.0 machine type
 
 Changes from RFC v1:
 - clean VIRTIO_NET_S_ANNOUNCE bit during reset
 - free announce timer during clean
 - make announce work for non-vhost case
 
 Changes from V7:
 - Instead of introducing a global method for each kind of nic, this
   version limits the changes to virtio-net itself.
 
 Cc: Liuyongan liuyon...@huawei.com
 Cc: Amos Kong ak...@redhat.com
 Signed-off-by: Jason Wang jasow...@redhat.com
 ---
  hw/net/virtio-net.c|   48 
 
  include/hw/i386/pc.h   |5 
  include/hw/virtio/virtio-net.h |   16 +
  3 files changed, 69 insertions(+), 0 deletions(-)
 
 diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
 index 940a7cf..98d59e9 100644
 --- a/hw/net/virtio-net.c
 +++ b/hw/net/virtio-net.c
 @@ -25,6 +25,7 @@
  #include monitor/monitor.h
  
  #define VIRTIO_NET_VM_VERSION11
 +#define VIRTIO_NET_ANNOUNCE_ROUNDS3
  
  #define MAC_TABLE_ENTRIES64
  #define MAX_VLAN(1  12)   /* Per 802.1Q definition */

I would make it  5 to be consistent with SELF_ANNOUNCE_ROUNDS
in savevm.c

in fact, let's move SELF_ANNOUNCE_ROUNDS to include/migration/vmstate.h
and reuse it.

 @@ -99,6 +100,25 @@ static bool virtio_net_started(VirtIONet *n, uint8_t 
 status)
  (n-status  VIRTIO_NET_S_LINK_UP)  vdev-vm_running;
  }
  
 +static void virtio_net_announce_timer(void *opaque)
 +{
 +VirtIONet *n = opaque;
 +VirtIODevice *vdev = VIRTIO_DEVICE(n);
 +
 +if (n-announce 

I would make it  0 here, just in case it becomes negative as a result
of some bug.

 +virtio_net_started(n, vdev-status) 
 +vdev-guest_features  (0x1  VIRTIO_NET_F_GUEST_ANNOUNCE) 
 +vdev-guest_features  (0x1  VIRTIO_NET_F_CTRL_VQ)) {
 +
 +n-announce--;
 +n-status |= VIRTIO_NET_S_ANNOUNCE;
 +virtio_notify_config(vdev);
 +} else {
 +timer_del(n-announce_timer);

why do this here?

 +n-announce = 0;

why is this here?

 +}
 +}
 +
  static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
  {
  VirtIODevice *vdev = VIRTIO_DEVICE(n);
 @@ -147,6 +167,8 @@ static void virtio_net_set_status(struct VirtIODevice 
 *vdev, uint8_t status)
  
  virtio_net_vhost_status(n, status);
  
 +virtio_net_announce_timer(n);
 +

why do this here? why not right after we set announce counter?

  for (i = 0; i  n-max_queues; i++) {
  q = n-vqs[i];
  
 @@ -322,6 +344,9 @@ static void virtio_net_reset(VirtIODevice *vdev)
  n-nobcast = 0;
  /* multiqueue is disabled by default */
  n-curr_queues = 1;
 +timer_del(n-announce_timer);
 +n-announce = 0;
 +n-status = ~VIRTIO_NET_S_ANNOUNCE;
  
  /* Flush any MAC and VLAN filter table state */
  n-mac_table.in_use = 0;
 @@ -731,6 +756,22 @@ static int virtio_net_handle_vlan_table(VirtIONet *n, 
 uint8_t cmd,
  return VIRTIO_NET_OK;
  }
  
 +static int virtio_net_handle_announce(VirtIONet *n, uint8_t cmd,
 +  struct iovec *iov, unsigned int 
 iov_cnt)
 +{
 +if (cmd == VIRTIO_NET_CTRL_ANNOUNCE_ACK) {
 +n-status = ~VIRTIO_NET_S_ANNOUNCE;
 +if (n-announce) {

I would make it  0 here, just in case it becomes negative as a result
of some bug.

 +

Re: [Qemu-devel] [PATCH 0/4] Support more than 255 cpus

2014-05-15 Thread Jan Kiszka
On 2014-05-15 09:16, Li, ZhenHua wrote:
 Kernel's kvm support is not here.
 x2APIC is needed, I will try to do that later.

x2APIC emulation can wait if KVM support is there. But we need at least
one of them before starting to think about raising the limit.

Jan

PS: Please don't top-post.

 
 On 05/13/2014 06:53 PM, Jan Kiszka wrote:
 On 2014-05-13 09:09, Li, Zhen-Hua wrote:
 From: Li, ZhenHua zhen-h...@hp.com

 These series patches are trying to make Qemu support more than 255 CPUs.
 The max cpu number changed to 4096.

   Support more than 255 cpus: ACPI and APIC defines
   Support more than 255 cpus: max_cpus to 4096
   Support more than 255 cpus: max cpumask bit to 4096
   Support more than 255 cpus: runtime chec

   include/hw/acpi/cpu_hotplug_defs.h | 4 ++--
   include/hw/i386/apic_internal.h| 2 +-
   include/hw/i386/pc.h | 2 +-
   include/sysemu/sysemu.h | 2 +-
   hw/i386/acpi-build.c | 8 
 Don't we need x2APIC support to provide 255 CPUs? Where so you enforce
 this, i.e. keep the restriction to 255 CPUs when we are not in KVM mode
 with in-kernel APIC (the emulate APIC lacks x2APIC mode, unfortunately)?
 But, wait, KVM only supports up to 255 VCPUs. So what are you
 targeting at?

 Jan

 

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH v2 14/23] target-arm: add banked coprocessor register type and macros

2014-05-15 Thread Aggeler Fabian
On 14 May 2014, at 18:42, Greg Bellows 
greg.bell...@linaro.orgmailto:greg.bell...@linaro.org wrote:

On 13 May 2014 11:15, Fabian Aggeler 
aggel...@ethz.chmailto:aggel...@ethz.ch wrote:
Banked CP registers can be defined with a A32_BANKED_REG macro which defines
a non-secure instance of the register followed by an adjacent secure instance.
Using a union makes the code backwards-compatible since the non-secure
instance can normally be accessed by its existing name.

This comment indicates that the 0th entry or the default name is the non-secure 
bank, which differs from the code below.

The code does indeed use the 0th entry as non-secure and 1st entry as secure. 
Using the union the default name points to the
0th entry. Am I missing something here?



When translating a banked CP register access instruction in monitor mode,
the SCR.NS bit determines which instance is going to be accessed.

If EL3 is operating in Aarch64 state coprocessor registers are not
banked anymore but in some cases have its own _EL3 instance.

Signed-off-by: Sergey Fedorov 
s.fedo...@samsung.commailto:s.fedo...@samsung.com
Signed-off-by: Fabian Aggeler aggel...@ethz.chmailto:aggel...@ethz.ch
---
 target-arm/cpu.h   | 121 +
 target-arm/helper.c|  64 --
 target-arm/translate.c |  19 +---
 3 files changed, 184 insertions(+), 20 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index a970d55..9e325ac 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -80,6 +80,16 @@
 #define offsetofhigh32(S, M) (offsetof(S, M) + sizeof(uint32_t))
 #endif

+/* Define a banked coprocessor register state field. Use %name as the
+ * register field name for the secure instance. The non-secure instance
+ * has a _nonsecure suffix.

Where is the _nonsecure suffix?

The above comment appears to be incorrect as the code assumes that the 0th 
entry as the non-secure bank.

That comment is wrong and still reflects an earlier implementation where I 
didn’t yet use a union to be able to retain
the existing name. I will adjust it. Thanks for catching this.


+ */
+#define A32_BANKED_REG(type, name) \
+union { \
+type name; \
+type name##_banked[2]; \
+}
+
 /* Meanings of the ARMCPU object's two inbound GPIO lines */
 #define ARM_CPU_IRQ 0
 #define ARM_CPU_FIQ 1
@@ -89,6 +99,7 @@ typedef void ARMWriteCPFunc(void *opaque, int cp_info,
 typedef uint32_t ARMReadCPFunc(void *opaque, int cp_info,
int dstreg, int operand);

+
 struct arm_boot_info;

 #define NB_MMU_MODES 5
@@ -673,6 +684,78 @@ static inline bool arm_el_is_aa64(CPUARMState *env, int el)
 return arm_feature(env, ARM_FEATURE_AARCH64);
 }

+/* When EL3 is operating in Aarch32 state, the NS-bit determines
+ * whether the secure instance of a cp-register should be used. */
+#define USE_SECURE_REG(env) ( \
+arm_feature(env, ARM_FEATURE_SECURITY_EXTENSIONS)  \
+!arm_el_is_aa64(env, 3)  \
+!((env)-cp15.c1_scr  1/*NS*/))
+
+#define NONSECURE_BANK 0
+#define SECURE_BANK 1
+
+#define A32_BANKED_REG_GET(env, regname) \
+((USE_SECURE_REG(env)) ? \
+(env)-cp15.regname##_banked[SECURE_BANK] : \
+(env)-cp15.regname)
+
+#define A32_MAPPED_EL3_REG_GET(env, regname) \
+(((arm_el_is_aa64(env, 3)  arm_current_pl(env) == 3) || \
+  (USE_SECURE_REG(env))) ? \
+(env)-cp15.regname##_el3 : \
+(env)-cp15.regname##_el1)
+
+#define A32_BANKED_REG_SET(env, regname, val) \
+do { \
+if (USE_SECURE_REG(env)) { \
+(env)-cp15.regname##_banked[SECURE_BANK] = (val); \
+} else { \
+(env)-cp15.regname = (val); \
+} \
+} while (0)
+
+#define A32_MAPPED_EL3_REG_SET(env, regname, val) \
+do { \
+if ((arm_el_is_aa64(env, 3)  arm_current_pl(env) == 3) || \
+(USE_SECURE_REG(env))) { \
+(env)-cp15.regname##_el3 = (val); \
+} else { \
+(env)-cp15.regname##_el1 = (val); \
+} \
+} while (0)
+
+
+#define A32_BANKED_CURRENT_REG_GET(env, regname) \
+((!arm_el_is_aa64(env, 3)  arm_is_secure(env)) ? \
+(env)-cp15.regname##_banked[SECURE_BANK] : \
+(env)-cp15.regname)
+
+#define A32_MAPPED_EL3_CURRENT_REG_GET(env, regname) \
+(((arm_el_is_aa64(env, 3)  arm_current_pl(env) == 3) || \
+  (!arm_el_is_aa64(env, 3)  arm_is_secure(env))) ? \
+(env)-cp15.regname##_el3 : \
+(env)-cp15.regname##_el1)
+
+#define A32_BANKED_CURRENT_REG_SET(env, regname, val) \
+do { \
+if (!arm_el_is_aa64(env, 3)  arm_is_secure(env)) { \
+(env)-cp15.regname##_banked[SECURE_BANK] = (val); \
+} else { \
+(env)-cp15.regname = (val); \
+} \
+} while (0)
+
+#define 

Re: [Qemu-devel] Qemu stucking

2014-05-15 Thread Mark Cave-Ayland

On 15/05/14 06:41, sonia verma wrote:


Hi

I'm getting below error when trying to boot the KVM with ethernet
bridging,kvm support and universel TUN enabled by the following command..

/usr/bin/qemu-system-ppc64 -m 512 -nographic -hda
/var/volatile/debian_lenny_
powerpc_standard.qcow2


FWIW I think this image will boot with normal qemu-system-ppc too.



cannot manage 'OHCI USB controller' PCI device type 'usb':
   106b 3f (c 3 10)

  =
  OpenBIOS 1.0 [Aug 28 2012 05:40]


Wow, that's fairly old...


  Configuration device id QEMU version 1 machine id 3
  CPUs: 1
  Memory: 512M
  UUID: ----
  CPU type PowerPC,970FX
Welcome to OpenBIOS v1.0 built on Aug 28 2012 05:40
Second-stage QUIK loader
Welcome to quik. mate is good.
Debian GNU/Linux PowerPCchosen/bootargs =
boot: `
Enter the kernel image name as [device:][partno]/path, where partno is a
number from 0 to 16.  Instead of /path you can type [mm-nn] to specify a
range of disk blocks (512B)
boot: Linux
initrd imagename = /initrd.img, mem_size: 4406840
initrd_start:
Starting at 51, , 1024
OF stdout device is: /pci@f000/mac-io@e/escc@13000/ch-b@13000
command line: console=ttyS0,9600 console=tty0
memory layout at init:
   alloc_bottom : 00c01000
   alloc_top: 2000
   alloc_top_hi : 2000
   rmo_top  : 2000
   ram_top  : 2000
Looking for displays
found display   : /pci@f000/QEMU,VGA@c, opening ... done
copying OF device tree ...
Building dt strings...
Building dt structure...
Device tree strings 0x00c02000 - 0x00c024f0
Device tree struct  0x00c03000 - 0x00c05000
Calling quiesce ...
returning from prom_init

The prompt is stuck at init and is not able to proceed even after wating
for 15-20 minutes.

Please help regarding this.


In my testing here, I tend to find that some versions of Linux don't 
particularly like running in -nographic mode and hangs at the point 
above. Some kernels even get upset running with a 32/15-bit display and 
must be forced to 8-bit instead :/


Can you try removing -nographic and instead try one or both of the 
following:


-g 800x600x32
-g 800x600x8

And if those don't work, try the same image again but with 
qemu-system-ppc rather than qemu-system-ppc64.



HTH,

Mark.




Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Andreas Färber
Am 15.05.2014 09:04, schrieb Greg Kurz:
 On Thu, 15 May 2014 12:16:35 +0530
 Amit Shah amit.s...@redhat.com wrote:
 On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
 On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
 On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
 Since each virtio device is streamed in its own section, the idea is to
 stream subsections between the end of the device section and the start
 of the next sections. This allows an older QEMU to complain and exit
 when fed with subsections:

 Unknown savevm section type 5
 Error -22 while loading VM state

 Please make this configurable -- either via configure or device
 properties.  That avoids having to break existing configurations that
 work without this patch.

Since backwards migration is not supported upstream, wouldn't it be
easiest to just add support for the subsection marker and skipping to
the end of section in that downstream?

 All users of virtio_load()/virtio_save() need to be patched because the
 subsections are streamed AFTER the device itself.

IMO this is calling for inversion of control - i.e. let virtio devices
call generic load/save functions that then dispatch to device-specific
code and let us add common stuff in a central place without forgetting
to add calls in some new device.

 Since all have the same fixup, I'm wondering if a new section can be
 added to the virtio-bus itself, which gets propagated to all devices
 upon load in the dest.

 This calls for a way for devices to inherit properties from the bus,
 which doesn't exist ATM.
 Fine but let's not hold up this patchset because of this.

 No, only suggestion is to add a migration section in the bus, and then
 it's easier to do this in the post-migrate functions for each device
 -- so only one new section gets introduced instead of all devices
 being modified to send a new subsection.

 
 The main problem I see is that virtio sucks: as you see in patch 8, we have
 to be careful not to call vring or virtqueue stuff before the device knows
 its endianness or it breaks... I need to study how the virtio-bus gets
 migrated to ensure the endian section is streamed before the devices.

There is no ordering guarantee. The state needs to be migrated in the
device or bus where it sits, if post-load processing is required; i.e.,
if it's in VirtIODevice then something like this series, if it were on
VirtioBus exclusively (device asking bus for its endianness each time
and does not do post-load stuff) then endianness could be migrated as a
new bus section. Not sure if that would help the broken state though?

Would touch on Stefan's alias properties for anything but virtio-mmio.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest

2014-05-15 Thread Jason Wang
On 05/15/2014 04:28 PM, Michael S. Tsirkin wrote:
 Thanks, looks good.
 Some minor comments below,

 On Thu, May 15, 2014 at 03:16:47PM +0800, Jason Wang wrote:
 It's hard to track all mac addresses and their configurations (e.g
 vlan or ipv6) in qemu. Without those informations, it's impossible to
 s/those informations/this information/

 build proper garp packet after migration. The only possible solution
 to this is let guest (who knew all configurations) to do this.
 s/knew/knows/

 So, this patch introduces a new readonly config status bit of virtio-net,
 VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce
 presence of its link through config update interrupt.When guest has
 done the announcement, it should ack the notification through
 VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated by a new
 feature bit VIRTIO_NET_F_ANNOUNCE (which has already been supported by
 Linux guest).

 During load, a counter of announcing rounds were set so that the after
 s/were/is/
 s/the after/after/

Will correct those typos.
 the vm is running it can trigger rounds of config interrupts to notify
 the guest to build and send the correct garps.

 Tested with ping to guest with vlan during migration. Without the
 patch, lots of the packets were lost after migration. With the patch,
 could not notice packet loss after migration.
 below changelog should go after ---, until the ack list.


Ok.
 Reference:
 RFC v2: https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01750.html
 RFC v1: https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg02648.html
 V7: https://lists.gnu.org/archive/html/qemu-devel/2013-03/msg01127.html

 Changes from RFC v2:
 - use QEMU_CLOCK_VIRTUAL instead of QEMU_CLOCK_REALTIME
 - compat self announce for 2.0 machine type

 Changes from RFC v1:
 - clean VIRTIO_NET_S_ANNOUNCE bit during reset
 - free announce timer during clean
 - make announce work for non-vhost case

 Changes from V7:
 - Instead of introducing a global method for each kind of nic, this
   version limits the changes to virtio-net itself.

 Cc: Liuyongan liuyon...@huawei.com
 Cc: Amos Kong ak...@redhat.com
 Signed-off-by: Jason Wang jasow...@redhat.com
 ---
  hw/net/virtio-net.c|   48 
 
  include/hw/i386/pc.h   |5 
  include/hw/virtio/virtio-net.h |   16 +
  3 files changed, 69 insertions(+), 0 deletions(-)

 diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
 index 940a7cf..98d59e9 100644
 --- a/hw/net/virtio-net.c
 +++ b/hw/net/virtio-net.c
 @@ -25,6 +25,7 @@
  #include monitor/monitor.h
  
  #define VIRTIO_NET_VM_VERSION11
 +#define VIRTIO_NET_ANNOUNCE_ROUNDS3
  
  #define MAC_TABLE_ENTRIES64
  #define MAX_VLAN(1  12)   /* Per 802.1Q definition */
 I would make it  5 to be consistent with SELF_ANNOUNCE_ROUNDS
 in savevm.c

 in fact, let's move SELF_ANNOUNCE_ROUNDS to include/migration/vmstate.h
 and reuse it.

Ok.
 @@ -99,6 +100,25 @@ static bool virtio_net_started(VirtIONet *n, uint8_t 
 status)
  (n-status  VIRTIO_NET_S_LINK_UP)  vdev-vm_running;
  }
  
 +static void virtio_net_announce_timer(void *opaque)
 +{
 +VirtIONet *n = opaque;
 +VirtIODevice *vdev = VIRTIO_DEVICE(n);
 +
 +if (n-announce 
 I would make it  0 here, just in case it becomes negative as a result
 of some bug.

Sure.
 +virtio_net_started(n, vdev-status) 
 +vdev-guest_features  (0x1  VIRTIO_NET_F_GUEST_ANNOUNCE) 
 +vdev-guest_features  (0x1  VIRTIO_NET_F_CTRL_VQ)) {
 +
 +n-announce--;
 +n-status |= VIRTIO_NET_S_ANNOUNCE;
 +virtio_notify_config(vdev);
 +} else {
 +timer_del(n-announce_timer);
 why do this here?

 +n-announce = 0;
 why is this here?


If guest shuts down the device, there's no need to do the announcing.
 +}
 +}
 +
  static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
  {
  VirtIODevice *vdev = VIRTIO_DEVICE(n);
 @@ -147,6 +167,8 @@ static void virtio_net_set_status(struct VirtIODevice 
 *vdev, uint8_t status)
  
  virtio_net_vhost_status(n, status);
  
 +virtio_net_announce_timer(n);
 +
 why do this here? why not right after we set announce counter?

The reasons are:

- The counters were set in load, but the device is not running so we
could not inject the interrupt at that time.
- We can stop the progress when guest is shutting down the device.

  for (i = 0; i  n-max_queues; i++) {
  q = n-vqs[i];
  
 @@ -322,6 +344,9 @@ static void virtio_net_reset(VirtIODevice *vdev)
  n-nobcast = 0;
  /* multiqueue is disabled by default */
  n-curr_queues = 1;
 +timer_del(n-announce_timer);
 +n-announce = 0;
 +n-status = ~VIRTIO_NET_S_ANNOUNCE;
  
  /* Flush any MAC and VLAN filter table state */
  n-mac_table.in_use = 0;
 @@ -731,6 +756,22 @@ static int virtio_net_handle_vlan_table(VirtIONet *n, 
 uint8_t cmd,
  return VIRTIO_NET_OK;
  }
  
 +static int 

Re: [Qemu-devel] [PATCH v1 00/22] target-arm: Preparations for A64 EL2 and 3

2014-05-15 Thread Aggeler Fabian
Hi Greg

Thanks for your comments. I still have to work through them. I am using 
OpenVirtualization in secure world, which then switches to a Linux kernel in 
non-secure world to test the patches. What about you?

Best,
Fabian

On 14 May 2014, at 15:55, Greg Bellows 
greg.bell...@linaro.orgmailto:greg.bell...@linaro.org wrote:

Hi Fabian,

I too had been updating the core TZ patches provided by Samsung.  From looking 
at your changes I see a lot of similarities in our code with the exception 
being the mechanism for banked register support.  The difference being that 
your approach is a bit more explicit in the declaration of the banked 
registers.  Whereas my approach was to update the banked registers once all the 
other registers were registered.  Both approaches I believe work.

I spoke with Peter M. and he and I are okay with your approach.  I will be 
looking closer at your patches today and making comments.

One thing that held me up from committing sooner was testing my changes.  Do 
you have a good approach for testing the changes?

Regards,

Greg


On 14 May 2014 03:58, Aggeler Fabian 
aggel...@student.ethz.chmailto:aggel...@student.ethz.ch wrote:
I see. What is Greg Bellows working on exactly? Also peripherals like TZASC, 
TZPC,...? My plan is to focus on them now if no one else is working on them. 
What do you suggest to minimize overlap?

Thanks,
Fabian

From: Peter Maydell [peter.mayd...@linaro.orgmailto:peter.mayd...@linaro.org]
Sent: Monday, May 12, 2014 10:39 PM
To: Aggeler  Fabian
Cc: Edgar E. Iglesias; Rob Herring; Peter Crosthwaite; QEMU Developers; 
Alexander Graf; John Williams; Alex Bennée; Greg Bellows
Subject: Re: [Qemu-devel] [PATCH v1 00/22] target-arm: Preparations for A64 EL2 
and 3

On 12 May 2014 20:13, Aggeler  Fabian 
aggel...@student.ethz.chmailto:aggel...@student.ethz.ch wrote:
 I’ve been reworking the Samsung patches as part of my Master thesis and I 
 wanted to
 send them some time this week. I am currently rebasing them when I noticed 
 Edgar’s
 patches. Is there some branch with the patches so I could rebase on them?

Hmm, that makes about three lots of people trying to do similar things
at this point. We should try to coordinate so we don't duplicate work.

thanks
-- PMM





Re: [Qemu-devel] [PATCH v2 02/23] target-arm: move SCR into Security Extensions register list

2014-05-15 Thread Aggeler Fabian

On 14 May 2014, at 16:19, Greg Bellows 
greg.bell...@linaro.orgmailto:greg.bell...@linaro.org wrote:




On 13 May 2014 11:15, Fabian Aggeler 
aggel...@ethz.chmailto:aggel...@ethz.ch wrote:
From: Sergey Fedorov s.fedo...@samsung.commailto:s.fedo...@samsung.com

Define a new ARM CP register info list for the Security Extension feature.
Register that list only for ARM cores with Security Extension support.
Moving SCR into Security Extension register group.

Signed-off-by: Sergey Fedorov 
s.fedo...@samsung.commailto:s.fedo...@samsung.com
Signed-off-by: Fabian Aggeler aggel...@ethz.chmailto:aggel...@ethz.ch
---
 target-arm/helper.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 3be917c..7898f40 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -768,9 +768,6 @@ static const ARMCPRegInfo v7_cp_reginfo[] = {
   .access = PL1_RW, .writefn = vbar_write,
   .fieldoffset = offsetof(CPUARMState, cp15.c12_vbar),
   .resetvalue = 0 },
-{ .name = SCR, .cp = 15, .crn = 1, .crm = 1, .opc1 = 0, .opc2 = 0,
-  .access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.c1_scr),
-  .resetvalue = 0, },
 { .name = CCSIDR, .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .crn = 0, .crm = 0, .opc1 = 1, .opc2 = 0,
   .access = PL1_R, .readfn = ccsidr_read, .type = ARM_CP_NO_MIGRATE },
@@ -2087,6 +2084,15 @@ static void sctlr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 tlb_flush(CPU(cpu), 1);
 }

+static const ARMCPRegInfo tz_cp_reginfo[] = {

Sticking with the feature name switch from TRUSTZONE to SECURITY, for 
consistency we should call this security_cp_reginfo.

Makes sense. I will change it.


+#ifndef CONFIG_USER_ONLY
+{ .name = SCR, .cp = 15, .crn = 1, .crm = 1, .opc1 = 0, .opc2 = 0,
+  .access = PL3_RW, .fieldoffset = offsetof(CPUARMState, cp15.c1_scr),
+  .resetvalue = 0, },
+#endif
+REGINFO_SENTINEL
+};
+
 static CPAccessResult ctr_el0_access(CPUARMState *env, const ARMCPRegInfo *ri)
 {
 /* Only accessible in EL0 if SCTLR.UCT is set (and only in AArch64,
@@ -2364,6 +2370,9 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 if (arm_feature(env, ARM_FEATURE_LPAE)) {
 define_arm_cp_regs(cpu, lpae_cp_reginfo);
 }
+if (arm_feature(env, ARM_FEATURE_SECURITY_EXTENSIONS)) {
+define_arm_cp_regs(cpu, tz_cp_reginfo);
+}
 /* Slightly awkwardly, the OMAP and StrongARM cores need all of
  * cp15 crn=0 to be writes-ignored, whereas for other cores they should
  * be read-only (ie write causes UNDEF exception).
--
1.8.3.2







Re: [Qemu-devel] [Qemu-ppc] macio ide question/bug report

2014-05-15 Thread Mark Cave-Ayland

On 15/05/14 00:21, BALATON Zoltan wrote:


Which part is it that's still confusing you? Putting breakpoints on
pmac_ide_transfer() and pmac_ide_atapi_transfer_cb() will show you the
iterations on each DMA request (be sure to compare against a known
good example to understand how it should work first). If you can give
more detail as to which bits are confusing, I will try my best to
explain them.


Looking at the backtrace:

#0  ide_atapi_cmd_error (s=0x563cb238, sense_key=5, asc=33)
 at hw/ide/atapi.c:141
#1  0x556cecf5 in ide_atapi_io_error (s=0x563cb238, ret=-5)
 at hw/ide/atapi.c:160
#2  0x556d9d01 in pmac_ide_atapi_transfer_cb
(opaque=0x563ccc68,
 ret=-5) at hw/ide/macio.c:64
#3  0x556780d2 in dma_complete (dbs=0x563ab840, ret=-5)
 at dma-helpers.c:121
#4  0x556781db in dma_bdrv_cb (opaque=0x563ab840, ret=-5)
 at dma-helpers.c:149
#5  0x55614dd1 in bdrv_co_em_bh (opaque=0x563b5000) at
block.c:4602
#6  0x555f8170 in aio_bh_poll (ctx=0x5637fc00) at async.c:81
#7  0x555f7dc9 in aio_poll (ctx=0x5637fc00, blocking=false)
 at aio-posix.c:188
#8  0x555f8587 in aio_ctx_dispatch (source=0x5637fc00,
callback=
 0x0, user_data=0x0) at async.c:205
#9  0x778ca6d5 in g_main_context_dispatch ()
from /lib64/libglib-2.0.so.0
#10 0x557a0f42 in glib_pollfds_poll () at main-loop.c:190
#11 0x557a1042 in os_host_main_loop_wait (timeout=0) at
main-loop.c:235
#12 0x557a1115 in main_loop_wait (nonblocking=1) at main-loop.c:484
#13 0x55844190 in main_loop () at vl.c:2075
#14 0x5584bc23 in main (argc=30, argv=0x7fffdc88, envp=
 0x7fffdd80) at vl.c:4556

shows that pmac_ide_atapi_transfer_cb is called with ret=-5 which is why
it fails, so putting a breakpoint there is too late. What I don't
understand is where this -5 value comes from. I don't have a known good
example because Darwin reads the TOC differently (probably before
enabling DMA, I did not trace it more than the logs I've included
earlier though) and MorphOS always fails.


Have you noticed that the dma_bdrv_*() functions use a function pointer 
to pmac_ide_atapi_transfer_cb() as their callback function? The 
dma_bdrv_*() functions operate in a separate thread and then invoke the 
callback function when finished.


The breakpoint you are hitting above is the invocation of 
pmac_ide_atapi_transfer_cb() as a result of the callback *after* the 
command has already failed, and not the invocation of 
pmac_ide_atapi_transfer_cb() which calls dma_bdrv_*() to setup the 
asynchronous transfer. Hence the DMA has already failed and the -5 value 
is being returned from the IDE code. I can only guess the reason this 
works with Darwin is because it is a non-DMA request.


If you place a breakpoint on pmac_ide_transfer() then its invocation of 
pmac_ide_atapi_transfer_cb() should be the first iteration which sets up 
the DMA request, and the second invocation should be the (failing) 
callback from the dma_bdrv_*() functions. But as I mentioned before, I 
think the problem is that these functions shouldn't be called in the 
first place when requesting a TOC.



HTH,

Mark.




Re: [Qemu-devel] [PATCH v1 00/22] target-arm: Preparations for A64 EL2 and 3

2014-05-15 Thread Sergey Fedorov
On 15.05.2014 13:28, Aggeler Fabian wrote:
 Hi Greg

 Thanks for your comments. I still have to work through them. I am using 
 OpenVirtualization in secure world, which then switches to a Linux kernel in 
 non-secure world to test the patches. What about you?

 Best,
 Fabian

Hi, Fabian, are there some secure OS with secure user-space tasks which
can be used for testing whether world switching is performed correctly?

Regards, Sergey.


 On 14 May 2014, at 15:55, Greg Bellows 
 greg.bell...@linaro.orgmailto:greg.bell...@linaro.org wrote:

 Hi Fabian,

 I too had been updating the core TZ patches provided by Samsung.  From 
 looking at your changes I see a lot of similarities in our code with the 
 exception being the mechanism for banked register support.  The difference 
 being that your approach is a bit more explicit in the declaration of the 
 banked registers.  Whereas my approach was to update the banked registers 
 once all the other registers were registered.  Both approaches I believe work.

 I spoke with Peter M. and he and I are okay with your approach.  I will be 
 looking closer at your patches today and making comments.

 One thing that held me up from committing sooner was testing my changes.  Do 
 you have a good approach for testing the changes?

 Regards,

 Greg


 On 14 May 2014 03:58, Aggeler Fabian 
 aggel...@student.ethz.chmailto:aggel...@student.ethz.ch wrote:
 I see. What is Greg Bellows working on exactly? Also peripherals like TZASC, 
 TZPC,...? My plan is to focus on them now if no one else is working on them. 
 What do you suggest to minimize overlap?

 Thanks,
 Fabian
 
 From: Peter Maydell 
 [peter.mayd...@linaro.orgmailto:peter.mayd...@linaro.org]
 Sent: Monday, May 12, 2014 10:39 PM
 To: Aggeler  Fabian
 Cc: Edgar E. Iglesias; Rob Herring; Peter Crosthwaite; QEMU Developers; 
 Alexander Graf; John Williams; Alex Bennée; Greg Bellows
 Subject: Re: [Qemu-devel] [PATCH v1 00/22] target-arm: Preparations for A64 
 EL2 and 3

 On 12 May 2014 20:13, Aggeler  Fabian 
 aggel...@student.ethz.chmailto:aggel...@student.ethz.ch wrote:
 I’ve been reworking the Samsung patches as part of my Master thesis and I 
 wanted to
 send them some time this week. I am currently rebasing them when I noticed 
 Edgar’s
 patches. Is there some branch with the patches so I could rebase on them?
 Hmm, that makes about three lots of people trying to do similar things
 at this point. We should try to coordinate so we don't duplicate work.

 thanks
 -- PMM







Re: [Qemu-devel] Qemu stucking

2014-05-15 Thread sonia verma
Hi Mark

Thanks for the reply.
I'll test and let you know the result soon.


On Thu, May 15, 2014 at 2:36 PM, Mark Cave-Ayland 
mark.cave-ayl...@ilande.co.uk wrote:

 On 15/05/14 06:41, sonia verma wrote:

  Hi

 I'm getting below error when trying to boot the KVM with ethernet
 bridging,kvm support and universel TUN enabled by the following command..

 /usr/bin/qemu-system-ppc64 -m 512 -nographic -hda
 /var/volatile/debian_lenny_
 powerpc_standard.qcow2


 FWIW I think this image will boot with normal qemu-system-ppc too.



 cannot manage 'OHCI USB controller' PCI device type 'usb':
106b 3f (c 3 10)

   =
   OpenBIOS 1.0 [Aug 28 2012 05:40]


 Wow, that's fairly old...


Configuration device id QEMU version 1 machine id 3
   CPUs: 1
   Memory: 512M
   UUID: ----
   CPU type PowerPC,970FX
 Welcome to OpenBIOS v1.0 built on Aug 28 2012 05:40
 Second-stage QUIK loader
 Welcome to quik. mate is good.
 Debian GNU/Linux PowerPCchosen/bootargs =
 boot: `
 Enter the kernel image name as [device:][partno]/path, where partno is a
 number from 0 to 16.  Instead of /path you can type [mm-nn] to specify a
 range of disk blocks (512B)
 boot: Linux
 initrd imagename = /initrd.img, mem_size: 4406840
 initrd_start:
 Starting at 51, , 1024
 OF stdout device is: /pci@f000/mac-io@e/escc@13000/ch-b@13000
 command line: console=ttyS0,9600 console=tty0
 memory layout at init:
alloc_bottom : 00c01000
alloc_top: 2000
alloc_top_hi : 2000
rmo_top  : 2000
ram_top  : 2000
 Looking for displays
 found display   : /pci@f000/QEMU,VGA@c, opening ... done
 copying OF device tree ...
 Building dt strings...
 Building dt structure...
 Device tree strings 0x00c02000 - 0x00c024f0
 Device tree struct  0x00c03000 - 0x00c05000
 Calling quiesce ...
 returning from prom_init

 The prompt is stuck at init and is not able to proceed even after wating
 for 15-20 minutes.

 Please help regarding this.


 In my testing here, I tend to find that some versions of Linux don't
 particularly like running in -nographic mode and hangs at the point above.
 Some kernels even get upset running with a 32/15-bit display and must be
 forced to 8-bit instead :/

 Can you try removing -nographic and instead try one or both of the
 following:

 -g 800x600x32
 -g 800x600x8

 And if those don't work, try the same image again but with qemu-system-ppc
 rather than qemu-system-ppc64.


 HTH,

 Mark.




Re: [Qemu-devel] [PATCH] virtio-net: announce self by guest

2014-05-15 Thread Michael S. Tsirkin
On Thu, May 15, 2014 at 05:22:28PM +0800, Jason Wang wrote:
 On 05/15/2014 04:28 PM, Michael S. Tsirkin wrote:
  Thanks, looks good.
  Some minor comments below,
 
  On Thu, May 15, 2014 at 03:16:47PM +0800, Jason Wang wrote:
  It's hard to track all mac addresses and their configurations (e.g
  vlan or ipv6) in qemu. Without those informations, it's impossible to
  s/those informations/this information/
 
  build proper garp packet after migration. The only possible solution
  to this is let guest (who knew all configurations) to do this.
  s/knew/knows/
 
  So, this patch introduces a new readonly config status bit of virtio-net,
  VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce
  presence of its link through config update interrupt.When guest has
  done the announcement, it should ack the notification through
  VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated by a new
  feature bit VIRTIO_NET_F_ANNOUNCE (which has already been supported by
  Linux guest).
 
  During load, a counter of announcing rounds were set so that the after
  s/were/is/
  s/the after/after/
 
 Will correct those typos.
  the vm is running it can trigger rounds of config interrupts to notify
  the guest to build and send the correct garps.
 
  Tested with ping to guest with vlan during migration. Without the
  patch, lots of the packets were lost after migration. With the patch,
  could not notice packet loss after migration.
  below changelog should go after ---, until the ack list.
 
 
 Ok.
  Reference:
  RFC v2: https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01750.html
  RFC v1: https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg02648.html
  V7: https://lists.gnu.org/archive/html/qemu-devel/2013-03/msg01127.html
 
  Changes from RFC v2:
  - use QEMU_CLOCK_VIRTUAL instead of QEMU_CLOCK_REALTIME
  - compat self announce for 2.0 machine type
 
  Changes from RFC v1:
  - clean VIRTIO_NET_S_ANNOUNCE bit during reset
  - free announce timer during clean
  - make announce work for non-vhost case
 
  Changes from V7:
  - Instead of introducing a global method for each kind of nic, this
version limits the changes to virtio-net itself.
 
  Cc: Liuyongan liuyon...@huawei.com
  Cc: Amos Kong ak...@redhat.com
  Signed-off-by: Jason Wang jasow...@redhat.com
  ---
   hw/net/virtio-net.c|   48 
  
   include/hw/i386/pc.h   |5 
   include/hw/virtio/virtio-net.h |   16 +
   3 files changed, 69 insertions(+), 0 deletions(-)
 
  diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
  index 940a7cf..98d59e9 100644
  --- a/hw/net/virtio-net.c
  +++ b/hw/net/virtio-net.c
  @@ -25,6 +25,7 @@
   #include monitor/monitor.h
   
   #define VIRTIO_NET_VM_VERSION11
  +#define VIRTIO_NET_ANNOUNCE_ROUNDS3
   
   #define MAC_TABLE_ENTRIES64
   #define MAX_VLAN(1  12)   /* Per 802.1Q definition */
  I would make it  5 to be consistent with SELF_ANNOUNCE_ROUNDS
  in savevm.c
 
  in fact, let's move SELF_ANNOUNCE_ROUNDS to include/migration/vmstate.h
  and reuse it.
 
 Ok.
  @@ -99,6 +100,25 @@ static bool virtio_net_started(VirtIONet *n, uint8_t 
  status)
   (n-status  VIRTIO_NET_S_LINK_UP)  vdev-vm_running;
   }
   
  +static void virtio_net_announce_timer(void *opaque)
  +{
  +VirtIONet *n = opaque;
  +VirtIODevice *vdev = VIRTIO_DEVICE(n);
  +
  +if (n-announce 
  I would make it  0 here, just in case it becomes negative as a result
  of some bug.
 
 Sure.
  +virtio_net_started(n, vdev-status) 
  +vdev-guest_features  (0x1  VIRTIO_NET_F_GUEST_ANNOUNCE) 
  +vdev-guest_features  (0x1  VIRTIO_NET_F_CTRL_VQ)) {
  +
  +n-announce--;
  +n-status |= VIRTIO_NET_S_ANNOUNCE;
  +virtio_notify_config(vdev);
  +} else {
  +timer_del(n-announce_timer);
  why do this here?
 
  +n-announce = 0;
  why is this here?
 
 
 If guest shuts down the device, there's no need to do the announcing.

It's still weird.
Either announce is 0 and then timer was not running
or it's  0 and then this won't trigger.

  +}
  +}
  +
   static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
   {
   VirtIODevice *vdev = VIRTIO_DEVICE(n);
  @@ -147,6 +167,8 @@ static void virtio_net_set_status(struct VirtIODevice 
  *vdev, uint8_t status)
   
   virtio_net_vhost_status(n, status);
   
  +virtio_net_announce_timer(n);
  +
  why do this here? why not right after we set announce counter?
 
 The reasons are:
 
 - The counters were set in load, but the device is not running so we
 could not inject the interrupt at that time.

I see. This makes sense - but this isn't intuitive.
Why don't we simply start timer with current time?
Need to make sure it runs fine if time passes, but
I think it's fine.

 - We can stop the progress when guest is shutting down the device.

On shut down guest will reset device stopping timer - this seems enough.

 
   

Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Michael S. Tsirkin
On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
 Am 15.05.2014 09:04, schrieb Greg Kurz:
  On Thu, 15 May 2014 12:16:35 +0530
  Amit Shah amit.s...@redhat.com wrote:
  On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
  On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
  On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
  Since each virtio device is streamed in its own section, the idea is to
  stream subsections between the end of the device section and the start
  of the next sections. This allows an older QEMU to complain and exit
  when fed with subsections:
 
  Unknown savevm section type 5
  Error -22 while loading VM state
 
  Please make this configurable -- either via configure or device
  properties.  That avoids having to break existing configurations that
  work without this patch.
 
 Since backwards migration is not supported upstream, wouldn't it be
 easiest to just add support for the subsection marker and skipping to
 the end of section in that downstream?

Backwards and forwards migration need to be supported,
customers told us repeatedly. So some downstreams support this
and not supporting it upstream just means downstreams need
to do their own thing.

As importantly, ping-pong migration is the only
reliable way to stress migration.

So if we want to test cross-version we need it to work
both way.

Finally, the real issue and difficulty with cross-version migration is
making VM behave in a backwards compatible way.  Serializing in a
compatible way is a trivial problem, or would be if the code wasn't a
mess :) Once you do the hard part, breaking migration because of the
trivial serialization issue is just silly.  And special-casing forward
migration does not make code simpler, it really only leads to
proliferation of hacks and lack of symmetry.

So yes it's a useful feature, and no not supporting it does
not help anyway.


  All users of virtio_load()/virtio_save() need to be patched because the
  subsections are streamed AFTER the device itself.
 
 IMO this is calling for inversion of control - i.e. let virtio devices
 call generic load/save functions that then dispatch to device-specific
 code and let us add common stuff in a central place without forgetting
 to add calls in some new device.
 
  Since all have the same fixup, I'm wondering if a new section can be
  added to the virtio-bus itself, which gets propagated to all devices
  upon load in the dest.
 
  This calls for a way for devices to inherit properties from the bus,
  which doesn't exist ATM.
  Fine but let's not hold up this patchset because of this.
 
  No, only suggestion is to add a migration section in the bus, and then
  it's easier to do this in the post-migrate functions for each device
  -- so only one new section gets introduced instead of all devices
  being modified to send a new subsection.
 
  
  The main problem I see is that virtio sucks: as you see in patch 8, we have
  to be careful not to call vring or virtqueue stuff before the device knows
  its endianness or it breaks... I need to study how the virtio-bus gets
  migrated to ensure the endian section is streamed before the devices.
 
 There is no ordering guarantee. The state needs to be migrated in the
 device or bus where it sits, if post-load processing is required; i.e.,
 if it's in VirtIODevice then something like this series, if it were on
 VirtioBus exclusively (device asking bus for its endianness each time
 and does not do post-load stuff) then endianness could be migrated as a
 new bus section. Not sure if that would help the broken state though?
 
 Would touch on Stefan's alias properties for anything but virtio-mmio.
 
 Regards,
 Andreas
 
 -- 
 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCHv4] block: optimize zero writes with bdrv_write_zeroes

2014-05-15 Thread Kevin Wolf
Am 15.05.2014 um 07:16 hat Peter Lieven geschrieben:
 Am 14.05.2014 13:41, schrieb Kevin Wolf:
  Am 08.05.2014 um 18:22 hat Peter Lieven geschrieben:
  this patch tries to optimize zero write requests
  by automatically using bdrv_write_zeroes if it is
  supported by the format.
 
  This significantly speeds up file system initialization and
  should speed zero write test used to test backend storage
  performance.
 
  I ran the following 2 tests on my internal SSD with a
  50G QCOW2 container and on an attached iSCSI storage.
 
  a) mkfs.ext4 -E lazy_itable_init=0,lazy_journal_init=0 /dev/vdX
 
  QCOW2 [off] [on] [unmap]
  -
  runtime:   14secs1.1secs  1.1secs
  filesize:  937M  18M  18M
 
  iSCSI [off] [on] [unmap]
  
  runtime:   9.3s  0.9s 0.9s
 
  b) dd if=/dev/zero of=/dev/vdX bs=1M oflag=direct
 
  QCOW2 [off] [on] [unmap]
  -
  runtime:   246secs   18secs   18secs
  filesize:  51G   192K 192K
  throughput:203M/s2.3G/s   2.3G/s
 
  iSCSI*[off] [on] [unmap]
  
  runtime:   8mins 45secs   33secs
  throughput:106M/s1.2G/s   1.6G/s
  allocated: 100%  100% 0%
 
  * The storage was connected via an 1Gbit interface.
It seems to internally handle writing zeroes
via WRITESAME16 very fast.
 
  Signed-off-by: Peter Lieven p...@kamp.de
  ---
  v3-v4: - use QAPI generated enum and lookup table [Kevin]
  - added more details about the options in the comments
of the qapi-schema [Eric]
  - changed the type of detect_zeroes from str to
BlockdevDetectZeroesOptions. I left the name
as is because it is consistent with e.g.
BlockdevDiscardOptions or BlockdevAioOptions [Eric]
  - changed the parse function in blockdev_init to
be generic usable for other enum parameters
  
  v2-v3: - moved parameter parsing to blockdev_init
  - added per device detect_zeroes status to 
hmp (info block -v) and qmp (query-block) [Eric]
  - added support to enable detect-zeroes also
for hot added devices [Eric]
  - added missing entry to qemu_common_drive_opts
  - fixed description of qemu_iovec_is_zero [Fam]
 
  v1-v2: - added tests to commit message (Markus)
  RFCv2-v1: - fixed paramter parsing strncmp - strcmp (Eric)
 - fixed typo (choosen-chosen) (Eric)
 - added second example to commit msg
 
  RFCv1-RFCv2: - add detect-zeroes=off|on|unmap knob to drive cmdline 
  parameter
- call zero detection only for format (bs-file != NULL)
 
   block.c   |   11 ++
   block/qapi.c  |1 +
   blockdev.c|   34 +
   hmp.c |5 +
   include/block/block_int.h |1 +
   include/qemu-common.h |1 +
   qapi-schema.json  |   52 
  -
   qemu-options.hx   |6 ++
   qmp-commands.hx   |3 +++
   util/iov.c|   21 ++
   10 files changed, 120 insertions(+), 15 deletions(-)
 
  diff --git a/block.c b/block.c
  index b749d31..aea4c77 100644
  --- a/block.c
  +++ b/block.c
  @@ -3244,6 +3244,17 @@ static int coroutine_fn 
  bdrv_aligned_pwritev(BlockDriverState *bs,
   
   ret = notifier_with_return_list_notify(bs-before_write_notifiers, 
  req);
   
  +if (!ret  bs-detect_zeroes != BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF 
  +!(flags  BDRV_REQ_ZERO_WRITE)  bs-file 
  +drv-bdrv_co_write_zeroes  qemu_iovec_is_zero(qiov)) {
  Pretty long condition. :-)
 
  Looks like most is obviously needed, but I wonder what the bs-file part
  is good for? That looks rather arbitrary.
 
 What I wanted to achieve is that this condition is only true if we handle
 the format (e.g. raw, qcow2, vmdk etc.). If e.g. qcow2 then sends a
 zero write this should always end on the disk and should not be optimizable.

But why? This means setting an arbitrary policy for no good reason. You
already have an option, and it already defaults to off, so unless
someone specifically enables it for bs-file, we don't do the
optimisation. But if someone wants to have it on bs-file, what reason
is there to ignore that request?

  +flags |= BDRV_REQ_ZERO_WRITE;
  +/* if the device was not opened with discard=on the below flag
  + * is immediately cleared again in bdrv_co_do_write_zeroes */
  Is it? I only see it being cleared in bdrv_co_write_zeroes(), but that's
  not a function that seems to be called from here.
 
 You are right. Question, do we want to support detect_zeroes = unmap
 if discard = ignore? If yes, I have to update the docs. Otherwise
 I have to check for BDRV_O_DISCARD before setting BDRV_REQ_MAY_UNMAP.

I think it would be reasonable enough to just error out when you try to

[Qemu-devel] [Bug 1303926] Re: qemu-system-x86_64 crashed with SIGABRT

2014-05-15 Thread f3a97
Hi Serge,


I think I have already reported the required information a number of times with 
the Ubuntu built-in bug reporting facility (apport?), which asked me to report 
the crash information to developers.

Are you able to find it out or do I need to manually open a new bug?


Thanks you.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1303926

Title:
  qemu-system-x86_64 crashed with SIGABRT

Status in QEMU:
  New
Status in “qemu” package in Ubuntu:
  Fix Released

Bug description:
  I've been getting this periodically since my upgrade to qemu 2.0 in
  trusty this morning.

  ProblemType: Crash
  DistroRelease: Ubuntu 14.04
  Package: qemu-system-x86 2.0.0~rc1+dfsg-0ubuntu1
  ProcVersionSignature: Ubuntu 3.13.0-23.45-generic 3.13.8
  Uname: Linux 3.13.0-23-generic x86_64
  ApportVersion: 2.14.1-0ubuntu1
  Architecture: amd64
  Date: Mon Apr  7 13:31:53 2014
  ExecutablePath: /usr/bin/qemu-system-x86_64
  InstallationDate: Installed on 2013-11-26 (131 days ago)
  InstallationMedia: Ubuntu 13.10 Saucy Salamander - Release amd64 
(20131016.1)
  ProcEnviron: PATH=(custom, no user)
  Registers: No symbol table is loaded.  Use the file command.
  Signal: 6
  SourcePackage: qemu
  StacktraceTop:
   
  Title: qemu-system-x86_64 crashed with SIGABRT
  UpgradeStatus: Upgraded to trusty on 2014-01-17 (79 days ago)
  UserGroups:

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1303926/+subscriptions



Re: [Qemu-devel] [PATCH v2 01/16] x86: Convert conditional compilation of debug printfs to regular ifs

2014-05-15 Thread Marc Marí
El Tue, 13 May 2014 08:38:26 -0600
Eric Blake ebl...@redhat.com escribió:
 Although __VA_ARGS__ is required by C99, the use of ##__VA_ARGS__ is a
 gcc extension; are you sure that all other supported compilers handle
 it?  (I guess that's just clang)
 
 If you want something portable to C99, just use one fewer macro
 argument, so that you are guaranteed that __VA_ARGS__ will be
 non-empty (that is, subsume fmt into the ...):
 
 #define DEBUG(...)\
 QEMU_DPRINTF(DEVICE_ASSIGNMENT_DEBUG_ENABLED, \
  pci_assign, __VA_ARGS__)
 

I found some problems to convert from ## __VA_ARGS__ to __VA_ARGS__ in
some parts, and as I asked in IRC, it was pointed that our HACKING
document recommends the fmt, ## __VA_ARGS__ approach, and it obviously
works on all the compilers we care about. So I will leave this as it
is now.

Marc



Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Andreas Färber
Am 15.05.2014 11:52, schrieb Michael S. Tsirkin:
 On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
 Am 15.05.2014 09:04, schrieb Greg Kurz:
 On Thu, 15 May 2014 12:16:35 +0530
 Amit Shah amit.s...@redhat.com wrote:
 On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
 On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
 On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
 Since each virtio device is streamed in its own section, the idea is to
 stream subsections between the end of the device section and the start
 of the next sections. This allows an older QEMU to complain and exit
 when fed with subsections:

 Unknown savevm section type 5
 Error -22 while loading VM state

 Please make this configurable -- either via configure or device
 properties.  That avoids having to break existing configurations that
 work without this patch.

 Since backwards migration is not supported upstream, wouldn't it be
 easiest to just add support for the subsection marker and skipping to
 the end of section in that downstream?
 
 Backwards and forwards migration need to be supported,
 customers told us repeatedly. So some downstreams support this
 and not supporting it upstream just means downstreams need
 to do their own thing.
 
 As importantly, ping-pong migration is the only
 reliable way to stress migration.
 
 So if we want to test cross-version we need it to work
 both way.
 
 Finally, the real issue and difficulty with cross-version migration is
 making VM behave in a backwards compatible way.  Serializing in a
 compatible way is a trivial problem, or would be if the code wasn't a
 mess :) Once you do the hard part, breaking migration because of the
 trivial serialization issue is just silly.  And special-casing forward
 migration does not make code simpler, it really only leads to
 proliferation of hacks and lack of symmetry.
 
 So yes it's a useful feature, and no not supporting it does
 not help anyway.

It seems you misunderstand. I was not saying it's not useful.

My point is that VMStateSubsections added in newer versions (and thus
not existing in older versions) need to be handled for any
VMState-converted devices. So why can't we make that work for virtio too?

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



[Qemu-devel] [PATCH v2 3/8] xics: Disable flags reset on xics reset

2014-05-15 Thread Alexey Kardashevskiy
Since islsi[] array has been merged into the ICSState struct,
we must not reset flags as they tell if the interrupt is in use.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/intc/xics.c | 7 +++
 hw/intc/xics_kvm.c | 7 +++
 2 files changed, 14 insertions(+)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 7065978..83a809e 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -521,11 +521,18 @@ static void ics_reset(DeviceState *dev)
 {
 ICSState *ics = ICS(dev);
 int i;
+uint8_t flags[ics-nr_irqs];
+
+for (i = 0; i  ics-nr_irqs; i++) {
+flags[i] = ics-irqs[i].flags;
+}
 
 memset(ics-irqs, 0, sizeof(ICSIRQState) * ics-nr_irqs);
+
 for (i = 0; i  ics-nr_irqs; i++) {
 ics-irqs[i].priority = 0xff;
 ics-irqs[i].saved_priority = 0xff;
+ics-irqs[i].flags = flags[i];
 }
 }
 
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index 8719a88..6fa2412 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -271,11 +271,18 @@ static void ics_kvm_reset(DeviceState *dev)
 {
 ICSState *ics = ICS(dev);
 int i;
+uint8_t flags[ics-nr_irqs];
+
+for (i = 0; i  ics-nr_irqs; i++) {
+flags[i] = ics-irqs[i].flags;
+}
 
 memset(ics-irqs, 0, sizeof(ICSIRQState) * ics-nr_irqs);
+
 for (i = 0; i  ics-nr_irqs; i++) {
 ics-irqs[i].priority = 0xff;
 ics-irqs[i].saved_priority = 0xff;
+ics-irqs[i].flags = flags[i];
 }
 
 ics_set_kvm_state(ics, 1);
-- 
1.9.rc0




Re: [Qemu-devel] Qemu stucking

2014-05-15 Thread sonia verma
Hi Mark

I tried booting KVM using qemy-system-ppc with your suggesstion but ended
up stucking at below logs..

/usr/bin/qemu-system-ppc -m 512 -nographic -hda
kvm/debian_lenny_powerpc_standard.qcow2
qemu-system-ppc: pci_add_option_rom: failed to find romfile
efi-ne2k_pci.rom
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle
 set_property: NULL phandle

 =
 OpenBIOS 1.1 [May 26 2013 13:52]
 Configuration device id QEMU version 1 machine id 2
 CPUs: 1
 Memory: 512M
 UUID: ----
 CPU type PowerPC,750


Actually I'm trying to boot KVM from the powerpc board having ubuntu 13.10
on it,so i need to provide the -nographic option.

Please help regarding this.





On Thu, May 15, 2014 at 3:14 PM, sonia verma soniaverma9...@gmail.comwrote:

 Hi Mark

 Thanks for the reply.
 I'll test and let you know the result soon.


 On Thu, May 15, 2014 at 2:36 PM, Mark Cave-Ayland 
 mark.cave-ayl...@ilande.co.uk wrote:

 On 15/05/14 06:41, sonia verma wrote:

  Hi

 I'm getting below error when trying to boot the KVM with ethernet
 bridging,kvm support and universel TUN enabled by the following command..

 /usr/bin/qemu-system-ppc64 -m 512 -nographic -hda
 /var/volatile/debian_lenny_
 powerpc_standard.qcow2


 FWIW I think this image will boot with normal qemu-system-ppc too.



 cannot manage 'OHCI USB controller' PCI device type 'usb':
106b 3f (c 3 10)

   =
   OpenBIOS 1.0 [Aug 28 2012 05:40]


 Wow, that's fairly old...


Configuration device id QEMU version 1 machine id 3
   CPUs: 1
   Memory: 512M
   UUID: ----
   CPU type PowerPC,970FX
 Welcome to OpenBIOS v1.0 built on Aug 28 2012 05:40
 Second-stage QUIK loader
 Welcome to quik. mate is good.
 Debian GNU/Linux PowerPCchosen/bootargs =
 boot: `
 Enter the kernel image name as [device:][partno]/path, where partno is a
 number from 0 to 16.  Instead of /path you can type [mm-nn] to specify a
 range of disk blocks (512B)
 boot: Linux
 initrd imagename = /initrd.img, mem_size: 4406840
 initrd_start:
 Starting at 51, , 1024
 OF stdout device is: /pci@f000/mac-io@e/escc@13000/ch-b@13000
 command line: console=ttyS0,9600 console=tty0
 memory layout at init:
alloc_bottom : 00c01000
alloc_top: 2000
alloc_top_hi : 2000
rmo_top  : 2000
ram_top  : 2000
 Looking for displays
 found display   : /pci@f000/QEMU,VGA@c, opening ... done
 copying OF device tree ...
 Building dt strings...
 Building dt structure...
 Device tree strings 0x00c02000 - 0x00c024f0
 Device tree struct  0x00c03000 - 0x00c05000
 Calling quiesce ...
 returning from prom_init

 The prompt is stuck at init and is not able to proceed even after wating
 for 15-20 minutes.

 Please help regarding this.


 In my testing here, I tend to find that some versions of Linux don't
 particularly like running in -nographic mode and hangs at the point above.
 Some kernels even get upset running with a 32/15-bit display and must be
 forced to 8-bit instead :/

 Can you try removing -nographic and instead try one or both of the
 following:

 -g 800x600x32
 -g 800x600x8

 And if those don't work, try the same image again but with
 qemu-system-ppc rather than qemu-system-ppc64.


 HTH,

 Mark.





[Qemu-devel] [PATCH v2 1/8] xics: Add flags for interrupts

2014-05-15 Thread Alexey Kardashevskiy
The existing interrupt allocation scheme in SPAPR assumes that
interrupts are allocated at the start time, continously and the config
will not change. However, there are cases when this is not going to work
such as:

1. migration - we will have to have an ability to choose interrupt
numbers for devices in the command line and this will create gaps in
interrupt space.

2. PCI hotplug - interrupts from unplugged device need to be returned
back to interrupt pool, otherwise we will quickly run out of interrupts.

This replaces a separate lslsi[] array with a byte in the ICSIRQState
struct and defines LSI and MSI flags. Neither of these flags set
signals that the descriptor is not allocated and not in use.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/intc/xics.c| 23 ---
 hw/intc/xics_kvm.c|  5 ++---
 include/hw/ppc/xics.h |  6 +-
 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 64aabe7..220ca0e 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -438,7 +438,7 @@ static void ics_set_irq(void *opaque, int srcno, int val)
 {
 ICSState *ics = (ICSState *)opaque;
 
-if (ics-islsi[srcno]) {
+if (ics-irqs[srcno].flags  XICS_FLAGS_IRQ_LSI) {
 set_irq_lsi(ics, srcno, val);
 } else {
 set_irq_msi(ics, srcno, val);
@@ -475,7 +475,7 @@ static void ics_write_xive(ICSState *ics, int nr, int 
server,
 
 trace_xics_ics_write_xive(nr, srcno, server, priority);
 
-if (ics-islsi[srcno]) {
+if (ics-irqs[srcno].flags  XICS_FLAGS_IRQ_LSI) {
 write_xive_lsi(ics, srcno);
 } else {
 write_xive_msi(ics, srcno);
@@ -497,7 +497,7 @@ static void ics_resend(ICSState *ics)
 
 for (i = 0; i  ics-nr_irqs; i++) {
 /* FIXME: filter by server#? */
-if (ics-islsi[i]) {
+if (ics-irqs[i].flags  XICS_FLAGS_IRQ_LSI) {
 resend_lsi(ics, i);
 } else {
 resend_msi(ics, i);
@@ -512,7 +512,7 @@ static void ics_eoi(ICSState *ics, int nr)
 
 trace_xics_ics_eoi(nr);
 
-if (ics-islsi[srcno]) {
+if (ics-irqs[srcno].flags  XICS_FLAGS_IRQ_LSI) {
 irq-status = ~XICS_STATUS_SENT;
 }
 }
@@ -609,7 +609,6 @@ static void ics_realize(DeviceState *dev, Error **errp)
 return;
 }
 ics-irqs = g_malloc0(ics-nr_irqs * sizeof(ICSIRQState));
-ics-islsi = g_malloc0(ics-nr_irqs * sizeof(bool));
 ics-qirqs = qemu_allocate_irqs(ics_set_irq, ics, ics-nr_irqs);
 }
 
@@ -646,11 +645,21 @@ qemu_irq xics_get_qirq(XICSState *icp, int irq)
 return icp-ics-qirqs[irq - icp-ics-offset];
 }
 
+static void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
+{
+assert(!(ics-irqs[srcno].flags  XICS_FLAGS_IRQ_MASK));
+
+ics-irqs[srcno].flags |=
+lsi ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;
+}
+
 void xics_set_irq_type(XICSState *icp, int irq, bool lsi)
 {
-assert(ics_valid_irq(icp-ics, irq));
+ICSState *ics = icp-ics;
 
-icp-ics-islsi[irq - icp-ics-offset] = lsi;
+assert(ics_valid_irq(ics, irq));
+
+ics_set_irq_type(ics, irq - ics-offset, lsi);
 }
 
 /*
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index 09476ae..8719a88 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -224,7 +224,7 @@ static int ics_set_kvm_state(ICSState *ics, int version_id)
 state |= KVM_XICS_MASKED;
 }
 
-if (ics-islsi[i]) {
+if (ics-irqs[i].flags  XICS_FLAGS_IRQ_LSI) {
 state |= KVM_XICS_LEVEL_SENSITIVE;
 if (irq-status  XICS_STATUS_ASSERTED) {
 state |= KVM_XICS_PENDING;
@@ -253,7 +253,7 @@ static void ics_kvm_set_irq(void *opaque, int srcno, int 
val)
 int rc;
 
 args.irq = srcno + ics-offset;
-if (!ics-islsi[srcno]) {
+if (ics-irqs[srcno].flags  XICS_FLAGS_IRQ_MSI) {
 if (!val) {
 return;
 }
@@ -290,7 +290,6 @@ static void ics_kvm_realize(DeviceState *dev, Error **errp)
 return;
 }
 ics-irqs = g_malloc0(ics-nr_irqs * sizeof(ICSIRQState));
-ics-islsi = g_malloc0(ics-nr_irqs * sizeof(bool));
 ics-qirqs = qemu_allocate_irqs(ics_kvm_set_irq, ics, ics-nr_irqs);
 }
 
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 0d7673d..fa8e9c2 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -136,7 +136,6 @@ struct ICSState {
 uint32_t nr_irqs;
 uint32_t offset;
 qemu_irq *qirqs;
-bool *islsi;
 ICSIRQState *irqs;
 XICSState *icp;
 };
@@ -150,6 +149,11 @@ struct ICSIRQState {
 #define XICS_STATUS_REJECTED   0x4
 #define XICS_STATUS_MASKED_PENDING 0x8
 uint8_t status;
+/* (flags  XICS_FLAGS_IRQ_MASK) == 0 means the interrupt is not allocated */
+#define XICS_FLAGS_IRQ_LSI 0x1
+#define XICS_FLAGS_IRQ_MSI 0x2
+#define XICS_FLAGS_IRQ_MASK0x3
+uint8_t flags;
 };
 
 qemu_irq xics_get_qirq(XICSState *icp, int irq);
-- 
1.9.rc0




[Qemu-devel] [PATCH v2 5/8] xics: Remove obsolete xics_set_irq_type()

2014-05-15 Thread Alexey Kardashevskiy
This removes xics_set_irq_type() as it is not used anymore.

This is done by a separate patch to make the previous patch
look nicer.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/intc/xics.c| 11 ---
 include/hw/ppc/xics.h |  1 -
 2 files changed, 12 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index fdcbb3a..1a8cfd7 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -678,17 +678,6 @@ static void ics_set_irq_type(ICSState *ics, int srcno, 
bool lsi)
 lsi ? XICS_FLAGS_IRQ_LSI : XICS_FLAGS_IRQ_MSI;
 }
 
-void xics_set_irq_type(XICSState *icp, int irq, bool lsi)
-{
-int src = xics_find_source(icp, irq);
-ICSState *ics;
-
-assert(src = 0);
-
-ics = icp-ics[src];
-ics_set_irq_type(ics, irq - ics-offset, lsi);
-}
-
 #define ICS_IRQ_FREE(ics, srcno)   \
 (!((ics)-irqs[(srcno)].flags  (XICS_FLAGS_IRQ_MASK)))
 
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 8e13488..0d8af1b 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -157,7 +157,6 @@ struct ICSIRQState {
 };
 
 qemu_irq xics_get_qirq(XICSState *icp, int irq);
-void xics_set_irq_type(XICSState *icp, int irq, bool lsi);
 int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi);
 int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align);
 
-- 
1.9.rc0




Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Michael S. Tsirkin
On Thu, May 15, 2014 at 11:58:25AM +0200, Andreas Färber wrote:
 Am 15.05.2014 11:52, schrieb Michael S. Tsirkin:
  On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
  Am 15.05.2014 09:04, schrieb Greg Kurz:
  On Thu, 15 May 2014 12:16:35 +0530
  Amit Shah amit.s...@redhat.com wrote:
  On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
  On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
  On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
  Since each virtio device is streamed in its own section, the idea is 
  to
  stream subsections between the end of the device section and the start
  of the next sections. This allows an older QEMU to complain and exit
  when fed with subsections:
 
  Unknown savevm section type 5
  Error -22 while loading VM state
 
  Please make this configurable -- either via configure or device
  properties.  That avoids having to break existing configurations that
  work without this patch.
 
  Since backwards migration is not supported upstream, wouldn't it be
  easiest to just add support for the subsection marker and skipping to
  the end of section in that downstream?
  
  Backwards and forwards migration need to be supported,
  customers told us repeatedly. So some downstreams support this
  and not supporting it upstream just means downstreams need
  to do their own thing.
  
  As importantly, ping-pong migration is the only
  reliable way to stress migration.
  
  So if we want to test cross-version we need it to work
  both way.
  
  Finally, the real issue and difficulty with cross-version migration is
  making VM behave in a backwards compatible way.  Serializing in a
  compatible way is a trivial problem, or would be if the code wasn't a
  mess :) Once you do the hard part, breaking migration because of the
  trivial serialization issue is just silly.  And special-casing forward
  migration does not make code simpler, it really only leads to
  proliferation of hacks and lack of symmetry.
  
  So yes it's a useful feature, and no not supporting it does
  not help anyway.
 
 It seems you misunderstand. I was not saying it's not useful.
 
 My point is that VMStateSubsections added in newer versions (and thus
 not existing in older versions) need to be handled for any
 VMState-converted devices. So why can't we make that work for virtio too?
 
 Andreas

Sure.
AFAIK the way to this works is by adding an field_exists callback,
and not sending the section when we are in a compat mode.

 -- 
 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



[Qemu-devel] [PATCH v2 8/8] spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB

2014-05-15 Thread Alexey Kardashevskiy
Currently SPAPR PHB keeps track of all allocated MSI/MISX interrupt as
XICS used to be unable to reuse interrupts which becomes a problem for
dynamic MSI reconfiguration which is happening on guest driver reload or
PCI hot (un)plug. Another problem is that PHB has a limit of devices
supporting MSI/MSIX (SPAPR_MSIX_MAX_DEVS=32) and there is no good reason
for that.

This makes use of new XICS ability to reuse interrupts.

This removes cached MSI configuration from SPAPR PHB so the first IRQ number
of a device is stored in MSI/MSIX config space so there is no need to store
this anywhere else. From now on, SPAPR PHB only keeps flags telling what type
of interrupt for which device it has configured in order to return error if
(for example) MSIX was enabled and the guest is trying to disable MSI which
it has not enabled.

This removes a limit for the maximum number of MSIX-enabled devices per PHB,
now XICS and PCI bus capacity are the only limitation.

This changes migration stream as it fixes vmstate_spapr_pci_msi::name which was
wrong since the beginning.

This fixed traces to be more informative.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---

In reality either MSIX or MSI is enabled, never both. So I could remove msi/msix
bitmaps from this patch, would it make sense?
---
 hw/ppc/spapr_pci.c  | 179 +++-
 include/hw/pci-host/spapr.h |  11 +--
 trace-events|   5 +-
 3 files changed, 99 insertions(+), 96 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index e574014..49e0382 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -220,36 +220,12 @@ static void rtas_write_pci_config(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
 }
 
 /*
- * Find an entry with config_addr or returns the empty one if not found AND
- * alloc_new is set.
- * At the moment the msi_table entries are never released so there is
- * no point to look till the end of the list if we need to find the free entry.
- */
-static int spapr_msicfg_find(sPAPRPHBState *phb, uint32_t config_addr,
- bool alloc_new)
-{
-int i;
-
-for (i = 0; i  SPAPR_MSIX_MAX_DEVS; ++i) {
-if (!phb-msi_table[i].nvec) {
-break;
-}
-if (phb-msi_table[i].config_addr == config_addr) {
-return i;
-}
-}
-if ((i  SPAPR_MSIX_MAX_DEVS)  alloc_new) {
-trace_spapr_pci_msi(Allocating new MSI config, i, config_addr);
-return i;
-}
-
-return -1;
-}
-
-/*
  * Set MSI/MSIX message data.
  * This is required for msi_notify()/msix_notify() which
  * will write at the addresses via spapr_msi_write().
+ *
+ * If hwaddr == 0, all entries will have .data == first_irq i.e.
+ * table will be reset.
  */
 static void spapr_msi_setmsg(PCIDevice *pdev, hwaddr addr, bool msix,
  unsigned first_irq, unsigned req_num)
@@ -263,12 +239,51 @@ static void spapr_msi_setmsg(PCIDevice *pdev, hwaddr 
addr, bool msix,
 return;
 }
 
-for (i = 0; i  req_num; ++i, ++msg.data) {
+for (i = 0; i  req_num; ++i) {
 msix_set_message(pdev, i, msg);
 trace_spapr_pci_msi_setup(pdev-name, i, msg.address);
+if (addr) {
+++msg.data;
+}
 }
 }
 
+static unsigned spapr_msi_get(sPAPRPHBState *phb, PCIDevice *pdev,
+  unsigned *num, bool *msix)
+{
+MSIMessage msg;
+unsigned irq = 0;
+uint8_t offs = (pci_bus_num(pdev-bus)  SPAPR_PCI_BUS_SHIFT) |
+PCI_SLOT(pdev-devfn);
+
+if ((phb-msi[offs]  (1  PCI_FUNC(pdev-devfn))) 
+(phb-msix[offs]  (1  PCI_FUNC(pdev-devfn {
+error_report(Both MSI and MSIX configured! MSIX will be used.);
+}
+
+if (phb-msix[offs]  (1  PCI_FUNC(pdev-devfn))) {
+*num = pdev-msix_entries_nr;
+if (*num) {
+msg = msix_get_message(pdev, 0);
+irq = msg.data;
+if (msix) {
+*msix = true;
+}
+}
+} else if (phb-msi[offs]  (1  PCI_FUNC(pdev-devfn))) {
+*num = msi_nr_vectors_allocated(pdev);
+if (*num) {
+msg = msi_get_message(pdev, 0);
+irq = msg.data;
+if (msix) {
+*msix = false;
+}
+}
+}
+
+return irq;
+}
+
 static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPREnvironment *spapr,
 uint32_t token, uint32_t nargs,
 target_ulong args, uint32_t nret,
@@ -280,9 +295,10 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
 unsigned int req_num = rtas_ld(args, 4); /* 0 == remove all */
 unsigned int seq_num = rtas_ld(args, 5);
 unsigned int ret_intr_type;
-int ndev, irq, max_irqs = 0;
+unsigned int irq, max_irqs = 0, num = 0, offs;
 sPAPRPHBState *phb = NULL;
 PCIDevice *pdev = NULL;
+bool msix = false;
 
 switch (func) {
 case 

Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Greg Kurz
On Thu, 15 May 2014 11:20:18 +0200
Andreas Färber afaer...@suse.de wrote:
 Am 15.05.2014 09:04, schrieb Greg Kurz:
  On Thu, 15 May 2014 12:16:35 +0530
  Amit Shah amit.s...@redhat.com wrote:
  On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
  On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
  On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
  Since each virtio device is streamed in its own section, the idea is to
  stream subsections between the end of the device section and the start
  of the next sections. This allows an older QEMU to complain and exit
  when fed with subsections:
 
  Unknown savevm section type 5
  Error -22 while loading VM state
 
  Please make this configurable -- either via configure or device
  properties.  That avoids having to break existing configurations that
  work without this patch.
 
 Since backwards migration is not supported upstream, wouldn't it be
 easiest to just add support for the subsection marker and skipping to
 the end of section in that downstream?
 

Not sure I understand well... Do you suggest to stream the markers first,
then the device, then the subsections ? And then there would be a way
we can have the subsections restored before the device ?

  All users of virtio_load()/virtio_save() need to be patched because the
  subsections are streamed AFTER the device itself.
 
 IMO this is calling for inversion of control - i.e. let virtio devices
 call generic load/save functions that then dispatch to device-specific
 code and let us add common stuff in a central place without forgetting
 to add calls in some new device.
 

That makes a lot of sense.

  Since all have the same fixup, I'm wondering if a new section can be
  added to the virtio-bus itself, which gets propagated to all devices
  upon load in the dest.
 
  This calls for a way for devices to inherit properties from the bus,
  which doesn't exist ATM.
  Fine but let's not hold up this patchset because of this.
 
  No, only suggestion is to add a migration section in the bus, and then
  it's easier to do this in the post-migrate functions for each device
  -- so only one new section gets introduced instead of all devices
  being modified to send a new subsection.
 
  
  The main problem I see is that virtio sucks: as you see in patch 8, we have
  to be careful not to call vring or virtqueue stuff before the device knows
  its endianness or it breaks... I need to study how the virtio-bus gets
  migrated to ensure the endian section is streamed before the devices.
 
 There is no ordering guarantee. The state needs to be migrated in the
 device or bus where it sits, if post-load processing is required; i.e.,
 if it's in VirtIODevice then something like this series, if it were on
 VirtioBus exclusively (device asking bus for its endianness each time
 and does not do post-load stuff) then endianness could be migrated as a
 new bus section. Not sure if that would help the broken state though?
 

IIRW the broken state was proposed as a per-device property...

Fam,

Do you have plans about the broken property ? Is it still needed ?

 Would touch on Stefan's alias properties for anything but virtio-mmio.
 

OMG... maybe I should hold on then.

 Regards,
 Andreas
 

Thanks !

-- 
Gregory Kurz kurzg...@fr.ibm.com
 gk...@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys  http://www.ibm.com
Tel +33 (0)562 165 496

Anarchy is about taking complete responsibility for yourself.
Alan Moore.




Re: [Qemu-devel] [PATCH] rdma: Fix block during rdma migration

2014-05-15 Thread Juan Quintela
Michael R. Hines mrhi...@linux.vnet.ibm.com wrote:
 On 05/09/2014 12:25 PM, Gonglei (Arei) wrote:
 Hi,

 -Original Message-
 From: Michael R. Hines [mailto:mrhi...@linux.vnet.ibm.com]
 Sent: Tuesday, April 01, 2014 8:42 AM
 To: Gonglei (Arei); qemu-devel@nongnu.org
 Cc: Huangweidong (C); quint...@redhat.com; dgilb...@redhat.com;
 owass...@redhat.com; mrhi...@us.ibm.com; Moyuxiang;
 pbonz...@redhat.com
 Subject: Re: [Qemu-devel] [PATCH] rdma: Fix block during rdma migration

 On 03/29/2014 03:39 PM, arei.gong...@huawei.com wrote:
 From: Mo Yuxiang moyuxi...@huawei.com

 If the networking break or there's something wrong with rdma
 device(ib0 with no IP) during rdma migration, the main_loop of
 qemu will be blocked in rdma_destroy_id. I add rdma_ack_cm_event
 to fix this bug.

 Signed-off-by: Mo Yuxiang moyuxi...@huawei.com
 Signed-off-by: Gonglei arei.gong...@huawei.com
 ---
migration-rdma.c | 1 +
1 file changed, 1 insertion(+)

 diff --git a/migration-rdma.c b/migration-rdma.c
 index eeb4302..f60749b 100644
 --- a/migration-rdma.c
 +++ b/migration-rdma.c
 @@ -949,6 +949,7 @@ route:
ERROR(errp, result not equal to event_addr_resolved %s,
rdma_event_str(cm_event-event));
perror(rdma_resolve_addr);
 +rdma_ack_cm_event(cm_event);
ret = -EINVAL;
goto err_resolve_get_addr;
}
 Reviewed-by: Michael R. Hines mrhi...@us.ibm.com

 Good catch. =) That's an obvious bug. It looks like I need
 to do a much better job of kill -9 inside the regression
 testing scripts - probably i should try killing the migration
 prematurely at different periods just to be sure there are
 no more places where the connection state is not getting
 cleaned up..

 - Michael

 Michael, do you have a plan to pull this patch to master? Thanks.

 Best regards,
 -Gonglei


 Sorry for the late reply, but I'm not the maintainer for migration,
 that's Juan
 (I can only signoff on patches like everyone else =).

 I also have outstanding RDMA patches myself that have not yet been pulled.

 Would you mind pinging Juan for both of us?

Pointer, please?

I was waiting for Michael Reviewed-by from Michael.


Later, Juan.


 - Michael

 - Michael



Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Andreas Färber
Am 15.05.2014 12:03, schrieb Michael S. Tsirkin:
 On Thu, May 15, 2014 at 11:58:25AM +0200, Andreas Färber wrote:
 Am 15.05.2014 11:52, schrieb Michael S. Tsirkin:
 On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
 Am 15.05.2014 09:04, schrieb Greg Kurz:
 On Thu, 15 May 2014 12:16:35 +0530
 Amit Shah amit.s...@redhat.com wrote:
 On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
 On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
 On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
 Since each virtio device is streamed in its own section, the idea is 
 to
 stream subsections between the end of the device section and the start
 of the next sections. This allows an older QEMU to complain and exit
 when fed with subsections:

 Unknown savevm section type 5
 Error -22 while loading VM state

 Please make this configurable -- either via configure or device
 properties.  That avoids having to break existing configurations that
 work without this patch.

 Since backwards migration is not supported upstream, wouldn't it be
 easiest to just add support for the subsection marker and skipping to
 the end of section in that downstream?

 Backwards and forwards migration need to be supported,
 customers told us repeatedly. So some downstreams support this
 and not supporting it upstream just means downstreams need
 to do their own thing.

 As importantly, ping-pong migration is the only
 reliable way to stress migration.

 So if we want to test cross-version we need it to work
 both way.

 Finally, the real issue and difficulty with cross-version migration is
 making VM behave in a backwards compatible way.  Serializing in a
 compatible way is a trivial problem, or would be if the code wasn't a
 mess :) Once you do the hard part, breaking migration because of the
 trivial serialization issue is just silly.  And special-casing forward
 migration does not make code simpler, it really only leads to
 proliferation of hacks and lack of symmetry.

 So yes it's a useful feature, and no not supporting it does
 not help anyway.

 It seems you misunderstand. I was not saying it's not useful.

 My point is that VMStateSubsections added in newer versions (and thus
 not existing in older versions) need to be handled for any
 VMState-converted devices. So why can't we make that work for virtio too?
 
 Sure.
 AFAIK the way to this works is by adding an field_exists callback,
 and not sending the section when we are in a compat mode.

OK, but upstream always sends the latest version today. So isn't that
just two ifs that you would need to add in save and load functions added
here for the downstream? x86_64 is unaffected from ppc's endianness
changes and you still do ppc64 BE, so there's no real functional problem
for RHEL, is there?

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



[Qemu-devel] [PATCH v2 2/8] xics: Add xics_find_source()

2014-05-15 Thread Alexey Kardashevskiy
PAPR allows having multiple interrupt sources such as PHB.

This adds a source lookup function and makes use of it.

Since at the moment QEMU only supports a single source,
no change in behaviour is expected.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/intc/xics.c | 30 +-
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 220ca0e..7065978 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -635,14 +635,32 @@ static const TypeInfo ics_info = {
 /*
  * Exported functions
  */
+static int xics_find_source(XICSState *icp, int irq)
+{
+int sources = 1;
+int src;
+
+/* FIXME: implement multiple sources */
+for (src = 0; src  sources; ++src) {
+ICSState *ics = icp-ics[src];
+if (ics_valid_irq(ics, irq)) {
+return src;
+}
+}
+
+return -1;
+}
 
 qemu_irq xics_get_qirq(XICSState *icp, int irq)
 {
-if (!ics_valid_irq(icp-ics, irq)) {
-return NULL;
+int src = xics_find_source(icp, irq);
+
+if (src = 0) {
+ICSState *ics = icp-ics[src];
+return ics-qirqs[irq - ics-offset];
 }
 
-return icp-ics-qirqs[irq - icp-ics-offset];
+return NULL;
 }
 
 static void ics_set_irq_type(ICSState *ics, int srcno, bool lsi)
@@ -655,10 +673,12 @@ static void ics_set_irq_type(ICSState *ics, int srcno, 
bool lsi)
 
 void xics_set_irq_type(XICSState *icp, int irq, bool lsi)
 {
-ICSState *ics = icp-ics;
+int src = xics_find_source(icp, irq);
+ICSState *ics;
 
-assert(ics_valid_irq(ics, irq));
+assert(src = 0);
 
+ics = icp-ics[src];
 ics_set_irq_type(ics, irq - ics-offset, lsi);
 }
 
-- 
1.9.rc0




Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Michael S. Tsirkin
On Thu, May 15, 2014 at 12:08:26PM +0200, Greg Kurz wrote:
 On Thu, 15 May 2014 11:20:18 +0200
 Andreas Färber afaer...@suse.de wrote:
  Am 15.05.2014 09:04, schrieb Greg Kurz:
   On Thu, 15 May 2014 12:16:35 +0530
   Amit Shah amit.s...@redhat.com wrote:
   On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
   On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
   On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
   Since each virtio device is streamed in its own section, the idea is 
   to
   stream subsections between the end of the device section and the start
   of the next sections. This allows an older QEMU to complain and exit
   when fed with subsections:
  
   Unknown savevm section type 5
   Error -22 while loading VM state
  
   Please make this configurable -- either via configure or device
   properties.  That avoids having to break existing configurations that
   work without this patch.
  
  Since backwards migration is not supported upstream, wouldn't it be
  easiest to just add support for the subsection marker and skipping to
  the end of section in that downstream?
  
 
 Not sure I understand well... Do you suggest to stream the markers first,
 then the device, then the subsections ? And then there would be a way
 we can have the subsections restored before the device ?
 
   All users of virtio_load()/virtio_save() need to be patched because 
   the
   subsections are streamed AFTER the device itself.
  
  IMO this is calling for inversion of control - i.e. let virtio devices
  call generic load/save functions that then dispatch to device-specific
  code and let us add common stuff in a central place without forgetting
  to add calls in some new device.
  
 
 That makes a lot of sense.
 
   Since all have the same fixup, I'm wondering if a new section can be
   added to the virtio-bus itself, which gets propagated to all devices
   upon load in the dest.
  
   This calls for a way for devices to inherit properties from the bus,
   which doesn't exist ATM.
   Fine but let's not hold up this patchset because of this.
  
   No, only suggestion is to add a migration section in the bus, and then
   it's easier to do this in the post-migrate functions for each device
   -- so only one new section gets introduced instead of all devices
   being modified to send a new subsection.
  
   
   The main problem I see is that virtio sucks: as you see in patch 8, we 
   have
   to be careful not to call vring or virtqueue stuff before the device knows
   its endianness or it breaks... I need to study how the virtio-bus gets
   migrated to ensure the endian section is streamed before the devices.
  
  There is no ordering guarantee. The state needs to be migrated in the
  device or bus where it sits, if post-load processing is required; i.e.,
  if it's in VirtIODevice then something like this series, if it were on
  VirtioBus exclusively (device asking bus for its endianness each time
  and does not do post-load stuff) then endianness could be migrated as a
  new bus section. Not sure if that would help the broken state though?
  
 
 IIRW the broken state was proposed as a per-device property...
 
 Fam,
 
 Do you have plans about the broken property ? Is it still needed ?
 
  Would touch on Stefan's alias properties for anything but virtio-mmio.
  
 
 OMG... maybe I should hold on then.

No need to wait imho.
Can this be made even simpler - call this stuff
from virtio_save/virtio_load?

Why not?


  Regards,
  Andreas
  
 
 Thanks !
 
 -- 
 Gregory Kurz kurzg...@fr.ibm.com
  gk...@linux.vnet.ibm.com
 Software Engineer @ IBM/Meiosys  http://www.ibm.com
 Tel +33 (0)562 165 496
 
 Anarchy is about taking complete responsibility for yourself.
 Alan Moore.



Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Greg Kurz
On Thu, 15 May 2014 12:08:26 +0200
Greg Kurz gk...@linux.vnet.ibm.com wrote:
 On Thu, 15 May 2014 11:20:18 +0200
 Andreas Färber afaer...@suse.de wrote:
  Am 15.05.2014 09:04, schrieb Greg Kurz:
   On Thu, 15 May 2014 12:16:35 +0530
   Amit Shah amit.s...@redhat.com wrote:
   On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
   On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
   On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
   Since each virtio device is streamed in its own section, the idea is 
   to
   stream subsections between the end of the device section and the start
   of the next sections. This allows an older QEMU to complain and exit
   when fed with subsections:
  
   Unknown savevm section type 5
   Error -22 while loading VM state
  
   Please make this configurable -- either via configure or device
   properties.  That avoids having to break existing configurations that
   work without this patch.
  
  Since backwards migration is not supported upstream, wouldn't it be
  easiest to just add support for the subsection marker and skipping to
  the end of section in that downstream?
  
 
 Not sure I understand well... Do you suggest to stream the markers first,
 then the device, then the subsections ? And then there would be a way
 we can have the subsections restored before the device ?
 

Heh just got the clarification thanks to Michael's mail... Sorry for the
confusion and noise. :P


   All users of virtio_load()/virtio_save() need to be patched because 
   the
   subsections are streamed AFTER the device itself.
  
  IMO this is calling for inversion of control - i.e. let virtio devices
  call generic load/save functions that then dispatch to device-specific
  code and let us add common stuff in a central place without forgetting
  to add calls in some new device.
  
 
 That makes a lot of sense.
 
   Since all have the same fixup, I'm wondering if a new section can be
   added to the virtio-bus itself, which gets propagated to all devices
   upon load in the dest.
  
   This calls for a way for devices to inherit properties from the bus,
   which doesn't exist ATM.
   Fine but let's not hold up this patchset because of this.
  
   No, only suggestion is to add a migration section in the bus, and then
   it's easier to do this in the post-migrate functions for each device
   -- so only one new section gets introduced instead of all devices
   being modified to send a new subsection.
  
   
   The main problem I see is that virtio sucks: as you see in patch 8, we 
   have
   to be careful not to call vring or virtqueue stuff before the device knows
   its endianness or it breaks... I need to study how the virtio-bus gets
   migrated to ensure the endian section is streamed before the devices.
  
  There is no ordering guarantee. The state needs to be migrated in the
  device or bus where it sits, if post-load processing is required; i.e.,
  if it's in VirtIODevice then something like this series, if it were on
  VirtioBus exclusively (device asking bus for its endianness each time
  and does not do post-load stuff) then endianness could be migrated as a
  new bus section. Not sure if that would help the broken state though?
  
 
 IIRW the broken state was proposed as a per-device property...
 
 Fam,
 
 Do you have plans about the broken property ? Is it still needed ?
 
  Would touch on Stefan's alias properties for anything but virtio-mmio.
  
 
 OMG... maybe I should hold on then.
 
  Regards,
  Andreas
  
 
 Thanks !
 



-- 
Gregory Kurz kurzg...@fr.ibm.com
 gk...@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys  http://www.ibm.com
Tel +33 (0)562 165 496

Anarchy is about taking complete responsibility for yourself.
Alan Moore.




Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Michael S. Tsirkin
On Thu, May 15, 2014 at 12:11:12PM +0200, Andreas Färber wrote:
 Am 15.05.2014 12:03, schrieb Michael S. Tsirkin:
  On Thu, May 15, 2014 at 11:58:25AM +0200, Andreas Färber wrote:
  Am 15.05.2014 11:52, schrieb Michael S. Tsirkin:
  On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
  Am 15.05.2014 09:04, schrieb Greg Kurz:
  On Thu, 15 May 2014 12:16:35 +0530
  Amit Shah amit.s...@redhat.com wrote:
  On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
  On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
  On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
  Since each virtio device is streamed in its own section, the idea 
  is to
  stream subsections between the end of the device section and the 
  start
  of the next sections. This allows an older QEMU to complain and exit
  when fed with subsections:
 
  Unknown savevm section type 5
  Error -22 while loading VM state
 
  Please make this configurable -- either via configure or device
  properties.  That avoids having to break existing configurations that
  work without this patch.
 
  Since backwards migration is not supported upstream, wouldn't it be
  easiest to just add support for the subsection marker and skipping to
  the end of section in that downstream?
 
  Backwards and forwards migration need to be supported,
  customers told us repeatedly. So some downstreams support this
  and not supporting it upstream just means downstreams need
  to do their own thing.
 
  As importantly, ping-pong migration is the only
  reliable way to stress migration.
 
  So if we want to test cross-version we need it to work
  both way.
 
  Finally, the real issue and difficulty with cross-version migration is
  making VM behave in a backwards compatible way.  Serializing in a
  compatible way is a trivial problem, or would be if the code wasn't a
  mess :) Once you do the hard part, breaking migration because of the
  trivial serialization issue is just silly.  And special-casing forward
  migration does not make code simpler, it really only leads to
  proliferation of hacks and lack of symmetry.
 
  So yes it's a useful feature, and no not supporting it does
  not help anyway.
 
  It seems you misunderstand. I was not saying it's not useful.
 
  My point is that VMStateSubsections added in newer versions (and thus
  not existing in older versions) need to be handled for any
  VMState-converted devices. So why can't we make that work for virtio too?
  
  Sure.
  AFAIK the way to this works is by adding an field_exists callback,
  and not sending the section when we are in a compat mode.
 
 OK, but upstream always sends the latest version today.
 So isn't that
 just two ifs that you would need to add in save and load functions added
 here for the downstream? x86_64 is unaffected from ppc's endianness
 changes and you still do ppc64 BE, so there's no real functional problem
 for RHEL, is there?
 
 Andreas

I'm sorry I don't understand what you are saying here.
Simply put, if you specify a compatible -M then your
device should behave, and migrate, exactly like and old
qemu did.

With *any* change, there's always the temptation to say
oh but old behaviour was too buggy we should just ignore it.
Seems to make sense in each case but does not scale,
you end up with breakages in all version without exception.
So this should be a very high bar to overcome.



 -- 
 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] Qemu stucking

2014-05-15 Thread Mark Cave-Ayland

On 15/05/14 11:01, sonia verma wrote:


Hi Mark

I tried booting KVM using qemy-system-ppc with your suggesstion but
ended up stucking at below logs..

/usr/bin/qemu-system-ppc -m 512 -nographic -hda
kvm/debian_lenny_powerpc_standard.qcow2
qemu-system-ppc: pci_add_option_rom: failed to find romfile
efi-ne2k_pci.rom
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle
  set_property: NULL phandle


This looks broken :/  I've seen similar errors to this caused by faulty 
gcc compilers around the 4.7 era breaking OpenBIOS binaries, but I 
always test the images supplied with stock QEMU to make sure they work 
for me here before sending a pull request.


Perhaps it is something within the KVM code? Does it work if you use 
standard QEMU?



ATB,

Mark.

P.S. You should also CC qemu-ppc as that's where the PPC users tend to 
hang out...





[Qemu-devel] [PATCH v2 4/8] spapr: Move interrupt allocator to xics

2014-05-15 Thread Alexey Kardashevskiy
The current allocator returns IRQ numbers from a pool and does not
support IRQs reuse in any form as it did not keep track of what it
previously returned, it only keeps the last returned IRQ. Some use
cases such as PCI hot(un)plug may require IRQ release and reallocation.

This moves an allocator from SPAPR to XICS.

This switches IRQ users to use new API.

This uses LSI/MSI flags to know if interrupt is allocated.

The interrupt release function will be posted as a separate patch.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/intc/xics.c | 88 ++
 hw/ppc/spapr.c | 67 --
 hw/ppc/spapr_events.c  |  2 +-
 hw/ppc/spapr_pci.c |  6 ++--
 hw/ppc/spapr_vio.c |  2 +-
 include/hw/ppc/spapr.h | 10 --
 include/hw/ppc/xics.h  |  2 ++
 trace-events   |  4 +++
 8 files changed, 99 insertions(+), 82 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 83a809e..fdcbb3a 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -689,6 +689,94 @@ void xics_set_irq_type(XICSState *icp, int irq, bool lsi)
 ics_set_irq_type(ics, irq - ics-offset, lsi);
 }
 
+#define ICS_IRQ_FREE(ics, srcno)   \
+(!((ics)-irqs[(srcno)].flags  (XICS_FLAGS_IRQ_MASK)))
+
+static int ics_find_free_block(ICSState *ics, int num, int alignnum)
+{
+int first, i;
+
+for (first = 0; first  ics-nr_irqs; first += alignnum) {
+if (num  (ics-nr_irqs - first)) {
+return -1;
+}
+for (i = first; i  first + num; ++i) {
+if (!ICS_IRQ_FREE(ics, i)) {
+break;
+}
+}
+if (i == (first + num)) {
+return first;
+}
+}
+
+return -1;
+}
+
+int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi)
+{
+ICSState *ics = icp-ics[src];
+int irq;
+
+if (irq_hint) {
+assert(src == xics_find_source(icp, irq_hint));
+if (!ICS_IRQ_FREE(ics, irq_hint - ics-offset)) {
+trace_xics_alloc_failed_hint(src, irq_hint);
+return -1;
+}
+irq = irq_hint;
+} else {
+irq = ics_find_free_block(ics, 1, 1);
+if (irq  0) {
+trace_xics_alloc_failed_no_left(src);
+return -1;
+}
+irq += ics-offset;
+}
+
+ics_set_irq_type(ics, irq - ics-offset, lsi);
+trace_xics_alloc(src, irq);
+
+return irq;
+}
+
+/*
+ * Allocate block of consequtive IRQs, returns a number of the first.
+ * If align==true, aligns the first IRQ number to num.
+ */
+int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align)
+{
+int i, first = -1;
+ICSState *ics = icp-ics[src];
+
+assert(src == 0);
+/*
+ * MSIMesage::data is used for storing VIRQ so
+ * it has to be aligned to num to support multiple
+ * MSI vectors. MSI-X is not affected by this.
+ * The hint is used for the first IRQ, the rest should
+ * be allocated continuously.
+ */
+if (align) {
+assert((num == 1) || (num == 2) || (num == 4) ||
+   (num == 8) || (num == 16) || (num == 32));
+first = ics_find_free_block(ics, num, num);
+} else {
+first = ics_find_free_block(ics, num, 1);
+}
+
+if (first = 0) {
+for (i = first; i  first + num; ++i) {
+ics_set_irq_type(ics, i, lsi);
+}
+}
+first += ics-offset;
+
+trace_xics_alloc_block(src, first, num, lsi, align);
+
+return first;
+}
+
 /*
  * Guest interfaces
  */
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b898a39..fce7e1c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -88,73 +88,6 @@
 
 sPAPREnvironment *spapr;
 
-int spapr_allocate_irq(int hint, bool lsi)
-{
-int irq;
-
-if (hint) {
-irq = hint;
-if (hint = spapr-next_irq) {
-spapr-next_irq = hint + 1;
-}
-/* FIXME: we should probably check for collisions somehow */
-} else {
-irq = spapr-next_irq++;
-}
-
-/* Configure irq type */
-if (!xics_get_qirq(spapr-icp, irq)) {
-return 0;
-}
-
-xics_set_irq_type(spapr-icp, irq, lsi);
-
-return irq;
-}
-
-/*
- * Allocate block of consequtive IRQs, returns a number of the first.
- * If msi==true, aligns the first IRQ number to num.
- */
-int spapr_allocate_irq_block(int num, bool lsi, bool msi)
-{
-int first = -1;
-int i, hint = 0;
-
-/*
- * MSIMesage::data is used for storing VIRQ so
- * it has to be aligned to num to support multiple
- * MSI vectors. MSI-X is not affected by this.
- * The hint is used for the first IRQ, the rest should
- * be allocated continuously.
- */
-if (msi) {
-assert((num == 1) || (num == 2) || (num == 4) ||
-   (num == 8) || (num == 16) || (num == 32));
-hint = (spapr-next_irq + num - 1)  ~(num - 1);
-}
-
-for (i = 0; i  num; ++i) {
-int irq;
-
-irq = 

Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Greg Kurz
On Thu, 15 May 2014 13:12:12 +0300
Michael S. Tsirkin m...@redhat.com wrote:

 On Thu, May 15, 2014 at 12:08:26PM +0200, Greg Kurz wrote:
  On Thu, 15 May 2014 11:20:18 +0200
  Andreas Färber afaer...@suse.de wrote:
   Am 15.05.2014 09:04, schrieb Greg Kurz:
On Thu, 15 May 2014 12:16:35 +0530
Amit Shah amit.s...@redhat.com wrote:
On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
Since each virtio device is streamed in its own section, the idea 
is to
stream subsections between the end of the device section and the 
start
of the next sections. This allows an older QEMU to complain and exit
when fed with subsections:
   
Unknown savevm section type 5
Error -22 while loading VM state
   
Please make this configurable -- either via configure or device
properties.  That avoids having to break existing configurations that
work without this patch.
   
   Since backwards migration is not supported upstream, wouldn't it be
   easiest to just add support for the subsection marker and skipping to
   the end of section in that downstream?
   
  
  Not sure I understand well... Do you suggest to stream the markers first,
  then the device, then the subsections ? And then there would be a way
  we can have the subsections restored before the device ?
  
All users of virtio_load()/virtio_save() need to be patched because 
the
subsections are streamed AFTER the device itself.
   
   IMO this is calling for inversion of control - i.e. let virtio devices
   call generic load/save functions that then dispatch to device-specific
   code and let us add common stuff in a central place without forgetting
   to add calls in some new device.
   
  
  That makes a lot of sense.
  
Since all have the same fixup, I'm wondering if a new section can be
added to the virtio-bus itself, which gets propagated to all devices
upon load in the dest.
   
This calls for a way for devices to inherit properties from the bus,
which doesn't exist ATM.
Fine but let's not hold up this patchset because of this.
   
No, only suggestion is to add a migration section in the bus, and then
it's easier to do this in the post-migrate functions for each device
-- so only one new section gets introduced instead of all devices
being modified to send a new subsection.
   

The main problem I see is that virtio sucks: as you see in patch 8, we 
have
to be careful not to call vring or virtqueue stuff before the device 
knows
its endianness or it breaks... I need to study how the virtio-bus gets
migrated to ensure the endian section is streamed before the devices.
   
   There is no ordering guarantee. The state needs to be migrated in the
   device or bus where it sits, if post-load processing is required; i.e.,
   if it's in VirtIODevice then something like this series, if it were on
   VirtioBus exclusively (device asking bus for its endianness each time
   and does not do post-load stuff) then endianness could be migrated as a
   new bus section. Not sure if that would help the broken state though?
   
  
  IIRW the broken state was proposed as a per-device property...
  
  Fam,
  
  Do you have plans about the broken property ? Is it still needed ?
  
   Would touch on Stefan's alias properties for anything but virtio-mmio.
   
  
  OMG... maybe I should hold on then.
 
 No need to wait imho.
 Can this be made even simpler - call this stuff
 from virtio_save/virtio_load?
 

Andreas already suggested this inversion of control.

 Why not?
 

No reason indeed. I'll rewrite the code that way ! :)

 
   Regards,
   Andreas
   
  
  Thanks !
  
  -- 
  Gregory Kurz kurzg...@fr.ibm.com
   gk...@linux.vnet.ibm.com
  Software Engineer @ IBM/Meiosys  http://www.ibm.com
  Tel +33 (0)562 165 496
  
  Anarchy is about taking complete responsibility for yourself.
  Alan Moore.
 

Thnaks !

-- 
Gregory Kurz kurzg...@fr.ibm.com
 gk...@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys  http://www.ibm.com
Tel +33 (0)562 165 496

Anarchy is about taking complete responsibility for yourself.
Alan Moore.




[Qemu-devel] [PATCH v2 6/8] spapr: Remove @next_irq

2014-05-15 Thread Alexey Kardashevskiy
This removes @next_irq from sPAPREnvironment which was used in old
IRQ allocator as XICS is now responsible for IRQs and keeps track of
allocated IRQs.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/ppc/spapr.c | 3 +--
 include/hw/ppc/spapr.h | 1 -
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index fce7e1c..d95e9b4 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -767,7 +767,7 @@ static const VMStateDescription vmstate_spapr = {
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
 .fields  = (VMStateField []) {
-VMSTATE_UINT32(next_irq, sPAPREnvironment),
+VMSTATE_UNUSED(4), /* used to be @next_irq */
 
 /* RTC offset */
 VMSTATE_UINT64(rtc_offset, sPAPREnvironment),
@@ -1171,7 +1171,6 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 /* Set up Interrupt Controller before we create the VCPUs */
 spapr-icp = xics_system_init(smp_cpus * kvmppc_smt_threads() / 
smp_threads,
   XICS_IRQS);
-spapr-next_irq = XICS_IRQ_BASE;
 
 /* init CPUs */
 if (cpu_model == NULL) {
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index feb241a..f8d7326 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -27,7 +27,6 @@ typedef struct sPAPREnvironment {
 long rtas_size;
 void *fdt_skel;
 target_ulong entry_point;
-uint32_t next_irq;
 uint64_t rtc_offset;
 bool has_graphics;
 
-- 
1.9.rc0




[Qemu-devel] [PATCH v2 7/8] xics: Implement xics_ics_free()

2014-05-15 Thread Alexey Kardashevskiy
This implements interrupt release function so IRQs can be returned back
to the pool for reuse in cases such as PCI hot plug.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/intc/xics.c| 25 +
 include/hw/ppc/xics.h |  1 +
 trace-events  |  2 ++
 3 files changed, 28 insertions(+)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 1a8cfd7..f6a2066 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -766,6 +766,31 @@ int xics_alloc_block(XICSState *icp, int src, int num, 
bool lsi, bool align)
 return first;
 }
 
+static void ics_free(ICSState *ics, int srcno, int num)
+{
+int i;
+
+for (i = srcno; i  srcno + num; ++i) {
+if (ICS_IRQ_FREE(ics, i)) {
+trace_xics_ics_free_warn(ics - ics-icp-ics, i + ics-offset);
+}
+memset(ics-irqs[i], 0, sizeof(ICSIRQState));
+}
+}
+
+void xics_free(XICSState *icp, int src, int irq, int num)
+{
+ICSState *ics = icp-ics[src];
+
+/* FIXME: implement multiple sources */
+assert(src == 0);
+
+trace_xics_ics_free(ics - icp-ics, irq, num);
+if (src = 0) {
+ics_free(ics, irq - ics-offset, num);
+}
+}
+
 /*
  * Guest interfaces
  */
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 0d8af1b..267d045 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -159,6 +159,7 @@ struct ICSIRQState {
 qemu_irq xics_get_qirq(XICSState *icp, int irq);
 int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi);
 int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align);
+void xics_free(XICSState *icp, int src, int irq, int num);
 
 void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu);
 
diff --git a/trace-events b/trace-events
index d8e83cc..0a66c71 100644
--- a/trace-events
+++ b/trace-events
@@ -1182,6 +1182,8 @@ xics_alloc(int src, int irq) source#%d, irq %d
 xics_alloc_failed_hint(int src, int irq) source#%d, irq %d is already in use
 xics_alloc_failed_no_left(int src) source#%d, no irq left
 xics_alloc_block(int src, int first, int num, bool lsi, int align) source#%d, 
first irq %d, %d irqs, lsi=%d, alignnum %d
+xics_ics_free(int src, int irq, int num) Source#%d, first irq %d, %d irqs
+xics_ics_free_warn(int src, int irq) Source#%d, irq %d is already free
 
 # hw/ppc/spapr_iommu.c
 spapr_iommu_put(uint64_t liobn, uint64_t ioba, uint64_t tce, uint64_t ret) 
liobn=%PRIx64 ioba=0x%PRIx64 tce=0x%PRIx64 ret=%PRId64
-- 
1.9.rc0




[Qemu-devel] [PATCH v2 0/8] Move interrupts from spapr to xics

2014-05-15 Thread Alexey Kardashevskiy
This moves interrupts allocation business from SPAPR to XICS
and makes use of it.

Changes:
v2:
* s/server/source/
* fixed typos, code style, added an assert
* added patch for spapr_pci for better IRQ reuse for MSI/MSIX

There is just one source at the moment. We might create one
per PHB and one per VIO device or VIO bus but I do not see any
immediate profit from it. Makes any sense?

Please comment. Thanks!



Alexey Kardashevskiy (8):
  xics: Add flags for interrupts
  xics: Add xics_find_source()
  xics: Disable flags reset on xics reset
  spapr: Move interrupt allocator to xics
  xics: Remove obsolete xics_set_irq_type()
  spapr: Remove @next_irq
  xics: Implement xics_ics_free()
  spapr_pci: Use XICS interrupt allocator and do not cache interrupts in
PHB

 hw/intc/xics.c  | 160 ---
 hw/intc/xics_kvm.c  |  12 ++-
 hw/ppc/spapr.c  |  70 +
 hw/ppc/spapr_events.c   |   2 +-
 hw/ppc/spapr_pci.c  | 181 +++-
 hw/ppc/spapr_vio.c  |   2 +-
 include/hw/pci-host/spapr.h |  11 +--
 include/hw/ppc/spapr.h  |  11 ---
 include/hw/ppc/xics.h   |  10 ++-
 trace-events|  11 ++-
 10 files changed, 275 insertions(+), 195 deletions(-)

-- 
1.9.rc0




Re: [Qemu-devel] Qemu stucking

2014-05-15 Thread sonia verma
Hi Mark

The gcc version I'm using is 4.8.1 .
It is not working with the standard Qemu.


On Thu, May 15, 2014 at 3:46 PM, Mark Cave-Ayland 
mark.cave-ayl...@ilande.co.uk wrote:

 On 15/05/14 11:01, sonia verma wrote:

  Hi Mark

 I tried booting KVM using qemy-system-ppc with your suggesstion but
 ended up stucking at below logs..

 /usr/bin/qemu-system-ppc -m 512 -nographic -hda
 kvm/debian_lenny_powerpc_standard.qcow2
 qemu-system-ppc: pci_add_option_rom: failed to find romfile
 efi-ne2k_pci.rom
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle
   set_property: NULL phandle


 This looks broken :/  I've seen similar errors to this caused by faulty
 gcc compilers around the 4.7 era breaking OpenBIOS binaries, but I always
 test the images supplied with stock QEMU to make sure they work for me here
 before sending a pull request.

 Perhaps it is something within the KVM code? Does it work if you use
 standard QEMU?


 ATB,

 Mark.

 P.S. You should also CC qemu-ppc as that's where the PPC users tend to
 hang out...




Re: [Qemu-devel] Qemu stucking

2014-05-15 Thread Mark Cave-Ayland

On 15/05/14 11:50, sonia verma wrote:


Hi Mark

The gcc version I'm using is 4.8.1 .
It is not working with the standard Qemu.


Unfortunately if it doesn't work with standard QEMU then it sounds as if 
there is something wrong with either your OpenBIOS binary or build 
environment.


The debian_lenny_powerpc_standard.qcow2 image boots fine for me (albeit 
not with -nographic) using a fresh build from git. I would suggest 
trying the same, using the pre-built OpenBIOS images and see if that works.


It may also be worth raising a bug report with your distro if they are 
shipping corrupt binaries.



Kind regards,

Mark.




Re: [Qemu-devel] Qemu stucking

2014-05-15 Thread sonia verma
Hi Mark

Thanks for the information.It will help me alot.
I'll let you know if any further issues.


On Thu, May 15, 2014 at 4:32 PM, Mark Cave-Ayland 
mark.cave-ayl...@ilande.co.uk wrote:

 On 15/05/14 11:50, sonia verma wrote:

  Hi Mark

 The gcc version I'm using is 4.8.1 .
 It is not working with the standard Qemu.


 Unfortunately if it doesn't work with standard QEMU then it sounds as if
 there is something wrong with either your OpenBIOS binary or build
 environment.

 The debian_lenny_powerpc_standard.qcow2 image boots fine for me (albeit
 not with -nographic) using a fresh build from git. I would suggest trying
 the same, using the pre-built OpenBIOS images and see if that works.

 It may also be worth raising a bug report with your distro if they are
 shipping corrupt binaries.


 Kind regards,

 Mark.




[Qemu-devel] [PATCH v2 1/2] block: Move declaration of bdrv_get_aio_context to block.h

2014-05-15 Thread Fam Zheng
block_int.h is for block layer and block drivers, other code shouldn't
include it. But similar to bdrv_set_aio_context, bdrv_get_aio_context
should also be accessible from outside of block layer.

Move it.

Signed-off-by: Fam Zheng f...@redhat.com
---
 include/block/block.h | 7 +++
 include/block/block_int.h | 7 ---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index d223e33..6676d1f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -534,6 +534,13 @@ int bdrv_debug_resume(BlockDriverState *bs, const char 
*tag);
 bool bdrv_debug_is_suspended(BlockDriverState *bs, const char *tag);
 
 /**
+ * bdrv_get_aio_context:
+ *
+ * Returns: the currently bound #AioContext
+ */
+AioContext *bdrv_get_aio_context(BlockDriverState *bs);
+
+/**
  * bdrv_set_aio_context:
  *
  * Changes the #AioContext used for fd handlers, timers, and BHs by this
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 47c79b3..a9301d5 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -397,13 +397,6 @@ void bdrv_add_before_write_notifier(BlockDriverState *bs,
 NotifierWithReturn *notifier);
 
 /**
- * bdrv_get_aio_context:
- *
- * Returns: the currently bound #AioContext
- */
-AioContext *bdrv_get_aio_context(BlockDriverState *bs);
-
-/**
  * bdrv_detach_aio_context:
  *
  * May be called from .bdrv_detach_aio_context() to detach children from the
-- 
1.9.2




[Qemu-devel] [PATCH v2 0/2] dataplane: Enable config-wce

2014-05-15 Thread Fam Zheng
This applies on top of Stefan's dataplane series.

v2:

Patch 1 moves the declaration of bdrv_get_aio_context to block.h.
Patch 2 is unchanged except for the dropped #include.


Fam Zheng (2):
  block: Move declaration of bdrv_get_aio_context to block.h
  virtio-blk: Allow config-wce in dataplane

 hw/block/dataplane/virtio-blk.c | 6 --
 hw/block/virtio-blk.c   | 8 +++-
 include/block/block.h   | 7 +++
 include/block/block_int.h   | 7 ---
 4 files changed, 14 insertions(+), 14 deletions(-)

-- 
1.9.2




[Qemu-devel] [PATCH v2 2/2] virtio-blk: Allow config-wce in dataplane

2014-05-15 Thread Fam Zheng
Dataplane now uses block layer. Protect bdrv_set_enable_write_cache with
aio_context_acquire and aio_context_release, so we can enable config-wce
to allow guest to modify the write cache online.

Signed-off-by: Fam Zheng f...@redhat.com
---
 hw/block/dataplane/virtio-blk.c | 6 --
 hw/block/virtio-blk.c   | 8 +++-
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 79fb612..46a6824 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -332,12 +332,6 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, 
VirtIOBlkConf *blk,
 return;
 }
 
-if (blk-config_wce) {
-error_setg(errp, device is incompatible with x-data-plane, 
- use config-wce=off);
-return;
-}
-
 /* If dataplane is (re-)enabled while the guest is running there could be
  * block jobs that can conflict.
  */
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 8a568e5..5e9433d 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -523,7 +523,10 @@ static void virtio_blk_set_config(VirtIODevice *vdev, 
const uint8_t *config)
 struct virtio_blk_config blkcfg;
 
 memcpy(blkcfg, config, sizeof(blkcfg));
+
+aio_context_acquire(bdrv_get_aio_context(s-bs));
 bdrv_set_enable_write_cache(s-bs, blkcfg.wce != 0);
+aio_context_release(bdrv_get_aio_context(s-bs));
 }
 
 static uint32_t virtio_blk_get_features(VirtIODevice *vdev, uint32_t features)
@@ -582,7 +585,10 @@ static void virtio_blk_set_status(VirtIODevice *vdev, 
uint8_t status)
  * s-bs would erroneously be placed in writethrough mode.
  */
 if (!(features  (1  VIRTIO_BLK_F_CONFIG_WCE))) {
-bdrv_set_enable_write_cache(s-bs, !!(features  (1  
VIRTIO_BLK_F_WCE)));
+aio_context_acquire(bdrv_get_aio_context(s-bs));
+bdrv_set_enable_write_cache(s-bs,
+!!(features  (1  VIRTIO_BLK_F_WCE)));
+aio_context_release(bdrv_get_aio_context(s-bs));
 }
 }
 
-- 
1.9.2




Re: [Qemu-devel] usb: usb tablet freeze when save/restore guest os

2014-05-15 Thread Gonglei (Arei)
 -Original Message-
 From: Gerd Hoffmann [mailto:kra...@redhat.com]
 Sent: Thursday, May 15, 2014 2:50 PM
 To: Gonglei (Arei)
 Cc: Paolo Bonzini; qemu-devel@nongnu.org; Huangweidong (C); Michael S.
 Tsirkin
 Subject: Re: usb: usb tablet freeze when save/restore guest os
 
   Hi,
 
  Well then, may I post a formal patch for this issue, Gerd? Thanks.
 
 I'd like to know what the root cause for the lost interrupt is.
 
Hi, Gerd. I must clarify the scene of this issue.

1)The problem occurred on Xen platform. 
the process of hibernate vm on Xen:
 suspend vm (pause all vcpus)
call xc_save to save memory
stop qemu (vm_stop)
save qemu 
destroy vm

2)The process of hibernate vm on KVM:
 vm_stop
do_vm_stop
pause_all_vcpus
runstate_set(state); //change the runstate to paused

3)the usb tablet backtrace:
vnc_client_read
protocol_client_msg
pointer_event
qemu_input_event_sync
in qemu_input_event_sync(), there is a check vm runstate, as below:

if (!runstate_is_running()  !runstate_check(RUN_STATE_SUSPENDED)) {
return;
}
On KVM platform, the usb tablet event will be return at there.

As for Xen, It's too later. After suspend vm, the qemu process can response the 
event
of usb tablet event. Because guest os's vcpus are paused, guest os cannot 
response 
interrupt injected by qemu. Then the interrupt will be lost.

 Not implementing PIRQ enable could be it, especially as the guest os
 seems to use it (otherwise your patch would have no effect).
 
 The check for the PIRQ enable bit should be in uhci_update_irq though,
 and you should check the single bit only, not the whole legacy support
 register.
 
Agreed. Thanks.

 cheers,
   Gerd
 
 

Best regards,
-Gonglei


[Qemu-devel] [PATCH v2] spapr: Add ibm, chip-id property in device tree

2014-05-15 Thread Alexey Kardashevskiy
This adds a ibm,chip-id property for CPU nodes which should be the same
for all cores in the same CPU socket. The recent guest kernels use this
information to associate threads with sockets.

Refer to the kernel commit 256f2d4b463d3030ebc8d2b54f427543814a2bdc
for more details.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---

Changes:
v2:
* always put ibm,chip to the device tree
* removed from migration as it is the user's responsibility
to run QEMU on both sides with the same CPU configuration


---
 hw/ppc/spapr.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b898a39..166c1c6 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -313,6 +313,9 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
 uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
 int i, smt = kvmppc_smt_threads();
 unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
+QemuOpts *opts = qemu_opts_find(qemu_find_opts(smp-opts), NULL);
+unsigned sockets = opts ? qemu_opt_get_number(opts, sockets, 0) : 0;
+uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
 
 fdt = g_malloc0(FDT_MAX_SIZE);
 _FDT((fdt_create(fdt, FDT_MAX_SIZE)));
@@ -470,6 +473,9 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
page_sizes_prop, page_sizes_prop_size)));
 }
 
+_FDT((fdt_property_cell(fdt, ibm,chip-id,
+cs-cpu_index / cpus_per_socket)));
+
 _FDT((fdt_end_node(fdt)));
 }
 
-- 
1.9.rc0




Re: [Qemu-devel] [PATCH 1/3] throttle: add throttle_detach/attach_aio_context()

2014-05-15 Thread Benoît Canet
The Wednesday 14 May 2014 à 16:22:45 (+0200), Stefan Hajnoczi wrote :
 Block I/O throttling uses timers and currently always adds them to the
 main loop.  Throttling will break if bdrv_set_aio_context() is used to
 move a BlockDriverState to a different AioContext.
 
 This patch adds throttle_detach/attach_aio_context() interfaces so the
 throttling timers and uses them to move timers to the new AioContext.
 Note that bdrv_set_aio_context() already drains all requests so we're
 sure no throttled requests are pending.
 
 The test cases need to be updated since the throttle_init() interface
 has changed.
 
 Cc: Benoît Canet benoit.ca...@irqsave.net
 Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
 ---
  block.c |  7 +++
  include/qemu/throttle.h | 10 ++
  tests/test-throttle.c   | 25 -
  util/throttle.c | 27 +++
  4 files changed, 60 insertions(+), 9 deletions(-)
 
 diff --git a/block.c b/block.c
 index 189fc0d..b30bcd5 100644
 --- a/block.c
 +++ b/block.c
 @@ -179,6 +179,7 @@ void bdrv_io_limits_enable(BlockDriverState *bs)
  {
  assert(!bs-io_limits_enabled);
  throttle_init(bs-throttle_state,
 +  bdrv_get_aio_context(bs),
QEMU_CLOCK_VIRTUAL,
bdrv_throttle_read_timer_cb,
bdrv_throttle_write_timer_cb,
 @@ -5532,6 +5533,9 @@ void bdrv_detach_aio_context(BlockDriverState *bs)
  return;
  }
  
 +if (bs-io_limits_enabled) {
 +throttle_detach_aio_context(bs-throttle_state);
 +}
  if (bs-drv-bdrv_detach_aio_context) {
  bs-drv-bdrv_detach_aio_context(bs);
  }
 @@ -5563,6 +5567,9 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
  if (bs-drv-bdrv_attach_aio_context) {
  bs-drv-bdrv_attach_aio_context(bs, new_context);
  }
 +if (bs-io_limits_enabled) {
 +throttle_attach_aio_context(bs-throttle_state, new_context);
 +}
  }
  
  void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
 diff --git a/include/qemu/throttle.h b/include/qemu/throttle.h
 index ab29b0b..b890613 100644
 --- a/include/qemu/throttle.h
 +++ b/include/qemu/throttle.h
 @@ -67,6 +67,11 @@ typedef struct ThrottleState {
  int64_t previous_leak;/* timestamp of the last leak done */
  QEMUTimer * timers[2];/* timers used to do the throttling */
  QEMUClockType clock_type; /* the clock used */
 +
 +/* Callbacks */
 +QEMUTimerCB *read_timer_cb;
 +QEMUTimerCB *write_timer_cb;
 +void *timer_opaque;
  } ThrottleState;
  
  /* operations on single leaky buckets */
 @@ -82,6 +87,7 @@ bool throttle_compute_timer(ThrottleState *ts,
  
  /* init/destroy cycle */
  void throttle_init(ThrottleState *ts,
 +   AioContext *aio_context,
 QEMUClockType clock_type,
 void (read_timer)(void *),
 void (write_timer)(void *),
 @@ -89,6 +95,10 @@ void throttle_init(ThrottleState *ts,
  
  void throttle_destroy(ThrottleState *ts);
  
 +void throttle_detach_aio_context(ThrottleState *ts);
 +
 +void throttle_attach_aio_context(ThrottleState *ts, AioContext *new_context);
 +
  bool throttle_have_timer(ThrottleState *ts);
  
  /* configuration */
 diff --git a/tests/test-throttle.c b/tests/test-throttle.c
 index 1d4ffd3..5fa5000 100644
 --- a/tests/test-throttle.c
 +++ b/tests/test-throttle.c
 @@ -12,8 +12,10 @@
  
  #include glib.h
  #include math.h
 +#include block/aio.h
  #include qemu/throttle.h
  
 +AioContext *ctx;
  LeakyBucketbkt;
  ThrottleConfig cfg;
  ThrottleState  ts;
 @@ -104,7 +106,8 @@ static void test_init(void)
  memset(ts, 1, sizeof(ts));
  
  /* init the structure */
 -throttle_init(ts, QEMU_CLOCK_VIRTUAL, read_timer_cb, write_timer_cb, 
 ts);
 +throttle_init(ts, ctx, QEMU_CLOCK_VIRTUAL,
 +  read_timer_cb, write_timer_cb, ts);
  
  /* check initialized fields */
  g_assert(ts.clock_type == QEMU_CLOCK_VIRTUAL);
 @@ -126,7 +129,8 @@ static void test_init(void)
  static void test_destroy(void)
  {
  int i;
 -throttle_init(ts, QEMU_CLOCK_VIRTUAL, read_timer_cb, write_timer_cb, 
 ts);
 +throttle_init(ts, ctx, QEMU_CLOCK_VIRTUAL,
 +  read_timer_cb, write_timer_cb, ts);
  throttle_destroy(ts);
  for (i = 0; i  2; i++) {
  g_assert(!ts.timers[i]);
 @@ -165,7 +169,8 @@ static void test_config_functions(void)
  
  orig_cfg.op_size = 1;
  
 -throttle_init(ts, QEMU_CLOCK_VIRTUAL, read_timer_cb, write_timer_cb, 
 ts);
 +throttle_init(ts, ctx, QEMU_CLOCK_VIRTUAL,
 +  read_timer_cb, write_timer_cb, ts);
  /* structure reset by throttle_init previous_leak should be null */
  g_assert(!ts.previous_leak);
  throttle_config(ts, orig_cfg);
 @@ -324,7 +329,8 @@ static void test_have_timer(void)
  g_assert(!throttle_have_timer(ts));
  
  /* init the structure */
 -

[Qemu-devel] [PATCH 1/9] kvm: add set_one_reg/get_one_reg helpers

2014-05-15 Thread Alexey Kardashevskiy
This adds QEMU wrappers for KVM_SET_ONE_REG/KVM_GET_ONE_REG ioctls.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
Changes:
v3:
* changed comments

v2:
* added Doc Comments
* removed error_print
---
 include/sysemu/kvm.h | 21 +
 kvm-all.c| 18 ++
 2 files changed, 39 insertions(+)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 63e241a..b1566a4 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -383,4 +383,25 @@ void kvm_init_irq_routing(KVMState *s);
  *   0: irq chip was created
  */
 int kvm_arch_irqchip_create(KVMState *s);
+
+/**
+ * kvm_set_one_reg - set a register value in KVM via KVM_SET_ONE_REG ioctl
+ * @id: The register ID
+ * @addr: The pointer to a value must point to a variable of the correct
+ * type/size for the register being accessed.
+ *
+ * Returns: 0 on success, or a negative errno on failure.
+ */
+int kvm_set_one_reg(CPUState *cs, uint64_t id, void *addr);
+
+/**
+ * kvm_get_one_reg - get a register value from KVM via KVM_GET_ONE_REG ioctl
+ * @id: The register ID
+ * @addr: The pointer to a value must point to a variable of the correct
+ * type/size for the register being accessed.
+ *
+ * Returns: 0 on success, or a negative errno on failure.
+ */
+int kvm_get_one_reg(CPUState *cs, uint64_t id, void *addr);
+
 #endif
diff --git a/kvm-all.c b/kvm-all.c
index 5cb7f26..ebb1afa 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -2114,3 +2114,21 @@ int kvm_create_device(KVMState *s, uint64_t type, bool 
test)
 
 return test ? 0 : create_dev.fd;
 }
+
+int kvm_set_one_reg(CPUState *cs, uint64_t id, void *addr)
+{
+struct kvm_one_reg reg = {
+.id = id,
+.addr = (uintptr_t)addr,
+};
+return kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, reg);
+}
+
+int kvm_get_one_reg(CPUState *cs, uint64_t id, void *addr)
+{
+struct kvm_one_reg reg = {
+.id = id,
+.addr = (uintptr_t)addr,
+};
+return kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, reg);
+}
-- 
1.9.rc0




[Qemu-devel] [PATCH 2/9] target-ppc: Add compat CPU option

2014-05-15 Thread Alexey Kardashevskiy
PowerISA defines a compatibility mode for server POWERPC CPUs which
is supported by the PCR special register. To support this feature,
SPAPR defines a set of virtual PVRs, once per PowerISA spec version.

This introduces a compat CPU option which defines maximal compatibility
mode enabled. The supported modes are power6/power7/power8.

This does not change the existing behaviour, new property will be used
by next patches.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 target-ppc/cpu-models.h | 11 +++
 target-ppc/cpu-qom.h|  2 ++
 target-ppc/translate_init.c | 75 +
 3 files changed, 88 insertions(+)

diff --git a/target-ppc/cpu-models.h b/target-ppc/cpu-models.h
index 9a003b4..33d2c51 100644
--- a/target-ppc/cpu-models.h
+++ b/target-ppc/cpu-models.h
@@ -595,6 +595,17 @@ enum {
 CPU_POWERPC_PA6T   = 0x0090,
 };
 
+/* Logical PVR definitions for sPAPR */
+enum {
+/* Logical CPUs */
+CPU_POWERPC_LOGICAL_2_04   = 0x0F01,
+CPU_POWERPC_LOGICAL_2_05   = 0x0F02,
+CPU_POWERPC_LOGICAL_2_06   = 0x0F03,
+CPU_POWERPC_LOGICAL_2_06_PLUS  = 0x0F13,
+CPU_POWERPC_LOGICAL_2_07   = 0x0F04,
+CPU_POWERPC_LOGICAL_2_08   = 0x0F05,
+};
+
 /* System version register (used on MPC 8xxx)*/
 enum {
 POWERPC_SVR_NONE   = 0x,
diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h
index 47dc8e6..2a8515b 100644
--- a/target-ppc/cpu-qom.h
+++ b/target-ppc/cpu-qom.h
@@ -82,6 +82,7 @@ typedef struct PowerPCCPUClass {
  * PowerPCCPU:
  * @env: #CPUPPCState
  * @cpu_dt_id: CPU index used in the device tree. KVM uses this index too
+ * @max_compat: Maximal supported logical PVR from the command line
  *
  * A PowerPC CPU.
  */
@@ -92,6 +93,7 @@ struct PowerPCCPU {
 
 CPUPPCState env;
 int cpu_dt_id;
+uint32_t max_compat;
 };
 
 static inline PowerPCCPU *ppc_env_get_cpu(CPUPPCState *env)
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 1d64ec9..6f61b34 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -28,6 +28,8 @@
 #include mmu-hash32.h
 #include mmu-hash64.h
 #include qemu/error-report.h
+#include qapi/visitor.h
+#include hw/qdev-properties.h
 
 //#define PPC_DUMP_CPU
 //#define PPC_DEBUG_SPR
@@ -7654,6 +7656,76 @@ POWERPC_FAMILY(POWER5P)(ObjectClass *oc, void *data)
 pcc-l1_icache_size = 0x1;
 }
 
+static void powerpc_get_compat(Object *obj, Visitor *v,
+   void *opaque, const char *name, Error **errp)
+{
+char *value = (char *);
+Property *prop = opaque;
+uint32_t *max_compat = qdev_get_prop_ptr(DEVICE(obj), prop);
+
+switch (*max_compat) {
+case CPU_POWERPC_LOGICAL_2_05:
+value = (char *)power6;
+break;
+case CPU_POWERPC_LOGICAL_2_06:
+value = (char *)power7;
+break;
+case CPU_POWERPC_LOGICAL_2_07:
+value = (char *)power8;
+break;
+case 0:
+break;
+default:
+error_setg(errp, Internal error: compat is set to %x,
+   max_compat ? *max_compat : -1);
+break;
+}
+
+visit_type_str(v, value, name, errp);
+}
+
+static void powerpc_set_compat(Object *obj, Visitor *v,
+   void *opaque, const char *name, Error **errp)
+{
+Error *error = NULL;
+char *value = NULL;
+Property *prop = opaque;
+uint32_t *max_compat = qdev_get_prop_ptr(DEVICE(obj), prop);
+
+visit_type_str(v, value, name, error);
+if (error) {
+error_propagate(errp, error);
+return;
+}
+
+if (strcmp(value, power6) == 0) {
+*max_compat = CPU_POWERPC_LOGICAL_2_05;
+} else if (strcmp(value, power7) == 0) {
+*max_compat = CPU_POWERPC_LOGICAL_2_06;
+} else if (strcmp(value, power8) == 0) {
+*max_compat = CPU_POWERPC_LOGICAL_2_07;
+} else {
+error_setg(errp, Invalid compatibility mode \%s\, value);
+}
+
+g_free(value);
+}
+
+static PropertyInfo powerpc_compat_propinfo = {
+.name = str,
+.legacy_name = powerpc-server-compat,
+.get = powerpc_get_compat,
+.set = powerpc_set_compat,
+};
+
+#define DEFINE_PROP_POWERPC_COMPAT(_n, _s, _f) \
+DEFINE_PROP(_n, _s, _f, powerpc_compat_propinfo, uint32_t)
+
+static Property powerpc_servercpu_properties[] = {
+DEFINE_PROP_POWERPC_COMPAT(compat, PowerPCCPU, max_compat),
+DEFINE_PROP_END_OF_LIST(),
+};
+
 static void init_proc_POWER7 (CPUPPCState *env)
 {
 gen_spr_ne_601(env);
@@ -7740,6 +7812,7 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
 
 dc-fw_name = PowerPC,POWER7;
 dc-desc = POWER7;
+dc-props = powerpc_servercpu_properties;
 pcc-pvr = CPU_POWERPC_POWER7_BASE;
 pcc-pvr_mask = CPU_POWERPC_POWER7_MASK;
 pcc-init_proc = init_proc_POWER7;
@@ -7798,6 +7871,7 @@ POWERPC_FAMILY(POWER7P)(ObjectClass *oc, void *data)
 
 

[Qemu-devel] [PATCH 4/9] target-ppc: Implement compat CPU option

2014-05-15 Thread Alexey Kardashevskiy
This adds basic support for the compat CPU option. By specifying
the compat property, the user can manually switch guest CPU mode from
raw to architected.

Since the actual compatibility mode is not implemented yet, this does
not change the existing behavior.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/ppc/spapr.c  | 15 ++-
 target-ppc/cpu-models.h |  6 ++
 target-ppc/cpu-qom.h|  2 ++
 target-ppc/cpu.h|  3 +++
 target-ppc/translate_init.c | 35 +++
 5 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 0f8bd95..61d0675 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -212,7 +212,8 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment 
*spapr)
 
 CPU_FOREACH(cpu) {
 DeviceClass *dc = DEVICE_GET_CLASS(cpu);
-int index = ppc_get_vcpu_dt_id(POWERPC_CPU(cpu));
+PowerPCCPU *pcpu = POWERPC_CPU(cpu);
+int index = ppc_get_vcpu_dt_id(pcpu);
 uint32_t associativity[] = {cpu_to_be32(0x5),
 cpu_to_be32(0x0),
 cpu_to_be32(0x0),
@@ -249,6 +250,14 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment 
*spapr)
 return ret;
 }
 
+if (pcpu-cpu_version) {
+ret = fdt_setprop(fdt, offset, cpu-version,
+  pcpu-cpu_version, sizeof(pcpu-cpu_version));
+if (ret  0) {
+return ret;
+}
+}
+
 /* Build interrupt servers and gservers properties */
 for (i = 0; i  smp_threads; i++) {
 servers_prop[i] = cpu_to_be32(index + i);
@@ -1278,6 +1287,10 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 kvmppc_set_papr(cpu);
 }
 
+if (cpu-max_compat) {
+ppc_set_compat(cpu, cpu-max_compat);
+}
+
 xics_cpu_setup(spapr-icp, cpu);
 
 qemu_register_reset(spapr_cpu_reset, cpu);
diff --git a/target-ppc/cpu-models.h b/target-ppc/cpu-models.h
index 33d2c51..45c0028 100644
--- a/target-ppc/cpu-models.h
+++ b/target-ppc/cpu-models.h
@@ -753,4 +753,10 @@ enum {
 POWERPC_SVR_8641D  = 0x80900121,
 };
 
+/* Processor Compatibility mask (PCR) */
+enum {
+POWERPC_ISA_COMPAT_2_05 = 0x02,
+POWERPC_ISA_COMPAT_2_06 = 0x04,
+};
+
 #endif
diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h
index 2a8515b..dfd1419 100644
--- a/target-ppc/cpu-qom.h
+++ b/target-ppc/cpu-qom.h
@@ -83,6 +83,7 @@ typedef struct PowerPCCPUClass {
  * @env: #CPUPPCState
  * @cpu_dt_id: CPU index used in the device tree. KVM uses this index too
  * @max_compat: Maximal supported logical PVR from the command line
+ * @cpu_version: Current logical PVR, zero if in raw mode
  *
  * A PowerPC CPU.
  */
@@ -94,6 +95,7 @@ struct PowerPCCPU {
 CPUPPCState env;
 int cpu_dt_id;
 uint32_t max_compat;
+uint32_t cpu_version;
 };
 
 static inline PowerPCCPU *ppc_env_get_cpu(CPUPPCState *env)
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index d498340..f61675a 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1123,6 +1123,8 @@ void ppc_store_msr (CPUPPCState *env, target_ulong value);
 
 void ppc_cpu_list (FILE *f, fprintf_function cpu_fprintf);
 
+void ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version);
+
 /* Time-base and decrementer management */
 #ifndef NO_CPU_IO_DEFS
 uint64_t cpu_ppc_load_tbl (CPUPPCState *env);
@@ -1338,6 +1340,7 @@ static inline int cpu_mmu_index (CPUPPCState *env)
 #define SPR_LPCR  (0x13E)
 #define SPR_BOOKE_DVC2(0x13F)
 #define SPR_BOOKE_TSR (0x150)
+#define SPR_PCR   (0x152)
 #define SPR_BOOKE_TCR (0x154)
 #define SPR_BOOKE_TLB0PS  (0x158)
 #define SPR_BOOKE_TLB1PS  (0x159)
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 6f61b34..c4bd5de 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7803,6 +7803,15 @@ static void init_proc_POWER7 (CPUPPCState *env)
 /* Can't find information on what this should be on reset.  This
  * value is the one used by 74xx processors. */
 vscr_init(env, 0x0001);
+
+/*
+ * Register PCR to report POWERPC_EXCP_PRIV_REG instead of
+ * POWERPC_EXCP_INVAL_SPR.
+ */
+spr_register(env, SPR_PCR, PCR,
+ SPR_NOACCESS, SPR_NOACCESS,
+ SPR_NOACCESS, SPR_NOACCESS,
+ 0x);
 }
 
 POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
@@ -8880,6 +8889,32 @@ static void ppc_cpu_unrealizefn(DeviceState *dev, Error 
**errp)
 }
 }
 
+void ppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version)
+{
+CPUPPCState *env = cpu-env;
+
+cpu-cpu_version = cpu_version;
+
+/*
+ * Calculate PCR value from virtual PVR.
+ * TODO: support actual compatibility in TCG.
+ */
+switch (cpu_version) {
+case 

[Qemu-devel] [PATCH 5/9] target-ppc: Define Processor Compatibility Masks

2014-05-15 Thread Alexey Kardashevskiy
This introduces PCR mask for supported compatibility modes.
This will be used later by the ibm,client-architecture-support call.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 target-ppc/cpu-qom.h| 1 +
 target-ppc/translate_init.c | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h
index dfd1419..093f09a 100644
--- a/target-ppc/cpu-qom.h
+++ b/target-ppc/cpu-qom.h
@@ -57,6 +57,7 @@ typedef struct PowerPCCPUClass {
 
 uint32_t pvr;
 uint32_t pvr_mask;
+uint64_t pcr_mask;
 uint32_t svr;
 uint64_t insns_flags;
 uint64_t insns_flags2;
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index c4bd5de..1b72485 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -7824,6 +7824,7 @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data)
 dc-props = powerpc_servercpu_properties;
 pcc-pvr = CPU_POWERPC_POWER7_BASE;
 pcc-pvr_mask = CPU_POWERPC_POWER7_MASK;
+pcc-pcr_mask = POWERPC_ISA_COMPAT_2_05 | POWERPC_ISA_COMPAT_2_06;
 pcc-init_proc = init_proc_POWER7;
 pcc-check_pow = check_pow_nocheck;
 pcc-insns_flags = PPC_INSNS_BASE | PPC_ISEL | PPC_STRING | PPC_MFTB |
@@ -7883,6 +7884,7 @@ POWERPC_FAMILY(POWER7P)(ObjectClass *oc, void *data)
 dc-props = powerpc_servercpu_properties;
 pcc-pvr = CPU_POWERPC_POWER7P_BASE;
 pcc-pvr_mask = CPU_POWERPC_POWER7P_MASK;
+pcc-pcr_mask = POWERPC_ISA_COMPAT_2_05 | POWERPC_ISA_COMPAT_2_06;
 pcc-init_proc = init_proc_POWER7;
 pcc-check_pow = check_pow_nocheck;
 pcc-insns_flags = PPC_INSNS_BASE | PPC_ISEL | PPC_STRING | PPC_MFTB |
@@ -7954,6 +7956,7 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
 dc-props = powerpc_servercpu_properties;
 pcc-pvr = CPU_POWERPC_POWER8_BASE;
 pcc-pvr_mask = CPU_POWERPC_POWER8_MASK;
+pcc-pcr_mask = POWERPC_ISA_COMPAT_2_05 | POWERPC_ISA_COMPAT_2_06;
 pcc-init_proc = init_proc_POWER8;
 pcc-check_pow = check_pow_nocheck;
 pcc-insns_flags = PPC_INSNS_BASE | PPC_ISEL | PPC_STRING | PPC_MFTB |
-- 
1.9.rc0




[Qemu-devel] [PATCH 6/9] spapr: Add ibm, client-architecture-support call

2014-05-15 Thread Alexey Kardashevskiy
The PAPR+ specification defines a ibm,client-architecture-support (CAS)
RTAS call which purpose is to provide a negotiation mechanism for
the guest and the hypervisor to work out the best compatibility parameters.
During the negotiation process, the guest provides an array of various
options and capabilities which it supports, the hypervisor adjusts
the device tree and (optionally) reboots the guest.

At the moment the Linux guest calls CAS method at early boot so SLOF
gets called. SLOF allocates a memory buffer for the device tree changes
and calls a custom KVMPPC_H_CAS hypercall. QEMU parses the options,
composes a diff for the device tree, copies it to the buffer provided
by SLOF and returns to SLOF. SLOF updates the device tree and returns
control to the guest kernel. Only then the Linux guest parses the device
tree so it is possible to avoid unnecessary reboot in most cases.

The device tree diff is a header with the update format version
(defined as 0 in this patch) followed by a device tree with the properties
which require update.

If QEMU detects during the option list parsing that it has to reboot
the guest, it silently does so as the guest expects reboot to happen as
pHyp firmware does it almost all the time.

This defines custom KVMPPC_H_CAS hypercall. The current SLOF already
has support for it.

This implements stub which returns empty tree to the guest.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/ppc/spapr.c | 26 ++
 hw/ppc/spapr_hcall.c   | 21 +
 include/hw/ppc/spapr.h |  9 -
 trace-events   |  4 
 4 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 61d0675..cf53a7a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -53,6 +53,7 @@
 #include hw/usb.h
 #include qemu/config-file.h
 #include qemu/error-report.h
+#include trace.h
 
 #include libfdt.h
 
@@ -562,6 +563,31 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
 return fdt;
 }
 
+int spapr_h_cas_compose_response(target_ulong addr, target_ulong size)
+{
+void *fdt;
+sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
+
+size -= sizeof(hdr);
+
+fdt = g_malloc0(size);
+_FDT((fdt_create(fdt, size)));
+
+_FDT((fdt_finish(fdt)));
+
+if (fdt_totalsize(fdt) + sizeof(hdr)  size) {
+trace_spapr_cas_failed(size);
+return -1;
+}
+
+cpu_physical_memory_write(addr, hdr, sizeof(hdr));
+cpu_physical_memory_write(addr + sizeof(hdr), fdt, fdt_totalsize(fdt));
+trace_spapr_cas_continue(fdt_totalsize(fdt) + sizeof(hdr));
+g_free(fdt);
+
+return 0;
+}
+
 static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
 {
 uint32_t associativity[] = {cpu_to_be32(0x4), cpu_to_be32(0x0),
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 0bae053..2f6aa5c 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -752,6 +752,24 @@ out:
 return ret;
 }
 
+static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
+  sPAPREnvironment *spapr,
+  target_ulong opcode,
+  target_ulong *args)
+{
+target_ulong list = args[0];
+
+if (!list) {
+return H_SUCCESS;
+}
+
+if (spapr_h_cas_compose_response(args[1], args[2])) {
+qemu_system_reset_request();
+}
+
+return H_SUCCESS;
+}
+
 static spapr_hcall_fn papr_hypercall_table[(MAX_HCALL_OPCODE / 4) + 1];
 static spapr_hcall_fn kvmppc_hypercall_table[KVMPPC_HCALL_MAX - 
KVMPPC_HCALL_BASE + 1];
 
@@ -831,6 +849,9 @@ static void hypercall_register_types(void)
 spapr_register_hypercall(KVMPPC_H_RTAS, h_rtas);
 
 spapr_register_hypercall(H_SET_MODE, h_set_mode);
+
+/* ibm,client-architecture-support support */
+spapr_register_hypercall(KVMPPC_H_CAS, h_client_architecture_support);
 }
 
 type_init(hypercall_register_types)
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 5fdac1e..dd6d97a 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -302,10 +302,16 @@ typedef struct sPAPREnvironment {
 #define KVMPPC_HCALL_BASE   0xf000
 #define KVMPPC_H_RTAS   (KVMPPC_HCALL_BASE + 0x0)
 #define KVMPPC_H_LOGICAL_MEMOP  (KVMPPC_HCALL_BASE + 0x1)
-#define KVMPPC_HCALL_MAXKVMPPC_H_LOGICAL_MEMOP
+/* Client Architecture support */
+#define KVMPPC_H_CAS(KVMPPC_HCALL_BASE + 0x2)
+#define KVMPPC_HCALL_MAXKVMPPC_H_CAS
 
 extern sPAPREnvironment *spapr;
 
+typedef struct sPAPRDeviceTreeUpdateHeader {
+uint32_t version_id;
+} sPAPRDeviceTreeUpdateHeader;
+
 /*#define DEBUG_SPAPR_HCALLS*/
 
 #ifdef DEBUG_SPAPR_HCALLS
@@ -401,6 +407,7 @@ struct sPAPRTCETable {
 
 void spapr_events_init(sPAPREnvironment *spapr);
 void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
+int spapr_h_cas_compose_response(target_ulong addr, 

[Qemu-devel] [PATCH 0/9] spapr: Enable ibm, client-architecture-support

2014-05-15 Thread Alexey Kardashevskiy
This enables a ibm,client-architecture-support RTAS call.

This allows older distros (such as SLES11 or RHEL6) to work on modern
POWERPC hardware (such as POWER8) in architected mode.

The previous try was RFC, so this is v1.

The very first patch here is for the reference, it is already  on its way to
upstream.

Please comment. Thanks!


Alexey Kardashevskiy (9):
  kvm: add set_one_reg/get_one_reg helpers
  target-ppc: Add compat CPU option
  spapr: Move server# property out of skeleton fdt
  target-ppc: Implement compat CPU option
  target-ppc: Define Processor Compatibility Masks
  spapr: Add ibm,client-architecture-support call
  spapr: Limit threads per core according to current compatibility mode
  spapr: Implement processor compatibility in
ibm,client-architecture-support
  KVM: PPC: Enable compatibility mode

 hw/ppc/spapr.c  | 149 +++-
 hw/ppc/spapr_hcall.c| 108 
 include/hw/ppc/spapr.h  |   9 ++-
 include/sysemu/kvm.h|  21 +++
 kvm-all.c   |  18 ++
 target-ppc/cpu-models.h |  17 +
 target-ppc/cpu-qom.h|   5 ++
 target-ppc/cpu.h|   3 +
 target-ppc/kvm.c|   5 ++
 target-ppc/kvm_ppc.h|   6 ++
 target-ppc/translate_init.c | 113 +
 trace-events|   9 +++
 12 files changed, 446 insertions(+), 17 deletions(-)

-- 
1.9.rc0




[Qemu-devel] [PATCH 9/9] KVM: PPC: Enable compatibility mode

2014-05-15 Thread Alexey Kardashevskiy
The host kernel implements a KVM_REG_PPC_ARCH_COMPAT register which
this uses to enable a compatibility mode if any chosen.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/ppc/spapr.c   | 6 ++
 hw/ppc/spapr_hcall.c | 4 
 target-ppc/kvm.c | 5 +
 target-ppc/kvm_ppc.h | 6 ++
 4 files changed, 21 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a0882a1..f89be10 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1376,6 +1376,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 /* Tell KVM that we're in PAPR mode */
 if (kvm_enabled()) {
 kvmppc_set_papr(cpu);
+
+if (cpu-max_compat 
+kvmppc_set_compat(cpu, cpu-max_compat)  0) {
+fprintf(stderr, Unable to set compatibility mode\n);
+exit(1);
+}
 }
 
 if (cpu-max_compat) {
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index cb815c3..2ab21d3 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -834,6 +834,10 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu_,
 CPU_FOREACH(cs) {
 PowerPCCPU *cpu = POWERPC_CPU(cs);
 
+if (kvmppc_set_compat(cpu, cpu_version)  0) {
+fprintf(stderr, Unable to set compatibility mode\n);
+return H_HARDWARE;
+}
 ppc_set_compat(cpu, cpu_version);
 }
 }
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index ff319fc..f8e8453 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1513,6 +1513,11 @@ void kvmppc_set_papr(PowerPCCPU *cpu)
 cap_papr = 1;
 }
 
+int kvmppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version)
+{
+return kvm_set_one_reg(CPU(cpu), KVM_REG_PPC_ARCH_COMPAT, cpu_version);
+}
+
 void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy)
 {
 CPUState *cs = CPU(cpu);
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index ff077ec..716c33d 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -23,6 +23,7 @@ int kvmppc_get_hasidle(CPUPPCState *env);
 int kvmppc_get_hypercall(CPUPPCState *env, uint8_t *buf, int buf_len);
 int kvmppc_set_interrupt(PowerPCCPU *cpu, int irq, int level);
 void kvmppc_set_papr(PowerPCCPU *cpu);
+int kvmppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version);
 void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy);
 int kvmppc_smt_threads(void);
 int kvmppc_clear_tsr_bits(PowerPCCPU *cpu, uint32_t tsr_bits);
@@ -95,6 +96,11 @@ static inline void kvmppc_set_papr(PowerPCCPU *cpu)
 {
 }
 
+static inline int kvmppc_set_compat(PowerPCCPU *cpu, uint32_t cpu_version)
+{
+return 0;
+}
+
 static inline void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy)
 {
 }
-- 
1.9.rc0




[Qemu-devel] [PATCH 3/9] spapr: Move server# property out of skeleton fdt

2014-05-15 Thread Alexey Kardashevskiy
The upcoming support of the ibm,client-architecture-support
reconfiguration method will be able to reduce the number of threads per
core so the server# and gserver# device tree properties are not parts
of the FDT skeleton anymore.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/ppc/spapr.c | 37 ++---
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 166c1c6..0f8bd95 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -219,6 +219,9 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment 
*spapr)
 cpu_to_be32(0x0),
 cpu_to_be32(cpu-numa_node),
 cpu_to_be32(index)};
+uint32_t servers_prop[smp_threads];
+uint32_t gservers_prop[smp_threads * 2];
+int i;
 
 if ((index % smt) != 0) {
 continue;
@@ -245,6 +248,24 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment 
*spapr)
 if (ret  0) {
 return ret;
 }
+
+/* Build interrupt servers and gservers properties */
+for (i = 0; i  smp_threads; i++) {
+servers_prop[i] = cpu_to_be32(index + i);
+/* Hack, direct the group queues back to cpu 0 */
+gservers_prop[i*2] = cpu_to_be32(index + i);
+gservers_prop[i*2 + 1] = 0;
+}
+ret = fdt_setprop(fdt, offset, ibm,ppc-interrupt-server#s,
+   servers_prop, sizeof(servers_prop));
+if (ret  0) {
+return ret;
+}
+ret = fdt_setprop(fdt, offset, ibm,ppc-interrupt-gserver#s,
+  gservers_prop, sizeof(gservers_prop));
+if (ret  0) {
+return ret;
+}
 }
 return ret;
 }
@@ -311,7 +332,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
 char qemu_hypertas_prop[] = hcall-memop1;
 uint32_t refpoints[] = {cpu_to_be32(0x4), cpu_to_be32(0x4)};
 uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(smp_cpus)};
-int i, smt = kvmppc_smt_threads();
+int smt = kvmppc_smt_threads();
 unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
 QemuOpts *opts = qemu_opts_find(qemu_find_opts(smp-opts), NULL);
 unsigned sockets = opts ? qemu_opt_get_number(opts, sockets, 0) : 0;
@@ -378,8 +399,6 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
 DeviceClass *dc = DEVICE_GET_CLASS(cs);
 PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
 int index = ppc_get_vcpu_dt_id(cpu);
-uint32_t servers_prop[smp_threads];
-uint32_t gservers_prop[smp_threads * 2];
 char *nodename;
 uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
0x, 0x};
@@ -428,18 +447,6 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
 _FDT((fdt_property_string(fdt, status, okay)));
 _FDT((fdt_property(fdt, 64-bit, NULL, 0)));
 
-/* Build interrupt servers and gservers properties */
-for (i = 0; i  smp_threads; i++) {
-servers_prop[i] = cpu_to_be32(index + i);
-/* Hack, direct the group queues back to cpu 0 */
-gservers_prop[i*2] = cpu_to_be32(index + i);
-gservers_prop[i*2 + 1] = 0;
-}
-_FDT((fdt_property(fdt, ibm,ppc-interrupt-server#s,
-   servers_prop, sizeof(servers_prop;
-_FDT((fdt_property(fdt, ibm,ppc-interrupt-gserver#s,
-   gservers_prop, sizeof(gservers_prop;
-
 if (env-spr_cb[SPR_PURR].oea_read) {
 _FDT((fdt_property(fdt, ibm,purr, NULL, 0)));
 }
-- 
1.9.rc0




[Qemu-devel] [PATCH 8/9] spapr: Implement processor compatibility in ibm, client-architecture-support

2014-05-15 Thread Alexey Kardashevskiy
Modern Linux kernels support last POWERPC CPUs so when a kernel boots,
in most cases it can find a matching cpu_spec in the kernel's cpu_specs
list. However if the kernel is quite old, it may be missing a definition
of the actual CPU. To provide an ability for old kernels to work on modern
hardware, a Processor Compatibility Mode has been introduced
by the PowerISA specification.

From the hardware prospective, it is supported by the Processor
Compatibility Register (PCR) which is defined in PowerISA. The register
enables one of the compatibility modes (2.05/2.06/2.07).
Since PCR is a hypervisor privileged register and cannot be
accessed from the guest, the mode selection is done via
ibm,client-architecture-support (CAS) RTAS call using which the guest
specifies what raw and architected CPU versions it supports.
QEMU works out the best match, changes a cpu-version property of
every CPU and notifies the guest about the change by setting these
properties in the buffer passed as a response on a custom H_CAS hypercall.

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/ppc/spapr.c   | 40 +
 hw/ppc/spapr_hcall.c | 83 
 trace-events |  5 
 3 files changed, 128 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a2c9106..a0882a1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -35,6 +35,7 @@
 #include kvm_ppc.h
 #include mmu-hash64.h
 #include cpu-models.h
+#include qom/cpu.h
 
 #include hw/boards.h
 #include hw/ppc/ppc.h
@@ -592,11 +593,50 @@ int spapr_h_cas_compose_response(target_ulong addr, 
target_ulong size)
 {
 void *fdt;
 sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
+CPUState *cs;
+int smt = kvmppc_smt_threads();
 
 size -= sizeof(hdr);
 
 fdt = g_malloc0(size);
 _FDT((fdt_create(fdt, size)));
+_FDT((fdt_begin_node(fdt, cpus)));
+
+CPU_FOREACH(cs) {
+PowerPCCPU *cpu = POWERPC_CPU(cs);
+DeviceClass *dc = DEVICE_GET_CLASS(cpu);
+int smpt = spapr_get_compat_smp_threads(cpu);
+int i, index = ppc_get_vcpu_dt_id(cpu);
+uint32_t servers_prop[smpt];
+uint32_t gservers_prop[smpt * 2];
+char tmp[32];
+
+if ((index % smt) != 0) {
+continue;
+}
+
+snprintf(tmp, 32, %s@%x, dc-fw_name, index);
+trace_spapr_cas_add_subnode(tmp);
+
+_FDT((fdt_begin_node(fdt, tmp)));
+_FDT((fdt_property_cell(fdt, cpu-version, cpu-cpu_version)));
+
+/* Build interrupt servers and gservers properties */
+for (i = 0; i  smpt; i++) {
+servers_prop[i] = cpu_to_be32(index + i);
+/* Hack, direct the group queues back to cpu 0 */
+gservers_prop[i*2] = cpu_to_be32(index + i);
+gservers_prop[i*2 + 1] = 0;
+}
+_FDT((fdt_property(fdt, ibm,ppc-interrupt-server#s,
+   servers_prop, sizeof(servers_prop;
+_FDT((fdt_property(fdt, ibm,ppc-interrupt-gserver#s,
+   gservers_prop, sizeof(gservers_prop;
+
+_FDT((fdt_end_node(fdt)));
+}
+
+_FDT((fdt_end_node(fdt)));
 
 _FDT((fdt_finish(fdt)));
 
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 2f6aa5c..cb815c3 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -3,6 +3,9 @@
 #include helper_regs.h
 #include hw/ppc/spapr.h
 #include mmu-hash64.h
+#include cpu-models.h
+#include trace.h
+#include kvm_ppc.h
 
 struct SPRSyncState {
 CPUState *cs;
@@ -752,12 +755,92 @@ out:
 return ret;
 }
 
+#define get_compat_level(cpuver) ( \
+((cpuver) == CPU_POWERPC_LOGICAL_2_05) ? 2050 : \
+((cpuver) == CPU_POWERPC_LOGICAL_2_06) ? 2060 : \
+((cpuver) == CPU_POWERPC_LOGICAL_2_06_PLUS) ? 2061 : \
+((cpuver) == CPU_POWERPC_LOGICAL_2_07) ? 2070 : 0)
+
 static target_ulong h_client_architecture_support(PowerPCCPU *cpu_,
   sPAPREnvironment *spapr,
   target_ulong opcode,
   target_ulong *args)
 {
 target_ulong list = args[0];
+PowerPCCPUClass *pcc_ = POWERPC_CPU_GET_CLASS(cpu_);
+CPUState *cs;
+CPUState *cs_ = CPU(cpu_);
+bool cpu_match = false;
+unsigned old_cpu_version = cpu_-cpu_version;
+unsigned compat_lvl = 0, cpu_version = 0;
+unsigned max_lvl = get_compat_level(cpu_-max_compat);
+
+/* Parse PVR list */
+for ( ; ; ) {
+uint32_t pvr, pvr_mask;
+
+pvr_mask = ldl_phys(cs_-as, list);
+list += 4;
+pvr = ldl_phys(cs_-as, list);
+list += 4;
+
+trace_spapr_cas_pvr_try(pvr);
+if (!max_lvl 
+((cpu_-env.spr[SPR_PVR]  pvr_mask) == (pvr  pvr_mask))) {
+cpu_match = true;
+cpu_version = 0;
+} else if (pvr == cpu_-cpu_version) {
+cpu_match = true;
+

[Qemu-devel] [PATCH 7/9] spapr: Limit threads per core according to current compatibility mode

2014-05-15 Thread Alexey Kardashevskiy
This puts a limit to the number of threads per core based on the current
compatibility mode. Although PowerISA specs do not specify the maximum
threads per core number, the linux guest still expects that
PowerISA2.05-compatible CPU supports only 2 threads per core as this
is what POWER6 (2.05 compliant CPU) implements, same is true for
POWER7 (2.06, 4 threads) and POWER8 (2.07, 8 threads).

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 hw/ppc/spapr.c | 31 ---
 1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index cf53a7a..a2c9106 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -34,6 +34,7 @@
 #include sysemu/kvm.h
 #include kvm_ppc.h
 #include mmu-hash64.h
+#include cpu-models.h
 
 #include hw/boards.h
 #include hw/ppc/ppc.h
@@ -203,6 +204,29 @@ static XICSState *xics_system_init(int nr_servers, int 
nr_irqs)
 return icp;
 }
 
+static int spapr_get_compat_smp_threads(PowerPCCPU *cpu)
+{
+int ret = -1;
+
+switch (cpu-cpu_version) {
+case CPU_POWERPC_LOGICAL_2_05:
+ret = 2;
+break;
+case CPU_POWERPC_LOGICAL_2_06:
+ret = 4;
+break;
+case CPU_POWERPC_LOGICAL_2_07:
+ret = 8;
+break;
+default:
+ret = smp_threads;
+break;
+}
+ret = MIN(ret, smp_threads);
+
+return ret;
+}
+
 static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
 {
 int ret = 0, offset;
@@ -221,8 +245,9 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment 
*spapr)
 cpu_to_be32(0x0),
 cpu_to_be32(cpu-numa_node),
 cpu_to_be32(index)};
-uint32_t servers_prop[smp_threads];
-uint32_t gservers_prop[smp_threads * 2];
+int smpt = spapr_get_compat_smp_threads(pcpu);
+uint32_t servers_prop[smpt];
+uint32_t gservers_prop[smpt * 2];
 int i;
 
 if ((index % smt) != 0) {
@@ -260,7 +285,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment 
*spapr)
 }
 
 /* Build interrupt servers and gservers properties */
-for (i = 0; i  smp_threads; i++) {
+for (i = 0; i  smpt; i++) {
 servers_prop[i] = cpu_to_be32(index + i);
 /* Hack, direct the group queues back to cpu 0 */
 gservers_prop[i*2] = cpu_to_be32(index + i);
-- 
1.9.rc0




Re: [Qemu-devel] [PATCH] migration: static variables will not be reset at second migration

2014-05-15 Thread Gonglei (Arei)
Hi, Juan

Ping...

Maybe you forgot to pull this patch, right? Thanks.



Best regards,
-Gonglei


 -Original Message-
 From: Juan Quintela [mailto:quint...@redhat.com]
 Sent: Friday, March 21, 2014 9:26 PM
 To: Gonglei (Arei)
 Cc: qemu-devel@nongnu.org; owass...@redhat.com; pbonz...@redhat.com;
 ebl...@redhat.com; dgilb...@redhat.com; chenliang (T)
 Subject: Re: [PATCH] migration: static variables will not be reset at second
 migration
 
 arei.gong...@huawei.com wrote:
  From: ChenLiang chenlian...@huawei.com
 
  The static variables in migration_bitmap_sync will not be reset in
  the case of a second attempted migration.
 
  Signed-off-by: ChenLiang chenlian...@huawei.com
  Signed-off-by: Gonglei arei.gong...@huawei.com
 
 Good catch.  Applied..
 
  ---
   arch_init.c | 15 ---
   1 file changed, 12 insertions(+), 3 deletions(-)
 
  diff --git a/arch_init.c b/arch_init.c
  index 60c975d..10516cb 100644
  --- a/arch_init.c
  +++ b/arch_init.c
  @@ -468,15 +468,23 @@ static void
 migration_bitmap_sync_range(ram_addr_t start, ram_addr_t length)
 
 
   /* Needs iothread lock! */
  +/* Fix me: there are too many global variables used in migration process. 
  */
  +static int64_t start_time;
  +static int64_t bytes_xfer_prev;
  +static int64_t num_dirty_pages_period;
  +
  +static void migration_bitmap_sync_init(void)
  +{
  +start_time = 0;
  +bytes_xfer_prev = 0;
  +num_dirty_pages_period = 0;
  +}
 
   static void migration_bitmap_sync(void)
   {
   RAMBlock *block;
   uint64_t num_dirty_pages_init = migration_dirty_pages;
   MigrationState *s = migrate_get_current();
  -static int64_t start_time;
  -static int64_t bytes_xfer_prev;
  -static int64_t num_dirty_pages_period;
   int64_t end_time;
   int64_t bytes_xfer_now;
 
  @@ -733,6 +741,7 @@ static int ram_save_setup(QEMUFile *f, void
 *opaque)
   migration_dirty_pages = ram_pages;
   mig_throttle_on = false;
   dirty_rate_high_cnt = 0;
  +migration_bitmap_sync_init();
 
   if (migrate_use_xbzrle()) {
   qemu_mutex_lock_iothread();



Re: [Qemu-devel] [PATCH 3/4] curl: Add sslverify option

2014-05-15 Thread Kevin Wolf
Am 15.05.2014 um 01:28 hat Matthew Booth geschrieben:
 This allows qemu to use images over https with a self-signed certificate. It
 defaults to verifying the certificate.
 
 Signed-off-by: Matthew Booth mbo...@redhat.com
 ---
  block/curl.c | 11 +++
  1 file changed, 11 insertions(+)
 
 diff --git a/block/curl.c b/block/curl.c
 index 1b9f2f2..43d6646 100644
 --- a/block/curl.c
 +++ b/block/curl.c
 @@ -23,6 +23,7 @@
   */
  #include qemu-common.h
  #include block/block_int.h
 +#include qapi/qmp/qbool.h
  #include curl/curl.h
  
  // #define DEBUG
 @@ -69,6 +70,7 @@ static CURLMcode __curl_multi_socket_action(CURLM 
 *multi_handle,
  
  #define CURL_BLOCK_OPT_URL   url
  #define CURL_BLOCK_OPT_READAHEAD readahead
 +#define CURL_BLOCK_OPT_SSLVERIFY sslverify
  
  struct BDRVCURLState;
  
 @@ -106,6 +108,7 @@ typedef struct BDRVCURLState {
  CURLState states[CURL_NUM_STATES];
  char *url;
  size_t readahead_size;
 +bool sslverify;
  bool accept_range;
  } BDRVCURLState;
  
 @@ -372,6 +375,7 @@ static CURLState *curl_init_state(BDRVCURLState *s)
  return NULL;
  }
  curl_easy_setopt(state-curl, CURLOPT_URL, s-url);
 +curl_easy_setopt(state-curl, CURLOPT_SSL_VERIFYPEER, s-sslverify);

block/curl.c: In function 'curl_init_state':
block/curl.c:378:1043: error: call to '_curl_easy_setopt_err_long' declared 
with attribute warning: curl_easy_setopt expects a long argument for this 
option [-Werror]
 curl_easy_setopt(state-curl, CURLOPT_SSL_VERIFYPEER, s-sslverify);

Squashing in this hunk:

--- a/block/curl.c
+++ b/block/curl.c
@@ -375,7 +375,8 @@ static CURLState *curl_init_state(BDRVCURLState *s)
 return NULL;
 }
 curl_easy_setopt(state-curl, CURLOPT_URL, s-url);
-curl_easy_setopt(state-curl, CURLOPT_SSL_VERIFYPEER, s-sslverify);
+curl_easy_setopt(state-curl, CURLOPT_SSL_VERIFYPEER,
+ (long) s-sslverify);
 curl_easy_setopt(state-curl, CURLOPT_TIMEOUT, 5);
 curl_easy_setopt(state-curl, CURLOPT_WRITEFUNCTION,
  (void *)curl_read_cb);

Kevin



[Qemu-devel] [PATCH] Revert iotests: Use configured python

2014-05-15 Thread Peter Maydell
This reverts commit f915db07ef9c368ea6db6430256de064fdd1525f.

This commit is broken because it does not account for the
build tree and the source tree being different, and can cause
build failures for out-of-tree builds. Revert it until we can
identify a better solution to the problem.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
I have an increasingly large queue of pull requests which I'm
currently not processing pending a fix to this build breakage.
Reverting the commit seems the simplest solution to unblocking
this... Barring objections I'm going to commit this to master
this afternoon.

 configure|  6 --
 tests/qemu-iotests/031   |  9 -
 tests/qemu-iotests/036   |  7 +++
 tests/qemu-iotests/039   | 19 +--
 tests/qemu-iotests/054   |  3 +--
 tests/qemu-iotests/060   | 21 ++---
 tests/qemu-iotests/061   | 25 -
 tests/qemu-iotests/065   |  2 +-
 tests/qemu-iotests/083   |  3 +--
 tests/qemu-iotests/check | 18 ++
 10 files changed, 43 insertions(+), 70 deletions(-)

diff --git a/configure b/configure
index 6adfa72..e565e59 100755
--- a/configure
+++ b/configure
@@ -4768,12 +4768,6 @@ if test $gcov = yes ; then
   echo GCOV=$gcov_tool  $config_host_mak
 fi
 
-iotests_common_env=tests/qemu-iotests/common.env
-
-echo # Automatically generated by configure - do not modify  
$iotests_common_env
-echo  $iotests_common_env
-echo PYTHON='$python'  $iotests_common_env
-
 # use included Linux headers
 if test $linux = yes ; then
   mkdir -p linux-headers
diff --git a/tests/qemu-iotests/031 b/tests/qemu-iotests/031
index 5aefb88..1d920ea 100755
--- a/tests/qemu-iotests/031
+++ b/tests/qemu-iotests/031
@@ -35,7 +35,6 @@ _cleanup()
 trap _cleanup; exit \$status 0 1 2 3 15
 
 # get standard environment, filters and checks
-. ./common.env
 . ./common.rc
 . ./common.filter
 . ./common.pattern
@@ -57,22 +56,22 @@ for IMGOPTS in compat=0.10 compat=1.1; do
 echo === Create image with unknown header extension ===
 echo
 _make_test_img 64M
-$PYTHON qcow2.py $TEST_IMG add-header-ext 0x12345678 This is a test 
header extension
-$PYTHON qcow2.py $TEST_IMG dump-header
+./qcow2.py $TEST_IMG add-header-ext 0x12345678 This is a test header 
extension
+./qcow2.py $TEST_IMG dump-header
 _check_test_img
 
 echo
 echo === Rewrite header with no backing file ===
 echo
 $QEMU_IMG rebase -u -b  $TEST_IMG
-$PYTHON qcow2.py $TEST_IMG dump-header
+./qcow2.py $TEST_IMG dump-header
 _check_test_img
 
 echo
 echo === Add a backing file and format ===
 echo
 $QEMU_IMG rebase -u -b /some/backing/file/path -F host_device $TEST_IMG
-$PYTHON qcow2.py $TEST_IMG dump-header
+./qcow2.py $TEST_IMG dump-header
 done
 
 # success, all done
diff --git a/tests/qemu-iotests/036 b/tests/qemu-iotests/036
index 29c35d1..03b6aa9 100755
--- a/tests/qemu-iotests/036
+++ b/tests/qemu-iotests/036
@@ -38,7 +38,6 @@ _cleanup()
 trap _cleanup; exit \$status 0 1 2 3 15
 
 # get standard environment, filters and checks
-. ./common.env
 . ./common.rc
 . ./common.filter
 . ./common.pattern
@@ -54,15 +53,15 @@ IMGOPTS=compat=1.1
 echo === Create image with unknown autoclear feature bit ===
 echo
 _make_test_img 64M
-$PYTHON qcow2.py $TEST_IMG set-feature-bit autoclear 63
-$PYTHON qcow2.py $TEST_IMG dump-header
+./qcow2.py $TEST_IMG set-feature-bit autoclear 63
+./qcow2.py $TEST_IMG dump-header
 
 echo
 echo === Repair image ===
 echo
 _check_test_img -r all
 
-$PYTHON qcow2.py $TEST_IMG dump-header
+./qcow2.py $TEST_IMG dump-header
 
 # success, all done
 echo *** done
diff --git a/tests/qemu-iotests/039 b/tests/qemu-iotests/039
index b7b7030..b9cbe99 100755
--- a/tests/qemu-iotests/039
+++ b/tests/qemu-iotests/039
@@ -38,7 +38,6 @@ _cleanup()
 trap _cleanup; exit \$status 0 1 2 3 15
 
 # get standard environment, filters and checks
-. ./common.env
 . ./common.rc
 . ./common.filter
 
@@ -59,7 +58,7 @@ _make_test_img $size
 $QEMU_IO -c write -P 0x5a 0 512 $TEST_IMG | _filter_qemu_io
 
 # The dirty bit must not be set
-$PYTHON qcow2.py $TEST_IMG dump-header | grep incompatible_features
+./qcow2.py $TEST_IMG dump-header | grep incompatible_features
 _check_test_img
 
 echo
@@ -74,7 +73,7 @@ $QEMU_IO -c write -P 0x5a 0 512 -c abort $TEST_IMG | 
_filter_qemu_io
 ulimit -c $old_ulimit
 
 # The dirty bit must be set
-$PYTHON qcow2.py $TEST_IMG dump-header | grep incompatible_features
+./qcow2.py $TEST_IMG dump-header | grep incompatible_features
 _check_test_img
 
 echo
@@ -83,7 +82,7 @@ echo == Read-only access must still work ==
 $QEMU_IO -r -c read -P 0x5a 0 512 $TEST_IMG | _filter_qemu_io
 
 # The dirty bit must be set
-$PYTHON qcow2.py $TEST_IMG dump-header | grep incompatible_features
+./qcow2.py $TEST_IMG dump-header | grep incompatible_features
 
 echo
 echo == Repairing the image file must succeed ==
@@ -91,7 +90,7 @@ echo == Repairing the image file must succeed ==
 

Re: [Qemu-devel] [PATCH 1/4] curl: Fix build when curl_multi_socket_action isn't available

2014-05-15 Thread Kevin Wolf
Am 15.05.2014 um 01:28 hat Matthew Booth geschrieben:
 Signed-off-by: Matthew Booth mbo...@redhat.com
 ---
  block/curl.c | 15 +++
  1 file changed, 15 insertions(+)

Thanks, applied all to the block branch (with patch 3 fixed as commented
there).

Please don't forget --cover-letter and --subject-prefix=PATCH v2 next
time, that makes it easier to reply to the whole series and to keep
track of the current version.

Kevin



Re: [Qemu-devel] [PATCH 3/5] block: make 'top' argument to block-commit optional

2014-05-15 Thread Benoît Canet
The Wednesday 14 May 2014 à 23:20:17 (-0400), Jeff Cody wrote :
 Now that active layer block-commit is supported, the 'top' argument
 no longer needs to be mandatory.
 
 Change it optional, with the default being the active layer in the

Do you mean Change it to optional or Make it optional ? 

 device chain.
 
 Signed-off-by: Jeff Cody jc...@redhat.com
 ---
  blockdev.c |  3 ++-
  qapi-schema.json   |  7 ---
  qmp-commands.hx|  5 +++--
  tests/qemu-iotests/040 | 28 ++--
  4 files changed, 27 insertions(+), 16 deletions(-)
 
 diff --git a/blockdev.c b/blockdev.c
 index 500707e..02c6525 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -1868,7 +1868,8 @@ void qmp_block_stream(const char *device, bool has_base,
  }
  
  void qmp_block_commit(const char *device,
 -  bool has_base, const char *base, const char *top,
 +  bool has_base, const char *base,
 +  bool has_top, const char *top,
bool has_speed, int64_t speed,
Error **errp)
  {
 diff --git a/qapi-schema.json b/qapi-schema.json
 index 36cb964..06a9b5d 100644
 --- a/qapi-schema.json
 +++ b/qapi-schema.json
 @@ -2099,8 +2099,9 @@
  # @base:   #optional The file name of the backing image to write data into.
  #If not specified, this is the deepest backing image
  #
 -# @top:  The file name of the backing image within the image 
 chain,
 -#which contains the topmost data to be committed down.
 +# @top:#optional The file name of the backing image within the image 
 chain,
 +#which contains the topmost data to be committed down. If
 +#not specified, this is the active layer.
  #
  #If top == base, that is an error.
  #If top == active, the job will not be completed by 
 itself,
 @@ -2128,7 +2129,7 @@
  #
  ##
  { 'command': 'block-commit',
 -  'data': { 'device': 'str', '*base': 'str', 'top': 'str',
 +  'data': { 'device': 'str', '*base': 'str', '*top': 'str',
  '*speed': 'int' } }
  
  ##
 diff --git a/qmp-commands.hx b/qmp-commands.hx
 index cae890e..1aa3c65 100644
 --- a/qmp-commands.hx
 +++ b/qmp-commands.hx
 @@ -985,7 +985,7 @@ EQMP
  
  {
  .name   = block-commit,
 -.args_type  = device:B,base:s?,top:s,speed:o?,
 +.args_type  = device:B,base:s?,top:s?,speed:o?,
  .mhandler.cmd_new = qmp_marshal_input_block_commit,
  },
  
 @@ -1003,7 +1003,8 @@ Arguments:
If not specified, this is the deepest backing image
(json-string, optional)
  - top:  The file name of the backing image within the image chain,
 -  which contains the topmost data to be committed down.
 +  which contains the topmost data to be committed down. If
 +  not specified, this is the active layer. (json-string, optional)
  
If top == base, that is an error.
If top == active, the job will not be completed by itself,
 diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
 index 734b6a6..803b0c7 100755
 --- a/tests/qemu-iotests/040
 +++ b/tests/qemu-iotests/040
 @@ -35,11 +35,7 @@ test_img = os.path.join(iotests.test_dir, 'test.img')
  class ImageCommitTestCase(iotests.QMPTestCase):
  '''Abstract base class for image commit test cases'''
  
 -def run_commit_test(self, top, base):
 -self.assert_no_active_block_jobs()
 -result = self.vm.qmp('block-commit', device='drive0', top=top, 
 base=base)
 -self.assert_qmp(result, 'return', {})
 -
 +def wait_for_complete(self):
  completed = False
  while not completed:
  for event in self.vm.get_qmp_events(wait=True):
 @@ -58,6 +54,18 @@ class ImageCommitTestCase(iotests.QMPTestCase):
  self.assert_no_active_block_jobs()
  self.vm.shutdown()
  
 +def run_commit_test(self, top, base):
 +self.assert_no_active_block_jobs()
 +result = self.vm.qmp('block-commit', device='drive0', top=top, 
 base=base)
 +self.assert_qmp(result, 'return', {})
 +self.wait_for_complete()
 +
 +def run_default_commit_test(self):
 +self.assert_no_active_block_jobs()
 +result = self.vm.qmp('block-commit', device='drive0')
 +self.assert_qmp(result, 'return', {})
 +self.wait_for_complete()
 +
  class TestSingleDrive(ImageCommitTestCase):
  image_len = 1 * 1024 * 1024
  test_len = 1 * 1024 * 256
 @@ -109,17 +117,17 @@ class TestSingleDrive(ImageCommitTestCase):
  self.assertEqual(-1, qemu_io('-c', 'read -P 0xab 0 524288', 
 backing_img).find(verification failed))
  self.assertEqual(-1, qemu_io('-c', 'read -P 0xef 524288 524288', 
 backing_img).find(verification failed))
  
 +def test_top_is_default_active(self):
 +self.run_default_commit_test()
 +self.assertEqual(-1, 

Re: [Qemu-devel] [PATCH 2/5] block: add helper function to determine if a BDS is in a chain

2014-05-15 Thread Benoît Canet
The Wednesday 14 May 2014 à 23:20:16 (-0400), Jeff Cody wrote :
 This is a small helper function, to determine if 'base' is in the
 chain of BlockDriverState 'top'.  It returns true if it is in the chain,
 and false otherwise.
 
 If either argument is NULL, it will also return false.
 
 Signed-off-by: Jeff Cody jc...@redhat.com
 ---
  block.c   | 9 +
  include/block/block.h | 1 +
  2 files changed, 10 insertions(+)
 
 diff --git a/block.c b/block.c
 index 81945d3..c4f77c2 100644
 --- a/block.c
 +++ b/block.c
 @@ -3734,6 +3734,15 @@ BlockDriverState *bdrv_lookup_bs(const char *device,
  return NULL;
  }
  
 +bool bdrv_is_in_chain(BlockDriverState *top, BlockDriverState *base)
 +{
 +while (top  top != base) {
 +top = top-backing_hd;
 +}
 +
 +return top != NULL;
 +}
 +
  BlockDriverState *bdrv_next(BlockDriverState *bs)
  {
  if (!bs) {
 diff --git a/include/block/block.h b/include/block/block.h
 index 1b119aa..283a6f3 100644
 --- a/include/block/block.h
 +++ b/include/block/block.h
 @@ -381,6 +381,7 @@ BlockDeviceInfoList *bdrv_named_nodes_list(void);
  BlockDriverState *bdrv_lookup_bs(const char *device,
   const char *node_name,
   Error **errp);
 +bool bdrv_is_in_chain(BlockDriverState *top, BlockDriverState *base);
  BlockDriverState *bdrv_next(BlockDriverState *bs);
  void bdrv_iterate(void (*it)(void *opaque, BlockDriverState *bs),
void *opaque);
 -- 
 1.8.3.1
 
Reviewed-by: Benoit Canet ben...@irqsave.net



Re: [Qemu-devel] [PATCH 3/5] block: make 'top' argument to block-commit optional

2014-05-15 Thread Jeff Cody
On Thu, May 15, 2014 at 01:47:55PM +0200, Benoît Canet wrote:
 The Wednesday 14 May 2014 à 23:20:17 (-0400), Jeff Cody wrote :
  Now that active layer block-commit is supported, the 'top' argument
  no longer needs to be mandatory.
  
  Change it optional, with the default being the active layer in the
 
 Do you mean Change it to optional or Make it optional ? 


Thanks - forgot the to.

  device chain.
  
  Signed-off-by: Jeff Cody jc...@redhat.com
  ---
   blockdev.c |  3 ++-
   qapi-schema.json   |  7 ---
   qmp-commands.hx|  5 +++--
   tests/qemu-iotests/040 | 28 ++--
   4 files changed, 27 insertions(+), 16 deletions(-)
  
  diff --git a/blockdev.c b/blockdev.c
  index 500707e..02c6525 100644
  --- a/blockdev.c
  +++ b/blockdev.c
  @@ -1868,7 +1868,8 @@ void qmp_block_stream(const char *device, bool 
  has_base,
   }
   
   void qmp_block_commit(const char *device,
  -  bool has_base, const char *base, const char *top,
  +  bool has_base, const char *base,
  +  bool has_top, const char *top,
 bool has_speed, int64_t speed,
 Error **errp)
   {
  diff --git a/qapi-schema.json b/qapi-schema.json
  index 36cb964..06a9b5d 100644
  --- a/qapi-schema.json
  +++ b/qapi-schema.json
  @@ -2099,8 +2099,9 @@
   # @base:   #optional The file name of the backing image to write data into.
   #If not specified, this is the deepest backing image
   #
  -# @top:  The file name of the backing image within the image 
  chain,
  -#which contains the topmost data to be committed down.
  +# @top:#optional The file name of the backing image within the image 
  chain,
  +#which contains the topmost data to be committed down. 
  If
  +#not specified, this is the active layer.
   #
   #If top == base, that is an error.
   #If top == active, the job will not be completed by 
  itself,
  @@ -2128,7 +2129,7 @@
   #
   ##
   { 'command': 'block-commit',
  -  'data': { 'device': 'str', '*base': 'str', 'top': 'str',
  +  'data': { 'device': 'str', '*base': 'str', '*top': 'str',
   '*speed': 'int' } }
   
   ##
  diff --git a/qmp-commands.hx b/qmp-commands.hx
  index cae890e..1aa3c65 100644
  --- a/qmp-commands.hx
  +++ b/qmp-commands.hx
  @@ -985,7 +985,7 @@ EQMP
   
   {
   .name   = block-commit,
  -.args_type  = device:B,base:s?,top:s,speed:o?,
  +.args_type  = device:B,base:s?,top:s?,speed:o?,
   .mhandler.cmd_new = qmp_marshal_input_block_commit,
   },
   
  @@ -1003,7 +1003,8 @@ Arguments:
 If not specified, this is the deepest backing image
 (json-string, optional)
   - top:  The file name of the backing image within the image chain,
  -  which contains the topmost data to be committed down.
  +  which contains the topmost data to be committed down. If
  +  not specified, this is the active layer. (json-string, optional)
   
 If top == base, that is an error.
 If top == active, the job will not be completed by itself,
  diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
  index 734b6a6..803b0c7 100755
  --- a/tests/qemu-iotests/040
  +++ b/tests/qemu-iotests/040
  @@ -35,11 +35,7 @@ test_img = os.path.join(iotests.test_dir, 'test.img')
   class ImageCommitTestCase(iotests.QMPTestCase):
   '''Abstract base class for image commit test cases'''
   
  -def run_commit_test(self, top, base):
  -self.assert_no_active_block_jobs()
  -result = self.vm.qmp('block-commit', device='drive0', top=top, 
  base=base)
  -self.assert_qmp(result, 'return', {})
  -
  +def wait_for_complete(self):
   completed = False
   while not completed:
   for event in self.vm.get_qmp_events(wait=True):
  @@ -58,6 +54,18 @@ class ImageCommitTestCase(iotests.QMPTestCase):
   self.assert_no_active_block_jobs()
   self.vm.shutdown()
   
  +def run_commit_test(self, top, base):
  +self.assert_no_active_block_jobs()
  +result = self.vm.qmp('block-commit', device='drive0', top=top, 
  base=base)
  +self.assert_qmp(result, 'return', {})
  +self.wait_for_complete()
  +
  +def run_default_commit_test(self):
  +self.assert_no_active_block_jobs()
  +result = self.vm.qmp('block-commit', device='drive0')
  +self.assert_qmp(result, 'return', {})
  +self.wait_for_complete()
  +
   class TestSingleDrive(ImageCommitTestCase):
   image_len = 1 * 1024 * 1024
   test_len = 1 * 1024 * 256
  @@ -109,17 +117,17 @@ class TestSingleDrive(ImageCommitTestCase):
   self.assertEqual(-1, qemu_io('-c', 'read -P 0xab 0 524288', 
  backing_img).find(verification failed))
   

[Qemu-devel] [PATCH v4] serial-pci: Set prog interface field of pci config to 16550 compatible

2014-05-15 Thread BALATON Zoltan
Signed-off-by: BALATON Zoltan bala...@eik.bme.hu
---

 v2: resubmission after pc-2.1 is added with the multiport case
 v3: added compatibility check to avoid changing earlier than pc-2.1
 v4: renamed compat property to prog_if

 hw/char/serial-pci.c |  7 +++
 include/hw/i386/pc.h | 15 +++
 2 files changed, 22 insertions(+)

diff --git a/hw/char/serial-pci.c b/hw/char/serial-pci.c
index 991c99f..a9c 100644
--- a/hw/char/serial-pci.c
+++ b/hw/char/serial-pci.c
@@ -34,6 +34,7 @@
 typedef struct PCISerialState {
 PCIDevice dev;
 SerialState state;
+uint8_t prog_if;
 } PCISerialState;
 
 typedef struct PCIMultiSerialState {
@@ -44,6 +45,7 @@ typedef struct PCIMultiSerialState {
 SerialState  state[PCI_SERIAL_MAX_PORTS];
 uint32_t level[PCI_SERIAL_MAX_PORTS];
 qemu_irq *irqs;
+uint8_t  prog_if;
 } PCIMultiSerialState;
 
 static int serial_pci_init(PCIDevice *dev)
@@ -60,6 +62,7 @@ static int serial_pci_init(PCIDevice *dev)
 return -1;
 }
 
+pci-dev.config[PCI_CLASS_PROG] = pci-prog_if;
 pci-dev.config[PCI_INTERRUPT_PIN] = 0x01;
 s-irq = pci_allocate_irq(pci-dev);
 
@@ -101,6 +104,7 @@ static int multi_serial_pci_init(PCIDevice *dev)
 assert(pci-ports  0);
 assert(pci-ports = PCI_SERIAL_MAX_PORTS);
 
+pci-dev.config[PCI_CLASS_PROG] = pci-prog_if;
 pci-dev.config[PCI_INTERRUPT_PIN] = 0x01;
 memory_region_init(pci-iobar, OBJECT(pci), multiserial, 8 * 
pci-ports);
 pci_register_bar(pci-dev, 0, PCI_BASE_ADDRESS_SPACE_IO, pci-iobar);
@@ -177,12 +181,14 @@ static const VMStateDescription vmstate_pci_multi_serial 
= {
 
 static Property serial_pci_properties[] = {
 DEFINE_PROP_CHR(chardev,  PCISerialState, state.chr),
+DEFINE_PROP_UINT8(prog_if,  PCISerialState, prog_if, 0x02),
 DEFINE_PROP_END_OF_LIST(),
 };
 
 static Property multi_2x_serial_pci_properties[] = {
 DEFINE_PROP_CHR(chardev1,  PCIMultiSerialState, state[0].chr),
 DEFINE_PROP_CHR(chardev2,  PCIMultiSerialState, state[1].chr),
+DEFINE_PROP_UINT8(prog_if,  PCIMultiSerialState, prog_if, 0x02),
 DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -191,6 +197,7 @@ static Property multi_4x_serial_pci_properties[] = {
 DEFINE_PROP_CHR(chardev2,  PCIMultiSerialState, state[1].chr),
 DEFINE_PROP_CHR(chardev3,  PCIMultiSerialState, state[2].chr),
 DEFINE_PROP_CHR(chardev4,  PCIMultiSerialState, state[3].chr),
+DEFINE_PROP_UINT8(prog_if,  PCIMultiSerialState, prog_if, 0x02),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 32a7687..552fbd8 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -271,6 +271,21 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 .driver   = apic,\
 .property = version,\
 .value= stringify(0x11),\
+},\
+{\
+.driver   = pci-serial,\
+.property = prog_if,\
+.value= stringify(0),\
+},\
+{\
+.driver   = pci-serial-2x,\
+.property = prof_if,\
+.value= stringify(0),\
+},\
+{\
+.driver   = pci-serial-4x,\
+.property = prog_if,\
+.value= stringify(0),\
 }
 
 #define PC_COMPAT_1_7 \
-- 
1.8.1.5




[Qemu-devel] Differential VHD

2014-05-15 Thread Ankur Srivastava
Hi,

What all steps to be taken for differential vhd image? i created one
differential image using vhd-util

Thanks


Re: [Qemu-devel] [PATCH] Revert iotests: Use configured python

2014-05-15 Thread Kevin Wolf
Am 15.05.2014 um 13:34 hat Peter Maydell geschrieben:
 This reverts commit f915db07ef9c368ea6db6430256de064fdd1525f.
 
 This commit is broken because it does not account for the
 build tree and the source tree being different, and can cause
 build failures for out-of-tree builds. Revert it until we can
 identify a better solution to the problem.
 
 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 ---
 I have an increasingly large queue of pull requests which I'm
 currently not processing pending a fix to this build breakage.
 Reverting the commit seems the simplest solution to unblocking
 this... Barring objections I'm going to commit this to master
 this afternoon.

Acked-by: Kevin Wolf kw...@redhat.com



Re: [Qemu-devel] [PATCH 1/5] block: Auto-generate node_names for each BDS entry

2014-05-15 Thread Benoît Canet
The Wednesday 14 May 2014 à 23:20:15 (-0400), Jeff Cody wrote :
 Currently, node_name is only filled in when done so explicitly by the
 user.  If no node_name is specified, then the node name field is not
 populated.
 
 If node_names are automatically generated when not specified, that means
 that all block job operations can be done by reference to the unique
 node_name field.  This eliminates ambiguity in filename pathing
 (relative filenames, or file descriptors, symlinks, mounts, etc..) that
 qemu currently needs to deal with.
 
 If a node name is specified, then it will not be automatically
 generated for that BDS entry.
 
 If it is automatically generated, it will be prefaced with __qemu##,
 followed by 8 characters of a unique number, followed by 8 random
 ASCII characters in the range of 'A-Z'.  Some sample generated node-name
 strings:
 __qemu##IAIYNXXR
 __qemu##0002METXTRBQ
 __qemu##0001FMBORDWG
 
 The prefix is to aid in identifying it as a qemu-generated name, the
 numeric portion is to guarantee uniqueness in a given qemu session, and
 the random characters are to further avoid any accidental collisions
 with user-specified node-names.
 
 Signed-off-by: Jeff Cody jc...@redhat.com
 ---
  block.c | 16 +++-
  1 file changed, 15 insertions(+), 1 deletion(-)
 
 diff --git a/block.c b/block.c
 index c90c71a..81945d3 100644
 --- a/block.c
 +++ b/block.c
 @@ -838,12 +838,26 @@ static int bdrv_open_flags(BlockDriverState *bs, int 
 flags)
  return open_flags;
  }
  
 +#define GEN_NODE_NAME_PREFIX__qemu##
 +#define GEN_NODE_NAME_MAX_LEN   (sizeof(GEN_NODE_NAME_PREFIX) + 8 + 8)
  static void bdrv_assign_node_name(BlockDriverState *bs,
const char *node_name,
Error **errp)
  {
 +char gen_node_name[GEN_NODE_NAME_MAX_LEN];

The room for the '\0' string termination seems to be missing:

char gen_node_name[GEN_NODE_NAME_MAX_LEN + 1];

 +static uint32_t counter; /* simple counter to guarantee uniqueness */
 +
 +/* if node_name is NULL, auto-generate a node name */
  if (!node_name) {
 -return;
 +int len;
 +snprintf(gen_node_name, GEN_NODE_NAME_MAX_LEN,
 + %s%08x, GEN_NODE_NAME_PREFIX, counter++);
 +len = strlen(gen_node_name);
 +while (len  GEN_NODE_NAME_MAX_LEN - 1) {
 +gen_node_name[len++] = g_random_int_range('A', 'Z');
 +}

Is this code generating only 7 random chars instead of 8 ?

 +gen_node_name[GEN_NODE_NAME_MAX_LEN - 1] = '\0';

Could be:
gen_node_name[GEN_NODE_NAME_MAX_LEN] = '\0';
if the array is properly declared.

 +node_name = gen_node_name;
  }
  
  /* empty string node name is invalid */
 -- 
 1.8.3.1
 



Re: [Qemu-devel] [PATCH RFC 1/8] virtio: add subsections to the migration stream

2014-05-15 Thread Andreas Färber
Am 15.05.2014 12:16, schrieb Michael S. Tsirkin:
 On Thu, May 15, 2014 at 12:11:12PM +0200, Andreas Färber wrote:
 Am 15.05.2014 12:03, schrieb Michael S. Tsirkin:
 On Thu, May 15, 2014 at 11:58:25AM +0200, Andreas Färber wrote:
 Am 15.05.2014 11:52, schrieb Michael S. Tsirkin:
 On Thu, May 15, 2014 at 11:20:18AM +0200, Andreas Färber wrote:
 Am 15.05.2014 09:04, schrieb Greg Kurz:
 On Thu, 15 May 2014 12:16:35 +0530
 Amit Shah amit.s...@redhat.com wrote:
 On (Thu) 15 May 2014 [09:23:51], Michael S. Tsirkin wrote:
 On Thu, May 15, 2014 at 11:34:25AM +0530, Amit Shah wrote:
 On (Wed) 14 May 2014 [17:41:38], Greg Kurz wrote:
 Since each virtio device is streamed in its own section, the idea 
 is to
 stream subsections between the end of the device section and the 
 start
 of the next sections. This allows an older QEMU to complain and exit
 when fed with subsections:

 Unknown savevm section type 5
 Error -22 while loading VM state

 Please make this configurable -- either via configure or device
 properties.  That avoids having to break existing configurations that
 work without this patch.

 Since backwards migration is not supported upstream, wouldn't it be
 easiest to just add support for the subsection marker and skipping to
 the end of section in that downstream?

 Backwards and forwards migration need to be supported,
 customers told us repeatedly. So some downstreams support this
 and not supporting it upstream just means downstreams need
 to do their own thing.

 As importantly, ping-pong migration is the only
 reliable way to stress migration.

 So if we want to test cross-version we need it to work
 both way.

 Finally, the real issue and difficulty with cross-version migration is
 making VM behave in a backwards compatible way.  Serializing in a
 compatible way is a trivial problem, or would be if the code wasn't a
 mess :) Once you do the hard part, breaking migration because of the
 trivial serialization issue is just silly.  And special-casing forward
 migration does not make code simpler, it really only leads to
 proliferation of hacks and lack of symmetry.

 So yes it's a useful feature, and no not supporting it does
 not help anyway.

 It seems you misunderstand. I was not saying it's not useful.

 My point is that VMStateSubsections added in newer versions (and thus
 not existing in older versions) need to be handled for any
 VMState-converted devices. So why can't we make that work for virtio too?

 Sure.
 AFAIK the way to this works is by adding an field_exists callback,
 and not sending the section when we are in a compat mode.

 OK, but upstream always sends the latest version today.
 So isn't that
 just two ifs that you would need to add in save and load functions added
 here for the downstream? x86_64 is unaffected from ppc's endianness
 changes and you still do ppc64 BE, so there's no real functional problem
 for RHEL, is there?
 
 I'm sorry I don't understand what you are saying here.
 Simply put, if you specify a compatible -M then your
 device should behave, and migrate, exactly like and old
 qemu did.

Whatever the version_id fields are for, upstream QEMU *always* saves the
newest version_id format that it knows. There is no mechanism that I'm
aware of in upstream QEMU for migrating device fields dependent on -M.
So a device in QEMU only migrates exactly like an old QEMU did if
neither fields nor subsections were added.

Subsections are usually migrated based on whether the state differs from
the default state (didn't check whether the final patch does this
correctly here, but doesn't matter for 1/8 concept). So for x86 the
subsection should *never* get written and thus not be a problem for you.
For ppc64 it should not get written either, unless you care about
migration from SLES12 or upstream ppc64le to RHEL ppc64.
As I understood the IRC discussion Alex and me had with Greg about this,
this is copying the exact code VMState uses to write its subsections, so
there would be no forward migration problems at all once we convert to
VMState and VMStateSubsections. The only question remaining is how does
your downstream react when it encounters a subsection marker for
subsections it doesn't know - which is not something upstream can solve
for you unless you contribute backwards migration support upstream.

Don't forget that Greg is introducing subsection support *because of*
your backwards migration wish, so telling him not to add subsections
because of backwards migration is really weird! Either subsections work
with your downstream or they don't; if they don't, then you have much
larger problems than can be solved by blocking this one section for ppc.
Therefore I was saying if your downstream forgot to handle the case of a
non-VMState device having subsections, then it may be easier to fix
exactly that than make Greg jump through more hoops here for a
theoretical problem you are unlikely to encounter on your downstream.

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 

Re: [Qemu-devel] [PATCH 1/5] block: Auto-generate node_names for each BDS entry

2014-05-15 Thread Jeff Cody
On Thu, May 15, 2014 at 01:58:59PM +0200, Benoît Canet wrote:
 The Wednesday 14 May 2014 à 23:20:15 (-0400), Jeff Cody wrote :
  Currently, node_name is only filled in when done so explicitly by the
  user.  If no node_name is specified, then the node name field is not
  populated.
  
  If node_names are automatically generated when not specified, that means
  that all block job operations can be done by reference to the unique
  node_name field.  This eliminates ambiguity in filename pathing
  (relative filenames, or file descriptors, symlinks, mounts, etc..) that
  qemu currently needs to deal with.
  
  If a node name is specified, then it will not be automatically
  generated for that BDS entry.
  
  If it is automatically generated, it will be prefaced with __qemu##,
  followed by 8 characters of a unique number, followed by 8 random
  ASCII characters in the range of 'A-Z'.  Some sample generated node-name
  strings:
  __qemu##IAIYNXXR
  __qemu##0002METXTRBQ
  __qemu##0001FMBORDWG
  
  The prefix is to aid in identifying it as a qemu-generated name, the
  numeric portion is to guarantee uniqueness in a given qemu session, and
  the random characters are to further avoid any accidental collisions
  with user-specified node-names.
  
  Signed-off-by: Jeff Cody jc...@redhat.com
  ---
   block.c | 16 +++-
   1 file changed, 15 insertions(+), 1 deletion(-)
  
  diff --git a/block.c b/block.c
  index c90c71a..81945d3 100644
  --- a/block.c
  +++ b/block.c
  @@ -838,12 +838,26 @@ static int bdrv_open_flags(BlockDriverState *bs, int 
  flags)
   return open_flags;
   }
   
  +#define GEN_NODE_NAME_PREFIX__qemu##
  +#define GEN_NODE_NAME_MAX_LEN   (sizeof(GEN_NODE_NAME_PREFIX) + 8 + 8)
   static void bdrv_assign_node_name(BlockDriverState *bs,
 const char *node_name,
 Error **errp)
   {
  +char gen_node_name[GEN_NODE_NAME_MAX_LEN];
 
 The room for the '\0' string termination seems to be missing:
 
 char gen_node_name[GEN_NODE_NAME_MAX_LEN + 1];


The array includes room for it, note the use of 'sizeof()':
#define GEN_NODE_NAME_MAX_LEN   (sizeof(GEN_NODE_NAME_PREFIX) + 8 + 8)

sizeof() includes the '\0' in the length, while strlen() does not;
e.g.:
sizeof(four) = 5
strlen(four) = 4

  +static uint32_t counter; /* simple counter to guarantee uniqueness */
  +
  +/* if node_name is NULL, auto-generate a node name */
   if (!node_name) {
  -return;
  +int len;
  +snprintf(gen_node_name, GEN_NODE_NAME_MAX_LEN,
  + %s%08x, GEN_NODE_NAME_PREFIX, counter++);
  +len = strlen(gen_node_name);
  +while (len  GEN_NODE_NAME_MAX_LEN - 1) {
  +gen_node_name[len++] = g_random_int_range('A', 'Z');
  +}
 
 Is this code generating only 7 random chars instead of 8 ?
 

It generates 8 random characters (the sample node-name strings in the
commit message were pulled straight from the QMP command
'query-named-block-nodes')

  +gen_node_name[GEN_NODE_NAME_MAX_LEN - 1] = '\0';
 
 Could be:
 gen_node_name[GEN_NODE_NAME_MAX_LEN] = '\0';
 if the array is properly declared.


That would go over the array bounds by 1.

  +node_name = gen_node_name;
   }
   
   /* empty string node name is invalid */
  -- 
  1.8.3.1
  



Re: [Qemu-devel] [PATCH v3 4/4] sclp-s390: Add memory hotplug SCLPs

2014-05-15 Thread Christian Borntraeger
On 13/05/14 20:02, Matthew Rosato wrote:
 On 05/12/2014 03:35 AM, Christian Borntraeger wrote:
 On 07/05/14 20:05, Matthew Rosato wrote:
 Add memory information to read SCP info and add handlers for
 Read Storage Element Information, Attach Storage Element,
 Assign Storage and Unassign Storage.

 Signed-off-by: Matthew Rosato mjros...@linux.vnet.ibm.com
 ---
  hw/s390x/sclp.c|  245 
 ++--
  target-s390x/cpu.h |   15 
  target-s390x/kvm.c |5 ++
  3 files changed, 258 insertions(+), 7 deletions(-)

 diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
 index 338dbdf..b9425ca 100644
 --- a/hw/s390x/sclp.c
 +++ b/hw/s390x/sclp.c
 @@ -16,7 +16,8 @@
  #include sysemu/kvm.h
  #include exec/memory.h
  #include sysemu/sysemu.h
 -
 +#include exec/address-spaces.h
 +#include qemu/config-file.h
  #include hw/s390x/sclp.h
  #include hw/s390x/event-facility.h

 @@ -33,10 +34,18 @@ static inline SCLPEventFacility 
 *get_event_facility(void)
  static void read_SCP_info(SCCB *sccb)
  {
  ReadInfo *read_info = (ReadInfo *) sccb;
 +sclpMemoryHotplugDev *mhd = get_sclp_memory_hotplug_dev();


 This only works for the ccw machine. The legacy machine does not yet have 
 this device and qemu complains about the system bus not being hotplug 
 capable.
 Cant you just initialize the device in the ccw machine and check here only 
 for the existence?
 
 Agh, forgot about the legacy machine.  Sure, I can split out creation
 from the get_sclp_memory_hotplug_dev() code, and then rework the code here.
 
 But then I got to thinking -- should I not be setting
 SCLP_FC_ASSIGN_ATTACH_READ_STOR below when we are running in the legacy
 machine?  Because without the sclp_memory_hotplug_dev, we really aren't
 supporting these new functions (assign_storage, unassign_storage,
 assign_storage_element, read_storage_element*_info).
 Then, I can probably add some assert(mhd) for the remaining
 get_sclp_memory_hotplug_dev() calls, since we shouldn't get getting
 there for the legacy machine.
 
 What do you think?  Alternatively, I'd have to add code to fudge returns
 for each of these new fuctions.

Given our test coverage for the legacy machine, I agree to not add memory 
hotplug
to this machine. Please make sure to enhance read_SCP_info to be able to handle 
the
non-existence of the hotplug device.
 

  CPUState *cpu;
 -int shift = 0;
  int cpu_count = 0;
  int i = 0;
 +int rnsize, rnmax;
 +QemuOpts *opts = qemu_opts_find(qemu_find_opts(memory), NULL);
 +int slots = qemu_opt_get_number(opts, slots, 0);
 +int max_avail_slots = s390_get_memslot_count(kvm_state);
 +
 +if (slots  max_avail_slots) {
 +slots = max_avail_slots;
 +}

  CPU_FOREACH(cpu) {
  cpu_count++;
 @@ -52,16 +61,222 @@ static void read_SCP_info(SCCB *sccb)
  read_info-entries[i].type = 0;
  }

 -read_info-facilities = cpu_to_be64(SCLP_HAS_CPU_INFO);
 +/*
 + * The storage increment size is a multiple of 1M and is a power of 2.
 + * The number of storage increments must be MAX_STORAGE_INCREMENTS or 
 fewer.
 + */
 +while ((ram_size  mhd-increment_size)  MAX_STORAGE_INCREMENTS) {
 +mhd-increment_size++;
 +}
 +while ((mhd-standby_mem_size  mhd-increment_size) 
 +   MAX_STORAGE_INCREMENTS) {
 +mhd-increment_size++;
 +}
 +
 +mhd-standby_subregion_size = MEM_SECTION_SIZE;
 +/* Deduct the memory slot already used for core */
 +if (slots  0) {
 +while ((mhd-standby_subregion_size * (slots - 1)
 + mhd-standby_mem_size)) {
 +mhd-standby_subregion_size = mhd-standby_subregion_size  1;
 +}
 +}
 +/*
 + * Initialize mapping of guest standby memory sections indicating which
 + * are and are not online. Assume all standby memory begins offline.
 + */
 +if (mhd-standby_state_map == 0) {
 +if (mhd-standby_mem_size % mhd-standby_subregion_size) {
 +mhd-standby_state_map = g_malloc0((mhd-standby_mem_size /
 +   mhd-standby_subregion_size + 
 1) *
 +  (mhd-standby_subregion_size /
 +   MEM_SECTION_SIZE));
 +} else {
 +mhd-standby_state_map = g_malloc0(mhd-standby_mem_size /
 +   MEM_SECTION_SIZE);
 +}
 +}
 +
 +mhd-padded_ram_size = ram_size + mhd-pad_size;
 +mhd-rzm = 1  mhd-increment_size;
 +rnsize = 1  (mhd-increment_size - 20);
 +if (rnsize = 128) {
 +read_info-rnsize = rnsize;
 +} else {
 +read_info-rnsize = 0;
 +read_info-rnsize2 = cpu_to_be32(rnsize);
 +}

 -while ((ram_size  (20 + shift))  65535) {
 -shift++;
 +rnmax = ((ram_size + mhd-standby_mem_size + mhd-pad_size)
 +  mhd-increment_size);
 +if (rnmax  0x1) {
 +read_info-rnmax 

  1   2   3   4   >