Re: [Qemu-devel] Minutes of QEMU Summit 2015 (2015-08-18, Seattle)

2015-11-08 Thread Stefan Weil
Am 06.11.2015 um 17:22 schrieb Peter Maydell:
> On 6 September 2015 at 07:05, Stefan Weil  wrote:
>> Part of that infrastructure is the FSF savannah server.
>> The FSF infrastructure is still used for our mailing lists.
>>
>> http://savannah.nongnu.org/projects/qemu/ needs
>> someone who updates the summary. There should be
>> a link to the official website at least, maybe also a
>> link to the QEMU article on Wikipedia which is more
>> up to date.
>>
>> http://git.savannah.gnu.org/cgit/qemu.git was not updated
>> for 5 years. Should it be removed? I'd prefer to
>> have a QEMU git mirror there with automated update.
> 
> I finally got round to this -- I have removed the
> savannah repo[*] and download area, and updated the summary
> for the project to include a link to the project's
> official website.
> 
> [*] I opted to remove the repo rather than setting it up
> as a mirror, because the Savannah documentation says they
> only like to host mirrors for projects which are official
> GNU ones, which we are not.
> 
> thanks
> -- PMM


Thank you for this update. I think it will help at least some people.

Stefan




Re: [Qemu-devel] [PATCH] exec: silence hugetlbfs warning under qtest

2015-11-08 Thread Michael S. Tsirkin
On Tue, Oct 27, 2015 at 05:29:43PM +0100, marcandre.lur...@redhat.com wrote:
> From: Marc-André Lureau 
> 
> vhost-user-test prints a warning. A test should not need to run on
> hugetlbfs, let's silence the warning under qtest. Unfortunately, the
> condition can't check on qtest_enabled() or qtest_driver() since they
> are initialized later.
> 
> Signed-off-by: Marc-André Lureau 

Any idea what's the best way to address this?
Is poking at environment like this appropriate?
Maybe a command line flag to silence the warning?

> ---
>  exec.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/exec.c b/exec.c
> index 8af2570..d9c231d 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1194,8 +1194,9 @@ static long gethugepagesize(const char *path, Error 
> **errp)
>  return 0;
>  }
>  
> -if (fs.f_type != HUGETLBFS_MAGIC)
> +if (fs.f_type != HUGETLBFS_MAGIC && !getenv("QTEST_QEMU_BINARY")) {
>  fprintf(stderr, "Warning: path not on HugeTLBFS: %s\n", path);
> +}
>  
>  return fs.f_bsize;
>  }
> -- 
> 2.4.3



[Qemu-devel] [Bug 1512134] Re: Multiboot v1 memory map offset wrong

2015-11-08 Thread Tristan Parisot
Tested today with GRUB 2.0. Indeed, mmap_entry points to the start of
the structure, without a fancy offset. I guess we can close this ticket

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1512134

Title:
  Multiboot v1 memory map offset wrong

Status in QEMU:
  New

Bug description:
  I'm developping a multiboot kernel for multiboot v1
  My multiboot header contains the V1 magic (0x1BADB002) and the flags 
0x00010243  (with enabled memory detection, and boot loader name)

  
  When booted in multiboot,
  Qemu gives me two pointers:
  unsigned long mmap_length;
  unsigned long mmap_addr;

  mmap_addr shall points to this structure:
  struct multiboot_mmap_entry
  {
  multiboot_uint32_t size;
  multiboot_uint64_t addr;
  multiboot_uint64_t len;
  multiboot_uint32_t type;
  } 

  
  According to the multiboot v1 specification, mmap_addr should not point  to 
the start of this structure, but instead, should point to the "addr "field.

  Work-arround:
  Detect if qemu is used using bootloader_name field.
  If it is, do NOT apply a -4 offset to mmap_addr

  http://git.savannah.gnu.org/cgit/grub.git/tree/doc/multiboot.texi?h=multiboot

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1512134/+subscriptions



Re: [Qemu-devel] [PATCH V3] hw/virtio: Add PCIe capability to virtio devices

2015-11-08 Thread Michael S. Tsirkin
On Thu, Oct 29, 2015 at 01:56:28PM +0200, Marcel Apfelbaum wrote:
> The virtio devices are converted to PCI-Express
> if they are plugged into a PCI-Express bus and
> the 'modern' protocol is enabled.
> 
> Devices plugged directly into the Root Complex as
> Integrated Endpoints remain PCI.
> 
> Signed-off-by: Marcel Apfelbaum 


Looks ok. Two comments:

> ---
> v2 -> v3:
>  - Addressed Michael S. Tsirkin's comments:
>- enable pcie only for 2.5+ machines.
> 
> v1 -> v2:
>  - Addressed Michael S. Tsirkin's comments:
>- Added the minimum required capabilities for PCIe devices
>- Integrated Endpoints remain PCI
> 
>  - Use pcie_endpoint_cap_init instead of manually creating the pcie 
> capability.
> 
>  - Regarding Gerd Hoffman's comments:
>- Creating virtio-pcie devices:
>  For the moment I prefer to not duplicate the virtio definitions,
>  at least until we don't have a consensus (Personally I don't like it)
>- Removing the IO bar:
>  This would be my next patch on the "virtio to express" series, I plan
>  to remove it only for "modern" devices.
> 
> Thanks,
> Marcel
> 
>  hw/virtio/virtio-pci.c | 22 ++
>  hw/virtio/virtio-pci.h |  2 ++
>  include/hw/compat.h| 46 +-
>  3 files changed, 69 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index f55dd2b..a288d8b 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -1592,6 +1592,26 @@ static void virtio_pci_realize(PCIDevice *pci_dev, 
> Error **errp)
>  
>  address_space_init(>modern_as, >modern_cfg, 
> "virtio-pci-cfg-as");
>  
> +if (!(proxy->flags & VIRTIO_PCI_FLAG_DISABLE_PCIE)
> +&& !(proxy->flags & VIRTIO_PCI_FLAG_DISABLE_MODERN)
> +&& pci_bus_is_express(pci_dev->bus)
> +&& !pci_bus_is_root(pci_dev->bus)) {
> +int pos;
> +
> +pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS;
> +pos = pcie_endpoint_cap_init(pci_dev, 0);
> +assert(pos > 0);
> +
> +pos = pci_add_capability(pci_dev, PCI_CAP_ID_PM, 0, PCI_PM_SIZEOF);
> +assert(pos > 0);
> +
> +/*
> + * Indicates that this function complies with revision 1.2 of the
> + * PCI Power Management Interface Specification.
> + */
> +pci_set_word(pci_dev->config + pos + PCI_PM_PMC, 0x3);
> +}
> +
>  virtio_pci_bus_new(>bus, sizeof(proxy->bus), proxy);
>  if (k->realize) {
>  k->realize(proxy, errp);
> @@ -1622,6 +1642,8 @@ static Property virtio_pci_properties[] = {
>  VIRTIO_PCI_FLAG_DISABLE_LEGACY_BIT, false),
>  DEFINE_PROP_BIT("disable-modern", VirtIOPCIProxy, flags,
>  VIRTIO_PCI_FLAG_DISABLE_MODERN_BIT, true),
> +DEFINE_PROP_BIT("disable-pcie", VirtIOPCIProxy, flags,
> +VIRTIO_PCI_FLAG_DISABLE_PCIE_BIT, false),

It's preferable to call this one x-disable-pcie - this
is the convention to mark it as "not for user".

>  DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> index 801c23a..1a487fc 100644
> --- a/hw/virtio/virtio-pci.h
> +++ b/hw/virtio/virtio-pci.h
> @@ -72,8 +72,10 @@ typedef struct VirtioBusClass VirtioPCIBusClass;
>  /* virtio version flags */
>  #define VIRTIO_PCI_FLAG_DISABLE_LEGACY_BIT 2
>  #define VIRTIO_PCI_FLAG_DISABLE_MODERN_BIT 3
> +#define VIRTIO_PCI_FLAG_DISABLE_PCIE_BIT 4
>  #define VIRTIO_PCI_FLAG_DISABLE_LEGACY (1 << 
> VIRTIO_PCI_FLAG_DISABLE_LEGACY_BIT)
>  #define VIRTIO_PCI_FLAG_DISABLE_MODERN (1 << 
> VIRTIO_PCI_FLAG_DISABLE_MODERN_BIT)
> +#define VIRTIO_PCI_FLAG_DISABLE_PCIE (1 << VIRTIO_PCI_FLAG_DISABLE_PCIE_BIT)
>  
>  typedef struct {
>  MSIMessage msg;
> diff --git a/include/hw/compat.h b/include/hw/compat.h
> index 095de5d..0a08531 100644
> --- a/include/hw/compat.h
> +++ b/include/hw/compat.h
> @@ -2,7 +2,51 @@
>  #define HW_COMPAT_H
>  
>  #define HW_COMPAT_2_4 \
> -/* empty */
> +{\
> +.driver   = "virtio-blk-pci",\
> +.property = "disable-pcie",\
> +.value= "on",\
> +},{\
> +.driver   = "virtio-scsi-pci",\
> +.property = "disable-pcie",\
> +.value= "on",\
> +},{\
> +.driver   = "virtio-net-pci",\
> +.property = "disable-pcie",\
> +.value= "on",\
> +},{\
> +.driver   = "virtio-input-host-pci",\
> +.property = "disable-pcie",\
> +.value= "on",\
> +},{\
> +.driver   = "virtio-keyboard-pci",\
> +.property = "disable-pcie",\
> +.value= "on",\
> +},{\
> +.driver   = "virtio-mouse-pci",\
> +.property = "disable-pcie",\
> +.value= "on",\
> +},{\
> +.driver   = "virtio-serial-pci",\
> +.property = "disable-pcie",\

Re: [Qemu-devel] [PATCH for-2.5 v3 1/3] arm: boot: Add secure_board_setup flag

2015-11-08 Thread Peter Crosthwaite
On Fri, Nov 6, 2015 at 6:45 AM, Peter Maydell  wrote:
>
> On 3 November 2015 at 04:30, Peter Crosthwaite
>  wrote:
> > Add a flag that when set, will cause the primary CPU to start in secure
> > mode, even if the overall boot is non-secure. This is useful for when
> > there is a board-setup blob that needs to run from secure mode, but
> > device and secondary CPU init should still be done as-normal for a non-
> > secure boot.
> >
> > Signed-off-by: Peter Crosthwaite 
> > ---
> > changed since v2:
> > Assert if running KVM and board_setup_secure is set
> >
> >  hw/arm/boot.c| 8 +++-
> >  include/hw/arm/arm.h | 6 ++
> >  2 files changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> > index b0879a5..f671454 100644
> > --- a/hw/arm/boot.c
> > +++ b/hw/arm/boot.c
> > @@ -495,7 +495,8 @@ static void do_cpu_reset(void *opaque)
> >  }
> >
> >  /* Set to non-secure if not a secure boot */
> > -if (!info->secure_boot) {
> > +if (!info->secure_boot &&
> > +(cs != first_cpu || !info->secure_board_setup)) {
> >  /* Linux expects non-secure state */
> >  env->cp15.scr_el3 |= SCR_NS;
> >  }
> > @@ -598,6 +599,11 @@ static void arm_load_kernel_notify(Notifier *notifier, 
> > void *data)
> >  struct arm_boot_info *info =
> >  container_of(n, struct arm_boot_info, load_kernel_notifier);
> >
> > +/* It is the boards job to make sure secure_board_setup is actually
> > + * possible
> > + */
>
> I think we could improve this comment a bit:
> /* The board code is not supposed to set secure_board_setup unless
>  * running its code in secure mode is actually possible, and KVM
>  * doesn't support secure.
>  */
>

Taken verbatim. Thanks.

> > +assert(!info->secure_board_setup || tcg_enabled());
>
> This assertion causes us to fail "make check", because there's
> a test target which starts every board with the "qtest" accelerator,
> which is not TCG. You need !kvm_enabled() instead (and
> to include sysemu/kvm.h).
>

Fixed. Changed to De Morgan && equivalent with the two !s for easier reading:

assert(!(info->secure_board_setup && kvm_enabled()));

Regards,
Peter

> > +
> >  /* Load the kernel.  */
> >  if (!info->kernel_filename || info->firmware_loaded) {
> >
> > diff --git a/include/hw/arm/arm.h b/include/hw/arm/arm.h
> > index 9217b70..60dc919 100644
> > --- a/include/hw/arm/arm.h
> > +++ b/include/hw/arm/arm.h
> > @@ -97,6 +97,12 @@ struct arm_boot_info {
> >  hwaddr board_setup_addr;
> >  void (*write_board_setup)(ARMCPU *cpu,
> >const struct arm_boot_info *info);
> > +
> > +/* If set, the board specific loader/setup blob will be run from secure
> > + * mode, regardless of secure_boot. The blob becomes responsible for
> > + * changing to non-secure state if implementing a non-secure boot
> > + */
> > +bool secure_board_setup;
> >  };
>
> thanks
> -- PMM



Re: [Qemu-devel] [PATCH V3] hw/virtio: Add PCIe capability to virtio devices

2015-11-08 Thread Marcel Apfelbaum

On 11/08/2015 07:10 PM, Michael S. Tsirkin wrote:

On Thu, Oct 29, 2015 at 01:56:28PM +0200, Marcel Apfelbaum wrote:

The virtio devices are converted to PCI-Express
if they are plugged into a PCI-Express bus and
the 'modern' protocol is enabled.

Devices plugged directly into the Root Complex as
Integrated Endpoints remain PCI.

Signed-off-by: Marcel Apfelbaum 



Looks ok. Two comments:


---
v2 -> v3:
  - Addressed Michael S. Tsirkin's comments:
- enable pcie only for 2.5+ machines.

v1 -> v2:
  - Addressed Michael S. Tsirkin's comments:
- Added the minimum required capabilities for PCIe devices
- Integrated Endpoints remain PCI

  - Use pcie_endpoint_cap_init instead of manually creating the pcie capability.

  - Regarding Gerd Hoffman's comments:
- Creating virtio-pcie devices:
  For the moment I prefer to not duplicate the virtio definitions,
  at least until we don't have a consensus (Personally I don't like it)
- Removing the IO bar:
  This would be my next patch on the "virtio to express" series, I plan
  to remove it only for "modern" devices.

Thanks,
Marcel

  hw/virtio/virtio-pci.c | 22 ++
  hw/virtio/virtio-pci.h |  2 ++
  include/hw/compat.h| 46 +-
  3 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index f55dd2b..a288d8b 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1592,6 +1592,26 @@ static void virtio_pci_realize(PCIDevice *pci_dev, Error 
**errp)

  address_space_init(>modern_as, >modern_cfg, 
"virtio-pci-cfg-as");

+if (!(proxy->flags & VIRTIO_PCI_FLAG_DISABLE_PCIE)
+&& !(proxy->flags & VIRTIO_PCI_FLAG_DISABLE_MODERN)
+&& pci_bus_is_express(pci_dev->bus)
+&& !pci_bus_is_root(pci_dev->bus)) {
+int pos;
+
+pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS;
+pos = pcie_endpoint_cap_init(pci_dev, 0);
+assert(pos > 0);
+
+pos = pci_add_capability(pci_dev, PCI_CAP_ID_PM, 0, PCI_PM_SIZEOF);
+assert(pos > 0);
+
+/*
+ * Indicates that this function complies with revision 1.2 of the
+ * PCI Power Management Interface Specification.
+ */
+pci_set_word(pci_dev->config + pos + PCI_PM_PMC, 0x3);
+}
+
  virtio_pci_bus_new(>bus, sizeof(proxy->bus), proxy);
  if (k->realize) {
  k->realize(proxy, errp);
@@ -1622,6 +1642,8 @@ static Property virtio_pci_properties[] = {
  VIRTIO_PCI_FLAG_DISABLE_LEGACY_BIT, false),
  DEFINE_PROP_BIT("disable-modern", VirtIOPCIProxy, flags,
  VIRTIO_PCI_FLAG_DISABLE_MODERN_BIT, true),
+DEFINE_PROP_BIT("disable-pcie", VirtIOPCIProxy, flags,
+VIRTIO_PCI_FLAG_DISABLE_PCIE_BIT, false),


It's preferable to call this one x-disable-pcie - this
is the convention to mark it as "not for user".


Sure, I'll change it.




  DEFINE_PROP_END_OF_LIST(),
  };

diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index 801c23a..1a487fc 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -72,8 +72,10 @@ typedef struct VirtioBusClass VirtioPCIBusClass;
  /* virtio version flags */
  #define VIRTIO_PCI_FLAG_DISABLE_LEGACY_BIT 2
  #define VIRTIO_PCI_FLAG_DISABLE_MODERN_BIT 3
+#define VIRTIO_PCI_FLAG_DISABLE_PCIE_BIT 4
  #define VIRTIO_PCI_FLAG_DISABLE_LEGACY (1 << 
VIRTIO_PCI_FLAG_DISABLE_LEGACY_BIT)
  #define VIRTIO_PCI_FLAG_DISABLE_MODERN (1 << 
VIRTIO_PCI_FLAG_DISABLE_MODERN_BIT)
+#define VIRTIO_PCI_FLAG_DISABLE_PCIE (1 << VIRTIO_PCI_FLAG_DISABLE_PCIE_BIT)

  typedef struct {
  MSIMessage msg;
diff --git a/include/hw/compat.h b/include/hw/compat.h
index 095de5d..0a08531 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -2,7 +2,51 @@
  #define HW_COMPAT_H

  #define HW_COMPAT_2_4 \
-/* empty */
+{\
+.driver   = "virtio-blk-pci",\
+.property = "disable-pcie",\
+.value= "on",\
+},{\
+.driver   = "virtio-scsi-pci",\
+.property = "disable-pcie",\
+.value= "on",\
+},{\
+.driver   = "virtio-net-pci",\
+.property = "disable-pcie",\
+.value= "on",\
+},{\
+.driver   = "virtio-input-host-pci",\
+.property = "disable-pcie",\
+.value= "on",\
+},{\
+.driver   = "virtio-keyboard-pci",\
+.property = "disable-pcie",\
+.value= "on",\
+},{\
+.driver   = "virtio-mouse-pci",\
+.property = "disable-pcie",\
+.value= "on",\
+},{\
+.driver   = "virtio-serial-pci",\
+.property = "disable-pcie",\
+.value= "on",\
+},{\
+.driver   = "virtio-tablet-pci",\
+.property = "disable-pcie",\
+

Re: [Qemu-devel] [PATCH for-2.5 v3 3/3] arm: highbank: Implement PSCI and dummy monitor

2015-11-08 Thread Peter Crosthwaite
On Fri, Nov 6, 2015 at 6:47 AM, Peter Maydell  wrote:
> On 3 November 2015 at 04:30, Peter Crosthwaite
>  wrote:
>> Firstly, enable monitor mode and PSCI, both are which are features of
>
> "both of which"
>

Fixed.

>> this board.
>>
>> In addition to PSCI, this board also uses SMC for cache maintainence
>> ops. This means we need a secure monitor to catch these and nop them.
>> Use the ARM boot board-setup feature to implement this. All traps to
>> monitor mode implement the nop.
>>

Clarified this a little better as since v3 the non-smc traps are
spinning rather than nopping.

>> As a KVM CPU cannot run in secure mode, do not do the board-setup if
>> not running TCG. Report a warning explaining the limitation is this
>> case.
>
> "in this case".

Fixed.

>
>> @@ -371,6 +410,16 @@ static void calxeda_init(MachineState *machine, enum 
>> cxmachines machine_id)
>>  highbank_binfo.loader_start = 0;
>>  highbank_binfo.write_secondary_boot = hb_write_secondary;
>>  highbank_binfo.secondary_cpu_reset_hook = hb_reset_secondary;
>> +if (tcg_enabled()) {
>
> This also needs to be !kvm_enabled(), so 'make check' works.
>

Fixed.

>> +highbank_binfo.board_setup_addr = BOARD_SETUP_ADDR;
>> +highbank_binfo.write_board_setup = hb_write_board_setup;
>> +highbank_binfo.secure_board_setup = true;
>> +} else {
>> +error_report("WARNING: TCG unavailable - "
>> + "unable to load built-in Monitor support.\n"
>> + "Some guests (such as Linux) may not boot\n");
>
> You can't have newlines in an error_report() string. I suggest
>
> error_report("WARNING: cannot load built-in Monitor support if KVM "
>  "is enabled. Some guests (such as Linux) may not boot.");
>

Taken verbatim, although I line wrapped earlier due to new guideline
to stay a little back from 80 chars if possible.

> Otherwise
> Reviewed-by: Peter Maydell 
>

Thanks.

Regards,
Peter

> thanks
> -- PMM



Re: [Qemu-devel] [PATCH v6 25/33] nvdimm acpi: build ACPI nvdimm devices

2015-11-08 Thread Michael S. Tsirkin
On Fri, Oct 30, 2015 at 01:56:19PM +0800, Xiao Guangrong wrote:
> NVDIMM devices is defined in ACPI 6.0 9.20 NVDIMM Devices
> 
> There is a root device under \_SB and specified NVDIMM devices are under the
> root device. Each NVDIMM device has _ADR which returns its handle used to
> associate MEMDEV structure in NFIT
> 
> We reserve handle 0 for root device. In this patch, we save handle, handle,
> arg1 and arg2 to dsm memory. Arg3 is conditionally saved in later patch
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/acpi/nvdimm.c | 184 
> +++
>  1 file changed, 184 insertions(+)
> 
> diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> index dd84e5f..53ed675 100644
> --- a/hw/acpi/nvdimm.c
> +++ b/hw/acpi/nvdimm.c
> @@ -368,6 +368,15 @@ static void nvdimm_build_nfit(GSList *device_list, 
> GArray *table_offsets,
>  g_array_free(structures, true);
>  }
>  
> +struct NvdimmDsmIn {
> +uint32_t handle;
> +uint32_t revision;
> +uint32_t function;
> +   /* the remaining size in the page is used by arg3. */
> +uint8_t arg3[0];
> +} QEMU_PACKED;
> +typedef struct NvdimmDsmIn NvdimmDsmIn;
> +
>  static uint64_t
>  nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
>  {
> @@ -377,6 +386,7 @@ nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
>  static void
>  nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
>  {
> +fprintf(stderr, "BUG: we never write DSM notification IO Port.\n");
>  }
>  
>  static const MemoryRegionOps nvdimm_dsm_ops = {
> @@ -402,6 +412,179 @@ void nvdimm_init_acpi_state(MemoryRegion *memory, 
> MemoryRegion *io,
>  memory_region_add_subregion(io, NVDIMM_ACPI_IO_BASE, >io_mr);
>  }
>  
> +#define BUILD_STA_METHOD(_dev_, _method_)  \
> +do {   \
> +_method_ = aml_method("_STA", 0);  \
> +aml_append(_method_, aml_return(aml_int(0x0f)));   \
> +aml_append(_dev_, _method_);   \
> +} while (0)
> +
> +#define BUILD_DSM_METHOD(_dev_, _method_, _handle_, _uuid_)\
> +do {   \
> +Aml *ifctx, *uuid; \
> +_method_ = aml_method("_DSM", 4);  \
> +/* check UUID if it is we expect, return the errorcode if not.*/   \

check that UUID is what we expect?

> +uuid = aml_touuid(_uuid_); \
> +ifctx = aml_if(aml_lnot(aml_equal(aml_arg(0), uuid))); \
> +aml_append(ifctx, aml_return(aml_int(1 /* Not Supported */))); \
> +aml_append(method, ifctx); \
> +aml_append(method, aml_return(aml_call4("NCAL", aml_int(_handle_), \
> +   aml_arg(1), aml_arg(2), aml_arg(3;  \

So name NCAL here matches the name below.
Pls define a macro for it so we aren't limited
by silly 4-letter limitations of AML.
Same applies to all other names you use here and elsewhere.

> +aml_append(_dev_, _method_);   \
> +} while (0)
> +
> +#define BUILD_FIELD_UNIT_SIZE(_field_, _byte_, _name_) \
> +aml_append(_field_, aml_named_field(_name_, (_byte_) * BITS_PER_BYTE))
> +
> +#define BUILD_FIELD_UNIT_STRUCT(_field_, _s_, _f_, _name_) \
> +BUILD_FIELD_UNIT_SIZE(_field_, sizeof(typeof_field(_s_, _f_)), _name_)
> +

why are these macros? Make them functions pls.

> +static void build_nvdimm_devices(GSList *device_list, Aml *root_dev)
> +{
> +for (; device_list; device_list = device_list->next) {
> +NVDIMMDevice *nvdimm = device_list->data;
> +int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> +   NULL);
> +uint32_t handle = nvdimm_slot_to_handle(slot);
> +Aml *dev, *method;
> +
> +dev = aml_device("NV%02X", slot);
> +aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
> +
> +BUILD_STA_METHOD(dev, method);
> +
> +/*
> + * Chapter 4: _DSM Interface for NVDIMM Device (non-root) - Example
> + * in DSM Spec Rev1.
> + */
> +BUILD_DSM_METHOD(dev, method,
> + handle /* NVDIMM Device Handle */,
> + "4309AC30-0D11-11E4-9191-0800200C9A66"
> + /* UUID for NVDIMM Devices. */);
> +
> +aml_append(root_dev, dev);
> +}
> +}
> +
> +static void nvdimm_build_acpi_devices(GSList *device_list, Aml *sb_scope)
> +{
> +Aml *dev, *method, *field;
> +uint64_t page_size = TARGET_PAGE_SIZE;
> +
> +dev = aml_device("NVDR");
> +  

Re: [Qemu-devel] [PATCH v6 32/33] nvdimm acpi: support _FIT method

2015-11-08 Thread Michael S. Tsirkin
On Fri, Oct 30, 2015 at 01:56:26PM +0800, Xiao Guangrong wrote:
> FIT buffer is not completely mapped into guest address space, so a new
> function, Read FIT, function index 0x, is reserved by QEMU to
> read the piece of FIT buffer. The buffer is concatenated before _FIT
> return
> 
> Refer to docs/specs/acpi-nvdimm.txt for detailed design
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/acpi/nvdimm.c | 168 
> +--
>  1 file changed, 164 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> index f8d7d19..3f35220 100644
> --- a/hw/acpi/nvdimm.c
> +++ b/hw/acpi/nvdimm.c
> @@ -384,6 +384,18 @@ static void nvdimm_build_nfit(GSList *device_list, 
> GArray *table_offsets,
>  g_array_free(structures, true);
>  }
>  
> +/*
> + * define UUID for NVDIMM Root Device according to Chapter 3 DSM Interface
> + * for NVDIMM Root Device - Example in DSM Spec Rev1.
> + */
> +#define NVDIMM_DSM_ROOT_UUID "2F10E7A4-9E91-11E4-89D3-123B93F75CBA"
> +
> +/*
> + * Read FIT Function, which is a QEMU internal use only function, more detail
> + * refer to docs/specs/acpi_nvdimm.txt
> + */
> +#define NVDIMM_DSM_FUNC_READ_FIT 0x
> +
>  /* define NVDIMM DSM return status codes according to DSM Spec Rev1. */
>  enum {
>  /* Common return status codes. */
> @@ -420,6 +432,11 @@ struct NvdimmFuncInSetLabelData {
>  } QEMU_PACKED;
>  typedef struct NvdimmFuncInSetLabelData NvdimmFuncInSetLabelData;
>  
> +struct NvdimmFuncInReadFit {
> +uint32_t offset; /* fit offset */
> +} QEMU_PACKED;
> +typedef struct NvdimmFuncInReadFit NvdimmFuncInReadFit;
> +
>  struct NvdimmDsmIn {
>  uint32_t handle;
>  uint32_t revision;
> @@ -429,6 +446,7 @@ struct NvdimmDsmIn {
>  uint8_t arg3[0];
>  NvdimmFuncInSetLabelData func_set_label_data;
>  NvdimmFuncInGetLabelData func_get_label_data;
> +NvdimmFuncInReadFit func_read_fit;
>  };
>  } QEMU_PACKED;
>  typedef struct NvdimmDsmIn NvdimmDsmIn;
> @@ -450,13 +468,71 @@ struct NvdimmFuncOutGetLabelData {
>  } QEMU_PACKED;
>  typedef struct NvdimmFuncOutGetLabelData NvdimmFuncOutGetLabelData;
>  
> +struct NvdimmFuncOutReadFit {
> +uint32_t status;/* return status code. */
> +uint32_t length;/* the length of fit data we read. */
> +uint8_t fit_data[0]; /* fit data. */
> +} QEMU_PACKED;
> +typedef struct NvdimmFuncOutReadFit NvdimmFuncOutReadFit;
> +
>  static void nvdimm_dsm_write_status(GArray *out, uint32_t status)
>  {
>  status = cpu_to_le32(status);
>  build_append_int_noprefix(out, status, sizeof(status));
>  }
>  
> -static void nvdimm_dsm_root(NvdimmDsmIn *in, GArray *out)
> +/* Build fit memory which is presented to guest via _FIT method. */
> +static void nvdimm_build_fit(AcpiNVDIMMState *state)
> +{
> +if (!state->fit) {
> +GSList *device_list = nvdimm_get_plugged_device_list();
> +
> +nvdimm_debug("Rebuild FIT...\n");
> +state->fit = nvdimm_build_device_structure(device_list);
> +g_slist_free(device_list);
> +}
> +}
> +
> +/* Read FIT data, defined in docs/specs/acpi_nvdimm.txt. */
> +static void nvdimm_dsm_func_read_fit(AcpiNVDIMMState *state,
> + NvdimmDsmIn *in, GArray *out)
> +{
> +NvdimmFuncInReadFit *read_fit = >func_read_fit;
> +NvdimmFuncOutReadFit fit_out;
> +uint32_t read_length = TARGET_PAGE_SIZE - sizeof(NvdimmFuncOutReadFit);
> +uint32_t status = NVDIMM_DSM_ROOT_DEV_STATUS_INVALID_PARAS;
> +
> +nvdimm_build_fit(state);
> +
> +le32_to_cpus(_fit->offset);
> +
> +nvdimm_debug("Read FIT offset %#x.\n", read_fit->offset);
> +
> +if (read_fit->offset > state->fit->len) {
> +nvdimm_debug("offset %#x is beyond fit size (%#x).\n",
> + read_fit->offset, state->fit->len);
> +goto exit;
> +}
> +
> +read_length = MIN(read_length, state->fit->len - read_fit->offset);
> +nvdimm_debug("read length %#x.\n", read_length);
> +
> +fit_out.status = cpu_to_le32(NVDIMM_DSM_STATUS_SUCCESS);
> +fit_out.length = cpu_to_le32(read_length);

Is array always empty at this point?
If yes, better assert this here to make sure guest can not
use unlimited memory.

> +g_array_append_vals(out, _out, sizeof(fit_out));
> +
> +if (read_length) {
> +g_array_append_vals(out, state->fit->data + read_fit->offset,
> +read_length);
> +}
> +return;
> +
> +exit:
> +nvdimm_dsm_write_status(out, status);
> +}
> +
> +static void nvdimm_dsm_root(AcpiNVDIMMState *state, NvdimmDsmIn *in,
> +GArray *out)
>  {
>  uint32_t status = NVDIMM_DSM_STATUS_NOT_SUPPORTED;
>  
> @@ -475,6 +551,10 @@ static void nvdimm_dsm_root(NvdimmDsmIn *in, GArray *out)
>  return;
>  }
>  
> +if (in->function == NVDIMM_DSM_FUNC_READ_FIT /* FIT Read */) {
> +return 

[Qemu-devel] [Bug 893208] Re: qemu on ARM hosts can't boot i386 image

2015-11-08 Thread Marina Kovalevna
Thanks for all the tips guys, I finally got it to work on my Rpi2.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/893208

Title:
  qemu on ARM hosts can't boot i386 image

Status in QEMU:
  New
Status in Linaro QEMU:
  New

Bug description:
  If you apply some workarounds for bug 870990, bug 883133 and bug
  883136 QEMU still cannot boot the i386
  debian_squeeze_i386_standard.qcow2 image from
  http://people.debian.org/~aurel32/qemu/i386/ -- grub starts to boot
  but something causes the system to reset just before display of the
  blue-background grub menu, so we go round in a loop forever. This
  image boots OK on i386 hosted qemu so this indicates some kind of ARM-
  host specific bug.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/893208/+subscriptions



[Qemu-devel] [PATCH 08/18] armv7m: fix RETTOBASE

2015-11-08 Thread Michael Davidsaver
The polarity is reversed, and it should include
internal exceptions.

Should be set when # of active exceptions <= 1.

Signed-off-by: Michael Davidsaver 
---
 hw/intc/armv7m_nvic.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index 30e349e..3b10dee 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -432,16 +432,20 @@ static uint32_t nvic_readl(nvic_state *s, uint32_t offset)
 val = cpu->env.v7m.exception;
 /* VECTPENDING */
 val |= (cpu->env.v7m.pending << 12)&0x1ff;
-/* ISRPENDING and RETTOBASE */
+/* ISRPENDING - Set it any externel IRQ pending (vector>=16) */
 for (irq = 16; irq < s->num_irq; irq++) {
 if (s->vectors[irq].pending) {
 val |= (1 << 22);
 break;
 }
+}
+/* RETTOBASE - Set if no (other) handler is active */
+for (irq = 1; irq < s->num_irq; irq++) {
 if (irq != cpu->env.v7m.exception && s->vectors[irq].active) {
-val |= (1 << 11);
+val |= (1 << 11); /* some other handler is active */
 }
 }
+val ^= (1<<11); /* invert */
 /* PENDSTSET */
 if (s->vectors[ARMV7M_EXCP_SYSTICK].pending) {
 val |= (1 << 26);
@@ -454,6 +458,7 @@ static uint32_t nvic_readl(nvic_state *s, uint32_t offset)
 if (s->vectors[ARMV7M_EXCP_NMI].pending) {
 val |= (1 << 31);
 }
+/* ISRPREEMPT not implemented */
 return val;
 case 0xd08: /* Vector Table Offset.  */
 return cpu->env.v7m.vecbase;
@@ -588,10 +593,14 @@ static void nvic_writel(nvic_state *s, uint32_t offset, 
uint32_t value)
 qemu_irq_pulse(s->sysresetreq);
 }
 if (value & 2) {
-qemu_log_mask(LOG_UNIMP, "VECTCLRACTIVE unimplemented\n");
+qemu_log_mask(LOG_GUEST_ERROR,
+  "Setting VECTCLRACTIVE when not in DEBUG mode "
+  "is UNPREDICTABLE\n");
 }
 if (value & 1) {
-qemu_log_mask(LOG_UNIMP, "AIRCR system reset unimplemented\n");
+qemu_log_mask(LOG_GUEST_ERROR,
+  "Setting VECTRESET when not in DEBUG mode "
+  "is UNPREDICTABLE\n");
 }
 if (value & 0x700) {
 unsigned i;
-- 
2.1.4




[Qemu-devel] [PATCH 15/18] arm: gic: Remove references to NVIC

2015-11-08 Thread Michael Davidsaver
armv7m_nvic.c no longer relies on the GIC.
Remove REV_NVIC and conditionals which use it.

Signed-off-by: Michael Davidsaver 
---
 hw/intc/arm_gic.c| 14 +++---
 hw/intc/arm_gic_common.c | 23 ---
 hw/intc/gic_internal.h   |  7 ++-
 3 files changed, 17 insertions(+), 27 deletions(-)

diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
index 8bad132..d543a93 100644
--- a/hw/intc/arm_gic.c
+++ b/hw/intc/arm_gic.c
@@ -184,7 +184,7 @@ static void gic_set_irq(void *opaque, int irq, int level)
 return;
 }
 
-if (s->revision == REV_11MPCORE || s->revision == REV_NVIC) {
+if (s->revision == REV_11MPCORE) {
 gic_set_irq_11mpcore(s, irq, level, cm, target);
 } else {
 gic_set_irq_generic(s, irq, level, cm, target);
@@ -335,7 +335,7 @@ uint32_t gic_acknowledge_irq(GICState *s, int cpu, 
MemTxAttrs attrs)
 return 1023;
 }
 
-if (s->revision == REV_11MPCORE || s->revision == REV_NVIC) {
+if (s->revision == REV_11MPCORE) {
 /* Clear pending flags for both level and edge triggered interrupts.
  * Level triggered IRQs will be reasserted once they become inactive.
  */
@@ -514,7 +514,7 @@ void gic_complete_irq(GICState *s, int cpu, int irq, 
MemTxAttrs attrs)
 return; /* No active IRQ.  */
 }
 
-if (s->revision == REV_11MPCORE || s->revision == REV_NVIC) {
+if (s->revision == REV_11MPCORE) {
 /* Mark level triggered interrupts as pending if they are still
raised.  */
 if (!GIC_TEST_EDGE_TRIGGER(irq) && GIC_TEST_ENABLED(irq, cm)
@@ -672,7 +672,7 @@ static uint32_t gic_dist_readb(void *opaque, hwaddr offset, 
MemTxAttrs attrs)
 } else if (offset < 0xf10) {
 goto bad_reg;
 } else if (offset < 0xf30) {
-if (s->revision == REV_11MPCORE || s->revision == REV_NVIC) {
+if (s->revision == REV_11MPCORE) {
 goto bad_reg;
 }
 
@@ -883,7 +883,7 @@ static void gic_dist_writeb(void *opaque, hwaddr offset,
 if (irq < GIC_NR_SGIS)
 value |= 0xaa;
 for (i = 0; i < 4; i++) {
-if (s->revision == REV_11MPCORE || s->revision == REV_NVIC) {
+if (s->revision == REV_11MPCORE) {
 if (value & (1 << (i * 2))) {
 GIC_SET_MODEL(irq + i);
 } else {
@@ -901,7 +901,7 @@ static void gic_dist_writeb(void *opaque, hwaddr offset,
 goto bad_reg;
 } else if (offset < 0xf20) {
 /* GICD_CPENDSGIRn */
-if (s->revision == REV_11MPCORE || s->revision == REV_NVIC) {
+if (s->revision == REV_11MPCORE) {
 goto bad_reg;
 }
 irq = (offset - 0xf10);
@@ -912,7 +912,7 @@ static void gic_dist_writeb(void *opaque, hwaddr offset,
 }
 } else if (offset < 0xf30) {
 /* GICD_SPENDSGIRn */
-if (s->revision == REV_11MPCORE || s->revision == REV_NVIC) {
+if (s->revision == REV_11MPCORE) {
 goto bad_reg;
 }
 irq = (offset - 0xf20);
diff --git a/hw/intc/arm_gic_common.c b/hw/intc/arm_gic_common.c
index 9c82b97..4987047 100644
--- a/hw/intc/arm_gic_common.c
+++ b/hw/intc/arm_gic_common.c
@@ -97,9 +97,7 @@ void gic_init_irqs_and_mmio(GICState *s, qemu_irq_handler 
handler,
  *  [N+32..N+63] PPIs for CPU 1
  *   ...
  */
-if (s->revision != REV_NVIC) {
-i += (GIC_INTERNAL * s->num_cpu);
-}
+i += (GIC_INTERNAL * s->num_cpu);
 qdev_init_gpio_in(DEVICE(s), handler, i);
 
 for (i = 0; i < s->num_cpu; i++) {
@@ -113,16 +111,12 @@ void gic_init_irqs_and_mmio(GICState *s, qemu_irq_handler 
handler,
 memory_region_init_io(>iomem, OBJECT(s), ops, s, "gic_dist", 0x1000);
 sysbus_init_mmio(sbd, >iomem);
 
-if (s->revision != REV_NVIC) {
-/* This is the main CPU interface "for this core". It is always
- * present because it is required by both software emulation and KVM.
- * NVIC is not handled here because its CPU interface is different,
- * neither it can use KVM.
- */
-memory_region_init_io(>cpuiomem[0], OBJECT(s), ops ? [1] : NULL,
-  s, "gic_cpu", s->revision == 2 ? 0x1000 : 0x100);
-sysbus_init_mmio(sbd, >cpuiomem[0]);
-}
+/* This is the main CPU interface "for this core". It is always
+ * present because it is required by both software emulation and KVM.
+ */
+memory_region_init_io(>cpuiomem[0], OBJECT(s), ops ? [1] : NULL,
+  s, "gic_cpu", s->revision == 2 ? 0x1000 : 0x100);
+sysbus_init_mmio(sbd, >cpuiomem[0]);
 }
 
 static void arm_gic_common_realize(DeviceState *dev, Error **errp)
@@ -154,7 +148,7 @@ static void arm_gic_common_realize(DeviceState *dev, Error 
**errp)
 }
 
 if (s->security_extn &&
-(s->revision == REV_11MPCORE || s->revision == REV_NVIC)) {
+(s->revision == REV_11MPCORE)) {
 error_setg(errp, 

[Qemu-devel] [PATCH 00/18] Fix exception handling and msr/mrs access

2015-11-08 Thread Michael Davidsaver
This series grew from a previous incorrect patch attempting to fix some 
incorrect behavior.  After spending some time going through the arch. ref. 
manual for v7-M I think I understand better how this should work and have made 
a number of changes which actually improve the situation.

These changes have not yet been cross checked against real hardware, and I 
therefore don't consider them mergeable.  It's gotten big enough though that 
I'd like to get some feedback.

I think the changes in this series effect only ARMv7-M specific code with the 
exception of removing references to NVIC from the GIC code.

* Add unprivileged access case for MRS/MSR instructions
* Priority based exception masking with PRIMASK, FAULTMASK, and BASEPRI.
* Auto-clear FAULTMASK on exception return (except NMI)
* Validation and consistency checking on exception return
* Exception priorities using PRIGROUP
* Exception escalation to HardFault when priority permits
* Escalation to unrecoverable exception otherwise (though the action is not 
correct, see below)
* Correct calculation of the RETTOBASE field of ICSR
* Remove the need for the armv7m.hack MemoryRegion to catch exception returns
* Fill in previously unimplemented HFSR, CFSR, and CCR registers

This series removes the dependence of the NVIC code on the GIC.  The GIC 
doesn't have the concept of PRIGROUP to change the size of the group priority 
field.  Also, there are a lot of cases in this code which I don't understand 
and worry about breaking.  Now that I have things working (I think), I could 
look at recombining them if this is desired.

Some additional state is also added to v7m in struct CPUARMState so that all 
the information needed
in arm_v7m_cpu_exec_interrupt() is found in one place.  I started by having 
this state split between CPU and struct nvic_state, but found this confusing.  
Some guidance would be helpful.

I add a pointer to ARMCPU* in struct nvic_state which is populated in 
armv7m_nvic_realize().  I think this is reasonable given the tight coupling 
between NVIC and CPU, but it does look ugly.

At the moment I've left the action of an unrecoverable exception to call 
cpu_abort().  I'm not sure of the value of implementing the actual defined 
behavior in the context of QEMU.

I've tried to add VMState as appropriate, but have not tested it.

I looked briefly at qtest, but can't quite see how to use it given the need to 
execute code to test most of the exception behavior.  Is something like this 
feasible at present?

Regards,
Michael


Michael Davidsaver (18):
  armv7m: MRS/MSR handle unprivileged access
  armv7m: Undo armv7m.hack
  armv7m: Complain about incorrect exception table entries.
  armv7m: Explicit error for bad vector table
  armv7m: expand NVIC state
  armv7m: new NVIC utility functions
  armv7m: Update NVIC registers
  armv7m: fix RETTOBASE
  armv7m: NVIC update vmstate
  armv7m: NVIC initialization
  armv7m: fix I and F flag handling
  armv7m: simpler/faster exception start
  armv7m: implement CFSR and HFSR
  armv7m: auto-clear FAULTMASK
  arm: gic: Remove references to NVIC
  armv7m: check exception return consistency
  armv7m: implement CCR
  armv7m: prevent unprivileged write to STIR

 hw/arm/armv7m.c  |   8 -
 hw/intc/arm_gic.c|  14 +-
 hw/intc/arm_gic_common.c |  23 +-
 hw/intc/armv7m_nvic.c| 777 ---
 hw/intc/gic_internal.h   |   7 +-
 target-arm/cpu.c |  44 +--
 target-arm/cpu.h |  35 ++-
 target-arm/helper.c  | 222 ++
 target-arm/machine.c |   7 +-
 9 files changed, 843 insertions(+), 294 deletions(-)

-- 
2.1.4




[Qemu-devel] [PATCH 01/18] armv7m: MRS/MSR handle unprivileged access

2015-11-08 Thread Michael Davidsaver
The MRS and MSR instruction handling isn't checking
the current permission level.

Signed-off-by: Michael Davidsaver 
---
 target-arm/helper.c | 79 +
 1 file changed, 37 insertions(+), 42 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 4ecae61..4408100 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -7365,23 +7365,32 @@ uint32_t HELPER(get_r13_banked)(CPUARMState *env, 
uint32_t mode)
 
 uint32_t HELPER(v7m_mrs)(CPUARMState *env, uint32_t reg)
 {
-ARMCPU *cpu = arm_env_get_cpu(env);
+uint32_t mask;
+unsigned el = arm_current_el(env);
+
+/* First handle registers which unprivileged can read */
+
+switch (reg) {
+case 0 ... 7: /* xPSR sub-fields */
+mask = 0;
+if ((reg&1) && el) {
+mask |= 0x01ff; /* IPSR (unpriv. reads as zero) */
+}
+if (!(reg&4)) {
+mask |= 0xf800; /* APSR */
+}
+/* EPSR reads as zero */
+return xpsr_read(env) & mask;
+break;
+case 20: /* CONTROL */
+return env->v7m.control;
+}
+
+if (el == 0) {
+return 0; /* unprivileged reads others as zero */
+}
 
 switch (reg) {
-case 0: /* APSR */
-return xpsr_read(env) & 0xf800;
-case 1: /* IAPSR */
-return xpsr_read(env) & 0xf80001ff;
-case 2: /* EAPSR */
-return xpsr_read(env) & 0xff00fc00;
-case 3: /* xPSR */
-return xpsr_read(env) & 0xff00fdff;
-case 5: /* IPSR */
-return xpsr_read(env) & 0x01ff;
-case 6: /* EPSR */
-return xpsr_read(env) & 0x0700fc00;
-case 7: /* IEPSR */
-return xpsr_read(env) & 0x0700edff;
 case 8: /* MSP */
 return env->v7m.current_sp ? env->v7m.other_sp : env->regs[13];
 case 9: /* PSP */
@@ -7393,40 +7402,26 @@ uint32_t HELPER(v7m_mrs)(CPUARMState *env, uint32_t reg)
 return env->v7m.basepri;
 case 19: /* FAULTMASK */
 return (env->daif & PSTATE_F) != 0;
-case 20: /* CONTROL */
-return env->v7m.control;
 default:
-/* ??? For debugging only.  */
-cpu_abort(CPU(cpu), "Unimplemented system register read (%d)\n", reg);
+qemu_log_mask(LOG_GUEST_ERROR, "Attempt to read unknown special"
+   " register %d\n", reg);
 return 0;
 }
 }
 
 void HELPER(v7m_msr)(CPUARMState *env, uint32_t reg, uint32_t val)
 {
-ARMCPU *cpu = arm_env_get_cpu(env);
+if (arm_current_el(env) == 0 && reg > 7) {
+/* only xPSR sub-fields may be written by unprivileged */
+return;
+}
 
 switch (reg) {
-case 0: /* APSR */
-xpsr_write(env, val, 0xf800);
-break;
-case 1: /* IAPSR */
-xpsr_write(env, val, 0xf800);
-break;
-case 2: /* EAPSR */
-xpsr_write(env, val, 0xfe00fc00);
-break;
-case 3: /* xPSR */
-xpsr_write(env, val, 0xfe00fc00);
-break;
-case 5: /* IPSR */
-/* IPSR bits are readonly.  */
-break;
-case 6: /* EPSR */
-xpsr_write(env, val, 0x0600fc00);
-break;
-case 7: /* IEPSR */
-xpsr_write(env, val, 0x0600fc00);
+case 0 ... 7: /* xPSR sub-fields */
+/* only APSR is actually writable */
+if (reg&4) {
+xpsr_write(env, val, 0xf800); /* APSR */
+}
 break;
 case 8: /* MSP */
 if (env->v7m.current_sp)
@@ -7467,8 +7462,8 @@ void HELPER(v7m_msr)(CPUARMState *env, uint32_t reg, 
uint32_t val)
 switch_v7m_sp(env, (val & 2) != 0);
 break;
 default:
-/* ??? For debugging only.  */
-cpu_abort(CPU(cpu), "Unimplemented system register write (%d)\n", reg);
+qemu_log_mask(LOG_GUEST_ERROR, "Attempt to write unknown special"
+   " register %d\n", reg);
 return;
 }
 }
-- 
2.1.4




Re: [Qemu-devel] [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image

2015-11-08 Thread Fam Zheng
On Fri, 11/06 19:36, Max Reitz wrote:
> > +next_sector = sector_num;
> > +next_chunk = sector_num / sectors_per_chunk;
> 
> @next_sector and @next_chunk set here...
> 
> >  hbitmap_next_sector = s->sector_num;
> > -sector_num = s->sector_num;
> > -sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
> > -end = s->bdev_length / BDRV_SECTOR_SIZE;
> >  
> >  /* Extend the QEMUIOVector to include all adjacent blocks that will
> >   * be copied in this operation.
> > @@ -198,14 +191,6 @@ static uint64_t coroutine_fn 
> > mirror_iteration(MirrorBlockJob *s)
> >  next_sector = sector_num;
> >  next_chunk = sector_num / sectors_per_chunk;
> 
> ...and here already. So the above seems superfluous, considering that
> they are not read in between.
> 
> (If you keep hbitmap_next_sector = s->sector_num; above the sector_num =
> ... block, that may reduce conflicts further)

Indeed, thanks for noticing this.

> > +case MIRROR_METHOD_DISCARD:
> > +return mirror_do_zero_or_discard(s, sector_num,
> > + contiguous_sectors,
> > + false);
> 
> s/false/true/?

Yes, thanks.

Fam




[Qemu-devel] [PATCH v3 7/9] block: Drop BlockDriver.bdrv_ioctl

2015-11-08 Thread Fam Zheng
Now the callback is not used any more, drop the field along with all
implementations in block drivers, which are iscsi and raw.

Signed-off-by: Fam Zheng 
Reviewed-by: Kevin Wolf 
---
 block/iscsi.c | 33 -
 block/raw-posix.c |  8 
 block/raw_bsd.c   |  6 --
 include/block/block_int.h |  1 -
 4 files changed, 48 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index b3fa0a0..002d29b 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -845,38 +845,6 @@ static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
 return >common;
 }
 
-static void ioctl_cb(void *opaque, int status)
-{
-int *p_status = opaque;
-*p_status = status;
-}
-
-static int iscsi_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
-{
-IscsiLun *iscsilun = bs->opaque;
-int status;
-
-switch (req) {
-case SG_GET_VERSION_NUM:
-*(int *)buf = 3;
-break;
-case SG_GET_SCSI_ID:
-((struct sg_scsi_id *)buf)->scsi_type = iscsilun->type;
-break;
-case SG_IO:
-status = -EINPROGRESS;
-iscsi_aio_ioctl(bs, req, buf, ioctl_cb, );
-
-while (status == -EINPROGRESS) {
-aio_poll(iscsilun->aio_context, true);
-}
-
-return 0;
-default:
-return -1;
-}
-return 0;
-}
 #endif
 
 static int64_t
@@ -1807,7 +1775,6 @@ static BlockDriver bdrv_iscsi = {
 .bdrv_co_flush_to_disk = iscsi_co_flush,
 
 #ifdef __linux__
-.bdrv_ioctl   = iscsi_ioctl,
 .bdrv_aio_ioctl   = iscsi_aio_ioctl,
 #endif
 
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 918c756..aec9ec6 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -2175,12 +2175,6 @@ static int hdev_open(BlockDriverState *bs, QDict 
*options, int flags,
 }
 
 #if defined(__linux__)
-static int hdev_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
-{
-BDRVRawState *s = bs->opaque;
-
-return ioctl(s->fd, req, buf);
-}
 
 static BlockAIOCB *hdev_aio_ioctl(BlockDriverState *bs,
 unsigned long int req, void *buf,
@@ -2338,7 +2332,6 @@ static BlockDriver bdrv_host_device = {
 
 /* generic scsi device */
 #ifdef __linux__
-.bdrv_ioctl = hdev_ioctl,
 .bdrv_aio_ioctl = hdev_aio_ioctl,
 #endif
 };
@@ -2471,7 +2464,6 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_lock_medium   = cdrom_lock_medium,
 
 /* generic scsi device */
-.bdrv_ioctl = hdev_ioctl,
 .bdrv_aio_ioctl = hdev_aio_ioctl,
 };
 #endif /* __linux__ */
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 0aded31..915d6fd 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -169,11 +169,6 @@ static void raw_lock_medium(BlockDriverState *bs, bool 
locked)
 bdrv_lock_medium(bs->file->bs, locked);
 }
 
-static int raw_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
-{
-return bdrv_ioctl(bs->file->bs, req, buf);
-}
-
 static BlockAIOCB *raw_aio_ioctl(BlockDriverState *bs,
  unsigned long int req, void *buf,
  BlockCompletionFunc *cb,
@@ -262,7 +257,6 @@ BlockDriver bdrv_raw = {
 .bdrv_media_changed   = _media_changed,
 .bdrv_eject   = _eject,
 .bdrv_lock_medium = _lock_medium,
-.bdrv_ioctl   = _ioctl,
 .bdrv_aio_ioctl   = _aio_ioctl,
 .create_opts  = _create_opts,
 .bdrv_has_zero_init   = _has_zero_init
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 7db9900..550ce18 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -227,7 +227,6 @@ struct BlockDriver {
 void (*bdrv_lock_medium)(BlockDriverState *bs, bool locked);
 
 /* to control generic scsi devices */
-int (*bdrv_ioctl)(BlockDriverState *bs, unsigned long int req, void *buf);
 BlockAIOCB *(*bdrv_aio_ioctl)(BlockDriverState *bs,
 unsigned long int req, void *buf,
 BlockCompletionFunc *cb, void *opaque);
-- 
2.4.3




[Qemu-devel] [PATCH v3 8/9] block: Introduce BlockDriver.bdrv_drain callback

2015-11-08 Thread Fam Zheng
Drivers can have internal request sources that generate IO, like the
need_check_timer in QED. Since we want quiesced periods that contain
nested event loops in block layer, we need to have a way to disable such
event sources.

Block drivers must implement the "bdrv_drain" callback if it has any
internal sources that can generate I/O activity, like a timer or a
worker thread (even in a library) that can schedule QEMUBH in an
asynchronous callback.

Update the comments of bdrv_drain and bdrv_drained_begin accordingly.

Like bdrv_requests_pending(), we should consider all the children of bs.
Before, the while loop just works, as bdrv_requests_pending() already
tracks its children; now we mustn't miss the callback, so recurse down
explicitly.

Signed-off-by: Fam Zheng 
---
 block/io.c| 13 -
 include/block/block_int.h |  6 ++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 4ecb171..136849c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -238,7 +238,8 @@ bool bdrv_requests_pending(BlockDriverState *bs)
 }
 
 /*
- * Wait for pending requests to complete on a single BlockDriverState subtree
+ * Wait for pending requests to complete on a single BlockDriverState subtree,
+ * and suspend block driver's internal I/O until next request arrives.
  *
  * Note that unlike bdrv_drain_all(), the caller must hold the BlockDriverState
  * AioContext.
@@ -249,8 +250,18 @@ bool bdrv_requests_pending(BlockDriverState *bs)
  */
 void bdrv_drain(BlockDriverState *bs)
 {
+BdrvChild *child;
 bool busy = true;
 
+if (bs->drv && bs->drv->bdrv_drain) {
+bs->drv->bdrv_drain(bs);
+}
+QLIST_FOREACH(child, >children, next) {
+BlockDriverState *cbs = child->bs;
+if (cbs->drv && cbs->drv->bdrv_drain) {
+cbs->drv->bdrv_drain(bs);
+}
+}
 while (busy) {
 /* Keep iterating */
  bdrv_flush_io_queue(bs);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 550ce18..4a9f8ff 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -295,6 +295,12 @@ struct BlockDriver {
  */
 int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo);
 
+/**
+ * Drain and stop any internal sources of requests in the driver, and
+ * remain so until next I/O callback (e.g. bdrv_co_writev) is called.
+ */
+void (*bdrv_drain)(BlockDriverState *bs);
+
 QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
2.4.3




[Qemu-devel] [PATCH v3 6/9] block: Emulate bdrv_ioctl with bdrv_aio_ioctl and track both

2015-11-08 Thread Fam Zheng
Currently all drivers that support .bdrv_aio_ioctl also implement
.bdrv_ioctl redundantly.  To track ioctl requests in block layer it is
easier if we unify the two paths, because we'll need to run it in a
coroutine, as required by tracked_request_begin. While we're at it, use
.bdrv_aio_ioctl plus aio_poll() to emulate bdrv_ioctl().

Signed-off-by: Fam Zheng 
---
 block/io.c | 101 +++--
 1 file changed, 92 insertions(+), 9 deletions(-)

diff --git a/block/io.c b/block/io.c
index 324ae5a..4ecb171 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2528,26 +2528,109 @@ int bdrv_discard(BlockDriverState *bs, int64_t 
sector_num, int nb_sectors)
 return rwco.ret;
 }
 
+typedef struct {
+CoroutineIOCompletion *co;
+QEMUBH *bh;
+} BdrvIoctlCompletionData;
+
+static void bdrv_ioctl_bh_cb(void *opaque)
+{
+BdrvIoctlCompletionData *data = opaque;
+
+bdrv_co_io_em_complete(data->co, -ENOTSUP);
+qemu_bh_delete(data->bh);
+}
+
+static int bdrv_co_do_ioctl(BlockDriverState *bs, int req, void *buf)
+{
+BlockDriver *drv = bs->drv;
+BdrvTrackedRequest tracked_req;
+CoroutineIOCompletion co = {
+.coroutine = qemu_coroutine_self(),
+};
+BlockAIOCB *acb;
+
+tracked_request_begin(_req, bs, 0, 0, BDRV_TRACKED_IOCTL);
+if (!drv || !drv->bdrv_aio_ioctl) {
+co.ret = -ENOTSUP;
+goto out;
+}
+
+acb = drv->bdrv_aio_ioctl(bs, req, buf, bdrv_co_io_em_complete, );
+if (!acb) {
+BdrvIoctlCompletionData *data = g_new(BdrvIoctlCompletionData, 1);
+data->bh = aio_bh_new(bdrv_get_aio_context(bs),
+bdrv_ioctl_bh_cb, data);
+data->co = 
+qemu_bh_schedule(data->bh);
+}
+qemu_coroutine_yield();
+out:
+tracked_request_end(_req);
+return co.ret;
+}
+
+typedef struct {
+BlockDriverState *bs;
+int req;
+void *buf;
+int ret;
+} BdrvIoctlCoData;
+
+static void coroutine_fn bdrv_co_ioctl_entry(void *opaque)
+{
+BdrvIoctlCoData *data = opaque;
+data->ret = bdrv_co_do_ioctl(data->bs, data->req, data->buf);
+}
+
 /* needed for generic scsi interface */
-
 int bdrv_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
 {
-BlockDriver *drv = bs->drv;
+BdrvIoctlCoData data = {
+.bs = bs,
+.req = req,
+.buf = buf,
+.ret = -EINPROGRESS,
+};
 
-if (drv && drv->bdrv_ioctl)
-return drv->bdrv_ioctl(bs, req, buf);
-return -ENOTSUP;
+if (qemu_in_coroutine()) {
+/* Fast-path if already in coroutine context */
+bdrv_co_ioctl_entry();
+} else {
+Coroutine *co = qemu_coroutine_create(bdrv_co_ioctl_entry);
+qemu_coroutine_enter(co, );
+}
+while (data.ret == -EINPROGRESS) {
+aio_poll(bdrv_get_aio_context(bs), true);
+}
+return data.ret;
+}
+
+static void coroutine_fn bdrv_co_aio_ioctl_entry(void *opaque)
+{
+BlockAIOCBCoroutine *acb = opaque;
+acb->req.error = bdrv_co_do_ioctl(acb->common.bs,
+  acb->req.req, acb->req.buf);
+bdrv_co_complete(acb);
 }
 
 BlockAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
 unsigned long int req, void *buf,
 BlockCompletionFunc *cb, void *opaque)
 {
-BlockDriver *drv = bs->drv;
+BlockAIOCBCoroutine *acb = qemu_aio_get(_em_co_aiocb_info,
+bs, cb, opaque);
+Coroutine *co;
 
-if (drv && drv->bdrv_aio_ioctl)
-return drv->bdrv_aio_ioctl(bs, req, buf, cb, opaque);
-return NULL;
+acb->need_bh = true;
+acb->req.error = -EINPROGRESS;
+acb->req.req = req;
+acb->req.buf = buf;
+co = qemu_coroutine_create(bdrv_co_aio_ioctl_entry);
+qemu_coroutine_enter(co, acb);
+
+bdrv_co_maybe_schedule_bh(acb);
+return >common;
 }
 
 void *qemu_blockalign(BlockDriverState *bs, size_t size)
-- 
2.4.3




[Qemu-devel] [PATCH v3 5/9] block: Add ioctl parameter fields to BlockRequest

2015-11-08 Thread Fam Zheng
The two fields that will be used by ioctl handling code later are added
as union, because it's used exclusively by ioctl code which dosn't need
the four fields in the other struct of the union.

Signed-off-by: Fam Zheng 
Reviewed-by: Kevin Wolf 
---
 include/block/block.h | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 610db92..c8b40b7 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -335,10 +335,18 @@ void bdrv_aio_cancel_async(BlockAIOCB *acb);
 
 typedef struct BlockRequest {
 /* Fields to be filled by multiwrite caller */
-int64_t sector;
-int nb_sectors;
-int flags;
-QEMUIOVector *qiov;
+union {
+struct {
+int64_t sector;
+int nb_sectors;
+int flags;
+QEMUIOVector *qiov;
+};
+struct {
+int req;
+void *buf;
+};
+};
 BlockCompletionFunc *cb;
 void *opaque;
 
-- 
2.4.3




Re: [Qemu-devel] [PULL 12/22] vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE

2015-11-08 Thread Yuanhan Liu
On Fri, Nov 06, 2015 at 10:01:58AM +, Peter Maydell wrote:
> On 6 November 2015 at 01:34, Yuanhan Liu  wrote:
> > On Thu, Nov 05, 2015 at 11:42:15AM +, Peter Maydell wrote:
> >> On 3 October 2015 at 17:33, Michael S. Tsirkin  wrote:
> >> > On Fri, Oct 02, 2015 at 06:18:51PM +0200, Paolo Bonzini wrote:
> >> >>
> >> >>
> >> >> On 24/09/2015 15:20, Michael S. Tsirkin wrote:
> >> >> > From: Yuanhan Liu 
> >> >> >
> >> >> > Quote from Michael:
> >> >> >
> >> >> > We really should rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE.
> >> >>
> >> >> Where is the corresponding Linux patch for this?
> >> >>
> >> >> I would like to fetch the updated headers for KVM, and this is breaking
> >> >> it.  In fact, a patch that just renames the #define (without providing
> >> >> the old name for backwards compatibility) would be NACKed in upstream 
> >> >> Linux.
> >> >>
> >> >> Paolo
> >> >
> >> > Right. And it turns out this whole approach is wrong.  I intend to
> >> > revert this patch, and also drop the patch sending VHOST_RESET_OWNER on
> >> > device stop.
> >>
> >> This revert doesn't seem to have happened, I think, which means
> >> that this is one of the things which prevents a clean header-update
> >> against kvm/next. Could we get this fixed for rc0, please?
> >
> > My bad. I will fix it next week. What's the deadline for rc0 then?
> 
> rc0 is 12th November (http://wiki.qemu.org/Planning/2.5). You need
> to also allow time for the patch to be reviewed and possibly taken
> via somebody's tree.

Got it and thanks for the remind.

--yliu




Re: [Qemu-devel] qemu : rbd block driver internal snapshot and vm_stop is hanging forever

2015-11-08 Thread Alexandre DERUMIER
Also, 

this occur only with rbd_cache=false or qemu drive cache=none.


If I use rbd_cache=true or qemu drive cache=writeback, I don't have this bug.


- Mail original -
De: "aderumier" 
À: "ceph-devel" , "qemu-devel" 

Envoyé: Lundi 9 Novembre 2015 04:23:10
Objet: Re: qemu : rbd block driver internal snapshot and vm_stop is hanging 
forever

Some other infos: 

I can reproduce it too with manual snapshot with rbd command 


#rbd --image myrbdvolume snap create --snap snap1 

qemu monitor: 

#stop 


This is with ceph hammer 0.94.5. 


in qemu vm_stop, the only thing related to block driver are 

bdrv_drain_all(); 
ret = bdrv_flush_all(); 


- Mail original - 
De: "aderumier"  
À: "ceph-devel" , "qemu-devel" 
 
Envoyé: Lundi 9 Novembre 2015 04:10:45 
Objet: qemu : rbd block driver internal snapshot and vm_stop is hanging forever 

Hi, 

with qemu (2.4.1), if I do an internal snapshot of an rbd device, 
then I pause the vm with vm_stop, 

the qemu process is hanging forever 


monitor commands to reproduce: 


# snapshot_blkdev_internal drive-virtio0 yoursnapname 
# stop 




I don't see this with qcow2 or sheepdog block driver for example. 


Regards, 

Alexandre 
-- 
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
the body of a message to majord...@vger.kernel.org 
More majordomo info at http://vger.kernel.org/majordomo-info.html 

-- 
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
the body of a message to majord...@vger.kernel.org 
More majordomo info at http://vger.kernel.org/majordomo-info.html 




Re: [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(PROCESSOR_MODULE_INFO)

2015-11-08 Thread Alexey Kardashevskiy

On 11/05/2015 10:06 AM, Sukadev Bhattiprolu wrote:

Implement RTAS_SYSPARM_PROCESSOR_MODULE_INFO parameter to rtas_get_sysparm()
call in qemu. This call returns the processor module (socket), chip and core
information as specified in section 7.3.16.18 of PAPR v2.7.

We walk the /proc/device-tree to determine the number of chips, cores and
modules in the _host_ system and return this info to the guest application
that makes the rtas_get_sysparm() call.

We currently hard code the number of module_types to 1, but we should determine
that dynamically somehow later.

Thanks to input from Nishanth Aravamudan and Alexey Kardashevsk.


"iy" is missing :)




Signed-off-by: Sukadev Bhattiprolu 
---
Changelog[v2]:
 [Alexey Kardashevsk] Use existing interfaces like ldl_be_p(),
 stw_be_phys(), g_hash_table_new_full(), error_report() rather
 than re-inventing; fix indentation, function prottypes etc;
 Drop the fts() interface (which doesn't seem to be available
 on mingw32/mingw64) and use opendir() to walk specific
 directories in the directory tree.
---
  hw/ppc/Makefile.objs|   1 +
  hw/ppc/spapr_rtas.c |  35 +++
  hw/ppc/spapr_rtas_modinfo.c | 230 
  hw/ppc/spapr_rtas_modinfo.h |  12 +++
  include/hw/ppc/spapr.h  |   1 +
  5 files changed, 279 insertions(+)
  create mode 100644 hw/ppc/spapr_rtas_modinfo.c
  create mode 100644 hw/ppc/spapr_rtas_modinfo.h

diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
index c1ffc77..57c6b02 100644
--- a/hw/ppc/Makefile.objs
+++ b/hw/ppc/Makefile.objs
@@ -4,6 +4,7 @@ obj-y += ppc.o ppc_booke.o
  obj-$(CONFIG_PSERIES) += spapr.o spapr_vio.o spapr_events.o
  obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
  obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
+obj-$(CONFIG_PSERIES) += spapr_rtas_modinfo.o
  ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
  obj-y += spapr_pci_vfio.o
  endif
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 34b12a3..41fd8a6 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -34,6 +34,8 @@
  #include "hw/ppc/spapr.h"
  #include "hw/ppc/spapr_vio.h"
  #include "qapi-event.h"
+
+#include "spapr_rtas_modinfo.h"
  #include "hw/boards.h"

  #include 
@@ -240,6 +242,39 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
  target_ulong ret = RTAS_OUT_SUCCESS;

  switch (parameter) {
+case RTAS_SYSPARM_PROCESSOR_MODULE_INFO: {
+int i;
+int offset = 0;
+int size;


Nit - could be one line.



+struct rtas_module_info modinfo;
+
+if (rtas_get_module_info()) {
+break;



@ret will be still 0 in this case, set @ret to RTAS_OUT_HW_ERROR here.

Also, rtas_ibm_set_system_parameter() must return RTAS_OUT_NOT_AUTHORIZED 
on RTAS_SYSPARM_PROCESSOR_MODULE_INFO, as PAPR says.




+}
+
+size = sizeof(modinfo);
+size += (modinfo.module_types - 1) * sizeof(struct rtas_socket_info);
+
+stw_be_phys(_space_memory, buffer+offset, size);
+offset += 2;
+
+stw_be_phys(_space_memory, buffer+offset, 
modinfo.module_types);
+offset += 2;
+
+for (i = 0; i < modinfo.module_types; i++) {
+stw_be_phys(_space_memory, buffer+offset,
+modinfo.si[i].sockets);



checkpatch.pl does not warn on this but new lines start under opening brace 
in the previous line.


In terms on vim, it would be:
set expandtab
set tabstop=4
set shiftwidth=4
set cino=:0,(0




+offset += 2;
+stw_be_phys(_space_memory, buffer+offset,
+modinfo.si[i].chips);
+offset += 2;
+stw_be_phys(_space_memory, buffer+offset,
+modinfo.si[i].cores_per_chip);
+offset += 2;
+}
+break;
+}
+
  case RTAS_SYSPARM_SPLPAR_CHARACTERISTICS: {
  char *param_val = g_strdup_printf("MaxEntCap=%d,"
"DesMem=%llu,"
diff --git a/hw/ppc/spapr_rtas_modinfo.c b/hw/ppc/spapr_rtas_modinfo.c
new file mode 100644
index 000..068fc2c
--- /dev/null
+++ b/hw/ppc/spapr_rtas_modinfo.c
@@ -0,0 +1,230 @@
+/*
+ * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System Emulator
+ *
+ * Hypercall based emulated RTAS



This is a description of hw/ppc/spapr_rtas.c, not of the new file.

Not sure the new code deserves a separate file, I'd either:
- add it into hw/ppc/spapr_rtas.c OR
- move rtas_ibm_get_system_parameter() + rtas_ibm_set_system_parameter() to 
a separate file (let's call it  hw/ppc/spapr_rtas_syspar.c) and add this 
new parameter there as there will be new parameters in the future anyway.


But I'll leave to the maintainer (David, hi :) ).




+ *
+ * Copyright (c) 2015 Sukadev Bhattiprolu, IBM 

[Qemu-devel] qemu : rbd block driver internal snapshot and vm_stop is hanging forever

2015-11-08 Thread Alexandre DERUMIER
Hi,

with qemu (2.4.1), if I do an internal snapshot of an rbd device,
then I pause the vm with vm_stop,

the qemu process is hanging forever


monitor commands to reproduce:


# snapshot_blkdev_internal drive-virtio0 yoursnapname
# stop




I don't see this with qcow2 or sheepdog block driver for example.


Regards,

Alexandre



[Qemu-devel] [PATCH v2] virtio-blk: trivial code optimization

2015-11-08 Thread arei.gonglei
From: Gonglei 

1. avoid possible superflous checking
2. make code more robustness

Signed-off-by: Gonglei 
---
v2: address Paolo's comments, thanks.
---
 hw/block/virtio-blk.c | 27 +--
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 093e475..21f8d72 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -404,24 +404,15 @@ void virtio_blk_submit_multireq(BlockBackend *blk, 
MultiReqBuffer *mrb)
 for (i = 0; i < mrb->num_reqs; i++) {
 VirtIOBlockReq *req = mrb->reqs[i];
 if (num_reqs > 0) {
-bool merge = true;
-
-/* merge would exceed maximum number of IOVs */
-if (niov + req->qiov.niov > IOV_MAX) {
-merge = false;
-}
-
-/* merge would exceed maximum transfer length of backend device */
-if (req->qiov.size / BDRV_SECTOR_SIZE + nb_sectors > max_xfer_len) 
{
-merge = false;
-}
-
-/* requests are not sequential */
-if (sector_num + nb_sectors != req->sector_num) {
-merge = false;
-}
-
-if (!merge) {
+/*
+ * NOTE: We cannot merge the requests in below situations:
+ * 1. requests are not sequential
+ * 2. merge would exceed maximum number of IOVs
+ * 3. merge would exceed maximum transfer length of backend device
+ */
+if (sector_num + nb_sectors != req->sector_num ||
+niov > IOV_MAX - req->qiov.niov ||
+req->qiov.size / BDRV_SECTOR_SIZE + nb_sectors > max_xfer_len) 
{
 submit_requests(blk, mrb, start, num_reqs, niov);
 num_reqs = 0;
 }
-- 
1.7.12.4





[Qemu-devel] [PATCH 04/18] armv7m: Explicit error for bad vector table

2015-11-08 Thread Michael Davidsaver
Give an explicit error and abort when a load
from VECBASE fails.  Otherwise would likely
jump to 0, which for v7-m holds the reset stack
pointer address.

Signed-off-by: Michael Davidsaver 
---
 target-arm/helper.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 4178400..1d7ac43 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -5496,7 +5496,17 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
 /* Clear IT bits */
 env->condexec_bits = 0;
 env->regs[14] = lr;
-addr = ldl_phys(cs->as, env->v7m.vecbase + env->v7m.exception * 4);
+{
+MemTxResult result;
+addr = address_space_ldl(cs->as,
+ env->v7m.vecbase + env->v7m.exception * 4,
+ MEMTXATTRS_UNSPECIFIED, );
+if (result != MEMTX_OK) {
+cpu_abort(cs, "Failed to read from exception vector table "
+  "entry %08x\n",
+  env->v7m.vecbase + env->v7m.exception * 4);
+}
+}
 env->regs[15] = addr & 0xfffe;
 env->thumb = addr & 1;
 if (!env->thumb) {
-- 
2.1.4




[Qemu-devel] [PATCH 03/18] armv7m: Complain about incorrect exception table entries.

2015-11-08 Thread Michael Davidsaver
For -M  These should always be thumb mode.
Log a message if this is seen.

Signed-off-by: Michael Davidsaver 
---
 target-arm/helper.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 4408100..4178400 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -5396,7 +5396,8 @@ static void do_v7m_exception_exit(CPUARMState *env)
 qemu_log_mask(LOG_GUEST_ERROR,
   "M profile return from interrupt with misaligned "
   "PC is UNPREDICTABLE\n");
-/* Actual hardware seems to ignore the lsbit, and there are several
+/* The ARM calls for UsageFault when the T bit isn't set, but
+ * actual hardware seems to ignore the lsbit, and there are several
  * RTOSes out there which incorrectly assume the r15 in the stack
  * frame should be a Thumb-style "lsbit indicates ARM/Thumb" value.
  */
@@ -5498,6 +5499,12 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
 addr = ldl_phys(cs->as, env->v7m.vecbase + env->v7m.exception * 4);
 env->regs[15] = addr & 0xfffe;
 env->thumb = addr & 1;
+if (!env->thumb) {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "M profile interrupt handler %d with incorrect "
+  "instruction mode in PC is UNPREDICTABLE\n",
+  env->v7m.exception);
+}
 }
 
 /* Function used to synchronize QEMU's AArch64 register set with AArch32
-- 
2.1.4




[Qemu-devel] [PATCH 02/18] armv7m: Undo armv7m.hack

2015-11-08 Thread Michael Davidsaver
Add CPU unassigned access handler in place of special
MemoryRegion to catch exception returns.

Signed-off-by: Michael Davidsaver 
---
 hw/arm/armv7m.c  |  8 
 target-arm/cpu.c | 18 ++
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
index a80d2ad..68146de 100644
--- a/hw/arm/armv7m.c
+++ b/hw/arm/armv7m.c
@@ -176,7 +176,6 @@ DeviceState *armv7m_init(MemoryRegion *system_memory, int 
mem_size, int num_irq,
 uint64_t entry;
 uint64_t lowaddr;
 int big_endian;
-MemoryRegion *hack = g_new(MemoryRegion, 1);
 
 if (cpu_model == NULL) {
cpu_model = "cortex-m3";
@@ -221,13 +220,6 @@ DeviceState *armv7m_init(MemoryRegion *system_memory, int 
mem_size, int num_irq,
 }
 }
 
-/* Hack to map an additional page of ram at the top of the address
-   space.  This stops qemu complaining about executing code outside RAM
-   when returning from an exception.  */
-memory_region_init_ram(hack, NULL, "armv7m.hack", 0x1000, _fatal);
-vmstate_register_ram_global(hack);
-memory_region_add_subregion(system_memory, 0xf000, hack);
-
 qemu_register_reset(armv7m_reset, cpu);
 return nvic;
 }
diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index 30739fc..be026bc 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -280,6 +280,23 @@ bool arm_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 }
 
 #if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
+static void arm_v7m_unassigned_access(CPUState *cpu, hwaddr addr,
+  bool is_write, bool is_exec, int opaque,
+  unsigned size)
+{
+ARMCPU *arm = ARM_CPU(cpu);
+CPUARMState *env = >env;
+
+if (env->v7m.exception != 0 && addr >= 0xfff0 && !is_write) {
+cpu->exception_index = EXCP_EXCEPTION_EXIT;
+cpu_loop_exit(cpu);
+} else {
+/* TODO, signal some *Fault? */
+cpu_abort(cpu, "Trying to access outside RAM or ROM at 0x"
+  TARGET_FMT_plx "\n", addr);
+}
+}
+
 static bool arm_v7m_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
 {
 CPUClass *cc = CPU_GET_CLASS(cs);
@@ -909,6 +926,7 @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
 cc->do_interrupt = arm_v7m_cpu_do_interrupt;
 #endif
 
+cc->do_unassigned_access = arm_v7m_unassigned_access;
 cc->cpu_exec_interrupt = arm_v7m_cpu_exec_interrupt;
 }
 
-- 
2.1.4




[Qemu-devel] [PATCH 05/18] armv7m: expand NVIC state

2015-11-08 Thread Michael Davidsaver
Expand the NVIC to fully support -M priorities and masking.
Doesn't use GIC code.

Move some state to ARMCPU to allow calculation of exception masking.

Add storage for PRIGROUP to configure group/sub-group split.
Track group and sub-group in separate fields for quick comparison.
Mix in vector # with sub-group as per tie breaking rules.

NVIC now derives directly from SysBusDevice, and
struct NVICClass is eliminated.

Also add DPRINTF() macro.

Signed-off-by: Michael Davidsaver 
---
 hw/intc/armv7m_nvic.c | 74 ++-
 target-arm/cpu.h  |  3 +++
 2 files changed, 52 insertions(+), 25 deletions(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index 6fc167e..487a09a 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -13,43 +13,67 @@
 #include "hw/sysbus.h"
 #include "qemu/timer.h"
 #include "hw/arm/arm.h"
+#include "target-arm/cpu.h"
 #include "exec/address-spaces.h"
-#include "gic_internal.h"
 
-typedef struct {
-GICState gic;
+/*#define DEBUG_NVIC 0
+ */
+#ifdef DEBUG_NVIC
+#define DPRINTF(LVL, fmt, ...) \
+do { if ((LVL) <= DEBUG_NVIC) { \
+fprintf(stderr, "armv7m_nvic: " fmt , ## __VA_ARGS__); \
+} } while (0)
+#else
+#define DPRINTF(LVL, fmt, ...) do {} while (0)
+#endif
+
+/* the number of IRQ lines in addition to the 16 internal
+ * exception vectors.
+ */
+#define NVIC_MAX_IRQ 496
+
+#define NVIC_MAX_VECTORS 512
+
+struct vec_info {
+uint16_t prio_sub; /* sub-group priority*512 + exception# */
+int8_t prio_group; /* group priority [-2, 0x7f] */
+uint8_t raw_prio; /* value writen by guest */
+uint8_t enabled;
+uint8_t pending;
+uint8_t active;
+uint8_t level;
+/* exceptions <=15 never set level */
+};
+typedef struct vec_info vec_info;
+
+struct nvic_state {
+/*< private >*/
+SysBusDevice parent_obj;
+/*< public >*/
+
+ARMCPU *cpu; /* NVIC is so closely tied to the CPU, just keep a ref */
+
+vec_info vectors[NVIC_MAX_VECTORS];
+
+uint8_t prigroup;
+
 struct {
 uint32_t control;
 uint32_t reload;
 int64_t tick;
 QEMUTimer *timer;
 } systick;
-MemoryRegion sysregmem;
-MemoryRegion gic_iomem_alias;
-MemoryRegion container;
+
+MemoryRegion iomem; /* system control space and NVIC */
+
 uint32_t num_irq;
+qemu_irq excpout;
 qemu_irq sysresetreq;
-} nvic_state;
+};
+typedef struct nvic_state nvic_state;
 
 #define TYPE_NVIC "armv7m_nvic"
-/**
- * NVICClass:
- * @parent_reset: the parent class' reset handler.
- *
- * A model of the v7M NVIC and System Controller
- */
-typedef struct NVICClass {
-/*< private >*/
-ARMGICClass parent_class;
-/*< public >*/
-DeviceRealize parent_realize;
-void (*parent_reset)(DeviceState *dev);
-} NVICClass;
-
-#define NVIC_CLASS(klass) \
-OBJECT_CLASS_CHECK(NVICClass, (klass), TYPE_NVIC)
-#define NVIC_GET_CLASS(obj) \
-OBJECT_GET_CLASS(NVICClass, (obj), TYPE_NVIC)
+
 #define NVIC(obj) \
 OBJECT_CHECK(nvic_state, (obj), TYPE_NVIC)
 
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 815fef8..c193fbb 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -398,6 +398,9 @@ typedef struct CPUARMState {
 uint32_t control;
 int current_sp;
 int exception;
+int exception_prio;
+unsigned pending;
+int pending_prio;
 } v7m;
 
 /* Information associated with an exception about to be taken:
-- 
2.1.4




[Qemu-devel] [PATCH 18/18] armv7m: prevent unprivileged write to STIR

2015-11-08 Thread Michael Davidsaver
Prevent unprivileged from writing to the
Software Triggered Interrupt register

Signed-off-by: Michael Davidsaver 
---
 hw/intc/armv7m_nvic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index ca8c93c..b744cd5 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -654,7 +654,9 @@ static void nvic_writel(nvic_state *s, uint32_t offset, 
uint32_t value)
   "NVIC: fault status registers unimplemented\n");
 break;
 case 0xf00: /* Software Triggered Interrupt Register */
-if ((value & 0x1ff) < NVIC_MAX_IRQ) {
+/* STIR write allowed if privlaged or USERSETMPEND set */
+if ((arm_current_el(>env) || (cpu->env.v7m.ccr&2))
+&& ((value & 0x1ff) < NVIC_MAX_IRQ)) {
 armv7m_nvic_set_pending(s, (value&0x1ff)+16);
 }
 break;
-- 
2.1.4




[Qemu-devel] [PATCH 07/18] armv7m: Update NVIC registers

2015-11-08 Thread Michael Davidsaver
Replace use of GIC state/functions with new NVIC.

Signed-off-by: Michael Davidsaver 
---
 hw/intc/armv7m_nvic.c | 233 --
 1 file changed, 168 insertions(+), 65 deletions(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index ebb4d4e..30e349e 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -394,13 +394,13 @@ void set_irq_level(void *opaque, int n, int level)
 
 static uint32_t nvic_readl(nvic_state *s, uint32_t offset)
 {
-ARMCPU *cpu;
+ARMCPU *cpu = s->cpu;
 uint32_t val;
 int irq;
 
 switch (offset) {
 case 4: /* Interrupt Control Type.  */
-return (s->num_irq / 32) - 1;
+return ((s->num_irq - 16) / 32) - 1;
 case 0x10: /* SysTick Control and Status.  */
 val = s->systick.control;
 s->systick.control &= ~SYSTICK_COUNTFLAG;
@@ -426,45 +426,39 @@ static uint32_t nvic_readl(nvic_state *s, uint32_t offset)
 case 0x1c: /* SysTick Calibration Value.  */
 return 1;
 case 0xd00: /* CPUID Base.  */
-cpu = ARM_CPU(current_cpu);
 return cpu->midr;
 case 0xd04: /* Interrupt Control State.  */
 /* VECTACTIVE */
-cpu = ARM_CPU(current_cpu);
 val = cpu->env.v7m.exception;
-if (val == 1023) {
-val = 0;
-} else if (val >= 32) {
-val -= 16;
-}
 /* VECTPENDING */
-if (s->gic.current_pending[0] != 1023)
-val |= (s->gic.current_pending[0] << 12);
+val |= (cpu->env.v7m.pending << 12)&0x1ff;
 /* ISRPENDING and RETTOBASE */
-for (irq = 32; irq < s->num_irq; irq++) {
-if (s->gic.irq_state[irq].pending) {
+for (irq = 16; irq < s->num_irq; irq++) {
+if (s->vectors[irq].pending) {
 val |= (1 << 22);
 break;
 }
-if (irq != cpu->env.v7m.exception && s->gic.irq_state[irq].active) 
{
+if (irq != cpu->env.v7m.exception && s->vectors[irq].active) {
 val |= (1 << 11);
 }
 }
 /* PENDSTSET */
-if (s->gic.irq_state[ARMV7M_EXCP_SYSTICK].pending)
+if (s->vectors[ARMV7M_EXCP_SYSTICK].pending) {
 val |= (1 << 26);
+}
 /* PENDSVSET */
-if (s->gic.irq_state[ARMV7M_EXCP_PENDSV].pending)
+if (s->vectors[ARMV7M_EXCP_PENDSV].pending) {
 val |= (1 << 28);
+}
 /* NMIPENDSET */
-if (s->gic.irq_state[ARMV7M_EXCP_NMI].pending)
+if (s->vectors[ARMV7M_EXCP_NMI].pending) {
 val |= (1 << 31);
+}
 return val;
 case 0xd08: /* Vector Table Offset.  */
-cpu = ARM_CPU(current_cpu);
 return cpu->env.v7m.vecbase;
 case 0xd0c: /* Application Interrupt/Reset Control.  */
-return 0xfa05;
+return 0xfa05 | (s->prigroup<<8);
 case 0xd10: /* System Control.  */
 /* TODO: Implement SLEEPONEXIT.  */
 return 0;
@@ -473,20 +467,20 @@ static uint32_t nvic_readl(nvic_state *s, uint32_t offset)
 return 0;
 case 0xd24: /* System Handler Status.  */
 val = 0;
-if (s->gic.irq_state[ARMV7M_EXCP_MEM].active) val |= (1 << 0);
-if (s->gic.irq_state[ARMV7M_EXCP_BUS].active) val |= (1 << 1);
-if (s->gic.irq_state[ARMV7M_EXCP_USAGE].active) val |= (1 << 3);
-if (s->gic.irq_state[ARMV7M_EXCP_SVC].active) val |= (1 << 7);
-if (s->gic.irq_state[ARMV7M_EXCP_DEBUG].active) val |= (1 << 8);
-if (s->gic.irq_state[ARMV7M_EXCP_PENDSV].active) val |= (1 << 10);
-if (s->gic.irq_state[ARMV7M_EXCP_SYSTICK].active) val |= (1 << 11);
-if (s->gic.irq_state[ARMV7M_EXCP_USAGE].pending) val |= (1 << 12);
-if (s->gic.irq_state[ARMV7M_EXCP_MEM].pending) val |= (1 << 13);
-if (s->gic.irq_state[ARMV7M_EXCP_BUS].pending) val |= (1 << 14);
-if (s->gic.irq_state[ARMV7M_EXCP_SVC].pending) val |= (1 << 15);
-if (s->gic.irq_state[ARMV7M_EXCP_MEM].enabled) val |= (1 << 16);
-if (s->gic.irq_state[ARMV7M_EXCP_BUS].enabled) val |= (1 << 17);
-if (s->gic.irq_state[ARMV7M_EXCP_USAGE].enabled) val |= (1 << 18);
+if (s->vectors[ARMV7M_EXCP_MEM].active) val |= (1 << 0);
+if (s->vectors[ARMV7M_EXCP_BUS].active) val |= (1 << 1);
+if (s->vectors[ARMV7M_EXCP_USAGE].active) val |= (1 << 3);
+if (s->vectors[ARMV7M_EXCP_SVC].active) val |= (1 << 7);
+if (s->vectors[ARMV7M_EXCP_DEBUG].active) val |= (1 << 8);
+if (s->vectors[ARMV7M_EXCP_PENDSV].active) val |= (1 << 10);
+if (s->vectors[ARMV7M_EXCP_SYSTICK].active) val |= (1 << 11);
+if (s->vectors[ARMV7M_EXCP_USAGE].pending) val |= (1 << 12);
+if (s->vectors[ARMV7M_EXCP_MEM].pending) val |= (1 << 13);
+if (s->vectors[ARMV7M_EXCP_BUS].pending) val |= (1 << 14);
+if (s->vectors[ARMV7M_EXCP_SVC].pending) val |= (1 

[Qemu-devel] [PATCH 11/18] armv7m: fix I and F flag handling

2015-11-08 Thread Michael Davidsaver
Despite having the same notation, these bits
have completely different meaning than -AR.

Add armv7m_excp_unmasked()
to calculate the currently runable exception priority
taking into account masks and active handlers.
Use this in conjunction with the pending exception
priority to determine if the pending exception
can interrupt execution.

Signed-off-by: Michael Davidsaver 
---
 target-arm/cpu.c | 26 +++---
 target-arm/cpu.h | 27 ++-
 2 files changed, 33 insertions(+), 20 deletions(-)

diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index be026bc..5d03117 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -173,6 +173,8 @@ static void arm_cpu_reset(CPUState *s)
 uint32_t initial_pc; /* Loaded from 0x4 */
 uint8_t *rom;
 
+env->v7m.exception_prio = env->v7m.pending_prio = 0x100;
+
 env->daif &= ~PSTATE_I;
 rom = rom_ptr(0);
 if (rom) {
@@ -301,29 +303,15 @@ static bool arm_v7m_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 {
 CPUClass *cc = CPU_GET_CLASS(cs);
 ARMCPU *cpu = ARM_CPU(cs);
-CPUARMState *env = >env;
 bool ret = false;
 
-
-if (interrupt_request & CPU_INTERRUPT_FIQ
-&& !(env->daif & PSTATE_F)) {
-cs->exception_index = EXCP_FIQ;
-cc->do_interrupt(cs);
-ret = true;
-}
-/* ARMv7-M interrupt return works by loading a magic value
- * into the PC.  On real hardware the load causes the
- * return to occur.  The qemu implementation performs the
- * jump normally, then does the exception return when the
- * CPU tries to execute code at the magic address.
- * This will cause the magic PC value to be pushed to
- * the stack if an interrupt occurred at the wrong time.
- * We avoid this by disabling interrupts when
- * pc contains a magic address.
+/* ARMv7-M interrupt masking works differently than -A or -R.
+ * There is no FIQ/IRQ distinction.
+ * Instead of masking interrupt sources, the I and F bits
+ * (along with basepri) mask certain exception priority levels.
  */
 if (interrupt_request & CPU_INTERRUPT_HARD
-&& !(env->daif & PSTATE_I)
-&& (env->regs[15] < 0xfff0)) {
+&& (armv7m_excp_unmasked(cpu) >= cpu->env.v7m.pending_prio)) {
 cs->exception_index = EXCP_IRQ;
 cc->do_interrupt(cs);
 ret = true;
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index c193fbb..29d89ce 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -1033,9 +1033,34 @@ void arm_cpu_list(FILE *f, fprintf_function cpu_fprintf);
 uint32_t arm_phys_excp_target_el(CPUState *cs, uint32_t excp_idx,
  uint32_t cur_el, bool secure);
 
+/* @returns highest (numerically lowest) unmasked exception priority
+ */
+static inline
+int armv7m_excp_unmasked(ARMCPU *cpu)
+{
+CPUARMState *env = >env;
+int runnable;
+
+/* find highest (numerically lowest) priority which could
+ * run based on masks
+ */
+if (env->daif_F) { /* FAULTMASK */
+runnable = -2;
+} else if (env->daif_I) { /* PRIMASK */
+runnable = -1;
+} else if (env->v7m.basepri > 0) {
+/* BASEPRI==1 -> runnable==-1 (same as PRIMASK==1) */
+runnable = env->v7m.basepri-2;
+} else {
+runnable = 0x100; /* lower than any possible priority */
+}
+/* consider priority of active handler */
+return MIN(runnable, env->v7m.exception_prio-1);
+}
+
 /* Interface between CPU and Interrupt controller.  */
 void armv7m_nvic_set_pending(void *opaque, int irq);
-int armv7m_nvic_acknowledge_irq(void *opaque);
+void armv7m_nvic_acknowledge_irq(void *opaque);
 void armv7m_nvic_complete_irq(void *opaque, int irq);
 
 /* Interface for defining coprocessor registers.
-- 
2.1.4




[Qemu-devel] [PATCH 06/18] armv7m: new NVIC utility functions

2015-11-08 Thread Michael Davidsaver
Internal functions for operations previously done
by GIC internals.

nvic_irq_update() recalculates highest pending/active
exceptions.

armv7m_nvic_set_pending() include exception escalation
logic.

armv7m_nvic_acknowledge_irq() and nvic_irq_update()
update ARMCPU fields.

Signed-off-by: Michael Davidsaver 
---
 hw/intc/armv7m_nvic.c | 250 +++---
 1 file changed, 235 insertions(+), 15 deletions(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index 487a09a..ebb4d4e 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -140,36 +140,256 @@ static void systick_reset(nvic_state *s)
 timer_del(s->systick.timer);
 }
 
-/* The external routines use the hardware vector numbering, ie. the first
-   IRQ is #16.  The internal GIC routines use #32 as the first IRQ.  */
+/* caller must call nvic_irq_update() after this */
+static
+void set_prio(nvic_state *s, unsigned irq, uint8_t prio)
+{
+unsigned submask = (1<<(s->prigroup+1))-1;
+
+assert(irq > 3); /* only use for configurable prios */
+assert(irq < NVIC_MAX_VECTORS);
+
+s->vectors[irq].raw_prio = prio;
+s->vectors[irq].prio_group = (prio>>(s->prigroup+1));
+s->vectors[irq].prio_sub = irq + (prio)*NVIC_MAX_VECTORS;
+
+DPRINTF(0, "Set %u priority grp %d sub %u\n", irq,
+s->vectors[irq].prio_group, s->vectors[irq].prio_sub);
+}
+
+/* recompute highest pending */
+static
+void nvic_irq_update(nvic_state *s, int update_active)
+{
+unsigned i;
+int lvl;
+CPUARMState *env = >cpu->env;
+int16_t act_group = 0x100, pend_group = 0x100;
+uint16_t act_sub = 0, pend_sub = 0;
+uint16_t act_irq = 0, pend_irq = 0;
+
+/* find highest priority */
+for (i = 1; i < s->num_irq; i++) {
+vec_info *I = >vectors[i];
+
+DPRINTF(2, " VECT %d %d:%u\n", i, I->prio_group, I->prio_sub);
+
+if (I->active && ((I->prio_group < act_group)
+|| (I->prio_group == act_group && I->prio_sub < act_sub)))
+{
+act_group = I->prio_group;
+act_sub = I->prio_sub;
+act_irq = i;
+}
+
+if (I->enabled && I->pending && ((I->prio_group < pend_group)
+|| (I->prio_group == pend_group && I->prio_sub < pend_sub)))
+{
+pend_group = I->prio_group;
+pend_sub = I->prio_sub;
+pend_irq = i;
+}
+}
+
+env->v7m.pending = pend_irq;
+env->v7m.pending_prio = pend_group;
+
+if (update_active) {
+env->v7m.exception = act_irq;
+env->v7m.exception_prio = act_group;
+}
+
+/* Raise NVIC output even if pend_group is masked.
+ * This is necessary as we get no notification
+ * when PRIMASK et al. are changed.
+ * As long as our output is high cpu_exec() will call
+ * into arm_v7m_cpu_exec_interrupt() frequently, which
+ * then tests to see if the pending exception
+ * is permitted.
+ */
+lvl = pend_irq > 0;
+DPRINTF(1, "highest pending %d %d:%u\n", pend_irq, pend_group, pend_sub);
+DPRINTF(1, "highest active  %d %d:%u\n", act_irq, act_group, act_sub);
+
+DPRINTF(0, "IRQ %c highest act %d pend %d\n",
+lvl ? 'X' : '_', act_irq, pend_irq);
+
+qemu_set_irq(s->excpout, lvl);
+}
+
+static
+void armv7m_nvic_clear_pending(void *opaque, int irq)
+{
+nvic_state *s = (nvic_state *)opaque;
+vec_info *I;
+
+assert(irq >= 0);
+assert(irq < NVIC_MAX_VECTORS);
+
+I = >vectors[irq];
+if (I->pending) {
+I->pending = 0;
+nvic_irq_update(s, 0);
+}
+}
+
 void armv7m_nvic_set_pending(void *opaque, int irq)
 {
 nvic_state *s = (nvic_state *)opaque;
-if (irq >= 16)
-irq += 16;
-gic_set_pending_private(>gic, 0, irq);
+CPUARMState *env = >cpu->env;
+vec_info *I;
+int active = s->cpu->env.v7m.exception;
+
+assert(irq > 0);
+assert(irq < NVIC_MAX_VECTORS);
+
+I = >vectors[irq];
+
+if (irq < ARMV7M_EXCP_SYSTICK && irq != ARMV7M_EXCP_DEBUG) {
+int runnable = armv7m_excp_unmasked(s->cpu);
+/* test for exception escalation for vectors other than:
+ * Debug (12), SysTick (15), and all external IRQs (>=16)
+ */
+unsigned escalate = 0;
+if (I->active) {
+/* trying to pend an active fault (possibly nested).
+ * eg. UsageFault in UsageFault handler
+ */
+escalate = 1;
+DPRINTF(0, " Escalate, active\n");
+} else if (!I->enabled) {
+/* trying to pend a disabled fault
+ * eg. UsageFault while USGFAULTENA in SHCSR is clear.
+ */
+escalate = 1;
+DPRINTF(0, " Escalate, not enabled\n");
+} else if (I->prio_group > runnable) {
+/* trying to pend a fault which is not immediately
+ * runnable due to masking by PRIMASK, FAULTMASK, BASEPRI,
+ * or the 

[Qemu-devel] [PATCH 09/18] armv7m: NVIC update vmstate

2015-11-08 Thread Michael Davidsaver
Signed-off-by: Michael Davidsaver 
---
 hw/intc/armv7m_nvic.c | 64 +--
 1 file changed, 62 insertions(+), 2 deletions(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index 3b10dee..c860b36 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -810,15 +810,75 @@ static const MemoryRegionOps nvic_sysreg_ops = {
 .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
+static
+int nvic_post_load(void *opaque, int version_id)
+{
+nvic_state *s = opaque;
+unsigned i;
+
+/* evil hack to get ARMCPU* ahead of time */
+assert(cpus.tqh_first);
+assert(!CPU_NEXT(cpus.tqh_first));
+s->cpu = ARM_CPU(cpus.tqh_first);
+assert(s->cpu);
+
+/* recalculate priorities */
+for (i = 4; i < s->num_irq; i++) {
+set_prio(s, i, s->vectors[i].raw_prio);
+}
+
+nvic_irq_update(s, highest_runnable_prio(s->cpu));
+
+return 0;
+}
+
+static
+int vec_info_get(QEMUFile *f, void *pv, size_t size)
+{
+vec_info *I = pv;
+I->prio_sub = qemu_get_be16(f);
+I->prio_group = qemu_get_byte(f);
+I->raw_prio = qemu_get_ubyte(f);
+I->enabled = qemu_get_ubyte(f);
+I->pending = qemu_get_ubyte(f);
+I->active = qemu_get_ubyte(f);
+I->level = qemu_get_ubyte(f);
+return 0;
+}
+
+static
+void vec_info_put(QEMUFile *f, void *pv, size_t size)
+{
+vec_info *I = pv;
+qemu_put_be16(f, I->prio_sub);
+qemu_put_byte(f, I->prio_group);
+qemu_put_ubyte(f, I->raw_prio);
+qemu_put_ubyte(f, I->enabled);
+qemu_put_ubyte(f, I->pending);
+qemu_put_ubyte(f, I->active);
+qemu_put_ubyte(f, I->level);
+}
+
+static const VMStateInfo vmstate_info = {
+.name = "armv7m_nvic_info",
+.get = vec_info_get,
+.put = vec_info_put,
+};
+
 static const VMStateDescription vmstate_nvic = {
 .name = "armv7m_nvic",
-.version_id = 1,
-.minimum_version_id = 1,
+.version_id = 2,
+.minimum_version_id = 2,
+.post_load = _post_load,
 .fields = (VMStateField[]) {
+VMSTATE_ARRAY(vectors, nvic_state, NVIC_MAX_VECTORS, 0,
+  vmstate_info, vec_info),
+VMSTATE_UINT8(prigroup, nvic_state),
 VMSTATE_UINT32(systick.control, nvic_state),
 VMSTATE_UINT32(systick.reload, nvic_state),
 VMSTATE_INT64(systick.tick, nvic_state),
 VMSTATE_TIMER_PTR(systick.timer, nvic_state),
+VMSTATE_UINT32(num_irq, nvic_state),
 VMSTATE_END_OF_LIST()
 }
 };
-- 
2.1.4




[Qemu-devel] [PATCH 12/18] armv7m: simpler/faster exception start

2015-11-08 Thread Michael Davidsaver
No need to bounce through EXCP_IRQ handling
for non-IRQ exceptions.  just update CPU
state directly.

Signed-off-by: Michael Davidsaver 
---
 target-arm/helper.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 1d7ac43..2541890 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -5433,23 +5433,21 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
 
 /* For exceptions we just mark as pending on the NVIC, and let that
handle it.  */
-/* TODO: Need to escalate if the current priority is higher than the
-   one we're raising.  */
 switch (cs->exception_index) {
 case EXCP_UDEF:
 armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE);
-return;
+break;
 case EXCP_SWI:
 /* The PC already points to the next instruction.  */
 armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_SVC);
-return;
+break;
 case EXCP_PREFETCH_ABORT:
 case EXCP_DATA_ABORT:
 /* TODO: if we implemented the MPU registers, this is where we
  * should set the MMFAR, etc from exception.fsr and exception.vaddress.
  */
 armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_MEM);
-return;
+break;
 case EXCP_BKPT:
 if (semihosting_enabled()) {
 int nr;
@@ -5466,7 +5464,6 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
 armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_DEBUG);
 return;
 case EXCP_IRQ:
-env->v7m.exception = armv7m_nvic_acknowledge_irq(env->nvic);
 break;
 case EXCP_EXCEPTION_EXIT:
 do_v7m_exception_exit(env);
@@ -5476,6 +5473,10 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
 return; /* Never happens.  Keep compiler happy.  */
 }
 
+armv7m_nvic_acknowledge_irq(env->nvic);
+
+qemu_log_mask(CPU_LOG_INT, "... as %d\n", env->v7m.exception);
+
 /* Align stack pointer.  */
 /* ??? Should only do this if Configuration Control Register
STACKALIGN bit is set.  */
-- 
2.1.4




[Qemu-devel] [PATCH v2] mirror: Improve zero-write and discard with fragmented image

2015-11-08 Thread Fam Zheng
The "pnum < nb_sectors" condition in deciding whether to actually copy
data is unnecessarily strict, and the qiov initialization is
unnecessarily too, for both bdrv_aio_write_zeroes and bdrv_aio_discard
branches.

Reorganize mirror_iteration flow so that we:

1) Find the contiguous zero/discarded sectors with
bdrv_get_block_status_above() before deciding what to do. We query
s->buf_size sized blocks at a time.

2) If the sectors in question are zeroed/discarded and aligned to
target cluster, issue zero write or discard accordingly. It's done
in mirror_do_zero_or_discard, where we don't add buffer to qiov.

3) Otherwise, do the same loop as before in mirror_do_read.

Signed-off-by: Fam Zheng 

---

v2: Address Max's comments:
Don't do superfluous assignment for next_sector and next_chunk;
Move assignment to hbitmap_next_sector above sector_num to reduce
conflict.
---
 block/mirror.c | 154 +
 1 file changed, 122 insertions(+), 32 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index b1252a1..7c8d090 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -157,23 +157,13 @@ static void mirror_read_complete(void *opaque, int ret)
 mirror_write_complete, op);
 }
 
-static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
+static uint64_t mirror_do_read(MirrorBlockJob *s)
 {
 BlockDriverState *source = s->common.bs;
-int nb_sectors, sectors_per_chunk, nb_chunks;
-int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
+int sectors_per_chunk, nb_sectors, nb_chunks;
+int64_t end, next_chunk, next_sector, hbitmap_next_sector, sector_num;
 uint64_t delay_ns = 0;
 MirrorOp *op;
-int pnum;
-int64_t ret;
-
-s->sector_num = hbitmap_iter_next(>hbi);
-if (s->sector_num < 0) {
-bdrv_dirty_iter_init(s->dirty_bitmap, >hbi);
-s->sector_num = hbitmap_iter_next(>hbi);
-trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap));
-assert(s->sector_num >= 0);
-}
 
 hbitmap_next_sector = s->sector_num;
 sector_num = s->sector_num;
@@ -198,14 +188,6 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 next_sector = sector_num;
 next_chunk = sector_num / sectors_per_chunk;
 
-/* Wait for I/O to this cluster (from a previous iteration) to be done.  */
-while (test_bit(next_chunk, s->in_flight_bitmap)) {
-trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
-s->waiting_for_io = true;
-qemu_coroutine_yield();
-s->waiting_for_io = false;
-}
-
 do {
 int added_sectors, added_chunks;
 
@@ -301,24 +283,132 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 s->sectors_in_flight += nb_sectors;
 trace_mirror_one_iteration(s, sector_num, nb_sectors);
 
-ret = bdrv_get_block_status_above(source, NULL, sector_num,
-  nb_sectors, );
-if (ret < 0 || pnum < nb_sectors ||
-(ret & BDRV_BLOCK_DATA && !(ret & BDRV_BLOCK_ZERO))) {
-bdrv_aio_readv(source, sector_num, >qiov, nb_sectors,
-   mirror_read_complete, op);
-} else if (ret & BDRV_BLOCK_ZERO) {
+bdrv_aio_readv(source, sector_num, >qiov, nb_sectors,
+   mirror_read_complete, op);
+return delay_ns;
+}
+
+static uint64_t mirror_do_zero_or_discard(MirrorBlockJob *s,
+  int64_t sector_num,
+  int nb_sectors,
+  bool is_discard)
+{
+int sectors_per_chunk, nb_chunks;
+int64_t next_chunk, next_sector, hbitmap_next_sector;
+uint64_t delay_ns = 0;
+MirrorOp *op;
+
+sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
+assert(nb_sectors >= sectors_per_chunk);
+next_chunk = sector_num / sectors_per_chunk;
+nb_chunks = DIV_ROUND_UP(nb_sectors, sectors_per_chunk);
+bitmap_set(s->in_flight_bitmap, next_chunk, nb_chunks);
+delay_ns = ratelimit_calculate_delay(>limit, nb_sectors);
+
+/* Allocate a MirrorOp that is used as an AIO callback. The qiov is zeroed
+ * out so the freeing in iteration is nop. */
+op = g_new0(MirrorOp, 1);
+op->s = s;
+op->sector_num = sector_num;
+op->nb_sectors = nb_sectors;
+
+/* Advance the HBitmapIter in parallel, so that we do not examine
+ * the same sector twice.
+ */
+hbitmap_next_sector = sector_num;
+next_sector = sector_num + nb_sectors;
+while (next_sector > hbitmap_next_sector) {
+hbitmap_next_sector = hbitmap_iter_next(>hbi);
+if (hbitmap_next_sector < 0) {
+break;
+}
+}
+
+bdrv_reset_dirty_bitmap(s->dirty_bitmap, sector_num, nb_sectors);
+s->in_flight++;
+s->sectors_in_flight += nb_sectors;
+if (is_discard) {
+

Re: [Qemu-devel] assert during internal snapshot

2015-11-08 Thread Li, Liang Z
> -Original Message-
> From: Denis V. Lunev [mailto:d...@openvz.org]
> Sent: Saturday, November 07, 2015 11:20 PM
> To: Li, Liang Z; Paolo Bonzini; Juan Quintela; Amit Shah
> Cc: QEMU
> Subject: assert during internal snapshot
> 
> Hello, All!
> 
> This commit
> 
> commit 94f5a43704129ca4995aa3385303c5ae225bde42
> Author: Liang Li 
> Date:   Mon Nov 2 15:37:00 2015 +0800
> 
>  migration: defer migration_end & blk_mig_cleanup
> 
>  Because of the patch 3ea3b7fa9af067982f34b of kvm, which introduces a
>  lazy collapsing of small sptes into large sptes mechanism, now
>  migration_end() is a time consuming operation because it calls
>  memroy_global_dirty_log_stop(), which will trigger the dropping of small
>  sptes operation and takes about dozens of milliseconds, so call
>  migration_end() before all the vmsate data has already been transferred
>  to the destination will prolong VM downtime. This operation should be
>  deferred after all the data has been transferred to the destination.
> 
>  blk_mig_cleanup() can be deferred too.
> 
>  For a VM with 8G RAM, this patch can reduce the VM downtime about
> 30 ms.
> 
>  Signed-off-by: Liang Li 
>  Reviewed-by: Paolo Bonzini 
>  Reviewed-by: Juan Quintela al3
>  Reviewed-by: Amit Shah al3
>  Signed-off-by: Juan Quintela al3
> 
> introduces the following regression
> 
> (gdb) bt
> #0  0x7fd5d314a267 in __GI_raise (sig=sig@entry=6)
>  at ../sysdeps/unix/sysv/linux/raise.c:55
> #1  0x7fd5d314beca in __GI_abort () at abort.c:89
> #2  0x7fd5d314303d in __assert_fail_base (
>  fmt=0x7fd5d32a5028 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
>  assertion=assertion@entry=0x557288ed5b69 "i != mr->ioeventfd_nb",
>  file=file@entry=0x557288ed5a36 "/home/den/src/qemu/memory.c",
>  line=line@entry=1731,
>  function=function@entry=0x557288ed5fb0 <__PRETTY_FUNCTION__.32545>
> "memory_region_del_eventfd") at assert.c:92
> #3  0x7fd5d31430f2 in __GI___assert_fail (
>  assertion=0x557288ed5b69 "i != mr->ioeventfd_nb",
>  file=0x557288ed5a36 "/home/den/src/qemu/memory.c", line=1731,
>  function=0x557288ed5fb0 <__PRETTY_FUNCTION__.32545>
> "memory_region_del_eventfd") at assert.c:101
> #4  0x557288b108fa in memory_region_del_eventfd
> (mr=0x55728ad83700,
>  addr=16, size=2, match_data=true, data=0, e=0x55728b21ff40)
>  at /home/den/src/qemu/memory.c:1731
> #5  0x557288d9fc18 in virtio_pci_set_host_notifier_internal (
>  proxy=0x55728ad82e80, n=0, assign=false, set_handler=false)
>  at hw/virtio/virtio-pci.c:178
> #6  0x557288da19a9 in virtio_pci_set_host_notifier (d=0x55728ad82e80,
> n=0,
>  assign=false) at hw/virtio/virtio-pci.c:984
> #7  0x557288b523df in virtio_scsi_dataplane_start (s=0x55728ad8afa0)
>  at /home/den/src/qemu/hw/scsi/virtio-scsi-dataplane.c:268
> #8  0x557288b50210 in virtio_scsi_handle_cmd (vdev=0x55728ad8afa0,
>  vq=0x55728b21ffc0) at /home/den/src/qemu/hw/scsi/virtio-scsi.c:574
> #9  0x557288b65cb7 in virtio_queue_notify_vq (vq=0x55728b21ffc0)
>  at /home/den/src/qemu/hw/virtio/virtio.c:966
> #10 0x557288b67bbf in virtio_queue_host_notifier_read
> (n=0x55728b220010)
>  at /home/den/src/qemu/hw/virtio/virtio.c:1643
> #11 0x557288e12a2b in aio_dispatch (ctx=0x55728acaeab0) at
> aio-posix.c:160
> #12 0x557288e03194 in aio_ctx_dispatch (source=0x55728acaeab0,
>  callback=0x0, user_data=0x0) at async.c:226
> #13 0x7fd5d409fff7 in g_main_context_dispatch ()
> from /lib/x86_64-linux-gnu/libglib-2.0.so.0
> ---Type  to continue, or q  to quit---
> #14 0x557288e1110d in glib_pollfds_poll () at main-loop.c:211
> #15 0x557288e111e8 in os_host_main_loop_wait (timeout=0) at
> main-loop.c:256
> #16 0x557288e11295 in main_loop_wait (nonblocking=0) at main-
> loop.c:504
> #17 0x557288c1c31c in main_loop () at vl.c:1890
> #18 0x557288c23dec in main (argc=105, argv=0x7ffca9a6fa08,
>  envp=0x7ffca9a6fd58) at vl.c:4644
> (gdb)
> 
> during 'virsh create-snapshot' operation over alive VM.
> It happens 100% of time when VM is run using the following command line:
> 
>   7498 ?tl 0:37 qemu-system-x86_64 -enable-kvm -name rhel7
> -S -machine pc-i440fx-2.2,accel=kvm,usb=off -cpu SandyBridge -m 1024 -
> realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -object
> iothread,id=iothread1 -uuid 456af4d3-5d67-41c6-a229-c55ded6098e9
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel7.monitor,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-
> shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot
> strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7
> 

Re: [Qemu-devel] [PATCH v3 2/3] target-i386: calculate vcpu's TSC rate to be migrated

2015-11-08 Thread haozhong . zhang
On 11/06/15 13:12, Eduardo Habkost wrote:
> On Fri, Nov 06, 2015 at 10:32:24AM +0800, haozhong.zh...@intel.com wrote:
> > On 11/05/15 14:05, Eduardo Habkost wrote:
> > > On Thu, Nov 05, 2015 at 09:30:51AM +0800, Haozhong Zhang wrote:
> > > > On 11/04/15 19:42, Eduardo Habkost wrote:
> [...]
> > > > > > +env->tsc_khz_saved = r;
> > > > > > +}
> > > > > 
> > > > > Why do you need a separate tsc_khz_saved field, and don't simply use
> > > > > tsc_khz? It would have the additional feature of letting QMP clients
> > > > > query the current TSC rate by asking for the tsc-freq property on CPU
> > > > > objects.
> > > > >
> > > > 
> > > > It's to avoid overriding env->tsc_khz on the destination in the
> > > > migration. I can change this line to
> > > >  env->tsc_khz = env->tsc_khz_saved = r;
> > > 
> > > You are already avoiding overriding env->tsc_khz, because you use
> > > KVM_GET_TSC_KHZ only if tsc_khz is not set yet. I still don't see why
> > > you need a tsc_khz_saved field that requires duplicating the SET_TSC_KHZ
> > > code, if you could just do this:
> > > 
> > > if (!env->tsc_khz) {
> > > env->tsc_khz = kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ);
> > > }
> > >
> > 
> > Consider an example that we migrate a VM from machine A to machine B
> > and then to machine C, and QEMU on machine B is launched with the cpu
> > option 'tsc-freq' (i.e. env->tsc_khz on B is non-zero at the
> > beginning):
> >  1) In the migration from B to C, the user-specified TSC frequency by
> > 'tsc-freq' on B is expected to be migrated to C. That is, the
> > value of env->tsc_khz on B is migrated.
> >  2) If TSC frequency is migrated through env->tsc_khz, then
> > env->tsc_khz on B will be overrode in the migration from A to B
> > before kvm_arch_setup_tsc_khz(). If the guest TSC frequency is
> > different than the user-specified TSC frequency on B, the
> > expectation in 1) will not be satisfied anymore.
> 
> Setting tsc-freq on B when tsc-freq was not used on A is invalid usage.
> This is not different from changing the CPU model and adding or removing
> CPU flags when migrating, which is also incorrect. The command-line
> parameters defining the VM must be the same when you migrate.
>

Good to know it's an invalid usage. Then the question is what QEMU is
expected to do for this invalid usage?

 1) Abort the migration? But I find that the current QEMU does not
abort the migration between different CPU models (e.g. Nehalem and
Haswell).

 2) Or do not abort the migration and ignore tsc-freq option? If so,
tsc_khz_saved will be not needed.

Haozhong

> -- 
> Eduardo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Qemu-devel] [PATCH 16/18] armv7m: check exception return consistency

2015-11-08 Thread Michael Davidsaver
Detect use of reserved exception return codes
and return to thread mode from nested
exception handler.

Also check consistency between NVIC and CPU
wrt. the active exception.

Signed-off-by: Michael Davidsaver 
---
 hw/intc/armv7m_nvic.c |  7 +++-
 target-arm/cpu.h  |  2 +-
 target-arm/helper.c   | 97 ++-
 3 files changed, 96 insertions(+), 10 deletions(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index 734f6f8..a75dd3c 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -342,7 +342,7 @@ void armv7m_nvic_acknowledge_irq(void *opaque)
 assert(env->v7m.exception > 0); /* spurious exception? */
 }
 
-void armv7m_nvic_complete_irq(void *opaque, int irq)
+bool armv7m_nvic_complete_irq(void *opaque, int irq)
 {
 nvic_state *s = (nvic_state *)opaque;
 vec_info *I;
@@ -352,12 +352,17 @@ void armv7m_nvic_complete_irq(void *opaque, int irq)
 
 I = >vectors[irq];
 
+if (!I->active) {
+return true;
+}
+
 I->active = 0;
 I->pending = I->level;
 assert(!I->level || irq >= 16);
 
 nvic_irq_update(s, 1);
 DPRINTF(0, "EOI %d\n", irq);
+return false;
 }
 
 /* Only called for external interrupt (vector>=16) */
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index e98bca0..72b0b32 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -1063,7 +1063,7 @@ int armv7m_excp_unmasked(ARMCPU *cpu)
 /* Interface between CPU and Interrupt controller.  */
 void armv7m_nvic_set_pending(void *opaque, int irq);
 void armv7m_nvic_acknowledge_irq(void *opaque);
-void armv7m_nvic_complete_irq(void *opaque, int irq);
+bool armv7m_nvic_complete_irq(void *opaque, int irq);
 
 /* Interface for defining coprocessor registers.
  * Registers are defined in tables of arm_cp_reginfo structs
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 83af528..3993f77 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -5375,18 +5375,65 @@ static void switch_v7m_sp(CPUARMState *env, int process)
 
 static void do_v7m_exception_exit(CPUARMState *env)
 {
+unsigned ufault = 0;
 uint32_t type;
 uint32_t xpsr;
+uint32_t nvic_active;
+
+if (env->v7m.exception == 0) {
+hw_error("Return from exception w/o active exception.  Logic error.");
+}
 
-type = env->regs[15];
 if (env->v7m.exception != ARMV7M_EXCP_NMI) {
 /* Auto-clear FAULTMASK on return from other than NMI */
 env->daif &= ~PSTATE_F;
 }
-if (env->v7m.exception != 0) {
-armv7m_nvic_complete_irq(env->nvic, env->v7m.exception);
+
+if (armv7m_nvic_complete_irq(env->nvic, env->v7m.exception)) {
+qemu_log_mask(LOG_GUEST_ERROR, "Requesting return from exception "
+  "from inactive exception %d\n",
+  env->v7m.exception);
+ufault = 1;
+}
+/* env->v7m.exception and friends updated by armv7m_nvic_complete_irq() */
+
+type = env->regs[15]&0xf;
+/* QEMU seems to clear the LSB at some point. */
+type |= 1;
+
+switch (type) {
+case 0x1: /* Return to Handler mode */
+if (env->v7m.exception == 0) {
+qemu_log_mask(LOG_GUEST_ERROR, "Requesting return from exception "
+  "to Handler mode not allowed at base level of "
+  "activation");
+ufault = 1;
+}
+break;
+case 0x9: /* Return to Thread mode w/ Main stack */
+case 0xd: /* Return to Thread mode w/ Process stack */
+if (env->v7m.exception != 0) {
+/* Attempt to return to Thread mode
+ * from nested handler while NONBASETHRDENA not set.
+ */
+qemu_log_mask(LOG_GUEST_ERROR, "Nested exception return to %d w/"
+  " Thread mode while NONBASETHRDENA not set\n",
+  env->v7m.exception);
+ufault = 1;
+}
+break;
+default:
+qemu_log_mask(LOG_GUEST_ERROR, "Exception return w/ reserved code"
+   " %02x\n", (unsigned)type);
+ufault = 1;
 }
 
+/* TODO? if ufault==1 ARM calls for entering exception handler
+ * directly w/o popping stack.
+ * We pop anyway since the active UsageFault will push on entry
+ * which should happen before execution resumes?
+ */
+
 /* Switch to the target stack.  */
 switch_v7m_sp(env, (type & 4) != 0);
 /* Pop registers.  */
@@ -5409,15 +5456,49 @@ static void do_v7m_exception_exit(CPUARMState *env)
 env->regs[15] &= ~1U;
 }
 xpsr = v7m_pop(env);
+nvic_active = env->v7m.exception;
 xpsr_write(env, xpsr, 0xfdff);
+
 /* Undo stack alignment.  */
 if (xpsr & 0x200)
 env->regs[13] |= 4;
-/* ??? The exception return type specifies Thread/Handler mode.  However
-   this is also implied by the xPSR value. Not sure what to do
-   if there is a mismatch.  */

[Qemu-devel] [PATCH v3 9/9] qed: Implement .bdrv_drain

2015-11-08 Thread Fam Zheng
The "need_check_timer" is used to clear the "NEED_CHECK" flag in the
image header after a grace period once metadata update has finished. In
compliance to the bdrv_drain semantics we should make sure it remains
deleted once .bdrv_drain is called.

We cannot reuse qed_need_check_timer_cb because here it doesn't satisfy
the assertion.  Do the "plug" and "flush" calls manually.

Signed-off-by: Fam Zheng 
Reviewed-by: Kevin Wolf 
---
 block/qed.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/block/qed.c b/block/qed.c
index 5ea05d4..9b88895 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -375,6 +375,18 @@ static void bdrv_qed_attach_aio_context(BlockDriverState 
*bs,
 }
 }
 
+static void bdrv_qed_drain(BlockDriverState *bs)
+{
+BDRVQEDState *s = bs->opaque;
+
+/* Cancel timer and start doing I/O that were meant to happen as if it
+ * fired, that way we get bdrv_drain() taking care of the ongoing requests
+ * correctly. */
+qed_cancel_need_check_timer(s);
+qed_plug_allocating_write_reqs(s);
+bdrv_aio_flush(s->bs, qed_clear_need_check, s);
+}
+
 static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
  Error **errp)
 {
@@ -1676,6 +1688,7 @@ static BlockDriver bdrv_qed = {
 .bdrv_check   = bdrv_qed_check,
 .bdrv_detach_aio_context  = bdrv_qed_detach_aio_context,
 .bdrv_attach_aio_context  = bdrv_qed_attach_aio_context,
+.bdrv_drain   = bdrv_qed_drain,
 };
 
 static void bdrv_qed_init(void)
-- 
2.4.3




[Qemu-devel] [PATCH 14/18] armv7m: auto-clear FAULTMASK

2015-11-08 Thread Michael Davidsaver
on return from all exceptions other than NMI

Signed-off-by: Michael Davidsaver 
---
 target-arm/helper.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 5be09b8..83af528 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -5379,8 +5379,13 @@ static void do_v7m_exception_exit(CPUARMState *env)
 uint32_t xpsr;
 
 type = env->regs[15];
-if (env->v7m.exception != 0)
+if (env->v7m.exception != ARMV7M_EXCP_NMI) {
+/* Auto-clear FAULTMASK on return from other than NMI */
+env->daif &= ~PSTATE_F;
+}
+if (env->v7m.exception != 0) {
 armv7m_nvic_complete_irq(env->nvic, env->v7m.exception);
+}
 
 /* Switch to the target stack.  */
 switch_v7m_sp(env, (type & 4) != 0);
-- 
2.1.4




[Qemu-devel] [PATCH 10/18] armv7m: NVIC initialization

2015-11-08 Thread Michael Davidsaver
Signed-off-by: Michael Davidsaver 
---
 hw/intc/armv7m_nvic.c | 107 --
 1 file changed, 51 insertions(+), 56 deletions(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index c860b36..8eaf677 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -827,7 +827,7 @@ int nvic_post_load(void *opaque, int version_id)
 set_prio(s, i, s->vectors[i].raw_prio);
 }
 
-nvic_irq_update(s, highest_runnable_prio(s->cpu));
+nvic_irq_update(s, 0);
 
 return 0;
 }
@@ -883,66 +883,66 @@ static const VMStateDescription vmstate_nvic = {
 }
 };
 
+static Property props_nvic[] = {
+DEFINE_PROP_UINT32("num-irq", nvic_state, num_irq, 64),
+DEFINE_PROP_END_OF_LIST()
+};
+
 static void armv7m_nvic_reset(DeviceState *dev)
 {
 nvic_state *s = NVIC(dev);
-NVICClass *nc = NVIC_GET_CLASS(s);
-nc->parent_reset(dev);
-/* Common GIC reset resets to disabled; the NVIC doesn't have
- * per-CPU interfaces so mark our non-existent CPU interface
- * as enabled by default, and with a priority mask which allows
- * all interrupts through.
- */
-s->gic.cpu_ctlr[0] = GICC_CTLR_EN_GRP0;
-s->gic.priority_mask[0] = 0x100;
-/* The NVIC as a whole is always enabled. */
-s->gic.ctlr = 1;
+
+s->vectors[ARMV7M_EXCP_RESET].enabled = 1;
+s->vectors[ARMV7M_EXCP_NMI].enabled = 1;
+s->vectors[ARMV7M_EXCP_HARD].enabled = 1;
+s->vectors[ARMV7M_EXCP_SVC].enabled = 1;
+s->vectors[ARMV7M_EXCP_DEBUG].enabled = 1;
+s->vectors[ARMV7M_EXCP_PENDSV].enabled = 1;
+
+s->vectors[ARMV7M_EXCP_RESET].prio_group = -3;
+s->vectors[ARMV7M_EXCP_NMI].prio_group = -2;
+s->vectors[ARMV7M_EXCP_HARD].prio_group = -1;
+
 systick_reset(s);
 }
 
 static void armv7m_nvic_realize(DeviceState *dev, Error **errp)
 {
 nvic_state *s = NVIC(dev);
-NVICClass *nc = NVIC_GET_CLASS(s);
-Error *local_err = NULL;
-
-/* The NVIC always has only one CPU */
-s->gic.num_cpu = 1;
-/* Tell the common code we're an NVIC */
-s->gic.revision = 0x;
-s->num_irq = s->gic.num_irq;
-nc->parent_realize(dev, _err);
-if (local_err) {
-error_propagate(errp, local_err);
+
+/* evil hack to get ARMCPU* ahead of time */
+assert(cpus.tqh_first);
+assert(!CPU_NEXT(cpus.tqh_first));
+s->cpu = ARM_CPU(cpus.tqh_first);
+assert(s->cpu);
+
+if (s->num_irq > NVIC_MAX_IRQ) {
+error_setg(errp, TYPE_NVIC " num_irq too large");
 return;
 }
-gic_init_irqs_and_distributor(>gic);
-/* The NVIC and system controller register area looks like this:
- *  0..0xff : system control registers, including systick
- *  0x100..0xcff : GIC-like registers
- *  0xd00..0xfff : system control registers
- * We use overlaying to put the GIC like registers
- * over the top of the system control register region.
- */
-memory_region_init(>container, OBJECT(s), "nvic", 0x1000);
-/* The system register region goes at the bottom of the priority
- * stack as it covers the whole page.
+
+qdev_init_gpio_in(dev, set_irq_level, s->num_irq);
+
+s->num_irq += 16; /* include space for internal exception vectors */
+
+/* The NVIC and system controller register area starts at 0xe000e000
+ * and looks like this:
+ *  0x004 - ICTR
+ *  0x010 - 0x1c - systick
+ *  0x100..0x7ec - NVIC
+ *  0x7f0..0xcff - Reserved
+ *  0xd00..0xd3c - SCS registers
+ *  0xd40..0xeff - Reserved or Not implemented
+ *  0xf00 - STIR
  */
-memory_region_init_io(>sysregmem, OBJECT(s), _sysreg_ops, s,
+
+memory_region_init_io(>iomem, OBJECT(s), _sysreg_ops, s,
   "nvic_sysregs", 0x1000);
-memory_region_add_subregion(>container, 0, >sysregmem);
-/* Alias the GIC region so we can get only the section of it
- * we need, and layer it on top of the system register region.
- */
-memory_region_init_alias(>gic_iomem_alias, OBJECT(s),
- "nvic-gic", >gic.iomem,
- 0x100, 0xc00);
-memory_region_add_subregion_overlap(>container, 0x100,
->gic_iomem_alias, 1);
+
 /* Map the whole thing into system memory at the location required
  * by the v7M architecture.
  */
-memory_region_add_subregion(get_system_memory(), 0xe000e000, 
>container);
+memory_region_add_subregion(get_system_memory(), 0xe000e000, >iomem);
 s->systick.timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, systick_timer_tick, s);
 }
 
@@ -954,36 +954,31 @@ static void armv7m_nvic_instance_init(Object *obj)
  * any user-specified property setting, so just modify the
  * value in the GICState struct.
  */
-GICState *s = ARM_GIC_COMMON(obj);
 DeviceState *dev = DEVICE(obj);
 nvic_state *nvic = NVIC(obj);
-/* The ARM v7m may have anything from 0 to 496 external 

[Qemu-devel] [PATCH 17/18] armv7m: implement CCR

2015-11-08 Thread Michael Davidsaver
Implement Configuration and Control register.
Handle STACKALIGN and USERSETMPEND bits.

Signed-off-by: Michael Davidsaver 
---
 hw/intc/armv7m_nvic.c | 15 +++
 target-arm/cpu.h  |  1 +
 target-arm/helper.c   |  8 +++-
 target-arm/machine.c  |  1 +
 4 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index a75dd3c..ca8c93c 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -474,8 +474,7 @@ static uint32_t nvic_readl(nvic_state *s, uint32_t offset)
 /* TODO: Implement SLEEPONEXIT.  */
 return 0;
 case 0xd14: /* Configuration Control.  */
-/* TODO: Implement Configuration Control bits.  */
-return 0;
+return cpu->env.v7m.ccr;
 case 0xd24: /* System Handler Status.  */
 val = 0;
 if (s->vectors[ARMV7M_EXCP_MEM].active) val |= (1 << 0);
@@ -619,9 +618,17 @@ static void nvic_writel(nvic_state *s, uint32_t offset, 
uint32_t value)
 }
 break;
 case 0xd10: /* System Control.  */
-case 0xd14: /* Configuration Control.  */
 /* TODO: Implement control registers.  */
-qemu_log_mask(LOG_UNIMP, "NVIC: SCR and CCR unimplemented\n");
+qemu_log_mask(LOG_UNIMP, "NVIC: SCR unimplemented\n");
+break;
+case 0xd14: /* Configuration Control.  */
+value &= 0x31b;
+if (value&0x118) {
+qemu_log_mask(LOG_UNIMP, "CCR unimplemented bits"
+ " BFHFNMIGN, DIV_0_TRP, UNALIGN_TRP");
+value &= ~0x118;
+}
+cpu->env.v7m.ccr = value;
 break;
 case 0xd24: /* System Handler Control.  */
 /* TODO: Real hardware allows you to set/clear the active bits
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 72b0b32..90ccdcd 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -396,6 +396,7 @@ typedef struct CPUARMState {
 uint32_t vecbase;
 uint32_t basepri;
 uint32_t control;
+uint32_t ccr; /* Configuration and Control */
 uint32_t cfsr; /* Configurable Fault Status */
 uint32_t hfsr; /* HardFault Status */
 int current_sp;
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 3993f77..402bfc5 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -5412,7 +5412,7 @@ static void do_v7m_exception_exit(CPUARMState *env)
 break;
 case 0x9: /* Return to Thread mode w/ Main stack */
 case 0xd: /* Return to Thread mode w/ Process stack */
-if (env->v7m.exception != 0) {
+if ((env->v7m.exception != 0) && !(env->v7m.ccr&1)) {
 /* Attempt to return to Thread mode
  * from nested handler while NONBASETHRDENA not set.
  */
@@ -5564,10 +5564,8 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
 
 qemu_log_mask(CPU_LOG_INT, "... as %d\n", env->v7m.exception);
 
-/* Align stack pointer.  */
-/* ??? Should only do this if Configuration Control Register
-   STACKALIGN bit is set.  */
-if (env->regs[13] & 4) {
+/* Align stack pointer (STACKALIGN)  */
+if (env->v7m.ccr&(1<<9)) {
 env->regs[13] -= 4;
 xpsr |= 0x200;
 }
diff --git a/target-arm/machine.c b/target-arm/machine.c
index d7c2034..e8b710d 100644
--- a/target-arm/machine.c
+++ b/target-arm/machine.c
@@ -100,6 +100,7 @@ static const VMStateDescription vmstate_m = {
 VMSTATE_UINT32(env.v7m.vecbase, ARMCPU),
 VMSTATE_UINT32(env.v7m.basepri, ARMCPU),
 VMSTATE_UINT32(env.v7m.control, ARMCPU),
+VMSTATE_UINT32(env.v7m.ccr, ARMCPU),
 VMSTATE_UINT32(env.v7m.cfsr, ARMCPU),
 VMSTATE_UINT32(env.v7m.hfsr, ARMCPU),
 VMSTATE_INT32(env.v7m.current_sp, ARMCPU),
-- 
2.1.4




[Qemu-devel] [PATCH 13/18] armv7m: implement CFSR and HFSR

2015-11-08 Thread Michael Davidsaver
Add the Configurable and Hard Fault Status registers.
Note undefined instructions and escalations

Signed-off-by: Michael Davidsaver 
---
 hw/intc/armv7m_nvic.c | 10 +++---
 target-arm/cpu.h  |  2 ++
 target-arm/helper.c   |  1 +
 target-arm/machine.c  |  6 --
 4 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index 8eaf677..734f6f8 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -287,6 +287,7 @@ void armv7m_nvic_set_pending(void *opaque, int irq)
 }
 I = >vectors[irq];
 
+s->cpu->env.v7m.hfsr |= 1<<30; /* FORCED */
 DPRINTF(0, "Escalate %d to %d\n", oldirq, irq);
 }
 }
@@ -488,10 +489,9 @@ static uint32_t nvic_readl(nvic_state *s, uint32_t offset)
 if (s->vectors[ARMV7M_EXCP_USAGE].enabled) val |= (1 << 18);
 return val;
 case 0xd28: /* Configurable Fault Status.  */
-/* TODO: Implement Fault Status.  */
-qemu_log_mask(LOG_UNIMP, "Configurable Fault Status unimplemented\n");
-return 0;
+return cpu->env.v7m.cfsr;
 case 0xd2c: /* Hard Fault Status.  */
+return cpu->env.v7m.hfsr;
 case 0xd30: /* Debug Fault Status.  */
 case 0xd34: /* Mem Manage Address.  */
 case 0xd38: /* Bus Fault Address.  */
@@ -629,7 +629,11 @@ static void nvic_writel(nvic_state *s, uint32_t offset, 
uint32_t value)
  */
 break;
 case 0xd28: /* Configurable Fault Status.  */
+cpu->env.v7m.cfsr &= ~value; /* W1C */
+break;
 case 0xd2c: /* Hard Fault Status.  */
+cpu->env.v7m.hfsr &= ~value; /* W1C */
+break;
 case 0xd30: /* Debug Fault Status.  */
 case 0xd34: /* Mem Manage Address.  */
 case 0xd38: /* Bus Fault Address.  */
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 29d89ce..e98bca0 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -396,6 +396,8 @@ typedef struct CPUARMState {
 uint32_t vecbase;
 uint32_t basepri;
 uint32_t control;
+uint32_t cfsr; /* Configurable Fault Status */
+uint32_t hfsr; /* HardFault Status */
 int current_sp;
 int exception;
 int exception_prio;
diff --git a/target-arm/helper.c b/target-arm/helper.c
index 2541890..5be09b8 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -5436,6 +5436,7 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
 switch (cs->exception_index) {
 case EXCP_UDEF:
 armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_USAGE);
+env->v7m.cfsr |= 1<<16; /* UNDEFINSTR */
 break;
 case EXCP_SWI:
 /* The PC already points to the next instruction.  */
diff --git a/target-arm/machine.c b/target-arm/machine.c
index 36a0d15..d7c2034 100644
--- a/target-arm/machine.c
+++ b/target-arm/machine.c
@@ -92,14 +92,16 @@ static bool m_needed(void *opaque)
 
 static const VMStateDescription vmstate_m = {
 .name = "cpu/m",
-.version_id = 1,
-.minimum_version_id = 1,
+.version_id = 2,
+.minimum_version_id = 2,
 .needed = m_needed,
 .fields = (VMStateField[]) {
 VMSTATE_UINT32(env.v7m.other_sp, ARMCPU),
 VMSTATE_UINT32(env.v7m.vecbase, ARMCPU),
 VMSTATE_UINT32(env.v7m.basepri, ARMCPU),
 VMSTATE_UINT32(env.v7m.control, ARMCPU),
+VMSTATE_UINT32(env.v7m.cfsr, ARMCPU),
+VMSTATE_UINT32(env.v7m.hfsr, ARMCPU),
 VMSTATE_INT32(env.v7m.current_sp, ARMCPU),
 VMSTATE_INT32(env.v7m.exception, ARMCPU),
 VMSTATE_END_OF_LIST()
-- 
2.1.4




Re: [Qemu-devel] [PULL 0/7] Block patches

2015-11-08 Thread Fam Zheng
On Fri, 11/06 18:07, Peter Maydell wrote:
> On 6 November 2015 at 17:52, Stefan Hajnoczi  wrote:
> > The following changes since commit 4b59f39bc9a03afcc74b2fa28da7c3189fca507c:
> >
> >   Merge remote-tracking branch 
> > 'remotes/mjt/tags/pull-trivial-patches-2015-11-06' into staging (2015-11-06 
> > 12:50:24 +)
> >
> > are available in the git repository at:
> >
> >   git://github.com/stefanha/qemu.git tags/block-pull-request
> >
> > for you to fetch changes up to 6f707181b1bd6ccf2d2fd9397039c7ef6fa4a9fd:
> >
> >   blockdev: acquire AioContext in hmp_commit() (2015-11-06 15:41:00 +)
> >
> > 
> 
> Build failure on OSX :-(
> 
> /Users/pm215/src/qemu-for-merges/aio-posix.c  CCblock/qcow.o
> :442:37: error: no member named 'epollfd' in 'struct AioContext'
> epoll_handler.pfd.fd = ctx->epollfd;
>~~~  ^
> 

:(

I think it is harmless to always include this member in AioContext. Stefan,
could you squash this into patch 5?  Thanks!

Fam

---

diff --git a/include/block/aio.h b/include/block/aio.h
index 91737d5..735f1f8 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -124,11 +124,9 @@ struct AioContext {
 QEMUTimerListGroup tlg;
 
 int external_disable_cnt;
-#ifdef CONFIG_EPOLL
 int epollfd;
 bool epoll_enabled;
 bool epoll_available;
-#endif
 };
   
 /**   



Re: [Qemu-devel] assert during internal snapshot

2015-11-08 Thread Li, Liang Z
> Hello, All!
> 
> This commit
> 
> commit 94f5a43704129ca4995aa3385303c5ae225bde42
> Author: Liang Li 
> Date:   Mon Nov 2 15:37:00 2015 +0800
> 
>  migration: defer migration_end & blk_mig_cleanup
> 
>  Because of the patch 3ea3b7fa9af067982f34b of kvm, which introduces a
>  lazy collapsing of small sptes into large sptes mechanism, now
>  migration_end() is a time consuming operation because it calls
>  memroy_global_dirty_log_stop(), which will trigger the dropping of small
>  sptes operation and takes about dozens of milliseconds, so call
>  migration_end() before all the vmsate data has already been transferred
>  to the destination will prolong VM downtime. This operation should be
>  deferred after all the data has been transferred to the destination.
> 
>  blk_mig_cleanup() can be deferred too.
> 
>  For a VM with 8G RAM, this patch can reduce the VM downtime about
> 30 ms.
> 
>  Signed-off-by: Liang Li 
>  Reviewed-by: Paolo Bonzini 
>  Reviewed-by: Juan Quintela al3
>  Reviewed-by: Amit Shah al3
>  Signed-off-by: Juan Quintela al3
> 
> introduces the following regression
> 
> (gdb) bt
> #0  0x7fd5d314a267 in __GI_raise (sig=sig@entry=6)
>  at ../sysdeps/unix/sysv/linux/raise.c:55
> #1  0x7fd5d314beca in __GI_abort () at abort.c:89
> #2  0x7fd5d314303d in __assert_fail_base (
>  fmt=0x7fd5d32a5028 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
>  assertion=assertion@entry=0x557288ed5b69 "i != mr->ioeventfd_nb",
>  file=file@entry=0x557288ed5a36 "/home/den/src/qemu/memory.c",
>  line=line@entry=1731,
>  function=function@entry=0x557288ed5fb0 <__PRETTY_FUNCTION__.32545>
> "memory_region_del_eventfd") at assert.c:92
> #3  0x7fd5d31430f2 in __GI___assert_fail (
>  assertion=0x557288ed5b69 "i != mr->ioeventfd_nb",
>  file=0x557288ed5a36 "/home/den/src/qemu/memory.c", line=1731,
>  function=0x557288ed5fb0 <__PRETTY_FUNCTION__.32545>
> "memory_region_del_eventfd") at assert.c:101
> #4  0x557288b108fa in memory_region_del_eventfd
> (mr=0x55728ad83700,
>  addr=16, size=2, match_data=true, data=0, e=0x55728b21ff40)
>  at /home/den/src/qemu/memory.c:1731
> #5  0x557288d9fc18 in virtio_pci_set_host_notifier_internal (
>  proxy=0x55728ad82e80, n=0, assign=false, set_handler=false)
>  at hw/virtio/virtio-pci.c:178
> #6  0x557288da19a9 in virtio_pci_set_host_notifier (d=0x55728ad82e80,
> n=0,
>  assign=false) at hw/virtio/virtio-pci.c:984
> #7  0x557288b523df in virtio_scsi_dataplane_start (s=0x55728ad8afa0)
>  at /home/den/src/qemu/hw/scsi/virtio-scsi-dataplane.c:268
> #8  0x557288b50210 in virtio_scsi_handle_cmd (vdev=0x55728ad8afa0,
>  vq=0x55728b21ffc0) at /home/den/src/qemu/hw/scsi/virtio-scsi.c:574
> #9  0x557288b65cb7 in virtio_queue_notify_vq (vq=0x55728b21ffc0)
>  at /home/den/src/qemu/hw/virtio/virtio.c:966
> #10 0x557288b67bbf in virtio_queue_host_notifier_read
> (n=0x55728b220010)
>  at /home/den/src/qemu/hw/virtio/virtio.c:1643
> #11 0x557288e12a2b in aio_dispatch (ctx=0x55728acaeab0) at
> aio-posix.c:160
> #12 0x557288e03194 in aio_ctx_dispatch (source=0x55728acaeab0,
>  callback=0x0, user_data=0x0) at async.c:226
> #13 0x7fd5d409fff7 in g_main_context_dispatch ()
> from /lib/x86_64-linux-gnu/libglib-2.0.so.0
> ---Type  to continue, or q  to quit---
> #14 0x557288e1110d in glib_pollfds_poll () at main-loop.c:211
> #15 0x557288e111e8 in os_host_main_loop_wait (timeout=0) at
> main-loop.c:256
> #16 0x557288e11295 in main_loop_wait (nonblocking=0) at main-
> loop.c:504
> #17 0x557288c1c31c in main_loop () at vl.c:1890
> #18 0x557288c23dec in main (argc=105, argv=0x7ffca9a6fa08,
>  envp=0x7ffca9a6fd58) at vl.c:4644
> (gdb)
> 
> during 'virsh create-snapshot' operation over alive VM.
> It happens 100% of time when VM is run using the following command line:
> 
>   7498 ?tl 0:37 qemu-system-x86_64 -enable-kvm -name rhel7
> -S -machine pc-i440fx-2.2,accel=kvm,usb=off -cpu SandyBridge -m 1024 -
> realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -object
> iothread,id=iothread1 -uuid 456af4d3-5d67-41c6-a229-c55ded6098e9
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel7.monitor,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-
> shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot
> strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7
> -device
> ich9-usb-
> uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6
> -device
> ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1
> -device
> 

[Qemu-devel] [PATCH v3 4/9] iscsi: Emulate commands in iscsi_aio_ioctl as iscsi_ioctl

2015-11-08 Thread Fam Zheng
iscsi_ioctl emulates SG_GET_VERSION_NUM and SG_GET_SCSI_ID. Now that
bdrv_ioctl() will be emulated with .bdrv_aio_ioctl, replicate the logic
into iscsi_aio_ioctl to make them consistent.

Signed-off-by: Fam Zheng 
Reviewed-by: Kevin Wolf 
---
 block/iscsi.c | 40 ++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 9a628b7..b3fa0a0 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -96,6 +96,7 @@ typedef struct IscsiAIOCB {
 int status;
 int64_t sector_num;
 int nb_sectors;
+int ret;
 #ifdef __linux__
 sg_io_hdr_t *ioh;
 #endif
@@ -726,6 +727,38 @@ iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int status,
 iscsi_schedule_bh(acb);
 }
 
+static void iscsi_ioctl_bh_completion(void *opaque)
+{
+IscsiAIOCB *acb = opaque;
+
+qemu_bh_delete(acb->bh);
+acb->common.cb(acb->common.opaque, acb->ret);
+qemu_aio_unref(acb);
+}
+
+static void iscsi_ioctl_handle_emulated(IscsiAIOCB *acb, int req, void *buf)
+{
+BlockDriverState *bs = acb->common.bs;
+IscsiLun *iscsilun = bs->opaque;
+int ret = 0;
+
+switch (req) {
+case SG_GET_VERSION_NUM:
+*(int *)buf = 3;
+break;
+case SG_GET_SCSI_ID:
+((struct sg_scsi_id *)buf)->scsi_type = iscsilun->type;
+break;
+default:
+ret = -EINVAL;
+}
+assert(!acb->bh);
+acb->bh = aio_bh_new(bdrv_get_aio_context(bs),
+ iscsi_ioctl_bh_completion, acb);
+acb->ret = ret;
+qemu_bh_schedule(acb->bh);
+}
+
 static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
 unsigned long int req, void *buf,
 BlockCompletionFunc *cb, void *opaque)
@@ -735,8 +768,6 @@ static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
 struct iscsi_data data;
 IscsiAIOCB *acb;
 
-assert(req == SG_IO);
-
 acb = qemu_aio_get(_aiocb_info, bs, cb, opaque);
 
 acb->iscsilun = iscsilun;
@@ -745,6 +776,11 @@ static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
 acb->buf = NULL;
 acb->ioh = buf;
 
+if (req != SG_IO) {
+iscsi_ioctl_handle_emulated(acb, req, buf);
+return >common;
+}
+
 acb->task = malloc(sizeof(struct scsi_task));
 if (acb->task == NULL) {
 error_report("iSCSI: Failed to allocate task for scsi command. %s",
-- 
2.4.3




[Qemu-devel] [PATCH v3 3/9] block: Track discard requests

2015-11-08 Thread Fam Zheng
Both bdrv_discard and bdrv_aio_discard will call into bdrv_co_discard,
so add tracked_request_begin/end calls around the loop.

Signed-off-by: Fam Zheng 
Reviewed-by: Kevin Wolf 
---
 block/io.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index a9a49e4..324ae5a 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2424,6 +2424,7 @@ static void coroutine_fn bdrv_discard_co_entry(void 
*opaque)
 int coroutine_fn bdrv_co_discard(BlockDriverState *bs, int64_t sector_num,
  int nb_sectors)
 {
+BdrvTrackedRequest req;
 int max_discard, ret;
 
 if (!bs->drv) {
@@ -2446,6 +2447,8 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, 
int64_t sector_num,
 return 0;
 }
 
+tracked_request_begin(, bs, sector_num, nb_sectors,
+  BDRV_TRACKED_DISCARD);
 bdrv_set_dirty(bs, sector_num, nb_sectors);
 
 max_discard = MIN_NON_ZERO(bs->bl.max_discard, BDRV_REQUEST_MAX_SECTORS);
@@ -2479,20 +2482,24 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, 
int64_t sector_num,
 acb = bs->drv->bdrv_aio_discard(bs, sector_num, nb_sectors,
 bdrv_co_io_em_complete, );
 if (acb == NULL) {
-return -EIO;
+ret = -EIO;
+goto out;
 } else {
 qemu_coroutine_yield();
 ret = co.ret;
 }
 }
 if (ret && ret != -ENOTSUP) {
-return ret;
+goto out;
 }
 
 sector_num += num;
 nb_sectors -= num;
 }
-return 0;
+ret = 0;
+out:
+tracked_request_end();
+return ret;
 }
 
 int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
-- 
2.4.3




[Qemu-devel] [PATCH v3 2/9] block: Track flush requests

2015-11-08 Thread Fam Zheng
Both bdrv_flush and bdrv_aio_flush eventually call bdrv_co_flush, add
tracked_request_begin and tracked_request_end pair in that function so
that all flush requests are now tracked.

Signed-off-by: Fam Zheng 
Reviewed-by: Kevin Wolf 
---
 block/io.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index 793809a..a9a49e4 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2318,18 +2318,20 @@ static void coroutine_fn bdrv_flush_co_entry(void 
*opaque)
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 {
 int ret;
+BdrvTrackedRequest req;
 
 if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
 bdrv_is_sg(bs)) {
 return 0;
 }
 
+tracked_request_begin(, bs, 0, 0, BDRV_TRACKED_FLUSH);
 /* Write back cached data to the OS even with cache=unsafe */
 BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_OS);
 if (bs->drv->bdrv_co_flush_to_os) {
 ret = bs->drv->bdrv_co_flush_to_os(bs);
 if (ret < 0) {
-return ret;
+goto out;
 }
 }
 
@@ -2369,14 +2371,17 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 ret = 0;
 }
 if (ret < 0) {
-return ret;
+goto out;
 }
 
 /* Now flush the underlying protocol.  It will also have BDRV_O_NO_FLUSH
  * in the case of cache=unsafe, so there are no useless flushes.
  */
 flush_parent:
-return bs->file ? bdrv_co_flush(bs->file->bs) : 0;
+ret = bs->file ? bdrv_co_flush(bs->file->bs) : 0;
+out:
+tracked_request_end();
+return ret;
 }
 
 int bdrv_flush(BlockDriverState *bs)
-- 
2.4.3




Re: [Qemu-devel] qemu : rbd block driver internal snapshot and vm_stop is hanging forever

2015-11-08 Thread Alexandre DERUMIER
Some other infos:

I can reproduce it too with manual snapshot with rbd command


#rbd --image myrbdvolume snap create --snap snap1

qemu monitor:

#stop


This is with ceph hammer 0.94.5.


in qemu vm_stop, the only thing related to block driver are

bdrv_drain_all();
ret = bdrv_flush_all();


- Mail original -
De: "aderumier" 
À: "ceph-devel" , "qemu-devel" 

Envoyé: Lundi 9 Novembre 2015 04:10:45
Objet: qemu : rbd block driver internal snapshot and vm_stop is hanging forever

Hi, 

with qemu (2.4.1), if I do an internal snapshot of an rbd device, 
then I pause the vm with vm_stop, 

the qemu process is hanging forever 


monitor commands to reproduce: 


# snapshot_blkdev_internal drive-virtio0 yoursnapname 
# stop 




I don't see this with qcow2 or sheepdog block driver for example. 


Regards, 

Alexandre 
-- 
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
the body of a message to majord...@vger.kernel.org 
More majordomo info at http://vger.kernel.org/majordomo-info.html 




Re: [Qemu-devel] Qemu: Guest Linux hangs on Mac OS X 10.11

2015-11-08 Thread Peter Maydell
On 18 October 2015 at 21:37, Peter Maydell  wrote:
> Sometimes it does manage to unwedge itself. Paolo, do you have
> any suggestions for how to debug this kind of issue?

So the good news is that on mainline this doesn't happen any more.
The bad news is that something weird is going on such that git
bisect doesn't give helpful answers. Specifically if I start by
compiling older versions and work forwards, then
 0fd7e09 kvmclock: add a new function to update env->tsc.
shows the bug, and
 6388acc Revert "Introduce cpu_clean_all_dirty"
does not. (And I've got to that commit both via a git-bisect
and by a second round of manually trying to identify the commit,
so it's consistent about where it changes behaviour.)
However that makes no sense because that revert commit
is just removing unused code. And then if I go backwards again
to 0fd7e09 the bug doesn't repro there.

thanks
-- PMM



Re: [Qemu-devel] [PATCH] virtio-blk: trivial code optimization

2015-11-08 Thread Gonglei
On 2015/11/6 20:54, Paolo Bonzini wrote:
> 
> 
> On 06/11/2015 11:35, Stefan Hajnoczi wrote:
  if (niov + req->qiov.niov > IOV_MAX) {
  merge = false;
 +goto unmerge;
  }
  
  /* merge would exceed maximum transfer length of backend 
 device */
  if (req->qiov.size / BDRV_SECTOR_SIZE + nb_sectors > 
 max_xfer_len) {
  merge = false;
 +goto unmerge;
  }
  
  /* requests are not sequential */
  if (sector_num + nb_sectors != req->sector_num) {
  merge = false;
  }
 -
 +unmerge:
>> C has a way of expressing this without gotos.  Please use else if:
>>
>>   if (a) {
>>   ...
>>   } else if (b) {
>>   ...
>>   } else if (c) {
>>   ...
>>   }
> 
> Another way is
> 
> if (niov + req->qiov.niov > IOV_MAX ||
> req->qiov.size / BDRV_SECTOR_SIZE + nb_sectors > max_xfer_len ||
> sector_num + nb_sectors != req->sector_num) {
> submit_requests(...)
> ...
> }
> 
> While at it, we could reorder the conditions so that the most common
> ("requests are not sequential") comes first.
> 
> I'm not sure about handling of overflow.  It's probably better to
> write conditions as "new > max - old" (e.g. "niov > IOV_MAX -
> req->qiov.niov") rather than "old + new > max".  The former is always
> safe, because we know that old <= max and there can be no integer
> overflow.
> 
Nice points.  Thanks, both of you.

Regards,
-Gonglei




[Qemu-devel] [PATCH v3 1/9] block: Add more types for tracked request

2015-11-08 Thread Fam Zheng
We'll track more request types besides read and write, change the
boolean field to an enum.

Signed-off-by: Fam Zheng 
Reviewed-by: Kevin Wolf 
---
 block/io.c|  9 +
 include/block/block_int.h | 10 +-
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index 8dcad3b..793809a 100644
--- a/block/io.c
+++ b/block/io.c
@@ -348,13 +348,14 @@ static void tracked_request_end(BdrvTrackedRequest *req)
 static void tracked_request_begin(BdrvTrackedRequest *req,
   BlockDriverState *bs,
   int64_t offset,
-  unsigned int bytes, bool is_write)
+  unsigned int bytes,
+  enum BdrvTrackedRequestType type)
 {
 *req = (BdrvTrackedRequest){
 .bs = bs,
 .offset = offset,
 .bytes  = bytes,
-.is_write   = is_write,
+.type   = type,
 .co = qemu_coroutine_self(),
 .serialising= false,
 .overlap_offset = offset,
@@ -971,7 +972,7 @@ static int coroutine_fn bdrv_co_do_preadv(BlockDriverState 
*bs,
 bytes = ROUND_UP(bytes, align);
 }
 
-tracked_request_begin(, bs, offset, bytes, false);
+tracked_request_begin(, bs, offset, bytes, BDRV_TRACKED_READ);
 ret = bdrv_aligned_preadv(bs, , offset, bytes, align,
   use_local_qiov ? _qiov : qiov,
   flags);
@@ -1292,7 +1293,7 @@ static int coroutine_fn 
bdrv_co_do_pwritev(BlockDriverState *bs,
  * Pad qiov with the read parts and be sure to have a tracked request not
  * only for bdrv_aligned_pwritev, but also for the reads of the RMW cycle.
  */
-tracked_request_begin(, bs, offset, bytes, true);
+tracked_request_begin(, bs, offset, bytes, BDRV_TRACKED_WRITE);
 
 if (!qiov) {
 ret = bdrv_co_do_zero_pwritev(bs, offset, bytes, flags, );
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 3ceeb5a..7db9900 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -60,11 +60,19 @@
 
 #define BLOCK_PROBE_BUF_SIZE512
 
+enum BdrvTrackedRequestType {
+BDRV_TRACKED_READ,
+BDRV_TRACKED_WRITE,
+BDRV_TRACKED_FLUSH,
+BDRV_TRACKED_IOCTL,
+BDRV_TRACKED_DISCARD,
+};
+
 typedef struct BdrvTrackedRequest {
 BlockDriverState *bs;
 int64_t offset;
 unsigned int bytes;
-bool is_write;
+enum BdrvTrackedRequestType type;
 
 bool serialising;
 int64_t overlap_offset;
-- 
2.4.3




[Qemu-devel] [PATCH v3 0/9] block: Fixes for bdrv_drain

2015-11-08 Thread Fam Zheng
v3: Don't reuse coroutine in bdrv_aio_ioctl. [Stefan]
Recursely call .bdrv_drain callback only. [Stefan, Paolo]
Added Kevin's reviewed-by in other patches.

v2: Add Kevin's reviewed-by in patches 1, 2, 5-7, 9.
Address Kevin's reviewing comments which are:
- Explicit "ret = 0" before out label in patch 3.
- Add missing qemu_aio_unref() in patch 4.
- Recurse into all children in bdrv_drain in patch 8.

Previously bdrv_drain and bdrv_drain_all don't handle ioctl, flush and discard
requests (which are fundamentally the same as read and write requests that
change disk state).  Forgetting such requests leaves us in risk of violating
the invariant that bdrv_drain() callers rely on - all asynchronous requests
must have completed after bdrv_drain returns.

Enrich the tracked request types, and add tracked_request_begin/end pairs to
all three code paths. As a prerequisite, ioctl code is moved into coroutine
too.

The last two patches take care of QED's "need check" timer, so that after
bdrv_drain returns, the driver is in a consistent state.

Fam


Fam Zheng (9):
  block: Add more types for tracked request
  block: Track flush requests
  block: Track discard requests
  iscsi: Emulate commands in iscsi_aio_ioctl as iscsi_ioctl
  block: Add ioctl parameter fields to BlockRequest
  block: Emulate bdrv_ioctl with bdrv_aio_ioctl and track both
  block: Drop BlockDriver.bdrv_ioctl
  block: Introduce BlockDriver.bdrv_drain callback
  qed: Implement .bdrv_drain

 block/io.c| 147 +++---
 block/iscsi.c |  73 ---
 block/qed.c   |  13 
 block/raw-posix.c |   8 ---
 block/raw_bsd.c   |   6 --
 include/block/block.h |  16 +++--
 include/block/block_int.h |  17 +-
 7 files changed, 205 insertions(+), 75 deletions(-)

-- 
2.4.3




Re: [Qemu-devel] assert during internal snapshot

2015-11-08 Thread Li, Liang Z
>  migration: defer migration_end & blk_mig_cleanup
> 
>  Because of the patch 3ea3b7fa9af067982f34b of kvm, which introduces a
>  lazy collapsing of small sptes into large sptes mechanism, now
>  migration_end() is a time consuming operation because it calls
>  memroy_global_dirty_log_stop(), which will trigger the dropping of small
>  sptes operation and takes about dozens of milliseconds, so call
>  migration_end() before all the vmsate data has already been transferred
>  to the destination will prolong VM downtime. This operation should be
>  deferred after all the data has been transferred to the destination.
> 
>  blk_mig_cleanup() can be deferred too.
> 
>  For a VM with 8G RAM, this patch can reduce the VM downtime about
> 30 ms.
> 
>  Signed-off-by: Liang Li 
>  Reviewed-by: Paolo Bonzini 
>  Reviewed-by: Juan Quintela al3
>  Reviewed-by: Amit Shah al3
>  Signed-off-by: Juan Quintela al3
> 
> introduces the following regression
> 
> (gdb) bt
> #0  0x7fd5d314a267 in __GI_raise (sig=sig@entry=6)
>  at ../sysdeps/unix/sysv/linux/raise.c:55
> #1  0x7fd5d314beca in __GI_abort () at abort.c:89
> #2  0x7fd5d314303d in __assert_fail_base (
>  fmt=0x7fd5d32a5028 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
>  assertion=assertion@entry=0x557288ed5b69 "i != mr->ioeventfd_nb",
>  file=file@entry=0x557288ed5a36 "/home/den/src/qemu/memory.c",
>  line=line@entry=1731,
>  function=function@entry=0x557288ed5fb0 <__PRETTY_FUNCTION__.32545>
> "memory_region_del_eventfd") at assert.c:92
> #3  0x7fd5d31430f2 in __GI___assert_fail (
>  assertion=0x557288ed5b69 "i != mr->ioeventfd_nb",
>  file=0x557288ed5a36 "/home/den/src/qemu/memory.c", line=1731,
>  function=0x557288ed5fb0 <__PRETTY_FUNCTION__.32545>
> "memory_region_del_eventfd") at assert.c:101
> #4  0x557288b108fa in memory_region_del_eventfd
> (mr=0x55728ad83700,
>  addr=16, size=2, match_data=true, data=0, e=0x55728b21ff40)
>  at /home/den/src/qemu/memory.c:1731
> #5  0x557288d9fc18 in virtio_pci_set_host_notifier_internal (
>  proxy=0x55728ad82e80, n=0, assign=false, set_handler=false)
>  at hw/virtio/virtio-pci.c:178
> #6  0x557288da19a9 in virtio_pci_set_host_notifier (d=0x55728ad82e80,
> n=0,
>  assign=false) at hw/virtio/virtio-pci.c:984
> #7  0x557288b523df in virtio_scsi_dataplane_start (s=0x55728ad8afa0)
>  at /home/den/src/qemu/hw/scsi/virtio-scsi-dataplane.c:268
> #8  0x557288b50210 in virtio_scsi_handle_cmd (vdev=0x55728ad8afa0,
>  vq=0x55728b21ffc0) at /home/den/src/qemu/hw/scsi/virtio-scsi.c:574
> #9  0x557288b65cb7 in virtio_queue_notify_vq (vq=0x55728b21ffc0)
>  at /home/den/src/qemu/hw/virtio/virtio.c:966
> #10 0x557288b67bbf in virtio_queue_host_notifier_read
> (n=0x55728b220010)
>  at /home/den/src/qemu/hw/virtio/virtio.c:1643
> #11 0x557288e12a2b in aio_dispatch (ctx=0x55728acaeab0) at
> aio-posix.c:160
> #12 0x557288e03194 in aio_ctx_dispatch (source=0x55728acaeab0,
>  callback=0x0, user_data=0x0) at async.c:226
> #13 0x7fd5d409fff7 in g_main_context_dispatch ()
> from /lib/x86_64-linux-gnu/libglib-2.0.so.0
> ---Type  to continue, or q  to quit---
> #14 0x557288e1110d in glib_pollfds_poll () at main-loop.c:211
> #15 0x557288e111e8 in os_host_main_loop_wait (timeout=0) at
> main-loop.c:256
> #16 0x557288e11295 in main_loop_wait (nonblocking=0) at main-
> loop.c:504
> #17 0x557288c1c31c in main_loop () at vl.c:1890
> #18 0x557288c23dec in main (argc=105, argv=0x7ffca9a6fa08,
>  envp=0x7ffca9a6fd58) at vl.c:4644
> (gdb)
> 
> during 'virsh create-snapshot' operation over alive VM.
> It happens 100% of time when VM is run using the following command line:
> 
>   7498 ?tl 0:37 qemu-system-x86_64 -enable-kvm -name rhel7
> -S -machine pc-i440fx-2.2,accel=kvm,usb=off -cpu SandyBridge -m 1024 -
> realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -object
> iothread,id=iothread1 -uuid 456af4d3-5d67-41c6-a229-c55ded6098e9
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel7.monitor,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-
> shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot
> strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7
> -device
> ich9-usb-
> uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6
> -device
> ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1
> -device
> ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2
> -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x5 -device
> virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive
> 

Re: [Qemu-devel] [PATCH v9 00/56] Postcopy implementation

2015-11-08 Thread Bharata B Rao
On Fri, Nov 06, 2015 at 03:48:11PM +, Dr. David Alan Gilbert wrote:
> * Bharata B Rao (bhar...@linux.vnet.ibm.com) wrote:
> 
> > > Where we have iterable, but non-postcopiable devices (e.g. htab
> > > or block migration), complete them before forming the 'package'
> > > but with the CPUs stopped.  This stops them filling up the package.
> > 
> > That helps and the migration suceeds now when I switch to postcopy
> > immediately after starting the migration.
> 
> Excellent.
> 
> > However after postcopy migration, when I attempt to start an incoming
> > instance again to migrate the guest back, I see this failure:
> > 
> > qemu-system-ppc64: cannot set up guest memory 'ppc_spapr.ram': Cannot 
> > allocate memory
> > 
> > The same doesn't happen with normal migration.
> 
> Huh that's fun; that's the original source guest that's running out of RAM?
> It's original QEMU should be gone by that point.

Yes, the original source QEMU is gone, but there is not enough memory left
in the host to start another incoming QEMU instance because...

At the beginning
-
$ grep -i mem /proc/meminfo 
MemTotal:   132816832 kB
MemFree:128781632 kB
MemAvailable:   131668224 kB

After starting the guest (-m 64G,slots=32,maxmem=128G)

$ grep -i mem /proc/meminfo 
MemTotal:   132816832 kB
MemFree:124866880 kB
MemAvailable:   127753728 kB

After starting the destination instance (incoming)
-
$ grep -i mem /proc/meminfo 
MemTotal:   132816832 kB
MemFree:122514880 kB
MemAvailable:   125401920 kB

After postcopy migration completes
--
$ grep -i mem /proc/meminfo 
MemTotal:   132816832 kB
MemFree:55150592 kB
MemAvailable:   58037888 kB

After terminating the source instance
-
$ grep -i mem /proc/meminfo 
MemTotal:   132816832 kB
MemFree:59432448 kB
MemAvailable:   62319872 kB

So as you can see, postcopy migration will result in guest claiming
its entire RAM memory from host. This doesn't happen during normal migration.

The above results are with the additional patch you sent in this thread
and I was switching to postcopy migration immediately after starting the
migration. Without this patch, when I delay the switching to postcopy
migration a bit, I see the free memory as below when postcopy migration
completes.

$ grep -i mem /proc/meminfo 
MemTotal:   132816832 kB
MemFree:85018176 kB
MemAvailable:   87937024 kB

Regards,
Bharata.




Re: [Qemu-devel] [PATCH v9 00/56] Postcopy implementation

2015-11-08 Thread David Gibson
On Fri, Nov 06, 2015 at 12:22:23PM +, Dr. David Alan Gilbert wrote:
> * Bharata B Rao (bharata@gmail.com) wrote:
> > On Fri, Nov 6, 2015 at 2:39 PM, Dr. David Alan Gilbert
> >  wrote:
> > > * Bharata B Rao (bhar...@linux.vnet.ibm.com) wrote:
> > >> On Thu, Nov 05, 2015 at 06:10:27PM +, Dr. David Alan Gilbert (git) 
> > >> wrote:
> > >> > From: "Dr. David Alan Gilbert" 
> > >> >
> > >> >   This is the 9th cut of my version of postcopy.
> > >> >
> > >> > The userfaultfd linux kernel code is now in the upstream kernel
> > >> > tree, and so 4.3 can be used without modification.
> > >> >
> > >> > This qemu series can be found at:
> > >> > https://github.com/orbitfp7/qemu.git
> > >> > on the wp3-postcopy-v9 tag
> > >> >
> > >> > Testing status:
> > >> >   * Tested heavily on x86
> > >> >   * Smoke tested on aarch64 (so it does work on different page sizes)
> > >>
> > >> Tested minimally on ppc64 with back and forth postcopy migration of
> > >> unloaded pseries guest within the localhost - works as expected.
> > >>
> > >> However I am seeing a failure in one case. I am not sure if this is
> > >> a user error or a real issue in postcopy migration. If I switch to 
> > >> postcopy
> > >> migration immediately after starting the migration, I see the migration
> > >> failing with error:
> > >>
> > >> qemu-system-ppc64: qemu_savevm_send_packaged: Unreasonably large 
> > >> packaged state: 25905005
> > >
> > > I put an arbitrary limit of 16MB (see MAX_VM_CMD_PACKAGED_SIZE in 
> > > include/sysemu/sysemu.h)
> > > on the size of the data accepted into the packaged blob.  How big is the 
> > > htab data likely to be?
> > 
> > HTAB size is a variable and depends on maxmem size. It will be 1/128
> > th of maxmem. So for a 32G guest, HTAB will be 256M in size.
> 
> OK, that does get a bit big.
> Two possible fixes;
>  1 - postcopy htab (I don't know htab to know how hard that is)

It's.. awkward.  We'd need a way to set up the mappings on the
destination so that faults on bits of the hash table not yet up to
date get flagged and handed to qemu, rather than causing a fatal fault
in the guest.  I suspect that will need host kernel changes, although
maybe there's a way of setting up the htab on destination so that
unmapping things look like MMIO (which already goes to qemu).

>  2 - do one pass of iterable/non-postcopiable devices before we start the 
> package;
>  I'm just writing a patch to try that; I'll send it to you to let
>  you try once I get it to not-break normal migration.

Hm.  So, depends a bit on what you mean by "one pass".  If we've had
one complete pass through the hash table, I'd expect that to be enough
to get the package down to a reasonable size.  But one pass through
the full hash table can be multiple calls to the htab iterator.

Which makes me think it's a bit odd that we're not already getting
most of the htab data across during the precopy phase.  Don't we
already delay entering the postcopy phase until precopy is "complete"
in the sense that the remaining non-postcopiable data is below the
downtime limit?  I would have thought that would also ensure we'd only
have a reasonable number of remaining htab updates for the package.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH for-2.5 1/1] spapr: Handle failure of KVM_PPC_ALLOCATE_HTAB ioctl

2015-11-08 Thread David Gibson
On Tue, Nov 03, 2015 at 03:38:19PM +0530, Bharata B Rao wrote:
> KVM_PPC_ALLOCATE_HTAB ioctl can return -ENOMEM for KVM guests and QEMU
> never handled this correctly. But this didn't cause any problems till
> now as KVM_PPC_ALLOCATE_HTAB ioctl returned with smaller than requested
> HTAB when enough contiguous memory wasn't available in the host.
> After the proposed kernel change: https://patchwork.ozlabs.org/patch/530501/,
> KVM_PPC_ALLOCATE_HTAB ioctl will not fallback to lower sized HTAB
> allocation and will fail if requested HTAB size can't be met.
> 
> Check for such failures in QEMU and abort appropriately. This will
> prevent guest kernel from hanging/freezing during early boot by doing
> graceful exit when host is unable to allocate requested HTAB.
> 
> Signed-off-by: Bharata B Rao 

I'm going to apply this, since it fixes a real problem.

I'm not entirely happy with the way it's done though - I'd prefer to
see a separate case for (shift < 0) giving an unconditional error.
Handling both the HV success case and the failure case in that first
branch is unnecessarily subtle and confusing, IMO.


> ---
>  hw/ppc/spapr.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index e1202ce..ec6e141 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1022,7 +1022,7 @@ static void spapr_alloc_htab(sPAPRMachineState *spapr)
>  
>  shift = kvmppc_reset_htab(spapr->htab_shift);
>  
> -if (shift > 0) {
> +if (shift != 0) {
>  /* Kernel handles htab, we don't need to allocate one */
>  if (shift != spapr->htab_shift) {
>  error_setg(_abort, "Failed to allocate HTAB of requested 
> size, try with smaller maxmem");
> @@ -1055,7 +1055,7 @@ static void spapr_reset_htab(sPAPRMachineState *spapr)
>  int index;
>  
>  shift = kvmppc_reset_htab(spapr->htab_shift);
> -if (shift > 0) {
> +if (shift != 0) {
>  if (shift != spapr->htab_shift) {
>  error_setg(_abort, "Requested HTAB allocation failed 
> during reset");
>  }

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v7 12/35] util: let qemu_fd_getlength support block device

2015-11-08 Thread Xiao Guangrong



On 11/06/2015 11:54 PM, Eduardo Habkost wrote:

On Mon, Nov 02, 2015 at 05:13:14PM +0800, Xiao Guangrong wrote:

lseek can not work for all block devices as the man page says:
| Some devices are incapable of seeking and POSIX does not specify
| which devices must support lseek().

This patch tries to add the support on Linux by using BLKGETSIZE64
ioctl

Signed-off-by: Xiao Guangrong 


On which cases is this patch necessary? Do you know any examples of
Linux block devices that won't work with lseek(SEEK_END)?


To be honest, i have not checked all block device, this patch was made
based on the man page. However, i do not mind drop this patch (and maybe
other patches) to make this pachset smaller. BLKGETSIZE64 can be added
in the future if we meet such device.





Re: [Qemu-devel] [PATCH v6 25/33] nvdimm acpi: build ACPI nvdimm devices

2015-11-08 Thread Xiao Guangrong



On 11/09/2015 01:38 AM, Michael S. Tsirkin wrote:

On Fri, Oct 30, 2015 at 01:56:19PM +0800, Xiao Guangrong wrote:

NVDIMM devices is defined in ACPI 6.0 9.20 NVDIMM Devices

There is a root device under \_SB and specified NVDIMM devices are under the
root device. Each NVDIMM device has _ADR which returns its handle used to
associate MEMDEV structure in NFIT

We reserve handle 0 for root device. In this patch, we save handle, handle,
arg1 and arg2 to dsm memory. Arg3 is conditionally saved in later patch

Signed-off-by: Xiao Guangrong 
---
  hw/acpi/nvdimm.c | 184 +++
  1 file changed, 184 insertions(+)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index dd84e5f..53ed675 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -368,6 +368,15 @@ static void nvdimm_build_nfit(GSList *device_list, GArray 
*table_offsets,
  g_array_free(structures, true);
  }

+struct NvdimmDsmIn {
+uint32_t handle;
+uint32_t revision;
+uint32_t function;
+   /* the remaining size in the page is used by arg3. */
+uint8_t arg3[0];
+} QEMU_PACKED;
+typedef struct NvdimmDsmIn NvdimmDsmIn;
+
  static uint64_t
  nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
  {
@@ -377,6 +386,7 @@ nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
  static void
  nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
  {
+fprintf(stderr, "BUG: we never write DSM notification IO Port.\n");
  }

  static const MemoryRegionOps nvdimm_dsm_ops = {
@@ -402,6 +412,179 @@ void nvdimm_init_acpi_state(MemoryRegion *memory, 
MemoryRegion *io,
  memory_region_add_subregion(io, NVDIMM_ACPI_IO_BASE, >io_mr);
  }

+#define BUILD_STA_METHOD(_dev_, _method_)  \
+do {   \
+_method_ = aml_method("_STA", 0);  \
+aml_append(_method_, aml_return(aml_int(0x0f)));   \
+aml_append(_dev_, _method_);   \
+} while (0)
+
+#define BUILD_DSM_METHOD(_dev_, _method_, _handle_, _uuid_)\
+do {   \
+Aml *ifctx, *uuid; \
+_method_ = aml_method("_DSM", 4);  \
+/* check UUID if it is we expect, return the errorcode if not.*/   \


check that UUID is what we expect?


Yes, it is, better English indeed.




+uuid = aml_touuid(_uuid_); \
+ifctx = aml_if(aml_lnot(aml_equal(aml_arg(0), uuid))); \
+aml_append(ifctx, aml_return(aml_int(1 /* Not Supported */))); \
+aml_append(method, ifctx); \
+aml_append(method, aml_return(aml_call4("NCAL", aml_int(_handle_), \
+   aml_arg(1), aml_arg(2), aml_arg(3;  \


So name NCAL here matches the name below.
Pls define a macro for it so we aren't limited
by silly 4-letter limitations of AML.
Same applies to all other names you use here and elsewhere.


Okay, it's good to me.




+aml_append(_dev_, _method_);   \
+} while (0)
+
+#define BUILD_FIELD_UNIT_SIZE(_field_, _byte_, _name_) \
+aml_append(_field_, aml_named_field(_name_, (_byte_) * BITS_PER_BYTE))
+
+#define BUILD_FIELD_UNIT_STRUCT(_field_, _s_, _f_, _name_) \
+BUILD_FIELD_UNIT_SIZE(_field_, sizeof(typeof_field(_s_, _f_)), _name_)
+


why are these macros? Make them functions pls.


Since Igor thought it hidden the details and it was not good to the readers.
I will just drop this macros and inline it in the place where it is used.




Re: [Qemu-devel] [PATCH 1/1] migration: qemu_savevm_state_cleanup becomes mandatory operation

2015-11-08 Thread Denis V. Lunev

On 11/09/2015 08:10 AM, Li, Liang Z wrote:

since commit
 commit 94f5a43704129ca4995aa3385303c5ae225bde42
 Author: Liang Li 
 Date:   Mon Nov 2 15:37:00 2015 +0800

 migration: defer migration_end & blk_mig_cleanup

when actual .cleanup callbacks calling was removed from complete operations.

Signed-off-by: Denis V. Lunev 
CC: Paolo Bonzini 
CC: Juan Quintela 
CC: Amit Shah 
---
  migration/savevm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/savevm.c b/migration/savevm.c index e05158d..9f2230f
100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -942,8 +942,8 @@ static int qemu_savevm_state(QEMUFile *f, Error
**errp)
  qemu_savevm_state_complete(f);
  ret = qemu_file_get_error(f);
  }
+qemu_savevm_state_cleanup();
  if (ret != 0) {
-qemu_savevm_state_cleanup();
  error_setg_errno(errp, -ret, "Error while writing VM state");
  }
  return ret;
--
2.5.0



Yes, you are right. Thanks a lot.

BTW, can this patch fix the regression you reported?

Reviewed-by: Liang Li 


yes



Re: [Qemu-devel] [PATCH 1/1] migration: qemu_savevm_state_cleanup becomes mandatory operation

2015-11-08 Thread Li, Liang Z
> On 11/09/2015 08:10 AM, Li, Liang Z wrote:
> >> since commit
> >>  commit 94f5a43704129ca4995aa3385303c5ae225bde42
> >>  Author: Liang Li 
> >>  Date:   Mon Nov 2 15:37:00 2015 +0800
> >>
> >>  migration: defer migration_end & blk_mig_cleanup
> >>
> >> when actual .cleanup callbacks calling was removed from complete
> operations.
> >>
> >> Signed-off-by: Denis V. Lunev 
> >> CC: Paolo Bonzini 
> >> CC: Juan Quintela 
> >> CC: Amit Shah 
> >> ---
> >>   migration/savevm.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/migration/savevm.c b/migration/savevm.c index
> e05158d..9f2230f
> >> 100644
> >> --- a/migration/savevm.c
> >> +++ b/migration/savevm.c
> >> @@ -942,8 +942,8 @@ static int qemu_savevm_state(QEMUFile *f, Error
> >> **errp)
> >>   qemu_savevm_state_complete(f);
> >>   ret = qemu_file_get_error(f);
> >>   }
> >> +qemu_savevm_state_cleanup();
> >>   if (ret != 0) {
> >> -qemu_savevm_state_cleanup();
> >>   error_setg_errno(errp, -ret, "Error while writing VM state");
> >>   }
> >>   return ret;
> >> --
> >> 2.5.0
> >>
> >
> > Yes, you are right. Thanks a lot.
> >
> > BTW, can this patch fix the regression you reported?
> >
> > Reviewed-by: Liang Li 
> >
> yes

Great.  You'd better change the commit message to make it more clear.

Liang 



Re: [Qemu-devel] [PATCH v2] virtio-blk: trivial code optimization

2015-11-08 Thread Fam Zheng
On Mon, 11/09 11:19, arei.gong...@huawei.com wrote:
> From: Gonglei 
> 
> 1. avoid possible superflous checking
> 2. make code more robustness
> 
> Signed-off-by: Gonglei 
> ---
> v2: address Paolo's comments, thanks.
> ---
>  hw/block/virtio-blk.c | 27 +--
>  1 file changed, 9 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 093e475..21f8d72 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -404,24 +404,15 @@ void virtio_blk_submit_multireq(BlockBackend *blk, 
> MultiReqBuffer *mrb)
>  for (i = 0; i < mrb->num_reqs; i++) {
>  VirtIOBlockReq *req = mrb->reqs[i];
>  if (num_reqs > 0) {
> -bool merge = true;
> -
> -/* merge would exceed maximum number of IOVs */
> -if (niov + req->qiov.niov > IOV_MAX) {
> -merge = false;
> -}
> -
> -/* merge would exceed maximum transfer length of backend device 
> */
> -if (req->qiov.size / BDRV_SECTOR_SIZE + nb_sectors > 
> max_xfer_len) {
> -merge = false;
> -}
> -
> -/* requests are not sequential */
> -if (sector_num + nb_sectors != req->sector_num) {
> -merge = false;
> -}
> -
> -if (!merge) {
> +/*
> + * NOTE: We cannot merge the requests in below situations:
> + * 1. requests are not sequential
> + * 2. merge would exceed maximum number of IOVs
> + * 3. merge would exceed maximum transfer length of backend 
> device
> + */
> +if (sector_num + nb_sectors != req->sector_num ||
> +niov > IOV_MAX - req->qiov.niov ||
> +req->qiov.size / BDRV_SECTOR_SIZE + nb_sectors > 
> max_xfer_len) {
>  submit_requests(blk, mrb, start, num_reqs, niov);
>  num_reqs = 0;
>  }
> -- 
> 1.7.12.4
> 
> 
> 

Reviewed-by: Fam Zheng 



[Qemu-devel] [PATCH qemu] spapr: Add /system-id

2015-11-08 Thread Alexey Kardashevskiy
Section B.6.2.1 Root Node Properties of PAPR specification defines
a set of properties which shall be present in the device tree root,
one of these properties is "system-id" which "should be unique across
all systems and all manufacturers". Since UUID is meant to be unique,
it makes sense to use it as "system-id".

This adds "system-id" property to the device tree root when not empty.

Signed-off-by: Alexey Kardashevskiy 
---

This might be expected by AIX so here is the patch.
I am really not sure if it makes sense to initialize property when
UUID is all zeroes as the requirement is "unique" and zero-uuid is not.

---
 hw/ppc/spapr.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index de77528..e8b407d 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -374,6 +374,9 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
   qemu_uuid[14], qemu_uuid[15]);
 
 _FDT((fdt_property_string(fdt, "vm,uuid", buf)));
+if (qemu_uuid_set) {
+_FDT((fdt_property_string(fdt, "system-id", buf)));
+}
 g_free(buf);
 
 if (qemu_get_vm_name()) {
-- 
2.5.0.rc3




[Qemu-devel] [PATCH for 2.5 v2 1/1] migration: qemu_savevm_state_cleanup becomes mandatory operation

2015-11-08 Thread Denis V. Lunev
since commit
commit 94f5a43704129ca4995aa3385303c5ae225bde42
Author: Liang Li 
Date:   Mon Nov 2 15:37:00 2015 +0800

migration: defer migration_end & blk_mig_cleanup

when actual .cleanup callbacks calling was removed from complete operations.

The patch fixes regression introduced by the commit above results in
100% reliable assert for virtio-scsi VM with iothreads enabled during
'virsh create-snapshot' operation:
assert(i != mr->ioeventfd_nb);
memory_region_del_eventfd
virtio_pci_set_host_notifier_internal
virtio_pci_set_host_notifier
virtio_scsi_dataplane_start
virtio_scsi_handle_cmd
virtio_queue_notify_vq
virtio_queue_host_notifier_read
aio_dispatch

Signed-off-by: Denis V. Lunev 
Reviewed-by: Liang Li 
CC: Paolo Bonzini 
CC: Juan Quintela 
CC: Amit Shah 
---
 migration/savevm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index e05158d..9f2230f 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -942,8 +942,8 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
 qemu_savevm_state_complete(f);
 ret = qemu_file_get_error(f);
 }
+qemu_savevm_state_cleanup();
 if (ret != 0) {
-qemu_savevm_state_cleanup();
 error_setg_errno(errp, -ret, "Error while writing VM state");
 }
 return ret;
-- 
2.1.4




Re: [Qemu-devel] qemu : rbd block driver internal snapshot and vm_stop is hanging forever

2015-11-08 Thread Denis V. Lunev

On 11/09/2015 10:19 AM, Denis V. Lunev wrote:

On 11/09/2015 06:10 AM, Alexandre DERUMIER wrote:

Hi,

with qemu (2.4.1), if I do an internal snapshot of an rbd device,
then I pause the vm with vm_stop,

the qemu process is hanging forever


monitor commands to reproduce:


# snapshot_blkdev_internal drive-virtio0 yoursnapname
# stop




I don't see this with qcow2 or sheepdog block driver for example.


Regards,

Alexandre


this could look like the problem I have recenty trying to
fix with dataplane enabled. Patch series is named as

[PATCH for 2.5 v6 0/10] dataplane snapshot fixes

Den


anyway, even if above will not help, can you collect gdb
traces from all threads in QEMU process. May be I'll be
able to give a hit.

Den



Re: [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol

2015-11-08 Thread zhanghailiang

On 2015/11/9 14:51, zhanghailiang wrote:

On 2015/11/7 2:26, Dr. David Alan Gilbert wrote:

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:

We need communications protocol of user-defined to control the checkpoint
process.

The new checkpoint request is started by Primary VM, and the interactive process
like below:
Checkpoint synchronizing points,

Primary Secondary
'checkpoint-request'   @ ->
Suspend (In hybrid mode)
'checkpoint-reply' <-- @
Suspend state


Why is this initial pair necessary?  Can't you just start with the vmstate-send
and save the extra request pair/round trip?  On the 2nd checkpoint we know
the SVM already received the previous checkpoint because we got it's first
vmstate-load.





Er, i have made some mistake, before 'checkpoint-request' command, we have a 
'checkpoint-ready'
communication, which is sent by SVM to PVM, to tell PVM that SVM is ready for 
checkpoint.
We do the initial work before send 'checkpoint-ready' in Secondary.

So, yes, you are right, this 'checkpoint-reply' is unnecessary for simple 
checkpoint mode.
I will remove it or maybe just add more comment about this, and add 
'checkpoint-ready' in the comment.


Yes, we can certainly drop this handshaking in simple checkpoint mode.
But we still need to do some initial work (preparing work) in simple checkpoint 
mode.
And i'm not sure if this initial work is time-wasting or not. We choose to do 
this preparing work
before send the 'checkpoint-reply' to reducing VM's STOP time as possible as we 
can.


I guess in full-COLO (rather than simple checkpoint) you can get the secondary 
to
do some of it's stopping/cleanup after it sends the checkpoint-reply


Actually, we do it before it sends 'checkpoint-reply' :)


but before vmstate-send, so you can hide some of the time.




(Perhaps add a comment to explain)



OK, I will add more comment about this.


'vmstate-send' @ ->
Send state  Receive state
'vmstate-received' <-- @
Release packets Load state
'vmstate-load' <-- @
Resume  Resume (In hybrid mode)

Start Comparing (In hybrid mode)
NOTE:
  1) '@' who sends the message
  2) Every sync-point is synchronized by two sides with only
 one handshake(single direction) for low-latency.
 If more strict synchronization is required, a opposite direction
 sync-point should be added.
  3) Since sync-points are single direction, the remote side may
 go forward a lot when this side just receives the sync-point.
  4) For now, we only support 'periodic' checkpoint, for which
the Secondary VM is not running, later we will support 'hybrid' mode.

Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
Signed-off-by: Gonglei 
Cc: Eric Blake 
---
v10:
- Rename enum COLOCmd to COLOCommand (Eric's suggestion).
- Remove unused 'ram-steal'
---
  migration/colo.c | 192 ++-
  qapi-schema.json |  27 
  trace-events |   2 +
  3 files changed, 219 insertions(+), 2 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 4fdf3a9..2510762 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -10,10 +10,12 @@
   * later.  See the COPYING file in the top-level directory.
   */

+#include 
  #include "sysemu/sysemu.h"
  #include "migration/colo.h"
  #include "trace.h"
  #include "qemu/error-report.h"
+#include "qemu/sockets.h"

  bool colo_supported(void)
  {
@@ -34,6 +36,103 @@ bool migration_incoming_in_colo_state(void)
  return mis && (mis->state == MIGRATION_STATUS_COLO);
  }

+/* colo checkpoint control helper */
+static int colo_ctl_put(QEMUFile *f, uint32_t cmd, uint64_t value)
+{
+int ret = 0;
+
+qemu_put_be32(f, cmd);
+qemu_put_be64(f, value);
+qemu_fflush(f);
+
+ret = qemu_file_get_error(f);
+trace_colo_ctl_put(COLOCommand_lookup[cmd], value);
+
+return ret;
+}
+
+static int colo_ctl_get_cmd(QEMUFile *f, uint32_t *cmd)
+{
+int ret = 0;
+
+*cmd = qemu_get_be32(f);
+ret = qemu_file_get_error(f);
+if (ret < 0) {
+return ret;
+}
+if (*cmd >= COLO_COMMAND_MAX) {
+error_report("Invalid colo command, get cmd:%d", *cmd);
+return -EINVAL;
+}
+trace_colo_ctl_get(COLOCommand_lookup[*cmd]);
+
+return 0;
+}
+
+static int colo_ctl_get(QEMUFile *f, uint32_t require)
+{
+int ret;
+uint32_t cmd;
+uint64_t value;
+
+ret = colo_ctl_get_cmd(f, );
+if (ret < 0) {
+return ret;
+}
+if (cmd != 

Re: [Qemu-devel] [PATCH 1/1] migration: qemu_savevm_state_cleanup becomes mandatory operation

2015-11-08 Thread Amit Shah
CC'ing Liang Li, author of the patch.

On (Sat) 07 Nov 2015 [18:40:12], Denis V. Lunev wrote:
> since commit
> commit 94f5a43704129ca4995aa3385303c5ae225bde42
> Author: Liang Li 
> Date:   Mon Nov 2 15:37:00 2015 +0800
> 
> migration: defer migration_end & blk_mig_cleanup
> 
> when actual .cleanup callbacks calling was removed from complete operations.
> 
> Signed-off-by: Denis V. Lunev 
> CC: Paolo Bonzini 
> CC: Juan Quintela 
> CC: Amit Shah 
> ---
>  migration/savevm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index e05158d..9f2230f 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -942,8 +942,8 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
>  qemu_savevm_state_complete(f);
>  ret = qemu_file_get_error(f);
>  }
> +qemu_savevm_state_cleanup();
>  if (ret != 0) {
> -qemu_savevm_state_cleanup();
>  error_setg_errno(errp, -ret, "Error while writing VM state");
>  }
>  return ret;
> -- 
> 2.5.0
> 

Amit



Re: [Qemu-devel] [PATCH v7 11/35] util: introduce qemu_file_getlength()

2015-11-08 Thread Xiao Guangrong



On 11/06/2015 11:50 PM, Eduardo Habkost wrote:

As this patch affects raw_getlength(), CCing the raw block driver
maintainer and the qemu-block mailing list.


Eduardo, thanks for your reminder. I will keep CCing Kevin and qemu-block mail
list for future version.



On Mon, Nov 02, 2015 at 05:13:13PM +0800, Xiao Guangrong wrote:

It is used to get the size of the specified file, also qemu_fd_getlength()
is introduced to unify the code with raw_getlength() in block/raw-posix.c

Signed-off-by: Xiao Guangrong 
---
  block/raw-posix.c|  7 +--
  include/qemu/osdep.h |  2 ++
  util/osdep.c | 31 +++


I know I was the one who suggested osdep.c, but maybe oslib-posix.c is a
more appropriate place for the new function?



Since the function we introduced here can work on both windows and posix, so
i thing osdep.c is the right place. Otherwise we should implement it for 
multiple
platforms.



Re: [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(PROCESSOR_MODULE_INFO)

2015-11-08 Thread David Gibson
On Mon, Nov 09, 2015 at 12:57:48PM +1100, Alexey Kardashevskiy wrote:
> On 11/05/2015 10:06 AM, Sukadev Bhattiprolu wrote:
> >Implement RTAS_SYSPARM_PROCESSOR_MODULE_INFO parameter to rtas_get_sysparm()
> >call in qemu. This call returns the processor module (socket), chip and core
> >information as specified in section 7.3.16.18 of PAPR v2.7.
> >
> >We walk the /proc/device-tree to determine the number of chips, cores and
> >modules in the _host_ system and return this info to the guest application
> >that makes the rtas_get_sysparm() call.
> >
> >We currently hard code the number of module_types to 1, but we should 
> >determine
> >that dynamically somehow later.
> >
> >Thanks to input from Nishanth Aravamudan and Alexey Kardashevsk.
> 
> "iy" is missing :)
> 
> 
> >
> >Signed-off-by: Sukadev Bhattiprolu 
> >---
> >Changelog[v2]:
> > [Alexey Kardashevsk] Use existing interfaces like ldl_be_p(),
> > stw_be_phys(), g_hash_table_new_full(), error_report() 
> > rather
> > than re-inventing; fix indentation, function prottypes etc;
> > Drop the fts() interface (which doesn't seem to be available
> > on mingw32/mingw64) and use opendir() to walk specific
> > directories in the directory tree.
> >---
> >  hw/ppc/Makefile.objs|   1 +
> >  hw/ppc/spapr_rtas.c |  35 +++
> >  hw/ppc/spapr_rtas_modinfo.c | 230 
> > 
> >  hw/ppc/spapr_rtas_modinfo.h |  12 +++
> >  include/hw/ppc/spapr.h  |   1 +
> >  5 files changed, 279 insertions(+)
> >  create mode 100644 hw/ppc/spapr_rtas_modinfo.c
> >  create mode 100644 hw/ppc/spapr_rtas_modinfo.h
> >
> >diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
> >index c1ffc77..57c6b02 100644
> >--- a/hw/ppc/Makefile.objs
> >+++ b/hw/ppc/Makefile.objs
> >@@ -4,6 +4,7 @@ obj-y += ppc.o ppc_booke.o
> >  obj-$(CONFIG_PSERIES) += spapr.o spapr_vio.o spapr_events.o
> >  obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
> >  obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
> >+obj-$(CONFIG_PSERIES) += spapr_rtas_modinfo.o
> >  ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
> >  obj-y += spapr_pci_vfio.o
> >  endif
> >diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> >index 34b12a3..41fd8a6 100644
> >--- a/hw/ppc/spapr_rtas.c
> >+++ b/hw/ppc/spapr_rtas.c
> >@@ -34,6 +34,8 @@
> >  #include "hw/ppc/spapr.h"
> >  #include "hw/ppc/spapr_vio.h"
> >  #include "qapi-event.h"
> >+
> >+#include "spapr_rtas_modinfo.h"
> >  #include "hw/boards.h"
> >
> >  #include 
> >@@ -240,6 +242,39 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU 
> >*cpu,
> >  target_ulong ret = RTAS_OUT_SUCCESS;
> >
> >  switch (parameter) {
> >+case RTAS_SYSPARM_PROCESSOR_MODULE_INFO: {
> >+int i;
> >+int offset = 0;
> >+int size;
> 
> Nit - could be one line.
> 
> 
> >+struct rtas_module_info modinfo;
> >+
> >+if (rtas_get_module_info()) {
> >+break;
> 
> 
> @ret will be still 0 in this case, set @ret to RTAS_OUT_HW_ERROR here.
> 
> Also, rtas_ibm_set_system_parameter() must return RTAS_OUT_NOT_AUTHORIZED on
> RTAS_SYSPARM_PROCESSOR_MODULE_INFO, as PAPR says.
> 
> 
> >+}
> >+
> >+size = sizeof(modinfo);
> >+size += (modinfo.module_types - 1) * sizeof(struct 
> >rtas_socket_info);
> >+
> >+stw_be_phys(_space_memory, buffer+offset, size);
> >+offset += 2;
> >+
> >+stw_be_phys(_space_memory, buffer+offset, 
> >modinfo.module_types);
> >+offset += 2;
> >+
> >+for (i = 0; i < modinfo.module_types; i++) {
> >+stw_be_phys(_space_memory, buffer+offset,
> >+modinfo.si[i].sockets);
> 
> 
> checkpatch.pl does not warn on this but new lines start under opening brace
> in the previous line.
> 
> In terms on vim, it would be:
> set expandtab
> set tabstop=4
> set shiftwidth=4
> set cino=:0,(0
> 
> 
> 
> >+offset += 2;
> >+stw_be_phys(_space_memory, buffer+offset,
> >+modinfo.si[i].chips);
> >+offset += 2;
> >+stw_be_phys(_space_memory, buffer+offset,
> >+modinfo.si[i].cores_per_chip);
> >+offset += 2;
> >+}
> >+break;
> >+}
> >+
> >  case RTAS_SYSPARM_SPLPAR_CHARACTERISTICS: {
> >  char *param_val = g_strdup_printf("MaxEntCap=%d,"
> >"DesMem=%llu,"
> >diff --git a/hw/ppc/spapr_rtas_modinfo.c b/hw/ppc/spapr_rtas_modinfo.c
> >new file mode 100644
> >index 000..068fc2c
> >--- /dev/null
> >+++ b/hw/ppc/spapr_rtas_modinfo.c
> >@@ -0,0 +1,230 @@
> >+/*
> >+ * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System 
> >Emulator
> >+ *
> >+ * Hypercall based emulated RTAS
> 
> 
> This is a description of 

Re: [Qemu-devel] [PATCH v2 1/1] target-ppc: Implement rtas_get_sysparm(PROCESSOR_MODULE_INFO)

2015-11-08 Thread David Gibson
On Wed, Nov 04, 2015 at 03:06:05PM -0800, Sukadev Bhattiprolu wrote:
> Implement RTAS_SYSPARM_PROCESSOR_MODULE_INFO parameter to rtas_get_sysparm()
> call in qemu. This call returns the processor module (socket), chip and core
> information as specified in section 7.3.16.18 of PAPR v2.7.

PAPR v2.7 isn't available publically.  For upstream patches, please
reference LoPAPR instead (where it's section 7.3.16.17 AFAICT).

> We walk the /proc/device-tree to determine the number of chips, cores and
> modules in the _host_ system and return this info to the guest application
> that makes the rtas_get_sysparm() call.
> 
> We currently hard code the number of module_types to 1, but we should 
> determine
> that dynamically somehow later.
> 
> Thanks to input from Nishanth Aravamudan and Alexey Kardashevsk.
> 
> Signed-off-by: Sukadev Bhattiprolu 

This isn't ready to go yet - you need to put some more consideration
into the uncommon cases: PR KVM, TCG and non-Power hosts.

> ---
> Changelog[v2]:
> [Alexey Kardashevsk] Use existing interfaces like ldl_be_p(),
> stw_be_phys(), g_hash_table_new_full(), error_report() rather
> than re-inventing; fix indentation, function prottypes etc;
> Drop the fts() interface (which doesn't seem to be available
> on mingw32/mingw64) and use opendir() to walk specific
> directories in the directory tree.
> ---
>  hw/ppc/Makefile.objs|   1 +
>  hw/ppc/spapr_rtas.c |  35 +++
>  hw/ppc/spapr_rtas_modinfo.c | 230 
> 
>  hw/ppc/spapr_rtas_modinfo.h |  12 +++
>  include/hw/ppc/spapr.h  |   1 +
>  5 files changed, 279 insertions(+)
>  create mode 100644 hw/ppc/spapr_rtas_modinfo.c
>  create mode 100644 hw/ppc/spapr_rtas_modinfo.h
> 
> diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
> index c1ffc77..57c6b02 100644
> --- a/hw/ppc/Makefile.objs
> +++ b/hw/ppc/Makefile.objs
> @@ -4,6 +4,7 @@ obj-y += ppc.o ppc_booke.o
>  obj-$(CONFIG_PSERIES) += spapr.o spapr_vio.o spapr_events.o
>  obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
>  obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
> +obj-$(CONFIG_PSERIES) += spapr_rtas_modinfo.o
>  ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
>  obj-y += spapr_pci_vfio.o
>  endif
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 34b12a3..41fd8a6 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -34,6 +34,8 @@
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/spapr_vio.h"
>  #include "qapi-event.h"
> +
> +#include "spapr_rtas_modinfo.h"
>  #include "hw/boards.h"
>  
>  #include 
> @@ -240,6 +242,39 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU 
> *cpu,
>  target_ulong ret = RTAS_OUT_SUCCESS;
>  
>  switch (parameter) {
> +case RTAS_SYSPARM_PROCESSOR_MODULE_INFO: {
> +int i;
> +int offset = 0;
> +int size;
> +struct rtas_module_info modinfo;
> +
> +if (rtas_get_module_info()) {
> +break;
> +}

So, you handle the variable size of this structure before sending it
to the guest, but you don't handle it in allocation of the structure
right here.  You'll get away with that because for now you only ever
have one entry in the sockets array, but it's a bit icky.

> +
> +size = sizeof(modinfo);
> +size += (modinfo.module_types - 1) * sizeof(struct rtas_socket_info);

More seriously, this calculation will break horribly if you change the
size of the array in struct rtas_module_info.

> +stw_be_phys(_space_memory, buffer+offset, size);
> +offset += 2;
> +
> +stw_be_phys(_space_memory, buffer+offset, 
> modinfo.module_types);
> +offset += 2;
> +
> +for (i = 0; i < modinfo.module_types; i++) {
> +stw_be_phys(_space_memory, buffer+offset,
> +modinfo.si[i].sockets);
> +offset += 2;
> +stw_be_phys(_space_memory, buffer+offset,
> +modinfo.si[i].chips);
> +offset += 2;
> +stw_be_phys(_space_memory, buffer+offset,
> +modinfo.si[i].cores_per_chip);
> +offset += 2;
> +}
> +break;
> +}
> +
>  case RTAS_SYSPARM_SPLPAR_CHARACTERISTICS: {
>  char *param_val = g_strdup_printf("MaxEntCap=%d,"
>"DesMem=%llu,"
> diff --git a/hw/ppc/spapr_rtas_modinfo.c b/hw/ppc/spapr_rtas_modinfo.c
> new file mode 100644
> index 000..068fc2c
> --- /dev/null
> +++ b/hw/ppc/spapr_rtas_modinfo.c
> @@ -0,0 +1,230 @@
> +/*
> + * QEMU PowerPC pSeries Logical Partition (aka sPAPR) hardware System 
> Emulator
> + *
> + * Hypercall based emulated RTAS
> + *
> + * Copyright (c) 2015 Sukadev Bhattiprolu, IBM Corporation.
> + *
> + * Permission is hereby 

Re: [Qemu-devel] [PATCH COLO-Frame v10 06/38] migration: Integrate COLO checkpoint process into loadvm

2015-11-08 Thread zhanghailiang

On 2015/11/7 1:29, Dr. David Alan Gilbert wrote:

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:

Switch from normal migration loadvm process into COLO checkpoint process if
COLO mode is enabled.
We add three new members to struct MigrationIncomingState, 
'have_colo_incoming_thread'
and 'colo_incoming_thread' record the colo related threads for secondary VM,
'migration_incoming_co' records the original migration incoming coroutine.

Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
---
v10: fix a bug about fd leak which is found by Dave.





diff --git a/migration/migration.c b/migration/migration.c
index cf83531..7d8cd38 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -288,6 +288,27 @@ static void process_incoming_migration_co(void *opaque)
MIGRATION_STATUS_ACTIVE);
  ret = qemu_loadvm_state(f);

+if (!ret) {
+/* Make sure all file formats flush their mutable metadata */
+bdrv_invalidate_cache_all(_err);
+if (local_err) {
+error_report_err(local_err);
+migrate_decompress_threads_join();
+exit(EXIT_FAILURE);
+}
+}


Are you moving this code? Because I think the bdrv_invalidate_cache_all is a 
few lines
below here - just


+/* we get colo info, and know if we are in colo mode */
+if (!ret && migration_incoming_enable_colo()) {
+mis->migration_incoming_co = qemu_coroutine_self();
+qemu_thread_create(>colo_incoming_thread, "colo incoming",
+ colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE);
+mis->have_colo_incoming_thread = true;
+qemu_coroutine_yield();
+
+/* Wait checkpoint incoming thread exit before free resource */
+qemu_thread_join(>colo_incoming_thread);
+}
+
  qemu_fclose(f);
  free_xbzrle_decoded_buf();
  migration_incoming_state_destroy();



 here in my current head world; so shouldn't you be deleting
the bdrv_invalidate_cache_all here?



Good catch! I deleted it in patch 38, which should be moved into
this patch. I will fix it in next version.

Thanks,
zhanghailiang


(Otherwise OK)

Dave


diff --git a/stubs/migration-colo.c b/stubs/migration-colo.c
index acddca6..c12516e 100644
--- a/stubs/migration-colo.c
+++ b/stubs/migration-colo.c
@@ -22,6 +22,16 @@ bool migration_in_colo_state(void)
  return false;
  }

+bool migration_incoming_in_colo_state(void)
+{
+return false;
+}
+
  void migrate_start_colo_process(MigrationState *s)
  {
  }
+
+void *colo_process_incoming_thread(void *opaque)
+{
+return NULL;
+}
--
1.8.3.1



--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

.







Re: [Qemu-devel] [PATCH 1/1] migration: qemu_savevm_state_cleanup becomes mandatory operation

2015-11-08 Thread Denis V. Lunev

On 11/09/2015 09:16 AM, Li, Liang Z wrote:

On 11/09/2015 08:10 AM, Li, Liang Z wrote:

since commit
  commit 94f5a43704129ca4995aa3385303c5ae225bde42
  Author: Liang Li 
  Date:   Mon Nov 2 15:37:00 2015 +0800

  migration: defer migration_end & blk_mig_cleanup

when actual .cleanup callbacks calling was removed from complete

operations.

Signed-off-by: Denis V. Lunev 
CC: Paolo Bonzini 
CC: Juan Quintela 
CC: Amit Shah 
---
   migration/savevm.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/savevm.c b/migration/savevm.c index

e05158d..9f2230f

100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -942,8 +942,8 @@ static int qemu_savevm_state(QEMUFile *f, Error
**errp)
   qemu_savevm_state_complete(f);
   ret = qemu_file_get_error(f);
   }
+qemu_savevm_state_cleanup();
   if (ret != 0) {
-qemu_savevm_state_cleanup();
   error_setg_errno(errp, -ret, "Error while writing VM state");
   }
   return ret;
--
2.5.0


Yes, you are right. Thanks a lot.

BTW, can this patch fix the regression you reported?

Reviewed-by: Liang Li 


yes

Great.  You'd better change the commit message to make it more clear.

Liang

argh.. you are right...

This problem has appeared in the end of big rework
of another problem with snapshots and dataplane.
Sorry that this is not clear that regression is fixed.
I'll resend the patch with better commit message

Den



Re: [Qemu-devel] [PATCH COLO-Frame v10 09/38] COLO: Implement colo checkpoint protocol

2015-11-08 Thread zhanghailiang

On 2015/11/7 2:26, Dr. David Alan Gilbert wrote:

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:

We need communications protocol of user-defined to control the checkpoint
process.

The new checkpoint request is started by Primary VM, and the interactive process
like below:
Checkpoint synchronizing points,

Primary Secondary
'checkpoint-request'   @ ->
Suspend (In hybrid mode)
'checkpoint-reply' <-- @
Suspend state


Why is this initial pair necessary?  Can't you just start with the vmstate-send
and save the extra request pair/round trip?  On the 2nd checkpoint we know
the SVM already received the previous checkpoint because we got it's first
vmstate-load.



Yes, we can certainly drop this handshaking in simple checkpoint mode.
But we still need to do some initial work (preparing work) in simple checkpoint 
mode.
And i'm not sure if this initial work is time-wasting or not. We choose to do 
this preparing work
before send the 'checkpoint-reply' to reducing VM's STOP time as possible as we 
can.


I guess in full-COLO (rather than simple checkpoint) you can get the secondary 
to
do some of it's stopping/cleanup after it sends the checkpoint-reply


Actually, we do it before it sends 'checkpoint-reply' :)


but before vmstate-send, so you can hide some of the time.




(Perhaps add a comment to explain)



OK, I will add more comment about this.


'vmstate-send' @ ->
Send state  Receive state
'vmstate-received' <-- @
Release packets Load state
'vmstate-load' <-- @
Resume  Resume (In hybrid mode)

Start Comparing (In hybrid mode)
NOTE:
  1) '@' who sends the message
  2) Every sync-point is synchronized by two sides with only
 one handshake(single direction) for low-latency.
 If more strict synchronization is required, a opposite direction
 sync-point should be added.
  3) Since sync-points are single direction, the remote side may
 go forward a lot when this side just receives the sync-point.
  4) For now, we only support 'periodic' checkpoint, for which
the Secondary VM is not running, later we will support 'hybrid' mode.

Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
Signed-off-by: Gonglei 
Cc: Eric Blake 
---
v10:
- Rename enum COLOCmd to COLOCommand (Eric's suggestion).
- Remove unused 'ram-steal'
---
  migration/colo.c | 192 ++-
  qapi-schema.json |  27 
  trace-events |   2 +
  3 files changed, 219 insertions(+), 2 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 4fdf3a9..2510762 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -10,10 +10,12 @@
   * later.  See the COPYING file in the top-level directory.
   */

+#include 
  #include "sysemu/sysemu.h"
  #include "migration/colo.h"
  #include "trace.h"
  #include "qemu/error-report.h"
+#include "qemu/sockets.h"

  bool colo_supported(void)
  {
@@ -34,6 +36,103 @@ bool migration_incoming_in_colo_state(void)
  return mis && (mis->state == MIGRATION_STATUS_COLO);
  }

+/* colo checkpoint control helper */
+static int colo_ctl_put(QEMUFile *f, uint32_t cmd, uint64_t value)
+{
+int ret = 0;
+
+qemu_put_be32(f, cmd);
+qemu_put_be64(f, value);
+qemu_fflush(f);
+
+ret = qemu_file_get_error(f);
+trace_colo_ctl_put(COLOCommand_lookup[cmd], value);
+
+return ret;
+}
+
+static int colo_ctl_get_cmd(QEMUFile *f, uint32_t *cmd)
+{
+int ret = 0;
+
+*cmd = qemu_get_be32(f);
+ret = qemu_file_get_error(f);
+if (ret < 0) {
+return ret;
+}
+if (*cmd >= COLO_COMMAND_MAX) {
+error_report("Invalid colo command, get cmd:%d", *cmd);
+return -EINVAL;
+}
+trace_colo_ctl_get(COLOCommand_lookup[*cmd]);
+
+return 0;
+}
+
+static int colo_ctl_get(QEMUFile *f, uint32_t require)
+{
+int ret;
+uint32_t cmd;
+uint64_t value;
+
+ret = colo_ctl_get_cmd(f, );
+if (ret < 0) {
+return ret;
+}
+if (cmd != require) {
+error_report("Unexpect colo command, expect:%d, but get cmd:%d",
+ require, cmd);
+return -EINVAL;
+}
+
+value = qemu_get_be64(f);
+ret = qemu_file_get_error(f);
+if (ret < 0) {
+return ret;
+}
+
+return value;
+}


Should the return type be uint64_t since you're returning value?
But then you're also using it to return an error code; so perhaps
it might be better to have a uint64_t *valueparameter to
return the 

Re: [Qemu-devel] [PATCH 3/3] block/gluster: add support for multiple gluster servers

2015-11-08 Thread Peter Krempa
On Thu, Nov 05, 2015 at 07:45:50 -0500, Prasanna Kumar Kalever wrote:
> On Thursday, November 5, 2015 6:07:06 PM, Prasanna Kumar Kalever wrote:
> > This patch adds a way to specify multiple volfile servers to the gluster
> > block backend of QEMU with tcp|rdma transport types and their port numbers.
> > 
> > Problem:
> > 
> > Currently VM Image on gluster volume is specified like this:
> 

[...]

> > @@ -345,7 +676,7 @@ static int qemu_gluster_open(BlockDriverState *bs, QDict
> > *options,
> >  
> >  out:
> >  qemu_opts_del(opts);
> > -qemu_gluster_gconf_free(gconf);
> > +qapi_free_BlockdevOptionsGluster(gconf);
> 
> Can some one help me please ?
> This leads to crash in the second iteration i.e. while freeing 
> "gconf->servers->next->value"

So, prior to this you allocate a array of the data structures as:

+gsconf = g_new0(GlusterServer, num_servers);
+
+ptr = qemu_opt_get(opts, GLUSTER_OPT_VOLUME);
+if (!ptr) {
+error_setg(_err, "Error: qemu_gluster: please provide 'volume' "
+   "option");
+goto out;
+}

Then you use the following code to fill the linked list:

+  if (gconf->servers == NULL) {
+gconf->servers = g_new0(GlusterServerList, 1);
+gconf->servers->value = [i];

So here you set the value. For a i of 0 the '[i]' expression will
be a pointer with equal address to 'gsconf'. For explanation:

'gsconf[i]' can be written as '*(gsconf + i)', so
'[i]' becomes basically '&(*(gsconf + i))'

This can be also simplified to:
'gsconf + i'. For a i of 0 this becomes the same pointer as 'gsconf'

And once you use that with free(), the whole gsconf array will be freed.
All the other pointers that you've filled to the linked list become
invalid, since they were pointing into the same array that was
completely freed in the first iteration.

Peter


signature.asc
Description: Digital signature


[Qemu-devel] [PATCH V5] block/nfs: add support for setting debug level

2015-11-08 Thread Peter Lieven
recent libnfs versions support logging debug messages. Add
support for it in qemu through an URL parameter.

Example:
 qemu -cdrom nfs://127.0.0.1/iso/my.iso?debug=2

Signed-off-by: Peter Lieven 
---
v4->v5: add a comment in the code why we limit the debug level [Stefan]
v3->v4: revert to the initial version, but limit max debug level
v2->v3: use a per-drive option instead of a global one. [Stefan]
v1->v2: reworked patch to accept the debug level as a cmdline
parameter instead of an URI parameter [Stefan]

 block/nfs.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/block/nfs.c b/block/nfs.c
index fd79f89..ab1e267 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -36,6 +36,7 @@
 #include 
 
 #define QEMU_NFS_MAX_READAHEAD_SIZE 1048576
+#define QEMU_NFS_MAX_DEBUG_LEVEL 2
 
 typedef struct NFSClient {
 struct nfs_context *context;
@@ -334,6 +335,17 @@ static int64_t nfs_client_open(NFSClient *client, const 
char *filename,
 }
 nfs_set_readahead(client->context, val);
 #endif
+#ifdef LIBNFS_FEATURE_DEBUG
+} else if (!strcmp(qp->p[i].name, "debug")) {
+/* limit the maximum debug level to avoid potential flooding
+ * of our log files. */
+if (val > QEMU_NFS_MAX_DEBUG_LEVEL) {
+error_report("NFS Warning: Limiting NFS debug level"
+ " to %d", QEMU_NFS_MAX_DEBUG_LEVEL);
+val = QEMU_NFS_MAX_DEBUG_LEVEL;
+}
+nfs_set_debug(client->context, val);
+#endif
 } else {
 error_setg(errp, "Unknown NFS parameter name: %s",
qp->p[i].name);
-- 
1.9.1




[Qemu-devel] [PATCH for-2.5 v4 3/3] arm: highbank: Implement PSCI and dummy monitor

2015-11-08 Thread Peter Crosthwaite
Firstly, enable monitor mode and PSCI, both of which are features of
this board.

In addition to PSCI, this board also uses SMC for cache maintenance
ops. This means we need a secure monitor to catch these and nop them.
Use the ARM boot board-setup feature to implement this. The SMC trap
implements the needed nop while all other traps will pen the CPU.

As a KVM CPU cannot run in secure mode, do not do the board-setup if
not running TCG. Report a warning explaining the limitation in this
case.

Reviewed-by: Peter Maydell 
Signed-off-by: Peter Crosthwaite 
---
Changed since v3:
Commit message typos.
Changed tcg_enabled() check to kvm_enabled()
Reworded error_report() message.
clarified non-smc trap behaviour in commit message
Changed since v2:
Change to spinlock/movs trap table and drop fallthrough (PMM review)
Do not do board-setup if KVM is enabled. Issue a warning.
Changed since v1:
fallthrough all of trap table to nop implementation
use movw for table address
leave loader at 0
Move MVBAR (and blob to non-zero)
Split nop implementation from MVBAR setup
set secure boot for board
implement NS switch in blob
Changed since RFC:
Use bootloader callback to load blob.
Change "firmware" to "board-setup" for consistency.
Tweak commit message.

 hw/arm/highbank.c | 70 +++
 1 file changed, 60 insertions(+), 10 deletions(-)

diff --git a/hw/arm/highbank.c b/hw/arm/highbank.c
index f2e248b..85ae69e 100644
--- a/hw/arm/highbank.c
+++ b/hw/arm/highbank.c
@@ -22,6 +22,7 @@
 #include "hw/devices.h"
 #include "hw/loader.h"
 #include "net/net.h"
+#include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
 #include "hw/boards.h"
 #include "sysemu/block-backend.h"
@@ -32,10 +33,52 @@
 #define SMP_BOOT_REG0x40
 #define MPCORE_PERIPHBASE   0xfff1
 
+#define MVBAR_ADDR  0x200
+
 #define NIRQ_GIC160
 
 /* Board init.  */
 
+/* MVBAR_ADDR is limited by precision of movw */
+
+QEMU_BUILD_BUG_ON(MVBAR_ADDR >= (1 << 16));
+
+#define ARMV7_IMM16(x) (extract32((x),  0, 12) | \
+extract32((x), 12,  4) << 16)
+
+static void hb_write_board_setup(ARMCPU *cpu,
+ const struct arm_boot_info *info)
+{
+int n;
+uint32_t board_setup_blob[] = {
+/* MVBAR_ADDR */
+/* Default unimplemented and unused vectors to spin. Makes it
+ * easier to debug (as opposed to the CPU running away).
+ */
+0xeafe, /* notused1: b notused */
+0xeafe, /* notused2: b notused */
+0xe1b0f00e, /* smc: movs pc, lr - exception return */
+0xeafe, /* prefetch_abort: b prefetch_abort */
+0xeafe, /* data_abort: b data_abort */
+0xeafe, /* notused3: b notused3 */
+0xeafe, /* irq: b irq */
+0xeafe, /* fiq: b fiq */
+#define BOARD_SETUP_ADDR (MVBAR_ADDR + 8 * sizeof(uint32_t))
+0xe300 + ARMV7_IMM16(MVBAR_ADDR), /* movw r0, MVBAR_ADDR */
+0xee0c0f30, /* mcr p15, 0, r0, c12, c0, 1 - set MVBAR */
+0xee110f11, /* mrc p15, 0, r0, c1 , c1, 0 - get SCR */
+0xe3810001, /* orr r0, #1 - set NS */
+0xee010f11, /* mcr p15, 0, r0, c1 , c1, 0 - set SCR */
+0xe1600070, /* smc - go to monitor mode to flush NS change */
+0xe12fff1e, /* bx lr - return to caller */
+};
+for (n = 0; n < ARRAY_SIZE(board_setup_blob); n++) {
+board_setup_blob[n] = tswap32(board_setup_blob[n]);
+}
+rom_add_blob_fixed("board-setup", board_setup_blob,
+   sizeof(board_setup_blob), MVBAR_ADDR);
+}
+
 static void hb_write_secondary(ARMCPU *cpu, const struct arm_boot_info *info)
 {
 int n;
@@ -241,16 +284,13 @@ static void calxeda_init(MachineState *machine, enum 
cxmachines machine_id)
 cpuobj = object_new(object_class_get_name(oc));
 cpu = ARM_CPU(cpuobj);
 
-/* By default A9 and A15 CPUs have EL3 enabled.  This board does not
- * currently support EL3 so the CPU EL3 property is disabled before
- * realization.
- */
-if (object_property_find(cpuobj, "has_el3", NULL)) {
-object_property_set_bool(cpuobj, false, "has_el3", );
-if (err) {
-error_report_err(err);
-exit(1);
-}
+object_property_set_int(cpuobj, QEMU_PSCI_CONDUIT_SMC,
+"psci-conduit", _abort);
+
+if (n) {
+/* Secondary CPUs start in PSCI powered-down state */
+object_property_set_bool(cpuobj, true,
+ "start-powered-off", _abort);
 }
 
 if (object_property_find(cpuobj, "reset-cbar", NULL)) {
@@ -371,6 +411,16 @@ static void calxeda_init(MachineState *machine, enum 
cxmachines machine_id)
 highbank_binfo.loader_start = 0;
 highbank_binfo.write_secondary_boot = hb_write_secondary;

[Qemu-devel] [PATCH for-2.5 v4 1/3] arm: boot: Add secure_board_setup flag

2015-11-08 Thread Peter Crosthwaite
Add a flag that when set, will cause the primary CPU to start in secure
mode, even if the overall boot is non-secure. This is useful for when
there is a board-setup blob that needs to run from secure mode, but
device and secondary CPU init should still be done as-normal for a non-
secure boot.

Signed-off-by: Peter Crosthwaite 
---
changed since v3:
Clarified comment for assertion.
Changed tcg_enabled() check to kvm_enabled().
changed since v2:
Assert if running KVM and board_setup_secure is set

 hw/arm/boot.c| 10 +-
 include/hw/arm/arm.h |  6 ++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index b0879a5..75f69bf 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -11,6 +11,7 @@
 #include "hw/hw.h"
 #include "hw/arm/arm.h"
 #include "hw/arm/linux-boot-if.h"
+#include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
 #include "hw/boards.h"
 #include "hw/loader.h"
@@ -495,7 +496,8 @@ static void do_cpu_reset(void *opaque)
 }
 
 /* Set to non-secure if not a secure boot */
-if (!info->secure_boot) {
+if (!info->secure_boot &&
+(cs != first_cpu || !info->secure_board_setup)) {
 /* Linux expects non-secure state */
 env->cp15.scr_el3 |= SCR_NS;
 }
@@ -598,6 +600,12 @@ static void arm_load_kernel_notify(Notifier *notifier, 
void *data)
 struct arm_boot_info *info =
 container_of(n, struct arm_boot_info, load_kernel_notifier);
 
+/* The board code is not supposed to set secure_board_setup unless
+ * running its code in secure mode is actually possible, and KVM
+ * doesn't support secure.
+ */
+assert(!(info->secure_board_setup && kvm_enabled()));
+
 /* Load the kernel.  */
 if (!info->kernel_filename || info->firmware_loaded) {
 
diff --git a/include/hw/arm/arm.h b/include/hw/arm/arm.h
index 67ba7db..c26b0e3 100644
--- a/include/hw/arm/arm.h
+++ b/include/hw/arm/arm.h
@@ -97,6 +97,12 @@ struct arm_boot_info {
 hwaddr board_setup_addr;
 void (*write_board_setup)(ARMCPU *cpu,
   const struct arm_boot_info *info);
+
+/* If set, the board specific loader/setup blob will be run from secure
+ * mode, regardless of secure_boot. The blob becomes responsible for
+ * changing to non-secure state if implementing a non-secure boot
+ */
+bool secure_board_setup;
 };
 
 /**
-- 
1.9.1




[Qemu-devel] [PATCH for-2.5 v4 0/3] ARM: Highbank: Highbank: Add monitor support

2015-11-08 Thread Peter Crosthwaite
Hi,

This adds dummy monitor support to the Highbank board. It is needed by
the Highbank kernel which expects a monitor to be present.

A feature is added to arm/boot's board_setup feature, that allows the
board_setup entry point to be entered in secure mode (which is needed
to configure a monitor).

This feature does not play nice with -cpu override, but cpu override
is not valid for well-defined ARM SoCs. So defeature -cpu for Highbank.

Regards,
Peter

See indiv. patches for detailed change logs.

Changed since v3:
Addressed PMM v2 review
Changed tcg_enabled() checks to kvm_enabled()
Changed since v2:
Defeature -cpu for Highbank (new patch)
Rework board_setup blob implementation (PMM review)
Conditionalise feature on TCG
Dropped initial board_setup and Zynq patches (Merged)
Changed since v1:
Addressed PMM review.
Added secure_board_setup flag (P4)
Added Zynq patch first, then Highbank


Peter Crosthwaite (3):
  arm: boot: Add secure_board_setup flag
  arm: highbank: Defeature CPU override
  arm: highbank: Implement PSCI and dummy monitor

 hw/arm/boot.c| 10 +-
 hw/arm/highbank.c| 91 ++--
 include/hw/arm/arm.h |  6 
 3 files changed, 82 insertions(+), 25 deletions(-)

-- 
1.9.1




[Qemu-devel] [PATCH for-2.5 v4 2/3] arm: highbank: Defeature CPU override

2015-11-08 Thread Peter Crosthwaite
This board should not support CPU model override. This allows for
easier patching of the board with being able to rely on the CPU
type being correct.

Reviewed-by: Peter Maydell 
Signed-off-by: Peter Crosthwaite 
---

 hw/arm/highbank.c | 21 +++--
 1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/hw/arm/highbank.c b/hw/arm/highbank.c
index be04b27..f2e248b 100644
--- a/hw/arm/highbank.c
+++ b/hw/arm/highbank.c
@@ -223,15 +223,13 @@ static void calxeda_init(MachineState *machine, enum 
cxmachines machine_id)
 MemoryRegion *sysmem;
 char *sysboot_filename;
 
-if (!cpu_model) {
-switch (machine_id) {
-case CALXEDA_HIGHBANK:
-cpu_model = "cortex-a9";
-break;
-case CALXEDA_MIDWAY:
-cpu_model = "cortex-a15";
-break;
-}
+switch (machine_id) {
+case CALXEDA_HIGHBANK:
+cpu_model = "cortex-a9";
+break;
+case CALXEDA_MIDWAY:
+cpu_model = "cortex-a15";
+break;
 }
 
 for (n = 0; n < smp_cpus; n++) {
@@ -240,11 +238,6 @@ static void calxeda_init(MachineState *machine, enum 
cxmachines machine_id)
 ARMCPU *cpu;
 Error *err = NULL;
 
-if (!oc) {
-error_report("Unable to find CPU definition");
-exit(1);
-}
-
 cpuobj = object_new(object_class_get_name(oc));
 cpu = ARM_CPU(cpuobj);
 
-- 
1.9.1




Re: [Qemu-devel] qemu : rbd block driver internal snapshot and vm_stop is hanging forever

2015-11-08 Thread Denis V. Lunev

On 11/09/2015 06:10 AM, Alexandre DERUMIER wrote:

Hi,

with qemu (2.4.1), if I do an internal snapshot of an rbd device,
then I pause the vm with vm_stop,

the qemu process is hanging forever


monitor commands to reproduce:


# snapshot_blkdev_internal drive-virtio0 yoursnapname
# stop




I don't see this with qcow2 or sheepdog block driver for example.


Regards,

Alexandre


this could look like the problem I have recenty trying to
fix with dataplane enabled. Patch series is named as

[PATCH for 2.5 v6 0/10] dataplane snapshot fixes

Den



Re: [Qemu-devel] [PATCH v7 07/35] util: introduce qemu_file_get_page_size()

2015-11-08 Thread Xiao Guangrong



On 11/06/2015 11:36 PM, Eduardo Habkost wrote:

On Mon, Nov 02, 2015 at 05:13:09PM +0800, Xiao Guangrong wrote:

There are three places use the some logic to get the page size on
the file path or file fd

Windows did not support file hugepage, so it will return normal page
for this case. And this interface has not been used on windows so far

This patch introduces qemu_file_get_page_size() to unify the code

Signed-off-by: Xiao Guangrong 

[...]

diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 914cef5..51437ff 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -340,7 +340,7 @@ static void sigbus_handler(int signal)
  siglongjmp(sigjump, 1);
  }

-static size_t fd_getpagesize(int fd)
+static size_t fd_getpagesize(int fd, Error **errp)
  {
  #ifdef CONFIG_LINUX
  struct statfs fs;
@@ -351,7 +351,12 @@ static size_t fd_getpagesize(int fd)
  ret = fstatfs(fd, );
  } while (ret != 0 && errno == EINTR);

-if (ret == 0 && fs.f_type == HUGETLBFS_MAGIC) {
+if (ret) {
+error_setg_errno(errp, errno, "fstatfs is failed");
+return 0;
+}
+
+if (fs.f_type == HUGETLBFS_MAGIC) {
  return fs.f_bsize;
  }


You are changing os_mem_prealloc() behavior when fstatfs() fails, here.
Have you ensured there are no cases where fstatfs() fails but this code
is still expected to work?


stat() is supported for all kinds of files, so failed stat() is caused by
file is not exist or kernel internal error (e,g memory is not enough) or
security check is not passed. Whichever we should not do any operation on
the file if stat() failed. The origin code did not check it but it is worth
being fixed i think.



The change looks safe: gethugepagesize() seems to be always called in
the path that would make fd_getpagesize() be called from
os_mem_prealloc(), so allocation would abort much earlier if statfs()
failed. But I haven't confirmed that yet, and I wanted to be sure.



Yes, I am entirely agree with you. :)




Re: [Qemu-devel] [PATCH v6 32/33] nvdimm acpi: support _FIT method

2015-11-08 Thread Xiao Guangrong



On 11/09/2015 01:50 AM, Michael S. Tsirkin wrote:

On Fri, Oct 30, 2015 at 01:56:26PM +0800, Xiao Guangrong wrote:

FIT buffer is not completely mapped into guest address space, so a new
function, Read FIT, function index 0x, is reserved by QEMU to
read the piece of FIT buffer. The buffer is concatenated before _FIT
return

Refer to docs/specs/acpi-nvdimm.txt for detailed design

Signed-off-by: Xiao Guangrong 
---
  hw/acpi/nvdimm.c | 168 +--
  1 file changed, 164 insertions(+), 4 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index f8d7d19..3f35220 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -384,6 +384,18 @@ static void nvdimm_build_nfit(GSList *device_list, GArray 
*table_offsets,
  g_array_free(structures, true);
  }

+/*
+ * define UUID for NVDIMM Root Device according to Chapter 3 DSM Interface
+ * for NVDIMM Root Device - Example in DSM Spec Rev1.
+ */
+#define NVDIMM_DSM_ROOT_UUID "2F10E7A4-9E91-11E4-89D3-123B93F75CBA"
+
+/*
+ * Read FIT Function, which is a QEMU internal use only function, more detail
+ * refer to docs/specs/acpi_nvdimm.txt
+ */
+#define NVDIMM_DSM_FUNC_READ_FIT 0x
+
  /* define NVDIMM DSM return status codes according to DSM Spec Rev1. */
  enum {
  /* Common return status codes. */
@@ -420,6 +432,11 @@ struct NvdimmFuncInSetLabelData {
  } QEMU_PACKED;
  typedef struct NvdimmFuncInSetLabelData NvdimmFuncInSetLabelData;

+struct NvdimmFuncInReadFit {
+uint32_t offset; /* fit offset */
+} QEMU_PACKED;
+typedef struct NvdimmFuncInReadFit NvdimmFuncInReadFit;
+
  struct NvdimmDsmIn {
  uint32_t handle;
  uint32_t revision;
@@ -429,6 +446,7 @@ struct NvdimmDsmIn {
  uint8_t arg3[0];
  NvdimmFuncInSetLabelData func_set_label_data;
  NvdimmFuncInGetLabelData func_get_label_data;
+NvdimmFuncInReadFit func_read_fit;
  };
  } QEMU_PACKED;
  typedef struct NvdimmDsmIn NvdimmDsmIn;
@@ -450,13 +468,71 @@ struct NvdimmFuncOutGetLabelData {
  } QEMU_PACKED;
  typedef struct NvdimmFuncOutGetLabelData NvdimmFuncOutGetLabelData;

+struct NvdimmFuncOutReadFit {
+uint32_t status;/* return status code. */
+uint32_t length;/* the length of fit data we read. */
+uint8_t fit_data[0]; /* fit data. */
+} QEMU_PACKED;
+typedef struct NvdimmFuncOutReadFit NvdimmFuncOutReadFit;
+
  static void nvdimm_dsm_write_status(GArray *out, uint32_t status)
  {
  status = cpu_to_le32(status);
  build_append_int_noprefix(out, status, sizeof(status));
  }

-static void nvdimm_dsm_root(NvdimmDsmIn *in, GArray *out)
+/* Build fit memory which is presented to guest via _FIT method. */
+static void nvdimm_build_fit(AcpiNVDIMMState *state)
+{
+if (!state->fit) {
+GSList *device_list = nvdimm_get_plugged_device_list();
+
+nvdimm_debug("Rebuild FIT...\n");
+state->fit = nvdimm_build_device_structure(device_list);
+g_slist_free(device_list);
+}
+}
+
+/* Read FIT data, defined in docs/specs/acpi_nvdimm.txt. */
+static void nvdimm_dsm_func_read_fit(AcpiNVDIMMState *state,
+ NvdimmDsmIn *in, GArray *out)
+{
+NvdimmFuncInReadFit *read_fit = >func_read_fit;
+NvdimmFuncOutReadFit fit_out;
+uint32_t read_length = TARGET_PAGE_SIZE - sizeof(NvdimmFuncOutReadFit);
+uint32_t status = NVDIMM_DSM_ROOT_DEV_STATUS_INVALID_PARAS;
+
+nvdimm_build_fit(state);
+
+le32_to_cpus(_fit->offset);
+
+nvdimm_debug("Read FIT offset %#x.\n", read_fit->offset);
+
+if (read_fit->offset > state->fit->len) {
+nvdimm_debug("offset %#x is beyond fit size (%#x).\n",
+ read_fit->offset, state->fit->len);
+goto exit;
+}
+
+read_length = MIN(read_length, state->fit->len - read_fit->offset);
+nvdimm_debug("read length %#x.\n", read_length);
+
+fit_out.status = cpu_to_le32(NVDIMM_DSM_STATUS_SUCCESS);
+fit_out.length = cpu_to_le32(read_length);


Is array always empty at this point?
If yes, better assert this here to make sure guest can not
use unlimited memory.


It's unnecessary. At the end of dsm handler, we have asserted it that the
memory size can not beyond the size of dsm memory region:

static uint64_t
nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
{
   ..
exit:
assert(out->len <= memory_region_size(dsm_ram_mr));

/* Write output result to dsm memory. */
memcpy(dsm_ram_addr, out->data, out->len);
memory_region_set_dirty(dsm_ram_mr, 0, out->len);

buf_size = cpu_to_le32(out->len);
..
}




+g_array_append_vals(out, _out, sizeof(fit_out));
+
+if (read_length) {
+g_array_append_vals(out, state->fit->data + read_fit->offset,
+read_length);
+}
+return;
+
+exit:
+nvdimm_dsm_write_status(out, status);
+}
+
+static void nvdimm_dsm_root(AcpiNVDIMMState *state, NvdimmDsmIn *in,
+   

[Qemu-devel] [PATCH for-2.5 v4 1/1] hw/misc: Add support for ADC controller in Xilinx Zynq 7000

2015-11-08 Thread Peter Crosthwaite
From: Guenter Roeck 

Add support for the Xilinx XADC core used in Zynq 7000.

References:
- Zynq-7000 All Programmable SoC Technical Reference Manual
- 7 Series FPGAs and Zynq-7000 All Programmable SoC XADC
  Dual 12-Bit 1 MSPS Analog-to-Digital Converter

Tested with Linux using QEMU machine xilinx-zynq-a9 with devicetree
files zynq-zc702.dtb and zynq-zc706.dtb, and kernel configuration
multi_v7_defconfig.

Reviewed-by: Alistair Francis 
Signed-off-by: Guenter Roeck 
[ PC changes:
  * Changed macro names to match TRM where possible
  * Made programmers model macro scheme consistent
  * Dropped XADC_ZYNQ_ prefix on local macros
  * Fix ALM field width
  * Update threshold-comparison interrupts in _update_ints()
  * factored out DFIFO pushes into helper. Renamed to "push/pop"
  * Changed xadc_reg to 10 bits and added OOB check.
  * Reduced scope of MCTL reset to just stop channel coms.
  * Added dummy read data to write commands
  * Changed _ to - seperators in string names and filenames
  * Dropped  in header comment
  * Catchall'ed _update_ints() in _write handler.
  * Minor whitespace changes.
  * Use ZYNQ_XADC_FIFO_DEPTH instead of ARRAY_SIZE()
]
Signed-off-by: Peter Crosthwaite 
---
v4:
Addressed Alistair review
Minor whitespace changes
Use ZYNQ_XADC_FIFO_DEPTH instead of ARRAY_SIZE()
v3:
See [PC changes] in commit message
v2:
Use extract32()
Merge zynq_xadc_reset() and _zynq_xadc_reset() into one function
Use "xlnx,zynq_xadc"
Move device model to include/hw/misc/zynq_xadc.h
irq -> qemu_irq
xadc_dfifo_depth -> xadc_dfifo_entries
Dropped unnecessary comments
Merged zynq_xadc_realize() into zynq_xadc_init()

 hw/arm/xilinx_zynq.c|   6 +
 hw/misc/Makefile.objs   |   1 +
 hw/misc/zynq-xadc.c | 302 
 include/hw/misc/zynq-xadc.h |  46 +++
 4 files changed, 355 insertions(+)
 create mode 100644 hw/misc/zynq-xadc.c
 create mode 100644 include/hw/misc/zynq-xadc.h

diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index 82a9db8..1c1a445 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -24,6 +24,7 @@
 #include "hw/block/flash.h"
 #include "sysemu/block-backend.h"
 #include "hw/loader.h"
+#include "hw/misc/zynq-xadc.h"
 #include "hw/ssi.h"
 #include "qemu/error-report.h"
 
@@ -264,6 +265,11 @@ static void zynq_init(MachineState *machine)
 sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, 0xE0101000);
 sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[79-IRQ_OFFSET]);
 
+dev = qdev_create(NULL, TYPE_ZYNQ_XADC);
+qdev_init_nofail(dev);
+sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, 0xF8007100);
+sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, pic[39-IRQ_OFFSET]);
+
 dev = qdev_create(NULL, "pl330");
 qdev_prop_set_uint8(dev, "num_chnls",  8);
 qdev_prop_set_uint8(dev, "num_periph_req",  4);
diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
index 4aa76ff..aeb6b7d 100644
--- a/hw/misc/Makefile.objs
+++ b/hw/misc/Makefile.objs
@@ -36,6 +36,7 @@ obj-$(CONFIG_OMAP) += omap_sdrc.o
 obj-$(CONFIG_OMAP) += omap_tap.o
 obj-$(CONFIG_SLAVIO) += slavio_misc.o
 obj-$(CONFIG_ZYNQ) += zynq_slcr.o
+obj-$(CONFIG_ZYNQ) += zynq-xadc.o
 obj-$(CONFIG_STM32F2XX_SYSCFG) += stm32f2xx_syscfg.o
 
 obj-$(CONFIG_PVPANIC) += pvpanic.o
diff --git a/hw/misc/zynq-xadc.c b/hw/misc/zynq-xadc.c
new file mode 100644
index 000..ae3df5e
--- /dev/null
+++ b/hw/misc/zynq-xadc.c
@@ -0,0 +1,302 @@
+/*
+ * ADC registers for Xilinx Zynq Platform
+ *
+ * Copyright (c) 2015 Guenter Roeck
+ * Based on hw/misc/zynq_slcr.c, written by Michal Simek
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see .
+ */
+
+#include "hw/hw.h"
+#include "hw/misc/zynq-xadc.h"
+#include "qemu/timer.h"
+#include "sysemu/sysemu.h"
+
+enum {
+CFG= 0x000 / 4,
+INT_STS,
+INT_MASK,
+MSTS,
+CMDFIFO,
+RDFIFO,
+MCTL,
+};
+
+#define CFG_ENABLE  BIT(31)
+#define CFG_CFIFOTH_SHIFT   20
+#define CFG_CFIFOTH_LENGTH  4
+#define CFG_DFIFOTH_SHIFT   16
+#define CFG_DFIFOTH_LENGTH  4
+#define CFG_WEDGE   BIT(13)
+#define CFG_REDGE   BIT(12)
+#define CFG_TCKRATE_SHIFT   8
+#define CFG_TCKRATE_LENGTH  2
+
+#define CFG_TCKRATE_DIV(x)  (0x1 << (x - 1))
+
+#define CFG_IGAP_SHIFT  0
+#define CFG_IGAP_LENGTH 5
+
+#define INT_CFIFO_LTH   BIT(9)
+#define INT_DFIFO_GTH   BIT(8)
+#define INT_OT  BIT(7)
+#define INT_ALM_SHIFT   0

Re: [Qemu-devel] [PULL 0/7] Block patches

2015-11-08 Thread Markus Armbruster
Fam Zheng  writes:

> On Fri, 11/06 18:07, Peter Maydell wrote:
>> On 6 November 2015 at 17:52, Stefan Hajnoczi  wrote:
>> > The following changes since commit 
>> > 4b59f39bc9a03afcc74b2fa28da7c3189fca507c:
>> >
>> >   Merge remote-tracking branch
>> > 'remotes/mjt/tags/pull-trivial-patches-2015-11-06' into staging
>> > (2015-11-06 12:50:24 +)
>> >
>> > are available in the git repository at:
>> >
>> >   git://github.com/stefanha/qemu.git tags/block-pull-request
>> >
>> > for you to fetch changes up to 6f707181b1bd6ccf2d2fd9397039c7ef6fa4a9fd:
>> >
>> >   blockdev: acquire AioContext in hmp_commit() (2015-11-06 15:41:00 +)
>> >
>> > 
>> 
>> Build failure on OSX :-(
>> 
>> /Users/pm215/src/qemu-for-merges/aio-posix.c  CCblock/qcow.o
>> :442:37: error: no member named 'epollfd' in 'struct AioContext'
>> epoll_handler.pfd.fd = ctx->epollfd;
>>~~~  ^
>> 
>
> :(
>
> I think it is harmless to always include this member in AioContext. Stefan,
> could you squash this into patch 5?  Thanks!
>
> Fam
>
> ---
>
> diff --git a/include/block/aio.h b/include/block/aio.h
> index 91737d5..735f1f8 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -124,11 +124,9 @@ struct AioContext {
>  QEMUTimerListGroup tlg;
>  
>  int external_disable_cnt;
> -#ifdef CONFIG_EPOLL
>  int epollfd;
>  bool epoll_enabled;
>  bool epoll_available;
> -#endif
>  };

Replace by the ifdeffery by a comment pointing to CONFIG_EPOLL, perhaps?



Re: [Qemu-devel] qemu : rbd block driver internal snapshot and vm_stop is hanging forever

2015-11-08 Thread Alexandre DERUMIER
Something is really wrong,

because guest is also freezing, with a simple snapshot, with cache=none / 
rbd_cache=false

qemu monitor : snapshot_blkdev_internal drive-virtio0 snap1

or 

rbd command : rbd --image myrbdvolume snap create --snap snap1 


Then the guest can't read/write to disk anymore.


I have tested with last ceph internalis, same problem



- Mail original -
De: "aderumier" 
À: "ceph-devel" , "qemu-devel" 

Envoyé: Lundi 9 Novembre 2015 04:54:23
Objet: Re: [Qemu-devel] qemu : rbd block driver internal snapshot and vm_stop 
is hanging forever

Also, 

this occur only with rbd_cache=false or qemu drive cache=none. 


If I use rbd_cache=true or qemu drive cache=writeback, I don't have this bug. 


- Mail original - 
De: "aderumier"  
À: "ceph-devel" , "qemu-devel" 
 
Envoyé: Lundi 9 Novembre 2015 04:23:10 
Objet: Re: qemu : rbd block driver internal snapshot and vm_stop is hanging 
forever 

Some other infos: 

I can reproduce it too with manual snapshot with rbd command 


#rbd --image myrbdvolume snap create --snap snap1 

qemu monitor: 

#stop 


This is with ceph hammer 0.94.5. 


in qemu vm_stop, the only thing related to block driver are 

bdrv_drain_all(); 
ret = bdrv_flush_all(); 


- Mail original - 
De: "aderumier"  
À: "ceph-devel" , "qemu-devel" 
 
Envoyé: Lundi 9 Novembre 2015 04:10:45 
Objet: qemu : rbd block driver internal snapshot and vm_stop is hanging forever 

Hi, 

with qemu (2.4.1), if I do an internal snapshot of an rbd device, 
then I pause the vm with vm_stop, 

the qemu process is hanging forever 


monitor commands to reproduce: 


# snapshot_blkdev_internal drive-virtio0 yoursnapname 
# stop 




I don't see this with qcow2 or sheepdog block driver for example. 


Regards, 

Alexandre 
-- 
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
the body of a message to majord...@vger.kernel.org 
More majordomo info at http://vger.kernel.org/majordomo-info.html 

-- 
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
the body of a message to majord...@vger.kernel.org 
More majordomo info at http://vger.kernel.org/majordomo-info.html 



Re: [Qemu-devel] [PATCH 1/1] migration: qemu_savevm_state_cleanup becomes mandatory operation

2015-11-08 Thread Li, Liang Z
> since commit
> commit 94f5a43704129ca4995aa3385303c5ae225bde42
> Author: Liang Li 
> Date:   Mon Nov 2 15:37:00 2015 +0800
> 
> migration: defer migration_end & blk_mig_cleanup
> 
> when actual .cleanup callbacks calling was removed from complete operations.
> 
> Signed-off-by: Denis V. Lunev 
> CC: Paolo Bonzini 
> CC: Juan Quintela 
> CC: Amit Shah 
> ---
>  migration/savevm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c index e05158d..9f2230f
> 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -942,8 +942,8 @@ static int qemu_savevm_state(QEMUFile *f, Error
> **errp)
>  qemu_savevm_state_complete(f);
>  ret = qemu_file_get_error(f);
>  }
> +qemu_savevm_state_cleanup();
>  if (ret != 0) {
> -qemu_savevm_state_cleanup();
>  error_setg_errno(errp, -ret, "Error while writing VM state");
>  }
>  return ret;
> --
> 2.5.0
> 


Yes, you are right. Thanks a lot.

BTW, can this patch fix the regression you reported?

Reviewed-by: Liang Li 




Re: [Qemu-devel] [PATCH] hw/arm/virt: error_report cleanups

2015-11-08 Thread Markus Armbruster
Peter Maydell  writes:

> On 7 November 2015 at 15:25, Andrew Jones  wrote:
>> Signed-off-by: Andrew Jones 
>> ---
>>  hw/arm/virt.c | 10 +-
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index 77d9267599b7e..9c6792cea16f6 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -941,8 +941,8 @@ static void machvirt_init(MachineState *machine)
>>  if (!gic_version) {
>>  gic_version = kvm_arm_vgic_probe();
>>  if (!gic_version) {
>> -error_report("Unable to determine GIC version supported by 
>> host\n"
>> - "Probably KVM acceleration is not supported\n");
>> +error_report("Unable to determine GIC version supported by 
>> host");
>> +error_printf("KVM acceleration is probably not supported\n");
>>  exit(1);
>>  }
>>  }
>> @@ -990,7 +990,7 @@ static void machvirt_init(MachineState *machine)
>>  char *cpuopts = g_strdup(cpustr[1]);
>>
>>  if (!oc) {
>> -fprintf(stderr, "Unable to find CPU definition\n");
>> +error_report("Unable to find CPU definition");
>>  exit(1);
>>  }
>>  cpuobj = object_new(object_class_get_name(oc));
>> @@ -1126,8 +1126,8 @@ static void virt_set_gic_version(Object *obj, const 
>> char *value, Error **errp)
>>  } else if (!strcmp(value, "host")) {
>>  vms->gic_version = 0; /* Will probe later */
>>  } else {
>> -error_report("Invalid gic-version option value\n"
>> - "Allowed values are: 3, 2, host\n");
>> +error_report("Invalid gic-version option value");
>> +error_printf("Allowed gic-version values are: 3, 2, host\n");
>>  exit(1);
>>  }
>
> Would it be better just to have a single error_report
> for these without the newlines, eg
>
>   error_report("Unable to determine GIC version supported by host. "
>"KVM acceleration is probably not supported.");
>
> ?

For consistency, error messages should be a phrase, not a full sentence,
let alone a paraphraph.

You can of course turn any parapgraph into a phrase by stringing
together its parts with semicolons, but that's cheating :)

Reviewed-by: Markus Armbruster 



Re: [Qemu-devel] [PATCH] exec: silence hugetlbfs warning under qtest

2015-11-08 Thread Markus Armbruster
"Michael S. Tsirkin"  writes:

> On Tue, Oct 27, 2015 at 05:29:43PM +0100, marcandre.lur...@redhat.com wrote:
>> From: Marc-André Lureau 
>> 
>> vhost-user-test prints a warning. A test should not need to run on
>> hugetlbfs, let's silence the warning under qtest. Unfortunately, the
>> condition can't check on qtest_enabled() or qtest_driver() since they
>> are initialized later.
>> 
>> Signed-off-by: Marc-André Lureau 
>
> Any idea what's the best way to address this?

Can we reorder things so that qtest_enabled() can be used?

> Is poking at environment like this appropriate?

I'm afraid it isn't.  QTEST_QEMU_BINARY can be in the environment even
when you're not running under qtest.

> Maybe a command line flag to silence the warning?

There's -qtest and -qtest-log.  I guess keying on those would be less
unclean than QTEST_QEMU_BINARY.

>> ---
>>  exec.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/exec.c b/exec.c
>> index 8af2570..d9c231d 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -1194,8 +1194,9 @@ static long gethugepagesize(const char *path, Error 
>> **errp)
>>  return 0;
>>  }
>>  
>> -if (fs.f_type != HUGETLBFS_MAGIC)
>> +if (fs.f_type != HUGETLBFS_MAGIC && !getenv("QTEST_QEMU_BINARY")) {
>>  fprintf(stderr, "Warning: path not on HugeTLBFS: %s\n", path);
>> +}
>>  
>>  return fs.f_bsize;
>>  }
>> -- 
>> 2.4.3