date:20150128

Re: [Qemu-devel] [PATCH] block-migration: fix pending() return value

2015-01-28 Thread Markus Armbruster

Vladimir Sementsov-Ogievskiy  writes:

> Hi there.. Can someone review my bug fix? I'll surely fix typos in
> description after it.

The file you patch has since moved.  Suggest a quick rebase.  Make sure
to cc: your v2 to migration maintainers.

$ scripts/get_maintainer.pl -f migration/block.c 
Kevin Wolf  (supporter:Block)
Stefan Hajnoczi  (supporter:Block)
Juan Quintela  (maintainer:Migration)
Amit Shah  (maintainer:Migration)

Re: [Qemu-devel] [PATCH 2/2] bootdevice: update boot_order in MachineState

2015-01-28 Thread Markus Armbruster

Dinar Valeev  writes:

> On 01/28/2015 02:48 AM, Gonglei wrote:
>> On 2015/1/27 18:49, Dinar Valeev wrote:
>> 
>>> On 01/27/2015 10:18 AM, Gonglei wrote:
 On 2015/1/27 16:57, Dinar Valeev wrote:

> On 01/27/2015 03:51 AM, Gonglei wrote:
>> On 2015/1/27 7:52, dval...@suse.de wrote:
>>
>>> From: Dinar Valeev 
>>>
>>> on sPAPR we need to update boot_order in MachineState in case it
>>> got changed on reset.
>>>
>>> Signed-off-by: Dinar Valeev 
>>> ---
>>>  bootdevice.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/bootdevice.c b/bootdevice.c
>>> index 5914417..4f11a06 100644
>>> --- a/bootdevice.c
>>> +++ b/bootdevice.c
>>> @@ -26,6 +26,7 @@
>>>  #include "qapi/visitor.h"
>>>  #include "qemu/error-report.h"
>>>  #include "hw/hw.h"
>>> +#include "hw/boards.h"
>>>  
>>>  typedef struct FWBootEntry FWBootEntry;
>>>  
>>> @@ -50,6 +51,8 @@ void qemu_register_boot_set(QEMUBootSetHandler *func, 
>>> void *opaque)
>>>  void qemu_boot_set(const char *boot_order, Error **errp)
>>>  {
>>>  Error *local_err = NULL;
>>> +MachineState *machine = MACHINE(qdev_get_machine());
>>> +machine->boot_order = boot_order;
>>>  
>>>  if (!boot_set_handler) {
>>>  error_setg(errp, "no function defined to set boot device list 
>>> for"
>>
>> Have you registered boot set handler on ppc/sPAPR platform by calling
>> qemu_register_boot_set()? Otherwise qemu_boot_set function
>>  will return error.
> No, I set boot_order on each machine reset. My tests are showing
> it works without an error.

 That's interesting. Does this function be called?
>>> Yes, then simply returns.
 Would you debug it by setting a breakpoint ?
>>> I added a trace event.
>>>  if (!boot_set_handler) {
>>> +trace_qemu_boot_set(boot_order);
>>>  error_setg(errp, "no function defined to set boot device list for"
>>>   " this architecture");
>>>  return;
>>>
>>> And I see this now in qemu's monitor. Still I don't see error message.
>> 
>> That's because NULL is passed to this function in restore_boot_order()
>> the error is ignored (commit f183993). I have seen the previous conversation
>> about your patch serials. And I think this is the reason which
>> you moved machine->boot_order = boot_order before
>> checking boot_set_handler variable based on Alexander's
>> suggestion, right? But I think this is not a good idea.
> Yes
>
> Any proposal how this can be done differently?
>
> It seems I'm almost alone who wants -boot once=X option to get fixed
> for sPAPR. We use it in test automation, and we need to be sure that
> we boot from hard disk once installation is done.

The crash is a bug, and bugs need fixing.

You first patch looked okay to me.  I suggested dropping the null check
in spapr_create_fdt_skel(), because I feel it's papering over the bug
you fix.

This isn't a review of your second attempt, it's encouragement.  I
haven't looked at this version, nor have I thought through Alex's idea
to patch machine->boot_order in qemu_boot_set().

Re: [Qemu-devel] [PATCH v2 01/47] acpi: introduce AML composer aml_append()

2015-01-28 Thread Shannon Zhao

On 2015/1/28 18:00, Igor Mammedov wrote:
> On Wed, 28 Jan 2015 09:56:26 +0200
> "Michael S. Tsirkin"  wrote:
> 
>>> I've tried redo series with passing alloc list as first argument,
>>> looks ugly as hell
>>
>> I tried too. Not too bad at all. See below:
>>
>> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
>> index f66da5d..820504a 100644
>> --- a/hw/i386/acpi-build.c
>> +++ b/hw/i386/acpi-build.c
>> @@ -491,14 +491,14 @@ static void acpi_set_pci_info(void)
>>  }
>>  }
>>  
>> -static void build_append_pcihp_notify_entry(AcpiAml *method, int slot)
>> +static void build_append_pcihp_notify_entry(AmlPool *p, AcpiAml *method, 
>> int slot)
>>  {
>> -AcpiAml if_ctx;
>> +AcpiAml *if_ctx;
>>  int32_t devfn = PCI_DEVFN(slot, 0);
>>  
>> -if_ctx = acpi_if(acpi_and(acpi_arg0(), acpi_int(0x1U << slot)));
>> -aml_append(&if_ctx, acpi_notify(acpi_name("S%.02X", devfn), 
>> acpi_arg1()));
>> -aml_append(method, if_ctx);
>> +if_ctx = acpi_if(p, acpi_and(p, acpi_arg0(), acpi_int(p, 0x1U << 
>> slot)));
>> +aml_append(p, if_ctx, acpi_notify(p, acpi_name(p, "S%.02X", devfn), 
>> acpi_arg1(p)));
>> +aml_append(p, method, if_ctx);
>>  }
>>  
>>  static void build_append_pci_bus_devices(AcpiAml *parent_scope, PCIBus *bus,
>>
>> What exactly is the problem?  A tiny bit more verbose but the lifetime
>> of all objects is now explicit.
> every usage of aml_foo()/build_append_pcihp_notify_entry() tags along
> extra pointer which is not really necessary for user to know. If possible
> user shouldn't care about it and concentrate on composing AML instead.
> 
> Whole point of passing AmlPool and record every allocation is to avoid
> mistakes like:
> 
> acpi_if(acpi_and(acpi_arg0(), acpi_int(0x1U << slot)));
> 
> and forgetting to assign object returned by call anywhere,
> it's basically the same as calling malloc() without
> using result anywhere, however neither libc nor glib
> force user to pass allocator (in our case garbage collector)
> in every call that allocates memory. Let's just follow common
> convention here (#4) where an object is allocated by API call
> (i.e like object_new(FOO), gtk_widget_foo()).
> 
> Hence is suggesting at least to hide AmlPool internally in API
> without exposing it to user. We can provide for user
> init/free API to manage internal AmlPool manually, allowing
> him to select when to do initialization and cleanup.
> 
> Claudio, Marcel, Shannon,
> As the first API users, could you give your feedback on the topic.
> 

In my opinion, it's good to make users focused on ACPI table construction 
through
auto memory management. And it makes the code clear.

PS:
We're talking about use-after-free, like below example. But this example really 
exist?
During generating ACPI tables for virt machine, I don't encounter this case.

For example:
aml_append(&a, b);
aml_append(&a, b);

Thanks,
Shannon

[Qemu-devel] [PATCH v2 5/5] vhost-scsi: set the bootable value of channel/target/lun

2015-01-28 Thread arei.gonglei

From: Gonglei 

At present, the target is valued boot_tpgt, In addition,
channel and lun both are 0 for bootable vhost-scsi device.

Signed-off-by: Gonglei 
Signed-off-by: Bo Su 
---
 hw/scsi/vhost-scsi.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index dc9076e..e30ff84 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -251,6 +251,12 @@ static void vhost_scsi_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 
+/* At present, channel and lun both are 0 for bootable vhost-scsi disk */
+s->channel = 0;
+s->lun = 0;
+/* Note: we can also get the minimum tpgt from kernel */
+s->target = vs->conf.boot_tpgt;
+
 error_setg(&s->migration_blocker,
 "vhost-scsi does not support migration");
 migrate_add_blocker(s->migration_blocker);
-- 
1.7.12.4

[Qemu-devel] [PATCH v2 0/5] vhost-scsi: support to assign boot order

2015-01-28 Thread arei.gonglei

From: Gonglei 

Qemu haven't provide a bootindex property for vhost-scsi device.
So, we can not assign the boot order for it at present. But
Some clients/users have requirements for that in some scenarios.
This patch achieve the aim in Qemu side.

Because Qemu only accept an wwpn argument for vhost-scsi, we
cannot assign a tpgt. That's say tpg is transparent for Qemu, Qemu
doesn't know which tpg can boot, but vhost-scsi driver module
doesn't know too for one assigned wwpn.

At present, we assume that the first tpg can boot only, and add
a boot_tpgt property that defaults to 0. Of course, people can
pass a valid value by qemu command line.

v2 -> v1: (Thanks to Paolo's suggestion)
 - change calling  qdev_get_own_fw_dev_path_from_handler in
   get_boot_devices_list, and convert non-NULL suffixes to
   implementations of FWPathProvider in Patch 1. (Paolo)
 - add a boot_tpgt property for vhost-scsi in Patch 4. (Paolo)
 - remove the ioctl calling in Patch 4, because the kernel
   patch hasn't been accepted.

kernel patch:
[PATCH] vhost-scsi: introduce an ioctl to get the minimum tpgt
http://news.gmane.org/gmane.comp.emulators.kvm.devel

Gonglei (5):
  qdev: support to get a device firmware path directly
  vhost-scsi: add bootindex property
  vhost-scsi: realize the TYPE_FW_PATH_PROVIDER interface
  vhost-scsi: add a property for booting
  vhost-scsi: set the bootable value of channel/target/lun

 bootdevice.c| 31 +--
 hw/core/qdev.c  |  7 +++
 hw/scsi/vhost-scsi.c| 35 +++
 hw/virtio/virtio-pci.c  |  2 ++
 include/hw/qdev-core.h  |  1 +
 include/hw/virtio/vhost-scsi.h  |  5 +
 include/hw/virtio/virtio-scsi.h |  1 +
 7 files changed, 68 insertions(+), 14 deletions(-)

-- 
1.7.12.4

[Qemu-devel] [PATCH v2 4/5] vhost-scsi: add a property for booting

2015-01-28 Thread arei.gonglei

From: Gonglei 

Because Qemu only accept an wwpn argument for vhost-scsi, we
cannot assign a tpgt. That's say tpg is transparent for Qemu, Qemu
doesn't know which tpg can boot, but vhost-scsi driver module
doesn't know too for one assigned wwpn.

At present, we assume that the first tpg can boot only, and add
a boot_tpgt property that defaults to 0. Of course, people can
pass a valid value by qemu command line.

Suggested-by: Paolo Bonzini 
Signed-off-by: Gonglei 
---
 include/hw/virtio/vhost-scsi.h  | 1 +
 include/hw/virtio/virtio-scsi.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/hw/virtio/vhost-scsi.h b/include/hw/virtio/vhost-scsi.h
index c0056c2..dea0075 100644
--- a/include/hw/virtio/vhost-scsi.h
+++ b/include/hw/virtio/vhost-scsi.h
@@ -69,6 +69,7 @@ typedef struct VHostSCSI {
 #define DEFINE_VHOST_SCSI_PROPERTIES(_state, _conf_field) \
 DEFINE_PROP_STRING("vhostfd", _state, _conf_field.vhostfd), \
 DEFINE_PROP_STRING("wwpn", _state, _conf_field.wwpn), \
+DEFINE_PROP_UINT32("boot_tpgt", _state, _conf_field.boot_tpgt, 0), \
 DEFINE_PROP_UINT32("num_queues", _state, _conf_field.num_queues, 1), \
 DEFINE_PROP_UINT32("max_sectors", _state, _conf_field.max_sectors, 
0x), \
 DEFINE_PROP_UINT32("cmd_per_lun", _state, _conf_field.cmd_per_lun, 128)
diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
index bf17cc9..c122e7a 100644
--- a/include/hw/virtio/virtio-scsi.h
+++ b/include/hw/virtio/virtio-scsi.h
@@ -153,6 +153,7 @@ struct VirtIOSCSIConf {
 uint32_t cmd_per_lun;
 char *vhostfd;
 char *wwpn;
+uint32_t boot_tpgt;
 IOThread *iothread;
 };
 
-- 
1.7.12.4

[Qemu-devel] [PATCH v2 2/5] vhost-scsi: add bootindex property

2015-01-28 Thread arei.gonglei

From: Gonglei 

Signed-off-by: Gonglei 
---
 hw/scsi/vhost-scsi.c   | 9 +
 hw/virtio/virtio-pci.c | 2 ++
 include/hw/virtio/vhost-scsi.h | 1 +
 3 files changed, 12 insertions(+)

diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index dcb2bc5..9c4f613 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -290,11 +290,20 @@ static void vhost_scsi_class_init(ObjectClass *klass, 
void *data)
 vdc->set_status = vhost_scsi_set_status;
 }
 
+static void vhost_scsi_instance_init(Object *obj)
+{
+VHostSCSI *dev = VHOST_SCSI(obj);
+
+device_add_bootindex_property(obj, &dev->bootindex, "bootindex", NULL,
+  DEVICE(dev), NULL);
+}
+
 static const TypeInfo vhost_scsi_info = {
 .name = TYPE_VHOST_SCSI,
 .parent = TYPE_VIRTIO_SCSI_COMMON,
 .instance_size = sizeof(VHostSCSI),
 .class_init = vhost_scsi_class_init,
+.instance_init = vhost_scsi_instance_init,
 };
 
 static void virtio_register_types(void)
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index dde1d73..604cb5b 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1238,6 +1238,8 @@ static void vhost_scsi_pci_instance_init(Object *obj)
 
 virtio_instance_init_common(obj, &dev->vdev, sizeof(dev->vdev),
 TYPE_VHOST_SCSI);
+object_property_add_alias(obj, "bootindex", OBJECT(&dev->vdev),
+  "bootindex", &error_abort);
 }
 
 static const TypeInfo vhost_scsi_pci_info = {
diff --git a/include/hw/virtio/vhost-scsi.h b/include/hw/virtio/vhost-scsi.h
index 85cc031..ed50289 100644
--- a/include/hw/virtio/vhost-scsi.h
+++ b/include/hw/virtio/vhost-scsi.h
@@ -60,6 +60,7 @@ typedef struct VHostSCSI {
 Error *migration_blocker;
 
 struct vhost_dev dev;
+int32_t bootindex;
 } VHostSCSI;
 
 #define DEFINE_VHOST_SCSI_PROPERTIES(_state, _conf_field) \
-- 
1.7.12.4

[Qemu-devel] [PATCH v2 1/5] qdev: support to get a device firmware path directly

2015-01-28 Thread arei.gonglei

From: Gonglei 

commit 6b1566c (qdev: Introduce FWPathProvider interface) did a
good job for supproting to get firmware path on some different
architectures.

Moreover further more, we can use the interface to get firmware
path name for a device which isn't attached a specific bus,
such as virtio-bus, scsi-bus etc.

When the device (such as vhost-scsi) realize the TYPE_FW_PATH_PROVIDER
interface, we should introduce a new function to get the correct firmware
path name for it.

Signed-off-by: Gonglei 
Signed-off-by: Paolo Bonzini 
---
 bootdevice.c   | 31 +--
 hw/core/qdev.c |  7 +++
 include/hw/qdev-core.h |  1 +
 3 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/bootdevice.c b/bootdevice.c
index 5914417..c3a010c 100644
--- a/bootdevice.c
+++ b/bootdevice.c
@@ -210,7 +210,9 @@ char *get_boot_devices_list(size_t *size, bool 
ignore_suffixes)
 char *list = NULL;
 
 QTAILQ_FOREACH(i, &fw_boot_order, link) {
-char *devpath = NULL, *bootpath;
+char *devpath = NULL,  *suffix = NULL;
+char *bootpath;
+char *d;
 size_t len;
 
 if (i->dev) {
@@ -218,21 +220,22 @@ char *get_boot_devices_list(size_t *size, bool 
ignore_suffixes)
 assert(devpath);
 }
 
-if (i->suffix && !ignore_suffixes && devpath) {
-size_t bootpathlen = strlen(devpath) + strlen(i->suffix) + 1;
-
-bootpath = g_malloc(bootpathlen);
-snprintf(bootpath, bootpathlen, "%s%s", devpath, i->suffix);
-g_free(devpath);
-} else if (devpath) {
-bootpath = devpath;
-} else if (!ignore_suffixes) {
-assert(i->suffix);
-bootpath = g_strdup(i->suffix);
-} else {
-bootpath = g_strdup("");
+if (!ignore_suffixes) {
+d = qdev_get_own_fw_dev_path_from_handler(i->dev->parent_bus, 
i->dev);
+if (d) {
+assert(!i->suffix);
+suffix = d;
+} else {
+suffix = g_strdup(i->suffix);
+}
 }
 
+bootpath = g_strdup_printf("%s%s",
+   devpath ? devpath : "",
+   suffix ? suffix : "");
+g_free(devpath);
+g_free(suffix);
+
 if (total) {
 list[total-1] = '\n';
 }
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 2eacac0..44c6b93 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -818,6 +818,13 @@ static char *qdev_get_fw_dev_path_from_handler(BusState 
*bus, DeviceState *dev)
 return d;
 }
 
+char *qdev_get_own_fw_dev_path_from_handler(BusState *bus, DeviceState *dev)
+{
+Object *obj = OBJECT(dev);
+
+return fw_path_provider_try_get_dev_path(obj, bus, dev);
+}
+
 static int qdev_get_fw_dev_path_helper(DeviceState *dev, char *p, int size)
 {
 int l = 0;
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 15a226f..4e673f9 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -342,6 +342,7 @@ void qbus_reset_all_fn(void *opaque);
 BusState *sysbus_get_default(void);
 
 char *qdev_get_fw_dev_path(DeviceState *dev);
+char *qdev_get_own_fw_dev_path_from_handler(BusState *bus, DeviceState *dev);
 
 /**
  * @qdev_machine_init
-- 
1.7.12.4

[Qemu-devel] [PATCH v2 3/5] vhost-scsi: realize the TYPE_FW_PATH_PROVIDER interface

2015-01-28 Thread arei.gonglei

From: Gonglei 

In the way, we can make the bootindex property take effect.
At the meanwhile, the firmware path name of vhost-scsi is
"channel@channel/vhost-scsi@target,lun".

Signed-off-by: Gonglei 
---
 hw/scsi/vhost-scsi.c   | 20 
 include/hw/virtio/vhost-scsi.h |  3 +++
 2 files changed, 23 insertions(+)

diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 9c4f613..dc9076e 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -24,6 +24,7 @@
 #include "hw/virtio/virtio-scsi.h"
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
+#include "hw/fw-path-provider.h"
 
 /* Features supported by host kernel. */
 static const int kernel_feature_bits[] = {
@@ -271,6 +272,19 @@ static void vhost_scsi_unrealize(DeviceState *dev, Error 
**errp)
 virtio_scsi_common_unrealize(dev, errp);
 }
 
+/*
+ * Implementation of an interface to adjust firmware path
+ * for the bootindex property handling.
+ */
+static char *vhost_scsi_get_fw_dev_path(FWPathProvider *p, BusState *bus,
+DeviceState *dev)
+{
+VHostSCSI *s = VHOST_SCSI(dev);
+/* format: channel@channel/vhost-scsi@target,lun */
+return g_strdup_printf("channel@%x/%s@%x,%x", s->channel,
+   qdev_fw_name(dev), s->target, s->lun);
+}
+
 static Property vhost_scsi_properties[] = {
 DEFINE_VHOST_SCSI_PROPERTIES(VHostSCSI, parent_obj.conf),
 DEFINE_PROP_END_OF_LIST(),
@@ -280,6 +294,7 @@ static void vhost_scsi_class_init(ObjectClass *klass, void 
*data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
+FWPathProviderClass *fwc = FW_PATH_PROVIDER_CLASS(klass);
 
 dc->props = vhost_scsi_properties;
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
@@ -288,6 +303,7 @@ static void vhost_scsi_class_init(ObjectClass *klass, void 
*data)
 vdc->get_features = vhost_scsi_get_features;
 vdc->set_config = vhost_scsi_set_config;
 vdc->set_status = vhost_scsi_set_status;
+fwc->get_dev_path = vhost_scsi_get_fw_dev_path;
 }
 
 static void vhost_scsi_instance_init(Object *obj)
@@ -304,6 +320,10 @@ static const TypeInfo vhost_scsi_info = {
 .instance_size = sizeof(VHostSCSI),
 .class_init = vhost_scsi_class_init,
 .instance_init = vhost_scsi_instance_init,
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_FW_PATH_PROVIDER },
+{ }
+},
 };
 
 static void virtio_register_types(void)
diff --git a/include/hw/virtio/vhost-scsi.h b/include/hw/virtio/vhost-scsi.h
index ed50289..c0056c2 100644
--- a/include/hw/virtio/vhost-scsi.h
+++ b/include/hw/virtio/vhost-scsi.h
@@ -61,6 +61,9 @@ typedef struct VHostSCSI {
 
 struct vhost_dev dev;
 int32_t bootindex;
+int channel;
+int target;
+int lun;
 } VHostSCSI;
 
 #define DEFINE_VHOST_SCSI_PROPERTIES(_state, _conf_field) \
-- 
1.7.12.4

[Qemu-devel] [-prfix=PATCH v2 RFC 5/6] qemu-iotests: s390x: fix test 051

2015-01-28 Thread Xiao Guang Chen

From: Mao Chuan Li 

The tests for device type "ide_cd" are skipped for the s390 platform.
The default device id of hard disk on the s390 platform differs to that
of the x86 platform. A new variable device_id is defined and "virtio0"
set for the s390 platform. A s390 platform specific output file is also
needed.

Reviewed-by:   Michael Mueller 
Signed-off-by: Mao Chuan Li 
---
 tests/qemu-iotests/051 |  91 +---
 tests/qemu-iotests/051.s390-virtio.out | 377 +
 2 files changed, 439 insertions(+), 29 deletions(-)
 create mode 100644 tests/qemu-iotests/051.s390-virtio.out

diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index 11c858f..2b600df 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -137,13 +137,19 @@ run_qemu -drive if=ide
 run_qemu -drive if=virtio
 run_qemu -drive if=scsi
 
-run_qemu -drive if=none,id=disk -device ide-cd,drive=disk
-run_qemu -drive if=none,id=disk -device lsi53c895a -device scsi-cd,drive=disk
-
-run_qemu -drive if=none,id=disk -device ide-drive,drive=disk
-run_qemu -drive if=none,id=disk -device ide-hd,drive=disk
-run_qemu -drive if=none,id=disk -device lsi53c895a -device scsi-disk,drive=disk
-run_qemu -drive if=none,id=disk -device lsi53c895a -device scsi-hd,drive=disk
+case "$QEMU_DEFAULT_MACHINE" in
+s390-virtio)
+;;
+*)
+run_qemu -drive if=none,id=disk -device ide-cd,drive=disk
+run_qemu -drive if=none,id=disk -device lsi53c895a -device 
scsi-cd,drive=disk
+
+run_qemu -drive if=none,id=disk -device ide-drive,drive=disk
+run_qemu -drive if=none,id=disk -device ide-hd,drive=disk
+run_qemu -drive if=none,id=disk -device lsi53c895a -device 
scsi-disk,drive=disk
+run_qemu -drive if=none,id=disk -device lsi53c895a -device 
scsi-hd,drive=disk
+;;
+esac
 
 echo
 echo === Read-only ===
@@ -157,13 +163,19 @@ run_qemu -drive file="$TEST_IMG",if=ide,readonly=on
 run_qemu -drive file="$TEST_IMG",if=virtio,readonly=on
 run_qemu -drive file="$TEST_IMG",if=scsi,readonly=on
 
-run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
ide-cd,drive=disk
-run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
lsi53c895a -device scsi-cd,drive=disk
+case "$QEMU_DEFAULT_MACHINE" in
+s390-virtio)
+;;
+*)
+run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
ide-cd,drive=disk
+run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
lsi53c895a -device scsi-cd,drive=disk
 
-run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
ide-drive,drive=disk
-run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
ide-hd,drive=disk
-run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
lsi53c895a -device scsi-disk,drive=disk
-run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
lsi53c895a -device scsi-hd,drive=disk
+run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
ide-drive,drive=disk
+run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
ide-hd,drive=disk
+run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
lsi53c895a -device scsi-disk,drive=disk
+run_qemu -drive file="$TEST_IMG",if=none,id=disk,readonly=on -device 
lsi53c895a -device scsi-hd,drive=disk
+;;
+esac
 
 echo
 echo === Cache modes ===
@@ -172,12 +184,24 @@ echo
 # Cannot use the test image because cache=none might not work on the host FS
 # Use cdrom so that we won't get errors about missing media
 
-run_qemu -drive media=cdrom,cache=none
-run_qemu -drive media=cdrom,cache=directsync
-run_qemu -drive media=cdrom,cache=writeback
-run_qemu -drive media=cdrom,cache=writethrough
-run_qemu -drive media=cdrom,cache=unsafe
-run_qemu -drive media=cdrom,cache=invalid_value
+case "$QEMU_DEFAULT_MACHINE" in
+s390-virtio)
+run_qemu -drive if=scsi,media=cdrom,cache=none
+run_qemu -drive if=scsi,media=cdrom,cache=directsync
+run_qemu -drive if=scsi,media=cdrom,cache=writeback
+run_qemu -drive if=scsi,media=cdrom,cache=writethrough
+run_qemu -drive if=scsi,media=cdrom,cache=unsafe
+run_qemu -drive if=scsi,media=cdrom,cache=invalid_value
+;;
+*)
+run_qemu -drive media=cdrom,cache=none
+run_qemu -drive media=cdrom,cache=directsync
+run_qemu -drive media=cdrom,cache=writeback
+run_qemu -drive media=cdrom,cache=writethrough
+run_qemu -drive media=cdrom,cache=unsafe
+run_qemu -drive media=cdrom,cache=invalid_value
+;;
+esac
 
 echo
 echo === Specifying the protocol layer ===
@@ -241,28 +265,37 @@ echo
 echo === Snapshot mode ===
 echo
 
+case "$QEMU_DEFAULT_MACHINE" in
+s390-virtio)
+device_id="virtio0"
+;;
+*)
+device_id="ide0-hd0"
+;;
+esac
+
 $QEMU_IO -c "write -P 0x11 0 4k" "$TEST_IMG" | _filter_qemu_io
 
-echo 'qem

[Qemu-devel] [-prfix=PATCH v2 RFC 4/6] qemu-iotests: s390x: fix test 041

2015-01-28 Thread Xiao Guang Chen

From: Mao Chuan Li 

There is no such device 'ide-cd' defined on the s390 platform, so
test_medium_not_found() test is skipped.

Reviewed-by:   Michael Mueller 
Signed-off-by: Mao Chuan Li  
---
 tests/qemu-iotests/041 | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 59a8f73..4fb1b34 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -197,6 +197,9 @@ class TestSingleDrive(ImageMirroringTestCase):
 'target image does not match source after mirroring')
 
 def test_medium_not_found(self):
+   if iotests.qemu_default_machine == 's390-virtio':
+   return
+
 result = self.vm.qmp('drive-mirror', device='ide1-cd0', sync='full',
  target=target_img)
 self.assert_qmp(result, 'error/class', 'GenericError')
@@ -867,6 +870,9 @@ class TestRepairQuorum(ImageMirroringTestCase):
 if not self.has_quorum():
 return
 
+if iotests.qemu_default_machine == 's390-virtio':
+   return
+
 result = self.vm.qmp('drive-mirror', device='ide1-cd0', sync='full',
  node_name='repair0',
  replaces='img1',
-- 
1.9.1

[Qemu-devel] [-prfix=PATCH v2 RFC 2/6] qemu-iotests: qemu machine type support

2015-01-28 Thread Xiao Guang Chen

From: Mao Chuan Li 

This patch adds qemu machine type support to the io test suite. Based on
the qemu default machine type the reference output file can now vary
from the default to a machine specific output file if necessary. That
shall allow all platforms to use this test suite.

Reviewed-by:   Michael Mueller 
Signed-off-by: Mao Chuan Li 
---
 tests/qemu-iotests/check | 5 +
 tests/qemu-iotests/common.config | 1 +
 tests/qemu-iotests/iotests.py| 1 +
 3 files changed, 7 insertions(+)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 8ca4011..fc0351d 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -323,6 +323,11 @@ do
 fi
 
 reference="$source_iotests/$seq.out"
+reference_machine="$source_iotests/$seq.$QEMU_DEFAULT_MACHINE.out"
+if [ -f $reference_machine ]; then
+reference=$reference_machine
+fi
+
 if [ "$CACHEMODE" = "none" ]; then
 [ -f "$source_iotests/$seq.out.nocache" ] && 
reference="$source_iotests/$seq.out.nocache"
 fi
diff --git a/tests/qemu-iotests/common.config b/tests/qemu-iotests/common.config
index bd6790b..73e25da 100644
--- a/tests/qemu-iotests/common.config
+++ b/tests/qemu-iotests/common.config
@@ -107,6 +107,7 @@ export QEMU=$QEMU_PROG
 export QEMU_IMG=$QEMU_IMG_PROG
 export QEMU_IO="$QEMU_IO_PROG $QEMU_IO_OPTIONS"
 export QEMU_NBD=$QEMU_NBD_PROG
+export QEMU_DEFAULT_MACHINE=$($QEMU -machine ? | awk '/(default)/{print $1}')
 
 [ -f /etc/qemu-iotest.config ]   && . /etc/qemu-iotest.config
 
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index f57f154..69dee95 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -39,6 +39,7 @@ imgproto = os.environ.get('IMGPROTO', 'file')
 test_dir = os.environ.get('TEST_DIR', '/var/tmp')
 output_dir = os.environ.get('OUTPUT_DIR', '.')
 cachemode = os.environ.get('CACHEMODE')
+qemu_default_machine = os.environ.get('QEMU_DEFAULT_MACHINE')
 
 socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper')
 
-- 
1.9.1

[Qemu-devel] [-prfix=PATCH v2 RFC 3/6] qemu-iotests: run qemu with -nodefaults

2015-01-28 Thread Xiao Guang Chen

From: Mao Chuan Li 

This patch fixes an io test suite issue that was introduced with the
commit c88930a6866e74953e931ae749781e98e486e5c8 'qemu-char: Permit only
a single "stdio" character device'. The option supresses the creation of
default devices.

Reviewed-by: Michael Mueller 
Signed-off-by: Mao Chuan Li 
---
 tests/qemu-iotests/common| 1 +
 tests/qemu-iotests/common.config | 2 +-
 tests/qemu-iotests/common.qemu   | 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/common b/tests/qemu-iotests/common
index 9e12bec..f3fa6dd 100644
--- a/tests/qemu-iotests/common
+++ b/tests/qemu-iotests/common
@@ -51,6 +51,7 @@ export IMGOPTS=""
 export CACHEMODE="writeback"
 export QEMU_IO_OPTIONS=""
 export CACHEMODE_IS_DEFAULT=true
+export QEMU_OPTIONS="-nodefaults"
 
 for r
 do
diff --git a/tests/qemu-iotests/common.config b/tests/qemu-iotests/common.config
index 73e25da..3025921 100644
--- a/tests/qemu-iotests/common.config
+++ b/tests/qemu-iotests/common.config
@@ -103,7 +103,7 @@ if [ -z "$QEMU_NBD_PROG" ]; then
 export QEMU_NBD_PROG="`set_prog_path qemu-nbd`"
 fi
 
-export QEMU=$QEMU_PROG
+export QEMU="$QEMU_PROG $QEMU_OPTIONS"
 export QEMU_IMG=$QEMU_IMG_PROG
 export QEMU_IO="$QEMU_IO_PROG $QEMU_IO_OPTIONS"
 export QEMU_NBD=$QEMU_NBD_PROG
diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index ee7ebb4..ea27536 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -153,7 +153,7 @@ function _launch_qemu()
 mkfifo "${fifo_out}"
 mkfifo "${fifo_in}"
 
-"${QEMU}" -nographic -serial none ${comm} -machine accel=qtest "${@}" 2>&1 
\
+${QEMU} -nographic -serial none ${comm} -machine accel=qtest "${@}" 2>&1 \
 >"${fifo_out}" 
\
 <"${fifo_in}" &
 QEMU_PID[${_QEMU_HANDLE}]=$!
-- 
1.9.1

[Qemu-devel] [-prfix=PATCH v2 RFC 6/6] qemu-iotests: s390x: fix test 055

2015-01-28 Thread Xiao Guang Chen

From: Mao Chuan Li 

There is no such device 'ide-cd' defined on the s390 platform, so
test_medium_not_found() test is skipped.

Reviewed-by:   Michael Mueller 
Signed-off-by: Mao Chuan Li  
---
 tests/qemu-iotests/055 | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055
index 451b67d..5cf487b 100755
--- a/tests/qemu-iotests/055
+++ b/tests/qemu-iotests/055
@@ -90,6 +90,9 @@ class TestSingleDrive(iotests.QMPTestCase):
 'target image does not match source after backup')
 
 def test_medium_not_found(self):
+if iotests.qemu_default_machine == 's390-virtio':
+   return
+
 result = self.vm.qmp('drive-backup', device='ide1-cd0',
  target=target_img, sync='full')
 self.assert_qmp(result, 'error/class', 'GenericError')
@@ -252,6 +255,9 @@ class TestSingleTransaction(iotests.QMPTestCase):
 'target image does not match source after backup')
 
 def test_medium_not_found(self):
+if iotests.qemu_default_machine == 's390-virtio':
+   return
+
 result = self.vm.qmp('transaction', actions=[{
 'type': 'drive-backup',
 'data': { 'device': 'ide1-cd0',
-- 
1.9.1

[Qemu-devel] [-prfix=PATCH v2 RFC 1/6] qemu-iotests: fix tests 067, 071 and 087

2015-01-28 Thread Xiao Guang Chen

From: Xiao Guang Chen 

Due to no default floppy and cdrom for guests any more, output files for test
case 067, 071 and 087 should be editted to make the test pass.

Use virtio-blk instead of virtio-blk-pci as the device driver for test 067.
For virtio-blk-pci is equal to virt-blk as device driver but other platform such
as s390 may not recognize virtio-blk-pci.

Reviewed-by: Michael Mueller 
Signed-off-by: Xiao Guang Chen 
---
 tests/qemu-iotests/067 |  8 
 tests/qemu-iotests/067.out | 26 +-
 tests/qemu-iotests/071.out | 12 
 tests/qemu-iotests/087.out | 18 +++---
 4 files changed, 20 insertions(+), 44 deletions(-)

diff --git a/tests/qemu-iotests/067 b/tests/qemu-iotests/067
index d025192..6cddf10 100755
--- a/tests/qemu-iotests/067
+++ b/tests/qemu-iotests/067
@@ -56,7 +56,7 @@ echo
 echo === -drive/-device and device_del ===
 echo
 
-run_qemu -drive file=$TEST_IMG,format=$IMGFMT,if=none,id=disk -device 
virtio-blk-pci,drive=disk,id=virtio0 <

[Qemu-devel] [-prfix=PATCH v2 RFC 0/6] Update tests/qemu-iotests failing cases for the s390 platform

2015-01-28 Thread Xiao Guang Chen

From: Xiao Guang Chen 

v2:
1. Drop the patches for test 039 for it has been fixed in upstream.
2. Integrate patches for test 071, 067 and 087.
3. Keep the other patches.

v1:
1. updated the test suite to be default-machine-type-aware, from the previous 
platform-aware
2. created a new patch "qemu-iotests: run qemu with -nodefaults" to counterpart 
the impact from the commit:
c88930a6866e74953e931ae749781e98e486e5c8
qemu-char: Permit only a single "stdio" character device

When more than one is used, the terminal settings aren't restored
correctly on exit.  Fixable.  However, such usage makes no sense,
because the users race for input, so outlaw it instead.

If you want to connect multiple things to stdio, use the mux
chardev.
3. updated all the checking of platform name to the current machine name

Mao Chuan Li (5):
  qemu-iotests: qemu machine type support
  qemu-iotests: run qemu with -nodefaults
  qemu-iotests: s390x: fix test 041
  qemu-iotests: s390x: fix test 051
  qemu-iotests: s390x: fix test 055

Xiao Guang Chen (1):
  qemu-iotests: fix tests 067, 071 and 087

 tests/qemu-iotests/041 |   6 +
 tests/qemu-iotests/051 |  91 +---
 tests/qemu-iotests/051.s390-virtio.out | 377 +
 tests/qemu-iotests/055 |   6 +
 tests/qemu-iotests/067 |   8 +-
 tests/qemu-iotests/067.out |  26 +--
 tests/qemu-iotests/071.out |  12 +-
 tests/qemu-iotests/087.out |  18 +-
 tests/qemu-iotests/check   |   5 +
 tests/qemu-iotests/common  |   1 +
 tests/qemu-iotests/common.config   |   3 +-
 tests/qemu-iotests/common.qemu |   2 +-
 tests/qemu-iotests/iotests.py  |   1 +
 13 files changed, 481 insertions(+), 75 deletions(-)
 create mode 100644 tests/qemu-iotests/051.s390-virtio.out

-- 
1.9.1

Re: [Qemu-devel] [PATCH v3] spapr_vio/spapr_iommu: Move VIO bypass where it belongs

2015-01-28 Thread David Gibson

On Thu, Jan 29, 2015 at 04:04:58PM +1100, Alexey Kardashevskiy wrote:
> Instead of tweaking a TCE table device by adding there a bypass flag,
> let's add an alias to RAM and IOMMU memory region, and enable/disable
> those according to the selected bypass mode.
> This way IOMMU memory region can have size of the actual window rather
> than ram_size which is essential for upcoming DDW support.
> 
> This moves bypass logic to VIO layer and keeps @bypass flag in TCE table
> for migration compatibility only. This replaces spapr_tce_set_bypass()
> calls with explicit assignment to avoid confusion as the function could
> do something more that just syncing the @bypass flag.
> 
> This adds a pointer to VIO device into the sPAPRTCETable struct to provide
> the sPAPRTCETable device a way to update bypass mode for the VIO device.
> 
> Signed-off-by: Alexey Kardashevskiy 

It looks correct to me, although I don't love the fact that the bypass
is controlled by the VIO layer, but the state is stored in the TCE
table.  We should fix that at some point, although it will require
some awkward changing to the migration stream.

Reviewed-by: David Gibson 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpUe28fKhrQ5.pgp
Description: PGP signature

[Qemu-devel] [PATCH v3] spapr_vio/spapr_iommu: Move VIO bypass where it belongs

2015-01-28 Thread Alexey Kardashevskiy

Instead of tweaking a TCE table device by adding there a bypass flag,
let's add an alias to RAM and IOMMU memory region, and enable/disable
those according to the selected bypass mode.
This way IOMMU memory region can have size of the actual window rather
than ram_size which is essential for upcoming DDW support.

This moves bypass logic to VIO layer and keeps @bypass flag in TCE table
for migration compatibility only. This replaces spapr_tce_set_bypass()
calls with explicit assignment to avoid confusion as the function could
do something more that just syncing the @bypass flag.

This adds a pointer to VIO device into the sPAPRTCETable struct to provide
the sPAPRTCETable device a way to update bypass mode for the VIO device.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v3:
* s/spapr_tce_set_bypass/spapr_vio_set_bypass/
* added sPAPRTCETable::vdev to avoid messing with Object::parent

v2:
* kept @bypass in sPAPRTCETable not to break migration
---
 hw/ppc/spapr_iommu.c   | 26 --
 hw/ppc/spapr_vio.c | 28 ++--
 include/hw/ppc/spapr.h |  2 +-
 include/hw/ppc/spapr_vio.h |  4 
 4 files changed, 47 insertions(+), 13 deletions(-)

diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index da47474..f9f85c5 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -25,6 +25,7 @@
 #include "trace.h"
 
 #include "hw/ppc/spapr.h"
+#include "hw/ppc/spapr_vio.h"
 
 #include 
 
@@ -72,9 +73,7 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion 
*iommu, hwaddr addr,
 .perm = IOMMU_NONE,
 };
 
-if (tcet->bypass) {
-ret.perm = IOMMU_RW;
-} else if ((addr >> tcet->page_shift) < tcet->nb_table) {
+if ((addr >> tcet->page_shift) < tcet->nb_table) {
 /* Check if we are in bound */
 hwaddr page_mask = IOMMU_PAGE_MASK(tcet->page_shift);
 
@@ -90,10 +89,22 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion 
*iommu, hwaddr addr,
 return ret;
 }
 
+static int spapr_tce_table_post_load(void *opaque, int version_id)
+{
+sPAPRTCETable *tcet = SPAPR_TCE_TABLE(opaque);
+
+if (tcet->vdev) {
+spapr_vio_set_bypass(tcet->vdev, tcet->bypass);
+}
+
+return 0;
+}
+
 static const VMStateDescription vmstate_spapr_tce_table = {
 .name = "spapr_iommu",
 .version_id = 2,
 .minimum_version_id = 2,
+.post_load = spapr_tce_table_post_load,
 .fields  = (VMStateField []) {
 /* Sanity check */
 VMSTATE_UINT32_EQUAL(liobn, sPAPRTCETable),
@@ -131,7 +142,8 @@ static int spapr_tce_table_realize(DeviceState *dev)
 trace_spapr_iommu_new_table(tcet->liobn, tcet, tcet->table, tcet->fd);
 
 memory_region_init_iommu(&tcet->iommu, OBJECT(dev), &spapr_iommu_ops,
- "iommu-spapr", ram_size);
+ "iommu-spapr",
+ (uint64_t)tcet->nb_table << tcet->page_shift);
 
 QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list);
 
@@ -191,17 +203,11 @@ MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet)
 return &tcet->iommu;
 }
 
-void spapr_tce_set_bypass(sPAPRTCETable *tcet, bool bypass)
-{
-tcet->bypass = bypass;
-}
-
 static void spapr_tce_reset(DeviceState *dev)
 {
 sPAPRTCETable *tcet = SPAPR_TCE_TABLE(dev);
 size_t table_size = tcet->nb_table * sizeof(uint64_t);
 
-tcet->bypass = false;
 memset(tcet->table, 0, table_size);
 }
 
diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index dc9e46a..7c49158 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -322,6 +322,18 @@ static void spapr_vio_quiesce_one(VIOsPAPRDevice *dev)
 free_crq(dev);
 }
 
+void spapr_vio_set_bypass(VIOsPAPRDevice *dev, bool bypass)
+{
+if (!dev->tcet) {
+return;
+}
+
+memory_region_set_enabled(&dev->mrbypass, bypass);
+memory_region_set_enabled(spapr_tce_get_iommu(dev->tcet), !bypass);
+
+dev->tcet->bypass = bypass;
+}
+
 static void rtas_set_tce_bypass(PowerPCCPU *cpu, sPAPREnvironment *spapr,
 uint32_t token,
 uint32_t nargs, target_ulong args,
@@ -348,7 +360,7 @@ static void rtas_set_tce_bypass(PowerPCCPU *cpu, 
sPAPREnvironment *spapr,
 return;
 }
 
-spapr_tce_set_bypass(dev->tcet, !!enable);
+spapr_vio_set_bypass(dev, !!enable);
 
 rtas_st(rets, 0, RTAS_OUT_SUCCESS);
 }
@@ -407,6 +419,7 @@ static void spapr_vio_busdev_reset(DeviceState *qdev)
 
 dev->signal_state = 0;
 
+spapr_vio_set_bypass(dev, false);
 if (pc->reset) {
 pc->reset(dev);
 }
@@ -456,12 +469,23 @@ static int spapr_vio_busdev_init(DeviceState *qdev)
 
 if (pc->rtce_window_size) {
 uint32_t liobn = SPAPR_VIO_BASE_LIOBN | dev->reg;
+
+memory_region_init(&dev->mrroot, OBJECT(dev), "iommu-spapr-root",
+   ram_size);
+memory_region_init_alias(&dev->mrbypass, OBJECT(dev),
+

Re: [Qemu-devel] [PATCH v2] spapr_vio/spapr_iommu: Move VIO bypass where it belongs

2015-01-28 Thread David Gibson

On Thu, Jan 29, 2015 at 11:48:46AM +1100, Alexey Kardashevskiy wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 01/29/2015 11:31 AM, David Gibson wrote:
> > On Wed, Jan 28, 2015 at 07:49:48PM +1100, Alexey Kardashevskiy wrote:
> >> On 01/28/2015 08:21 AM, Alexey Kardashevskiy wrote:
> >>> On 01/27/2015 05:13 PM, Alexey Kardashevskiy wrote:
> > [snip]
>  diff --git a/include/hw/ppc/spapr_vio.h
>  b/include/hw/ppc/spapr_vio.h index 46edc2a..6ad55d1 100644 ---
>  a/include/hw/ppc/spapr_vio.h +++ b/include/hw/ppc/spapr_vio.h @@
>  -64,6 +64,8 @@ struct VIOsPAPRDevice { target_ulong
>  signal_state; VIOsPAPR_CRQ crq; AddressSpace as; +
>  MemoryRegion mrroot; +MemoryRegion mrbypass; sPAPRTCETable
>  *tcet; };
>  
>  
> >>> 
> >>> 
> >> 
> >> 
> >> I believe doing something like this is way too disguising because
> >> of tobj->parent?
> > 
> > It's kinda ugly, but it's the best way I can think of.
> 
> 
> 
> Better than having VIOsPAPRDevice pointer in sPAPRTCETable? I would
> expect a object_get_parent() helper to exist but it is not there and I
> believe this is for a reason (may be it will be get rid of some time
> later)...


Uh.. I think someone who knows qom better than me will have to answer
that..

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpXuC_9APAVb.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v1] vhost-user: fix not send all hugepage files to vhost-user

2015-01-28 Thread Linhaifeng

Hi,Michael S.Tsirkin

The vhost-user device will not work if there are two numa nodes in VM.

Should we fix this bug or ignore it ?

On 2014/12/18 13:06, Linhaifeng wrote:
> 
> 
> On 2014/12/17 14:02, haifeng@huawei.com wrote:
>> From: linhaifeng 
>>
>> If we create VM with two or more numa nodes qemu will create two
>> or more hugepage files but qemu only send one hugepage file fd
>> to vhost-user when VM's memory size is 2G and with two numa nodes.
>>
>> Signed-off-by: linhaifeng 
>> ---
>>  hw/virtio/vhost-user.c  | 78 
>> ++---
>>  hw/virtio/vhost.c   | 13 
>>  linux-headers/linux/vhost.h |  7 
>>  3 files changed, 73 insertions(+), 25 deletions(-)
>>
>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>> index aefe0bb..439cbba 100644
>> --- a/hw/virtio/vhost-user.c
>> +++ b/hw/virtio/vhost-user.c
>> @@ -24,6 +24,10 @@
>>  #include 
>>  
>>  #define VHOST_MEMORY_MAX_NREGIONS8
>> +/* FIXME: same as the max number of numa node?*/
>> +#define HUGEPAGE_MAX_FILES   8
>> +
>> +#define RAM_SHARED (1 << 1)
>>  
>>  typedef enum VhostUserRequest {
>>  VHOST_USER_NONE = 0,
>> @@ -41,14 +45,15 @@ typedef enum VhostUserRequest {
>>  VHOST_USER_SET_VRING_KICK = 12,
>>  VHOST_USER_SET_VRING_CALL = 13,
>>  VHOST_USER_SET_VRING_ERR = 14,
>> -VHOST_USER_MAX
>> +VHOST_USER_MMAP_HUGEPAGE_FILE = 15,
>> +VHOST_USER_UNMAP_HUGEPAGE_FILE = 16,
>> +VHOST_USER_MAX,
>>  } VhostUserRequest;
>>  
>>  typedef struct VhostUserMemoryRegion {
>>  uint64_t guest_phys_addr;
>>  uint64_t memory_size;
>>  uint64_t userspace_addr;
>> -uint64_t mmap_offset;
>>  } VhostUserMemoryRegion;
>>  
>>  typedef struct VhostUserMemory {
>> @@ -57,6 +62,16 @@ typedef struct VhostUserMemory {
>>  VhostUserMemoryRegion regions[VHOST_MEMORY_MAX_NREGIONS];
>>  } VhostUserMemory;
>>  
>> +typedef struct HugepageMemoryInfo {
>> +uint64_t base_addr;
>> +uint64_t size;
>> +}HugeMemInfo;
>> +
>> +typedef struct HugepageInfo {
>> +uint32_t num;
>> +HugeMemInfo files[HUGEPAGE_MAX_FILES];
>> +}HugepageInfo;
>> +
>>  typedef struct VhostUserMsg {
>>  VhostUserRequest request;
>>  
>> @@ -71,6 +86,7 @@ typedef struct VhostUserMsg {
>>  struct vhost_vring_state state;
>>  struct vhost_vring_addr addr;
>>  VhostUserMemory memory;
>> +HugepageInfo huge_info;
>>  };
>>  } QEMU_PACKED VhostUserMsg;
>>  
>> @@ -104,7 +120,9 @@ static unsigned long int 
>> ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
>>  VHOST_GET_VRING_BASE,   /* VHOST_USER_GET_VRING_BASE */
>>  VHOST_SET_VRING_KICK,   /* VHOST_USER_SET_VRING_KICK */
>>  VHOST_SET_VRING_CALL,   /* VHOST_USER_SET_VRING_CALL */
>> -VHOST_SET_VRING_ERR /* VHOST_USER_SET_VRING_ERR */
>> +VHOST_SET_VRING_ERR,/* VHOST_USER_SET_VRING_ERR */
>> +VHOST_MMAP_HUGEPAGE_FILE,  /* VHOST_USER_MMAP_HUGEPAGE_FILE */
>> +VHOST_UNMAP_HUGEPAGE_FILE, /* VHOST_USER_UNMAP_HUGEPAGE_FILE */
>>  };
>>  
>>  static VhostUserRequest vhost_user_request_translate(unsigned long int 
>> request)
>> @@ -190,6 +208,7 @@ static int vhost_user_call(struct vhost_dev *dev, 
>> unsigned long int request,
>>  int fds[VHOST_MEMORY_MAX_NREGIONS];
>>  int i, fd;
>>  size_t fd_num = 0;
>> +RAMBlock *block;
>>  
>>  assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER);
>>  
>> @@ -213,37 +232,46 @@ static int vhost_user_call(struct vhost_dev *dev, 
>> unsigned long int request,
>>  case VHOST_RESET_OWNER:
>>  break;
>>  
>> -case VHOST_SET_MEM_TABLE:
>> -for (i = 0; i < dev->mem->nregions; ++i) {
>> -struct vhost_memory_region *reg = dev->mem->regions + i;
>> -ram_addr_t ram_addr;
>> +case VHOST_MMAP_HUGEPAGE_FILE:
>> +qemu_mutex_lock_ramlist();
>>  
>> -assert((uintptr_t)reg->userspace_addr == reg->userspace_addr);
>> -qemu_ram_addr_from_host((void *)(uintptr_t)reg->userspace_addr, 
>> &ram_addr);
>> -fd = qemu_get_ram_fd(ram_addr);
>> -if (fd > 0) {
>> -msg.memory.regions[fd_num].userspace_addr = 
>> reg->userspace_addr;
>> -msg.memory.regions[fd_num].memory_size  = reg->memory_size;
>> -msg.memory.regions[fd_num].guest_phys_addr = 
>> reg->guest_phys_addr;
>> -msg.memory.regions[fd_num].mmap_offset = 
>> reg->userspace_addr -
>> -(uintptr_t) qemu_get_ram_block_host_ptr(ram_addr);
>> -assert(fd_num < VHOST_MEMORY_MAX_NREGIONS);
>> -fds[fd_num++] = fd;
>> +/* Get hugepage file informations */
>> +QTAILQ_FOREACH(block, &ram_list.blocks, next) {
>> +if (block->flags & RAM_SHARED && block->fd > 0) {
>> +msg.huge_info.files[fd_num].size = block->length;
>> +msg.huge_info.files[fd_num].base_addr = block->host;
>> +

Re: [Qemu-devel] [PATCH 2/2] hw/ppc/spapr Add qemu_register_boot_set for SPAPR

2015-01-28 Thread Gonglei

On 2015/1/29 8:42, Alexander Graf wrote:

> 
> 
> On 29.01.15 01:40, Gonglei wrote:
>> On 2015/1/26 17:35, Alexander Graf wrote:
>>
>>> On 01/26/2015 11:49 AM, Dinar Valeev wrote:
 On 01/24/2015 12:04 AM, Alexander Graf wrote:
>
>
> On 23.01.15 23:51, dval...@suse.de wrote:
>> From: Dinar Valeev 
>>
>> In order to have -boot once=d functioning, it is required to have
>> qemu_register_boot_set
>>
>> qemu-system-ppc64 -enable-kvm -boot once=d
>>
>> Ready!
>> 0 > dev /chosen   ok
>> 0 > .properties
>> ...
>> qemu,boot-device d
>> ...
>> 0 > reset-all
>>
>> Ready!
>> 0 > dev /chosen   ok
>> 0 > .properties
>> ...
>> qemu,boot-device cdn
>> ...
>>
>> Signed-off-by: Dinar Valeev 
>> ---
>>   hw/ppc/spapr.c | 12 
>>   1 file changed, 12 insertions(+)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 3d2cfa3..38b03fc 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -314,6 +314,16 @@ static void add_str(GString *s, const gchar *s1)
>>   g_string_append_len(s, s1, strlen(s1) + 1);
>>   }
>>
>> +static void spapr_boot_set(void *opaque, const char *boot_device,
>> +   Error **errp)
>> +{
>> +int offset;
>> +offset = fdt_path_offset(opaque, "/chosen");
>> +fdt_setprop_string(opaque, offset, "qemu,boot-device", boot_device);
>> +
>> +}
>> +
>> +
>>   static void *spapr_create_fdt_skel(hwaddr initrd_base,
>>  hwaddr initrd_size,
>>  hwaddr kernel_size,
>> @@ -414,6 +424,8 @@ static void *spapr_create_fdt_skel(hwaddr 
>> initrd_base,
>>   if (boot_device) {
>>   _FDT((fdt_property_string(fdt, "qemu,boot-device", 
>> boot_device)));
>>   }
>> +qemu_register_boot_set(spapr_boot_set, fdt);
>
> If you simply move the code above (the _FDT() one) from create_fdt_skel
> to spapr_finalize_fdt() you should have the same net effect and much
> cleaner code :).
 I've tried your proposal, on reset boot-device property stays "d"
>>>
>>> Ugh, the machine field doesn't change on reset. I think it'd be a lot more 
>>> intuitive if it did. Can you try with the patch below applied as well?
>>>
>>
>> This approach is not good because boot_set_handler is NULL and return error 
>> directly.
>> Please using qemu_register_boot_set for this purpose.
> 
> I'd personally prefer if we get rid of qemu_register_boot_set
> completely. It duplicates the reset logic as well as information holder
> locality (machine struct vs parameter).
> 

Maybe yes. But lots of other machines do not  register
reset callback. So those machines using qemu_register_boot_set()
register a handler callback achieve this purpose.

Regards,
-Gonglei

Re: [Qemu-devel] [PATCH v2 0/5] Common unplug and unplug request cb for memory and CPU hot-unplug

2015-01-28 Thread Zhu Guihua

Any comments about this series?
If no, could anyone help to merge this?
Thanks.

Regards,
Zhu

On Wed, 2015-01-28 at 15:45 +0800, Zhu Guihua wrote:
> Memory and CPU hot unplug are both asynchronous procedures.
> When the unplug operation happens, unplug request cb is called first.
> And when guest OS finished handling unplug, unplug cb will be called
> to do the real removal of device.
> 
> They both need pc-machine, piix4 and ich9 unplug and unplug request cb.
> So this patchset introduces these commom functions as part1, and memory
> and CPU hot-unplug will come soon as part 2 and 3.
> 
> This patch-set is based on QEmu 2.2
> 
> v2:
> - Commit messages changes
> 
> Tang Chen (5):
>   acpi, pc: Add hotunplug request cb for pc machine.
>   acpi, ich9: Add hotunplug request cb for ich9.
>   acpi, pc: Add unplug cb for pc machine.
>   acpi, ich9: Add unplug cb for ich9.
>   acpi, piix4: Add unplug cb for piix4.
> 
>  hw/acpi/ich9.c | 14 ++
>  hw/acpi/piix4.c|  8 
>  hw/i386/pc.c   | 16 
>  hw/isa/lpc_ich9.c  | 14 --
>  include/hw/acpi/ich9.h |  4 
>  5 files changed, 54 insertions(+), 2 deletions(-)
>

Re: [Qemu-devel] [PATCH 0/6] Trivial cleanups around g_malloc()

2015-01-28 Thread Gonglei

On 2015/1/29 10:21, Eric Blake wrote:

> On 01/28/2015 05:31 PM, Gonglei wrote:
>> On 2015/1/28 19:16, Markus Armbruster wrote:
>>
>>> I screwed up cc: qemu-trivial, my apologies.
>>>
> 
 Markus Armbruster (6):
   onenand: g_malloc() can't fail, bury dead error handling
   rtl8139: g_malloc() can't fail, bury dead error handling
   kvm: g_malloc() can't fail, bury dead error handling
   rdma: g_malloc0() can't fail, bury dead error handling
   vnc: g_realloc() can't fail, bury dead error handling
   translate-all: Use g_try_malloc() for dynamic translator buffer

> 
>>>
>>
>> For serials:
> 
> s/serials/series/

Yes, thanks :)

> 
>> Reviewed-by: Gonglei 
>>
>> Regards,
>> -Gonglei
>>
>>
>>
>>
>>
>

Re: [Qemu-devel] [PATCH 0/6] Trivial cleanups around g_malloc()

2015-01-28 Thread Eric Blake

On 01/28/2015 05:31 PM, Gonglei wrote:
> On 2015/1/28 19:16, Markus Armbruster wrote:
> 
>> I screwed up cc: qemu-trivial, my apologies.
>>

>>> Markus Armbruster (6):
>>>   onenand: g_malloc() can't fail, bury dead error handling
>>>   rtl8139: g_malloc() can't fail, bury dead error handling
>>>   kvm: g_malloc() can't fail, bury dead error handling
>>>   rdma: g_malloc0() can't fail, bury dead error handling
>>>   vnc: g_realloc() can't fail, bury dead error handling
>>>   translate-all: Use g_try_malloc() for dynamic translator buffer
>>>

>>
> 
> For serials:

s/serials/series/

> Reviewed-by: Gonglei 
> 
> Regards,
> -Gonglei
> 
> 
> 
> 
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v6 5/5] qemu-iotests: Add 093 for IO throttling

2015-01-28 Thread Fam Zheng

On Thu, 01/29 08:53, Fam Zheng wrote:
> On Wed, 01/28 11:22, Max Reitz wrote:
> > On 2015-01-27 at 21:28, Fam Zheng wrote:
> > >This case utilizes qemu-io command "aio_{read,write} -q" to verify the
> > >effectiveness of IO throttling options.
> > >
> > >It's implemented by driving the vm timer from qtest protocol, so the
> > >throttling timers are signaled with determinied time duration. Then we
> > >verify the completed IO requests are within 10% error of bps and iops
> > >limits.
> > >
> > >"null" protocol is used as the disk backend so that no actual disk IO is
> > >performed on host, this will make the blockstats much more
> > >deterministic. Both "null-aio" and "null-co" are covered, which is also
> > >a simple cross validation test for the driver code.
> > >
> > >Signed-off-by: Fam Zheng 
> > >---
> > >  tests/qemu-iotests/093 | 120 
> > > +
> > >  tests/qemu-iotests/093.out |   5 ++
> > >  tests/qemu-iotests/group   |   1 +
> > >  3 files changed, 126 insertions(+)
> > >  create mode 100755 tests/qemu-iotests/093
> > >  create mode 100644 tests/qemu-iotests/093.out
> > >
> > >diff --git a/tests/qemu-iotests/093 b/tests/qemu-iotests/093
> > >new file mode 100755
> > >index 000..2866536
> > >--- /dev/null
> > >+++ b/tests/qemu-iotests/093
> > >@@ -0,0 +1,120 @@
> > >+#!/usr/bin/env python
> > >+#
> > >+# Tests for IO throttling
> > >+#
> > >+# Copyright (C) 2015 Red Hat, Inc.
> > >+#
> > >+# This program is free software; you can redistribute it and/or modify
> > >+# it under the terms of the GNU General Public License as published by
> > >+# the Free Software Foundation; either version 2 of the License, or
> > >+# (at your option) any later version.
> > >+#
> > >+# This program is distributed in the hope that it will be useful,
> > >+# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > >+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > >+# GNU General Public License for more details.
> > >+#
> > >+# You should have received a copy of the GNU General Public License
> > >+# along with this program.  If not, see .
> > >+#
> > >+
> > >+import iotests
> > >+
> > >+class ThrottleTestCase(iotests.QMPTestCase):
> > >+test_img = "null-aio://"
> > >+
> > >+def blockstats(self, device):
> > >+result = self.vm.qmp("query-blockstats")
> > >+for r in result['return']:
> > >+if r['device'] == device:
> > >+stat = r['stats']
> > >+return stat['rd_bytes'], stat['rd_operations'], 
> > >stat['wr_bytes'], stat['wr_operations']
> > >+raise Exception("Device not found for blockstats: %s" % device)
> > >+
> > >+def setUp(self):
> > >+self.vm = iotests.VM().add_drive(self.test_img)
> > >+self.vm.launch()
> > >+
> > >+def tearDown(self):
> > >+self.vm.shutdown()
> > >+
> > >+def do_test_throttle(self, seconds, params):
> > >+def check_limit(limit, num):
> > >+# IO throttling algorithm is discrete, allow 10% error so the 
> > >test
> > >+# is more robust
> > >+return limit == 0 or \
> > >+   (num < seconds * limit * 1.1
> > >+   and num > seconds * limit * 0.9)
> > >+
> > >+nsec_per_sec = 10
> > >+
> > >+params['device'] = 'drive0'
> > >+
> > >+result = self.vm.qmp("block_set_io_throttle", conv_keys=False, 
> > >**params)
> > >+self.assert_qmp(result, 'return', {})
> > >+
> > >+# Set vm clock to a known value
> > >+ns = seconds * nsec_per_sec
> > >+self.vm.qtest("clock_step %d" % ns)
> > >+
> > >+# Submit enough requests. They will drain bps_max and iops_max, 
> > >but the
> > >+# rest requests won't get executed until we advance the virtual 
> > >clock
> > >+# with qtest interface
> > >+rq_size = 512
> > >+rd_nr = max(params['bps'] / rq_size / 2,
> > >+params['bps_rd'] / rq_size,
> > >+params['iops'] / 2,
> > >+params['iops_rd']) + \
> > >+params['bps_max'] / rq_size / 2 + \
> > >+params['iops_max']
> > 
> > I guess the divisions by two are because those values represent read and
> > write operations combined. Shouldn't iops_max be divided by two, too, then?
> > 
> > >+rd_nr *= seconds * 2
> > >+wr_nr = max(params['bps'] / rq_size / 2,
> > >+params['bps_wr'] / rq_size,
> > >+params['iops'] / 2,
> > >+params['iops_wr']) + \
> > >+params['bps_max'] / rq_size / 2 + \
> > >+params['iops_max']
> > >+wr_nr *= seconds * 2
> > >+for i in range(rd_nr):
> > >+self.vm.hmp_qemu_io("drive0", "aio_read %d %d" % (i * 
> > >rq_size, rq_size))
> > >+for i in range(wr_nr):
> > >+self.vm.hmp_qemu_i

Re: [Qemu-devel] [PATCH] target-ppc: Use right page size with hash table lookup

2015-01-28 Thread David Gibson

On Mon, Jan 26, 2015 at 07:51:58PM +0530, Aneesh Kumar K.V wrote:
> We look at two sizes specified in ISA (4K, 64K). If not found matching,
> we consider it 16MB.
> 
> Without this patch we would fail to lookup address above 16MB range.
> Below 16MB happened to work before because the kernel have a liner
> mapping and we always looked up hash for 0xc000. The
> actual real address was computed by using the 16MB offset
> with the real address found with the above hash.
> 
> Without Fix:
> (gdb) x/16x 0xc100
> 0xc100 :   Cannot access memory at 
> address 0xc100
> (gdb)
> 
> With Fix:
> (gdb)  x/16x 0xc100
> 0xc100 :   0x  0x
>   0x  0x
> 0xc110 :   0x  0x
>   0x  0x
> 0xc120 :   0x  0x
>   0x  0x
> 0xc130 :   0x  0x
>   0x  0x
> 
> Signed-off-by: Aneesh Kumar K.V 

This doesn't fully implement the sllp page size encodings, but it's
certainly better than what's there now.

Reviewed-by: David Gibson 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgp7vUWRYh0op.pgp
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH v1 07/13] spapr: Start all the threads of CPU core when core is hotplugged

2015-01-28 Thread David Gibson

On Thu, Jan 08, 2015 at 11:40:14AM +0530, Bharata B Rao wrote:
> PowerPC kernel adds or removes CPUs in core granularity and hence
> onlines/offlines all the SMT threads of a core during hot plug/unplug.
> Support this notion by starting all SMT threads of a core when a core
> is hotplugged.
> 
> Signed-off-by: Bharata B Rao 
> ---
>  hw/ppc/spapr.c | 25 +
>  1 file changed, 25 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index a293a59..4347471 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1376,6 +1376,8 @@ static void spapr_drc_reset(void *opaque)
>  }
>  }
>  
> +static const char *current_cpu_model;

More new global variables?  Please don't.

>  /* pSeries LPAR / sPAPR hardware init */
>  static void ppc_spapr_init(MachineState *machine)
>  {
> @@ -1473,6 +1475,8 @@ static void ppc_spapr_init(MachineState *machine)
>  }
>  }
>  
> +current_cpu_model = cpu_model;
> +
>  /* allocate RAM */
>  spapr->ram_limit = ram_size;
>  spapr->maxram_limit = machine->maxram_size;
> @@ -1912,10 +1916,31 @@ static void spapr_cpu_plug(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
>  PowerPCCPU *cpu = POWERPC_CPU(cs);
>  sPAPRDRConnector *drc =
>  spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, 
> cpu->cpu_dt_id);
> +int id = ppc_get_vcpu_dt_id(cpu);
> +int smt = kvmppc_smt_threads();
> +int i;
> +
> +/*
> + * SMT threads return from here, only main thread (core) will
> + * continue, create threads and signal hotplug event to the guest.
> + */
> +if ((id % smt) != 0) {
> +return;
> +}
>  
>  /* TODO: Check if DR is enabled ? */
>  g_assert(drc);
>  
> +/* Start rest of the SMT threads of the hot plugged core */
> +for (i = 1; i < smp_threads; i++) {
> +cpu = cpu_ppc_init(current_cpu_model);
> +if (cpu == NULL) {
> +fprintf(stderr, "Unable to find PowerPC CPU definition\n");
> +exit(1);
> +}
> +spapr_cpu_reset(cpu);
> +}
> +
>  spapr_cpu_reset(POWERPC_CPU(CPU(dev)));
>  spapr_cpu_hotplug_add(dev, cs);
>  spapr_hotplug_req_add_event(drc);

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgp6eU0B_f8Bw.pgp
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH v1 10/13] cpus, spapr: reclaim allocated vCPU objects

2015-01-28 Thread David Gibson

On Thu, Jan 08, 2015 at 11:40:17AM +0530, Bharata B Rao wrote:
> From: Gu Zheng 

This needs a commit message, it's not at all clear from the 1-line description.

> 
> Signed-off-by: Gu Zheng 
> Signed-off-by: Bharata B Rao 
>(added spapr bits)
> ---
>  cpus.c   | 44 
>  hw/ppc/spapr.c   | 14 -
>  include/qom/cpu.h| 11 ++
>  include/sysemu/kvm.h |  1 +
>  kvm-all.c| 57 
> +++-
>  5 files changed, 125 insertions(+), 2 deletions(-)

The generic and PAPR specific parts should probably be divided into
different patches, since they'll want to go via different trees.

> diff --git a/cpus.c b/cpus.c
> index 1b5168a..98b7199 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -871,6 +871,24 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void 
> *data), void *data)
>  qemu_cpu_kick(cpu);
>  }
>  
> +static void qemu_kvm_destroy_vcpu(CPUState *cpu)
> +{
> +CPU_REMOVE(cpu);
> +
> +if (kvm_destroy_vcpu(cpu) < 0) {
> +fprintf(stderr, "kvm_destroy_vcpu failed.\n");
> +exit(1);
> +}
> +
> +object_unparent(OBJECT(cpu));
> +}
> +
> +static void qemu_tcg_destroy_vcpu(CPUState *cpu)
> +{
> +CPU_REMOVE(cpu);
> +object_unparent(OBJECT(cpu));
> +}
> +
>  static void flush_queued_work(CPUState *cpu)
>  {
>  struct qemu_work_item *wi;
> @@ -964,6 +982,11 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
>  }
>  }
>  qemu_kvm_wait_io_event(cpu);
> +if (cpu->exit && !cpu_can_run(cpu)) {
> +qemu_kvm_destroy_vcpu(cpu);
> +qemu_mutex_unlock(&qemu_global_mutex);
> +return NULL;
> +}
>  }
>  
>  return NULL;
> @@ -1018,6 +1041,7 @@ static void tcg_exec_all(void);
>  static void *qemu_tcg_cpu_thread_fn(void *arg)
>  {
>  CPUState *cpu = arg;
> +CPUState *remove_cpu = NULL;
>  
>  qemu_tcg_init_cpu_signals();
>  qemu_thread_get_self(cpu->thread);
> @@ -1052,6 +1076,16 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
>  }
>  }
>  qemu_tcg_wait_io_event();
> +CPU_FOREACH(cpu) {
> +if (cpu->exit && !cpu_can_run(cpu)) {
> +remove_cpu = cpu;
> +break;
> +}
> +}
> +if (remove_cpu) {
> +qemu_tcg_destroy_vcpu(remove_cpu);
> +remove_cpu = NULL;
> +}
>  }
>  
>  return NULL;
> @@ -1208,6 +1242,13 @@ void resume_all_vcpus(void)
>  }
>  }
>  
> +void cpu_remove(CPUState *cpu)
> +{
> +cpu->stop = true;
> +cpu->exit = true;
> +qemu_cpu_kick(cpu);
> +}
> +
>  /* For temporary buffers for forming a name */
>  #define VCPU_THREAD_NAME_SIZE 16
>  
> @@ -1402,6 +1443,9 @@ static void tcg_exec_all(void)
>  break;
>  }
>  } else if (cpu->stop || cpu->stopped) {
> +if (cpu->exit) {
> +next_cpu = CPU_NEXT(cpu);
> +}
>  break;
>  }
>  }
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index ec793b1..44405b2 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1910,7 +1910,19 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, 
> CPUState *cs)
>  
>  static void spapr_cpu_release(DeviceState *dev, void *opaque)
>  {
> -/* Release vCPU */
> +CPUState *cs;
> +int i;
> +int id = ppc_get_vcpu_dt_id(POWERPC_CPU(CPU(dev)));
> +
> +for (i = id; i < id + smp_threads; i++) {
> +CPU_FOREACH(cs) {
> +PowerPCCPU *cpu = POWERPC_CPU(cs);
> +
> +if (i == ppc_get_vcpu_dt_id(cpu)) {
> +cpu_remove(cs);
> +}
> +}
> +}
>  }
>  
>  static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs)
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index 2098f1c..30fd0cd 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -206,6 +206,7 @@ struct kvm_run;
>   * @halted: Nonzero if the CPU is in suspended state.
>   * @stop: Indicates a pending stop request.
>   * @stopped: Indicates the CPU has been artificially stopped.
> + * @exit: Indicates the CPU has exited due to an unplug operation.
>   * @tcg_exit_req: Set to force TCG to stop executing linked TBs for this
>   *   CPU and return to its top level loop.
>   * @singlestep_enabled: Flags for single-stepping.
> @@ -249,6 +250,7 @@ struct CPUState {
>  bool created;
>  bool stop;
>  bool stopped;
> +bool exit;
>  volatile sig_atomic_t exit_request;
>  uint32_t interrupt_request;
>  int singlestep_enabled;
> @@ -305,6 +307,7 @@ struct CPUState {
>  QTAILQ_HEAD(CPUTailQ, CPUState);
>  extern struct CPUTailQ cpus;
>  #define CPU_NEXT(cpu) QTAILQ_NEXT(cpu, node)
> +#define CPU_REMOVE(cpu) QTAILQ_REMOVE(&cpus, cpu, node)
>  #define CPU_FOREACH(cpu) QTAILQ_FOREACH(cpu, &cpus, node)
>  #define CPU_FOREACH_SAFE(cpu, next_cpu) \
>

Re: [Qemu-devel] [RFC PATCH v1 04/13] spapr: Factor out CPU initialization code into realizefn

2015-01-28 Thread David Gibson

On Thu, Jan 08, 2015 at 11:40:11AM +0530, Bharata B Rao wrote:
> Move some CPU initialization code from machine init function to
> CPU realizefn so that it can be used from CPU hotplug path too.
> 
> With the inclusion of ppc.h in translate_init.c, explicit *irq_init()
> function definitions aren't required, remove them.
> 
> Signed-off-by: Bharata B Rao 
> ---
>  hw/ppc/spapr.c  | 29 +
>  include/hw/ppc/spapr.h  |  3 +++
>  target-ppc/translate_init.c | 43 ++-
>  3 files changed, 30 insertions(+), 45 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 779d364..f49b0fa 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -81,8 +81,6 @@
>  
>  #define MIN_RMA_SLOF128UL
>  
> -#define TIMEBASE_FREQ   51200ULL
> -
>  #define MAX_CPUS255
>  
>  #define PHANDLE_XICP0x
> @@ -971,7 +969,7 @@ static void ppc_spapr_reset(void)
>  
>  }
>  
> -static void spapr_cpu_reset(void *opaque)
> +void spapr_cpu_reset(void *opaque)
>  {
>  PowerPCCPU *cpu = opaque;
>  CPUState *cs = CPU(cpu);
> @@ -1387,7 +1385,6 @@ static void ppc_spapr_init(MachineState *machine)
>  const char *initrd_filename = machine->initrd_filename;
>  const char *boot_device = machine->boot_order;
>  PowerPCCPU *cpu;
> -CPUPPCState *env;
>  PCIHostState *phb;
>  int i;
>  MemoryRegion *sysmem = get_system_memory();
> @@ -1472,30 +1469,6 @@ static void ppc_spapr_init(MachineState *machine)
>  fprintf(stderr, "Unable to find PowerPC CPU definition\n");
>  exit(1);
>  }
> -env = &cpu->env;
> -
> -/* Set time-base frequency to 512 MHz */
> -cpu_ppc_tb_init(env, TIMEBASE_FREQ);
> -
> -/* PAPR always has exception vectors in RAM not ROM. To ensure this,
> - * MSR[IP] should never be set.
> - */
> -env->msr_mask &= ~(1 << 6);
> -
> -/* Tell KVM that we're in PAPR mode */
> -if (kvm_enabled()) {
> -kvmppc_set_papr(cpu);
> -}
> -
> -if (cpu->max_compat) {
> -if (ppc_set_compat(cpu, cpu->max_compat) < 0) {
> -exit(1);
> -}
> -}
> -
> -xics_cpu_setup(spapr->icp, cpu);
> -
> -qemu_register_reset(spapr_cpu_reset, cpu);
>  }
>  
>  /* allocate RAM */
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index b1a0838..831db6b 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -478,6 +478,8 @@ struct sPAPRTCETable {
>  QLIST_ENTRY(sPAPRTCETable) list;
>  };
>  
> +#define TIMEBASE_FREQ   51200ULL
> +
>  void spapr_events_init(sPAPREnvironment *spapr);
>  void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
>  int spapr_h_cas_compose_response(target_ulong addr, target_ulong size);
> @@ -494,5 +496,6 @@ int spapr_tcet_dma_dt(void *fdt, int node_off, const char 
> *propname,
>sPAPRTCETable *tcet);
>  void spapr_hotplug_req_add_event(sPAPRDRConnector *drc);
>  void spapr_hotplug_req_remove_event(sPAPRDRConnector *drc);
> +void spapr_cpu_reset(void *opaque);
>  
>  #endif /* !defined (__HW_SPAPR_H__) */
> diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
> index 72cc9d0..9c642a5 100644
> --- a/target-ppc/translate_init.c
> +++ b/target-ppc/translate_init.c
> @@ -30,29 +30,14 @@
>  #include "qemu/error-report.h"
>  #include "qapi/visitor.h"
>  #include "hw/qdev-properties.h"
> +#include "hw/ppc/spapr.h"
> +#include "hw/ppc/ppc.h"
>  
>  //#define PPC_DUMP_CPU
>  //#define PPC_DEBUG_SPR
>  //#define PPC_DUMP_SPR_ACCESSES
>  /* #define USE_APPLE_GDB */
>  
> -/* For user-mode emulation, we don't emulate any IRQ controller */
> -#if defined(CONFIG_USER_ONLY)
> -#define PPC_IRQ_INIT_FN(name)
>  \
> -static inline void glue(glue(ppc, name),_irq_init) (CPUPPCState *env)
>  \
> -{
>  \
> -}
> -#else
> -#define PPC_IRQ_INIT_FN(name)
>  \
> -void glue(glue(ppc, name),_irq_init) (CPUPPCState *env);
> -#endif
> -
> -PPC_IRQ_INIT_FN(40x);
> -PPC_IRQ_INIT_FN(6xx);
> -PPC_IRQ_INIT_FN(970);
> -PPC_IRQ_INIT_FN(POWER7);
> -PPC_IRQ_INIT_FN(e500);
> -
>  /* Generic callbacks:
>   * do nothing but store/retrieve spr value
>   */
> @@ -8905,6 +8890,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>  CPUState *cs = CPU(dev);
>  PowerPCCPU *cpu = POWERPC_CPU(dev);
>  PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> +CPUPPCState *env = &cpu->env;
>  Error *local_err = NULL;
>  #if !defined(CONFIG_USER_ONLY)
>  int max_smt = kvm_enabled() ? kvmppc_smt_threads() : 1;
> @@ -8965,6 +8951,29 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>  
>  qemu_ini

Re: [Qemu-devel] [RFC PATCH v1 02/13] spapr: Add DRC dt entries for CPUs

2015-01-28 Thread David Gibson

On Thu, Jan 08, 2015 at 11:40:09AM +0530, Bharata B Rao wrote:
> Advertise CPU DR-capability to the guest via device tree.
> 
> Signed-off-by: Bharata B Rao 
> Signed-off-by: Michael Roth 
>[spapr_drc_reset implementation]

Reviewed-by: David Gibson 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgp2veJ0f0QOL.pgp
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH v1 01/13] spapr: enable PHB/CPU/LMB hotplug for pseries-2.3

2015-01-28 Thread David Gibson

On Thu, Jan 08, 2015 at 11:40:08AM +0530, Bharata B Rao wrote:
> From: Michael Roth 
> 
> Introduce an sPAPRMachineClass sub-class of MachineClass to
> handle sPAPR-specific machine configuration properties.
> 
> The 'dr_phb[cpu,lmb]_enabled' field of that class can be set as
> part of machine-specific init code, and is then propagated
> to sPAPREnvironment to conditional enable creation of DRC
> objects and device-tree description to facilitate hotplug
> of PHBs/CPUs/LMBs.
> 
> Since we can't migrate this state to older machine types,
> default the option to false and only enable it for new
> machine types.
> 
> Signed-off-by: Michael Roth 
> Signed-off-by: Bharata B Rao 
>   [Added CPU and LMB bits]

Reviewed-by: David Gibson 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpLFltx5k5N5.pgp
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH v1 06/13] spapr: CPU hotplug support

2015-01-28 Thread David Gibson

On Thu, Jan 08, 2015 at 11:40:13AM +0530, Bharata B Rao wrote:
> Support CPU hotplug via device-add command. Use the exising EPOW event
> infrastructure to send CPU hotplug notification to the guest.
> 
> Signed-off-by: Bharata B Rao 
> ---
>  hw/ppc/spapr.c  | 205 
> +++-
>  hw/ppc/spapr_events.c   |   8 +-
>  target-ppc/translate_init.c |   6 ++
>  3 files changed, 215 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 515d770..a293a59 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -330,6 +330,8 @@ static void add_str(GString *s, const gchar *s1)
>  g_string_append_len(s, s1, strlen(s1) + 1);
>  }
>  
> +uint32_t cpus_per_socket;
> +
>  static void *spapr_create_fdt_skel(hwaddr initrd_base,
> hwaddr initrd_size,
> hwaddr kernel_size,
> @@ -350,9 +352,9 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
>  unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
>  QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
>  unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
> -uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
>  char *buf;
>  
> +cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
>  add_str(hypertas, "hcall-pft");
>  add_str(hypertas, "hcall-term");
>  add_str(hypertas, "hcall-dabr");
> @@ -1744,12 +1746,209 @@ static void spapr_nmi(NMIState *n, int cpu_index, 
> Error **errp)
>  }
>  }
>  
> +/* TODO: Duplicates code from spapr_create_fdt_skel(), Fix this */

Uh, yeah, you should fix this.  I think you probably want to move the
filling out of the cpu dt information from create_fdt_skel() to
finalize_fdt(), then you should be able to use a common helper function.

> +static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
> +int drc_index)
> +{
> +PowerPCCPU *cpu = POWERPC_CPU(cs);
> +CPUPPCState *env = &cpu->env;
> +PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
> +int index = ppc_get_vcpu_dt_id(cpu);
> +uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
> +   0x, 0x};
> +uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
> +uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 10;
> +uint32_t page_sizes_prop[64];
> +size_t page_sizes_prop_size;
> +int smpt = ppc_get_compat_smt_threads(cpu);
> +uint32_t servers_prop[smpt];
> +uint32_t gservers_prop[smpt * 2];
> +int i;
> +uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
> +uint32_t associativity[] = {cpu_to_be32(0x5),
> +cpu_to_be32(0x0),
> +cpu_to_be32(0x0),
> +cpu_to_be32(0x0),
> +cpu_to_be32(cs->numa_node),
> +cpu_to_be32(index)};
> +
> +_FDT((fdt_setprop_cell(fdt, offset, "reg", index)));
> +_FDT((fdt_setprop_string(fdt, offset, "device_type", "cpu")));
> +
> +_FDT((fdt_setprop_cell(fdt, offset, "cpu-version", env->spr[SPR_PVR])));
> +_FDT((fdt_setprop_cell(fdt, offset, "d-cache-block-size",
> +env->dcache_line_size)));
> +_FDT((fdt_setprop_cell(fdt, offset, "d-cache-line-size",
> +env->dcache_line_size)));
> +_FDT((fdt_setprop_cell(fdt, offset, "i-cache-block-size",
> +env->icache_line_size)));
> +_FDT((fdt_setprop_cell(fdt, offset, "i-cache-line-size",
> +env->icache_line_size)));
> +
> +if (pcc->l1_dcache_size) {
> +_FDT((fdt_setprop_cell(fdt, offset, "d-cache-size",
> +pcc->l1_dcache_size)));
> +} else {
> +fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
> +}
> +if (pcc->l1_icache_size) {
> +_FDT((fdt_setprop_cell(fdt, offset, "i-cache-size",
> +pcc->l1_icache_size)));
> +} else {
> +fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
> +}
> +
> +_FDT((fdt_setprop_cell(fdt, offset, "timebase-frequency", tbfreq)));
> +_FDT((fdt_setprop_cell(fdt, offset, "clock-frequency", cpufreq)));
> +_FDT((fdt_setprop_cell(fdt, offset, "ibm,slb-size", env->slb_nr)));
> +_FDT((fdt_setprop_string(fdt, offset, "status", "okay")));
> +_FDT((fdt_setprop(fdt, offset, "64-bit", NULL, 0)));
> +
> +if (env->spr_cb[SPR_PURR].oea_read) {
> +_FDT((fdt_setprop(fdt, offset, "ibm,purr", NULL, 0)));
> +}
> +
> +if (env->mmu_model & POWERPC_MMU_1TSEG) {
> +_FDT((fdt_setprop(fdt, offset, "ibm,processor-segment-sizes",
> +   segs, sizeof(segs;
> +}
> +
> +/* Advertise VMX/VSX (vector extensions) if available
> + *   0 / no property == no vector extensi

Re: [Qemu-devel] [RFC PATCH v1 05/13] spapr: Support ibm, lrdr-capacity device tree property

2015-01-28 Thread David Gibson

On Thu, Jan 08, 2015 at 11:40:12AM +0530, Bharata B Rao wrote:
> Add support for ibm,lrdr-capacity since this is needed by the guest
> kernel to know about the possible hot-pluggable CPUs and Memory.
> 
> Define minimum hotpluggable memory size as 256MB and start storing maximum
> possible memory for the guest in sPAPREnvironment.
> 
> Signed-off-by: Bharata B Rao 
> ---
>  hw/ppc/spapr.c |  3 ++-
>  hw/ppc/spapr_rtas.c| 28 ++--
>  include/hw/ppc/spapr.h |  6 --
>  3 files changed, 32 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index f49b0fa..515d770 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -775,7 +775,7 @@ static void spapr_finalize_fdt(sPAPREnvironment *spapr,
>  }
>  
>  /* RTAS */
> -ret = spapr_rtas_device_tree_setup(fdt, rtas_addr, rtas_size);
> +ret = spapr_rtas_device_tree_setup(spapr, fdt, rtas_addr, rtas_size);
>  if (ret < 0) {
>  fprintf(stderr, "Couldn't set up RTAS device tree properties\n");
>  }
> @@ -1473,6 +1473,7 @@ static void ppc_spapr_init(MachineState *machine)
>  
>  /* allocate RAM */
>  spapr->ram_limit = ram_size;
> +spapr->maxram_limit = machine->maxram_size;

There's now a bunch of duplication between sPAPREnvironment and
MachineState - it looks like we should really fold sPAPREnvironment
into a subclass of MachineState, but that's probably a project for
another day.

>  memory_region_allocate_system_memory(ram, NULL, "ppc_spapr.ram",
>   spapr->ram_limit);
>  memory_region_add_subregion(sysmem, 0, ram);
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index d847f45..e8a0f21 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -29,6 +29,7 @@
>  #include "sysemu/char.h"
>  #include "hw/qdev.h"
>  #include "sysemu/device_tree.h"
> +#include "sysemu/cpus.h"
>  
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/spapr_vio.h"
> @@ -551,11 +552,12 @@ void spapr_rtas_register(int token, const char *name, 
> spapr_rtas_fn fn)
>  rtas_table[token].fn = fn;
>  }
>  
> -int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
> - hwaddr rtas_size)
> +int spapr_rtas_device_tree_setup(sPAPREnvironment *spapr, void *fdt,
> + hwaddr rtas_addr, hwaddr rtas_size)
>  {
>  int ret;
>  int i;
> +uint32_t lrdr_capacity[5];
>  
>  ret = fdt_add_mem_rsv(fdt, rtas_addr, rtas_size);
>  if (ret < 0) {
> @@ -604,6 +606,28 @@ int spapr_rtas_device_tree_setup(void *fdt, hwaddr 
> rtas_addr,
>  }
>  
>  }
> +
> +ret = qemu_fdt_setprop_cell(fdt, "/rtas", "#address-cells", 0x2);
> +if (ret < 0) {
> +fprintf(stderr, "Couldn't add #address-cells rtas property\n");
> +}
> +
> +ret = qemu_fdt_setprop_cell(fdt, "/rtas", "#size-cells", 0x2);
> +if (ret < 0) {
> +fprintf(stderr, "Couldn't add #size-cells rtas property\n");
> +}

It's not clear what adding #address-cells and #size-cells has to do
with this, and these properties generally don't make sense on a node
without children.

> +lrdr_capacity[0] = cpu_to_be32(spapr->maxram_limit >> 32);
> +lrdr_capacity[1] = cpu_to_be32(spapr->maxram_limit & 0x);
> +lrdr_capacity[2] = 0;
> +lrdr_capacity[3] = cpu_to_be32(SPAPR_MIN_MEMORY_BLOCK_SIZE);
> +lrdr_capacity[4] = cpu_to_be32(max_cpus/smp_threads);
> +ret = qemu_fdt_setprop(fdt, "/rtas", "ibm,lrdr-capacity", lrdr_capacity,
> + sizeof(lrdr_capacity));
> +if (ret < 0) {
> +fprintf(stderr, "Couldn't add ibm,lrdr-capacity rtas property\n");
> +}
> +
>  return 0;
>  }
>  
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index 831db6b..ae8b4e1 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -18,6 +18,7 @@ typedef struct sPAPREnvironment {
>  XICSState *icp;
>  
>  hwaddr ram_limit;
> +hwaddr maxram_limit;
>  void *htab;
>  uint32_t htab_shift;
>  hwaddr rma_size;
> @@ -444,8 +445,8 @@ void spapr_rtas_register(int token, const char *name, 
> spapr_rtas_fn fn);
>  target_ulong spapr_rtas_call(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>   uint32_t token, uint32_t nargs, target_ulong 
> args,
>   uint32_t nret, target_ulong rets);
> -int spapr_rtas_device_tree_setup(void *fdt, hwaddr rtas_addr,
> - hwaddr rtas_size);
> +int spapr_rtas_device_tree_setup(sPAPREnvironment *spapr, void *fdt,
> + hwaddr rtas_addr, hwaddr rtas_size);
>  
>  #define SPAPR_TCE_PAGE_SHIFT   12
>  #define SPAPR_TCE_PAGE_SIZE(1ULL << SPAPR_TCE_PAGE_SHIFT)
> @@ -479,6 +480,7 @@ struct sPAPRTCETable {
>  };
>  
>  #define TIMEBASE_FREQ   51200ULL
> +#define SPAPR_MIN_MEMORY_BLOCK_SIZE (1 << 28) /* 256MB */
>  
>  void spapr_events_init(sPAPR

Re: [Qemu-devel] [RFC PATCH v1 09/13] spapr: CPU hot unplug support

2015-01-28 Thread David Gibson

On Thu, Jan 08, 2015 at 11:40:16AM +0530, Bharata B Rao wrote:
> Support hot removal of CPU for sPAPR guests.
> 
> Signed-off-by: Bharata B Rao 
> ---
>  hw/ppc/spapr.c | 43 +++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 4347471..ec793b1 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1908,6 +1908,22 @@ static void spapr_cpu_hotplug_add(DeviceState *dev, 
> CPUState *cs)
>  drck->attach(drc, dev, fdt, offset, false);
>  }
>  
> +static void spapr_cpu_release(DeviceState *dev, void *opaque)
> +{
> +/* Release vCPU */

Um.. should this actually do something?

> +}
> +
> +static void spapr_cpu_hotplug_remove(DeviceState *dev, CPUState *cs)
> +{
> +PowerPCCPU *cpu = POWERPC_CPU(cs);
> +int id = ppc_get_vcpu_dt_id(cpu);
> +sPAPRDRConnector *drc =
> +spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> +sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> +
> +drck->detach(drc, dev, spapr_cpu_release, NULL);
> +}
> +
>  static void spapr_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>  Error **errp)
>  {
> @@ -1948,6 +1964,21 @@ static void spapr_cpu_plug(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
>  return;
>  }
>  
> +static void spapr_cpu_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
> +Error **errp)
> +{
> +Error *local_err = NULL;

Unused variable.

> +CPUState *cs = CPU(dev);
> +PowerPCCPU *cpu = POWERPC_CPU(cs);
> +sPAPRDRConnector *drc =
> +spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, 
> cpu->cpu_dt_id);
> +
> +spapr_cpu_hotplug_remove(dev, cs);
> +spapr_hotplug_req_remove_event(drc);
> +error_propagate(errp, local_err);
> +return;
> +}
> +
>  static void spapr_machine_device_plug(HotplugHandler *hotplug_dev,
>DeviceState *dev, Error **errp)
>  {
> @@ -1958,6 +1989,16 @@ static void spapr_machine_device_plug(HotplugHandler 
> *hotplug_dev,
>  }
>  }
>  
> +static void spapr_machine_device_unplug(HotplugHandler *hotplug_dev,
> +  DeviceState *dev, Error **errp)
> +{
> +if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> +if (dev->hotplugged) {
> +spapr_cpu_unplug(hotplug_dev, dev, errp);
> +}
> +}
> +}
> +
>  static HotplugHandler *spapr_get_hotpug_handler(MachineState *machine,
>   DeviceState *dev)
>  {
> @@ -1986,6 +2027,8 @@ static void spapr_machine_class_init(ObjectClass *oc, 
> void *data)
>  mc->get_hotplug_handler = spapr_get_hotpug_handler;
>  
>  hc->plug = spapr_machine_device_plug;
> +hc->unplug = spapr_machine_device_unplug;
> +
>  smc->dr_phb_enabled = false;
>  smc->dr_cpu_enabled = false;
>  smc->dr_lmb_enabled = false;

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpdzCwcwgLxd.pgp
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH v1 03/13] spapr: Consider max_cpus during xics initialization

2015-01-28 Thread David Gibson

On Thu, Jan 08, 2015 at 11:40:10AM +0530, Bharata B Rao wrote:
> Use max_cpus instead of smp_cpus when intializating xics system. Also
> report max_cpus in ibm,interrupt-server-ranges device tree property of
> interrupt controller node.
> 
> Signed-off-by: Bharata B Rao 

Reviewed-by: David Gibson 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgp77wi1wP5IL.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH] linux-user/syscall.c: Let all lock_user_struct() and unlock_user_struct() paired with each other

2015-01-28 Thread Chen Gang S

On 1/29/15 06:36, Peter Maydell wrote:
> On 28 January 2015 at 22:09, Chen Gang S  wrote:
>>  - Is what I said above really correct (e.g. is linux-user really mainly
>>for cpu emulation)?.
> 
> Not really. linux-user is mainly for running single Linux binaries.
> It has a secondary use for running gcc test binaries which think
> they are "bare metal" but actually use some kind of semihosting API.
> (You should check whether tile has one of those.)
> 
> As well as linux-user mode, QEMU has system emulation mode, where
> we emulate a complete machine.
> 
> Both modes need CPU emulation.
> 

OK, thanks.

For coding and test, is linux-user a good starting position for me? (I
guess it is).


Thanks.
-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed

Re: [Qemu-devel] [PATCH v5 2/2] Xen: Use the ioreq-server API when available

2015-01-28 Thread Don Slutz



On 01/28/15 19:05, Don Slutz wrote:
> On 01/28/15 14:32, Don Slutz wrote:
>> On 12/05/14 05:50, Paul Durrant wrote:
>>> The ioreq-server API added to Xen 4.5 offers better security than
>>> the existing Xen/QEMU interface because the shared pages that are
>>> used to pass emulation request/results back and forth are removed
>>> from the guest's memory space before any requests are serviced.
>>> This prevents the guest from mapping these pages (they are in a
>>> well known location) and attempting to attack QEMU by synthesizing
>>> its own request structures. Hence, this patch modifies configure
>>> to detect whether the API is available, and adds the necessary
>>> code to use the API if it is.
>>
>> This patch (which is now on xenbits qemu staging) is causing me
>> issues.
>>
> 
> I have found the key.
> 
> The following will reproduce my issue:
> 
> 1) xl create -p 
> 2) read one of HVM_PARAM_IOREQ_PFN, HVM_PARAM_BUFIOREQ_PFN, or
>HVM_PARAM_BUFIOREQ_EVTCHN
> 3) xl unpause new guest
> 
> The guest will hang in hvmloader.
> 
> More in thread:
> 
> 
> Subject: Re: [Xen-devel] [PATCH] ioreq-server: handle
> IOREQ_TYPE_PCI_CONFIG in assist function
> References: <1422385589-17316-1-git-send-email-wei.l...@citrix.com>
> 
> 

Opps, That thread does not make sense to include what I have found.

Here is the info I was going to send there:


Using QEMU upstream master (or xenbits qemu staging), you do not have a
default ioreq server.  And so hvm_select_ioreq_server() returns NULL for
hvmloader's iorequest to:

CPU4  0 (+   0)  HANDLE_PIO [ port = 0x0cfe size = 2 dir = 1 ]

(I added this xentrace to figure out what is happening, and I have
a lot of data about it, if any one wants it.)

To get a guest hang instead of calling hvm_complete_assist_req()
for some of hvmloader's pci_read() calls, you can do the following:


1) xl create -p 
2) read one of HVM_PARAM_IOREQ_PFN, HVM_PARAM_BUFIOREQ_PFN, or
   HVM_PARAM_BUFIOREQ_EVTCHN
3) xl unpause new guest

The guest will hang in hvmloader.

The read of HVM_PARAM_IOREQ_PFN will cause a default ioreq server to
be created and directed to the QEMU upsteam that is not a default
ioreq server.  This read also creates the extra event channels that
I see.

I see that hvmop_create_ioreq_server() prevents you from creating
an is_default ioreq_server, so QEMU is not able to do.

Not sure where we go from here.

   -Don Slutz


> -Don Slutz
> 
> 
>> So far I have tracked it back to hvm_select_ioreq_server()
>> which selects the "default_ioreq_server".  Since I have one 1
>> QEMU, it is both the "default_ioreq_server" and an enabled
>> 2nd ioreq_server.  I am continuing to understand why my changes
>> are causing this.  More below.
>>
>> This patch causes QEMU to only call xc_evtchn_bind_interdomain()
>> for the enabled 2nd ioreq_server.  So when (if)
>> hvm_select_ioreq_server() selects the "default_ioreq_server", the
>> guest hangs on an I/O.
>>
>> Using the debug key 'e':
>>
>> (XEN) [2015-01-28 18:57:07] 'e' pressed -> dumping event-channel info
>> (XEN) [2015-01-28 18:57:07] Event channel information for domain 0:
>> (XEN) [2015-01-28 18:57:07] Polling vCPUs: {}
>> (XEN) [2015-01-28 18:57:07] port [p/m/s]
>> (XEN) [2015-01-28 18:57:07]1 [0/0/0]: s=5 n=0 x=0 v=0
>> (XEN) [2015-01-28 18:57:07]2 [0/0/0]: s=6 n=0 x=0
>> (XEN) [2015-01-28 18:57:07]3 [0/0/0]: s=6 n=0 x=0
>> (XEN) [2015-01-28 18:57:07]4 [0/0/0]: s=5 n=0 x=0 v=1
>> (XEN) [2015-01-28 18:57:07]5 [0/0/0]: s=6 n=0 x=0
>> (XEN) [2015-01-28 18:57:07]6 [0/0/0]: s=6 n=0 x=0
>> (XEN) [2015-01-28 18:57:07]7 [0/0/0]: s=5 n=1 x=0 v=0
>> (XEN) [2015-01-28 18:57:07]8 [0/0/0]: s=6 n=1 x=0
>> (XEN) [2015-01-28 18:57:07]9 [0/0/0]: s=6 n=1 x=0
>> (XEN) [2015-01-28 18:57:07]   10 [0/0/0]: s=5 n=1 x=0 v=1
>> (XEN) [2015-01-28 18:57:07]   11 [0/0/0]: s=6 n=1 x=0
>> (XEN) [2015-01-28 18:57:07]   12 [0/0/0]: s=6 n=1 x=0
>> (XEN) [2015-01-28 18:57:07]   13 [0/0/0]: s=5 n=2 x=0 v=0
>> (XEN) [2015-01-28 18:57:07]   14 [0/0/0]: s=6 n=2 x=0
>> (XEN) [2015-01-28 18:57:07]   15 [0/0/0]: s=6 n=2 x=0
>> (XEN) [2015-01-28 18:57:07]   16 [0/0/0]: s=5 n=2 x=0 v=1
>> (XEN) [2015-01-28 18:57:07]   17 [0/0/0]: s=6 n=2 x=0
>> (XEN) [2015-01-28 18:57:07]   18 [0/0/0]: s=6 n=2 x=0
>> (XEN) [2015-01-28 18:57:07]   19 [0/0/0]: s=5 n=3 x=0 v=0
>> (XEN) [2015-01-28 18:57:07]   20 [0/0/0]: s=6 n=3 x=0
>> (XEN) [2015-01-28 18:57:07]   21 [0/0/0]: s=6 n=3 x=0
>> (XEN) [2015-01-28 18:57:07]   22 [0/0/0]: s=5 n=3 x=0 v=1
>> (XEN) [2015-01-28 18:57:07]   23 [0/0/0]: s=6 n=3 x=0
>> (XEN) [2015-01-28 18:57:07]   24 [0/0/0]: s=6 n=3 x=0
>> (XEN) [2015-01-28 18:57:07]   25 [0/0/0]: s=5 n=4 x=0 v=0
>> (XEN) [2015-01-28 18:57:07]   26 [0/0/0]: s=6 n=4 x=0
>> (XEN) [2015-01-28 18:57:07]   27 [0/0/0]: s=6 n=4 x=0
>> (XEN) [2015-01-28 18:57:07]   28 [0/0/0]: s=5 n=4 x=0 v=1
>> (XEN) [2015-01-28 18:57:07]   29 [0/0/

Re: [Qemu-devel] [PATCH v6 5/5] qemu-iotests: Add 093 for IO throttling

2015-01-28 Thread Fam Zheng

On Wed, 01/28 11:22, Max Reitz wrote:
> On 2015-01-27 at 21:28, Fam Zheng wrote:
> >This case utilizes qemu-io command "aio_{read,write} -q" to verify the
> >effectiveness of IO throttling options.
> >
> >It's implemented by driving the vm timer from qtest protocol, so the
> >throttling timers are signaled with determinied time duration. Then we
> >verify the completed IO requests are within 10% error of bps and iops
> >limits.
> >
> >"null" protocol is used as the disk backend so that no actual disk IO is
> >performed on host, this will make the blockstats much more
> >deterministic. Both "null-aio" and "null-co" are covered, which is also
> >a simple cross validation test for the driver code.
> >
> >Signed-off-by: Fam Zheng 
> >---
> >  tests/qemu-iotests/093 | 120 
> > +
> >  tests/qemu-iotests/093.out |   5 ++
> >  tests/qemu-iotests/group   |   1 +
> >  3 files changed, 126 insertions(+)
> >  create mode 100755 tests/qemu-iotests/093
> >  create mode 100644 tests/qemu-iotests/093.out
> >
> >diff --git a/tests/qemu-iotests/093 b/tests/qemu-iotests/093
> >new file mode 100755
> >index 000..2866536
> >--- /dev/null
> >+++ b/tests/qemu-iotests/093
> >@@ -0,0 +1,120 @@
> >+#!/usr/bin/env python
> >+#
> >+# Tests for IO throttling
> >+#
> >+# Copyright (C) 2015 Red Hat, Inc.
> >+#
> >+# This program is free software; you can redistribute it and/or modify
> >+# it under the terms of the GNU General Public License as published by
> >+# the Free Software Foundation; either version 2 of the License, or
> >+# (at your option) any later version.
> >+#
> >+# This program is distributed in the hope that it will be useful,
> >+# but WITHOUT ANY WARRANTY; without even the implied warranty of
> >+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >+# GNU General Public License for more details.
> >+#
> >+# You should have received a copy of the GNU General Public License
> >+# along with this program.  If not, see .
> >+#
> >+
> >+import iotests
> >+
> >+class ThrottleTestCase(iotests.QMPTestCase):
> >+test_img = "null-aio://"
> >+
> >+def blockstats(self, device):
> >+result = self.vm.qmp("query-blockstats")
> >+for r in result['return']:
> >+if r['device'] == device:
> >+stat = r['stats']
> >+return stat['rd_bytes'], stat['rd_operations'], 
> >stat['wr_bytes'], stat['wr_operations']
> >+raise Exception("Device not found for blockstats: %s" % device)
> >+
> >+def setUp(self):
> >+self.vm = iotests.VM().add_drive(self.test_img)
> >+self.vm.launch()
> >+
> >+def tearDown(self):
> >+self.vm.shutdown()
> >+
> >+def do_test_throttle(self, seconds, params):
> >+def check_limit(limit, num):
> >+# IO throttling algorithm is discrete, allow 10% error so the 
> >test
> >+# is more robust
> >+return limit == 0 or \
> >+   (num < seconds * limit * 1.1
> >+   and num > seconds * limit * 0.9)
> >+
> >+nsec_per_sec = 10
> >+
> >+params['device'] = 'drive0'
> >+
> >+result = self.vm.qmp("block_set_io_throttle", conv_keys=False, 
> >**params)
> >+self.assert_qmp(result, 'return', {})
> >+
> >+# Set vm clock to a known value
> >+ns = seconds * nsec_per_sec
> >+self.vm.qtest("clock_step %d" % ns)
> >+
> >+# Submit enough requests. They will drain bps_max and iops_max, but 
> >the
> >+# rest requests won't get executed until we advance the virtual 
> >clock
> >+# with qtest interface
> >+rq_size = 512
> >+rd_nr = max(params['bps'] / rq_size / 2,
> >+params['bps_rd'] / rq_size,
> >+params['iops'] / 2,
> >+params['iops_rd']) + \
> >+params['bps_max'] / rq_size / 2 + \
> >+params['iops_max']
> 
> I guess the divisions by two are because those values represent read and
> write operations combined. Shouldn't iops_max be divided by two, too, then?
> 
> >+rd_nr *= seconds * 2
> >+wr_nr = max(params['bps'] / rq_size / 2,
> >+params['bps_wr'] / rq_size,
> >+params['iops'] / 2,
> >+params['iops_wr']) + \
> >+params['bps_max'] / rq_size / 2 + \
> >+params['iops_max']
> >+wr_nr *= seconds * 2
> >+for i in range(rd_nr):
> >+self.vm.hmp_qemu_io("drive0", "aio_read %d %d" % (i * rq_size, 
> >rq_size))
> >+for i in range(wr_nr):
> >+self.vm.hmp_qemu_io("drive0", "aio_write %d %d" % (i * rq_size, 
> >rq_size))
> >+
> >+start_rd_bytes, start_rd_iops, start_wr_bytes, start_wr_iops = 
> >self.blockstats('drive0')
> >+
> >+self.vm.qtest("clock_step %d" % ns)
> >+end_rd_bytes, end_rd_iops, end_wr_bytes, end_wr

Re: [Qemu-devel] [PATCH v2] spapr_vio/spapr_iommu: Move VIO bypass where it belongs

2015-01-28 Thread Alexey Kardashevskiy

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 01/29/2015 11:31 AM, David Gibson wrote:
> On Wed, Jan 28, 2015 at 07:49:48PM +1100, Alexey Kardashevskiy wrote:
>> On 01/28/2015 08:21 AM, Alexey Kardashevskiy wrote:
>>> On 01/27/2015 05:13 PM, Alexey Kardashevskiy wrote:
> [snip]
 diff --git a/include/hw/ppc/spapr_vio.h
 b/include/hw/ppc/spapr_vio.h index 46edc2a..6ad55d1 100644 ---
 a/include/hw/ppc/spapr_vio.h +++ b/include/hw/ppc/spapr_vio.h @@
 -64,6 +64,8 @@ struct VIOsPAPRDevice { target_ulong
 signal_state; VIOsPAPR_CRQ crq; AddressSpace as; +
 MemoryRegion mrroot; +MemoryRegion mrbypass; sPAPRTCETable
 *tcet; };
 
 
>>> 
>>> 
>> 
>> 
>> I believe doing something like this is way too disguising because
>> of tobj->parent?
> 
> It's kinda ugly, but it's the best way I can think of.



Better than having VIOsPAPRDevice pointer in sPAPRTCETable? I would
expect a object_get_parent() helper to exist but it is not there and I
believe this is for a reason (may be it will be get rid of some time
later)...


> So, I think this is a reasonable approach, but it does need a comment 
> saying why the hack is necessary.
> 
>> 
>> 
>> 
>> --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -25,6 +25,7
>> @@ #include "trace.h"
>> 
>> #include "hw/ppc/spapr.h" +#include "hw/ppc/spapr_vio.h"
>> 
>> #include 
>> 
>> @@ -88,10 +89,26 @@ static IOMMUTLBEntry 
>> spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr, return
>> ret; }
>> 
>> +static int spapr_tce_table_post_load(void *opaque, int version_id) 
>> +{ +sPAPRTCETable *tcet = SPAPR_TCE_TABLE(opaque); +Object
>> *tobj = OBJECT(tcet); +Object *vobj =
>> object_dynamic_cast(tobj->parent, TYPE_VIO_SPAPR_DEVICE); + +if
>> (!vobj) { +return 0; +} + +
>> spapr_vio_set_bypass((VIOsPAPRDevice *) vobj, tcet->bypass); + +
>> return 0; +} + static const VMStateDescription
>> vmstate_spapr_tce_table = { .name = "spapr_iommu", .version_id = 2, 
>> .minimum_version_id = 2, +.post_load =
>> spapr_tce_table_post_load, .fields  = (VMStateField []) { /*
>> Sanity check */ VMSTATE_UINT32_EQUAL(liobn, sPAPRTCETable),
>> 
>> 
>> 
> 


- -- 
Alexey
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQIcBAEBAgAGBQJUyYNtAAoJEIYTPdgrwSC54t0P/0SGpr+UX7qlwKKs89bwVc//
dzUS4wPXeH7zYkLughLprEh3uINY9x7z8euqSJg1wPoh54F6QX9PDb8ipKavie+a
NlxDeqVNh+eWLR/rI1OxMbg/PcQSnqCN13EHJLiz69xkdEZsNReLKUFueMrk4Grl
H4l3+GDVx6HbBSTWawOmcpN4Bcz0/nRgsYf6RytIk9ZP60zplbNkrCt3jwPV87O3
/+g73eBvpusiY+8NIEPbzAEssMcTj+JDYB5Dy5JlsOYewQFyXiiXaVSdRnNIPB4o
tSaD93LYvkNEYXACJ7awsfWWx45tvWjt0GtO0YUU2vGD/cAWjsW8Q76f1DjNZdvD
wZwhJF0cieaRnlrNMlCE199DyxcKDqDsENEnRUsySE8qq1uHGXo1dlyn+rA3Spc7
jEfwxkmtpHf4l9JAB01hGmq37eYPPKS9uyjXwqtKZuDea0ObVEtsmXiSZnOho82D
+jJzr1KUDUMFcWw7ZWxXJ4I1wCt0u3krRWtO4vD0RXnjWgKUM8MbuNGFBFUt1qtF
8M1KTrm06jW4w0h19nBSP2ed7j2ObE2KJebE7I/tl//gxgQo4PmR/cwZv8EXpHCC
KNz6SDvUSboJ9i02Gxkk7GWKYPbGthQkrICxTfVwKGyms4O58/KiH8T/Koey/BLJ
92imwmKo3VtddbQfe5Ay
=dlFh
-END PGP SIGNATURE-

Re: [Qemu-devel] [PATCH 2/2] hw/ppc/spapr Add qemu_register_boot_set for SPAPR

2015-01-28 Thread Alexander Graf



On 29.01.15 01:40, Gonglei wrote:
> On 2015/1/26 17:35, Alexander Graf wrote:
> 
>> On 01/26/2015 11:49 AM, Dinar Valeev wrote:
>>> On 01/24/2015 12:04 AM, Alexander Graf wrote:


 On 23.01.15 23:51, dval...@suse.de wrote:
> From: Dinar Valeev 
>
> In order to have -boot once=d functioning, it is required to have
> qemu_register_boot_set
>
> qemu-system-ppc64 -enable-kvm -boot once=d
>
> Ready!
> 0 > dev /chosen   ok
> 0 > .properties
> ...
> qemu,boot-device d
> ...
> 0 > reset-all
>
> Ready!
> 0 > dev /chosen   ok
> 0 > .properties
> ...
> qemu,boot-device cdn
> ...
>
> Signed-off-by: Dinar Valeev 
> ---
>   hw/ppc/spapr.c | 12 
>   1 file changed, 12 insertions(+)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 3d2cfa3..38b03fc 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -314,6 +314,16 @@ static void add_str(GString *s, const gchar *s1)
>   g_string_append_len(s, s1, strlen(s1) + 1);
>   }
>
> +static void spapr_boot_set(void *opaque, const char *boot_device,
> +   Error **errp)
> +{
> +int offset;
> +offset = fdt_path_offset(opaque, "/chosen");
> +fdt_setprop_string(opaque, offset, "qemu,boot-device", boot_device);
> +
> +}
> +
> +
>   static void *spapr_create_fdt_skel(hwaddr initrd_base,
>  hwaddr initrd_size,
>  hwaddr kernel_size,
> @@ -414,6 +424,8 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
>   if (boot_device) {
>   _FDT((fdt_property_string(fdt, "qemu,boot-device", 
> boot_device)));
>   }
> +qemu_register_boot_set(spapr_boot_set, fdt);

 If you simply move the code above (the _FDT() one) from create_fdt_skel
 to spapr_finalize_fdt() you should have the same net effect and much
 cleaner code :).
>>> I've tried your proposal, on reset boot-device property stays "d"
>>
>> Ugh, the machine field doesn't change on reset. I think it'd be a lot more 
>> intuitive if it did. Can you try with the patch below applied as well?
>>
> 
> This approach is not good because boot_set_handler is NULL and return error 
> directly.
> Please using qemu_register_boot_set for this purpose.

I'd personally prefer if we get rid of qemu_register_boot_set
completely. It duplicates the reset logic as well as information holder
locality (machine struct vs parameter).


Alex

Re: [Qemu-devel] [PATCH 2/2] bootdevice: update boot_order in MachineState

2015-01-28 Thread Gonglei

On 2015/1/29 6:22, Dinar Valeev wrote:

> On 01/28/2015 02:48 AM, Gonglei wrote:
>> On 2015/1/27 18:49, Dinar Valeev wrote:
>>
>>> On 01/27/2015 10:18 AM, Gonglei wrote:
 On 2015/1/27 16:57, Dinar Valeev wrote:

> On 01/27/2015 03:51 AM, Gonglei wrote:
>> On 2015/1/27 7:52, dval...@suse.de wrote:
>>
>>> From: Dinar Valeev 
>>>
>>> on sPAPR we need to update boot_order in MachineState in case it
>>> got changed on reset.
>>>
>>> Signed-off-by: Dinar Valeev 
>>> ---
>>>  bootdevice.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/bootdevice.c b/bootdevice.c
>>> index 5914417..4f11a06 100644
>>> --- a/bootdevice.c
>>> +++ b/bootdevice.c
>>> @@ -26,6 +26,7 @@
>>>  #include "qapi/visitor.h"
>>>  #include "qemu/error-report.h"
>>>  #include "hw/hw.h"
>>> +#include "hw/boards.h"
>>>  
>>>  typedef struct FWBootEntry FWBootEntry;
>>>  
>>> @@ -50,6 +51,8 @@ void qemu_register_boot_set(QEMUBootSetHandler *func, 
>>> void *opaque)
>>>  void qemu_boot_set(const char *boot_order, Error **errp)
>>>  {
>>>  Error *local_err = NULL;
>>> +MachineState *machine = MACHINE(qdev_get_machine());
>>> +machine->boot_order = boot_order;
>>>  
>>>  if (!boot_set_handler) {
>>>  error_setg(errp, "no function defined to set boot device list 
>>> for"
>>
>> Have you registered boot set handler on ppc/sPAPR platform by calling
>> qemu_register_boot_set()? Otherwise qemu_boot_set function
>>  will return error.
> No, I set boot_order on each machine reset. My tests are showing it works 
> without an error.

 That's interesting. Does this function be called?
>>> Yes, then simply returns.
 Would you debug it by setting a breakpoint ?
>>> I added a trace event.
>>>  if (!boot_set_handler) {
>>> +trace_qemu_boot_set(boot_order);
>>>  error_setg(errp, "no function defined to set boot device list for"
>>>   " this architecture");
>>>  return;
>>>
>>> And I see this now in qemu's monitor. Still I don't see error message.
>>
>> That's because NULL is passed to this function in restore_boot_order()
>> the error is ignored (commit f183993). I have seen the previous conversation
>> about your patch serials. And I think this is the reason which
>> you moved machine->boot_order = boot_order before
>> checking boot_set_handler variable based on Alexander's
>> suggestion, right? But I think this is not a good idea.
> Yes
> 
> Any proposal how this can be done differently?
> 

I have replied your previous thread. :)

Regards,
-Gonglei

> It seems I'm almost alone who wants -boot once=X option to get fixed for 
> sPAPR. We use it in test automation, and we need to be sure that we boot from 
> hard disk once installation is done.
> 
> Thanks,
> Dinar
>>
>> Regards,
>> -Gonglei
>>
>>
>

Re: [Qemu-devel] [RFC][PATCH 1/1] libxl: add one machine property to support IGD GFX passthrough

2015-01-28 Thread Chen, Tiejun


On 2015/1/28 19:12, Wei Liu wrote:

On Wed, Jan 28, 2015 at 08:42:56AM +0800, Chen, Tiejun wrote:

On 2015/1/27 22:40, Ian Jackson wrote:

Chen, Tiejun writes ("Re: [Qemu-devel] [RFC][PATCH 1/1] libxl: add one machine 
property to support IGD GFX passthrough"):

On 2015/1/23 8:43, Chen, Tiejun wrote:

On 2015/1/22 8:51, Chen, Tiejun wrote:

At this point I just concern here if we still use 'gfx_passthrou', at
least it may look like a backward compatibility with older versions of
qemu in Xen side, qemu-xen-traditional. But I'd like to follow your
final option.

...

Any feedback to this option I should follow here?


Ping...


I think this is a question that qemu upstream should answer.



Actually this is just commented by Gerd in qemu upstream. So now looks in
Xen side you guys don't have any objection to use 'igd-passthru' as well. If
yes, I'm fine to adopt this.



Yes, we would like to stay in line with upstream.


Wei,

Thanks for your response.



Just remember to handle old option in libxl if your old option is already
released by some older version of QEMUs.


I just drop that old option, -gfx_passthru, if we're under qemu upstream 
circumstance, like this,


--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -318,7 +318,10 @@ static char ** 
libxl__build_device_model_args_old(libxl__gc *gc,

 flexarray_vappend(dm_args, "-net", "none", NULL);
 }
 if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
-flexarray_append(dm_args, "-gfx_passthru");
+if (b_info->device_model_version !=
+LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
+flexarray_append(dm_args, "-gfx_passthru");
+}
 }
 } else {
 if (!sdl && !vnc)
@@ -702,7 +705,10 @@ static char ** 
libxl__build_device_model_args_new(libxl__gc *gc,

 flexarray_append(dm_args, "none");
 }
 if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
-flexarray_append(dm_args, "-gfx_passthru");
+if (b_info->device_model_version !=
+LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
+flexarray_append(dm_args, "-gfx_passthru");
+}
 }
 } else {
 if (!sdl && !vnc) {

Is this good enough?

Thanks
Tiejun

Re: [Qemu-devel] [PATCH 2/2] hw/ppc/spapr Add qemu_register_boot_set for SPAPR

2015-01-28 Thread Gonglei

On 2015/1/26 17:35, Alexander Graf wrote:

> On 01/26/2015 11:49 AM, Dinar Valeev wrote:
>> On 01/24/2015 12:04 AM, Alexander Graf wrote:
>>>
>>>
>>> On 23.01.15 23:51, dval...@suse.de wrote:
 From: Dinar Valeev 

 In order to have -boot once=d functioning, it is required to have
 qemu_register_boot_set

 qemu-system-ppc64 -enable-kvm -boot once=d

 Ready!
 0 > dev /chosen   ok
 0 > .properties
 ...
 qemu,boot-device d
 ...
 0 > reset-all

 Ready!
 0 > dev /chosen   ok
 0 > .properties
 ...
 qemu,boot-device cdn
 ...

 Signed-off-by: Dinar Valeev 
 ---
   hw/ppc/spapr.c | 12 
   1 file changed, 12 insertions(+)

 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index 3d2cfa3..38b03fc 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -314,6 +314,16 @@ static void add_str(GString *s, const gchar *s1)
   g_string_append_len(s, s1, strlen(s1) + 1);
   }

 +static void spapr_boot_set(void *opaque, const char *boot_device,
 +   Error **errp)
 +{
 +int offset;
 +offset = fdt_path_offset(opaque, "/chosen");
 +fdt_setprop_string(opaque, offset, "qemu,boot-device", boot_device);
 +
 +}
 +
 +
   static void *spapr_create_fdt_skel(hwaddr initrd_base,
  hwaddr initrd_size,
  hwaddr kernel_size,
 @@ -414,6 +424,8 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
   if (boot_device) {
   _FDT((fdt_property_string(fdt, "qemu,boot-device", 
 boot_device)));
   }
 +qemu_register_boot_set(spapr_boot_set, fdt);
>>>
>>> If you simply move the code above (the _FDT() one) from create_fdt_skel
>>> to spapr_finalize_fdt() you should have the same net effect and much
>>> cleaner code :).
>> I've tried your proposal, on reset boot-device property stays "d"
> 
> Ugh, the machine field doesn't change on reset. I think it'd be a lot more 
> intuitive if it did. Can you try with the patch below applied as well?
> 

This approach is not good because boot_set_handler is NULL and return error 
directly.
Please using qemu_register_boot_set for this purpose.

Regards,
-Gonglei

> 
> diff --git a/bootdevice.c b/bootdevice.c
> index 5914417..3b750ff 100644
> --- a/bootdevice.c
> +++ b/bootdevice.c
> @@ -50,6 +50,7 @@ void qemu_register_boot_set(QEMUBootSetHandler *func, void 
> *opaque)
>  void qemu_boot_set(const char *boot_order, Error **errp)
>  {
>  Error *local_err = NULL;
> +MachineState *machine = MACHINE(qdev_get_machine());
> 
>  if (!boot_set_handler) {
>  error_setg(errp, "no function defined to set boot device list for"
> @@ -63,6 +64,7 @@ void qemu_boot_set(const char *boot_order, Error **errp)
>  return;
>  }
> 
> +machine->boot_order = boot_order;
>  boot_set_handler(boot_set_opaque, boot_order, errp);
>  }
> 
> 
> 
> Thanks,
> 
> Alex
> 
>

Re: [Qemu-devel] [PATCH] pseries: Limit PCI host bridge "index" value

2015-01-28 Thread David Gibson

Ping agraf,

Any word on merging this?

On Wed, Jan 14, 2015 at 01:33:39PM +1100, David Gibson wrote:
> pseries guests can have large numbers of PCI host bridges.  To avoid the
> user having to specify a number of different configuration values for every
> one, the device supports an "index" property which is a shorthand setting
> the various window and configuration addresses from a predefined sensible
> set.
> 
> There are some problems with the details at present:
>   * The "index" propery is signed, but negative values will create PCI
> windows below where we expect, potentially colliding with other devices
>   * No limit is imposed on the "index" property and large values can
> translate to extremely large window addresses.  With PCI passthrough in
> particular this can mean we exceed various mapping and physical address
> limits causing the guest host bridge to not work in strange ways.
> 
> This patch addresses this, by making "index" unsigned, and imposing a
> limit.  Currently the limit allows indices from 0..255 which is probably
> enough host bridges for the time being.  It's fairly easy to extend if
> we discover we need more.
> 
> Signed-off-by: David Gibson 
> ---
>  hw/ppc/spapr_pci.c  | 8 +++-
>  include/hw/pci-host/spapr.h | 4 +++-
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 21b95b3..6deeb19 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -501,6 +501,12 @@ static void spapr_phb_realize(DeviceState *dev, Error 
> **errp)
>  return;
>  }
>  
> +if (sphb->index > SPAPR_PCI_MAX_INDEX) {
> +error_setg(errp, "\"index\" for PAPR PHB is too large (max %u)",
> +   SPAPR_PCI_MAX_INDEX);
> +return;
> +}
> +
>  sphb->buid = SPAPR_PCI_BASE_BUID + sphb->index;
>  sphb->dma_liobn = SPAPR_PCI_BASE_LIOBN + sphb->index;
>  
> @@ -669,7 +675,7 @@ static void spapr_phb_reset(DeviceState *qdev)
>  }
>  
>  static Property spapr_phb_properties[] = {
> -DEFINE_PROP_INT32("index", sPAPRPHBState, index, -1),
> +DEFINE_PROP_UINT32("index", sPAPRPHBState, index, -1),
>  DEFINE_PROP_UINT64("buid", sPAPRPHBState, buid, -1),
>  DEFINE_PROP_UINT32("liobn", sPAPRPHBState, dma_liobn, -1),
>  DEFINE_PROP_UINT64("mem_win_addr", sPAPRPHBState, mem_win_addr, -1),
> diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
> index 4ea2a0d..876ecf0 100644
> --- a/include/hw/pci-host/spapr.h
> +++ b/include/hw/pci-host/spapr.h
> @@ -64,7 +64,7 @@ typedef struct spapr_pci_msi_mig {
>  struct sPAPRPHBState {
>  PCIHostState parent_obj;
>  
> -int32_t index;
> +uint32_t index;
>  uint64_t buid;
>  char *dtbusname;
>  
> @@ -94,6 +94,8 @@ struct sPAPRPHBVFIOState {
>  int32_t iommugroupid;
>  };
>  
> +#define SPAPR_PCI_MAX_INDEX  255
> +
>  #define SPAPR_PCI_BASE_BUID  0x8002000ULL
>  
>  #define SPAPR_PCI_WINDOW_BASE0x100ULL

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpIMjuEdXsNm.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 0/6] Trivial cleanups around g_malloc()

2015-01-28 Thread Gonglei

On 2015/1/28 19:16, Markus Armbruster wrote:

> I screwed up cc: qemu-trivial, my apologies.
> 
> Markus Armbruster  writes:
> 
>> I'm routing these patches through qemu-trivial, even though some of
>> them touch actively maintained code, and could go through the
>> respective tree instead:
>>
>> * PATCH 1 block (Kevin, Stefan)
>>
>> * PATCH 3 KVM (Paolo)
>>
>> * PATCH 4 migration (Juan, Amit)
>>
>> * PATCH 5 VNC (Gerd)
>>
>> If you want me to reroute any of them, let me know.
>>
>> Markus Armbruster (6):
>>   onenand: g_malloc() can't fail, bury dead error handling
>>   rtl8139: g_malloc() can't fail, bury dead error handling
>>   kvm: g_malloc() can't fail, bury dead error handling
>>   rdma: g_malloc0() can't fail, bury dead error handling
>>   vnc: g_realloc() can't fail, bury dead error handling
>>   translate-all: Use g_try_malloc() for dynamic translator buffer
>>
>>  hw/block/onenand.c |  8 +---
>>  hw/net/rtl8139.c   | 14 --
>>  kvm-all.c  |  4 
>>  migration/rdma.c   |  3 ---
>>  translate-all.c|  2 +-
>>  ui/vnc.c   |  4 
>>  6 files changed, 2 insertions(+), 33 deletions(-)
> 

For serials:
Reviewed-by: Gonglei 

Regards,
-Gonglei

Re: [Qemu-devel] [PATCH v2] spapr_vio/spapr_iommu: Move VIO bypass where it belongs

2015-01-28 Thread David Gibson

On Tue, Jan 27, 2015 at 05:13:32PM +1100, Alexey Kardashevskiy wrote:
> Instead of tweaking a TCE table device by adding there a bypass flag,
> let's add an alias to RAM and IOMMU memory region, and enable/disable
> those according to the selected bypass mode.
> This way IOMMU memory region can have size of the actual window rather
> than ram_size which is essential for upcoming DDW support.
> 
> This moves bypass logic to VIO layer and keeps @bypass flag in TCE table
> for migration compatibility only. This replaces spapr_tce_set_bypass()
> calls with explicit assignment to avoid confusion as the function could
> do something more that just syncing the @bypass flag.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> Changes:
> v2:
> * kept @bypass in sPAPRTCETable not to break migration

So, the bypass field definition in the struct should probably get a
comment explaining how it's only used for migration compatibility now.

Apart from that:

Reviewed-by: David Gibson 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpRUMIaPJkTu.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v2] spapr_vio/spapr_iommu: Move VIO bypass where it belongs

2015-01-28 Thread David Gibson

On Wed, Jan 28, 2015 at 07:49:48PM +1100, Alexey Kardashevskiy wrote:
> On 01/28/2015 08:21 AM, Alexey Kardashevskiy wrote:
> > On 01/27/2015 05:13 PM, Alexey Kardashevskiy wrote:
[snip]
> >> diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h
> >> index 46edc2a..6ad55d1 100644
> >> --- a/include/hw/ppc/spapr_vio.h
> >> +++ b/include/hw/ppc/spapr_vio.h
> >> @@ -64,6 +64,8 @@ struct VIOsPAPRDevice {
> >>  target_ulong signal_state;
> >>  VIOsPAPR_CRQ crq;
> >>  AddressSpace as;
> >> +MemoryRegion mrroot;
> >> +MemoryRegion mrbypass;
> >>  sPAPRTCETable *tcet;
> >>  };
> >>  
> >>
> > 
> > 
> 
> 
> I believe doing something like this is way too disguising because of
> tobj->parent?

It's kinda ugly, but it's the best way I can think of.

So, I think this is a reasonable approach, but it does need a comment
saying why the hack is necessary.

> 
> 
> 
> --- a/hw/ppc/spapr_iommu.c
> +++ b/hw/ppc/spapr_iommu.c
> @@ -25,6 +25,7 @@
>  #include "trace.h"
> 
>  #include "hw/ppc/spapr.h"
> +#include "hw/ppc/spapr_vio.h"
> 
>  #include 
> 
> @@ -88,10 +89,26 @@ static IOMMUTLBEntry
> spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr,
>  return ret;
>  }
> 
> +static int spapr_tce_table_post_load(void *opaque, int version_id)
> +{
> +sPAPRTCETable *tcet = SPAPR_TCE_TABLE(opaque);
> +Object *tobj = OBJECT(tcet);
> +Object *vobj = object_dynamic_cast(tobj->parent, TYPE_VIO_SPAPR_DEVICE);
> +
> +if (!vobj) {
> +return 0;
> +}
> +
> +spapr_vio_set_bypass((VIOsPAPRDevice *) vobj, tcet->bypass);
> +
> +return 0;
> +}
> +
>  static const VMStateDescription vmstate_spapr_tce_table = {
>  .name = "spapr_iommu",
>  .version_id = 2,
>  .minimum_version_id = 2,
> +.post_load = spapr_tce_table_post_load,
>  .fields  = (VMStateField []) {
>  /* Sanity check */
>  VMSTATE_UINT32_EQUAL(liobn, sPAPRTCETable),
> 
> 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpnypLU8hg6Y.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH 4/4] usb: Pair g_malloc() with g_free(), not free()

2015-01-28 Thread Gonglei

On 2015/1/28 22:54, Markus Armbruster wrote:

> Spotted by Coverity with preview checker ALLOC_FREE_MISMATCH enabled
> and my "coverity: Model g_free() isn't necessarily free()" model patch
> applied.
> 
> Signed-off-by: Markus Armbruster 
> ---
>  hw/usb/desc-msos.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/usb/desc-msos.c b/hw/usb/desc-msos.c
> index 334d1ae..32c3600 100644
> --- a/hw/usb/desc-msos.c
> +++ b/hw/usb/desc-msos.c
> @@ -231,7 +231,7 @@ int usb_desc_msos(const USBDesc *desc,  USBPacket *p,
>  length = len;
>  }
>  memcpy(dest, buf, length);
> -free(buf);
> +g_free(buf);
>  
>  p->actual_length = length;
>  return 0;

This patch had been added Gerd's usb queue :)

http://lists.gnu.org/archive/html/qemu-devel/2015-01/msg01998.html

Regards,
-Gonglei

Re: [Qemu-devel] [PATCH 2/4] qemu-option: Pair g_malloc() with g_free(), not free()

2015-01-28 Thread Gonglei

On 2015/1/28 22:54, Markus Armbruster wrote:

> Spotted by Coverity with preview checker ALLOC_FREE_MISMATCH enabled
> and my "coverity: Model g_free() isn't necessarily free()" model patch
> applied.
> 
> Signed-off-by: Markus Armbruster 
> ---
>  util/qemu-option.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Gonglei

Re: [Qemu-devel] [PATCH 3/4] spapr_vio: Pair g_malloc() with g_free(), not free()

2015-01-28 Thread Gonglei

On 2015/1/28 22:54, Markus Armbruster wrote:

> Spotted by Coverity with preview checker ALLOC_FREE_MISMATCH enabled
> and my "coverity: Model g_free() isn't necessarily free()" model patch
> applied.
> 
> Signed-off-by: Markus Armbruster 
> ---
>  hw/ppc/spapr_vio.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Gonglei

Re: [Qemu-devel] [PATCH 1/4] qemu-option: Replace pointless use of g_malloc0() by g_malloc()

2015-01-28 Thread Gonglei

On 2015/1/28 22:54, Markus Armbruster wrote:

> get_opt_value() takes a write-only buffer, so zeroing it is pointless.
> We don't do it elsewhere, either.
> 
> Signed-off-by: Markus Armbruster 
> ---
>  util/qemu-option.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Gonglei

Re: [Qemu-devel] [PATCH RFC v6 07/20] virtio: allow virtio-1 queue layout

2015-01-28 Thread David Gibson

On Wed, Jan 28, 2015 at 05:07:01PM +0100, Cornelia Huck wrote:
> On Thu, 22 Jan 2015 13:06:09 +1100
> David Gibson  wrote:
> 
> > On Thu, Dec 11, 2014 at 02:25:09PM +0100, Cornelia Huck wrote:
> > > For virtio-1 devices, we allow a more complex queue layout that doesn't
> > > require descriptor table and rings on a physically-contigous memory area:
> > > add virtio_queue_set_rings() to allow transports to set this up.
> > > 
> > > Signed-off-by: Cornelia Huck 
> > > ---
> > >  hw/virtio/virtio-mmio.c|3 +++
> > >  hw/virtio/virtio.c |   53 
> > > 
> > >  include/hw/virtio/virtio.h |3 +++
> > >  3 files changed, 40 insertions(+), 19 deletions(-)
> > > 
> > > diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
> > > index 43b7e02..0c9b63b 100644
> > > --- a/hw/virtio/virtio-mmio.c
> > > +++ b/hw/virtio/virtio-mmio.c
> > > @@ -244,8 +244,11 @@ static void virtio_mmio_write(void *opaque, hwaddr 
> > > offset, uint64_t value,
> > >  case VIRTIO_MMIO_QUEUENUM:
> > >  DPRINTF("mmio_queue write %d max %d\n", (int)value, 
> > > VIRTQUEUE_MAX_SIZE);
> > >  virtio_queue_set_num(vdev, vdev->queue_sel, value);
> > > +/* Note: only call this function for legacy devices */
> > 
> > It's not clear to me if this is an assertion that this *does* only
> > call the function for legacy devices or a fixme, that it *should* only
> > call the function for legacy devices.
> 
> It's more like a note to whoever takes the virtio-mmio legacy device
> code and writes a virtio-1 virtio-mmio device.
> 
> Does
> /* Note: this function must only be called for legacy devices */
> make that intention clearer?

Yes, I think that's better.

> 
> > 
> > > +virtio_queue_update_rings(vdev, vdev->queue_sel);
> > >  break;
> > >  case VIRTIO_MMIO_QUEUEALIGN:
> > > +/* Note: this is only valid for legacy devices */
> > >  virtio_queue_set_align(vdev, vdev->queue_sel, value);
> > >  break;
> > >  case VIRTIO_MMIO_QUEUEPFN:
> 
> (...)
> 
> > >  /* virt queue functions */
> > > -static void virtqueue_init(VirtQueue *vq)
> > > +void virtio_queue_update_rings(VirtIODevice *vdev, int n)
> > 
> > Perhaps something in the name to emphasise that this is only for  > devices?
> 
> virtio_queue_legacy_update_rings()? Maybe a bit long...

There aren't many callers, so I think long is ok in this case.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpCiQsz1zm1Q.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH RFC v6 05/20] virtio: support more feature bits

2015-01-28 Thread David Gibson

On Wed, Jan 28, 2015 at 04:59:45PM +0100, Cornelia Huck wrote:
> On Thu, 22 Jan 2015 12:43:43 +1100
> David Gibson  wrote:
> 
> > On Thu, Dec 11, 2014 at 02:25:07PM +0100, Cornelia Huck wrote:
> > > With virtio-1, we support more than 32 feature bits. Let's extend both
> > > host and guest features to 64, which should suffice for a while.
> > > 
> > > vhost and migration have been ignored for now.
> > 
> > [snip]
> > 
> > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > > index f6c0379..08141c7 100644
> > > --- a/include/hw/virtio/virtio.h
> > > +++ b/include/hw/virtio/virtio.h
> > > @@ -55,6 +55,12 @@
> > >  /* A guest should never accept this.  It implies negotiation is broken. 
> > > */
> > >  #define VIRTIO_F_BAD_FEATURE 30
> > >  
> > > +/* v1.0 compliant. */
> > > +#define VIRTIO_F_VERSION_1  32
> > 
> > This is already in the kernel header, isn't it?

> 
> Yes. But nearly all files include this header but not the kernel
> header.

Can't you change that?  Or this file include the kernel header?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpSdE9hXJ5e2.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v5 2/2] Xen: Use the ioreq-server API when available

2015-01-28 Thread Don Slutz

On 01/28/15 14:32, Don Slutz wrote:
> On 12/05/14 05:50, Paul Durrant wrote:
>> The ioreq-server API added to Xen 4.5 offers better security than
>> the existing Xen/QEMU interface because the shared pages that are
>> used to pass emulation request/results back and forth are removed
>> from the guest's memory space before any requests are serviced.
>> This prevents the guest from mapping these pages (they are in a
>> well known location) and attempting to attack QEMU by synthesizing
>> its own request structures. Hence, this patch modifies configure
>> to detect whether the API is available, and adds the necessary
>> code to use the API if it is.
> 
> This patch (which is now on xenbits qemu staging) is causing me
> issues.
> 

I have found the key.

The following will reproduce my issue:

1) xl create -p 
2) read one of HVM_PARAM_IOREQ_PFN, HVM_PARAM_BUFIOREQ_PFN, or
   HVM_PARAM_BUFIOREQ_EVTCHN
3) xl unpause new guest

The guest will hang in hvmloader.

More in thread:


Subject: Re: [Xen-devel] [PATCH] ioreq-server: handle
IOREQ_TYPE_PCI_CONFIG in assist function
References: <1422385589-17316-1-git-send-email-wei.l...@citrix.com>


-Don Slutz


> So far I have tracked it back to hvm_select_ioreq_server()
> which selects the "default_ioreq_server".  Since I have one 1
> QEMU, it is both the "default_ioreq_server" and an enabled
> 2nd ioreq_server.  I am continuing to understand why my changes
> are causing this.  More below.
> 
> This patch causes QEMU to only call xc_evtchn_bind_interdomain()
> for the enabled 2nd ioreq_server.  So when (if)
> hvm_select_ioreq_server() selects the "default_ioreq_server", the
> guest hangs on an I/O.
> 
> Using the debug key 'e':
> 
> (XEN) [2015-01-28 18:57:07] 'e' pressed -> dumping event-channel info
> (XEN) [2015-01-28 18:57:07] Event channel information for domain 0:
> (XEN) [2015-01-28 18:57:07] Polling vCPUs: {}
> (XEN) [2015-01-28 18:57:07] port [p/m/s]
> (XEN) [2015-01-28 18:57:07]1 [0/0/0]: s=5 n=0 x=0 v=0
> (XEN) [2015-01-28 18:57:07]2 [0/0/0]: s=6 n=0 x=0
> (XEN) [2015-01-28 18:57:07]3 [0/0/0]: s=6 n=0 x=0
> (XEN) [2015-01-28 18:57:07]4 [0/0/0]: s=5 n=0 x=0 v=1
> (XEN) [2015-01-28 18:57:07]5 [0/0/0]: s=6 n=0 x=0
> (XEN) [2015-01-28 18:57:07]6 [0/0/0]: s=6 n=0 x=0
> (XEN) [2015-01-28 18:57:07]7 [0/0/0]: s=5 n=1 x=0 v=0
> (XEN) [2015-01-28 18:57:07]8 [0/0/0]: s=6 n=1 x=0
> (XEN) [2015-01-28 18:57:07]9 [0/0/0]: s=6 n=1 x=0
> (XEN) [2015-01-28 18:57:07]   10 [0/0/0]: s=5 n=1 x=0 v=1
> (XEN) [2015-01-28 18:57:07]   11 [0/0/0]: s=6 n=1 x=0
> (XEN) [2015-01-28 18:57:07]   12 [0/0/0]: s=6 n=1 x=0
> (XEN) [2015-01-28 18:57:07]   13 [0/0/0]: s=5 n=2 x=0 v=0
> (XEN) [2015-01-28 18:57:07]   14 [0/0/0]: s=6 n=2 x=0
> (XEN) [2015-01-28 18:57:07]   15 [0/0/0]: s=6 n=2 x=0
> (XEN) [2015-01-28 18:57:07]   16 [0/0/0]: s=5 n=2 x=0 v=1
> (XEN) [2015-01-28 18:57:07]   17 [0/0/0]: s=6 n=2 x=0
> (XEN) [2015-01-28 18:57:07]   18 [0/0/0]: s=6 n=2 x=0
> (XEN) [2015-01-28 18:57:07]   19 [0/0/0]: s=5 n=3 x=0 v=0
> (XEN) [2015-01-28 18:57:07]   20 [0/0/0]: s=6 n=3 x=0
> (XEN) [2015-01-28 18:57:07]   21 [0/0/0]: s=6 n=3 x=0
> (XEN) [2015-01-28 18:57:07]   22 [0/0/0]: s=5 n=3 x=0 v=1
> (XEN) [2015-01-28 18:57:07]   23 [0/0/0]: s=6 n=3 x=0
> (XEN) [2015-01-28 18:57:07]   24 [0/0/0]: s=6 n=3 x=0
> (XEN) [2015-01-28 18:57:07]   25 [0/0/0]: s=5 n=4 x=0 v=0
> (XEN) [2015-01-28 18:57:07]   26 [0/0/0]: s=6 n=4 x=0
> (XEN) [2015-01-28 18:57:07]   27 [0/0/0]: s=6 n=4 x=0
> (XEN) [2015-01-28 18:57:07]   28 [0/0/0]: s=5 n=4 x=0 v=1
> (XEN) [2015-01-28 18:57:07]   29 [0/0/0]: s=6 n=4 x=0
> (XEN) [2015-01-28 18:57:07]   30 [0/0/0]: s=6 n=4 x=0
> (XEN) [2015-01-28 18:57:07]   31 [0/0/0]: s=5 n=5 x=0 v=0
> (XEN) [2015-01-28 18:57:07]   32 [0/0/0]: s=6 n=5 x=0
> (XEN) [2015-01-28 18:57:07]   33 [0/0/0]: s=6 n=5 x=0
> (XEN) [2015-01-28 18:57:07]   34 [0/0/0]: s=5 n=5 x=0 v=1
> (XEN) [2015-01-28 18:57:07]   35 [0/0/0]: s=6 n=5 x=0
> (XEN) [2015-01-28 18:57:07]   36 [0/0/0]: s=6 n=5 x=0
> (XEN) [2015-01-28 18:57:07]   37 [0/0/0]: s=5 n=6 x=0 v=0
> (XEN) [2015-01-28 18:57:07]   38 [0/0/0]: s=6 n=6 x=0
> (XEN) [2015-01-28 18:57:07]   39 [0/0/0]: s=6 n=6 x=0
> (XEN) [2015-01-28 18:57:07]   40 [0/0/0]: s=5 n=6 x=0 v=1
> (XEN) [2015-01-28 18:57:07]   41 [0/0/0]: s=6 n=6 x=0
> (XEN) [2015-01-28 18:57:07]   42 [0/0/0]: s=6 n=6 x=0
> (XEN) [2015-01-28 18:57:07]   43 [0/0/0]: s=5 n=7 x=0 v=0
> (XEN) [2015-01-28 18:57:07]   44 [0/0/0]: s=6 n=7 x=0
> (XEN) [2015-01-28 18:57:07]   45 [0/0/0]: s=6 n=7 x=0
> (XEN) [2015-01-28 18:57:07]   46 [0/0/0]: s=5 n=7 x=0 v=1
> (XEN) [2015-01-28 18:57:07]   47 [0/0/0]: s=6 n=7 x=0
> (XEN) [2015-01-28 18:57:07]   48 [0/0/0]: s=6 n=7 x=0
> (XEN) [2015-01-28 18:57:07]   49 [0/0/0]: s=3 n=0 x=0 d=0 p=58
> (XEN) [2015-01-28 18:57:07]   50 [0/0/0]:

Re: [Qemu-devel] [PATCH] target-mips: use CP0EnLo_XI instead of magic number

2015-01-28 Thread Maciej W. Rozycki

On Mon, 26 Jan 2015, Leon Alrae wrote:

> Signed-off-by: Leon Alrae 
> ---

 Enthusiastically:

Reviewed-by: Maciej W. Rozycki 

 However...

> diff --git a/target-mips/translate.c b/target-mips/translate.c
> index 635192c..77d89be 100644
> --- a/target-mips/translate.c
> +++ b/target-mips/translate.c
> @@ -4947,7 +4947,7 @@ static void gen_mfc0(DisasContext *ctx, TCGv arg, int 
> reg, int sel)
>  #if defined(TARGET_MIPS64)
>  if (ctx->rxi) {
>  TCGv tmp = tcg_temp_new();
> -tcg_gen_andi_tl(tmp, arg, (3ull << 62));
> +tcg_gen_andi_tl(tmp, arg, (3ull << CP0EnLo_XI));
>  tcg_gen_shri_tl(tmp, tmp, 32);

... don't we need to do:

tcg_gen_andi_tl(arg, arg, ~(3ull << CP0EnLo_XI));

here and for EntryLo1 as well (for LPA-enabled processors)?

>  tcg_gen_or_tl(arg, arg, tmp);
>  tcg_temp_free(tmp);

 And do we want to have CP0C3_LPA set in the few templates that do in the 
first place?  AFAICT we don't really implement LPA so this bit will 
confuse software.  Of course implementing it would be another option, not 
very complicated AFAICS, and if we can track the requirement to update 
MFC0 at that time, then the clean-up I mentioned above can be deferred 
until then.

  Maciej

Re: [Qemu-devel] [PATCH 04/21] block: Add bdrv_close_all() handlers

2015-01-28 Thread Max Reitz


On 2015-01-28 at 17:05, Eric Blake wrote:

On 01/26/2015 02:22 PM, Paolo Bonzini wrote:


On 26/01/2015 22:13, Max Reitz wrote:

An eject blocker would also break backwards-compatibility though.  What
about an eject notifier?  Would that concept make sense?

It does make sense (in that it is the way I would implement "just do
what we always did"), but I just don't like it for the fact that it
makes NBD a special snowflake. I can live with it, though.

Yes, it's weird.  But this is just the backwards-compatible solution.

I'm okay with implementing only the new solution, but:

- the old QMP (and HMP?) commands must be removed

Back-compat is a bear to figure out. From libvirt's perspective, it is
okay to require a newer libvirt in order to drive newer qemu (the new
libvirt knowing how to probe whether old or new commands exist, and use
the right one); but it is still awkward any time upgrading qemu without
libvirt causes things to break gratuitously.


- the new command probably must not reuse the same BB as the guest, and
I am not sure that this is possible.

We've had that design goal in the back of our minds for some time
(sharing a single BDS among multiple devices) - but I don't think it has
actually happened yet, so if this is the first time we make it happen,
there may be lots of details to get right.  But it makes the most sense
(exporting and NBD disk is a form of creating a _new_ BB - distinct from
a guest-visible device, but both uses are definitely backends; and
sharing the same BDS among both backends makes total sense, so that the
drive visible to the guest can change medium without invalidating the
NBD serving up the old contents).


Well, I've looked up the discussion Markus, Kevin and me were having; 
our result was that some users may find it useful to have an own BB for 
an NBD server, while others may want to re-use an existing BB. The 
former would be requested by creating an NBD server on a node-name 
instead of giving a device name; if given a device name, NBD will reuse 
that BB, and will detach itself on eject.


Somehow I lost track of that final detail (I blame Christmas and New 
Year), so that's indeed what we decided upon. I will implement it in v2 
(an eject notifier for BBs, which is then used by NBD).


For the sake of completeness: Currently, it is impossible to have 
multiple BBs per BDS, so we cannot yet implement having a separate BB 
for the NBD server. This issue is however unrelated to this series, so 
we should be fine for now.


Max

Re: [Qemu-devel] [PATCH 04/21] block: Add bdrv_close_all() handlers

2015-01-28 Thread Max Reitz


On 2015-01-28 at 17:44, Eric Blake wrote:

On 01/26/2015 12:27 PM, Max Reitz wrote:

Every time a reference to a BlockBackend is taken, a notifier for
bdrv_close_all() has to be deposited so the reference holder can
relinquish its reference when bdrv_close_all() is called. That notifier
should be revoked on a bdrv_unref() call.


In addition to the design question about whether NBD exports should be
their own new BB,


@@ -198,8 +207,12 @@ void blk_ref(BlockBackend *blk)
   * If this drops it to zero, destroy @blk.
   * For convenience, do nothing if @blk is null.
   */
-void blk_unref(BlockBackend *blk)
+void blk_unref(BlockBackend *blk, Notifier *close_all_notifier)
  {
+if (close_all_notifier) {
+notifier_remove(close_all_notifier);
+}
+
  if (blk) {

Does removing a notifier when blk is NULL make sense?


Considering the pattern is probably "allocate notifier; if that fails, 
goto fail; create BB; if that fails, goto fail; ...; fail: blk_unref(); 
free notifier", the is indeed a case where the notifier may be non-NULL 
but not yet registered (if "create BB" failed). Therefore, to simplify 
the caller code, notifier_remove() should only be called if blk is not 
NULL, right.


Will fix.

Max

Re: [Qemu-devel] [PATCH 04/21] block: Add bdrv_close_all() handlers

2015-01-28 Thread Eric Blake

On 01/26/2015 12:27 PM, Max Reitz wrote:
> Every time a reference to a BlockBackend is taken, a notifier for
> bdrv_close_all() has to be deposited so the reference holder can
> relinquish its reference when bdrv_close_all() is called. That notifier
> should be revoked on a bdrv_unref() call.
> 

In addition to the design question about whether NBD exports should be
their own new BB,

> @@ -198,8 +207,12 @@ void blk_ref(BlockBackend *blk)
>   * If this drops it to zero, destroy @blk.
>   * For convenience, do nothing if @blk is null.
>   */
> -void blk_unref(BlockBackend *blk)
> +void blk_unref(BlockBackend *blk, Notifier *close_all_notifier)
>  {
> +if (close_all_notifier) {
> +notifier_remove(close_all_notifier);
> +}
> +
>  if (blk) {

Does removing a notifier when blk is NULL make sense?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] linux-user/syscall.c: Let all lock_user_struct() and unlock_user_struct() paired with each other

2015-01-28 Thread Peter Maydell

On 28 January 2015 at 22:09, Chen Gang S  wrote:
>  - Is what I said above really correct (e.g. is linux-user really mainly
>for cpu emulation)?.

Not really. linux-user is mainly for running single Linux binaries.
It has a secondary use for running gcc test binaries which think
they are "bare metal" but actually use some kind of semihosting API.
(You should check whether tile has one of those.)

As well as linux-user mode, QEMU has system emulation mode, where
we emulate a complete machine.

Both modes need CPU emulation.

-- PMM

Re: [Qemu-devel] [PATCH 04/11] target-arm: Define correct mmu_idx values and pass them in TB flags

2015-01-28 Thread Peter Maydell

On 28 January 2015 at 21:57, Greg Bellows  wrote:
> After getting through patch 9, I wonder if the TB NS bit can also be removed
> as it is implied in the MMU index.

No, because for a 32-bit EL3 we are always running under a Secure
translation regime (S1E3) but the TBFLAG_NS bit may be either 0 or
1 depending on the value of the SCR.NS bit.

-- PMM

Re: [Qemu-devel] [PATCH 09/11] target-arm: Use mmu_idx in get_phys_addr()

2015-01-28 Thread Peter Maydell

On 28 January 2015 at 21:37, Greg Bellows  wrote:
>
>> +/* Return true if the translation regime is using LPAE format page tables
>> */
>> +static inline bool regime_using_lpae_format(CPUARMState *env,
>> +ARMMMUIdx mmu_idx)
>> +{
>> +int el = regime_el(env, mmu_idx);
>> +if (el == 2 || arm_el_is_aa64(env, el)) {
>
>
> For the life of me, I can not figure out why EL2 is wired to always use
> LPAE.   Is this stated in the spec somewhere?  I found places where EL2
> registers can vary depending on TTBCR.EAE bit settings which implies it is
> not always true.

The only translation regimes controlled by EL2 are:
 (1) EL2's own stage 1 translations
 (2) the stage 2 translations

These must both be LPAE format: see the v8 ARM ARM section G4.4:
"the translation tables for the Non-secure PL2 stage 1 translations,
and for the Non-secure PL1&0 stage 2 translations, must use the
Long-descriptor translation table format." v7 ARM ARM B3.3 has
similar text.

(Basically, short-descriptors are obsolete and are only supported in
the pre-Virtualization translation regimes, ie AArch32 EL1/3.)

> I realize EL2 is not supported yet, but wouldn't this break AArch32 if it
> were and the LPAE feature was not enabled?

Implementations with Virtualization must include LPAE.

>> -static bool get_level1_table_address(CPUARMState *env, uint32_t *table,
>> - uint32_t address)
>> +static bool get_level1_table_address(CPUARMState *env, ARMMMUIdx mmu_idx,
>> + uint32_t *table, uint32_t address)
>>  {
>> -/* Get the TCR bank based on our security state */
>> -TCR *tcr = &env->cp15.tcr_el[arm_is_secure(env) ? 3 : 1];
>> +/* Note that we can only get here for an AArch32 PL0/PL1 lookup */
>> +int el = regime_el(env, mmu_idx);
>> +TCR *tcr = regime_tcr(env, mmu_idx);
>>
>> -/* We only get here if EL1 is running in AArch32. If EL3 is running
>> in
>> - * AArch32 there is a secure and non-secure instance of the
>> translation
>> - * table registers.
>> - */
>>  if (address & tcr->mask) {
>>  if (tcr->raw_tcr & TTBCR_PD1) {
>>  /* Translation table walk disabled for TTBR1 */
>>  return false;
>>  }
>> -*table = A32_BANKED_CURRENT_REG_GET(env, ttbr1) & 0xc000;
>> +*table = env->cp15.ttbr1_el[el] & 0xc000;
>
>
> Perhaps you plan to address this in a separate patch, but I believe TTBR1 is
> only applicable to EL1 and EL0 in AArch64.

It's true that TTBR1 is only for EL0/EL1, but see the comment at the
start of the function -- we can't get here except for EL0 and EL1,
because this function is only used for some kinds of short-descriptor
tables.

>> @@ -4663,7 +4736,12 @@ static int get_phys_addr_v5(CPUARMState *env,
>> uint32_t address, int access_type,
>>  desc = ldl_phys(cs->as, table);
>>  type = (desc & 3);
>>  domain = (desc >> 5) & 0x0f;
>> -domain_prot = (A32_BANKED_CURRENT_REG_GET(env, dacr) >> (domain * 2))
>> & 3;
>> +if (regime_el(env, mmu_idx) == 1) {
>> +dacr = env->cp15.dacr_ns;
>> +} else {
>> +dacr = env->cp15.dacr_s;
>> +}
>> +domain_prot = (dacr >> (domain * 2)) & 3;
>
>
> Is there a reason that you did not add a regime_dacr() here like you did for
> SCTLR and TCR?

Didn't seem necessary, since we know we only need to deal with S vs NS,
and the concept isn't generally applicable to most regimes. If the
dacr in env->cp15 was stored as dacr[4] I'd have used dacr[el] as
I do above for the ttbr1_el[], but it isn't, hence the conditional.

The TCR and SCTLR are used in LPAE format page tables so they apply
for the whole set of translation regimes.

> Also, the ARMv8 ARM makes it sound like DACR has no value in AArch64.

Well, the DACR is only relevant to short-descriptor format page tables,
so it's only consulted for AArch32 translations, and there's no
equivalent register in AArch64. (There is a DACR32_EL2 64 bit register,
but that is only there so a hypervisor can save and restore the state
of a 32 bit VM (at EL1) that is using short-descriptor page tables.)

> However, if it did have meaning in AArch64, then for S1SE1 would we be
> accessing the wrong bank as regime_el returns 1?  This working off the
> understanding that an address reference from an instruction executed in
> S/EL1 and AArch64 would generate such an index.

We can only get here for regime S1SE1 if:
 * EL3 is AArch64
 * EL1 is AArch32

Since EL3 is 64 bit, there is no banking of registers and regardless
of whether EL1 is Secure or NonSecure we want the one and only
register (which is in dacr_ns). (Compare the way we use
ttbr1_el[regime_el(env, mmu_idx)] and so get TTBR1_EL1 whether
this is Secure EL1 or NonSecure EL1.)

If EL3 is 32 bit then there is banking of registers, but it's
not possible to get here for S1SE1 in that case (only for S EL3
and NS EL1).

>> +/* TODO:
>> + * This code

Re: [Qemu-devel] [PATCH 03/21] block: Add bdrv_close_all() notifiers

2015-01-28 Thread Eric Blake

On 01/26/2015 12:27 PM, Max Reitz wrote:
> This adds a list of notifiers to be invoked on bdrv_close_all().
> 
> Signed-off-by: Max Reitz 
> ---
>  block.c   | 10 ++
>  include/block/block.h |  2 ++
>  2 files changed, 12 insertions(+)

Reviewed-by: Eric Blake 


-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 2/2] bootdevice: update boot_order in MachineState

2015-01-28 Thread Dinar Valeev

On 01/28/2015 02:48 AM, Gonglei wrote:
> On 2015/1/27 18:49, Dinar Valeev wrote:
> 
>> On 01/27/2015 10:18 AM, Gonglei wrote:
>>> On 2015/1/27 16:57, Dinar Valeev wrote:
>>>
 On 01/27/2015 03:51 AM, Gonglei wrote:
> On 2015/1/27 7:52, dval...@suse.de wrote:
>
>> From: Dinar Valeev 
>>
>> on sPAPR we need to update boot_order in MachineState in case it
>> got changed on reset.
>>
>> Signed-off-by: Dinar Valeev 
>> ---
>>  bootdevice.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/bootdevice.c b/bootdevice.c
>> index 5914417..4f11a06 100644
>> --- a/bootdevice.c
>> +++ b/bootdevice.c
>> @@ -26,6 +26,7 @@
>>  #include "qapi/visitor.h"
>>  #include "qemu/error-report.h"
>>  #include "hw/hw.h"
>> +#include "hw/boards.h"
>>  
>>  typedef struct FWBootEntry FWBootEntry;
>>  
>> @@ -50,6 +51,8 @@ void qemu_register_boot_set(QEMUBootSetHandler *func, 
>> void *opaque)
>>  void qemu_boot_set(const char *boot_order, Error **errp)
>>  {
>>  Error *local_err = NULL;
>> +MachineState *machine = MACHINE(qdev_get_machine());
>> +machine->boot_order = boot_order;
>>  
>>  if (!boot_set_handler) {
>>  error_setg(errp, "no function defined to set boot device list 
>> for"
>
> Have you registered boot set handler on ppc/sPAPR platform by calling
> qemu_register_boot_set()? Otherwise qemu_boot_set function
>  will return error.
 No, I set boot_order on each machine reset. My tests are showing it works 
 without an error.
>>>
>>> That's interesting. Does this function be called?
>> Yes, then simply returns.
>>> Would you debug it by setting a breakpoint ?
>> I added a trace event.
>>  if (!boot_set_handler) {
>> +trace_qemu_boot_set(boot_order);
>>  error_setg(errp, "no function defined to set boot device list for"
>>   " this architecture");
>>  return;
>>
>> And I see this now in qemu's monitor. Still I don't see error message.
> 
> That's because NULL is passed to this function in restore_boot_order()
> the error is ignored (commit f183993). I have seen the previous conversation
> about your patch serials. And I think this is the reason which
> you moved machine->boot_order = boot_order before
> checking boot_set_handler variable based on Alexander's
> suggestion, right? But I think this is not a good idea.
Yes

Any proposal how this can be done differently?

It seems I'm almost alone who wants -boot once=X option to get fixed for sPAPR. 
We use it in test automation, and we need to be sure that we boot from hard 
disk once installation is done.

Thanks,
Dinar
> 
> Regards,
> -Gonglei
> 
>

Re: [Qemu-devel] [PATCH 02/21] quorum: Fix close path

2015-01-28 Thread Eric Blake

On 01/26/2015 12:27 PM, Max Reitz wrote:
> bdrv_unref() can lead to bdrv_close(), which in turn will result in
> bdrv_drain_all(). This function will later be called blk_drain_all() and
> iterate only over the BlockBackends for which blk_is_inserted() holds
> true; therefore, bdrv_is_inserted() and thus quorum_is_inserted() will
> probably be called.
> 
> This patch makes quorum_is_inserted() aware of the fact that some
> children may have been closed already.
> 
> Signed-off-by: Max Reitz 
> ---
>  block/quorum.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 

>  for (i = 0; i < s->num_children; i++) {
> -if (!bdrv_is_inserted(s->bs[i])) {
> +if (s->bs[i] && !bdrv_is_inserted(s->bs[i])) {
>  return 0;

May have a minor conflict if you fix the earlier series to use bool for
this function's return type.  But that should not get in the way of:

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 01/21] block: Guard remaining unsafe blk_bs() callers

2015-01-28 Thread Eric Blake

On 01/26/2015 12:27 PM, Max Reitz wrote:
> There are cases where it is probably (!) not necessary to check whether
> the return value of blk_bs() is non-NULL, and those are places after
> blk_new_open(). In every other place, though, there has to be some check
> to make sure that the return value of blk_bs() is non-NULL before it is
> used.
> 
> This patch adds that check to hopefully all of the remaining places.

I did not audit for places you might have missed, but concur that these
needed it.

> 
> Signed-off-by: Max Reitz 
> ---
>  blockdev.c  | 18 ++
>  hw/block/xen_disk.c |  4 +++-
>  2 files changed, 13 insertions(+), 9 deletions(-)


Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 04/21] block: Add bdrv_close_all() handlers

2015-01-28 Thread Eric Blake

On 01/26/2015 02:22 PM, Paolo Bonzini wrote:
> 
> 
> On 26/01/2015 22:13, Max Reitz wrote:

>>> An eject blocker would also break backwards-compatibility though.  What
>>> about an eject notifier?  Would that concept make sense?
>>
>> It does make sense (in that it is the way I would implement "just do
>> what we always did"), but I just don't like it for the fact that it
>> makes NBD a special snowflake. I can live with it, though.
> 
> Yes, it's weird.  But this is just the backwards-compatible solution.
> 
> I'm okay with implementing only the new solution, but:
> 
> - the old QMP (and HMP?) commands must be removed

Back-compat is a bear to figure out. From libvirt's perspective, it is
okay to require a newer libvirt in order to drive newer qemu (the new
libvirt knowing how to probe whether old or new commands exist, and use
the right one); but it is still awkward any time upgrading qemu without
libvirt causes things to break gratuitously.

> 
> - the new command probably must not reuse the same BB as the guest, and
> I am not sure that this is possible.

We've had that design goal in the back of our minds for some time
(sharing a single BDS among multiple devices) - but I don't think it has
actually happened yet, so if this is the first time we make it happen,
there may be lots of details to get right.  But it makes the most sense
(exporting and NBD disk is a form of creating a _new_ BB - distinct from
a guest-visible device, but both uses are definitely backends; and
sharing the same BDS among both backends makes total sense, so that the
drive visible to the guest can change medium without invalidating the
NBD serving up the old contents).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RESEND 50/50] iotests: Add test for change-related QMP commands

2015-01-28 Thread Max Reitz


On 2015-01-28 at 16:57, Eric Blake wrote:

On 01/27/2015 12:46 PM, Max Reitz wrote:

Signed-off-by: Max Reitz 
---
  tests/qemu-iotests/118 | 649 +
  tests/qemu-iotests/118.out |   5 +
  tests/qemu-iotests/group   |   1 +
  3 files changed, 655 insertions(+)
  create mode 100755 tests/qemu-iotests/118
  create mode 100644 tests/qemu-iotests/118.out


Looks like lots of useful tests and good coverage, including things like
no-op open of an already open tray.  I didn't spot any obvious flaws
(but my python is weak), or think of any obvious missing tests, so feel
free to add:

Reviewed-by: Eric Blake 


Thank you very much for reviewing this series! I really did not think 
someone would jump on this so fast. :-)


Max

Re: [Qemu-devel] [PATCH] linux-user/syscall.c: Let all lock_user_struct() and unlock_user_struct() paired with each other

2015-01-28 Thread Chen Gang S


Thank you for your work, too.

Next month, I shall start tile qemu, I guess, for coding, I shall start
from linux-user (which is mainly for cpu emulation).

 - For each patch, I should make at least a valuable change, and pass
   related test.

 - Each month, I should make 3 patches at least.

 - Hope I can finish tile for linux-user within 2015-06-30: finish all
   tile cpu common instructions emulation, and pass all related test.

Welcome any ideas, suggestions, and completions, e.g.

 - Is what I said above really correct (e.g. is linux-user really mainly
   for cpu emulation)?.

 - For each patch, are there additional information, requirements or
   suggestions which I should notice about?

 - If I can finish 3 patches per month, is it possible to finish tile
   cpu common instructions emulation for linux-user within 2015-06-30?


Thanks.

On 1/28/15 22:27, Riku Voipio wrote:
> Hi,
> 
> First of all, thanks Chen for taking time to improve the linux-user
> codebase in qemu!
> 
> On Mon, Jan 26, 2015 at 03:01:52PM +, Peter Maydell wrote:
>> On 26 January 2015 at 14:59, Chen Gang S  wrote:
>>> On 1/26/15 06:10, Peter Maydell wrote:
 I would just like the commit message to be clear about the
 scope of the work the patch covers. If the patch is just "Fix
 mismatched lock/unlock calls in IPC struct conversion functions"
 then that's fine, but the commit message should say that. At the
 moment the commit message is very vague.

>>>
>>> OK, thanks.
>>>
>>> I am not quite familiar with this file, so I describe the modification
>>> by function name, e.g. lock_user_struct() and unlick_user_struct() in
>>> the patch subject.
> 
>> In a big file I think it's often more useful to describe the
>> functions which are being changed. My suggested subject would be:
>  
>> "Fix mismatched lock/unlock calls in IPC struct conversion functions"
>  
>> Riku can decide if he wants a v2 or will just fix it up as he
>> applies it to his linux-user tree.
> 
> No need for v2, I've change the title in my tree.
> 
> Riku
> 
> 

-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed

Re: [Qemu-devel] [PATCH RESEND 50/50] iotests: Add test for change-related QMP commands

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> Signed-off-by: Max Reitz 
> ---
>  tests/qemu-iotests/118 | 649 
> +
>  tests/qemu-iotests/118.out |   5 +
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 655 insertions(+)
>  create mode 100755 tests/qemu-iotests/118
>  create mode 100644 tests/qemu-iotests/118.out
> 

Looks like lots of useful tests and good coverage, including things like
no-op open of an already open tray.  I didn't spot any obvious flaws
(but my python is weak), or think of any obvious missing tests, so feel
free to add:

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RESEND 49/50] iotests: More options for VM.add_drive()

2015-01-28 Thread Eric Blake

On 01/28/2015 02:28 PM, Max Reitz wrote:
> On 2015-01-28 at 16:27, Eric Blake wrote:
>> On 01/27/2015 12:46 PM, Max Reitz wrote:
>>> This patch allows specifying the interface to be used for the drive, and
>>> makes specifying a path optional (if the path is None, the "file" option
>>> will be omitted, thus creating an empty drive).
>>>
>>> Signed-off-by: Max Reitz 
>>> ---
>>>   tests/qemu-iotests/iotests.py | 9 ++---
>>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>>
>>> +
>>> +if not path is None:
>>> +options.append('file=%s' % path)
>> Works, but it is more idiomatic to use 'if path is not None:', based on
>> a grep of the source tree.
> 
> Did I ever mention I don't know Python very well?

The feeling is mutual.  So it's the blind leading the blind :)

> 
> That sounds a lot better, will do, thanks.
> 
>> With that minor switch,
>> Reviewed-by: Eric Blake 
> 
> On to the last patch! :-)
> 
> Max
> 
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 04/11] target-arm: Define correct mmu_idx values and pass them in TB flags

2015-01-28 Thread Greg Bellows

On Fri, Jan 23, 2015 at 12:20 PM, Peter Maydell 
wrote:

> We currently claim that for ARM the mmu_idx should simply be the current
> exception level. However this isn't actually correct -- secure EL0 and EL1
> should have separate indexes from non-secure EL0 and EL1 since their
> VA->PA mappings may differ. We also will want an index for stage 2
> translations when we properly support EL2.
>
> Define and document all seven mmu index values that we require, and
> pass the mmu index in the TB flags rather than exception level or
> priv/user bit.
>
> This change doesn't update the get_phys_addr() code, so our page
> table walking still assumes a simplistic "user or priv?" model for
> the moment.
>
> Signed-off-by: Peter Maydell 
> ---
> This leaves some odd gaps in the TB flags usage. I will circle
> back and clean this up later (including moving the other common
> flags like the singlestep ones to the top of the flags word),
> but I didn't want to bloat this patchseries further.
> ---
>  target-arm/cpu.h   | 113
> -
>  target-arm/helper.c|   3 +-
>  target-arm/translate-a64.c |   5 +-
>  target-arm/translate.c |   5 +-
>  target-arm/translate.h |   3 +-
>  5 files changed, 101 insertions(+), 28 deletions(-)
>
> diff --git a/target-arm/cpu.h b/target-arm/cpu.h
> index 3eb00f4..cf7b9ab 100644
> --- a/target-arm/cpu.h
> +++ b/target-arm/cpu.h
> @@ -98,7 +98,7 @@ typedef uint32_t ARMReadCPFunc(void *opaque, int cp_info,
>
>  struct arm_boot_info;
>
> -#define NB_MMU_MODES 4
> +#define NB_MMU_MODES 7
>
>  /* We currently assume float and double are IEEE single and double
> precision respectively.
> @@ -1572,13 +1572,92 @@ static inline CPUARMState *cpu_init(const char
> *cpu_model)
>  #define cpu_signal_handler cpu_arm_signal_handler
>  #define cpu_list arm_cpu_list
>
> -/* MMU modes definitions */
> +/* ARM has the following "translation regimes" (as the ARM ARM calls
> them):
> + *
> + * If EL3 is 64-bit:
> + *  + NonSecure EL1 & 0 stage 1
> + *  + NonSecure EL1 & 0 stage 2
> + *  + NonSecure EL2
> + *  + Secure EL1 & EL0
> + *  + Secure EL3
> + * If EL3 is 32-bit:
> + *  + NonSecure PL1 & 0 stage 1
> + *  + NonSecure PL1 & 0 stage 2
> + *  + NonSecure PL2
> + *  + Secure PL0 & PL1
> + * (reminder: for 32 bit EL3, Secure PL1 is *EL3*, not EL1.)
> + *
> + * For QEMU, an mmu_idx is not quite the same as a translation regime
> because:
> + *  1. we need to split the "EL1 & 0" regimes into two mmu_idxes, because
> they
> + * may differ in access permissions even if the VA->PA map is the same
> + *  2. we want to cache in our TLB the full VA->IPA->PA lookup for a
> stage 1+2
> + * translation, which means that we have one mmu_idx that deals with
> two
> + * concatenated translation regimes [this sort of combined s1+2 TLB is
> + * architecturally permitted]
> + *  3. we don't need to allocate an mmu_idx to translations that we won't
> be
> + * handling via the TLB. The only way to do a stage 1 translation
> without
> + * the immediate stage 2 translation is via the ATS or AT system
> insns,
> + * which can be slow-pathed and always do a page table walk.
> + *  4. we can also safely fold together the "32 bit EL3" and "64 bit EL3"
> + * translation regimes, because they map reasonably well to each other
> + * and they can't both be active at the same time.
> + * This gives us the following list of mmu_idx values:
> + *
> + * NS EL0 (aka NS PL0) stage 1+2
> + * NS EL1 (aka NS PL1) stage 1+2
> + * NS EL2 (aka NS PL2)
> + * S EL3 (aka S PL1)
> + * S EL0 (aka S PL0)
> + * S EL1 (not used if EL3 is 32 bit)
> + * NS EL0+1 stage 2
> + *
> + * (The last of these is an mmu_idx because we want to be able to use the
> TLB
> + * for the accesses done as part of a stage 1 page table walk, rather than
> + * having to walk the stage 2 page table over and over.)
> + *
> + * Our enumeration includes at the end some entries which are not "true"
> + * mmu_idx values in that they don't have corresponding TLBs and are only
> + * valid for doing slow path page table walks.
> + *
> + * The constant names here are patterned after the general style of the
> names
> + * of the AT/ATS operations.
> + * The values used are carefully arranged to make mmu_idx => EL lookup
> easy.
> + */
> +typedef enum ARMMMUIdx {
> +ARMMMUIdx_S12NSE0 = 0,
> +ARMMMUIdx_S12NSE1 = 1,
> +ARMMMUIdx_S1E2 = 2,
> +ARMMMUIdx_S1E3 = 3,
> +ARMMMUIdx_S1SE0 = 4,
> +ARMMMUIdx_S1SE1 = 5,
> +ARMMMUIdx_S2NS = 6,
> +/* Indexes below here don't have TLBs and are used only for AT system
> + * instructions or for the first stage of an S12 page table walk.
> + */
> +ARMMMUIdx_S1NSE0 = 7,
> +ARMMMUIdx_S1NSE1 = 8,
> +} ARMMMUIdx;
> +
>  #define MMU_MODE0_SUFFIX _user
>  #define MMU_MODE1_SUFFIX _kernel
>  #define MMU_USER_IDX 0
> +
> +/* Return the exception level we're running at if this is our mmu_idx */
> +static inline int arm_mm

Re: [Qemu-devel] [PATCH 09/11] target-arm: Use mmu_idx in get_phys_addr()

2015-01-28 Thread Greg Bellows

On Fri, Jan 23, 2015 at

12:20 PM, Peter Maydell  wrote:

> Now we have the mmu_idx in get_phys_addr(), use it correctly to
> determine the behaviour of virtual to physical address translations,
> rather than using just an is_user flag and the current CPU state.
>
> Some TODO comments have been added to indicate where changes will
> need to be made to add EL2 and 64-bit EL3 support.
>
> Signed-off-by: Peter Maydell 
> ---
>  target-arm/helper.c | 200
> +++-
>  1 file changed, 151 insertions(+), 49 deletions(-)
>
> diff --git a/target-arm/helper.c b/target-arm/helper.c
> index 0ae04eb..0a06bbe 100644
> --- a/target-arm/helper.c
> +++ b/target-arm/helper.c
> @@ -4556,13 +4556,88 @@ void arm_cpu_do_interrupt(CPUState *cs)
>  cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
>  }
>
> +
> +/* Return the exception level which controls this address translation
> regime */
> +static inline uint32_t regime_el(CPUARMState *env, ARMMMUIdx mmu_idx)
> +{
> +switch (mmu_idx) {
> +case ARMMMUIdx_S2NS:
> +case ARMMMUIdx_S1E2:
> +return 2;
> +case ARMMMUIdx_S1E3:
> +return 3;
> +case ARMMMUIdx_S1SE0:
> +return arm_el_is_aa64(env, 3) ? 1 : 3;
> +case ARMMMUIdx_S1SE1:
> +case ARMMMUIdx_S1NSE0:
> +case ARMMMUIdx_S1NSE1:
> +return 1;
> +default:
> +g_assert_not_reached();
> +}
> +}
> +
> +/* Return the SCTLR value which controls this address translation regime
> */
> +static inline uint32_t regime_sctlr(CPUARMState *env, ARMMMUIdx mmu_idx)
> +{
> +return env->cp15.sctlr_el[regime_el(env, mmu_idx)];
> +}
> +
> +/* Return true if the specified stage of address translation is disabled
> */
> +static inline bool regime_translation_disabled(CPUARMState *env,
> +   ARMMMUIdx mmu_idx)
> +{
> +if (mmu_idx == ARMMMUIdx_S2NS) {
> +return (env->cp15.hcr_el2 & HCR_VM) == 0;
> +}
> +return (regime_sctlr(env, mmu_idx) & SCTLR_M) == 0;
> +}
> +
> +/* Return the TCR controlling this translation regime */
> +static inline TCR *regime_tcr(CPUARMState *env, ARMMMUIdx mmu_idx)
> +{
> +if (mmu_idx == ARMMMUIdx_S2NS) {
> +/* TODO: return VTCR_EL2 */
> +g_assert_not_reached();
> +}
> +return &env->cp15.tcr_el[regime_el(env, mmu_idx)];
> +}
> +
> +/* Return true if the translation regime is using LPAE format page tables
> */
> +static inline bool regime_using_lpae_format(CPUARMState *env,
> +ARMMMUIdx mmu_idx)
> +{
> +int el = regime_el(env, mmu_idx);
> +if (el == 2 || arm_el_is_aa64(env, el)) {
>

For the life of me, I can not figure out why EL2 is wired to always use
LPAE.   Is this stated in the spec somewhere?  I found places where EL2
registers can vary depending on TTBCR.EAE bit settings which implies it is
not always true.

I realize EL2 is not supported yet, but wouldn't this break AArch32 if it
were and the LPAE feature was not enabled?


> +return true;
> +}
> +if (arm_feature(env, ARM_FEATURE_LPAE)
> +&& (regime_tcr(env, mmu_idx)->raw_tcr & TTBCR_EAE)) {
> +return true;
> +}
> +return false;
> +}
> +
> +static inline bool regime_is_user(CPUARMState *env, ARMMMUIdx mmu_idx)
> +{
> +switch (mmu_idx) {
> +case ARMMMUIdx_S1SE0:
> +case ARMMMUIdx_S1NSE0:
> +return true;
> +default:
> +return false;
> +}
> +}
> +
>  /* Check section/page access permissions.
> Returns the page protection flags, or zero if the access is not
> permitted.  */
> -static inline int check_ap(CPUARMState *env, int ap, int domain_prot,
> -   int access_type, int is_user)
> +static inline int check_ap(CPUARMState *env, ARMMMUIdx mmu_idx,
> +   int ap, int domain_prot,
> +   int access_type)
>  {
>int prot_ro;
> +  bool is_user = regime_is_user(env, mmu_idx);
>
>if (domain_prot == 3) {
>  return PAGE_READ | PAGE_WRITE;
> @@ -4580,7 +4655,7 @@ static inline int check_ap(CPUARMState *env, int ap,
> int domain_prot,
>}
>if (access_type == 1)
>return 0;
> -  switch (A32_BANKED_CURRENT_REG_GET(env, sctlr) & (SCTLR_S |
> SCTLR_R)) {
> +  switch (regime_sctlr(env, mmu_idx) & (SCTLR_S | SCTLR_R)) {
>case SCTLR_S:
>return is_user ? 0 : PAGE_READ;
>case SCTLR_R:
> @@ -4612,35 +4687,32 @@ static inline int check_ap(CPUARMState *env, int
> ap, int domain_prot,
>}
>  }
>
> -static bool get_level1_table_address(CPUARMState *env, uint32_t *table,
> - uint32_t address)
> +static bool get_level1_table_address(CPUARMState *env, ARMMMUIdx mmu_idx,
> + uint32_t *table, uint32_t address)
>  {
> -/* Get the TCR bank based on our security state */
> -TCR *tcr = &env->cp15.tcr_el[arm_is_secure(env) ? 3 : 1]

Re: [Qemu-devel] [PATCH RESEND 49/50] iotests: More options for VM.add_drive()

2015-01-28 Thread Max Reitz


On 2015-01-28 at 16:27, Eric Blake wrote:

On 01/27/2015 12:46 PM, Max Reitz wrote:

This patch allows specifying the interface to be used for the drive, and
makes specifying a path optional (if the path is None, the "file" option
will be omitted, thus creating an empty drive).

Signed-off-by: Max Reitz 
---
  tests/qemu-iotests/iotests.py | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

+
+if not path is None:
+options.append('file=%s' % path)

Works, but it is more idiomatic to use 'if path is not None:', based on
a grep of the source tree.


Did I ever mention I don't know Python very well?

That sounds a lot better, will do, thanks.


With that minor switch,
Reviewed-by: Eric Blake 


On to the last patch! :-)

Max

Re: [Qemu-devel] [PATCH RESEND 49/50] iotests: More options for VM.add_drive()

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> This patch allows specifying the interface to be used for the drive, and
> makes specifying a path optional (if the path is None, the "file" option
> will be omitted, thus creating an empty drive).
> 
> Signed-off-by: Max Reitz 
> ---
>  tests/qemu-iotests/iotests.py | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 

> +
> +if not path is None:
> +options.append('file=%s' % path)

Works, but it is more idiomatic to use 'if path is not None:', based on
a grep of the source tree.

With that minor switch,
Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RESEND 48/50] hmp: Add read-only option to change command

2015-01-28 Thread Max Reitz


On 2015-01-28 at 16:22, Eric Blake wrote:

On 01/27/2015 12:46 PM, Max Reitz wrote:

Expose the new read-only option of 'blockdev-change-medium' for the
'change' HMP command.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
---
  hmp-commands.hx | 20 +---
  hmp.c   | 21 -
  2 files changed, 37 insertions(+), 4 deletions(-)


Already reviewed...

  
+@var{read-only} may be used to change the read-only status of the device. It

+accepts the following values:
+
+@table @var
+@item retain
+Retains the current status; this is the default.
+
+@item ro
+Makes the device read-only.
+
+@item rw
+Makes the device writable.

Hmm.  I suggested in 47/50 to rename the QMP enum values to something
longer, which would affect this on a rebase.  On the other hand, it
would be nice to make the HMP counterpart support BOTH spellings for
convenience ('read-write' for legibility, 'rw' for brevity in typing);
and I have no clue if tab completion starts to play a role either.  Up
to you if you want to add complexity or leave things simple and stupid
(the choices we make in HMP for user-friendliness have no bearing on
what libvirt will want to do, so I really have no strong preference).

...so my review still stands if nothing changes, and is probably worth
dropping if you respin to add user-friendly complexity.


I probably won't care for brevity too much. Again, if we want to allow 
shortcuts later on, it won't break compatibility so we don't really need 
to worry about it now (although I guess if we don't add it now, we won't 
add it ever).



  const char *arg = qdict_get_try_str(qdict, "arg");
+const char *read_only = qdict_get_try_str(qdict, "read-only");
+BlockdevChangeReadOnlyMode read_only_mode = 0;
  Error *err = NULL;
  
  if (strcmp(device, "vnc") == 0) {

+if (read_only) {
+monitor_printf(mon, "Parameter 'read-only' is invalid for VNC");

Hmm. In HMP, you don't type the parameter name (it is implicit based on
the rules for converting positional HMP arguments into dictionary
entries for use in the rest of the code).  In particular, that means
that if I type 'change vnc password rw', that means that 'rw' will be in
the "read-only" parameter position, even though it has nothing to do
with read-only.  [hmm - how did 'change vnc password XXX' work anyways,
since the .params for 'change' didn't allow a third argument before your
patch?]


Well, "help change" will tell you that that parameter is named 
"read-only", so I think mentioning the parameter name (instead of 
"garbage at the end of line" or something like that) is fine.


Max


At any rate, the idea I mentioned in the QMP patch about naming the
parameter 'read-mode' instead of 'read-only' might make for nicer error
messages.

Re: [Qemu-devel] [PATCH RESEND 48/50] hmp: Add read-only option to change command

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> Expose the new read-only option of 'blockdev-change-medium' for the
> 'change' HMP command.
> 
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 
> ---
>  hmp-commands.hx | 20 +---
>  hmp.c   | 21 -
>  2 files changed, 37 insertions(+), 4 deletions(-)
> 

Already reviewed...

>  
> +@var{read-only} may be used to change the read-only status of the device. It
> +accepts the following values:
> +
> +@table @var
> +@item retain
> +Retains the current status; this is the default.
> +
> +@item ro
> +Makes the device read-only.
> +
> +@item rw
> +Makes the device writable.

Hmm.  I suggested in 47/50 to rename the QMP enum values to something
longer, which would affect this on a rebase.  On the other hand, it
would be nice to make the HMP counterpart support BOTH spellings for
convenience ('read-write' for legibility, 'rw' for brevity in typing);
and I have no clue if tab completion starts to play a role either.  Up
to you if you want to add complexity or leave things simple and stupid
(the choices we make in HMP for user-friendliness have no bearing on
what libvirt will want to do, so I really have no strong preference).

...so my review still stands if nothing changes, and is probably worth
dropping if you respin to add user-friendly complexity.

>  const char *arg = qdict_get_try_str(qdict, "arg");
> +const char *read_only = qdict_get_try_str(qdict, "read-only");
> +BlockdevChangeReadOnlyMode read_only_mode = 0;
>  Error *err = NULL;
>  
>  if (strcmp(device, "vnc") == 0) {
> +if (read_only) {
> +monitor_printf(mon, "Parameter 'read-only' is invalid for VNC");

Hmm. In HMP, you don't type the parameter name (it is implicit based on
the rules for converting positional HMP arguments into dictionary
entries for use in the rest of the code).  In particular, that means
that if I type 'change vnc password rw', that means that 'rw' will be in
the "read-only" parameter position, even though it has nothing to do
with read-only.  [hmm - how did 'change vnc password XXX' work anyways,
since the .params for 'change' didn't allow a third argument before your
patch?]

At any rate, the idea I mentioned in the QMP patch about naming the
parameter 'read-mode' instead of 'read-only' might make for nicer error
messages.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RESEND 47/50] blockdev: Add read-only option to blockdev-change-medium

2015-01-28 Thread Max Reitz


On 2015-01-28 at 16:08, Eric Blake wrote:

On 01/27/2015 12:46 PM, Max Reitz wrote:

Add an option to qmp_blockdev_change_medium() which allows changing the
read-only status of the block device whose medium is changed.

Some drives do not have a inherently fixed read-only status; for
instance, floppy disks can be set read-only or writable independently of
the drive. Some users may find it useful to be able to therefore change
the read-only status of a block device when changing the medium.

Signed-off-by: Max Reitz 
---
  blockdev.c   | 25 -
  hmp.c|  2 +-
  qapi/block-core.json | 24 +++-
  qmp-commands.hx  | 24 +++-
  qmp.c|  3 ++-
  5 files changed, 73 insertions(+), 5 deletions(-)

  
  ##

+# @BlockdevChangeReadOnlyMode:
+#
+# Specifies the new read-only mode of a block device subject to the
+# @blockdev-change-medium command.
+#
+# @retain:  Retains the current read-only mode
+#
+# @ro:  Makes the device read-only
+#
+# @rw:  Makes the device writable
+#
+# Since: 2.3
+##
+{ 'enum': 'BlockdevChangeReadOnlyMode',
+  'data': ['retain', 'ro', 'rw'] }

Bike-shedding; would 'read-only' and 'read-write' look any better than
abbreviations?  Doesn't affect functionality, though.


I don't mind either way.


+
+
+##
  # @blockdev-change-medium:
  #
  # Changes the medium inserted into a block device by ejecting the current 
medium
@@ -1799,12 +1817,16 @@
  # @format:  #optional, format to open the new image with (defaults to the
  #   probed format)
  #
+# @read-only:   #optional, change the read-only mode of the device; defaults to
+#   'retain'

"read-only":"rw" looks weird.  Maybe naming this "read-mode" instead of
"read-only" would help.  Again, bikeshedding that doesn't affect
functionality, but worth considering for the interface cleanliness.


Well, actually it's write-mode, because reading will always be possible. :-)

"access" would be another possibility, or "read-only-mode".


So functionally, if nothing changes, you can add:
Reviewed-by: Eric Blake 

But if you change the interface on a respin, drop my R-b to make sure I
check and still like the new naming conventions.


Understood.

Max

Re: [Qemu-devel] [PATCH RESEND 45/50] qmp: Introduce blockdev-change-medium

2015-01-28 Thread Max Reitz


On 2015-01-28 at 16:01, Eric Blake wrote:

On 01/27/2015 12:46 PM, Max Reitz wrote:

Introduce a new QMP command 'blockdev-change-medium' which is intended
to replace the 'change' command for block devices. The existing function
qmp_change_blockdev() is accordingly renamed to
qmp_blockdev_change_medium().

Signed-off-by: Max Reitz 
---
  blockdev.c|  7 ---
  include/sysemu/blockdev.h |  2 --
  qapi-schema.json  |  3 ++-
  qapi/block-core.json  | 23 +++
  qmp-commands.hx   | 31 +++
  qmp.c |  2 +-
  6 files changed, 61 insertions(+), 7 deletions(-)

+++ b/qapi-schema.json
@@ -1649,7 +1649,8 @@
  #  device between when these calls are executed is undefined.
  #
  # Notes:  It is strongly recommended that this interface is not used 
especially
-# for changing block devices.
+# for changing block devices.  Please use blockdev-change-medium
+# instead (for VNC, please use change-vnc-password).

Not grammatically wrong, but still feels a bit awkward.  Maybe better
worded as:

This interface is deprecated, and it is strongly recommended that you
avoid using it.  For changing block devices, use blockdev-change-medium;
for changing VNC parameters, use change-vnc-password.


Thanks, I'll change it.


  ##
+# @blockdev-change-medium:
+#
+# Changes the medium inserted into a block device by ejecting the current 
medium
+# and loading a new image file which is inserted as the new medium (this 
command
+# combines blockdev-open-tray, blockdev-remove-medium, blockdev-insert-medium
+# and blockdev-close-tray).
+#
+# @device:  block device name
+#
+# @filename:filename of the new image to be loaded
+#
+# @format:  #optional, format to open the new image with (defaults to the
+#   probed format)
+#
+# Since: 2.3
+##
+{ 'command': 'blockdev-change-medium',
+  'data': { 'device': 'str',
+'filename': 'str',
+'*format': 'str' } }

Intentional that there is no way to specify 'force'?  I can live with
that (force is a sledgehammer; and someone that can justify using it can
just use the lower-level functions themselves.  No need to bloat the
nice wrapper interface with something that is usually not needed).


Yes, I'd rather drop 'force' here.


I'm unclear on whether this command will wait for the tray open to
happen, or if it fails in the middle if the tray was locked and a
request sent to the guest but the guest did not act fast enough to
unlock things.  As such, I'm not quite sure if this interface is missing
any parameters.


Well, I'd rather call this interface deprecated from the start (it's 
basically as deprecated as 'change' is, only that 'change' is even worse 
because it unites multiple different commands) than making it suitable 
for any situation.


There are two points from my perspective: First, this is mainly a 
replacement for 'change'. Since 'change' did not have a 'force' 
parameter, I don't see the need to have one here either. Second, we can 
always add additional parameters later (without breaking compatibility); 
for instance, in this series still the read-only mode parameter will be 
added.



+Examples:
+
+1. Change a removable medium
+
+-> { "execute": "blockdev-change-medium",
+ "arguments": { "device": "ide1-cd0",
+"filename": "/srv/images/Fedora-12-x86_64-DVD.iso",

Wow - you're still testing a Fedora 12 image?


No, I got that file name from the 'change' example. :-)


I'm a bit reluctant to mark this one reviewed, without more discussion.


No problem. I'll have to send a v2 anyway.

Max

Re: [Qemu-devel] [PATCH RESEND 47/50] blockdev: Add read-only option to blockdev-change-medium

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> Add an option to qmp_blockdev_change_medium() which allows changing the
> read-only status of the block device whose medium is changed.
> 
> Some drives do not have a inherently fixed read-only status; for
> instance, floppy disks can be set read-only or writable independently of
> the drive. Some users may find it useful to be able to therefore change
> the read-only status of a block device when changing the medium.
> 
> Signed-off-by: Max Reitz 
> ---
>  blockdev.c   | 25 -
>  hmp.c|  2 +-
>  qapi/block-core.json | 24 +++-
>  qmp-commands.hx  | 24 +++-
>  qmp.c|  3 ++-
>  5 files changed, 73 insertions(+), 5 deletions(-)
> 

>  
>  ##
> +# @BlockdevChangeReadOnlyMode:
> +#
> +# Specifies the new read-only mode of a block device subject to the
> +# @blockdev-change-medium command.
> +#
> +# @retain:  Retains the current read-only mode
> +#
> +# @ro:  Makes the device read-only
> +#
> +# @rw:  Makes the device writable
> +#
> +# Since: 2.3
> +##
> +{ 'enum': 'BlockdevChangeReadOnlyMode',
> +  'data': ['retain', 'ro', 'rw'] }

Bike-shedding; would 'read-only' and 'read-write' look any better than
abbreviations?  Doesn't affect functionality, though.

> +
> +
> +##
>  # @blockdev-change-medium:
>  #
>  # Changes the medium inserted into a block device by ejecting the current 
> medium
> @@ -1799,12 +1817,16 @@
>  # @format:  #optional, format to open the new image with (defaults to the
>  #   probed format)
>  #
> +# @read-only:   #optional, change the read-only mode of the device; defaults 
> to
> +#   'retain'

"read-only":"rw" looks weird.  Maybe naming this "read-mode" instead of
"read-only" would help.  Again, bikeshedding that doesn't affect
functionality, but worth considering for the interface cleanliness.

So functionally, if nothing changes, you can add:
Reviewed-by: Eric Blake 

But if you change the interface on a respin, drop my R-b to make sure I
check and still like the new naming conventions.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RESEND 46/50] hmp: Use blockdev-change-medium for change command

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> Use separate code paths for the two overloaded functions of the 'change'
> HMP command, and invoke the 'blockdev-change-medium' QMP command if used
> on a block device (by calling qmp_blockdev_change_medium()).
> 
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 
> ---
>  hmp.c | 27 +++
>  1 file changed, 15 insertions(+), 12 deletions(-)

Oh wow - this one survived untouched from an earlier posting :)

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RESEND 45/50] qmp: Introduce blockdev-change-medium

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> Introduce a new QMP command 'blockdev-change-medium' which is intended
> to replace the 'change' command for block devices. The existing function
> qmp_change_blockdev() is accordingly renamed to
> qmp_blockdev_change_medium().
> 
> Signed-off-by: Max Reitz 
> ---
>  blockdev.c|  7 ---
>  include/sysemu/blockdev.h |  2 --
>  qapi-schema.json  |  3 ++-
>  qapi/block-core.json  | 23 +++
>  qmp-commands.hx   | 31 +++
>  qmp.c |  2 +-
>  6 files changed, 61 insertions(+), 7 deletions(-)
> 

> +++ b/qapi-schema.json
> @@ -1649,7 +1649,8 @@
>  #  device between when these calls are executed is undefined.
>  #
>  # Notes:  It is strongly recommended that this interface is not used 
> especially
> -# for changing block devices.
> +# for changing block devices.  Please use blockdev-change-medium
> +# instead (for VNC, please use change-vnc-password).

Not grammatically wrong, but still feels a bit awkward.  Maybe better
worded as:

This interface is deprecated, and it is strongly recommended that you
avoid using it.  For changing block devices, use blockdev-change-medium;
for changing VNC parameters, use change-vnc-password.

>  ##
> +# @blockdev-change-medium:
> +#
> +# Changes the medium inserted into a block device by ejecting the current 
> medium
> +# and loading a new image file which is inserted as the new medium (this 
> command
> +# combines blockdev-open-tray, blockdev-remove-medium, blockdev-insert-medium
> +# and blockdev-close-tray).
> +#
> +# @device:  block device name
> +#
> +# @filename:filename of the new image to be loaded
> +#
> +# @format:  #optional, format to open the new image with (defaults to the
> +#   probed format)
> +#
> +# Since: 2.3
> +##
> +{ 'command': 'blockdev-change-medium',
> +  'data': { 'device': 'str',
> +'filename': 'str',
> +'*format': 'str' } }

Intentional that there is no way to specify 'force'?  I can live with
that (force is a sledgehammer; and someone that can justify using it can
just use the lower-level functions themselves.  No need to bloat the
nice wrapper interface with something that is usually not needed).

I'm unclear on whether this command will wait for the tray open to
happen, or if it fails in the middle if the tray was locked and a
request sent to the guest but the guest did not act fast enough to
unlock things.  As such, I'm not quite sure if this interface is missing
any parameters.

> +Examples:
> +
> +1. Change a removable medium
> +
> +-> { "execute": "blockdev-change-medium",
> + "arguments": { "device": "ide1-cd0",
> +"filename": 
> "/srv/images/Fedora-12-x86_64-DVD.iso",

Wow - you're still testing a Fedora 12 image?

I'm a bit reluctant to mark this one reviewed, without more discussion.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RESEND 44/50] block: Inquire tray state before tray-moved events

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> blk_dev_change_media_cb() is called for all potential tray movements;
> however, it is possible to request closing the tray but nothing actually
> happening (on a floppy disk drive without a medium).
> 
> Thus, the actual tray status should be inquired before sending a
> tray-moved event (and an event should be sent whenever the status
> changed).
> 
> Checking @load is now superfluous; it was necessary because it was
> possible to change a medium without having explicitly opened the tray
> and closed it again (or it might have been possible, at least). This is
> no longer possible, though.
> 
> Signed-off-by: Max Reitz 
> ---
>  block/block-backend.c | 17 +++--
>  1 file changed, 7 insertions(+), 10 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RESEND 43/50] blockdev: Implement change with basic operations

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> Implement 'change' on block devices by calling blockdev-open-tray,
> blockdev-remove-medium, blockdev-insert-medium (a variation of that
> which does not need a node-name) and blockdev-close-tray.
> 
> Signed-off-by: Max Reitz 
> ---
>  blockdev.c | 191 
> +++--
>  1 file changed, 72 insertions(+), 119 deletions(-)
> 

> +
> +qmp_blockdev_open_tray(device, false, false, &err);
> +if (err) {
> +error_propagate(errp, err);
> +return;
> +}
> +
> +qmp_blockdev_remove_medium(device, &err);
> +if (err) {
> +error_propagate(errp, err);
> +return;
> +}
> +
> +qmp_blockdev_insert_anon_medium(device, medium_bs, &err);
> +if (err) {
> +error_propagate(errp, err);
> +return;
> +}
> +
> +qmp_blockdev_close_tray(device, errp);

So if we fail anywhere in the middle, the device is left in an
unspecified state, and a followup query-block would be needed to learn
where we failed.  I guess that's life when you use a high-level command
instead of chaining low-level ones yourself.  I can live with it.

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RESEND 39/50] blockdev: Add blockdev-close-tray

2015-01-28 Thread Max Reitz


On 2015-01-28 at 14:58, Eric Blake wrote:

On 01/27/2015 12:46 PM, Max Reitz wrote:

Signed-off-by: Max Reitz 
---
  blockdev.c   | 22 ++
  qapi/block-core.json | 14 ++
  qmp-commands.hx  | 33 +
  3 files changed, 69 insertions(+)

+void qmp_blockdev_close_tray(const char *device, Error **errp)
+{
+BlockBackend *blk;
+
+blk = blk_by_name(device);
+if (!blk) {
+error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+return;
+}
+
+if (!blk_dev_has_removable_media(blk)) {
+error_setg(errp, "Device '%s' is not removable", device);
+return;
+}
+
+if (!blk_dev_is_tray_open(blk)) {
+return;
+}

Is it worth documenting that this (and the one in 38/50) are intentional
no-ops if the tray is already in the desired state?


There's certainly no harm in documenting it, so I might as well just do it.

Max


Reviewed-by: Eric Blake

Re: [Qemu-devel] [PATCH RESEND 42/50] blockdev: Implement eject with basic operations

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> Implement 'eject' by calling blockdev-open-tray and
> blockdev-remove-medium.
> 
> Signed-off-by: Max Reitz 
> ---
>  blockdev.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)

Hmm. If you decide to enforce node-name only on the low-level command,
but still want this high-level command to support BDS->BB lookup for
convenience, you might need additional code here.  I don't know if that
is an argument in favor or against supporting BDS->BB as convenience at
the low level.  I don't have a strong enough opinion towards either
decision, so I'll let others chime in or let your original choice be
good enough.  Therefore:

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RESEND 41/50] blockdev: Add blockdev-insert-medium

2015-01-28 Thread Max Reitz


On 2015-01-28 at 15:18, Eric Blake wrote:

On 01/27/2015 12:46 PM, Max Reitz wrote:

And a helper function for that which directly takes a pointer to the BDS
to be inserted instead of its node-name (which will be used for
implementing 'change' using blockdev-insert-medium).

Signed-off-by: Max Reitz 
---
  blockdev.c   | 43 +++
  qapi/block-core.json | 17 +
  qmp-commands.hx  | 38 ++
  3 files changed, 98 insertions(+)

+static void qmp_blockdev_insert_anon_medium(const char *device,
+BlockDriverState *bs, Error **errp)
+{
+BlockBackend *blk;
+
+blk = blk_by_name(device);
+if (!blk) {
+error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+return;
+}
+
+if (!blk_dev_has_removable_media(blk)) {
+error_setg(errp, "Device '%s' is not removable", device);
+return;
+}
+
+if (!blk_dev_is_tray_open(blk)) {
+error_setg(errp, "Tray of device '%s' is not open", device);
+return;
+}
+
+if (blk_bs(blk)) {
+error_setg(errp, "There already is a medium in device '%s'", device);
+return;
+}

Good, you didn't implement hot-swap semantics of replacing an existing
medium (that gets too confusing; so I _like_ that you force the user to
consider all the steps through multiple low-level commands).



+
+Example (1):

I'll quit pointing it out; but if you clean up the useless (1) in one
patch, do it across the series.


Yes, will do.


+
+-> { "execute": "blockdev-add",
+ "arguments": { "options": { "id": "backend0",
+ "node-name": "node0",

Why is 'id' needed?


Oops, because I forgot removing it after I added the functionality for 
creating a BDS tree without a BB. I'll remove it, thanks for catching it.


Max


Isn't the point of this command sequence to create
a BDS tree that is NOT tied to a BB, and then use insert-medium to make
the association after the fact?  We don't need to create a BB named
'backend0' if we are immediately going to reuse 'node0' in a different
BB (true, we have somewhat anticipated the idea of sharing BDS tree
among multiple BB, but haven't quite turned that on before now).


+ "driver": "raw",
+ "file": { "driver": "file",
+   "filename": "fedora.iso" } } } }
+
+<- { "return": {} }
+
+-> { "execute": "blockdev-insert-medium",
+ "arguments": { "device": "ide1-cd0",
+"node-name": "node0" } }
+
+<- { "return": {} }

If you can either explain why you used 'id' in the example, or remove
that parameter, you can add:
Reviewed-by: Eric Blake

Re: [Qemu-devel] [PATCH RESEND 41/50] blockdev: Add blockdev-insert-medium

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> And a helper function for that which directly takes a pointer to the BDS
> to be inserted instead of its node-name (which will be used for
> implementing 'change' using blockdev-insert-medium).
> 
> Signed-off-by: Max Reitz 
> ---
>  blockdev.c   | 43 +++
>  qapi/block-core.json | 17 +
>  qmp-commands.hx  | 38 ++
>  3 files changed, 98 insertions(+)
> 

> +static void qmp_blockdev_insert_anon_medium(const char *device,
> +BlockDriverState *bs, Error 
> **errp)
> +{
> +BlockBackend *blk;
> +
> +blk = blk_by_name(device);
> +if (!blk) {
> +error_set(errp, QERR_DEVICE_NOT_FOUND, device);
> +return;
> +}
> +
> +if (!blk_dev_has_removable_media(blk)) {
> +error_setg(errp, "Device '%s' is not removable", device);
> +return;
> +}
> +
> +if (!blk_dev_is_tray_open(blk)) {
> +error_setg(errp, "Tray of device '%s' is not open", device);
> +return;
> +}
> +
> +if (blk_bs(blk)) {
> +error_setg(errp, "There already is a medium in device '%s'", device);
> +return;
> +}

Good, you didn't implement hot-swap semantics of replacing an existing
medium (that gets too confusing; so I _like_ that you force the user to
consider all the steps through multiple low-level commands).


> +
> +Example (1):

I'll quit pointing it out; but if you clean up the useless (1) in one
patch, do it across the series.

> +
> +-> { "execute": "blockdev-add",
> + "arguments": { "options": { "id": "backend0",
> + "node-name": "node0",

Why is 'id' needed?  Isn't the point of this command sequence to create
a BDS tree that is NOT tied to a BB, and then use insert-medium to make
the association after the fact?  We don't need to create a BB named
'backend0' if we are immediately going to reuse 'node0' in a different
BB (true, we have somewhat anticipated the idea of sharing BDS tree
among multiple BB, but haven't quite turned that on before now).

> + "driver": "raw",
> + "file": { "driver": "file",
> +   "filename": "fedora.iso" } } } }
> +
> +<- { "return": {} }
> +
> +-> { "execute": "blockdev-insert-medium",
> + "arguments": { "device": "ide1-cd0",
> +"node-name": "node0" } }
> +
> +<- { "return": {} }

If you can either explain why you used 'id' in the example, or remove
that parameter, you can add:
Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 1/1] block: change default memory alignment for block requests to 4096

2015-01-28 Thread Denis V. Lunev


On 28/01/15 23:07, Paolo Bonzini wrote:


On 28/01/2015 19:49, Denis V. Lunev wrote:

The following sequence
 int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
 for (i = 0; i < 10; i++)
 write(fd, buf, 4096);
performs 10% better if buf is aligned to 4096 bytes rather then to
512 bytes on HDD with 512/4096 logical/physical sector size.

The difference is quite reliable.

The 10% difference, however, is probably not enough to cover the cost of
providing a bounce buffer if a guest is (rightfully) using a 512-byte
aligned buffer: bs->bl.opt_mem_alignment is in fact badly named and it
should be bs->bl.min_mem_alignment instead.

Instead, you probably should patch bdrv_opt_mem_align to return at least
4096, and leave the detection logic intact.  This will let
qemu_blockalign return a properly aligned buffer to qemu-img and other
in-process allocations, without negatively affecting the guest.

Thanks,

Paolo

ok, this looks good to me :)



Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
  block.c   | 4 ++--
  block/raw-posix.c | 4 ++--
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index d45e4dd..bc5d1e7 100644
--- a/block.c
+++ b/block.c
@@ -543,7 +543,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
  bs->bl.max_transfer_length = bs->file->bl.max_transfer_length;
  bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
  } else {
-bs->bl.opt_mem_alignment = 512;
+bs->bl.opt_mem_alignment = 4096;
  }
  
  if (bs->backing_hd) {

@@ -966,7 +966,7 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
  
  bs->open_flags = flags;

  bs->guest_block_size = 512;
-bs->request_alignment = 512;
+bs->request_alignment = 4096;
  bs->zero_beyond_eof = true;
  open_flags = bdrv_open_flags(bs, flags);
  bs->read_only = !(open_flags & BDRV_O_RDWR);
diff --git a/block/raw-posix.c b/block/raw-posix.c
index ec38fee..d1b3388 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -266,7 +266,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
  if (!s->buf_align) {
  size_t align;
  buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
-for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
+for (align = 4096; align <= MAX_BLOCKSIZE; align <<= 1) {
  if (pread(fd, buf + align, MAX_BLOCKSIZE, 0) >= 0) {
  s->buf_align = align;
  break;
@@ -278,7 +278,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
  if (!bs->request_alignment) {
  size_t align;
  buf = qemu_memalign(s->buf_align, MAX_BLOCKSIZE);
-for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
+for (align = 4096; align <= MAX_BLOCKSIZE; align <<= 1) {
  if (pread(fd, buf, align, 0) >= 0) {
  bs->request_alignment = align;
  break;

Re: [Qemu-devel] [PATCH RESEND 40/50] blockdev: Add blockdev-remove-medium

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> Signed-off-by: Max Reitz 
> ---
>  blockdev.c   | 25 +
>  qapi/block-core.json | 13 +
>  qmp-commands.hx  | 43 +++
>  3 files changed, 81 insertions(+)
> 

>  
> +void qmp_blockdev_remove_medium(const char *device, Error **errp)
> +{

> +if (!blk_dev_is_tray_open(blk)) {
> +error_setg(errp, "Tray of device '%s' is not open", device);
> +return;
> +}
> +
> +if (blk_bs(blk)) {
> +blk_remove_bs(blk);

Another intentional no-op if there is already no medium; worth
documenting alongside the other touchups.

>  
> +##
> +# @blockdev-remove-medium:
> +#
> +# Removes a medium (a block driver state tree) from a block device. That 
> block
> +# device's tray must currently be open.
> +#
> +# @device: block device name
> +#
> +# Since: 2.3
> +##
> +{ 'command': 'blockdev-remove-medium',
> +  'data': { 'device': 'str' } }

Just thinking aloud - obviously, this is a device operation.  But do we
want to allow specifying the device by the node-name of the BDS being
removed?  (I suppose the same applies to 38 [open a tray by the name of
the BDS in the tray] and 39 [close a tray that has the given BDS
inserted]).  But I'm fine with naming the parameter 'device', even if we
allow for a BDS->BB lookup when actually resolving the user's input,
since that would only be a convenience (and not like other block API
that specifically operate on nodes of a BDS tree rather than a device).

Furthermore, the counterpart command for inserting a medium (later in
this series) is one case where we CAN'T do the BDS->BB lookup
(generally, insertion will fail if a BDS node is already in the BB
device, unless you implement swap semantics, but that would make it
confusing to insert one BDS into the device referenced by another
BDS->BB lookup); and symmetry argues that if that command supports ONLY
a BB name, then all of the related commands are just fine using 'device'
as their parameter name to imply BB name.

So I'm fine with the naming you've used so far.

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 1/1] block: change default memory alignment for block requests to 4096

2015-01-28 Thread Paolo Bonzini



On 28/01/2015 19:49, Denis V. Lunev wrote:
> The following sequence
> int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
> for (i = 0; i < 10; i++)
> write(fd, buf, 4096);
> performs 10% better if buf is aligned to 4096 bytes rather then to
> 512 bytes on HDD with 512/4096 logical/physical sector size.
> 
> The difference is quite reliable.

The 10% difference, however, is probably not enough to cover the cost of
providing a bounce buffer if a guest is (rightfully) using a 512-byte
aligned buffer: bs->bl.opt_mem_alignment is in fact badly named and it
should be bs->bl.min_mem_alignment instead.

Instead, you probably should patch bdrv_opt_mem_align to return at least
4096, and leave the detection logic intact.  This will let
qemu_blockalign return a properly aligned buffer to qemu-img and other
in-process allocations, without negatively affecting the guest.

Thanks,

Paolo

> Signed-off-by: Denis V. Lunev 
> CC: Kevin Wolf 
> CC: Stefan Hajnoczi 
> ---
>  block.c   | 4 ++--
>  block/raw-posix.c | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/block.c b/block.c
> index d45e4dd..bc5d1e7 100644
> --- a/block.c
> +++ b/block.c
> @@ -543,7 +543,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error 
> **errp)
>  bs->bl.max_transfer_length = bs->file->bl.max_transfer_length;
>  bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
>  } else {
> -bs->bl.opt_mem_alignment = 512;
> +bs->bl.opt_mem_alignment = 4096;
>  }
>  
>  if (bs->backing_hd) {
> @@ -966,7 +966,7 @@ static int bdrv_open_common(BlockDriverState *bs, 
> BlockDriverState *file,
>  
>  bs->open_flags = flags;
>  bs->guest_block_size = 512;
> -bs->request_alignment = 512;
> +bs->request_alignment = 4096;
>  bs->zero_beyond_eof = true;
>  open_flags = bdrv_open_flags(bs, flags);
>  bs->read_only = !(open_flags & BDRV_O_RDWR);
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index ec38fee..d1b3388 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -266,7 +266,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
> fd, Error **errp)
>  if (!s->buf_align) {
>  size_t align;
>  buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
> -for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
> +for (align = 4096; align <= MAX_BLOCKSIZE; align <<= 1) {
>  if (pread(fd, buf + align, MAX_BLOCKSIZE, 0) >= 0) {
>  s->buf_align = align;
>  break;
> @@ -278,7 +278,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
> fd, Error **errp)
>  if (!bs->request_alignment) {
>  size_t align;
>  buf = qemu_memalign(s->buf_align, MAX_BLOCKSIZE);
> -for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
> +for (align = 4096; align <= MAX_BLOCKSIZE; align <<= 1) {
>  if (pread(fd, buf, align, 0) >= 0) {
>  bs->request_alignment = align;
>  break;
>

Re: [Qemu-devel] [PATCH RESEND 38/50] blockdev: Add blockdev-open-tray

2015-01-28 Thread Max Reitz


On 2015-01-28 at 14:56, Eric Blake wrote:

On 01/27/2015 12:46 PM, Max Reitz wrote:

Signed-off-by: Max Reitz 
---
  blockdev.c   | 48 
  qapi/block-core.json | 21 +
  qmp-commands.hx  | 37 +
  3 files changed, 106 insertions(+)

+Example (1):
+

With only one example, the (1) doesn't add much.  Minor whether you
leave it in or take it out.


I'll remove it.

Max


+-> { "execute": "blockdev-open-tray",
+ "arguments": { "device": "ide1-cd0" } }
+
+<- { "timestamp": { "seconds": 1418751016,
+"microseconds": 716996 },
+ "event": "DEVICE_TRAY_MOVED",
+ "data": { "device": "ide1-cd0",
+   "tray-open": true } }
+
+<- { "return": {} }

Nice inclusion of the event alongside the return code!

Reviewed-by: Eric Blake

Re: [Qemu-devel] [PATCH v2 1/1] qemu-img: Add QEMU_PKGVERSION to QEMU_IMG_VERSION

2015-01-28 Thread Don Slutz

Ping.

On 01/09/15 11:21, Eric Blake wrote:
> On 01/09/2015 08:17 AM, Don Slutz wrote:
>> This is the same way vl.c handles this.
>>
>> Signed-off-by: Don Slutz 
>> ---
>>  qemu-img.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Reviewed-by: Eric Blake 
> 
>>
>> diff --git a/qemu-img.c b/qemu-img.c
>> index 7876258..b6e9220 100644
>> --- a/qemu-img.c
>> +++ b/qemu-img.c
>> @@ -35,7 +35,7 @@
>>  #include "block/qapi.h"
>>  #include 
>>  
>> -#define QEMU_IMG_VERSION "qemu-img version " QEMU_VERSION \
>> +#define QEMU_IMG_VERSION "qemu-img version " QEMU_VERSION QEMU_PKGVERSION \
>>", Copyright (c) 2004-2008 Fabrice Bellard\n"
>>  
>>  typedef struct img_cmd_t {
>>
>

Re: [Qemu-devel] [PATCH 1/1] block: change default memory alignment for block requests to 4096

2015-01-28 Thread Denis V. Lunev


On 28/01/15 21:49, Denis V. Lunev wrote:

The following sequence
 int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
 for (i = 0; i < 10; i++)
 write(fd, buf, 4096);
performs 10% better if buf is aligned to 4096 bytes rather then to
512 bytes on HDD with 512/4096 logical/physical sector size.

The difference is quite reliable.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Stefan Hajnoczi 
---
  block.c   | 4 ++--
  block/raw-posix.c | 4 ++--
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index d45e4dd..bc5d1e7 100644
--- a/block.c
+++ b/block.c
@@ -543,7 +543,7 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
  bs->bl.max_transfer_length = bs->file->bl.max_transfer_length;
  bs->bl.opt_mem_alignment = bs->file->bl.opt_mem_alignment;
  } else {
-bs->bl.opt_mem_alignment = 512;
+bs->bl.opt_mem_alignment = 4096;
  }
  
  if (bs->backing_hd) {

@@ -966,7 +966,7 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
  
  bs->open_flags = flags;

  bs->guest_block_size = 512;
-bs->request_alignment = 512;
+bs->request_alignment = 4096;
  bs->zero_beyond_eof = true;
  open_flags = bdrv_open_flags(bs, flags);
  bs->read_only = !(open_flags & BDRV_O_RDWR);
diff --git a/block/raw-posix.c b/block/raw-posix.c
index ec38fee..d1b3388 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -266,7 +266,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
  if (!s->buf_align) {
  size_t align;
  buf = qemu_memalign(MAX_BLOCKSIZE, 2 * MAX_BLOCKSIZE);
-for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
+for (align = 4096; align <= MAX_BLOCKSIZE; align <<= 1) {
  if (pread(fd, buf + align, MAX_BLOCKSIZE, 0) >= 0) {
  s->buf_align = align;
  break;
@@ -278,7 +278,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
fd, Error **errp)
  if (!bs->request_alignment) {
  size_t align;
  buf = qemu_memalign(s->buf_align, MAX_BLOCKSIZE);
-for (align = 512; align <= MAX_BLOCKSIZE; align <<= 1) {
+for (align = 4096; align <= MAX_BLOCKSIZE; align <<= 1) {
  if (pread(fd, buf, align, 0) >= 0) {
  bs->request_alignment = align;
  break;

sorry, the patch is wrong. It breaks 'make check-block'.
I will redo it and perform more testing.

request-alignment related changes are wrong :(
I have run tests without them but added them as
a obvious last minute addition.

Re: [Qemu-devel] [PATCH RESEND 39/50] blockdev: Add blockdev-close-tray

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> Signed-off-by: Max Reitz 
> ---
>  blockdev.c   | 22 ++
>  qapi/block-core.json | 14 ++
>  qmp-commands.hx  | 33 +
>  3 files changed, 69 insertions(+)
> 

> +void qmp_blockdev_close_tray(const char *device, Error **errp)
> +{
> +BlockBackend *blk;
> +
> +blk = blk_by_name(device);
> +if (!blk) {
> +error_set(errp, QERR_DEVICE_NOT_FOUND, device);
> +return;
> +}
> +
> +if (!blk_dev_has_removable_media(blk)) {
> +error_setg(errp, "Device '%s' is not removable", device);
> +return;
> +}
> +
> +if (!blk_dev_is_tray_open(blk)) {
> +return;
> +}

Is it worth documenting that this (and the one in 38/50) are intentional
no-ops if the tray is already in the desired state?

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RESEND 38/50] blockdev: Add blockdev-open-tray

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> Signed-off-by: Max Reitz 
> ---
>  blockdev.c   | 48 
>  qapi/block-core.json | 21 +
>  qmp-commands.hx  | 37 +
>  3 files changed, 106 insertions(+)
> 

> +Example (1):
> +

With only one example, the (1) doesn't add much.  Minor whether you
leave it in or take it out.

> +-> { "execute": "blockdev-open-tray",
> + "arguments": { "device": "ide1-cd0" } }
> +
> +<- { "timestamp": { "seconds": 1418751016,
> +"microseconds": 716996 },
> + "event": "DEVICE_TRAY_MOVED",
> + "data": { "device": "ide1-cd0",
> +   "tray-open": true } }
> +
> +<- { "return": {} }

Nice inclusion of the event alongside the return code!

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RESEND 37/50] block: Add blk_remove_bs()

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> This function removes the BlockDriverState associated with the given
> BlockBackend from that BB and sets the BDS pointer in the BB to NULL.
> 
> Signed-off-by: Max Reitz 
> ---
>  block/block-backend.c  | 22 +-
>  include/sysemu/block-backend.h |  1 +
>  2 files changed, 22 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [Bug 1292234] Re: qcow2 image corruption on non-extent filesystems (ext3)

2015-01-28 Thread Chris J Arges

** No longer affects: qemu

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1292234

Title:
  qcow2 image corruption on non-extent filesystems (ext3)

Status in qemu package in Ubuntu:
  In Progress

Bug description:
  The security team uses a tool (http://bazaar.launchpad.net/~ubuntu-
  bugcontrol/ubuntu-qa-tools/master/view/head:/vm-tools/uvt) that uses
  libvirt snapshots quite a bit. I noticed after upgrading to trusty
  some time ago that qemu 1.7 (and the qemu 2.0 in the candidate ppa)
  has had stability problems such that the disk/partition table seems to
  be corrupted after removing a libvirt snapshot and then creating
  another with the same name. I don't have a very simple reproducer, but
  had enough that hallyn suggested I file a bug. First off:

  qemu-kvm 2.0~git-20140307.4c288ac-0ubuntu2

  $ cat /proc/version_signature
  Ubuntu 3.13.0-16.36-generic 3.13.5

  $ qemu-img info ./forhallyn-trusty-amd64.img
  image: ./forhallyn-trusty-amd64.img
  file format: qcow2
  virtual size: 8.0G (8589934592 bytes)
  disk size: 4.0G
  cluster_size: 65536
  Format specific information:
  compat: 0.10

  Steps to reproduce:
  1. create a virtual machine. For a simplified reproducer, I used virt-manager 
with:
    OS type: Linux
    Version: Ubuntu 14.04
    Memory: 768
    CPUs: 1

    Select managed or existing (Browse, new volume)
  Create a new storage volume:
    qcow2
    Max capacity: 8192
    Allocation: 0

    Advanced:
  NAT
  kvm
  x86_64
  firmware: default

  2. install a VM. I used trusty-desktop-amd64.iso from Jan 23 since it
  seems like I can hit the bug more reliably if I have lots of updates
  in a dist-upgrade. I have seen this with lucid-trusty guests that are
  i386 and amd64. After the install, reboot and then cleanly shutdown

  3. Backup the image file somewhere since steps 1 and 2 take a while :)

  4. Execute the following commands which are based on what our uvt tool
  does:

  $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot"
  $ virsh snapshot-current --name forhallyn-trusty-amd64
  pristine
  $ virsh start forhallyn-trusty-amd64
  $ virsh snapshot-list forhallyn-trusty-amd64 # this is showing as shutoff 
after start, this might be different with qemu 1.5

  in guest:
  sudo apt-get update
  sudo apt-get dist-upgrade
  780 upgraded...
  shutdown -h now

  $ virsh snapshot-delete forhallyn-trusty-amd64 pristine --children
  $ virsh snapshot-create-as forhallyn-trusty-amd64 pristine "uvt snapshot"

  $ virsh start forhallyn-trusty-amd64 # this command works, but there
  is often disk corruption

  The idea behind the above is to create a new VM with a pristine
  snapshot that we could revert later if we wanted. Instead, we boot the
  VM, run apt-get dist-upgrade, cleanly shutdown and then remove the old
  'pristine' snapshot and create a new 'pristine' snapshot. The
  intention is to update the VM and the pristine snapshot so that when
  we boot the next time, we boot from the updated VM and can revert back
  to the updated VM.

  After running 'virsh start' after doing snapshot-delete/snapshot-
  create-as, the disk may be corrupted. This can be seen with grub
  failing to find .mod files, the kernel not booting, init failing, etc.

  This does not seem to be related to the machine type used. Ie, pc-
  i440fx-1.5, pc-i440fx-1.7 and pc-i440fx-2.0 all fail with qemu 2.0,
  pc-i440fx-1.5 and pc-i440fx-1.7 fail with qemu 1.7 and pc-i440fx-1.5
  works fine with qemu 1.5.

  Only workaround I know if is to downgrade qemu to 1.5.0+dfsg-
  3ubuntu5.4 from Ubuntu 13.10.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1292234/+subscriptions

Re: [Qemu-devel] [PATCH v5 2/2] Xen: Use the ioreq-server API when available

2015-01-28 Thread Don Slutz

On 12/05/14 05:50, Paul Durrant wrote:
> The ioreq-server API added to Xen 4.5 offers better security than
> the existing Xen/QEMU interface because the shared pages that are
> used to pass emulation request/results back and forth are removed
> from the guest's memory space before any requests are serviced.
> This prevents the guest from mapping these pages (they are in a
> well known location) and attempting to attack QEMU by synthesizing
> its own request structures. Hence, this patch modifies configure
> to detect whether the API is available, and adds the necessary
> code to use the API if it is.

This patch (which is now on xenbits qemu staging) is causing me
issues.

So far I have tracked it back to hvm_select_ioreq_server()
which selects the "default_ioreq_server".  Since I have one 1
QEMU, it is both the "default_ioreq_server" and an enabled
2nd ioreq_server.  I am continuing to understand why my changes
are causing this.  More below.

This patch causes QEMU to only call xc_evtchn_bind_interdomain()
for the enabled 2nd ioreq_server.  So when (if)
hvm_select_ioreq_server() selects the "default_ioreq_server", the
guest hangs on an I/O.

Using the debug key 'e':

(XEN) [2015-01-28 18:57:07] 'e' pressed -> dumping event-channel info
(XEN) [2015-01-28 18:57:07] Event channel information for domain 0:
(XEN) [2015-01-28 18:57:07] Polling vCPUs: {}
(XEN) [2015-01-28 18:57:07] port [p/m/s]
(XEN) [2015-01-28 18:57:07]1 [0/0/0]: s=5 n=0 x=0 v=0
(XEN) [2015-01-28 18:57:07]2 [0/0/0]: s=6 n=0 x=0
(XEN) [2015-01-28 18:57:07]3 [0/0/0]: s=6 n=0 x=0
(XEN) [2015-01-28 18:57:07]4 [0/0/0]: s=5 n=0 x=0 v=1
(XEN) [2015-01-28 18:57:07]5 [0/0/0]: s=6 n=0 x=0
(XEN) [2015-01-28 18:57:07]6 [0/0/0]: s=6 n=0 x=0
(XEN) [2015-01-28 18:57:07]7 [0/0/0]: s=5 n=1 x=0 v=0
(XEN) [2015-01-28 18:57:07]8 [0/0/0]: s=6 n=1 x=0
(XEN) [2015-01-28 18:57:07]9 [0/0/0]: s=6 n=1 x=0
(XEN) [2015-01-28 18:57:07]   10 [0/0/0]: s=5 n=1 x=0 v=1
(XEN) [2015-01-28 18:57:07]   11 [0/0/0]: s=6 n=1 x=0
(XEN) [2015-01-28 18:57:07]   12 [0/0/0]: s=6 n=1 x=0
(XEN) [2015-01-28 18:57:07]   13 [0/0/0]: s=5 n=2 x=0 v=0
(XEN) [2015-01-28 18:57:07]   14 [0/0/0]: s=6 n=2 x=0
(XEN) [2015-01-28 18:57:07]   15 [0/0/0]: s=6 n=2 x=0
(XEN) [2015-01-28 18:57:07]   16 [0/0/0]: s=5 n=2 x=0 v=1
(XEN) [2015-01-28 18:57:07]   17 [0/0/0]: s=6 n=2 x=0
(XEN) [2015-01-28 18:57:07]   18 [0/0/0]: s=6 n=2 x=0
(XEN) [2015-01-28 18:57:07]   19 [0/0/0]: s=5 n=3 x=0 v=0
(XEN) [2015-01-28 18:57:07]   20 [0/0/0]: s=6 n=3 x=0
(XEN) [2015-01-28 18:57:07]   21 [0/0/0]: s=6 n=3 x=0
(XEN) [2015-01-28 18:57:07]   22 [0/0/0]: s=5 n=3 x=0 v=1
(XEN) [2015-01-28 18:57:07]   23 [0/0/0]: s=6 n=3 x=0
(XEN) [2015-01-28 18:57:07]   24 [0/0/0]: s=6 n=3 x=0
(XEN) [2015-01-28 18:57:07]   25 [0/0/0]: s=5 n=4 x=0 v=0
(XEN) [2015-01-28 18:57:07]   26 [0/0/0]: s=6 n=4 x=0
(XEN) [2015-01-28 18:57:07]   27 [0/0/0]: s=6 n=4 x=0
(XEN) [2015-01-28 18:57:07]   28 [0/0/0]: s=5 n=4 x=0 v=1
(XEN) [2015-01-28 18:57:07]   29 [0/0/0]: s=6 n=4 x=0
(XEN) [2015-01-28 18:57:07]   30 [0/0/0]: s=6 n=4 x=0
(XEN) [2015-01-28 18:57:07]   31 [0/0/0]: s=5 n=5 x=0 v=0
(XEN) [2015-01-28 18:57:07]   32 [0/0/0]: s=6 n=5 x=0
(XEN) [2015-01-28 18:57:07]   33 [0/0/0]: s=6 n=5 x=0
(XEN) [2015-01-28 18:57:07]   34 [0/0/0]: s=5 n=5 x=0 v=1
(XEN) [2015-01-28 18:57:07]   35 [0/0/0]: s=6 n=5 x=0
(XEN) [2015-01-28 18:57:07]   36 [0/0/0]: s=6 n=5 x=0
(XEN) [2015-01-28 18:57:07]   37 [0/0/0]: s=5 n=6 x=0 v=0
(XEN) [2015-01-28 18:57:07]   38 [0/0/0]: s=6 n=6 x=0
(XEN) [2015-01-28 18:57:07]   39 [0/0/0]: s=6 n=6 x=0
(XEN) [2015-01-28 18:57:07]   40 [0/0/0]: s=5 n=6 x=0 v=1
(XEN) [2015-01-28 18:57:07]   41 [0/0/0]: s=6 n=6 x=0
(XEN) [2015-01-28 18:57:07]   42 [0/0/0]: s=6 n=6 x=0
(XEN) [2015-01-28 18:57:07]   43 [0/0/0]: s=5 n=7 x=0 v=0
(XEN) [2015-01-28 18:57:07]   44 [0/0/0]: s=6 n=7 x=0
(XEN) [2015-01-28 18:57:07]   45 [0/0/0]: s=6 n=7 x=0
(XEN) [2015-01-28 18:57:07]   46 [0/0/0]: s=5 n=7 x=0 v=1
(XEN) [2015-01-28 18:57:07]   47 [0/0/0]: s=6 n=7 x=0
(XEN) [2015-01-28 18:57:07]   48 [0/0/0]: s=6 n=7 x=0
(XEN) [2015-01-28 18:57:07]   49 [0/0/0]: s=3 n=0 x=0 d=0 p=58
(XEN) [2015-01-28 18:57:07]   50 [0/0/0]: s=5 n=0 x=0 v=9
(XEN) [2015-01-28 18:57:07]   51 [0/0/0]: s=4 n=0 x=0 p=9 i=9
(XEN) [2015-01-28 18:57:07]   52 [0/0/0]: s=5 n=0 x=0 v=2
(XEN) [2015-01-28 18:57:07]   53 [0/0/0]: s=4 n=4 x=0 p=16 i=16
(XEN) [2015-01-28 18:57:07]   54 [0/0/0]: s=4 n=0 x=0 p=17 i=17
(XEN) [2015-01-28 18:57:07]   55 [0/0/0]: s=4 n=6 x=0 p=18 i=18
(XEN) [2015-01-28 18:57:07]   56 [0/0/0]: s=4 n=0 x=0 p=8 i=8
(XEN) [2015-01-28 18:57:07]   57 [0/0/0]: s=4 n=0 x=0 p=19 i=19
(XEN) [2015-01-28 18:57:07]   58 [0/0/0]: s=3 n=0 x=0 d=0 p=49
(XEN) [2015-01-28 18:57:07]   59 [0/0/0]: s=5 n=0 x=0 v=3
(XEN) [2015

Re: [Qemu-devel] [PATCH RESEND 36/50] blockdev: Allow more options for BB-less BDS tree

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> Most of the options which blockdev_init() parses for both the
> BlockBackend and the root BDS are valid for just the root BDS as well
> (e.g. read-only). This patch allows specifying these options even if not
> creating a BlockBackend.
> 
> Signed-off-by: Max Reitz 
> ---
>  blockdev.c | 156 
> ++---
>  1 file changed, 150 insertions(+), 6 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH RESEND 35/50] blockdev: Pull out blockdev option extraction

2015-01-28 Thread Eric Blake

On 01/27/2015 12:46 PM, Max Reitz wrote:
> Extract some of the blockdev option extraction code from blockdev_init()
> into an own function. This simplifies blockdev_init() and will allow

s/an/its/

> reusing the code in a different function added in a follow-up patch.
> 
> Signed-off-by: Max Reitz 
> ---
>  blockdev.c | 201 
> +
>  1 file changed, 108 insertions(+), 93 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

1 2 3 >

1 - 100 of 276 matches

Mail list logo