Re: [Qemu-devel] Reverse execution and deterministic replay

2014-06-27 Thread Peter Crosthwaite
Hi Pavel,

On Fri, Jun 27, 2014 at 3:18 PM, Pavel Dovgaluk
pavel.dovga...@ispras.ru wrote:
 Hello!

 We want to publish set of patches related to the reverse execution and 
 deterministic replay of qemu.
 Our implementation of deterministic replay can be used for deterministic and 
 reverse debugging of
 guest code through gdb remote interface.

 Execution recording writes non-deterministic events log, which can be later 
 used for replaying the
 execution anywhere and for unlimited number of times. It also supports 
 checkpointing for faster
 rewinding during reverse debugging. Execution replaying reads the log and 
 replays all
 non-deterministic events including external input, hardware clocks, and 
 interrupts.

 Reverse execution has the following features:
  * Deterministically replays whole system execution and all contents of the 
 memory,
state of the hadrware devices, clocks, and screen of the VM.
  * Writes execution log into the file for latter replaying for multiple times
on different machines.
  * Supports i386, x86_64, and ARM hardware platforms.
  * Performs deterministic replay of all operations with keyboard, mouse, 
 network adapters,
audio devices, serial interfaces, and physical USB devices connected to 
 the emulator.
  * Provides support for gdb reverse debugging commands like reverse-step and 
 reverse-continue.
  * Supports auto-checkpointing for convenient reverse debugging.
  * Allows going to the live execution from the replay mode.

 Our implementation is completely tested for qemu 1.5 and is in beta state for 
 2.0.50.

 Some details about our implementation of reverse execution can be found in 
 paper:
 http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html


Add relevant implementation details to the git commit messages.

 Can anyone review our patches?


Fred Konrad is doing a series on reverse exe at the moment. CC. Is the
an independent implementation of the same thing or are you building on
it?

I suggest posting a full RFC, this looks to me just like a cover
letter but without a series.

Note that we are going into hard freeze imminently so there will be
some delay for merge.

Regards,
Peter

 Pavel Dovgaluk






Re: [Qemu-devel] Reverse execution and deterministic replay

2014-06-27 Thread Pavel Dovgaluk
 -Original Message-
 From: peter.crosthwa...@petalogix.com 
 [mailto:peter.crosthwa...@petalogix.com] On Behalf Of
 Peter Crosthwaite
 Sent: Friday, June 27, 2014 10:11 AM
 To: Pavel Dovgaluk; Fréderic Konrad
 Cc: qemu-devel@nongnu.org Developers; Paolo Bonzini
 Subject: Re: [Qemu-devel] Reverse execution and deterministic replay
 
 Hi Pavel,
 
 On Fri, Jun 27, 2014 at 3:18 PM, Pavel Dovgaluk
 pavel.dovga...@ispras.ru wrote:
  Hello!
 
  We want to publish set of patches related to the reverse execution and 
  deterministic replay
 of qemu.
  Our implementation of deterministic replay can be used for deterministic 
  and reverse
 debugging of
  guest code through gdb remote interface.
 
  Execution recording writes non-deterministic events log, which can be later 
  used for
 replaying the
  execution anywhere and for unlimited number of times. It also supports 
  checkpointing for
 faster
  rewinding during reverse debugging. Execution replaying reads the log and 
  replays all
  non-deterministic events including external input, hardware clocks, and 
  interrupts.
 
  Reverse execution has the following features:
   * Deterministically replays whole system execution and all contents of the 
  memory,
 state of the hadrware devices, clocks, and screen of the VM.
   * Writes execution log into the file for latter replaying for multiple 
  times
 on different machines.
   * Supports i386, x86_64, and ARM hardware platforms.
   * Performs deterministic replay of all operations with keyboard, mouse, 
  network adapters,
 audio devices, serial interfaces, and physical USB devices connected to 
  the emulator.
   * Provides support for gdb reverse debugging commands like reverse-step 
  and reverse-
 continue.
   * Supports auto-checkpointing for convenient reverse debugging.
   * Allows going to the live execution from the replay mode.
 
  Our implementation is completely tested for qemu 1.5 and is in beta state 
  for 2.0.50.
 
  Some details about our implementation of reverse execution can be found in 
  paper:
  http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html
 
 
 Add relevant implementation details to the git commit messages.

Do you mean describing the details in patches that I should submit?

  Can anyone review our patches?
 
 
 Fred Konrad is doing a series on reverse exe at the moment. CC. Is the
 an independent implementation of the same thing or are you building on
 it?

Our implementation is not related to Fred Konrad.

 I suggest posting a full RFC, this looks to me just like a cover
 letter but without a series.

Of course I will post a full RFC with details of implementation.

 
 Note that we are going into hard freeze imminently so there will be
 some delay for merge.

Pavel Dovgaluk




Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2

2014-06-27 Thread Kevin Wolf
Am 27.06.2014 um 06:59 hat Paolo Bonzini geschrieben:
 Il 27/06/2014 03:15, Ming Lei ha scritto:
 On Thu, Jun 26, 2014 at 11:57 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 We can implement (advisory) calls like bdrv_plug/bdrv_unplug in order to
 restore the previous levels of performance.
 
 Yes, that is also what I am thinking, or interfaces like bdrv_queue_io()
 and bdrv_submit_io(), which may match with aio interfaces.
 
 Would you like to try preparing a patch?

Note that there is already an interface in block.c that takes multiple
requests at once, bdrv_aio_multiwrite(). It is currently used by
virtio-blk, even though not in dataplane mode. It also submits
individual requests to the block drivers currently, so effectively it
doesn't make a difference, just the problem occurs in the block layer
instead of the device.

We should either improve bdrv_aio_multiwrite() to submit the requests in
a batch to the block drivers, add a bdrv_aio_multiwrite() and use it for
dataplane as well (possibly with a flag for disabling the request merging
if we want to keep the current behaviour for dataplane); or, if we
consider it a bad interface, replace it altogether with the new thing
even for normal virtio-blk.

If this makes a difference for dataplane, it probably makes a difference
for all block devices.

Kevin



[Qemu-devel] [PATCH 3/3] ppc/spapr: Fix MAX_CPUS to 255

2014-06-27 Thread Nikunj A Dadhania
MAX_CPUS 256 is inconsistent with qemu supporting upto 255 cpus. This
MAX_CPUS number was percolated back to virsh capabilities with wrong
max_cpus.

Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com
---
 hw/ppc/spapr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 33f77d2..eab0f5f 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -79,7 +79,7 @@
 
 #define TIMEBASE_FREQ   51200ULL
 
-#define MAX_CPUS256
+#define MAX_CPUS255
 
 #define PHANDLE_XICP0x
 
-- 
1.8.3.1




[Qemu-devel] [PATCH 2/3] spapr: add uuid/host details to device tree

2014-06-27 Thread Nikunj A Dadhania
Useful for identifying the guest/host uniquely within the
guest. Adding following properties to the guest root node.

vm,uuid - uuid of the guest
host-model - Host model number
host-serial - Host machine serial number
hypervisor type - Tells its kvm

Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com
---
 hw/ppc/spapr.c   | 19 +++
 target-ppc/kvm.c | 44 +++-
 target-ppc/kvm_ppc.h | 12 
 3 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a8ba916..33f77d2 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -319,6 +319,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
 QemuOpts *opts = qemu_opts_find(qemu_find_opts(smp-opts), NULL);
 unsigned sockets = opts ? qemu_opt_get_number(opts, sockets, 0) : 0;
 uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
+char char_buf[512];
 
 add_str(hypertas, hcall-pft);
 add_str(hypertas, hcall-term);
@@ -348,6 +349,24 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
 _FDT((fdt_property_string(fdt, model, IBM pSeries (emulated by 
qemu;
 _FDT((fdt_property_string(fdt, compatible, qemu,pseries)));
 
+if (kvm_enabled()) {
+_FDT((fdt_property_string(fdt, hypervisor, kvm)));
+}
+
+/*
+ * Add info to guest to indentify which host is it being run on
+ * and what is the uuid of the guest
+ */
+memset(char_buf, 0, sizeof(char_buf));
+if (!kvmppc_get_host_model(char_buf, sizeof(char_buf))) {
+_FDT((fdt_property_string(fdt, host-model, char_buf)));
+memset(char_buf, 0, sizeof(char_buf));
+}
+if (!kvmppc_get_host_serial(char_buf, sizeof(char_buf))) {
+_FDT((fdt_property_string(fdt, host-serial, char_buf)));
+}
+_FDT((fdt_property(fdt, vm,uuid, qemu_uuid, 16)));
+
 _FDT((fdt_property_cell(fdt, #address-cells, 0x2)));
 _FDT((fdt_property_cell(fdt, #size-cells, 0x2)));
 
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 2d87108..25091f8 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -1369,7 +1369,7 @@ static int read_cpuinfo(const char *field, char *value, 
int len)
 }
 
 do {
-if(!fgets(line, sizeof(line), f)) {
+if (!fgets(line, sizeof(line), f)) {
 break;
 }
 if (!strncmp(line, field, field_len)) {
@@ -1404,6 +1404,48 @@ uint32_t kvmppc_get_tbfreq(void)
 return retval;
 }
 
+int32_t kvmppc_get_host_serial(char *value, int len)
+{
+FILE *f;
+int ret = -1;
+char line[512];
+
+memset(line, 0, sizeof(line));
+f = fopen(/proc/device-tree/system-id, r);
+if (!f) {
+return ret;
+}
+
+if (fgets(line, sizeof(line), f)) {
+snprintf(value, len, IBM,%s, line);
+ret = 0;
+}
+fclose(f);
+
+return ret;
+}
+
+int32_t kvmppc_get_host_model(char *value, int len)
+{
+FILE *f;
+int ret = -1;
+char line[512];
+
+memset(line, 0, sizeof(line));
+f = fopen(/proc/device-tree/model, r);
+if (!f) {
+return ret;
+}
+
+if (fgets(line, sizeof(line), f)) {
+snprintf(value, len, IBM,%s, line);
+ret = 0;
+}
+fclose(f);
+
+return ret;
+}
+
 /* Try to find a device tree node for a CPU with clock-frequency property */
 static int kvmppc_find_cpu_dt(char *buf, int buf_len)
 {
diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
index 1118122..6fa3314 100644
--- a/target-ppc/kvm_ppc.h
+++ b/target-ppc/kvm_ppc.h
@@ -19,6 +19,8 @@ uint32_t kvmppc_get_tbfreq(void);
 uint64_t kvmppc_get_clockfreq(void);
 uint32_t kvmppc_get_vmx(void);
 uint32_t kvmppc_get_dfp(void);
+int32_t kvmppc_get_host_model(char *buf, int buf_len);
+int32_t kvmppc_get_host_serial(char *buf, int buf_len);
 int kvmppc_get_hasidle(CPUPPCState *env);
 int kvmppc_get_hypercall(CPUPPCState *env, uint8_t *buf, int buf_len);
 int kvmppc_set_interrupt(PowerPCCPU *cpu, int irq, int level);
@@ -60,6 +62,16 @@ static inline uint32_t kvmppc_get_tbfreq(void)
 return 0;
 }
 
+static inline int32_t kvmppc_get_host_model(char *buf, int buf_len)
+{
+return 0;
+}
+
+static inline int32_t kvmppc_get_host_serial(char *buf, int buf_len)
+{
+return 0;
+}
+
 static inline uint64_t kvmppc_get_clockfreq(void)
 {
 return 0;
-- 
1.8.3.1




[Qemu-devel] [PATCH 1/3 v3] ppc: spapr-rtas - implement os-term rtas call

2014-06-27 Thread Nikunj A Dadhania
PAPR compliant guest calls this in absence of kdump. This finally
reaches the guest and can be handled according to the policies set by
higher level tools(like taking dump) for further analysis by tools like
crash.

Linux kernel calls this only when the extended version of os,term is
implemented to make sure that a return to the linux kernel is gauranteed.

CC: Benjamin Herrenschmidt b...@au1.ibm.com
CC: Anton Blanchard an...@samba.org
CC: Alexander Graf ag...@suse.de
Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com

---

v2: rebase to ppcnext
v3: Do not stop the VM, and update comments
---
 hw/ppc/spapr_rtas.c | 41 +
 1 file changed, 41 insertions(+)

diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 9ba1ba6..2da33c8 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -29,6 +29,8 @@
 #include sysemu/char.h
 #include hw/qdev.h
 #include sysemu/device_tree.h
+#include qapi/qmp/qjson.h
+#include monitor/monitor.h
 
 #include hw/ppc/spapr.h
 #include hw/ppc/spapr_vio.h
@@ -277,6 +279,41 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU *cpu,
 rtas_st(rets, 0, ret);
 }
 
+static void rtas_ibm_os_term(PowerPCCPU *cpu,
+sPAPREnvironment *spapr,
+uint32_t token, uint32_t nargs,
+target_ulong args,
+uint32_t nret, target_ulong rets)
+{
+target_ulong ret = 0;
+QObject *data;
+
+data = qobject_from_jsonf({ 'action': %s }, pause);
+monitor_protocol_event(QEVENT_GUEST_PANICKED, data);
+qobject_decref(data);
+
+rtas_st(rets, 0, ret);
+}
+
+/*
+ * According to PAPR, rtas ibm,os-term, does not gaurantee a return
+ * back to the guest cpu.
+ *
+ * While an additional ibm,extended-os-term property indicates that
+ * rtas call return will always occur. Below function implements a
+ * place holder for the same.
+ */
+static void rtas_ibm_ext_os_term(PowerPCCPU *cpu,
+sPAPREnvironment *spapr,
+uint32_t token, uint32_t nargs,
+target_ulong args,
+uint32_t nret, target_ulong rets)
+{
+target_ulong ret = RTAS_OUT_NOT_SUPPORTED;
+
+rtas_st(rets, 0, ret);
+}
+
 static struct rtas_call {
 const char *name;
 spapr_rtas_fn fn;
@@ -404,6 +441,10 @@ static void core_rtas_register_types(void)
 spapr_rtas_register(RTAS_IBM_SET_SYSTEM_PARAMETER,
 ibm,set-system-parameter,
 rtas_ibm_set_system_parameter);
+spapr_rtas_register(ibm,os-term,
+rtas_ibm_os_term);
+spapr_rtas_register(ibm,extended-os-term,
+rtas_ibm_ext_os_term);
 }
 
 type_init(core_rtas_register_types)
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH 1/3 v3] ppc: spapr-rtas - implement os-term rtas call

2014-06-27 Thread Alexey Kardashevskiy
On 06/27/2014 04:47 PM, Nikunj A Dadhania wrote:
 PAPR compliant guest calls this in absence of kdump. This finally
 reaches the guest and can be handled according to the policies set by
 higher level tools(like taking dump) for further analysis by tools like
 crash.
 
 Linux kernel calls this only when the extended version of os,term is
 implemented to make sure that a return to the linux kernel is gauranteed.
 
 CC: Benjamin Herrenschmidt b...@au1.ibm.com
 CC: Anton Blanchard an...@samba.org
 CC: Alexander Graf ag...@suse.de
 Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com
 
 ---
 
 v2: rebase to ppcnext
 v3: Do not stop the VM, and update comments
 ---
  hw/ppc/spapr_rtas.c | 41 +
  1 file changed, 41 insertions(+)
 
 diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
 index 9ba1ba6..2da33c8 100644
 --- a/hw/ppc/spapr_rtas.c
 +++ b/hw/ppc/spapr_rtas.c
 @@ -29,6 +29,8 @@
  #include sysemu/char.h
  #include hw/qdev.h
  #include sysemu/device_tree.h
 +#include qapi/qmp/qjson.h
 +#include monitor/monitor.h
  
  #include hw/ppc/spapr.h
  #include hw/ppc/spapr_vio.h
 @@ -277,6 +279,41 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU 
 *cpu,
  rtas_st(rets, 0, ret);
  }
  
 +static void rtas_ibm_os_term(PowerPCCPU *cpu,
 +sPAPREnvironment *spapr,
 +uint32_t token, uint32_t nargs,
 +target_ulong args,
 +uint32_t nret, target_ulong rets)
 +{
 +target_ulong ret = 0;
 +QObject *data;
 +
 +data = qobject_from_jsonf({ 'action': %s }, pause);
 +monitor_protocol_event(QEVENT_GUEST_PANICKED, data);
 +qobject_decref(data);
 +
 +rtas_st(rets, 0, ret);
 +}
 +
 +/*
 + * According to PAPR, rtas ibm,os-term, does not gaurantee a return
 + * back to the guest cpu.
 + *
 + * While an additional ibm,extended-os-term property indicates that
 + * rtas call return will always occur. Below function implements a
 + * place holder for the same.
 + */
 +static void rtas_ibm_ext_os_term(PowerPCCPU *cpu,
 +sPAPREnvironment *spapr,
 +uint32_t token, uint32_t nargs,
 +target_ulong args,
 +uint32_t nret, target_ulong rets)
 +{
 +target_ulong ret = RTAS_OUT_NOT_SUPPORTED;
 +
 +rtas_st(rets, 0, ret);
 +}
 +
  static struct rtas_call {
  const char *name;
  spapr_rtas_fn fn;
 @@ -404,6 +441,10 @@ static void core_rtas_register_types(void)
  spapr_rtas_register(RTAS_IBM_SET_SYSTEM_PARAMETER,
  ibm,set-system-parameter,
  rtas_ibm_set_system_parameter);
 +spapr_rtas_register(ibm,os-term,
 +rtas_ibm_os_term);


This just won't compile, spapr_rtas_register() takes 3 parameters now.
Tokens for ibm,os-term and ibm,extended-os-term are already defined,
just use them.



 +spapr_rtas_register(ibm,extended-os-term,
 +rtas_ibm_ext_os_term);
  }
  
  type_init(core_rtas_register_types)
 


ps. please (please) do not use my ibm's email in public :)

-- 
Alexey Kardashevskiy
IBM OzLabs, LTC Team

e-mail: a...@au1.ibm.com
notes: Alexey Kardashevskiy/Australia/IBM




Re: [Qemu-devel] [PATCH 1/3 v3] ppc: spapr-rtas - implement os-term rtas call

2014-06-27 Thread Nikunj A Dadhania
Alexey Kardashevskiy a...@au1.ibm.com writes:

 On 06/27/2014 04:47 PM, Nikunj A Dadhania wrote:
 PAPR compliant guest calls this in absence of kdump. This finally
 reaches the guest and can be handled according to the policies set by
 higher level tools(like taking dump) for further analysis by tools like
 crash.
 
 Linux kernel calls this only when the extended version of os,term is
 implemented to make sure that a return to the linux kernel is gauranteed.
 
 CC: Benjamin Herrenschmidt b...@au1.ibm.com
 CC: Anton Blanchard an...@samba.org
 CC: Alexander Graf ag...@suse.de
 Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com
 
  static struct rtas_call {
  const char *name;
  spapr_rtas_fn fn;
 @@ -404,6 +441,10 @@ static void core_rtas_register_types(void)
  spapr_rtas_register(RTAS_IBM_SET_SYSTEM_PARAMETER,
  ibm,set-system-parameter,
  rtas_ibm_set_system_parameter);
 +spapr_rtas_register(ibm,os-term,
 +rtas_ibm_os_term);


 This just won't compile, spapr_rtas_register() takes 3 parameters now.

duh, i missed that update :(

Resending

 Tokens for ibm,os-term and ibm,extended-os-term are already defined,
 just use them.



 +spapr_rtas_register(ibm,extended-os-term,
 +rtas_ibm_ext_os_term);
  }
  
  type_init(core_rtas_register_types)
 


 ps. please (please) do not use my ibm's email in public :)

Sure.

Regards
Nikunj




Re: [Qemu-devel] [PATCH for 2.1] qdev: correctly send DEVICE_DELETED for recursively-deleted devices

2014-06-27 Thread Markus Armbruster
Paolo Bonzini pbonz...@redhat.com writes:

 When a device is unparented (i.e. made completely hidden from management)
 we want to send a DEVICE_DELETED event only if the device actually was
 realized.  This avoids raising DEVICE_DELETED events when device_add
 fails.

 However, this does not work right for recursively-deleted
 devices: the whole tree is _first_ unrealized, _then_ unparented.
 Then device_unparent sees realized==false and fails to trigger
 the event.  The solution is simply to move have_realized into
 the DeviceState struct.  If device_add fails, we never set the
 new field to true and DEVICE_DELETED is not sent.

 Fixes qemu-iotests testcase 067.

Suggest to add Broken in commit 5942a19 here, to make it clear that
it's a recent regression.

 Reported-by: Markus Armbruster arm...@redhat.com
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  hw/core/qdev.c | 5 +++--
  include/hw/qdev-core.h | 1 +
  2 files changed, 4 insertions(+), 2 deletions(-)

 diff --git a/hw/core/qdev.c b/hw/core/qdev.c
 index d1eba3c..c520415 100644
 --- a/hw/core/qdev.c
 +++ b/hw/core/qdev.c
 @@ -848,6 +848,7 @@ static void device_set_realized(Object *obj, bool value, 
 Error **errp)
   if (value  !dev-realized) {
[...]
  if (dev-hotplugged  local_err == NULL) {
  device_reset(dev);
  }
 +dev-pending_deleted_event = false;

Unset on completion of unrealized - realized transition.

  } else if (!value  dev-realized) {
  QLIST_FOREACH(bus, dev-child_bus, sibling) {
  object_property_set_bool(OBJECT(bus), false, realized,
 @@ -862,6 +863,7 @@ static void device_set_realized(Object *obj, bool value, 
 Error **errp)
  if (dc-unrealize  local_err == NULL) {
  dc-unrealize(dev, local_err);
  }
 +dev-pending_deleted_event = true;

Set on completion of realized - unrealized transition.

  }
  
  if (local_err != NULL) {
 @@ -972,7 +974,6 @@ static void device_unparent(Object *obj)
  {
  DeviceState *dev = DEVICE(obj);
  BusState *bus;
 -bool have_realized = dev-realized;
  
  if (dev-realized) {
  object_property_set_bool(obj, false, realized, NULL);
 @@ -988,7 +989,7 @@ static void device_unparent(Object *obj)
  }
  
  /* Only send event if the device had been completely realized */
 -if (have_realized) {
 +if (dev-pending_deleted_event) {
  gchar *path = object_get_canonical_path(OBJECT(dev));
  
  qapi_event_send_device_deleted(!!dev-id, dev-id, path, 
 error_abort);

Let's see whether I understand how this works.  Please correct
misunderstandings.

device_unparent() runs right before device deletion, and only then.

First thing it does is setting property realized to false.

Does nothing if the device has never been completely realized.
dev-pending_deleted_event remains in its initial state false.
DEVICE_DELETED not sent.  Good.

Else, the device was completely realized at some time.  If it is
currently realized, we get a transition to unrealized right now, setting
dev-pending_deleted_event.  Else, the last transition must have been
realized - unrealized, setting dev-pending_deleted_event.  Since it
gets unset only on unrealized - realized, it's still set.

Therefore, dev-pending_deleted_event is set if and only if the device
has been completely realized.

 diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
 index 9221cfc..0799ff2 100644
 --- a/include/hw/qdev-core.h
 +++ b/include/hw/qdev-core.h
 @@ -156,6 +156,7 @@ struct DeviceState {
  
  const char *id;
  bool realized;
 +bool pending_deleted_event;
  QemuOpts *opts;
  int hotplugged;
  BusState *parent_bus;

Reviewed-by: Markus Armbruster arm...@redhat.com

(Tested, too, but my r-by subsumes that here)



Re: [Qemu-devel] [PATCH 1/3 v3] ppc: spapr-rtas - implement os-term rtas call

2014-06-27 Thread Nikunj A Dadhania
Nikunj A Dadhania nik...@linux.vnet.ibm.com writes:

 PAPR compliant guest calls this in absence of kdump. This finally
 reaches the guest and can be handled according to the policies set by
 higher level tools(like taking dump) for further analysis by tools like
 crash.

 Linux kernel calls this only when the extended version of os,term is
 implemented to make sure that a return to the linux kernel is gauranteed.

 CC: Benjamin Herrenschmidt b...@au1.ibm.com
 CC: Anton Blanchard an...@samba.org
 CC: Alexander Graf ag...@suse.de
 Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com

 ---

 v2: rebase to ppcnext
 v3: Do not stop the VM, and update comments
 ---
  hw/ppc/spapr_rtas.c | 41 +
  1 file changed, 41 insertions(+)

 diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
 index 9ba1ba6..2da33c8 100644
 --- a/hw/ppc/spapr_rtas.c
 +++ b/hw/ppc/spapr_rtas.c
 @@ -29,6 +29,8 @@
  #include sysemu/char.h
  #include hw/qdev.h
  #include sysemu/device_tree.h
 +#include qapi/qmp/qjson.h
 +#include monitor/monitor.h

  #include hw/ppc/spapr.h
  #include hw/ppc/spapr_vio.h
 @@ -277,6 +279,41 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU 
 *cpu,
  rtas_st(rets, 0, ret);
  }

 +static void rtas_ibm_os_term(PowerPCCPU *cpu,
 +sPAPREnvironment *spapr,
 +uint32_t token, uint32_t nargs,
 +target_ulong args,
 +uint32_t nret, target_ulong rets)
 +{
 +target_ulong ret = 0;
 +QObject *data;
 +
 +data = qobject_from_jsonf({ 'action': %s }, pause);
 +monitor_protocol_event(QEVENT_GUEST_PANICKED, data);
 +qobject_decref(data);

Even the above has got changed, and newer api:  qapi_event_send_guest_panicked

Regards
Nikunj




Re: [Qemu-devel] [v5][PATCH 2/5] xen, gfx passthrough: create pseudo intel isa bridge

2014-06-27 Thread Chen, Tiejun

On 2014/6/25 17:58, Chen, Tiejun wrote:

On 2014/6/25 17:44, Michael S. Tsirkin wrote:

On Wed, Jun 25, 2014 at 05:28:48PM +0800, Chen, Tiejun wrote:

On 2014/6/25 17:21, Michael S. Tsirkin wrote:

On Wed, Jun 25, 2014 at 05:14:30PM +0800, Chen, Tiejun wrote:

On 2014/6/25 17:04, Michael S. Tsirkin wrote:

On Wed, Jun 25, 2014 at 04:48:02PM +0800, Chen, Tiejun wrote:

On 2014/6/25 16:43, Michael S. Tsirkin wrote:

On Wed, Jun 25, 2014 at 04:39:07PM +0800, Chen, Tiejun wrote:

In fact it's exactly what passthrough does.
I wonder if more bits from ./hw/i386/kvm/pci-assign.c
can be reused. How do you poke at the host device? sysfs?


Yes, sysfs.

Thanks
Tiejun


Then you should be able to re-use large chunks of
./hw/i386/kvm/pci-assign.c: basically everything
that deals with emulation.


Do you mean those hooks to get info from the real device? Xen
have its own
wrapper, xen_host_pci_get_block(), so we always go there in xen
scenario.

Thanks
Tiejun


Yes and that's not good.  We have two pieces of code doing mostly
identical things slightly differently.
hw/i386/kvm/pci-assign.c is a bit younger so it's cleaner,
but these really need to be unified.



Sorry, take a look at this again,

xen_host_pci_get_block(XenHostPCIDevice *d, int pos, uint8_t *buf,
int len)
|
+ xen_host_pci_config_read(d, pos, buf, len)
|
+ pread(d-config_fd, buf, len, pos)

I thinks this should be same as kvm.

Thanks
Tiejun


get_block is trivial.

I really mean the whole PT infrastructure for
- discovering host devices through sysfs
- virtualizing devices

rom, bars, msi ...
the list goes on.

logic is mostly the same.



Looks you mean we can unify the entire PT infrastructure between kvm
and xen
inside qemu. But I'm afraid its not easy to do in a short time, so
maybe we
can queue this as next phase.

Thanks
Tiejun


I'm afraid once we merge your code, you'll lose interest :)



Currently we have to push this feature into upstream as our first
priority, so unless something is really needed to address. Of course I
hope this point what we're talking is not such a thing :)

But I can promise here I'd like to do this optimization with your guide
next :)


At least, don't add duplicate code for ROM.



Let me try this.



Its not easy as expected.

kvm always work with this structure, AssignedDevice, and especially this 
is just activated in kvm_enabled(). And then set all properties to this 
structure.


In xen case, the similar structure, XenHostPCIDevice, is not easy 
transferred into the structure, AssignedDevice. So this mean we have to 
split assigned_dev_load_option_rom() as line by line for xen and kvm, 
respectively.


I really agree we definitely need to unify PT infrastructure between kvm 
and xen after this try since I can't understand why we originally 
introduce same way to do same thing :(


Do you have better idea? If not, I prefer we open this completely as 
next action to follow-up. But this time I'm afraid I can't get in this.


Thanks
Tiejun



Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2

2014-06-27 Thread Paolo Bonzini

Il 27/06/2014 08:23, Kevin Wolf ha scritto:

Note that there is already an interface in block.c that takes multiple
requests at once, bdrv_aio_multiwrite(). It is currently used by
virtio-blk, even though not in dataplane mode. It also submits
individual requests to the block drivers currently, so effectively it
doesn't make a difference, just the problem occurs in the block layer
instead of the device.

We should either improve bdrv_aio_multiwrite() to submit the requests in
a batch to the block drivers, add a bdrv_aio_multiwrite() and use it for
dataplane as well (possibly with a flag for disabling the request merging
if we want to keep the current behaviour for dataplane); or, if we
consider it a bad interface, replace it altogether with the new thing
even for normal virtio-blk.


In fact, what's the status of Fam's patches to unify request processing 
between dataplane and non-dataplane?  They would add multiwrite support 
(also rerror/werror and blockstats).


I was hoping that they could get in 2.1.

Paolo


If this makes a difference for dataplane, it probably makes a difference
for all block devices.





[Qemu-devel] [PATCH v4] ppc: spapr-rtas - implement os-term rtas call

2014-06-27 Thread Nikunj A Dadhania
PAPR compliant guest calls this in absence of kdump. This finally
reaches the guest and can be handled according to the policies set by
higher level tools(like taking dump) for further analysis by tools like
crash.

Linux kernel calls this only when the extended version of os,term is
implemented to make sure that a return to the linux kernel is gauranteed.

CC: Benjamin Herrenschmidt b...@au1.ibm.com
CC: Anton Blanchard an...@samba.org
CC: Alexander Graf ag...@suse.de
Signed-off-by: Nikunj A Dadhania nik...@linux.vnet.ibm.com

---

v2: rebase to ppcnext
v3: Do not stop the VM, and update comments
v4: update spapr_register_rtas and qapi_event changes
---
 hw/ppc/spapr_rtas.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 9ba1ba6..b11de41 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -277,6 +277,38 @@ static void rtas_ibm_set_system_parameter(PowerPCCPU *cpu,
 rtas_st(rets, 0, ret);
 }
 
+static void rtas_ibm_os_term(PowerPCCPU *cpu,
+sPAPREnvironment *spapr,
+uint32_t token, uint32_t nargs,
+target_ulong args,
+uint32_t nret, target_ulong rets)
+{
+target_ulong ret = 0;
+
+qapi_event_send_guest_panicked(GUEST_PANIC_ACTION_PAUSE, error_abort);
+
+rtas_st(rets, 0, ret);
+}
+
+/*
+ * According to PAPR, rtas ibm,os-term, does not gaurantee a return
+ * back to the guest cpu.
+ *
+ * While an additional ibm,extended-os-term property indicates that
+ * rtas call return will always occur. Below function implements a
+ * place holder for the same.
+ */
+static void rtas_ibm_ext_os_term(PowerPCCPU *cpu,
+sPAPREnvironment *spapr,
+uint32_t token, uint32_t nargs,
+target_ulong args,
+uint32_t nret, target_ulong rets)
+{
+target_ulong ret = RTAS_OUT_NOT_SUPPORTED;
+
+rtas_st(rets, 0, ret);
+}
+
 static struct rtas_call {
 const char *name;
 spapr_rtas_fn fn;
@@ -404,6 +436,10 @@ static void core_rtas_register_types(void)
 spapr_rtas_register(RTAS_IBM_SET_SYSTEM_PARAMETER,
 ibm,set-system-parameter,
 rtas_ibm_set_system_parameter);
+spapr_rtas_register(RTAS_IBM_OS_TERM, ibm,os-term,
+rtas_ibm_os_term);
+spapr_rtas_register(RTAS_IBM_EXTENDED_OS_TERM, ibm,extended-os-term,
+rtas_ibm_ext_os_term);
 }
 
 type_init(core_rtas_register_types)
-- 
1.8.3.1




Re: [Qemu-devel] Reverse execution and deterministic replay

2014-06-27 Thread Frederic Konrad

On 27/06/2014 08:11, Peter Crosthwaite wrote:

Hi Pavel,

On Fri, Jun 27, 2014 at 3:18 PM, Pavel Dovgaluk
pavel.dovga...@ispras.ru wrote:

Hello!

We want to publish set of patches related to the reverse execution and 
deterministic replay of qemu.
Our implementation of deterministic replay can be used for deterministic and 
reverse debugging of
guest code through gdb remote interface.

Execution recording writes non-deterministic events log, which can be later 
used for replaying the
execution anywhere and for unlimited number of times. It also supports 
checkpointing for faster
rewinding during reverse debugging. Execution replaying reads the log and 
replays all
non-deterministic events including external input, hardware clocks, and 
interrupts.

Reverse execution has the following features:
  * Deterministically replays whole system execution and all contents of the 
memory,
state of the hadrware devices, clocks, and screen of the VM.
  * Writes execution log into the file for latter replaying for multiple times
on different machines.
  * Supports i386, x86_64, and ARM hardware platforms.
  * Performs deterministic replay of all operations with keyboard, mouse, 
network adapters,
audio devices, serial interfaces, and physical USB devices connected to the 
emulator.
  * Provides support for gdb reverse debugging commands like reverse-step and 
reverse-continue.
  * Supports auto-checkpointing for convenient reverse debugging.
  * Allows going to the live execution from the replay mode.

Our implementation is completely tested for qemu 1.5 and is in beta state for 
2.0.50.

Some details about our implementation of reverse execution can be found in 
paper:
http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html


Add relevant implementation details to the git commit messages.


Can anyone review our patches?


Fred Konrad is doing a series on reverse exe at the moment. CC. Is the
an independent implementation of the same thing or are you building on
it?


Hi,

Yes seems we are doing the same thing only we use icount as an instruction
counter and you created a new instruction counter?

This has advantage of having it working everywhere icount works but the
disavantages of having to use icount for reverse execution.

I think we can use both way so the reverse execution will works on other
architecture the time an instruction counter is added to them.

I'm sure your patches will add to our solution and I can review your patches
when you'll send them.

It would help if you rebase them on the patch set that is currently on 
the list:

[RFC PATCH v5 00/13] Reverse execution. I sent two days ago.

Thanks,
Fred


I suggest posting a full RFC, this looks to me just like a cover
letter but without a series.

Note that we are going into hard freeze imminently so there will be
some delay for merge.

Regards,
Peter


Pavel Dovgaluk








Re: [Qemu-devel] [PATCH 00/10] pc-bios/s390-ccw: Add DASD IPL support

2014-06-27 Thread Christian Borntraeger
On 26/06/14 16:42, Alexander Graf wrote:
 
 On 26.06.14 16:29, Jens Freimann wrote:
 Conny, Alex, Christian,

 here are some fixes for the s390-ccw bios. It's a mixture of
 additional features (DASD IPL support for different formats)
 and cleanups.
 
 From a quick glimpse it looks quite clean and straight forward, but I'd like 
 to make sure we get rid completely of the static sector size assumption.

Should be. I guess s/SECTOR_SIZE/MAX_SECTOR_SIZE/g would be ok for you then?
 
 Also, are we guaranteed that virtio always uses 512 byte block size? Or was 
 that just an internal API thing?

The virtio-blk API always talks in 512 byte sectors, no matter the block size.

Overall this is a nice improvement of the boot code - if possible I would like 
to see that in 2.1.

Conny, can you carry that in your tree (with s/SECTOR_SIZE/MAX_SECTOR_SIZE/g)?

Acked-by: Christian Borntraeger borntrae...@de.ibm.com

for the series.


Christian




Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2

2014-06-27 Thread Ming Lei
On Fri, Jun 27, 2014 at 12:59 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 27/06/2014 03:15, Ming Lei ha scritto:

 On Thu, Jun 26, 2014 at 11:57 PM, Paolo Bonzini pbonz...@redhat.com
 wrote:

 We can implement (advisory) calls like bdrv_plug/bdrv_unplug in order to
 restore the previous levels of performance.


 Yes, that is also what I am thinking, or interfaces like bdrv_queue_io()
 and bdrv_submit_io(), which may match with aio interfaces.


 Would you like to try preparing a patch?

OK, let me try to do that.




 Note that some fallout of the conversion was expected.  Dataplane told us
 experimentally what level of performance could be reached, but was a dead
 end in terms of functionality.  Now Stefan added a whole lot of
 functionality to dataplane (accounting, throttling, file formats and
 protocols, thread-pool based I/O, etc.) and we need to bring back any
 performance we lost in the process.


 These features are very good, but looks the conversion is a bit early, :-(


 Dataplane is still (and has always been) experimental.  For now, it's a
 playground to get rid of the big QEMU lock in hot paths.  As such,
 performance going up and down is expected.  The good thing is that every
 performance improvement we do now will not be restricted to dataplane, it
 can be applied just as well to any other device.

Yes, virtio-scsi may benefit from the improvement too, and other
block devices too.


Thanks,
-- 
Ming Lei



Re: [Qemu-devel] [RFC PATCH v5 00/13] Reverse execution.

2014-06-27 Thread Frederic Konrad

On 26/06/2014 17:52, Sebastian Tanase wrote:

Hello,

I'll be sending a new version (V3) of the patches on Monday. The patches add 
QemuOpts
handling to the -icount option. If you want I can only send the part of the 
patch
that adds QemuOpts support.

Best regards,

Sebastian Tanase


Hi,

Yes it would be nice if you can split the patch:
one patch making icount a qemuopts and the second adding the align option.

So I can pick the first part.

I can do that for you if you want.

Thanks,
Fred



- Mail original -

De: Paolo Bonzini pbonz...@redhat.com
À: Frederic Konrad fred.kon...@greensocs.com, qemu-devel@nongnu.org
Cc: peter maydell peter.mayd...@linaro.org, quint...@redhat.com, mark burton 
mark.bur...@greensocs.com,
dgilb...@redhat.com, amit shah amit.s...@redhat.com, vilan...@ac.upc.edu, 
sebastian tanase
sebastian.tan...@openwide.fr, camille begue camille.be...@openwide.fr
Envoyé: Jeudi 26 Juin 2014 17:32:57
Objet: Re: [Qemu-devel] [RFC PATCH v5 00/13] Reverse execution.

Il 26/06/2014 17:11, Frederic Konrad ha scritto:


Are you talking of this patch on the list:
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03039.html
?

It seems to includes the align options too. Is that possible to
split
it up?

Sure, you can split it up and when the original authors will rebase
they
will be able to add align on top.

Paolo






Re: [Qemu-devel] [PATCH] target-arm: Implement vCPU reset via KVM_ARM_VCPU_INIT for 32-bit CPUs

2014-06-27 Thread Diana Craciun

On 06/26/2014 08:16 PM, Peter Maydell wrote:

Implement kvm_arm_vcpu_init() as a simple call to arm_arm_vcpu_init()
(which uses the KVM_ARM_VCPU_INIT vcpu ioctl to tell the kernel
to re-initialize the vCPU), rather than via the complicated code
which saves a copy of the register state on first init and then
writes it back to the kernel. This is much simpler and brings the
32-bit KVM code into line with the 64-bit code.


Signed-off-by: Peter Maydell peter.mayd...@linaro.org
---
The kernel has always supported being able to call VCPU_INIT
multiple times for this reset effect; I just didn't realize it
was possible when I wrote the original reset code.

When kvm64.c grows support for system registers we can probably
coalesce the two kvm_arm_reset_cpu() functions into one.

I also have a vague recollection that somebody reported that
we had an actual bug in this area that this patch would fix;
however I can't now find that in the mailing list archives :-(


I did: http://lists.gnu.org/archive/html/qemu-devel/2014-05/msg03131.html




Testing appreciated: my ARMv7 box is being a bit flaky at the
moment; I don't *think* the occasional weird stuff I see is
the effect of this patch but it's hard to be certain.


I will test your patch in the following days.

Diana




[Qemu-devel] [PATCH v6 0/5] Support Archipelago as a QEMU block backend

2014-06-27 Thread Chrysostomos Nanakos
v6:
 - Split v5 1/4 patch into two different patches. First one implements
   QMP structured options and the second one implements bdrv_parse_filename().

v5:
 - Remove useless qemu_aio_count variable from BDRVArchipelagoState struct.
 - Cleanup xseg signal descriptor, call xseg_quit_local_signal() when closing
   block device.
 - Fix ds and volname leaks.
 - Make xseg request handler thread joinable and wait until exits before
   destroying condition variables and mutexes. Thanks to Stefan Hajnoczi for
   pointing this out.
 - Remove error_propagate() useless call.
 - Use memcpy instead of strncpy.
 - Remove check after trying to allocate memory with g_malloc().
 - Remove pipe code and complete AIO by introducing QEMU bottom-half.
 - Add Archipelago shared memory segment name in options list and QMP.
 - Remove functions archipelago_aio_read()/_write() and introduce new
   and simpler function, __archipelago_submit_request().
   Refactor archipelago_aio_segmented_rw() function.
 - Enable Archipelago support in qemu-iotests

v4:
 - Move Archipelago QMP support from qapi-schema.json file to
   qapi/block-core.json. Fixe various typographic errors, thanks to
   Kevin Wolf and Eric Blake.
 - Use new .create_opts format, define new QemuOptsList structure and refactor
   qemu_archipelago_create function.

v3:
 - Break down initial patch from one to three. First patch implements
   Archipelago QEMU block backend with read/write functionality.
   Second patch implements .bdrv_create() and adds support for creating
   Archipelago images. Third patch adds QMP support.
 - Remove global variable g_xseg_init, make xseg_initialize(), xseg_join()
   and xseg_leave() reentrant and thread-safe.
 - Introduce new enum BlockdevOptionsArchipelago for the QMP support.

v2:
 - Implement .bdrv_parse_filename() function to convert the shortuct version
   with a single string to the individual options.
 - Remove global variables and move relevant fields to ArchipelagoAIOCB struct.
 - Remove ArchipelagoConf struct and use the relevant fields as individual
   arguments.
 - Remove ArchipelagoCB struct and use ArchipelagoAIOCB instead.
 - Remove ArchipelagoThread struct and move relevant fields to
   ArchipelagoAIOCB instead. Now an I/O thread is spawned for per-device to
   handle all async I/O requests.
 - Remove double data copy, use qemu_iovec_from_buf() and copy data directly
   to the destination buffer.
 - Remove archipelago_aio_bh_cb() function, a full request is completed in
   qemu_archipelago_complete_aio() instead.
 - Resolve proposed changes from Kevin Wolf and miscellaneous style issues.


Chrysostomos Nanakos (5):
  block: Support Archipelago as a QEMU block backend
  block/archipelago: Implement bdrv_parse_filename()
  block/archipelago: Add support for creating images
  QMP: Add support for Archipelago
  qemu-iotests: add support for Archipelago protocol

 MAINTAINERS  |6 +
 block/Makefile.objs  |2 +
 block/archipelago.c  | 1103 ++
 configure|   40 ++
 qapi/block-core.json |   39 +-
 tests/qemu-iotests/common|6 +
 tests/qemu-iotests/common.rc |9 +-
 7 files changed, 1201 insertions(+), 4 deletions(-)
 create mode 100644 block/archipelago.c

-- 
1.7.10.4




[Qemu-devel] [PATCH v6 1/5] block: Support Archipelago as a QEMU block backend

2014-06-27 Thread Chrysostomos Nanakos
VM Image on Archipelago volume is specified like this:

file.driver=archipelago,file.volume=volumename[,file.mport=mapperd_port[,
file.vport=vlmcd_port][,file.segment=segment_name]]

'archipelago' is the protocol.

'mport' is the port number on which mapperd is listening. This is optional
and if not specified, QEMU will make Archipelago to use the default port.

'vport' is the port number on which vlmcd is listening. This is optional
and if not specified, QEMU will make Archipelago to use the default port.

'segment' is the name of the shared memory segment Archipelago stack is using.
This is optional and if not specified, QEMU will make Archipelago to use the
default value, 'archipelago'.

Examples:

file.driver=archipelago,file.volume=my_vm_volume
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123,
file.vport=1234
file.driver=archipelago,file.volume=my_vm_volume,file.mport=123,
file.vport=1234,file.segment=my_segment

Signed-off-by: Chrysostomos Nanakos cnana...@grnet.gr
---
 MAINTAINERS |6 +
 block/Makefile.objs |2 +
 block/archipelago.c |  819 +++
 configure   |   40 +++
 4 files changed, 867 insertions(+)
 create mode 100644 block/archipelago.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 9b93edd..58ef1e3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -999,3 +999,9 @@ SSH
 M: Richard W.M. Jones rjo...@redhat.com
 S: Supported
 F: block/ssh.c
+
+ARCHIPELAGO
+M: Chrysostomos Nanakos cnana...@grnet.gr
+M: Chrysostomos Nanakos ch...@include.gr
+S: Maintained
+F: block/archipelago.c
diff --git a/block/Makefile.objs b/block/Makefile.objs
index fd88c03..858d2b3 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -17,6 +17,7 @@ block-obj-$(CONFIG_LIBNFS) += nfs.o
 block-obj-$(CONFIG_CURL) += curl.o
 block-obj-$(CONFIG_RBD) += rbd.o
 block-obj-$(CONFIG_GLUSTERFS) += gluster.o
+block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
 block-obj-$(CONFIG_LIBSSH2) += ssh.o
 endif
 
@@ -35,5 +36,6 @@ gluster.o-cflags   := $(GLUSTERFS_CFLAGS)
 gluster.o-libs := $(GLUSTERFS_LIBS)
 ssh.o-cflags   := $(LIBSSH2_CFLAGS)
 ssh.o-libs := $(LIBSSH2_LIBS)
+archipelago.o-libs := $(ARCHIPELAGO_LIBS)
 qcow.o-libs:= -lz
 linux-aio.o-libs   := -laio
diff --git a/block/archipelago.c b/block/archipelago.c
new file mode 100644
index 000..c56826a
--- /dev/null
+++ b/block/archipelago.c
@@ -0,0 +1,819 @@
+/*
+ * QEMU Block driver for Archipelago
+ *
+ * Copyright 2014 GRNET S.A. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *   1. Redistributions of source code must retain the above
+ *  copyright notice, this list of conditions and the following
+ *  disclaimer.
+ *   2. Redistributions in binary form must reproduce the above
+ *  copyright notice, this list of conditions and the following
+ *  disclaimer in the documentation and/or other materials
+ *  provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY GRNET S.A. ``AS IS'' AND ANY EXPRESS
+ * OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL GRNET S.A OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+ * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+ * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ * The views and conclusions contained in the software and
+ * documentation are those of the authors and should not be
+ * interpreted as representing official policies, either expressed
+ * or implied, of GRNET S.A.
+ */
+
+/*
+* VM Image on Archipelago volume is specified like this:
+*
+* file.driver=archipelago,file.volume=volumename[,file.mport=mapperd_port[,
+* file.vport=vlmcd_port][,file.segment=segment_name]]
+*
+* 'archipelago' is the protocol.
+*
+* 'mport' is the port number on which mapperd is listening. This is optional
+* and if not specified, QEMU will make Archipelago to use the default port.
+*
+* 'vport' is the port number on which vlmcd is listening. This is optional
+* and if not specified, QEMU will make Archipelago to use the default port.
+*
+* 'segment' is the name of the shared memory segment Archipelago stack is 
using.
+* This is optional and if not specified, QEMU will make Archipelago to use the
+* default value, 'archipelago'.
+*
+* Examples:
+*
+* file.driver=archipelago,file.volume=my_vm_volume
+* 

[Qemu-devel] [PATCH v6 4/5] QMP: Add support for Archipelago

2014-06-27 Thread Chrysostomos Nanakos
Introduce new enum BlockdevOptionsArchipelago.

@volume:  #Name of the Archipelago volume image

@mport:   #'mport' is the port number on which mapperd is
  listening. This is optional and if not specified,
  QEMU will make Archipelago to use the default port.

@vport:   #'vport' is the port number on which vlmcd is
  listening. This is optional and if not specified,
  QEMU will make Archipelago to use the default port.

@segment: #optional The name of the shared memory segment
  Archipelago stack is using. This is optional
  and if not specified, QEMU will make Archipelago
  use the default value, 'archipelago'.

Signed-off-by: Chrysostomos Nanakos cnana...@grnet.gr
---
 qapi/block-core.json |   39 ---
 1 file changed, 36 insertions(+), 3 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index af6b436..55eb152 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -190,8 +190,8 @@
 # @ro: true if the backing device was open read-only
 #
 # @drv: the name of the block format used to open the backing device. As of
-#   0.14.0 this can be: 'blkdebug', 'bochs', 'cloop', 'cow', 'dmg',
-#   'file', 'file', 'ftp', 'ftps', 'host_cdrom', 'host_device',
+#   0.14.0 this can be: 'archipelago', 'blkdebug', 'bochs', 'cloop', 'cow',
+#   'dmg', 'file', 'file', 'ftp', 'ftps', 'host_cdrom', 'host_device',
 #   'host_floppy', 'http', 'https', 'nbd', 'parallels', 'qcow',
 #   'qcow2', 'raw', 'tftp', 'vdi', 'vmdk', 'vpc', 'vvfat'
 #
@@ -1077,7 +1077,7 @@
 # Since: 2.0
 ##
 { 'enum': 'BlockdevDriver',
-  'data': [ 'file', 'host_device', 'host_cdrom', 'host_floppy',
+  'data': [ 'archipelago', 'file', 'host_device', 'host_cdrom', 'host_floppy',
 'http', 'https', 'ftp', 'ftps', 'tftp', 'vvfat', 'blkdebug',
 'blkverify', 'bochs', 'cloop', 'cow', 'dmg', 'parallels', 'qcow',
 'qcow2', 'qed', 'raw', 'vdi', 'vhdx', 'vmdk', 'vpc', 'quorum' ] }
@@ -1207,6 +1207,38 @@
 '*pass-discard-snapshot': 'bool',
 '*pass-discard-other': 'bool' } }
 
+
+##
+# @BlockdevOptionsArchipelago
+#
+# Driver specific block device options for Archipelago.
+#
+# @volume:  Name of the Archipelago volume image
+#
+#
+# @mport:   #optional The port number on which mapperd is
+#   listening. This is optional
+#   and if not specified, QEMU will make Archipelago
+#   use the default port.
+#
+# @vport:   #optional The port number on which vlmcd is
+#   listening. This is optional
+#   and if not specified, QEMU will make Archipelago
+#   use the default port.
+#
+# @segment: #optional The name of the shared memory segment
+#   Archipelago stack is using. This is optional
+#   and if not specified, QEMU will make Archipelago
+#   use the default value, 'archipelago'.
+# Since: 2.1
+##
+{ 'type': 'BlockdevOptionsArchipelago',
+  'data': { 'volume': 'str',
+'*mport': 'int',
+'*vport': 'int',
+'*segment': 'str' } }
+
+
 ##
 # @BlkdebugEvent
 #
@@ -1347,6 +1379,7 @@
   'base': 'BlockdevOptionsBase',
   'discriminator': 'driver',
   'data': {
+  'archipelago':'BlockdevOptionsArchipelago',
   'file':   'BlockdevOptionsFile',
   'host_device':'BlockdevOptionsFile',
   'host_cdrom': 'BlockdevOptionsFile',
-- 
1.7.10.4




[Qemu-devel] [PATCH v6 2/5] block/archipelago: Implement bdrv_parse_filename()

2014-06-27 Thread Chrysostomos Nanakos
VM Image on Archipelago volume can also be specified like this:

file=archipelago:volumename[/mport=mapperd_port[:vport=vlmcd_port][:
segment=segment_name]]

Examples:

file=archipelago:my_vm_volume
file=archipelago:my_vm_volume/mport=123
file=archipelago:my_vm_volume/mport=123:vport=1234
file=archipelago:my_vm_volume/mport=123:vport=1234:segment=my_segment

Signed-off-by: Chrysostomos Nanakos cnana...@grnet.gr
---
 block/archipelago.c |  139 ++-
 1 file changed, 137 insertions(+), 2 deletions(-)

diff --git a/block/archipelago.c b/block/archipelago.c
index c56826a..3549454 100644
--- a/block/archipelago.c
+++ b/block/archipelago.c
@@ -40,6 +40,11 @@
 * file.driver=archipelago,file.volume=volumename[,file.mport=mapperd_port[,
 * file.vport=vlmcd_port][,file.segment=segment_name]]
 *
+* or
+*
+* file=archipelago:volumename[/mport=mapperd_port[:vport=vlmcd_port][:
+* segment=segment_name]]
+*
 * 'archipelago' is the protocol.
 *
 * 'mport' is the port number on which mapperd is listening. This is optional
@@ -57,11 +62,20 @@
 * file.driver=archipelago,file.volume=my_vm_volume
 * file.driver=archipelago,file.volume=my_vm_volume,file.mport=123
 * file.driver=archipelago,file.volume=my_vm_volume,file.mport=123,
-* file.vport=1234
+*  file.vport=1234
 * file.driver=archipelago,file.volume=my_vm_volume,file.mport=123,
-* file.vport=1234,file.segment=my_segment
+*  file.vport=1234,file.segment=my_segment
+*
+* or
+*
+* file=archipelago:my_vm_volume
+* file=archipelago:my_vm_volume/mport=123
+* file=archipelago:my_vm_volume/mport=123:vport=1234
+* file=archipelago:my_vm_volume/mport=123:vport=1234:segment=my_segment
+*
 */
 
+#include qemu-common.h
 #include block/block_int.h
 #include qemu/error-report.h
 #include qemu/thread.h
@@ -333,6 +347,126 @@ static void qemu_archipelago_complete_aio(void *opaque)
 g_free(reqdata);
 }
 
+static void xseg_find_port(char *pstr, const char *needle, xport *aport)
+{
+const char *a;
+char *endptr = NULL;
+unsigned long port;
+if (strstart(pstr, needle, a)) {
+if (strlen(a)  0) {
+port = strtoul(a, endptr, 10);
+if (strlen(endptr)) {
+*aport = -2;
+return;
+}
+*aport = (xport) port;
+}
+}
+}
+
+static void xseg_find_segment(char *pstr, const char *needle,
+  char **segment_name)
+{
+const char *a;
+if (strstart(pstr, needle, a)) {
+if (strlen(a)  0) {
+*segment_name = g_strdup(a);
+}
+}
+}
+
+static void parse_filename_opts(const char *filename, Error **errp,
+char **volume, char **segment_name,
+xport *mport, xport *vport)
+{
+const char *start;
+char *tokens[4], *ds;
+int idx;
+xport lmport = NoPort, lvport = NoPort;
+
+strstart(filename, archipelago:, start);
+
+ds = g_strdup(start);
+tokens[0] = strtok(ds, /);
+tokens[1] = strtok(NULL, :);
+tokens[2] = strtok(NULL, :);
+tokens[3] = strtok(NULL, \0);
+
+if (!strlen(tokens[0])) {
+error_setg(errp, volume name must be specified first);
+g_free(ds);
+return;
+}
+
+for (idx = 1; idx  4; idx++) {
+if (tokens[idx] != NULL) {
+if (strstart(tokens[idx], mport=, NULL)) {
+xseg_find_port(tokens[idx], mport=, lmport);
+}
+if (strstart(tokens[idx], vport=, NULL)) {
+xseg_find_port(tokens[idx], vport=, lvport);
+}
+if (strstart(tokens[idx], segment=, NULL)) {
+xseg_find_segment(tokens[idx], segment=, segment_name);
+}
+}
+}
+
+if ((lmport == -2) || (lvport == -2)) {
+error_setg(errp, mport and/or vport must be set);
+g_free(ds);
+return;
+}
+*volume = g_strdup(tokens[0]);
+*mport = lmport;
+*vport = lvport;
+g_free(ds);
+}
+
+static void archipelago_parse_filename(const char *filename, QDict *options,
+   Error **errp)
+{
+const char *start;
+char *volume = NULL, *segment_name = NULL;
+xport mport = NoPort, vport = NoPort;
+
+if (qdict_haskey(options, ARCHIPELAGO_OPT_VOLUME)
+|| qdict_haskey(options, ARCHIPELAGO_OPT_SEGMENT)
+|| qdict_haskey(options, ARCHIPELAGO_OPT_MPORT)
+|| qdict_haskey(options, ARCHIPELAGO_OPT_VPORT)) {
+error_setg(errp, volume/mport/vport/segment and a file name may not 
be 
+ specified at the same time);
+return;
+}
+
+if (!strstart(filename, archipelago:, start)) {
+error_setg(errp, File name must start with 'archipelago:');
+return;
+}
+
+if (!strlen(start) || strstart(start, /, NULL)) {
+error_setg(errp, volume name must be specified);
+return;
+}
+
+parse_filename_opts(filename, 

Re: [Qemu-devel] Reverse execution and deterministic replay

2014-06-27 Thread Peter Maydell
On 27 June 2014 06:18, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote:
 Our implementation is completely tested for qemu 1.5 and is in beta state for 
 2.0.50.

Note that you should post patches against current QEMU master;
patches against old releases like 1.5 are not something we could
use.

thanks
-- PMM



[Qemu-devel] [PATCH v6 3/5] block/archipelago: Add support for creating images

2014-06-27 Thread Chrysostomos Nanakos
qemu-img archipelago:volumename[/mport=mapperd_port[:vport=vlmcd_port]
 [:segment=segment_name]] [size]

Signed-off-by: Chrysostomos Nanakos cnana...@grnet.gr
---
 block/archipelago.c |  149 +++
 1 file changed, 149 insertions(+)

diff --git a/block/archipelago.c b/block/archipelago.c
index 3549454..3d5aff1 100644
--- a/block/archipelago.c
+++ b/block/archipelago.c
@@ -613,6 +613,140 @@ err_exit:
 xseg_leave(s-xseg);
 }
 
+static int qemu_archipelago_create_volume(Error **errp, const char *volname,
+  char *segment_name,
+  uint64_t size, xport mportno,
+  xport vportno)
+{
+int ret, targetlen;
+struct xseg *xseg = NULL;
+struct xseg_request *req;
+struct xseg_request_clone *xclone;
+struct xseg_port *port;
+xport srcport = NoPort, sport = NoPort;
+char *target;
+
+/* Try default values if none has been set */
+if (mportno == (xport) -1) {
+mportno = 1001;
+}
+
+if (vportno == (xport) -1) {
+vportno = 501;
+}
+
+if (segment_name == NULL) {
+segment_name = g_strdup(archipelago);
+}
+
+if (xseg_initialize()) {
+error_setg(errp, Cannot initialize XSEG);
+return -1;
+}
+
+xseg = xseg_join((char *)posix, segment_name,
+ (char *)posixfd, NULL);
+
+if (!xseg) {
+error_setg(errp, Cannot join XSEG shared memory segment);
+return -1;
+}
+
+port = xseg_bind_dynport(xseg);
+srcport = port-portno;
+init_local_signal(xseg, sport, srcport);
+
+req = xseg_get_request(xseg, srcport, mportno, X_ALLOC);
+if (!req) {
+error_setg(errp, Cannot get XSEG request);
+return -1;
+}
+
+targetlen = strlen(volname);
+ret = xseg_prep_request(xseg, req, targetlen,
+sizeof(struct xseg_request_clone));
+if (ret  0) {
+error_setg(errp, Cannot prepare XSEG request);
+goto err_exit;
+}
+
+target = xseg_get_target(xseg, req);
+if (!target) {
+error_setg(errp, Cannot get XSEG target.\n);
+goto err_exit;
+}
+memcpy(target, volname, targetlen);
+xclone = (struct xseg_request_clone *) xseg_get_data(xseg, req);
+memset(xclone-target, 0 , XSEG_MAX_TARGETLEN);
+xclone-targetlen = 0;
+xclone-size = size;
+req-offset = 0;
+req-size = req-datalen;
+req-op = X_CLONE;
+
+xport p = xseg_submit(xseg, req, srcport, X_ALLOC);
+if (p == NoPort) {
+error_setg(errp, Could not submit XSEG request);
+goto err_exit;
+}
+xseg_signal(xseg, p);
+
+ret = wait_reply(xseg, srcport, port, req);
+if (ret  0) {
+error_setg(errp, wait_reply() error.);
+}
+
+xseg_put_request(xseg, req, srcport);
+xseg_quit_local_signal(xseg, srcport);
+xseg_leave_dynport(xseg, port);
+xseg_leave(xseg);
+return ret;
+
+err_exit:
+xseg_put_request(xseg, req, srcport);
+xseg_quit_local_signal(xseg, srcport);
+xseg_leave_dynport(xseg, port);
+xseg_leave(xseg);
+return -1;
+}
+
+static int qemu_archipelago_create(const char *filename,
+   QemuOpts *options,
+   Error **errp)
+{
+int ret = 0;
+uint64_t total_size = 0;
+char *volname = NULL, *segment_name = NULL;
+const char *start;
+xport mport = NoPort, vport = NoPort;
+
+if (!strstart(filename, archipelago:, start)) {
+error_setg(errp, File name must start with 'archipelago:');
+return -1;
+}
+
+if (!strlen(start) || strstart(start, /, NULL)) {
+error_setg(errp, volume name must be specified);
+return -1;
+}
+
+parse_filename_opts(filename, errp, volname, segment_name, mport, 
vport);
+total_size = qemu_opt_get_size_del(options, BLOCK_OPT_SIZE, 0);
+
+/* Create an Archipelago volume */
+ret = qemu_archipelago_create_volume(errp, volname, segment_name,
+ total_size, mport,
+ vport);
+
+if (volname) {
+g_free(volname);
+}
+if (segment_name) {
+g_free(segment_name);
+}
+return ret;
+}
+
 static void qemu_archipelago_aio_cancel(BlockDriverAIOCB *blockacb)
 {
 ArchipelagoAIOCB *aio_cb = (ArchipelagoAIOCB *) blockacb;
@@ -925,6 +1059,19 @@ static int64_t 
qemu_archipelago_getlength(BlockDriverState *bs)
 return ret;
 }
 
+static QemuOptsList qemu_archipelago_create_opts = {
+.name = archipelago-create-opts,
+.head = QTAILQ_HEAD_INITIALIZER(qemu_archipelago_create_opts.head),
+.desc = {
+{
+.name = BLOCK_OPT_SIZE,
+.type = QEMU_OPT_SIZE,
+.help = Virtual disk size
+},
+{ /* end of list */ }
+}
+};
+
 static BlockDriverAIOCB 

[Qemu-devel] [PATCH v6 5/5] qemu-iotests: add support for Archipelago protocol

2014-06-27 Thread Chrysostomos Nanakos
Signed-off-by: Chrysostomos Nanakos cnana...@grnet.gr
---
 tests/qemu-iotests/common|6 ++
 tests/qemu-iotests/common.rc |9 -
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common b/tests/qemu-iotests/common
index 0aaf84d..a0e35c4 100644
--- a/tests/qemu-iotests/common
+++ b/tests/qemu-iotests/common
@@ -153,6 +153,7 @@ check options
 -nbdtest nbd
 -sshtest ssh
 -nfstest nfs
+-archipelagotest archipelago
 -xdiff  graphical mode diff
 -nocacheuse O_DIRECT on backing file
 -misalign   misalign memory allocations
@@ -264,6 +265,11 @@ testlist options
 xpand=false
 ;;
 
+-archipelago)
+IMGPROTO=archipelago
+xpand=false
+;;
+
 -nocache)
 CACHEMODE=none
 CACHEMODE_IS_DEFAULT=false
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 195c564..8ef1a52 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -64,6 +64,8 @@ elif [ $IMGPROTO = ssh ]; then
 elif [ $IMGPROTO = nfs ]; then
 TEST_DIR=nfs://127.0.0.1/$TEST_DIR
 TEST_IMG=$TEST_DIR/t.$IMGFMT
+elif [ $IMGPROTO = archipelago ]; then
+TEST_IMG=archipelago:at.$IMGFMT
 else
 TEST_IMG=$IMGPROTO:$TEST_DIR/t.$IMGFMT
 fi
@@ -163,7 +165,8 @@ _make_test_img()
 -e s# lazy_refcounts=\\(on\\|off\\)##g \
 -e s# block_size=[0-9]\\+##g \
 -e s# block_state_zero=\\(on\\|off\\)##g \
--e s# log_size=[0-9]\\+##g
+-e s# log_size=[0-9]\\+##g \
+-e s/archipelago:a/TEST_DIR\//g
 
 # Start an NBD server on the image file, which is what we'll be talking to
 if [ $IMGPROTO = nbd ]; then
@@ -206,6 +209,10 @@ _cleanup_test_img()
 rbd --no-progress rm $TEST_DIR/t.$IMGFMT  /dev/null
 ;;
 
+archipelago)
+vlmc remove at.$IMGFMT  /dev/null
+;;
+
 sheepdog)
 collie vdi delete $TEST_DIR/t.$IMGFMT
 ;;
-- 
1.7.10.4




Re: [Qemu-devel] [PATCH] tcg/ppc: Fix failure in tcg_out_mem_long

2014-06-27 Thread Greg Kurz
On Thu, 26 Jun 2014 21:26:00 -0700
Richard Henderson r...@twiddle.net wrote:

 With rt != r0 on loads, we use rt for scratch.  If we need an index
 register different from base, we can't use rt, but r0 is usable.
 
 Signed-off-by: Richard Henderson r...@twiddle.net
 ---
 This ought to fix the problem that Greg reported.
 

Thanks Richard !

 That we need to use --enable-debug-tcg to see the assert, and that I
 didn't previously do testing with that is disappointing.  I'm thinking
 that we ought to do something like gcc wrt --enable-checking=release
 vs development, so that we can't do normal development withing these
 asserts enabled.  More on that later...
 
 
 r~

Makes sense.

Cheers.

--
Greg

 ---
  tcg/ppc/tcg-target.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)
 
 diff --git a/tcg/ppc/tcg-target.c b/tcg/ppc/tcg-target.c
 index c83fd9f..dd84e76 100644
 --- a/tcg/ppc/tcg-target.c
 +++ b/tcg/ppc/tcg-target.c
 @@ -805,7 +805,10 @@ static void tcg_out_mem_long(TCGContext *s, int opi, int 
 opx, TCGReg rt,
 
  /* For unaligned, or very large offsets, use the indexed form.  */
  if (offset  align || offset != (int32_t)offset) {
 -tcg_debug_assert(rs != base  (!is_store || rs != rt));
 +if (rs == base) {
 +rs = TCG_REG_R0;
 +}
 +tcg_debug_assert(!is_store || rs != rt);
  tcg_out_movi(s, TCG_TYPE_PTR, rs, orig);
  tcg_out32(s, opx | TAB(rt, base, rs));
  return;



-- 
Gregory Kurz kurzg...@fr.ibm.com
 gk...@linux.vnet.ibm.com
Software Engineer @ IBM/Meiosys  http://www.ibm.com
Tel +33 (0)562 165 496

Anarchy is about taking complete responsibility for yourself.
Alan Moore.




[Qemu-devel] [PATCH] Allow mismatched virtio config-len

2014-06-27 Thread Dr. David Alan Gilbert (git)
From: Dr. David Alan Gilbert dgilb...@redhat.com

Commit 'virtio: validate config_len on load' restricted config_len
loaded from the wire to match the config_len that the device had.

Unfortunately, there are cases where this isn't true, the one
we found it on was the wqe addition in virtio-blk.

Allow mismatched config-lengths:
   *) If the version on the wire is shorter then ensure that the
  remainder is 0xff filled (as virtio_config_read does on
  out of range reads)
   *) If the version on the wire is longer, load what we have space
  for and skip the rest.

Signed-off-by: Dr. David Alan Gilbert dgilb...@redhat.com
---
 hw/virtio/virtio.c | 30 ++
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index a3082d5..2b11142 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -927,11 +927,33 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f)
 }
 config_len = qemu_get_be32(f);
 if (config_len != vdev-config_len) {
-error_report(Unexpected config length 0x%x. Expected 0x%zx,
- config_len, vdev-config_len);
-return -1;
+/*
+ * Unfortunately the reality is that there are cases where we
+ * see mismatched config lengths, so we have to deal with them
+ * rather than rejecting them.
+ */
+
+if (config_len  vdev-config_len) {
+/* This is normal in some devices when they add a new option */
+memset(vdev-config, 0xff, vdev-config_len);
+qemu_get_buffer(f, vdev-config, config_len);
+} else {
+int32_t diff;
+/* config_len  vdev-config_len
+ * This is rarer, but is here to allow us to fix the case above
+ */
+qemu_get_buffer(f, vdev-config, vdev-config_len);
+/*
+ * Even though we expect the diff to be small, we can't use
+ * qemu_file_skip because it's not safe for a large skip.
+ */
+for (diff = config_len - vdev-config_len; diff  0; diff--) {
+qemu_get_byte(f);
+}
+}
+} else {
+qemu_get_buffer(f, vdev-config, vdev-config_len);
 }
-qemu_get_buffer(f, vdev-config, vdev-config_len);
 
 num = qemu_get_be32(f);
 
-- 
1.9.3




Re: [Qemu-devel] [v5][PATCH 4/5] xen, gfx passthrough: create host bridge to passthrough

2014-06-27 Thread Chen, Tiejun

On 2014/6/25 14:24, Paolo Bonzini wrote:

Il 25/06/2014 04:17, Tiejun Chen ha scritto:

+if (xen_enabled()  xen_has_gfx_passthru) {
+d = pci_create_simple(b, 0, TYPE_I440FX_XEN_PCI_DEVICE);
+*pi440fx_state = I440FX_XEN_PCI_DEVICE(d);
+pci_create_pch(b);
+} else {
+d = pci_create_simple(b, 0, TYPE_I440FX_PCI_DEVICE);
+*pi440fx_state = I440FX_PCI_DEVICE(d);
+}


As mentioned in the review of v4, this should be a separate,
Xen-specific machine.  pci_create_pch should not be called in generic PC
code.



I track this path:

qemu_register_pc_machine(xenfv_machine);
|
+ .init = pc_xen_hvm_init,
|
+ pc_init_pci(machine);
|
+ pc_init1(machine, 1, 1);
|
+ i440fx_init()

So how to separate this to specific to xen? Or you mean we need to 
create an new machine to address this scenario? But actually this is 
same as xenfv_machine except for these little codes.


If you don't like this involve other cases, we may drop this chunk of 
codes as a function to tweak with CONFIG_XEN. But this is not good as 
well since this is device feature, so kvm may need this one day.



Thanks
Tiejun



Re: [Qemu-devel] [PATCH for 2.1] qdev: correctly send DEVICE_DELETED for recursively-deleted devices

2014-06-27 Thread Andreas Färber
Am 27.06.2014 09:16, schrieb Markus Armbruster:
 Paolo Bonzini pbonz...@redhat.com writes:
 
 When a device is unparented (i.e. made completely hidden from management)
 we want to send a DEVICE_DELETED event only if the device actually was
 realized.  This avoids raising DEVICE_DELETED events when device_add
 fails.

 However, this does not work right for recursively-deleted
 devices: the whole tree is _first_ unrealized, _then_ unparented.
 Then device_unparent sees realized==false and fails to trigger
 the event.  The solution is simply to move have_realized into
 the DeviceState struct.  If device_add fails, we never set the
 new field to true and DEVICE_DELETED is not sent.

 Fixes qemu-iotests testcase 067.
 
 Suggest to add Broken in commit 5942a19 here, to make it clear that
 it's a recent regression.

I vaguely recall that something like this was in Bandan's RFC (that I
assume the above commit forward-ported, the subject would be handy to
mention too), but once again without any explanation why, so I saw no
need to apply that during hardfreeze.

Andreas

 Reported-by: Markus Armbruster arm...@redhat.com
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 Reviewed-by: Markus Armbruster arm...@redhat.com

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH 1/4] mips/kvm: Init EBase to correct KSEG0

2014-06-27 Thread Aurelien Jarno
On Thu, Jun 26, 2014 at 10:44:22AM +0100, James Hogan wrote:
 The EBase CP0 register is initialised to 0x8000, however with KVM
 the guest's KSEG0 is at 0x4000. The incorrect value doesn't get
 passed to KVM yet as KVM doesn't implement the EBase register, however
 we should set it correctly now so as not to break migration/loadvm to a
 future version of QEMU that does support EBase.
 
 Signed-off-by: James Hogan james.ho...@imgtec.com
 Cc: Aurelien Jarno aurel...@aurel32.net
 Cc: Paolo Bonzini pbonz...@redhat.com
 ---
  target-mips/translate.c | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)
 
 diff --git a/target-mips/translate.c b/target-mips/translate.c
 index 2f91959ed7b1..d7b8c4dbc81a 100644
 --- a/target-mips/translate.c
 +++ b/target-mips/translate.c
 @@ -28,6 +28,7 @@
  
  #include exec/helper-proto.h
  #include exec/helper-gen.h
 +#include sysemu/kvm.h
  
  #define MIPS_DEBUG_DISAS 0
  //#define MIPS_DEBUG_SIGN_EXTENSIONS
 @@ -16076,7 +16077,12 @@ void cpu_state_reset(CPUMIPSState *env)
  env-CP0_Random = env-tlb-nb_tlb - 1;
  env-tlb-tlb_in_use = env-tlb-nb_tlb;
  env-CP0_Wired = 0;
 -env-CP0_EBase = 0x8000 | (cs-cpu_index  0x3FF);
 +env-CP0_EBase = (cs-cpu_index  0x3FF);
 +if (kvm_enabled()) {
 +env-CP0_EBase |= 0x4000;
 +} else {
 +env-CP0_EBase |= 0x8000;
 +}
  env-CP0_Status = (1  CP0St_BEV) | (1  CP0St_ERL);
  /* vectored interrupts not implemented, timer on int 7,
 no performance counters. */

Reviewed-by: Aurelien Jarno aurel...@aurel32.net

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PATCH 2/4] mips_malta: Change default KVM cpu to 24Kc (no FP)

2014-06-27 Thread Aurelien Jarno
On Thu, Jun 26, 2014 at 10:44:23AM +0100, James Hogan wrote:
 Change the default Malta CPU model for when KVM is enabled to 24Kc which
 doesn't have floating point support compared to the 24Kf.
 
 The resulting incorrect Config CP0 register value doesn't get passed to
 KVM yet as KVM doesn't expose it, however we should ensure it is set
 correctly now to reduce the risk of breaking migration/loadvm to a
 future version of QEMU/Linux that does support them.
 
 Signed-off-by: James Hogan james.ho...@imgtec.com
 Cc: Aurelien Jarno aurel...@aurel32.net
 Cc: Paolo Bonzini pbonz...@redhat.com
 ---
  hw/mips/mips_malta.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)
 
 diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
 index 2868ee5b0307..c0841991f4e9 100644
 --- a/hw/mips/mips_malta.c
 +++ b/hw/mips/mips_malta.c
 @@ -949,7 +949,12 @@ void mips_malta_init(MachineState *machine)
  #ifdef TARGET_MIPS64
  cpu_model = 20Kc;
  #else
 -cpu_model = 24Kf;
 +if (kvm_enabled()) {
 +/* Don't enable FPU on KVM yet */
 +cpu_model = 24Kc;
 +} else {
 +cpu_model = 24Kf;
 +}
  #endif
  }

Given the explanations in the other mails, that looks fine to me, that
said I think we should at least warn the user that we are disabling some
features, instead of doing it silently. This is what is done for example
on x86 when a CPU feature is not available.

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PATCH 3/4] mips_malta: Remove incorrect KVM TE references

2014-06-27 Thread Aurelien Jarno
On Thu, Jun 26, 2014 at 10:44:24AM +0100, James Hogan wrote:
 Fix the error message and code comments relating to KVM not supporting
 booting from the flash mapping when no kernel is provided. The issue is
 a general MIPS KVM issue and isn't specific to the Trap  Emulate
 version of MIPS KVM.
 
 Reported-by: Andreas Färber afaer...@suse.de
 Signed-off-by: James Hogan james.ho...@imgtec.com
 Cc: Aurelien Jarno aurel...@aurel32.net
 Cc: Paolo Bonzini pbonz...@redhat.com
 ---
  hw/mips/mips_malta.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)
 
 diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
 index c0841991f4e9..76cf5f2c48f4 100644
 --- a/hw/mips/mips_malta.c
 +++ b/hw/mips/mips_malta.c
 @@ -1033,7 +1033,7 @@ void mips_malta_init(MachineState *machine)
  fl_idx++;
  if (kernel_filename) {
  ram_low_size = MIN(ram_size, 256  20);
 -/* For KVM TE we reserve 1MB of RAM for running bootloader */
 +/* For KVM we reserve 1MB of RAM for running bootloader */
  if (kvm_enabled()) {
  ram_low_size -= 0x10;
  bootloader_run_addr = 0x4000 + ram_low_size;
 @@ -1057,10 +1057,10 @@ void mips_malta_init(MachineState *machine)
   bootloader_run_addr, kernel_entry);
  }
  } else {
 -/* The flash region isn't executable from a KVM TE guest */
 +/* The flash region isn't executable from a KVM guest */
  if (kvm_enabled()) {
  error_report(KVM enabled but no -kernel argument was specified. 
 
 - Booting from flash is not supported with KVM 
 TE.);
 + Booting from flash is not supported with KVM.);
  exit(1);
  }
  /* Load firmware from flash. */

Reviewed-by: Aurelien Jarno aurel...@aurel32.net


-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PATCH 4/4] mips_malta: Catch kernels linked at wrong address

2014-06-27 Thread Aurelien Jarno
On Thu, Jun 26, 2014 at 10:44:25AM +0100, James Hogan wrote:
 Add error reporting if the wrong type of kernel is provided for the
 current mode of acceleration.
 
 Currently a KVM kernel linked at 0x4000 can't be used with TCG, and
 a normal kernel linked at 0x8000 can't be used with KVM.
 
 Signed-off-by: James Hogan james.ho...@imgtec.com
 Cc: Aurelien Jarno aurel...@aurel32.net
 Cc: Paolo Bonzini pbonz...@redhat.com
 ---
  hw/mips/mips_malta.c | 14 ++
  1 file changed, 14 insertions(+)
 
 diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
 index 76cf5f2c48f4..95df42e6a4d5 100644
 --- a/hw/mips/mips_malta.c
 +++ b/hw/mips/mips_malta.c
 @@ -792,9 +792,23 @@ static int64_t load_kernel (void)
  loaderparams.kernel_filename);
  exit(1);
  }
 +
 +/* Sanity check where the kernel has been linked */
  if (kvm_enabled()) {
 +if (kernel_entry  0x8000ll) {
 +error_report(KVM guest kernels must be linked in useg. 
 + Did you forget to enable CONFIG_KVM_GUEST?);
 +exit(1);
 +}
 +
  xlate_to_kseg0 = cpu_mips_kvm_um_phys_to_kseg0;
  } else {
 +if (!(kernel_entry  0x8000ll)) {
 +error_report(KVM guest kernels aren't supported with TCG. 
 + Did you unintentionally enable CONFIG_KVM_GUEST?);
 +exit(1);
 +}
 +
  xlate_to_kseg0 = cpu_mips_phys_to_kseg0;
  }

Reviewed-by: Aurelien Jarno aurel...@aurel32.net
 

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] [PATCH v5 0/3] s390: Support for Hotplug of Standby Memory

2014-06-27 Thread Igor Mammedov
On Wed, 25 Jun 2014 10:26:57 -0400
Matthew Rosato mjros...@linux.vnet.ibm.com wrote:

 This patchset adds support in s390 for a pool of standby memory,
 which can be set online/offline by the guest (ie, via chmem).
 The standby pool of memory is allocated as the difference between 
 the initial memory setting and the maxmem setting.
 As part of this work, additional results are provided for the 
 Read SCP Information SCLP, and new implentation is added for the 
 Read Storage Element Information, Attach Storage Element, 
 Assign Storage and Unassign Storage SCLPs, which enables the s390 
 guest to manipulate the standby memory pool.
 
 This patchset is based on work originally done by Jeng-Fang (Nick)
 Wang.
Could you add short description how to test it, please.

 
 Changes for v5:
  * Since ACPI memory hotplug is now in, removed Igor's patches 
from this set.
  * Updated sclp.c to use object_resolve_path() instead of 
object_property_find().
 
 Changes for v4:
  * Remove initialization code from get_sclp_memory_hotplug_dev()
and place in its own function, init_sclp_memory_hotplug_dev().
  * Add hit to qemu-options.hx to note the fact that the memory 
size specified via -m might be forced to a boundary.
  * Account for the legacy s390 machine, which does not support 
memory hotplug.
  * Fix a bug in sclp.c - Change memory hotplug device parent to 
sysbus.
  * Pulled latest version of Igor's patch. 
 
 Matthew Rosato (3):
   sclp-s390: Add device to manage s390 memory hotplug
   virtio-ccw: Include standby memory when calculating storage increment
   sclp-s390: Add memory hotplug SCLPs
 
  hw/s390x/s390-virtio-ccw.c |   46 +--
  hw/s390x/sclp.c|  289 
 +++-
  include/hw/s390x/sclp.h|   20 +++
  qemu-options.hx|3 +-
  target-s390x/cpu.h |   18 +++
  target-s390x/kvm.c |5 +
  6 files changed, 366 insertions(+), 15 deletions(-)
 




Re: [Qemu-devel] [PATCH 00/10] pc-bios/s390-ccw: Add DASD IPL support

2014-06-27 Thread Alexander Graf


 Am 27.06.2014 um 09:53 schrieb Christian Borntraeger borntrae...@de.ibm.com:
 
 On 26/06/14 16:42, Alexander Graf wrote:
 
 On 26.06.14 16:29, Jens Freimann wrote:
 Conny, Alex, Christian,
 
 here are some fixes for the s390-ccw bios. It's a mixture of
 additional features (DASD IPL support for different formats)
 and cleanups.
 
 From a quick glimpse it looks quite clean and straight forward, but I'd like 
 to make sure we get rid completely of the static sector size assumption.
 
 Should be. I guess s/SECTOR_SIZE/MAX_SECTOR_SIZE/g would be ok for you then?

I'm not 100% convinced that we're safe on all users of SECTOR_SIZE. So please 
make sure to replace the occasions manually and audit every single one.

Alex

 
 Also, are we guaranteed that virtio always uses 512 byte block size? Or was 
 that just an internal API thing?
 
 The virtio-blk API always talks in 512 byte sectors, no matter the block size.
 
 Overall this is a nice improvement of the boot code - if possible I would 
 like to see that in 2.1.
 
 Conny, can you carry that in your tree (with s/SECTOR_SIZE/MAX_SECTOR_SIZE/g)?
 
 Acked-by: Christian Borntraeger borntrae...@de.ibm.com
 
 for the series.
 
 
 Christian
 



Re: [Qemu-devel] [v5][PATCH 3/5] xen, gfx passthrough: support Intel IGD passthrough with VT-D

2014-06-27 Thread Chen, Tiejun

On 2014/6/25 15:04, Michael S. Tsirkin wrote:

On Wed, Jun 25, 2014 at 10:17:19AM +0800, Tiejun Chen wrote:

Some registers of Intel IGD are mapped in host bridge, so it needs to


[snip]


  static int is_vga_passthrough(XenHostPCIDevice *dev)
  {
@@ -291,3 +292,158 @@ static int create_pseudo_pch_isa_bridge(PCIBus *bus, 
XenHostPCIDevice *hdev)
  XEN_PT_LOG(dev, The pseudo Intel PCH ISA bridge created.\n);
  return 0;
  }
+
+int pci_create_pch(PCIBus *bus)



Please prefix all xen specific non static functions
with xen_ or something like this.


Okay.


pci_ is for pci core.

In fact it's a good idea to do this for static functions
as well, in case we add a conflicting function in
some header.


+{
+XenHostPCIDevice hdev;
+int r = 0;
+
+if (!xen_has_gfx_passthru) {
+return r;
+}
+
+r = xen_host_pci_device_get(hdev, 0, 0, 0x1f, 0);
+if (r) {
+XEN_PT_ERR(NULL, Failed to find Intel PCH on host\n);
+goto err;
+}
+
+if (hdev.vendor_id == PCI_VENDOR_ID_INTEL) {
+r = create_pseudo_pch_isa_bridge(bus, hdev);
+if (r) {
+XEN_PT_ERR(NULL, Failed to create PCH ISA bridge.\n);
+goto err;
+}
+}


Does it work on non intel?


IGD means this should work on Intel platform.


It seems to return success.


Okay, I'd like to change this a void.


Maybe you should just verify that vendor and device
ID have the expected values on the host, and


Vendor id is enough.


fail otherwise.


+
+xen_host_pci_device_put(hdev);
+
+err:
+return r;
+}
+
+/*
+ * Currently we just pass this physical host bridge for IGD, 00:02.0.
+ *
+ * Here pci_dev is just that host bridge, so we have to get that real
+ * passthrough device by that given devfn to further confirm.
+ */



confirm what?


So change like:

* passthrough device by that given devfn to avoid other devices access.


Comments like this need to document what function does.

Maybe

/* Can we support IGD passthrough for this device?
  * We require ... XYZ - fill in here
  */


+static int is_igd_passthrough(PCIDevice *pci_dev)
+{
+PCIDevice *f = pci_dev-bus-devices[PCI_DEVFN(2, 0)];
+if (pci_dev-bus-devices[PCI_DEVFN(2, 0)]) {
+XenPCIPassthroughState *s = DO_UPCAST(XenPCIPassthroughState, dev, f);
+return (is_vga_passthrough(s-real_device)
+ (s-real_device.vendor_id == PCI_VENDOR_ID_INTEL));
+} else {
+return 0;
+}
+}
+
+void igd_pci_write(PCIDevice *pci_dev, uint32_t config_addr,
+   uint32_t val, int len)


Same here, xen_ everywhere please.


Okay.




+{
+XenHostPCIDevice dev;
+int r;
+
+/* IGD read/write is through the host bridge.
+ * ISA bridge is only for detect purpose. In i915 driver it will
+ * probe ISA bridge to discover the IGD, see comment in i915_drv.c:
+ * intel_detect_pch().


You mean in linux kernel I guess?


So change like,

* probe ISA bridge to discover the IGD, see comment in Linux:i915_drv.c:




+ */
+
+assert(pci_dev-devfn == 0x00);
+
+if (!is_igd_passthrough(pci_dev)) {
+goto write_default;
+}
+
+/* Just work for the i915 driver. */
+switch (config_addr) {
+case 0x58:  /* PAVPC Offset */
+break;
+default:
+/* Just sets the emulated values. */
+goto write_default;
+}
+
+/* Host write */
+r = xen_host_pci_device_get(dev, 0, 0, 0, 0);
+if (r) {
+XEN_PT_ERR(pci_dev, Can't get pci_dev_host_bridge\n);
+abort();
+}
+
+r = xen_host_pci_set_block(dev, config_addr, (uint8_t *)val, len);
+if (r) {
+XEN_PT_ERR(pci_dev, Can't get pci_dev_host_bridge\n);
+abort();
+}



Cleaner:

if (config_addr == 0x58) {


Maybe we add other offset in the future, so we'd better keep in them in 
switch().



 /* Host write */
 r = xen_host_pci_device_get(dev, 0, 0, 0, 0);
 if (r) {
 XEN_PT_ERR(pci_dev, Can't get pci_dev_host_bridge\n);
 abort();
 }

 r = xen_host_pci_set_block(dev, config_addr, (uint8_t *)val, 
len);
 if (r) {
 XEN_PT_ERR(pci_dev, Can't get pci_dev_host_bridge\n);
 abort();
 }
}

Note this does not work on e.g. BE.


Why do we need take BE into consideration here? Shouldn't PCI already be LE?


The best way is really to make the register writeable in wmask.
Then
 pci_default_write_config(pci_dev, config_addr, val, len);
 if (range_covers_byte(addr, len, 0x58)) {


 r = xen_host_pci_set_block(dev, config_addr,
 pci_dev-config + config_addr, len);
 }






+
+xen_host_pci_device_put(dev);
+
+return;
+
+write_default:
+pci_default_write_config(pci_dev, config_addr, val, len);
+}
+
+uint32_t igd_pci_read(PCIDevice *pci_dev, uint32_t config_addr, int len)
+{
+XenHostPCIDevice 

Re: [Qemu-devel] [v5][PATCH 5/5] xen, gfx passthrough: add opregion mapping

2014-06-27 Thread Chen, Tiejun

On 2014/6/25 15:13, Michael S. Tsirkin wrote:

On Wed, Jun 25, 2014 at 10:17:21AM +0800, Tiejun Chen wrote:


[snip]


diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 507165c..25147cf 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -63,7 +63,7 @@ typedef int (*xen_pt_conf_byte_read)
  #define XEN_PT_BAR_UNMAPPED (-1)

  #define PCI_CAP_MAX 48
-
+#define PCI_INTEL_OPREGION 0xfc



XEN_ please

PCI_CAP_MAX should be fixed too.


They are specific to PCI, not XEN. Why should we add such a prefix?






[snip]



+if (igd_guest_opregion) {
+ret = xc_domain_memory_mapping(xen_xc, xen_domid,
+(unsigned long)(igd_guest_opregion  XC_PAGE_SHIFT),
+(unsigned long)(igd_host_opregion  XC_PAGE_SHIFT),


don't spread casts all around.
Should be a last resort.


Okay.




+3,
+DPCI_REMOVE_MAPPING);
+if (ret) {
+return ret;
+}
+}
+
  return 0;
  }

@@ -447,3 +462,52 @@ err_out:
  XEN_PT_ERR(pci_dev, Can't get pci_dev_host_bridge\n);
  return -1;
  }
+
+uint32_t igd_read_opregion(XenPCIPassthroughState *s)
+{
+uint32_t val = 0;
+
+if (igd_guest_opregion == 0) {


!igd_guest_opregion is shorter and does the same,


Okay.




+return val;
+}
+
+val = igd_guest_opregion;
+
+XEN_PT_LOG(s-dev, Read opregion val=%x\n, val);
+return val;
+}
+
+void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val)
+{
+int ret;
+
+if (igd_guest_opregion) {
+XEN_PT_LOG(s-dev, opregion register already been set, ignoring 
%x\n,
+   val);
+return;
+}
+
+xen_host_pci_get_block(s-real_device, PCI_INTEL_OPREGION,
+(uint8_t *)igd_host_opregion, 4);
+igd_guest_opregion = (unsigned long)(val  ~0xfff)
+| (igd_host_opregion  0xfff);
+


Clearly broken on BE.


I still can't understand why we need to address this in BE case.


Maybe not important here but writing clean code is
just as easy.
uint8_t igd_host_opregion[4];

...

 xen_host_pci_get_block(s-real_device, PCI_INTEL_OPREGION,
   igd_host_opregion, sizeof igd_host_opregion);

 igd_guest_opregion = (val  ~0xfff) |
(pci_get_word(igd_host_opregion)  0xfff);

0xfff should be a macro too to avoid duplication.



Okay.

Thanks
Tiejun



Re: [Qemu-devel] [patch qemu] net: move queue number into NICPeers

2014-06-27 Thread Stefan Hajnoczi
On Mon, May 26, 2014 at 12:04:08PM +0200, Jiri Pirko wrote:
 It indicates the number of elements in ncs field and makes sense to have
 int inside NICPeers. Also in parse_netdev we do not need to access
 container and work with NICPeers only.
 
 Signed-off-by: Jiri Pirko j...@resnulli.us
 ---
  hw/core/qdev-properties-system.c | 3 +--
  hw/net/virtio-net.c  | 2 +-
  include/net/net.h| 2 +-
  net/net.c| 4 ++--
  4 files changed, 5 insertions(+), 6 deletions(-)

Thanks, applied to my net tree:
https://github.com/stefanha/qemu/commits/net

Stefan


pgpBUpz1DhfLw.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH 00/10] pc-bios/s390-ccw: Add DASD IPL support

2014-06-27 Thread Christian Borntraeger
On 27/06/14 11:05, Alexander Graf wrote:
 
 
 Am 27.06.2014 um 09:53 schrieb Christian Borntraeger 
 borntrae...@de.ibm.com:

 On 26/06/14 16:42, Alexander Graf wrote:

 On 26.06.14 16:29, Jens Freimann wrote:
 Conny, Alex, Christian,

 here are some fixes for the s390-ccw bios. It's a mixture of
 additional features (DASD IPL support for different formats)
 and cleanups.

 From a quick glimpse it looks quite clean and straight forward, but I'd 
 like to make sure we get rid completely of the static sector size 
 assumption.

 Should be. I guess s/SECTOR_SIZE/MAX_SECTOR_SIZE/g would be ok for you then?
 
 I'm not 100% convinced that we're safe on all users of SECTOR_SIZE. So please 
 make sure to replace the occasions manually and audit every single one.

Yes, a mindless sed, would also replace VIRTIO_SECTOR_SIZE with 
VIRTIO_MAX_SECTOR_SIZE.
Fortunately there are only 3 place in bootmap.c. Should be simple enough to 
review.


 
 Alex
 

 Also, are we guaranteed that virtio always uses 512 byte block size? Or was 
 that just an internal API thing?

 The virtio-blk API always talks in 512 byte sectors, no matter the block 
 size.

 Overall this is a nice improvement of the boot code - if possible I would 
 like to see that in 2.1.

 Conny, can you carry that in your tree (with 
 s/SECTOR_SIZE/MAX_SECTOR_SIZE/g)?

 Acked-by: Christian Borntraeger borntrae...@de.ibm.com

 for the series.


 Christian

 




Re: [Qemu-devel] [PATCH 4/5] PPC: e500: Support platform devices

2014-06-27 Thread Eric Auger
On 06/04/2014 02:28 PM, Alexander Graf wrote:
 For e500 our approach to supporting platform devices is to create a simple
 bus from the guest's point of view within which we map platform devices
 dynamically.
 
 We allocate memory regions always within the platform hole in address
 space and map IRQs to predetermined IRQ lines that are reserved for platform
 device usage.
 
 This maps really nicely into device tree logic, so we can just tell the
 guest about our virtual simple bus in device tree as well.
 
 Signed-off-by: Alexander Graf ag...@suse.de
 ---
  default-configs/ppc-softmmu.mak   |   1 +
  default-configs/ppc64-softmmu.mak |   1 +
  hw/ppc/e500.c | 221 
 ++
  hw/ppc/e500.h |   1 +
  hw/ppc/e500plat.c |   1 +
  5 files changed, 225 insertions(+)
 
 diff --git a/default-configs/ppc-softmmu.mak b/default-configs/ppc-softmmu.mak
 index 33f8d84..d6ec8b9 100644
 --- a/default-configs/ppc-softmmu.mak
 +++ b/default-configs/ppc-softmmu.mak
 @@ -45,6 +45,7 @@ CONFIG_PREP=y
  CONFIG_MAC=y
  CONFIG_E500=y
  CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
 +CONFIG_PLATFORM=y
  # For PReP
  CONFIG_MC146818RTC=y
  CONFIG_ETSEC=y
 diff --git a/default-configs/ppc64-softmmu.mak 
 b/default-configs/ppc64-softmmu.mak
 index 37a15b7..06677bf 100644
 --- a/default-configs/ppc64-softmmu.mak
 +++ b/default-configs/ppc64-softmmu.mak
 @@ -45,6 +45,7 @@ CONFIG_PSERIES=y
  CONFIG_PREP=y
  CONFIG_MAC=y
  CONFIG_E500=y
 +CONFIG_PLATFORM=y
  CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
  # For pSeries
  CONFIG_XICS=$(CONFIG_PSERIES)
 diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
 index 33d54b3..bc26215 100644
 --- a/hw/ppc/e500.c
 +++ b/hw/ppc/e500.c
 @@ -36,6 +36,7 @@
  #include exec/address-spaces.h
  #include qemu/host-utils.h
  #include hw/pci-host/ppce500.h
 +#include hw/platform/device.h
  
  #define EPAPR_MAGIC(0x45504150)
  #define BINARY_DEVICE_TREE_FILEmpc8544ds.dtb
 @@ -47,6 +48,14 @@
  
  #define RAM_SIZES_ALIGN(64UL  20)
  
 +#define E500_PLATFORM_BASE 0xF000ULL
 +#define E500_PLATFORM_HOLE (128ULL * 1024 * 1024) /* 128 MB */
 +#define E500_PLATFORM_PAGE_SHIFT   12
 +#define E500_PLATFORM_HOLE_PAGES   (E500_PLATFORM_HOLE  \
 +E500_PLATFORM_PAGE_SHIFT)
 +#define E500_PLATFORM_FIRST_IRQ5
 +#define E500_PLATFORM_NUM_IRQS 10
 +
  /* TODO: parameterize */
  #define MPC8544_CCSRBAR_BASE   0xE000ULL
  #define MPC8544_CCSRBAR_SIZE   0x0010ULL
 @@ -122,6 +131,62 @@ static void dt_serial_create(void *fdt, unsigned long 
 long offset,
  }
  }
  
 +typedef struct PlatformDevtreeData {
 +void *fdt;
 +const char *mpic;
 +int irq_start;
 +const char *node;
 +} PlatformDevtreeData;
 +
 +static int platform_device_create_devtree(Object *obj, void *opaque)
 +{
 +PlatformDevtreeData *data = opaque;
 +Object *dev;
 +PlatformDeviceState *pdev;
 +
 +dev = object_dynamic_cast(obj, TYPE_PLATFORM_DEVICE);
 +pdev = (PlatformDeviceState *)dev;
 +
 +if (!pdev) {
 +/* Container, traverse it for children */
 +return object_child_foreach(obj, platform_device_create_devtree, 
 data);
 +}
 +
 +return 0;
 +}
 +
 +static void platform_create_devtree(void *fdt, const char *node, uint64_t 
 addr,
 +const char *mpic, int irq_start,
 +int nr_irqs)
 +{
 +const char platcomp[] = qemu,platform\0simple-bus;
 +PlatformDevtreeData data;
 +
 +/* Create a /platform node that we can put all devices into */
 +
 +qemu_fdt_add_subnode(fdt, node);
 +qemu_fdt_setprop(fdt, node, compatible, platcomp, sizeof(platcomp));
 +qemu_fdt_setprop_string(fdt, node, device_type, platform);
 +
 +/* Our platform hole is less than 32bit big, so 1 cell is enough for 
 address
 +   and size */
 +qemu_fdt_setprop_cells(fdt, node, #size-cells, 1);
 +qemu_fdt_setprop_cells(fdt, node, #address-cells, 1);
 +qemu_fdt_setprop_cells(fdt, node, ranges, 0, addr  32, addr,
 +   E500_PLATFORM_HOLE);
 +
 +qemu_fdt_setprop_phandle(fdt, node, interrupt-parent, mpic);
 +
 +/* Loop through all devices and create nodes for known ones */
 +
 +data.fdt = fdt;
 +data.mpic = mpic;
 +data.irq_start = irq_start;
 +data.node = node;
 +
 +platform_device_create_devtree(qdev_get_machine(), data);
 +}
 +
  static int ppce500_load_device_tree(MachineState *machine,
  PPCE500Params *params,
  hwaddr addr,
 @@ -379,6 +444,12 @@ static int ppce500_load_device_tree(MachineState 
 *machine,
  qemu_fdt_setprop_cell(fdt, pci, #address-cells, 3);
  qemu_fdt_setprop_string(fdt, /aliases, pci0, pci);
  
 +if (params-has_platform) {
 +platform_create_devtree(fdt, /platform, 

Re: [Qemu-devel] [PATCH qom v2 1/4] sdhci: Fix misuse of qemu_free_irqs()

2014-06-27 Thread Andreas Färber
Am 18.06.2014 09:54, schrieb Peter Crosthwaite:
 From: Andreas Färber afaer...@suse.de
 
 It does a g_free() on the pointer.
 
 Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
 Reviewed-by: Peter Maydell peter.mayd...@linaro.org
 Signed-off-by: Andreas Färber afaer...@suse.de
 Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com

Thanks for picking this up and reviewing, applied to qom-next with
extended commit message:
https://github.com/afaerber/qemu-cpu/commits/qom-next

Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH 0/5] qemu-char/monitor: make monitor_puts thread safe

2014-06-27 Thread Stefan Hajnoczi
On Tue, Jun 03, 2014 at 06:39:05PM +0200, Paolo Bonzini wrote:
 Even though virtio-blk-dataplane mostly synchronizes with the block layer
 by means of the AioContext, we still need to introduce mutexes for other
 QEMU subsystems that the dataplane thread might encounter on its way.
 Adding rerror/werror support, for example, means that the dataplane
 thread will have to generate QMP events.
 
 monitor_puts is the entry point for generating QMP responses and events.
 Making it thread-safe lets virtio-blk-dataplane threads generate QMP
 events; because the same entry point is also used for responses, a
 response and an event will never be intertwined.
 
 Protection is inserted at both the qemu-char and monitor levels.
 A generic mutex is necessary in qemu_fe_chr_write so that
 qemu_chr_fe_write_all does not break its output; we reuse that
 mutex in some of the character devices.
 
 There is no need to protect against removal of the monitor's backend,
 since the monitor itself cannot be removed.
 
 Paolo Bonzini (6):
   qemu-char: introduce qemu_chr_alloc
   qemu-char: do not call chr_write directly
   qemu-char: move pty_chr_update_read_handler around
   qemu-char: make writes thread-safe
   monitor: protect outbuf with mutex
   monitor: protect event emission
 
  backends/baum.c   |   2 +-
  backends/msmouse.c|   2 +-
  include/sysemu/char.h |  20 ++--
  monitor.c |  55 ++
  qemu-char.c   | 125 
 +-
  spice-qemu-char.c |   2 +-
  ui/console.c  |   2 +-
  7 files changed, 149 insertions(+), 59 deletions(-)

Modulo Fam's missing unlock comment:

Reviewed-by: Stefan Hajnoczi stefa...@redhat.com


pgpVwyyyd0A4v.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH for 2.1 0/2] Fix commit of oversized layer

2014-06-27 Thread Kevin Wolf
Am 25.06.2014 um 22:55 hat Jeff Cody geschrieben:
 This fixes a regression in block-commit; if the top image is larger than the
 base image, we attempt to resize the base image.  The regression is that we
 fail the image truncate operation, returning -EBUSY.

Thanks, applied to the block branch.

One thing I'm not sure about is whether commit (all of synchronous,
live and live on active layer) should check the RESIZE blocker before
resizing the backing file.

In general, it feels like it would be the right thing to do, especially
considering the goal of operation categories in the final state, but on
the other hand it means that RESIZE would have to be excluded from
bs-backing_blocker, too, allowing standalone resize commands on backing
files. Not sure that this would be a good idea...

Kevin



Re: [Qemu-devel] [PATCH qom v2 2/4] hw: Fix qemu_allocate_irqs() leaks

2014-06-27 Thread Andreas Färber
Am 18.06.2014 09:55, schrieb Peter Crosthwaite:
 From: Andreas Färber afaer...@suse.de
 
 Replace qemu_allocate_irqs(foo, bar, 1)[0]
 with qemu_allocate_irq(foo, bar, 0).
 
 This avoids leaking the dereferenced qemu_irq *.
 
 Cc: Kirill Batuzov batuz...@ispras.ru
 Cc: Markus Armbruster arm...@redhat.com
 Cc: Peter Maydell peter.mayd...@linaro.org
 Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
 Reviewed-by: Peter Maydell peter.mayd...@linaro.org
 Signed-off-by: Andreas Färber afaer...@suse.de
 [PC Changes:
  * Applied change to instance in sh4/sh7750.c
 ]
 Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
 ---
 Changed since 1:
 Applied change to instance in sh4/sh7750.c (Kirill review)
[...]
 diff --git a/hw/sh4/sh7750.c b/hw/sh4/sh7750.c
 index 4a39357..9ccd770 100644
 --- a/hw/sh4/sh7750.c
 +++ b/hw/sh4/sh7750.c
 @@ -838,6 +838,5 @@ SH7750State *sh7750_init(SuperHCPU *cpu, MemoryRegion 
 *sysmem)
  qemu_irq sh7750_irl(SH7750State *s)
  {
  sh_intc_toggle_source(sh_intc_source(s-intc, IRL), 1, 0); /* enable */
 -return qemu_allocate_irqs(sh_intc_set_irl, sh_intc_source(s-intc, IRL),
 -   1)[0];
 +return qemu_allocate_irq(sh_intc_set_irl, sh_intc_source(s-intc, IRL), 
 1);

Thanks for catching this, my grep expression failed due to the line
break. But shouldn't this be 0 due to the zero-based index, as per my
commit message? Will fix up unless I hear objections.

Regards,
Andreas

  }

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH 00/10] pc-bios/s390-ccw: Add DASD IPL support

2014-06-27 Thread Cornelia Huck
On Fri, 27 Jun 2014 11:27:12 +0200
Christian Borntraeger borntrae...@de.ibm.com wrote:

 On 27/06/14 11:05, Alexander Graf wrote:
  
  
  Am 27.06.2014 um 09:53 schrieb Christian Borntraeger 
  borntrae...@de.ibm.com:
 
  On 26/06/14 16:42, Alexander Graf wrote:
 
  On 26.06.14 16:29, Jens Freimann wrote:
  Conny, Alex, Christian,
 
  here are some fixes for the s390-ccw bios. It's a mixture of
  additional features (DASD IPL support for different formats)
  and cleanups.
 
  From a quick glimpse it looks quite clean and straight forward, but I'd 
  like to make sure we get rid completely of the static sector size 
  assumption.
 
  Should be. I guess s/SECTOR_SIZE/MAX_SECTOR_SIZE/g would be ok for you 
  then?
  
  I'm not 100% convinced that we're safe on all users of SECTOR_SIZE. So 
  please make sure to replace the occasions manually and audit every single 
  one.
 
 Yes, a mindless sed, would also replace VIRTIO_SECTOR_SIZE with 
 VIRTIO_MAX_SECTOR_SIZE.
 Fortunately there are only 3 place in bootmap.c. Should be simple enough to 
 review.

Yes, all places that use it want a MAX_SECTOR_SIZE. All places using
the actual sector size are now using the helper function.

 
  Also, are we guaranteed that virtio always uses 512 byte block size? Or 
  was that just an internal API thing?
 
  The virtio-blk API always talks in 512 byte sectors, no matter the block 
  size.
 
  Overall this is a nice improvement of the boot code - if possible I would 
  like to see that in 2.1.
 
  Conny, can you carry that in your tree (with 
  s/SECTOR_SIZE/MAX_SECTOR_SIZE/g)?
 
  Acked-by: Christian Borntraeger borntrae...@de.ibm.com
 
  for the series.

Will push out shortly.

Unless there are objections, I'll send a pull request for this.




Re: [Qemu-devel] [PATCH qom v2 0/4] QOMify IRQs

2014-06-27 Thread Andreas Färber
Am 25.06.2014 11:39, schrieb Peter Crosthwaite:
 Ping!
 
 This is fully reviewed and should be rdy for a merge. I'd like to see
 this through for 2.1.

I have been very wary of applying the QOM conversion without full device
test coverage, similar to realization. People actually testing this
conversion would've been more reaffirming than a bit of review - the
hardfreeze can but does not necessarily uncover all corner cases. But
time is running out, so I intend to apply the series unless I discover
issues.

qtests for missing devices or statistics of how incomplete our coverage
actually is appreciated as always.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH v11 1/3] sPAPR: Implement EEH RTAS calls

2014-06-27 Thread Gavin Shan
On Thu, Jun 26, 2014 at 12:46:50PM +0200, Alexander Graf wrote:

On 26.06.14 12:43, Gavin Shan wrote:
On Thu, Jun 26, 2014 at 12:30:16PM +0200, Alexander Graf wrote:
On 26.06.14 03:35, Gavin Shan wrote:
The emulation for EEH RTAS requests from guest isn't covered
by QEMU yet and the patch implements them.

The patch defines constants used by EEH RTAS calls and adds
callback sPAPRPHBClass::eeh_handler, which is going to be used
this way:

1. RTAS calls are received in spapr_pci.c, sanity check is done
there.
2. RTAS handlers handle what they can. If there is something it
cannot handle and sPAPRPHBClass::eeh_handler callback is defined,
it is called.
3. sPAPRPHBClass::eeh_handler is only implemented for VFIO now. It
does ioctl() to the IOMMU container fd to complete the call. Error
codes from that ioctl() are transferred back to the guest.

Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  hw/ppc/spapr_pci.c  | 240 
 
  include/hw/pci-host/spapr.h |   7 ++
  include/hw/ppc/spapr.h  |  33 ++
  3 files changed, 280 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 131434b..8712051 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -422,6 +422,233 @@ static void 
rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
  rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */
  }
+static int rtas_handle_eeh_request(sPAPREnvironment *spapr,
+   uint64_t buid, uint32_t req, uint32_t 
opt)
+{
+sPAPRPHBState *sphb = spapr_find_phb(spapr, buid);
+sPAPRPHBClass *info = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
+
+if (!sphb || !info-eeh_handler) {
+return -ENOENT;
+}
+
+return info-eeh_handler(sphb, req, opt);
+}
+
+static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu,
+sPAPREnvironment *spapr,
+uint32_t token, uint32_t nargs,
+target_ulong args, uint32_t nret,
+target_ulong rets)
+{
+uint32_t addr, option;
+uint64_t buid = ((uint64_t)rtas_ld(args, 1)  32) | rtas_ld(args, 2);
+int ret;
+
+if ((nargs != 4) || (nret != 1)) {
+goto param_error_exit;
+}
+
+addr = rtas_ld(args, 0);
+option = rtas_ld(args, 3);
+switch (option) {
+case RTAS_EEH_ENABLE:
+if (!find_dev(spapr, buid, addr)) {
+goto param_error_exit;
+}
+break;
+case RTAS_EEH_DISABLE:
+case RTAS_EEH_THAW_IO:
+case RTAS_EEH_THAW_DMA:
+break;
+default:
+goto param_error_exit;
+}
+
+ret = rtas_handle_eeh_request(spapr, buid,
+  RTAS_EEH_REQ_SET_OPTION, option);
+if (ret = 0) {
+rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+return;
+}
+
+param_error_exit:
+rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
+}
+
+static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu,
+   sPAPREnvironment *spapr,
+   uint32_t token, uint32_t nargs,
+   target_ulong args, uint32_t 
nret,
+   target_ulong rets)
+{
+uint32_t addr, option;
+uint64_t buid = ((uint64_t)rtas_ld(args, 1)  32) | rtas_ld(args, 2);
+sPAPRPHBState *sphb = spapr_find_phb(spapr, buid);
+sPAPRPHBClass *info = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
+PCIDevice *pdev;
+
+if (!sphb || !info-eeh_handler) {
+goto param_error_exit;
+}
+
+if ((nargs != 4) || (nret != 2)) {
+goto param_error_exit;
+}
+
+addr = rtas_ld(args, 0);
+option = rtas_ld(args, 3);
+if (option != RTAS_GET_PE_ADDR  option != RTAS_GET_PE_MODE) {
+goto param_error_exit;
+}
+
+pdev = find_dev(spapr, buid, addr);
+if (!pdev) {
+goto param_error_exit;
+}
+
+/*
+ * For now, we always have bus level PE whose address
+ * has format 00BBSS00. The guest OS might regard
+ * PE address 0 as invalid. We avoid that simply by
+ * extending it with one.
+ */
+rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+if (option == RTAS_GET_PE_ADDR) {
+rtas_st(rets, 1, (pci_bus_num(pdev-bus)  16) + 1);
+} else {
+rtas_st(rets, 1, RTAS_PE_MODE_SHARED);
+}
+
+return;
+
+param_error_exit:
+rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
+}
+
+static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu,
+sPAPREnvironment *spapr,
+uint32_t token, uint32_t nargs,
+target_ulong args, uint32_t 
nret,
+target_ulong rets)
+{
+uint64_t buid = ((uint64_t)rtas_ld(args, 1)  32) | rtas_ld(args, 2);
+int ret;
+
+if ((nargs != 3) || (nret != 4  

Re: [Qemu-devel] VNC memory corruption during resolution change

2014-06-27 Thread Peter Lieven
Found the issue:

 during resolution change in Windows 7 it happens sometimes that it changes to 
an intermediate resolution where
 server_stride % cmp_bytes != 0. The problem that causes memory corruption is 
where
 the guest fb is copied to the server fb. It can easily be fixed truncating 
cmp_bytes in vnc_refresh_server_surface.
 But by looking at the code it seems that none of the encoders called in 
vnc_send_framebuffer_update really
 care about w  pixman_image_get_width(vd-server). I will send a patch that 
will remove all DIV_ROUND_UPs for
 now to avoid corruption. There are really almost no real resultions out there 
where width % 16 != 0. If we find
 some we might need to either decrease VNC_DIRTY_PIXELS_PER_BIT or make it 
dynamic depending on the resolution.

Peter


Am 26.06.2014 17:44, schrieb Peter Lieven:
 Hi all,

 while playing around with the vmware vga driver I noticed that there seems
 to be a race condition when the resolution is changed. I was able to trigger
 this also with std vga. Attached valgrind produces always an output similar 
 to this:

 ==3346== Thread 1:
 ==3346== Invalid read of size 8
 ==3346==at 0x4C2D108: memcpy@@GLIBC_2.14 (in 
 /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
 ==3346==by 0x400DB2: vnc_refresh_server_surface (vnc.c:2723)
 ==3346==by 0x400F19: vnc_refresh (vnc.c:2753)
 ==3346==by 0x3DA903: dpy_refresh (console.c:1416)
 ==3346==by 0x3D6D93: gui_update (console.c:194)
 ==3346==by 0x3B06C0: timerlist_run_timers (qemu-timer.c:488)
 ==3346==by 0x3B072C: qemu_clock_run_timers (qemu-timer.c:499)
 ==3346==by 0x3B0B4F: qemu_clock_run_all_timers (qemu-timer.c:605)
 ==3346==by 0x3649CF: main_loop_wait (main-loop.c:490)
 ==3346==by 0x406540: main_loop (vl.c:2051)
 ==3346==by 0x40DEA0: main (vl.c:4507)
 ==3346==  Address 0x12555180 is not stack'd, malloc'd or (recently) free'd
 ==3346==
 ==3346== Invalid write of size 8
 ==3346==at 0x4C2D10D: memcpy@@GLIBC_2.14 (in 
 /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
 ==3346==by 0x400DB2: vnc_refresh_server_surface (vnc.c:2723)
 ==3346==by 0x400F19: vnc_refresh (vnc.c:2753)
 ==3346==by 0x3DA903: dpy_refresh (console.c:1416)
 ==3346==by 0x3D6D93: gui_update (console.c:194)
 ==3346==by 0x3B06C0: timerlist_run_timers (qemu-timer.c:488)
 ==3346==by 0x3B072C: qemu_clock_run_timers (qemu-timer.c:499)
 ==3346==by 0x3B0B4F: qemu_clock_run_all_timers (qemu-timer.c:605)
 ==3346==by 0x3649CF: main_loop_wait (main-loop.c:490)
 ==3346==by 0x406540: main_loop (vl.c:2051)
 ==3346==by 0x40DEA0: main (vl.c:4507)
 ==3346==  Address 0x15731080 is not stack'd, malloc'd or (recently) free'd
 ==3346==
 ==3346== Invalid read of size 8
 ==3346==at 0x4C2D11A: memcpy@@GLIBC_2.14 (in 
 /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
 ==3346==by 0x400DB2: vnc_refresh_server_surface (vnc.c:2723)
 ==3346==by 0x400F19: vnc_refresh (vnc.c:2753)
 ==3346==by 0x3DA903: dpy_refresh (console.c:1416)
 ==3346==by 0x3D6D93: gui_update (console.c:194)
 ==3346==by 0x3B06C0: timerlist_run_timers (qemu-timer.c:488)
 ==3346==by 0x3B072C: qemu_clock_run_timers (qemu-timer.c:499)
 ==3346==by 0x3B0B4F: qemu_clock_run_all_timers (qemu-timer.c:605)
 ==3346==by 0x3649CF: main_loop_wait (main-loop.c:490)
 ==3346==by 0x406540: main_loop (vl.c:2051)
 ==3346==by 0x40DEA0: main (vl.c:4507)
 ==3346==  Address 0x12555170 is not stack'd, malloc'd or (recently) free'd
 ==3346==
 ==3346== Invalid read of size 1
 ==3346==at 0x4C2DCC0: bcmp (in 
 /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
 ==3346==by 0x400D91: vnc_refresh_server_surface (vnc.c:2720)
 ==3346==by 0x400F19: vnc_refresh (vnc.c:2753)
 ==3346==by 0x3DA903: dpy_refresh (console.c:1416)
 ==3346==by 0x3D6D93: gui_update (console.c:194)
 ==3346==by 0x3B06C0: timerlist_run_timers (qemu-timer.c:488)
 ==3346==by 0x3B072C: qemu_clock_run_timers (qemu-timer.c:499)
 ==3346==by 0x3B0B4F: qemu_clock_run_all_timers (qemu-timer.c:605)
 ==3346==by 0x3649CF: main_loop_wait (main-loop.c:490)
 ==3346==by 0x406540: main_loop (vl.c:2051)
 ==3346==by 0x40DEA0: main (vl.c:4507)
 ==3346==  Address 0x15731050 is 0 bytes after a block of size 196,560 alloc'd
 ==3346==at 0x4C29DB4: calloc (in 
 /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
 ==3346==by 0x70C8B1A: ??? (in 
 /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.30.2)
 ==3346==by 0x70C8BF4: ??? (in 
 /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.30.2)
 ==3346==by 0x3FAECC: vnc_dpy_switch (vnc.c:590)
 ==3346==by 0x3DA87C: dpy_gfx_replace_surface (console.c:1404)
 ==3346==by 0x3DBCF0: qemu_console_resize (console.c:1857)
 ==3346==by 0x450A39: vga_draw_text (vga.c:1344)
 ==3346==by 0x4521B0: vga_update_display (vga.c:1910)
 ==3346==by 0x2A665B: vmsvga_update_display (vmware_vga.c:1071)
 ==3346==by 0x3D7087: graphic_hw_update (console.c:256)
 ==3346==

Re: [Qemu-devel] About AddressSpace in intel-iommu emulation

2014-06-27 Thread Jan Kiszka
On 2014-06-27 07:46, Le Tan wrote:
 2014-06-27 12:55 GMT+08:00 Paolo Bonzini pbonz...@redhat.com:
 Il 27/06/2014 04:08, Le Tan ha scritto:

 1. In struct IOMMUTLBEntry, I think the addr_mask field should be the
 mask of the page offset, right? But I see different usages of this
 field. In spapr_tce_translate_iommu(), the addr_mask field is assigned
 with the mask of the page offset. However, in pbm_translate_iommu(),
 in the passthrough case, the addr_mask field seems to be assigned the
 mask of the page number. Is there any problem here?


 The intended usage is the one of spapr_tce_translate_iommu().  In practice
 it doesn't matter, both work.


 2. For q35, how to identify origination of DMA requests? The VT-d
 manual says we should use source-id(for PCI-Express devices, it is
 requester identifier) to map devices to domains. What is the related
 part in QEMU? Where can I get the source-id of a DMA request?


 You need to create a different AddressSpace for each PCI bus or device.
 
 How to create a different AddressSpace for each device? I thought a
 AddressSpace just belongs to a PCI bus before. The paging structures
 for different functions of the same device can also be different, too.
 So maybe we should create a different AddressSpace for each function?
 How to achieve it? Could you give me some more hints or is there any
 existing example in QEMU?

I would suggest to study the apb IOMMU implementation Paolo referenced
and the PCI layer functions used by that code. Specifically,
pci_setup_iommu takes a callback that is supposed to return an address
space to be used for a particular device. For apb, it's the same for all
devices on a bus, but that's not required...

Jan




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 0/3] virtio-blk: Suppress error action on r/w beyond end

2014-06-27 Thread Stefan Hajnoczi
On Thu, Jun 05, 2014 at 02:15:33PM +0200, Markus Armbruster wrote:
 When a device model's I/O operation fails, we execute the error
 action.  This lets layers above QEMU implement thin provisioning, or
 attempt to correct errors before they reach the guest.  But when the
 I/O operation fails because its invalid, reporting the error to the
 guest is the only sensible action.
 
 This short series does exactly that for virtio-blk.  I intend to do
 the same for IDE and SCSI.
 
 Markus Armbruster (3):
   virtio-blk: Factor common checks out of virtio_blk_handle_read/write()
   virtio-blk: Bypass error action and I/O accounting on invalid r/w
   virtio-blk: Treat read/write beyond end as invalid
 
  hw/block/virtio-blk.c | 45 +
  1 file changed, 29 insertions(+), 16 deletions(-)
 
 -- 
 1.9.3
 

Reviewed-by: Stefan Hajnoczi stefa...@redhat.com


pgpilzCdbNLlQ.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH 3/3] virtio-blk: Treat read/write beyond end as invalid

2014-06-27 Thread Stefan Hajnoczi
On Mon, Jun 23, 2014 at 02:57:36PM +0200, Markus Armbruster wrote:
 Markus Armbruster arm...@redhat.com writes:
 
  Stefan Hajnoczi stefa...@redhat.com writes:
 
  On Thu, Jun 05, 2014 at 02:15:36PM +0200, Markus Armbruster wrote:
  +if (sector  total_sectors || nb_sectors  total_sectors - sector) {
  +return false;
  +}
 
  if (sector = total_sectors || ...) {
 
  I suspect reading bdrv_check_byte_request() put the '' in my brain:
 
  if ((offset  len) || (len - offset  size))
  return -EIO;
 
  Don't we need offset = len here?
 
 Just remembered: we don't, because we allow I/O at offset len provided
 size is zero.
 
 Same reasoning applies to my patch.

Okay.  I didn't remember the offset=eof length=0 thing.

Stefan


pgpkBLJGnB6km.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH] docs/multiple-iothreads.txt: add documentation on IOThread programming

2014-06-27 Thread Stefan Hajnoczi
On Mon, Jun 09, 2014 at 09:29:31AM -0600, Eric Blake wrote:
 On 06/09/2014 07:59 AM, Stefan Hajnoczi wrote:
  This document explains how IOThreads and the main loop are related,
  especially how to write code that can run in an IOThread.  Currently on
  virtio-blk-data-plane uses these techniques.  The next obvious target is
  virtio-scsi; there has also been work on virtio-net.
  
  Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
  ---
   docs/multiple-iothreads.txt | 124 
  
   1 file changed, 124 insertions(+)
   create mode 100644 docs/multiple-iothreads.txt
  
  diff --git a/docs/multiple-iothreads.txt b/docs/multiple-iothreads.txt
  new file mode 100644
  index 000..f2b008d
  --- /dev/null
  +++ b/docs/multiple-iothreads.txt
  @@ -0,0 +1,124 @@
  +This document explains the IOThread feature and how to write code that runs
  +outside the QEMU global mutex.
 
 Pre-existing epidemic in this directory, but should you assert copyright
 and a license?

Yes, I'm happy to do that.


pgpoFkFscYiaA.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH] docs/multiple-iothreads.txt: add documentation on IOThread programming

2014-06-27 Thread Stefan Hajnoczi
On Mon, Jun 09, 2014 at 04:11:29PM +0200, Paolo Bonzini wrote:
 +The main loop and IOThreads
 +---
 +QEMU is an event-driven program that can do several things at once using an
 +event loop.  The VNC server and the QMP monitor are both processed from the
 +same event loop which monitors their file descriptors until they become
 +readable and then invokes a callback.
 +
 +The default event loop is called the main loop (see main-loop.c).  It is
 +possible to create additional event loop threads using -object
 +iothread,id=my-iothread.
 +
 +Side note: The main loop and IOThread are both event loops but their code is
 +not shared completely.  Sometimes it is useful to remember that although 
 they
 +are conceptually similar they are currently not interchangeable.
 
 Actually, the main loop does include all the iothread code.  So you could
 say that the main loop is a superset of the iothread.

Not quite.  The main loop includes AioContext but it does not use
iothread.c (IOThread).

 + * LEGACY timer_new_ms() - create a timer
 + * LEGACY qemu_bh_new() - create a BH
 + * LEGACY qemu_aio_wait() - run an event loop iteration
 
 also seems to be unused except for qemu-io-cmds.c (and easily removed from
 there).
 
 Perhaps add a note (here or elsewhere) that timer_new_ms/qemu_bh_new should
 never be used in the block layer?

I'll note it further down where the block layer is mentioned.


pgpIWVkVV628C.pgp
Description: PGP signature


[Qemu-devel] [RFC PATCH 0/3] cpu: add device_add foo-x86_64-cpu support

2014-06-27 Thread Gu Zheng
This series is based on the previous patchset from Chen Fan:
https://lists.nongnu.org/archive/html/qemu-devel/2014-05/msg02360.html

This patches try to make cpu hotplug with device_add, and make
-device foo-x86_64-cpu available,also we can set apic-id
property with command line, if without setting apic-id property,
we offer the first unoccupied apic id as the default new apic id.
When hotplug cpu with device_add, additional check of APIC ID will be
done after cpu object initialization which was different from
'cpu_add' command that check 'ids' at the beginning.

Chen Fan (2):
  cpu: introduce CpuTopoInfo structure for argument simplification
  cpu: add device_add foo-x86_64-cpu support

Gu Zheng (1):
  qom/cpu: move register_vmstate to common CPUClass.realizefn

 exec.c  |   32 ++---
 hw/intc/apic_common.c   |3 +-
 include/hw/i386/apic_internal.h |3 +-
 include/qom/cpu.h   |3 ++
 qdev-monitor.c  |1 +
 qom/cpu.c   |2 +
 target-i386/cpu.c   |   76 --
 target-i386/topology.h  |   51 ++
 8 files changed, 135 insertions(+), 36 deletions(-)

-- 
1.7.7




[Qemu-devel] [RFC PATCH 1/3] cpu: introduce CpuTopoInfo structure for argument simplification

2014-06-27 Thread Gu Zheng
Signed-off-by: Chen Fan chen.fan.f...@cn.fujitsu.com
Reviewed-by: Eduardo Habkost ehabk...@redhat.com
Signed-off-by: Gu Zheng guz.f...@cn.fujitsu.com
---
 target-i386/topology.h |   33 +
 1 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/target-i386/topology.h b/target-i386/topology.h
index 07a6c5f..e9ff89c 100644
--- a/target-i386/topology.h
+++ b/target-i386/topology.h
@@ -47,6 +47,12 @@
  */
 typedef uint32_t apic_id_t;
 
+typedef struct X86CPUTopoInfo {
+unsigned pkg_id;
+unsigned core_id;
+unsigned smt_id;
+} X86CPUTopoInfo;
+
 /* Return the bit width needed for 'count' IDs
  */
 static unsigned apicid_bitwidth_for_count(unsigned count)
@@ -92,13 +98,11 @@ static inline unsigned apicid_pkg_offset(unsigned nr_cores, 
unsigned nr_threads)
  */
 static inline apic_id_t apicid_from_topo_ids(unsigned nr_cores,
  unsigned nr_threads,
- unsigned pkg_id,
- unsigned core_id,
- unsigned smt_id)
+ const X86CPUTopoInfo *topo)
 {
-return (pkg_id   apicid_pkg_offset(nr_cores, nr_threads)) |
-   (core_id  apicid_core_offset(nr_cores, nr_threads)) |
-   smt_id;
+return (topo-pkg_id   apicid_pkg_offset(nr_cores, nr_threads)) |
+   (topo-core_id  apicid_core_offset(nr_cores, nr_threads)) |
+   topo-smt_id;
 }
 
 /* Calculate thread/core/package IDs for a specific topology,
@@ -107,14 +111,12 @@ static inline apic_id_t apicid_from_topo_ids(unsigned 
nr_cores,
 static inline void x86_topo_ids_from_idx(unsigned nr_cores,
  unsigned nr_threads,
  unsigned cpu_index,
- unsigned *pkg_id,
- unsigned *core_id,
- unsigned *smt_id)
+ X86CPUTopoInfo *topo)
 {
 unsigned core_index = cpu_index / nr_threads;
-*smt_id = cpu_index % nr_threads;
-*core_id = core_index % nr_cores;
-*pkg_id = core_index / nr_cores;
+topo-smt_id = cpu_index % nr_threads;
+topo-core_id = core_index % nr_cores;
+topo-pkg_id = core_index / nr_cores;
 }
 
 /* Make APIC ID for the CPU 'cpu_index'
@@ -125,10 +127,9 @@ static inline apic_id_t x86_apicid_from_cpu_idx(unsigned 
nr_cores,
 unsigned nr_threads,
 unsigned cpu_index)
 {
-unsigned pkg_id, core_id, smt_id;
-x86_topo_ids_from_idx(nr_cores, nr_threads, cpu_index,
-  pkg_id, core_id, smt_id);
-return apicid_from_topo_ids(nr_cores, nr_threads, pkg_id, core_id, smt_id);
+X86CPUTopoInfo topo;
+x86_topo_ids_from_idx(nr_cores, nr_threads, cpu_index, topo);
+return apicid_from_topo_ids(nr_cores, nr_threads, topo);
 }
 
 #endif /* TARGET_I386_TOPOLOGY_H */
-- 
1.7.7




[Qemu-devel] [RFC PATCH 2/3] qom/cpu: move register_vmstate to common CPUClass.realizefn

2014-06-27 Thread Gu Zheng
Move cpu vmstate register from cpu_exec_init into cpu_common_realizefn,
apic vmstate register into x86_cpu_apic_realize. And use the
cc-get_arch_id as the instance id that suggested by Igor to
fix the migration issue.

Signed-off-by: Gu Zheng guz.f...@cn.fujitsu.com
---
 exec.c  |   32 +++-
 hw/intc/apic_common.c   |3 +--
 include/hw/i386/apic_internal.h |3 ++-
 include/qom/cpu.h   |2 ++
 qom/cpu.c   |2 ++
 target-i386/cpu.c   |   12 +---
 6 files changed, 35 insertions(+), 19 deletions(-)

diff --git a/exec.c b/exec.c
index 4e179a6..61ad996 100644
--- a/exec.c
+++ b/exec.c
@@ -468,10 +468,28 @@ void tcg_cpu_address_space_init(CPUState *cpu, 
AddressSpace *as)
 }
 #endif
 
+void cpu_vmstate_register(CPUState *cpu)
+{
+CPUClass *cc = CPU_GET_CLASS(cpu);
+int cpu_index = cc-get_arch_id(cpu);
+
+if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
+vmstate_register(NULL, cpu_index, vmstate_cpu_common, cpu);
+}
+#if defined(CPU_SAVE_VERSION)  !defined(CONFIG_USER_ONLY)
+register_savevm(NULL, cpu, cpu_index, CPU_SAVE_VERSION,
+cpu_save, cpu_load, cpu-env_ptr);
+assert(cc-vmsd == NULL);
+assert(qdev_get_vmsd(DEVICE(cpu)) == NULL);
+#endif
+if (cc-vmsd != NULL) {
+vmstate_register(NULL, cpu_index, cc-vmsd, cpu);
+}
+}
+
 void cpu_exec_init(CPUArchState *env)
 {
 CPUState *cpu = ENV_GET_CPU(env);
-CPUClass *cc = CPU_GET_CLASS(cpu);
 CPUState *some_cpu;
 int cpu_index;
 
@@ -494,18 +512,6 @@ void cpu_exec_init(CPUArchState *env)
 #if defined(CONFIG_USER_ONLY)
 cpu_list_unlock();
 #endif
-if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
-vmstate_register(NULL, cpu_index, vmstate_cpu_common, cpu);
-}
-#if defined(CPU_SAVE_VERSION)  !defined(CONFIG_USER_ONLY)
-register_savevm(NULL, cpu, cpu_index, CPU_SAVE_VERSION,
-cpu_save, cpu_load, env);
-assert(cc-vmsd == NULL);
-assert(qdev_get_vmsd(DEVICE(cpu)) == NULL);
-#endif
-if (cc-vmsd != NULL) {
-vmstate_register(NULL, cpu_index, cc-vmsd, cpu);
-}
 }
 
 #if defined(TARGET_HAS_ICE)
diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index ce3d903..029f67d 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -345,7 +345,7 @@ static int apic_dispatch_post_load(void *opaque, int 
version_id)
 return 0;
 }
 
-static const VMStateDescription vmstate_apic_common = {
+const VMStateDescription vmstate_apic_common = {
 .name = apic,
 .version_id = 3,
 .minimum_version_id = 3,
@@ -391,7 +391,6 @@ static void apic_common_class_init(ObjectClass *klass, void 
*data)
 ICCDeviceClass *idc = ICC_DEVICE_CLASS(klass);
 DeviceClass *dc = DEVICE_CLASS(klass);
 
-dc-vmsd = vmstate_apic_common;
 dc-reset = apic_reset_common;
 dc-props = apic_properties_common;
 idc-realize = apic_common_realize;
diff --git a/include/hw/i386/apic_internal.h b/include/hw/i386/apic_internal.h
index 83e2a42..8a645cf 100644
--- a/include/hw/i386/apic_internal.h
+++ b/include/hw/i386/apic_internal.h
@@ -23,6 +23,7 @@
 #include exec/memory.h
 #include hw/cpu/icc_bus.h
 #include qemu/timer.h
+#include migration/vmstate.h
 
 /* APIC Local Vector Table */
 #define APIC_LVT_TIMER  0
@@ -136,7 +137,7 @@ typedef struct VAPICState {
 } QEMU_PACKED VAPICState;
 
 extern bool apic_report_tpr_access;
-
+extern const VMStateDescription vmstate_apic_common;
 void apic_report_irq_delivered(int delivered);
 bool apic_next_timer(APICCommonState *s, int64_t current_time);
 void apic_enable_tpr_access_reporting(DeviceState *d, bool enable);
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 4b352a2..87eecd2 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -548,6 +548,8 @@ void cpu_interrupt(CPUState *cpu, int mask);
 
 #endif /* USER_ONLY */
 
+void cpu_vmstate_register(CPUState *cpu);
+
 #ifdef CONFIG_SOFTMMU
 static inline void cpu_unassigned_access(CPUState *cpu, hwaddr addr,
  bool is_write, bool is_exec,
diff --git a/qom/cpu.c b/qom/cpu.c
index fada2d4..5158343 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -296,6 +296,8 @@ static void cpu_common_realizefn(DeviceState *dev, Error 
**errp)
 {
 CPUState *cpu = CPU(dev);
 
+cpu_vmstate_register(cpu);
+
 if (dev-hotplugged) {
 cpu_synchronize_post_init(cpu);
 notifier_list_notify(cpu_added_notifiers, dev);
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 8983457..10f6d53 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2554,13 +2554,19 @@ static void x86_cpu_apic_create(X86CPU *cpu, Error 
**errp)
 
 static void x86_cpu_apic_realize(X86CPU *cpu, Error **errp)
 {
-if (cpu-apic_state == NULL) {
+DeviceState *apic_state = cpu-apic_state;
+CPUClass *cc = CPU_GET_CLASS(CPU(cpu));
+
+if (apic_state == NULL) {
 return;
 }
 
-if 

[Qemu-devel] [RFC PATCH 3/3] cpu: add device_add foo-x86_64-cpu support

2014-06-27 Thread Gu Zheng
From: Chen Fan chen.fan.f...@cn.fujitsu.com

Add support to device_add foo-x86_64-cpu, and additional checks of
apic id are added into x86_cpuid_set_apic_id() and x86_cpu_apic_create()
for duplicate. Besides, in order to support device/device_add foo-x86_64-cpu
which without specified apic id, we add a new function get_free_apic_id() to
provide the first free apid id each time to avoid apic id duplicate.

Signed-off-by: Chen Fan chen.fan.f...@cn.fujitsu.com
Signed-off-by: Gu Zheng guz.f...@cn.fujitsu.com
---
 include/qom/cpu.h  |1 +
 qdev-monitor.c |1 +
 target-i386/cpu.c  |   64 +++-
 target-i386/topology.h |   18 +
 4 files changed, 83 insertions(+), 1 deletions(-)

diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 87eecd2..87bd652 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -291,6 +291,7 @@ struct CPUState {
 QTAILQ_HEAD(CPUTailQ, CPUState);
 extern struct CPUTailQ cpus;
 #define CPU_NEXT(cpu) QTAILQ_NEXT(cpu, node)
+#define CPU_REMOVE(cpu) QTAILQ_REMOVE(cpus, cpu, node)
 #define CPU_FOREACH(cpu) QTAILQ_FOREACH(cpu, cpus, node)
 #define CPU_FOREACH_SAFE(cpu, next_cpu) \
 QTAILQ_FOREACH_SAFE(cpu, cpus, node, next_cpu)
diff --git a/qdev-monitor.c b/qdev-monitor.c
index f87f3d8..48327c8 100644
--- a/qdev-monitor.c
+++ b/qdev-monitor.c
@@ -24,6 +24,7 @@
 #include qmp-commands.h
 #include sysemu/arch_init.h
 #include qemu/config-file.h
+#include qom/object_interfaces.h
 
 /*
  * Aliases were a bad idea from the start.  Let's keep them
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 10f6d53..b058b70 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -49,6 +49,7 @@
 #include hw/i386/apic_internal.h
 #endif
 
+#include qom/object_interfaces.h
 
 /* Cache topology CPUID constants: */
 
@@ -1550,6 +1551,7 @@ static void x86_cpuid_set_apic_id(Object *obj, Visitor 
*v, void *opaque,
 const int64_t max = UINT32_MAX;
 Error *error = NULL;
 int64_t value;
+X86CPUTopoInfo topo;
 
 if (dev-realized) {
 error_setg(errp, Attempt to set property '%s' on '%s' after 
@@ -1569,10 +1571,24 @@ static void x86_cpuid_set_apic_id(Object *obj, Visitor 
*v, void *opaque,
 return;
 }
 
+if (value  x86_cpu_apic_id_from_index(max_cpus - 1)) {
+error_setg(errp, CPU with APIC ID % PRIi64
+is more than MAX APIC ID limits, value);
+return;
+}
+
+x86_topo_ids_from_apic_id(smp_cores, smp_threads, value, topo);
+if (topo.smt_id = smp_threads || topo.core_id = smp_cores) {
+error_setg(errp, CPU with APIC ID % PRIi64  does not match 
+   topology configuration., value);
+return;
+}
+
 if ((value != cpu-env.cpuid_apic_id)  cpu_exists(value)) {
 error_setg(errp, CPU with APIC ID % PRIi64  exists, value);
 return;
 }
+
 cpu-env.cpuid_apic_id = value;
 }
 
@@ -1994,12 +2010,22 @@ out:
 return cpu;
 }
 
+static void x86_cpu_cpudef_instance_init(Object *obj)
+{
+DeviceState *dev = DEVICE(obj);
+
+dev-hotplugged = true;
+}
+
 static void x86_cpu_cpudef_class_init(ObjectClass *oc, void *data)
 {
 X86CPUDefinition *cpudef = data;
 X86CPUClass *xcc = X86_CPU_CLASS(oc);
+DeviceClass *dc = DEVICE_CLASS(oc);
 
 xcc-cpu_def = cpudef;
+
+dc-cannot_instantiate_with_device_add_yet = false;
 }
 
 static void x86_register_cpudef_type(X86CPUDefinition *def)
@@ -2008,6 +2034,8 @@ static void x86_register_cpudef_type(X86CPUDefinition 
*def)
 TypeInfo ti = {
 .name = typename,
 .parent = TYPE_X86_CPU,
+.instance_size = sizeof(X86CPU),
+.instance_init = x86_cpu_cpudef_instance_init,
 .class_init = x86_cpu_cpudef_class_init,
 .class_data = def,
 };
@@ -2544,8 +2572,17 @@ static void x86_cpu_apic_create(X86CPU *cpu, Error 
**errp)
 return;
 }
 
+if (env-cpuid_apic_id  x86_cpu_apic_id_from_index(max_cpus - 1)) {
+error_setg(errp, CPU with APIC ID % PRIi32
+ is more than MAX APIC ID:% PRIi32,
+env-cpuid_apic_id,
+x86_cpu_apic_id_from_index(max_cpus - 1));
+return;
+}
+
 object_property_add_child(OBJECT(cpu), apic,
   OBJECT(cpu-apic_state), NULL);
+
 qdev_prop_set_uint8(cpu-apic_state, id, env-cpuid_apic_id);
 /* TODO: convert to link */
 apic = APIC_COMMON(cpu-apic_state);
@@ -2681,6 +2718,21 @@ uint32_t x86_cpu_apic_id_from_index(unsigned int 
cpu_index)
 }
 }
 
+static uint32_t get_free_apic_id(void)
+{
+int i;
+
+for (i = 0; i  max_cpus; i++) {
+uint32_t id = x86_cpu_apic_id_from_index(i);
+
+if (!cpu_exists(id)) {
+return id;
+}
+}
+
+return x86_cpu_apic_id_from_index(max_cpus);
+}
+
 static void x86_cpu_initfn(Object *obj)
 {
 CPUState *cs = CPU(obj);
@@ -2688,7 +2740,9 @@ static void x86_cpu_initfn(Object 

Re: [Qemu-devel] [RFC PATCH 1/3] cpu: introduce CpuTopoInfo structure for argument simplification

2014-06-27 Thread Gu Zheng
Correct the author.
From: Chen Fan chen.fan.f...@cn.fujitsu.com

On 06/27/2014 06:03 PM, Gu Zheng wrote:

 Signed-off-by: Chen Fan chen.fan.f...@cn.fujitsu.com
 Reviewed-by: Eduardo Habkost ehabk...@redhat.com
 Signed-off-by: Gu Zheng guz.f...@cn.fujitsu.com
 ---
  target-i386/topology.h |   33 +
  1 files changed, 17 insertions(+), 16 deletions(-)
 
 diff --git a/target-i386/topology.h b/target-i386/topology.h
 index 07a6c5f..e9ff89c 100644
 --- a/target-i386/topology.h
 +++ b/target-i386/topology.h
 @@ -47,6 +47,12 @@
   */
  typedef uint32_t apic_id_t;
  
 +typedef struct X86CPUTopoInfo {
 +unsigned pkg_id;
 +unsigned core_id;
 +unsigned smt_id;
 +} X86CPUTopoInfo;
 +
  /* Return the bit width needed for 'count' IDs
   */
  static unsigned apicid_bitwidth_for_count(unsigned count)
 @@ -92,13 +98,11 @@ static inline unsigned apicid_pkg_offset(unsigned 
 nr_cores, unsigned nr_threads)
   */
  static inline apic_id_t apicid_from_topo_ids(unsigned nr_cores,
   unsigned nr_threads,
 - unsigned pkg_id,
 - unsigned core_id,
 - unsigned smt_id)
 + const X86CPUTopoInfo *topo)
  {
 -return (pkg_id   apicid_pkg_offset(nr_cores, nr_threads)) |
 -   (core_id  apicid_core_offset(nr_cores, nr_threads)) |
 -   smt_id;
 +return (topo-pkg_id   apicid_pkg_offset(nr_cores, nr_threads)) |
 +   (topo-core_id  apicid_core_offset(nr_cores, nr_threads)) |
 +   topo-smt_id;
  }
  
  /* Calculate thread/core/package IDs for a specific topology,
 @@ -107,14 +111,12 @@ static inline apic_id_t apicid_from_topo_ids(unsigned 
 nr_cores,
  static inline void x86_topo_ids_from_idx(unsigned nr_cores,
   unsigned nr_threads,
   unsigned cpu_index,
 - unsigned *pkg_id,
 - unsigned *core_id,
 - unsigned *smt_id)
 + X86CPUTopoInfo *topo)
  {
  unsigned core_index = cpu_index / nr_threads;
 -*smt_id = cpu_index % nr_threads;
 -*core_id = core_index % nr_cores;
 -*pkg_id = core_index / nr_cores;
 +topo-smt_id = cpu_index % nr_threads;
 +topo-core_id = core_index % nr_cores;
 +topo-pkg_id = core_index / nr_cores;
  }
  
  /* Make APIC ID for the CPU 'cpu_index'
 @@ -125,10 +127,9 @@ static inline apic_id_t x86_apicid_from_cpu_idx(unsigned 
 nr_cores,
  unsigned nr_threads,
  unsigned cpu_index)
  {
 -unsigned pkg_id, core_id, smt_id;
 -x86_topo_ids_from_idx(nr_cores, nr_threads, cpu_index,
 -  pkg_id, core_id, smt_id);
 -return apicid_from_topo_ids(nr_cores, nr_threads, pkg_id, core_id, 
 smt_id);
 +X86CPUTopoInfo topo;
 +x86_topo_ids_from_idx(nr_cores, nr_threads, cpu_index, topo);
 +return apicid_from_topo_ids(nr_cores, nr_threads, topo);
  }
  
  #endif /* TARGET_I386_TOPOLOGY_H */





Re: [Qemu-devel] [PATCH qom v2 2/4] hw: Fix qemu_allocate_irqs() leaks

2014-06-27 Thread Peter Crosthwaite
On Fri, Jun 27, 2014 at 7:45 PM, Andreas Färber afaer...@suse.de wrote:
 Am 18.06.2014 09:55, schrieb Peter Crosthwaite:
 From: Andreas Färber afaer...@suse.de

 Replace qemu_allocate_irqs(foo, bar, 1)[0]
 with qemu_allocate_irq(foo, bar, 0).

 This avoids leaking the dereferenced qemu_irq *.

 Cc: Kirill Batuzov batuz...@ispras.ru
 Cc: Markus Armbruster arm...@redhat.com
 Cc: Peter Maydell peter.mayd...@linaro.org
 Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
 Reviewed-by: Peter Maydell peter.mayd...@linaro.org
 Signed-off-by: Andreas Färber afaer...@suse.de
 [PC Changes:
  * Applied change to instance in sh4/sh7750.c
 ]
 Signed-off-by: Peter Crosthwaite peter.crosthwa...@xilinx.com
 ---
 Changed since 1:
 Applied change to instance in sh4/sh7750.c (Kirill review)
 [...]
 diff --git a/hw/sh4/sh7750.c b/hw/sh4/sh7750.c
 index 4a39357..9ccd770 100644
 --- a/hw/sh4/sh7750.c
 +++ b/hw/sh4/sh7750.c
 @@ -838,6 +838,5 @@ SH7750State *sh7750_init(SuperHCPU *cpu, MemoryRegion 
 *sysmem)
  qemu_irq sh7750_irl(SH7750State *s)
  {
  sh_intc_toggle_source(sh_intc_source(s-intc, IRL), 1, 0); /* enable */
 -return qemu_allocate_irqs(sh_intc_set_irl, sh_intc_source(s-intc, 
 IRL),
 -   1)[0];
 +return qemu_allocate_irq(sh_intc_set_irl, sh_intc_source(s-intc, 
 IRL), 1);

 Thanks for catching this, my grep expression failed due to the line
 break. But shouldn't this be 0 due to the zero-based index, as per my
 commit message? Will fix up unless I hear objections.


Yep, sorry.

Regards,
Peter

 Regards,
 Andreas

  }

 --
 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg




[Qemu-devel] [PATCH v2] docs/multiple-iothreads.txt: add documentation on IOThread programming

2014-06-27 Thread Stefan Hajnoczi
This document explains how IOThreads and the main loop are related,
especially how to write code that can run in an IOThread.  Currently
only virtio-blk-data-plane uses these techniques.  The next obvious
target is virtio-scsi; there has also been work on virtio-net.

Signed-off-by: Stefan Hajnoczi stefa...@redhat.com
---
v2:
 * Mention AioContext file descriptor monitoring is POSIX host only [Paolo]
 * Add note that block layer code must use AioContext APIs [Paolo]
 * Add copyright and license header [Eric]
 * Add missing comma [Eric]
 * Fix s/on/only/ typo in commit description [Fam]

 docs/multiple-iothreads.txt | 134 
 1 file changed, 134 insertions(+)
 create mode 100644 docs/multiple-iothreads.txt

diff --git a/docs/multiple-iothreads.txt b/docs/multiple-iothreads.txt
new file mode 100644
index 000..01d2491
--- /dev/null
+++ b/docs/multiple-iothreads.txt
@@ -0,0 +1,134 @@
+Copyright (c) 2014 Red Hat Inc.
+
+This work is licensed under the terms of the GNU GPL, version 2.  See
+the COPYING file in the top-level directory.
+
+
+This document explains the IOThread feature and how to write code that runs
+outside the QEMU global mutex.
+
+The main loop and IOThreads
+---
+QEMU is an event-driven program that can do several things at once using an
+event loop.  The VNC server and the QMP monitor are both processed from the
+same event loop, which monitors their file descriptors until they become
+readable and then invokes a callback.
+
+The default event loop is called the main loop (see main-loop.c).  It is
+possible to create additional event loop threads using -object
+iothread,id=my-iothread.
+
+Side note: The main loop and IOThread are both event loops but their code is
+not shared completely.  Sometimes it is useful to remember that although they
+are conceptually similar they are currently not interchangeable.
+
+Why IOThreads are useful
+
+IOThreads allow the user to control the placement of work.  The main loop is a
+scalability bottleneck on hosts with many CPUs.  Work can be spread across
+several IOThreads instead of just one main loop.  When set up correctly this
+can improve I/O latency and reduce jitter seen by the guest.
+
+The main loop is also deeply associated with the QEMU global mutex, which is a
+scalability bottleneck in itself.  vCPU threads and the main loop use the QEMU
+global mutex to serialize execution of QEMU code.  This mutex is necessary
+because a lot of QEMU's code historically was not thread-safe.
+
+The fact that all I/O processing is done in a single main loop and that the
+QEMU global mutex is contended by all vCPU threads and the main loop explain
+why it is desirable to place work into IOThreads.
+
+The experimental virtio-blk data-plane implementation has been benchmarked and
+shows these effects:
+ftp://public.dhe.ibm.com/linux/pdfs/KVM_Virtualized_IO_Performance_Paper.pdf
+
+How to program for IOThreads
+
+The main difference between legacy code and new code that can run in an
+IOThread is dealing explicitly with the event loop object, AioContext
+(see include/block/aio.h).  Code that only works in the main loop
+implicitly uses the main loop's AioContext.  Code that supports running
+in IOThreads must be aware of its AioContext.
+
+AioContext supports the following services:
+ * File descriptor monitoring (read/write/error on POSIX hosts)
+ * Event notifiers (inter-thread signalling)
+ * Timers
+ * Bottom Halves (BH) deferred callbacks
+
+There are several old APIs that use the main loop AioContext:
+ * LEGACY qemu_aio_set_fd_handler() - monitor a file descriptor
+ * LEGACY qemu_aio_set_event_notifier() - monitor an event notifier
+ * LEGACY timer_new_ms() - create a timer
+ * LEGACY qemu_bh_new() - create a BH
+ * LEGACY qemu_aio_wait() - run an event loop iteration
+
+Since they implicitly work on the main loop they cannot be used in code that
+runs in an IOThread.  They might cause a crash or deadlock if called from an
+IOThread since the QEMU global mutex is not held.
+
+Instead, use the AioContext functions directly (see include/block/aio.h):
+ * aio_set_fd_handler() - monitor a file descriptor
+ * aio_set_event_notifier() - monitor an event notifier
+ * aio_timer_new() - create a timer
+ * aio_bh_new() - create a BH
+ * aio_poll() - run an event loop iteration
+
+The AioContext can be obtained from the IOThread using
+iothread_get_aio_context() or for the main loop using qemu_get_aio_context().
+Code that takes an AioContext argument works both in IOThreads or the main
+loop, depending on which AioContext instance the caller passes in.
+
+How to synchronize with an IOThread
+---
+AioContext is not thread-safe so some rules must be followed when using file
+descriptors, event notifiers, timers, or BHs across threads:
+
+1. AioContext functions can be called safely from file descriptor, event
+notifier, 

Re: [Qemu-devel] [PATCH v1 1/1] char: cadence_uart: Convert to realize()

2014-06-27 Thread Peter Maydell
On 27 June 2014 01:11, Peter Crosthwaite peter.crosthwa...@xilinx.com wrote:
 On Tue, Jun 24, 2014 at 4:06 PM, Alistair Francis
 alistair.fran...@xilinx.com wrote:
 SysBusDevice::init is deprecated. Convert to Object::init and
 Device::realize as prescribed by QOM conventions.

 Signed-off-by: Alistair Francis alistair.fran...@xilinx.com

 Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com

 CC Peter for target-arm.

I think at this point given we're quite close to hardfreeze
I'd prefer not to take this, since it's just cleanup.

thanks
-- PMM



Re: [Qemu-devel] [PATCH v2] hw/net/eepro100: Implement read-only bits in MDI registers

2014-06-27 Thread Stefan Hajnoczi
On Mon, Jun 09, 2014 at 04:03:08PM +0100, Peter Maydell wrote:
 Although we defined an eepro100_mdi_mask[] array indicating which bits
 in the registers are read-only, we weren't actually doing anything with
 it. Make the MDI register-write code use it rather than manually making
 register 1 read-only and leaving the rest as reads-as-written. (The
 special-case handling of register 0 remains as before since its mask is
 all-zeros and the special casing happens before we apply the masking.)
 
 Signed-off-by: Peter Maydell peter.mayd...@linaro.org
 Message-id: 1402159924-13853-1-git-send-email-peter.mayd...@linaro.org
 ---
 No code change, but I fixed the errors in the commit message.
 
  hw/net/eepro100.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

Thanks, applied to my net tree:
https://github.com/stefanha/qemu/commits/net

Stefan


pgp5dVqC8br3P.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH 0/5] Platform device support

2014-06-27 Thread Andreas Färber
Am 26.06.2014 14:01, schrieb Alexander Graf:
 
 On 20.06.14 08:43, Peter Crosthwaite wrote:
 On Wed, Jun 4, 2014 at 10:28 PM, Alexander Graf ag...@suse.de wrote:
 Platforms without ISA and/or PCI have had a seriously hard time in
 the dynamic
 device creation world of QEMU. Devices on these were modeled as
 SysBus devices
 which can only be instantiated in machine files, not through -device.

 Why is that so?

 Well, SysBus is trying to be incredibly generic. It allows you to
 plug any
 interrupt sender into any other interrupt receiver. It allows you to map
 a device's memory regions into any other random memory region. All of
 that
 only works from C code or via really complicated command line
 arguments under
 discussion upstream right now.

 What you are doing seem to me to be an extension of SysBus - you are
 defining the same interfaces as sysbus but also adding some machine
 specifics wiring info. I think it's a candidate for QOM inheritance to
 avoid having to dup all the sysbus device models for both regular
 sysbus and platform bus. I think your functionality should be added as
 one of

 1: and interface that can be added to sysbus devices
 2: a new abstraction that inherits from SYS_BUS_DEVICE
 3: just new features to the sysbus core.

 Then both of us are using the same suite of device models and the
 differences between our approaches are limited to machine level
 instantiation method. My gut says #2 is the cleanest.
 
 The more I think about it the more I believe #3 would be the cleanest.
 The only thing my platform devices do in addition to sysbus devices is
 that it exposes qdev properties to give mapping code hints where a
 device wants to be mapped.
 
 If we just add qdev properties for all the possible hints in generic
 sysbus core code, we should be able to automatically convert all devices
 into dynamically allocatable devices. Whether they actually do get
 mapped and the generation of device tree chunks still stays in the the
 machine file's court.

As discussed offline with Alex, one issue I see is that this would be
encouraging people to add more devices to an artificial global bus in
/machine/unassigned that we've been trying to obsolete, rather than
sitting down and please creating an e500 SoC object as a start. Maybe we
should start generating a list of shame for 2.1. ;)
Instantiating a new [Sys/AXI/AMBA/...]Bus inside that SoC object would
make me much happier than using SysBus as is.

The pure QOM approach would be link properties instead of a bus, but
then the machine needs to know how many slots there shall be in
advance. Note that the docking procedure is always initiated from the
realizing device, whether bus or no bus.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] Reverse execution and deterministic replay

2014-06-27 Thread Pavel Dovgaluk
 -Original Message-
 From: Frederic Konrad [mailto:fred.kon...@greensocs.com]
 Sent: Friday, June 27, 2014 11:48 AM
 To: Pavel Dovgaluk
 Cc: Peter Crosthwaite; Paolo Bonzini; qemu-devel@nongnu.org Developers; Mark 
 Burton
 Subject: Re: [Qemu-devel] Reverse execution and deterministic replay
 
 On 27/06/2014 08:11, Peter Crosthwaite wrote:
  Hi Pavel,
 
  On Fri, Jun 27, 2014 at 3:18 PM, Pavel Dovgaluk
  pavel.dovga...@ispras.ru wrote:
  Hello!
 
  We want to publish set of patches related to the reverse execution and 
  deterministic replay
 of qemu.
  Our implementation of deterministic replay can be used for deterministic 
  and reverse
 debugging of
  guest code through gdb remote interface.
 
  Execution recording writes non-deterministic events log, which can be 
  later used for
 replaying the
  execution anywhere and for unlimited number of times. It also supports 
  checkpointing for
 faster
  rewinding during reverse debugging. Execution replaying reads the log and 
  replays all
  non-deterministic events including external input, hardware clocks, and 
  interrupts.
 
  Reverse execution has the following features:
* Deterministically replays whole system execution and all contents of 
  the memory,
  state of the hadrware devices, clocks, and screen of the VM.
* Writes execution log into the file for latter replaying for multiple 
  times
  on different machines.
* Supports i386, x86_64, and ARM hardware platforms.
* Performs deterministic replay of all operations with keyboard, mouse, 
  network adapters,
  audio devices, serial interfaces, and physical USB devices connected 
  to the emulator.
* Provides support for gdb reverse debugging commands like reverse-step 
  and reverse-
 continue.
* Supports auto-checkpointing for convenient reverse debugging.
* Allows going to the live execution from the replay mode.
 
  Our implementation is completely tested for qemu 1.5 and is in beta state 
  for 2.0.50.
 
  Some details about our implementation of reverse execution can be found in 
  paper:
  http://www.computer.org/csdl/proceedings/csmr/2012/4666/00/4666a553-abs.html
 
  Add relevant implementation details to the git commit messages.
 
  Can anyone review our patches?
 
  Fred Konrad is doing a series on reverse exe at the moment. CC. Is the
  an independent implementation of the same thing or are you building on
  it?
 
 Hi,
 
 Yes seems we are doing the same thing only we use icount as an instruction
 counter and you created a new instruction counter?

Yes, we created new instruction-accurate counter.

 This has advantage of having it working everywhere icount works but the
 disavantages of having to use icount for reverse execution.

The major disadvantage of icount is that it's updated only on TB boundaries.
When one instruction in the middle of the block uses virtual clock, it could
have different values for different divisions of the code to TB. E.g. you can
stop the execution using the debugger in the middle of the block. 
It will lead to creation of the new block starting from the next instruction
(which previously was in the middle of the TB). Reading virtual clock by this
instruction can give you different values.

 I think we can use both way so the reverse execution will works on other
 architecture the time an instruction counter is added to them.
 
 I'm sure your patches will add to our solution and I can review your patches
 when you'll send them.
 
 It would help if you rebase them on the patch set that is currently on
 the list:
 [RFC PATCH v5 00/13] Reverse execution. I sent two days ago.

We do not use icount at all. We record virtual time into the replay log instead.
But we implemented an icount-like feature, which computes the values of virtual
clock and TSC using our internal instruction counter.

 
 Thanks,
 Fred
 
  I suggest posting a full RFC, this looks to me just like a cover
  letter but without a series.
 
  Note that we are going into hard freeze imminently so there will be
  some delay for merge.
 
  Regards,
  Peter
 
  Pavel Dovgaluk
 
 
 


Pavel Dovgaluk




[Qemu-devel] [PATCH] ui/vnc: avoid memory corruption if width % VNC_DIRTY_PIXELS_PER_BIT != 0

2014-06-27 Thread Peter Lieven
during resolution change in Windows 7 it happens sometimes that Windows changes 
to
an intermediate resolution where server_stride % cmp_bytes != 0 (in 
vnc_refresh_server_surface).
The problem that causes memory corruption is where the guest fb is copied to 
the server fb.
It could be easily fixed by truncating cmp_bytes in vnc_refresh_server_surface. 
But by looking at
the code it seems that none of the encoders called in 
vnc_send_framebuffer_update really cares about
w  pixman_image_get_width(vd-server). This patch will therefore remove all 
DIV_ROUND_UPs for
now to avoid corruption or illegal reads. I think there are really almost no 
real resultions out
there where width % 16 != 0. If we really find some we might need to either 
decrease
VNC_DIRTY_PIXELS_PER_BIT or make it dynamic depending on the resolution.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Peter Lieven p...@kamp.de
---
 ui/vnc.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ui/vnc.c b/ui/vnc.c
index 14a86c3..9e37d47 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -577,7 +577,7 @@ void *vnc_server_fb_ptr(VncDisplay *vd, int x, int y)
 memset(bitmap, 0x00, sizeof(bitmap));\
 for (y = 0; y  h; y++) {\
 bitmap_set(bitmap[y], 0,\
-   DIV_ROUND_UP(w, VNC_DIRTY_PIXELS_PER_BIT));\
+   w / VNC_DIRTY_PIXELS_PER_BIT);\
 } \
 }
 
@@ -2738,7 +2738,7 @@ static int vnc_refresh_server_surface(VncDisplay *vd)
 }
 guest_ptr += x * cmp_bytes;
 
-for (; x  DIV_ROUND_UP(width, VNC_DIRTY_PIXELS_PER_BIT);
+for (; x  width / VNC_DIRTY_PIXELS_PER_BIT;
  x++, guest_ptr += cmp_bytes, server_ptr += cmp_bytes) {
 if (!test_and_clear_bit(x, vd-guest.dirty[y])) {
 continue;
-- 
1.7.9.5




Re: [Qemu-devel] Reverse execution and deterministic replay

2014-06-27 Thread Peter Maydell
On 27 June 2014 11:35, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote:
 The major disadvantage of icount is that it's updated only on TB boundaries.
 When one instruction in the middle of the block uses virtual clock, it could
 have different values for different divisions of the code to TB.

This is only true if the instruction is incorrectly not
marked as being I/O. The idea behind icount is that in
general we update it on TB boundaries (it's much faster
than doing it once per insn) but for those places which
do turn out to need an exact icount we then retranslate
the block to get the instruction-to-icount-adjustment
mapping.

It wouldn't surprise me if this turned out to have some
bugs in corner cases, but fixing these issues seems to
me like a much better design than ignoring icount completely
and reimplementing a second instruction counter.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 0/5] Platform device support

2014-06-27 Thread Peter Crosthwaite
On Fri, Jun 27, 2014 at 8:30 PM, Andreas Färber afaer...@suse.de wrote:
 Am 26.06.2014 14:01, schrieb Alexander Graf:

 On 20.06.14 08:43, Peter Crosthwaite wrote:
 On Wed, Jun 4, 2014 at 10:28 PM, Alexander Graf ag...@suse.de wrote:
 Platforms without ISA and/or PCI have had a seriously hard time in
 the dynamic
 device creation world of QEMU. Devices on these were modeled as
 SysBus devices
 which can only be instantiated in machine files, not through -device.

 Why is that so?

 Well, SysBus is trying to be incredibly generic. It allows you to
 plug any
 interrupt sender into any other interrupt receiver. It allows you to map
 a device's memory regions into any other random memory region. All of
 that
 only works from C code or via really complicated command line
 arguments under
 discussion upstream right now.

 What you are doing seem to me to be an extension of SysBus - you are
 defining the same interfaces as sysbus but also adding some machine
 specifics wiring info. I think it's a candidate for QOM inheritance to
 avoid having to dup all the sysbus device models for both regular
 sysbus and platform bus. I think your functionality should be added as
 one of

 1: and interface that can be added to sysbus devices
 2: a new abstraction that inherits from SYS_BUS_DEVICE
 3: just new features to the sysbus core.

 Then both of us are using the same suite of device models and the
 differences between our approaches are limited to machine level
 instantiation method. My gut says #2 is the cleanest.

 The more I think about it the more I believe #3 would be the cleanest.
 The only thing my platform devices do in addition to sysbus devices is
 that it exposes qdev properties to give mapping code hints where a
 device wants to be mapped.

 If we just add qdev properties for all the possible hints in generic
 sysbus core code, we should be able to automatically convert all devices
 into dynamically allocatable devices. Whether they actually do get
 mapped and the generation of device tree chunks still stays in the the
 machine file's court.

 As discussed offline with Alex, one issue I see is that this would be
 encouraging people to add more devices to an artificial global bus in
 /machine/unassigned that we've been trying to obsolete, rather than
 sitting down and please creating an e500 SoC object as a start. Maybe we
 should start generating a list of shame for 2.1. ;)
 Instantiating a new [Sys/AXI/AMBA/...]Bus inside that SoC object would
 make me much happier than using SysBus as is.


Do you mean address_space_memory (as used by sysbus_mmio_map)? We all
hate that global singleton, but can we decouple it from sysbus which
is not the root cause of that problem? sysbus_mmio_map usages just
need to be replaced with sysbus_mmio_get_region and you can create
whatever heirachy you want using unchanged sysbus devices.

Even if we phase out the global singleton and the SysBus bus, the
sysbus device abstraction is still sound and should be usable
busless. Then theres no need a for a tree-wide to implement Alex's
feature for all devs (assuming his plugger can be made to work
hintless?).

Regards,
Peter

 The pure QOM approach would be link properties instead of a bus, but
 then the machine needs to know how many slots there shall be in
 advance. Note that the docking procedure is always initiated from the
 realizing device, whether bus or no bus.

 Regards,
 Andreas

 --
 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg




Re: [Qemu-devel] Reverse execution and deterministic replay

2014-06-27 Thread Pavel Dovgaluk
 On 27 June 2014 11:35, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote:
  The major disadvantage of icount is that it's updated only on TB boundaries.
  When one instruction in the middle of the block uses virtual clock, it could
  have different values for different divisions of the code to TB.
 
 This is only true if the instruction is incorrectly not
 marked as being I/O. The idea behind icount is that in
 general we update it on TB boundaries (it's much faster
 than doing it once per insn) but for those places which
 do turn out to need an exact icount we then retranslate
 the block to get the instruction-to-icount-adjustment
 mapping.

I see. But if we want virtual clock in real mode then we still
should create new timer (based on icount code).

 It wouldn't surprise me if this turned out to have some
 bugs in corner cases, but fixing these issues seems to
 me like a much better design than ignoring icount completely
 and reimplementing a second instruction counter.

When we started an implementation, we didn't have enough resources
to fix all such bugs. That is why we selected such conservative
approach. But I believe that in future we will adopt the icount
for replay purposes.

Pavel Dovgaluk




Re: [Qemu-devel] [PATCH 0/5] Platform device support

2014-06-27 Thread Andreas Färber
Am 27.06.2014 12:54, schrieb Peter Crosthwaite:
 On Fri, Jun 27, 2014 at 8:30 PM, Andreas Färber afaer...@suse.de wrote:
 Am 26.06.2014 14:01, schrieb Alexander Graf:
 On 20.06.14 08:43, Peter Crosthwaite wrote:
 On Wed, Jun 4, 2014 at 10:28 PM, Alexander Graf ag...@suse.de wrote:
 Platforms without ISA and/or PCI have had a seriously hard time in
 the dynamic
 device creation world of QEMU. Devices on these were modeled as
 SysBus devices
 which can only be instantiated in machine files, not through -device.

 Why is that so?

 Well, SysBus is trying to be incredibly generic. It allows you to
 plug any
 interrupt sender into any other interrupt receiver. It allows you to map
 a device's memory regions into any other random memory region. All of
 that
 only works from C code or via really complicated command line
 arguments under
 discussion upstream right now.

 What you are doing seem to me to be an extension of SysBus - you are
 defining the same interfaces as sysbus but also adding some machine
 specifics wiring info. I think it's a candidate for QOM inheritance to
 avoid having to dup all the sysbus device models for both regular
 sysbus and platform bus. I think your functionality should be added as
 one of

 1: and interface that can be added to sysbus devices
 2: a new abstraction that inherits from SYS_BUS_DEVICE
 3: just new features to the sysbus core.

 Then both of us are using the same suite of device models and the
 differences between our approaches are limited to machine level
 instantiation method. My gut says #2 is the cleanest.

 The more I think about it the more I believe #3 would be the cleanest.
 The only thing my platform devices do in addition to sysbus devices is
 that it exposes qdev properties to give mapping code hints where a
 device wants to be mapped.

 If we just add qdev properties for all the possible hints in generic
 sysbus core code, we should be able to automatically convert all devices
 into dynamically allocatable devices. Whether they actually do get
 mapped and the generation of device tree chunks still stays in the the
 machine file's court.

 As discussed offline with Alex, one issue I see is that this would be
 encouraging people to add more devices to an artificial global bus in
 /machine/unassigned that we've been trying to obsolete, rather than
 sitting down and please creating an e500 SoC object as a start. Maybe we
 should start generating a list of shame for 2.1. ;)
 Instantiating a new [Sys/AXI/AMBA/...]Bus inside that SoC object would
 make me much happier than using SysBus as is.

 
 Do you mean address_space_memory (as used by sysbus_mmio_map)?

No, I mean the QOM composition model. When we think of using -device,
then they will go to /machine/peripheral/id or
/machine/peripheral-anon/device[n]; in your case that means that you get
a flat list of devices rather than a structure matching your device
tree. And like I said above, in both your and Alex' case SysBus is
something that has no real place in the composition tree unless we go
from that single unholy qdev-required bus to buses as they exist in the
hardware, like Anthony suggested long time ago. Alex' problem with that
is that he doesn't want to implement the same UART logic for 50
different-but-same buses, so some form of reuse or inheritance would be
needed.

Disclaimer: I have not yet reviewed this series, I was commenting on
abstract ideas that Alex requested feedback for.

Cheers,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg



Re: [Qemu-devel] [PATCH 0/5] Platform device support

2014-06-27 Thread Alexander Graf


On 27.06.14 13:17, Andreas Färber wrote:

Am 27.06.2014 12:54, schrieb Peter Crosthwaite:

On Fri, Jun 27, 2014 at 8:30 PM, Andreas Färber afaer...@suse.de wrote:

Am 26.06.2014 14:01, schrieb Alexander Graf:

On 20.06.14 08:43, Peter Crosthwaite wrote:

On Wed, Jun 4, 2014 at 10:28 PM, Alexander Graf ag...@suse.de wrote:

Platforms without ISA and/or PCI have had a seriously hard time in
the dynamic
device creation world of QEMU. Devices on these were modeled as
SysBus devices
which can only be instantiated in machine files, not through -device.

Why is that so?

Well, SysBus is trying to be incredibly generic. It allows you to
plug any
interrupt sender into any other interrupt receiver. It allows you to map
a device's memory regions into any other random memory region. All of
that
only works from C code or via really complicated command line
arguments under
discussion upstream right now.


What you are doing seem to me to be an extension of SysBus - you are
defining the same interfaces as sysbus but also adding some machine
specifics wiring info. I think it's a candidate for QOM inheritance to
avoid having to dup all the sysbus device models for both regular
sysbus and platform bus. I think your functionality should be added as
one of

1: and interface that can be added to sysbus devices
2: a new abstraction that inherits from SYS_BUS_DEVICE
3: just new features to the sysbus core.

Then both of us are using the same suite of device models and the
differences between our approaches are limited to machine level
instantiation method. My gut says #2 is the cleanest.

The more I think about it the more I believe #3 would be the cleanest.
The only thing my platform devices do in addition to sysbus devices is
that it exposes qdev properties to give mapping code hints where a
device wants to be mapped.

If we just add qdev properties for all the possible hints in generic
sysbus core code, we should be able to automatically convert all devices
into dynamically allocatable devices. Whether they actually do get
mapped and the generation of device tree chunks still stays in the the
machine file's court.

As discussed offline with Alex, one issue I see is that this would be
encouraging people to add more devices to an artificial global bus in
/machine/unassigned that we've been trying to obsolete, rather than
sitting down and please creating an e500 SoC object as a start. Maybe we
should start generating a list of shame for 2.1. ;)
Instantiating a new [Sys/AXI/AMBA/...]Bus inside that SoC object would
make me much happier than using SysBus as is.


Do you mean address_space_memory (as used by sysbus_mmio_map)?

No, I mean the QOM composition model. When we think of using -device,
then they will go to /machine/peripheral/id or
/machine/peripheral-anon/device[n]; in your case that means that you get
a flat list of devices rather than a structure matching your device
tree. And like I said above, in both your and Alex' case SysBus is
something that has no real place in the composition tree unless we go
from that single unholy qdev-required bus to buses as they exist in the
hardware, like Anthony suggested long time ago. Alex' problem with that
is that he doesn't want to implement the same UART logic for 50
different-but-same buses, so some form of reuse or inheritance would be
needed.

Disclaimer: I have not yet reviewed this series, I was commenting on
abstract ideas that Alex requested feedback for.


I think we can all agree that the sysbus bus is not a bus per se. So 
conceptually, what's the difference between a device attached to a 
non-bus and a device not attached to a bus at all? And why can't we 
convert sysbus to not be a bus anymore?



Alex




[Qemu-devel] [PULL 02/10] pc-bios/s390-ccw: cleanup and enhance bootmap defintions

2014-06-27 Thread Cornelia Huck
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com

Add declarations to describe structure of different dasd IPL sources
(eckd and fba). Move the structure definitions to a new header bootmap.h.
While we are at it, change structs to typedefs.

Acked-by: Christian Borntraeger borntrae...@de.ibm.com
Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com
Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 pc-bios/s390-ccw/bootmap.c |   66 +++-
 pc-bios/s390-ccw/bootmap.h |  254 
 2 files changed, 269 insertions(+), 51 deletions(-)
 create mode 100644 pc-bios/s390-ccw/bootmap.h

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index 753c288..c216030 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -9,6 +9,7 @@
  */
 
 #include s390-ccw.h
+#include bootmap.h
 
 /* #define DEBUG_FALLBACK */
 
@@ -20,41 +21,6 @@
 do { } while (0)
 #endif
 
-struct scsi_blockptr {
-uint64_t blockno;
-uint16_t size;
-uint16_t blockct;
-uint8_t reserved[4];
-} __attribute__ ((packed));
-
-struct component_entry {
-struct scsi_blockptr data;
-uint8_t pad[7];
-uint8_t component_type;
-uint64_t load_address;
-} __attribute((packed));
-
-struct component_header {
-uint8_t magic[4];
-uint8_t type;
-uint8_t reserved[27];
-} __attribute((packed));
-
-struct mbr {
-uint8_t magic[4];
-uint32_t version_id;
-uint8_t reserved[8];
-struct scsi_blockptr blockptr;
-} __attribute__ ((packed));
-
-#define ZIPL_MAGIC  zIPL
-
-#define ZIPL_COMP_HEADER_IPL0x00
-#define ZIPL_COMP_HEADER_DUMP   0x01
-
-#define ZIPL_COMP_ENTRY_LOAD0x02
-#define ZIPL_COMP_ENTRY_EXEC0x01
-
 /* Scratch space */
 static uint8_t sec[SECTOR_SIZE] __attribute__((__aligned__(SECTOR_SIZE)));
 
@@ -118,8 +84,6 @@ static int zipl_magic(uint8_t *ptr)
 return 1;
 }
 
-#define FREE_SPACE_FILLER '\xAA'
-
 static inline bool unused_space(const void *p, unsigned int size)
 {
 int i;
@@ -133,10 +97,10 @@ static inline bool unused_space(const void *p, unsigned 
int size)
 return true;
 }
 
-static int zipl_load_segment(struct component_entry *entry)
+static int zipl_load_segment(ComponentEntry *entry)
 {
-const int max_entries = (SECTOR_SIZE / sizeof(struct scsi_blockptr));
-struct scsi_blockptr *bprs = (void *)sec;
+const int max_entries = (SECTOR_SIZE / sizeof(ScsiBlockPtr));
+ScsiBlockPtr *bprs = (void *)sec;
 const int bprs_size = sizeof(sec);
 uint64_t blockno;
 long address;
@@ -170,7 +134,7 @@ static int zipl_load_segment(struct component_entry *entry)
 }
 
 if (bprs[i].blockct == 0  unused_space(bprs[i + 1],
-sizeof(struct scsi_blockptr))) {
+sizeof(ScsiBlockPtr))) {
 /* This is a continue pointer.
  * This ptr is the last one in the current script section.
  * I.e. the next ptr must point to the unused memory area.
@@ -195,14 +159,14 @@ fail:
 }
 
 /* Run a zipl program */
-static int zipl_run(struct scsi_blockptr *pte)
+static int zipl_run(ScsiBlockPtr *pte)
 {
-struct component_header *header;
-struct component_entry *entry;
+ComponentHeader *header;
+ComponentEntry *entry;
 uint8_t tmp_sec[SECTOR_SIZE];
 
 virtio_read(pte-blockno, tmp_sec);
-header = (struct component_header *)tmp_sec;
+header = (ComponentHeader *)tmp_sec;
 
 if (!zipl_magic(tmp_sec)) {
 goto fail;
@@ -215,7 +179,7 @@ static int zipl_run(struct scsi_blockptr *pte)
 dputs(start loading images\n);
 
 /* Load image(s) into RAM */
-entry = (struct component_entry *)(header[1]);
+entry = (ComponentEntry *)(header[1]);
 while (entry-component_type == ZIPL_COMP_ENTRY_LOAD) {
 if (zipl_load_segment(entry)  0) {
 goto fail;
@@ -244,11 +208,11 @@ fail:
 
 int zipl_load(void)
 {
-struct mbr *mbr = (void *)sec;
+ScsiMbr *mbr = (void *)sec;
 uint8_t *ns, *ns_end;
 int program_table_entries = 0;
-int pte_len = sizeof(struct scsi_blockptr);
-struct scsi_blockptr *prog_table_entry;
+const int pte_len = sizeof(ScsiBlockPtr);
+ScsiBlockPtr *prog_table_entry;
 const char *error = ;
 
 /* Grab the MBR */
@@ -276,7 +240,7 @@ int zipl_load(void)
 
 ns_end = sec + SECTOR_SIZE;
 for (ns = (sec + pte_len); (ns + pte_len)  ns_end; ns++) {
-prog_table_entry = (struct scsi_blockptr *)ns;
+prog_table_entry = (ScsiBlockPtr *)ns;
 if (!prog_table_entry-blockno) {
 break;
 }
@@ -292,7 +256,7 @@ int zipl_load(void)
 
 /* Run the default entry */
 
-prog_table_entry = (struct scsi_blockptr *)(sec + pte_len);
+prog_table_entry = (ScsiBlockPtr *)(sec + pte_len);
 
 return zipl_run(prog_table_entry);
 
diff --git a/pc-bios/s390-ccw/bootmap.h 

[Qemu-devel] [PULL 03/10] pc-bios/s390-ccw: handle different sector sizes

2014-06-27 Thread Cornelia Huck
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com

Use the virtio device's configuration to figure out the disk geometry
and use a sector size based upon the layout.

[CH: s/SECTOR_SIZE/MAX_SECTOR_SIZE/g]
Acked-by: Christian Borntraeger borntrae...@de.ibm.com
Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com
Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 pc-bios/s390-ccw/bootmap.c  |   12 +++---
 pc-bios/s390-ccw/s390-ccw.h |2 +-
 pc-bios/s390-ccw/virtio.c   |   96 ---
 pc-bios/s390-ccw/virtio.h   |   48 ++
 4 files changed, 147 insertions(+), 11 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index c216030..fa2ca26 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -10,6 +10,7 @@
 
 #include s390-ccw.h
 #include bootmap.h
+#include virtio.h
 
 /* #define DEBUG_FALLBACK */
 
@@ -22,7 +23,8 @@
 #endif
 
 /* Scratch space */
-static uint8_t sec[SECTOR_SIZE] __attribute__((__aligned__(SECTOR_SIZE)));
+static uint8_t sec[MAX_SECTOR_SIZE]
+__attribute__((__aligned__(MAX_SECTOR_SIZE)));
 
 typedef struct ResetInfo {
 uint32_t ipl_mask;
@@ -99,7 +101,7 @@ static inline bool unused_space(const void *p, unsigned int 
size)
 
 static int zipl_load_segment(ComponentEntry *entry)
 {
-const int max_entries = (SECTOR_SIZE / sizeof(ScsiBlockPtr));
+const int max_entries = (MAX_SECTOR_SIZE / sizeof(ScsiBlockPtr));
 ScsiBlockPtr *bprs = (void *)sec;
 const int bprs_size = sizeof(sec);
 uint64_t blockno;
@@ -163,7 +165,7 @@ static int zipl_run(ScsiBlockPtr *pte)
 {
 ComponentHeader *header;
 ComponentEntry *entry;
-uint8_t tmp_sec[SECTOR_SIZE];
+uint8_t tmp_sec[MAX_SECTOR_SIZE];
 
 virtio_read(pte-blockno, tmp_sec);
 header = (ComponentHeader *)tmp_sec;
@@ -187,7 +189,7 @@ static int zipl_run(ScsiBlockPtr *pte)
 
 entry++;
 
-if ((uint8_t *)(entry[1])  (tmp_sec + SECTOR_SIZE)) {
+if ((uint8_t *)(entry[1])  (tmp_sec + MAX_SECTOR_SIZE)) {
 goto fail;
 }
 }
@@ -238,7 +240,7 @@ int zipl_load(void)
 goto fail;
 }
 
-ns_end = sec + SECTOR_SIZE;
+ns_end = sec + virtio_get_block_size();
 for (ns = (sec + pte_len); (ns + pte_len)  ns_end; ns++) {
 prog_table_entry = (ScsiBlockPtr *)ns;
 if (!prog_table_entry-blockno) {
diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h
index fe1dd22..b6c0a5b 100644
--- a/pc-bios/s390-ccw/s390-ccw.h
+++ b/pc-bios/s390-ccw/s390-ccw.h
@@ -130,6 +130,6 @@ static inline void yield(void)
   : memory, cc);
 }
 
-#define SECTOR_SIZE 512
+#define MAX_SECTOR_SIZE 4096
 
 #endif /* S390_CCW_H */
diff --git a/pc-bios/s390-ccw/virtio.c b/pc-bios/s390-ccw/virtio.c
index c845b14..31b23b0 100644
--- a/pc-bios/s390-ccw/virtio.c
+++ b/pc-bios/s390-ccw/virtio.c
@@ -202,7 +202,7 @@ static int vring_wait_reply(struct vring *vr, int timeout)
  *   Virtio block  *
  ***/
 
-static int virtio_read_many(ulong sector, void *load_addr, int sec_num)
+int virtio_read_many(ulong sector, void *load_addr, int sec_num)
 {
 struct virtio_blk_outhdr out_hdr;
 u8 status;
@@ -211,12 +211,12 @@ static int virtio_read_many(ulong sector, void 
*load_addr, int sec_num)
 /* Tell the host we want to read */
 out_hdr.type = VIRTIO_BLK_T_IN;
 out_hdr.ioprio = 99;
-out_hdr.sector = sector;
+out_hdr.sector = virtio_sector_adjust(sector);
 
 vring_send_buf(block, out_hdr, sizeof(out_hdr), VRING_DESC_F_NEXT);
 
 /* This is where we want to receive data */
-vring_send_buf(block, load_addr, SECTOR_SIZE * sec_num,
+vring_send_buf(block, load_addr, virtio_get_block_size() * sec_num,
VRING_DESC_F_WRITE | VRING_HIDDEN_IS_CHAIN |
VRING_DESC_F_NEXT);
 
@@ -244,7 +244,7 @@ unsigned long virtio_load_direct(ulong rec_list1, ulong 
rec_list2,
 int sec_len = rec_list2  48;
 ulong addr = (ulong)load_addr;
 
-if (sec_len != SECTOR_SIZE) {
+if (sec_len != virtio_get_block_size()) {
 return -1;
 }
 
@@ -253,7 +253,7 @@ unsigned long virtio_load_direct(ulong rec_list1, ulong 
rec_list2,
 if (status) {
 virtio_panic(I/O Error);
 }
-addr += sec_num * SECTOR_SIZE;
+addr += sec_num * virtio_get_block_size();
 
 return addr;
 }
@@ -263,15 +263,95 @@ int virtio_read(ulong sector, void *load_addr)
 return virtio_read_many(sector, load_addr, 1);
 }
 
+static VirtioBlkConfig blk_cfg = {};
+static bool guessed_disk_nature;
+
+bool virtio_guessed_disk_nature(void)
+{
+return guessed_disk_nature;
+}
+
+void virtio_assume_scsi(void)
+{
+guessed_disk_nature = true;
+blk_cfg.blk_size = 512;
+}
+
+void virtio_assume_eckd(void)
+{
+guessed_disk_nature = true;
+

[Qemu-devel] [PULL 00/10] for-2.1: s390-ccw bios patches

2014-06-27 Thread Cornelia Huck
Here are some s390-ccw bios patches I'd like to see in 2.1. Being able
to finally boot from dasd is quite a useful feature. Please consider pulling.

The following changes since commit ff4873cb8c81db89668d8b56e19e57b852edb5f5:

  coroutine-win32.c: Add noinline attribute to work around gcc bug (2014-06-26 
14:08:14 +0100)

are available in the git repository at:

  git://github.com/cohuck/qemu.git tags/s390x-20140627

for you to fetch changes up to 77416f4075a673a27cfe5a7a34e93c0fa9810e35:

  pc-bios/s390-ccw: update binary (2014-06-27 12:11:53 +0200)


A series of patches to the s390-ccw bios:
- code cleanup
- improved error reporting
- most important, support to ipl (boot) from ECKD DASD (CDL, LDL or CMS
  formatted)



Eugene (jno) Dvurechenski (9):
  pc-bios/s390-ccw: make checkpatch happy
  pc-bios/s390-ccw: cleanup and enhance bootmap defintions
  pc-bios/s390-ccw: handle different sector sizes
  pc-bios/s390-ccw: add some utility code
  pc-bios/s390-ccw: Unify error handling
  pc-bios/s390-ccw: Add fill_hex_val func to provide better msgs
  pc-bios/s390-ccw: factor out ipl code
  pc-bios/s390-ccw: IPL from CDL-formatted ECKD DASD
  pc-bios/s390-ccw: IPL from LDL/CMS-formatted ECKD DASD

Jens Freimann (1):
  pc-bios/s390-ccw: update binary

 pc-bios/s390-ccw.img  |  Bin 9432 - 17624 bytes
 pc-bios/s390-ccw/bootmap.c|  445 -
 pc-bios/s390-ccw/bootmap.h|  344 +++
 pc-bios/s390-ccw/main.c   |   13 +-
 pc-bios/s390-ccw/s390-ccw.h   |   38 ++--
 pc-bios/s390-ccw/sclp-ascii.c |4 +-
 pc-bios/s390-ccw/virtio.c |  122 +--
 pc-bios/s390-ccw/virtio.h |   50 -
 8 files changed, 837 insertions(+), 179 deletions(-)
 create mode 100644 pc-bios/s390-ccw/bootmap.h

-- 
1.7.9.5




[Qemu-devel] [PULL 04/10] pc-bios/s390-ccw: add some utility code

2014-06-27 Thread Cornelia Huck
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com

IPL_assert(term,message) is introduced to handle error conditions.
ebcdic_to_ascii() to convert chars (mostly to print VOLSERs).
read_block() provision for unified block-number handling.

Acked-by: Christian Borntraeger borntrae...@de.ibm.com
Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com
Reviewed-by: David Hildenbrand d...@linux.vnet.ibm.com
Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 pc-bios/s390-ccw/bootmap.c |   15 +---
 pc-bios/s390-ccw/bootmap.h |   83 
 2 files changed, 84 insertions(+), 14 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index fa2ca26..bb8dd69 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -86,25 +86,12 @@ static int zipl_magic(uint8_t *ptr)
 return 1;
 }
 
-static inline bool unused_space(const void *p, unsigned int size)
-{
-int i;
-const unsigned char *m = p;
-
-for (i = 0; i  size; i++) {
-if (m[i] != FREE_SPACE_FILLER) {
-return false;
-}
-}
-return true;
-}
-
 static int zipl_load_segment(ComponentEntry *entry)
 {
 const int max_entries = (MAX_SECTOR_SIZE / sizeof(ScsiBlockPtr));
 ScsiBlockPtr *bprs = (void *)sec;
 const int bprs_size = sizeof(sec);
-uint64_t blockno;
+block_number_t blockno;
 long address;
 int i;
 
diff --git a/pc-bios/s390-ccw/bootmap.h b/pc-bios/s390-ccw/bootmap.h
index 59267b0..1846632 100644
--- a/pc-bios/s390-ccw/bootmap.h
+++ b/pc-bios/s390-ccw/bootmap.h
@@ -12,6 +12,10 @@
 #define _PC_BIOS_S390_CCW_BOOTMAP_H
 
 #include s390-ccw.h
+#include virtio.h
+
+typedef uint64_t block_number_t;
+#define NULL_BLOCK_NR 0x
 
 #define FREE_SPACE_FILLER '\xAA'
 
@@ -251,4 +255,83 @@ typedef struct IplVolumeLabel {
 };
 } __attribute__((packed)) IplVolumeLabel;
 
+/* utility code below */
+
+static inline void IPL_assert(bool term, const char *message)
+{
+if (!term) {
+sclp_print(\n! );
+sclp_print(message);
+virtio_panic( !\n); /* no return */
+}
+}
+
+static const unsigned char ebc2asc[256] =
+  /* 0123456789abcdef0123456789abcdef */
+ /* 1F */
+ /* 3F */
+ ...(+|.!$*);. /* 5F first.chr.here.is.real.space 
*/
+-/.,%_?.`:#@'=\/* 7F */
+.abcdefghi...jklmnopqr.. /* 9F */
+..stuvwxyz.. /* BF */
+.ABCDEFGHI...JKLMNOPQR.. /* DF */
+..STUVWXYZ..0123456789..;/* FF */
+
+static inline void ebcdic_to_ascii(const char *src,
+   char *dst,
+   unsigned int size)
+{
+unsigned int i;
+for (i = 0; i  size; i++) {
+unsigned c = src[i];
+dst[i] = ebc2asc[c];
+}
+}
+
+static inline void print_volser(const void *volser)
+{
+char ascii[8];
+
+ebcdic_to_ascii((char *)volser, ascii, 6);
+ascii[6] = '\0';
+sclp_print(VOLSER=[);
+sclp_print(ascii);
+sclp_print(]\n);
+}
+
+static inline bool unused_space(const void *p, size_t size)
+{
+size_t i;
+const unsigned char *m = p;
+
+for (i = 0; i  size; i++) {
+if (m[i] != FREE_SPACE_FILLER) {
+return false;
+}
+}
+return true;
+}
+
+static inline bool is_null_block_number(block_number_t x)
+{
+return x == NULL_BLOCK_NR;
+}
+
+static inline void read_block(block_number_t blockno,
+  void *buffer,
+  const char *errmsg)
+{
+IPL_assert(virtio_read(blockno, buffer) == 0, errmsg);
+}
+
+static inline bool block_size_ok(uint32_t block_size)
+{
+return block_size == virtio_get_block_size();
+}
+
+static inline bool magic_match(const void *data, const void *magic)
+{
+return *((uint32_t *)data) == *((uint32_t *)magic);
+}
+
 #endif /* _PC_BIOS_S390_CCW_BOOTMAP_H */
-- 
1.7.9.5




[Qemu-devel] [PULL 05/10] pc-bios/s390-ccw: Unify error handling

2014-06-27 Thread Cornelia Huck
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com

Convert to IPL_assert and friends

Acked-by: Christian Borntraeger borntrae...@de.ibm.com
Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com
Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 pc-bios/s390-ccw/bootmap.c  |   82 +++
 pc-bios/s390-ccw/main.c |   13 ---
 pc-bios/s390-ccw/s390-ccw.h |2 +-
 3 files changed, 31 insertions(+), 66 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index bb8dd69..1866a20 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -86,7 +86,7 @@ static int zipl_magic(uint8_t *ptr)
 return 1;
 }
 
-static int zipl_load_segment(ComponentEntry *entry)
+static void zipl_load_segment(ComponentEntry *entry)
 {
 const int max_entries = (MAX_SECTOR_SIZE / sizeof(ScsiBlockPtr));
 ScsiBlockPtr *bprs = (void *)sec;
@@ -103,10 +103,8 @@ static int zipl_load_segment(ComponentEntry *entry)
 
 do {
 memset(bprs, FREE_SPACE_FILLER, bprs_size);
-if (virtio_read(blockno, (uint8_t *)bprs)) {
-debug_print_int(failed reading bprs at, blockno);
-goto fail;
-}
+debug_print_int(reading bprs at, blockno);
+read_block(blockno, bprs, zipl_load_segment: cannot read block);
 
 for (i = 0;; i++) {
 u64 *cur_desc = (void *)bprs[i];
@@ -134,21 +132,13 @@ static int zipl_load_segment(ComponentEntry *entry)
 }
 address = virtio_load_direct(cur_desc[0], cur_desc[1], 0,
  (void *)address);
-if (address == -1) {
-goto fail;
-}
+IPL_assert(address != -1, zipl_load_segment: wrong IPL address);
 }
 } while (blockno);
-
-return 0;
-
-fail:
-sclp_print(failed loading segment\n);
-return -1;
 }
 
 /* Run a zipl program */
-static int zipl_run(ScsiBlockPtr *pte)
+static void zipl_run(ScsiBlockPtr *pte)
 {
 ComponentHeader *header;
 ComponentEntry *entry;
@@ -157,75 +147,53 @@ static int zipl_run(ScsiBlockPtr *pte)
 virtio_read(pte-blockno, tmp_sec);
 header = (ComponentHeader *)tmp_sec;
 
-if (!zipl_magic(tmp_sec)) {
-goto fail;
-}
+IPL_assert(zipl_magic(tmp_sec), zipl_run: zipl_magic);
 
-if (header-type != ZIPL_COMP_HEADER_IPL) {
-goto fail;
-}
+IPL_assert(header-type == ZIPL_COMP_HEADER_IPL,
+   zipl_run: wrong header type);
 
 dputs(start loading images\n);
 
 /* Load image(s) into RAM */
 entry = (ComponentEntry *)(header[1]);
 while (entry-component_type == ZIPL_COMP_ENTRY_LOAD) {
-if (zipl_load_segment(entry)  0) {
-goto fail;
-}
+zipl_load_segment(entry);
 
 entry++;
 
-if ((uint8_t *)(entry[1])  (tmp_sec + MAX_SECTOR_SIZE)) {
-goto fail;
-}
+IPL_assert((uint8_t *)(entry[1]) = (tmp_sec + MAX_SECTOR_SIZE),
+   zipl_run: wrong entry size);
 }
 
-if (entry-component_type != ZIPL_COMP_ENTRY_EXEC) {
-goto fail;
-}
+IPL_assert(entry-component_type == ZIPL_COMP_ENTRY_EXEC,
+   zipl_run: no EXEC entry);
 
 /* should not return */
 jump_to_IPL_code(entry-load_address);
-
-return 0;
-
-fail:
-sclp_print(failed running zipl\n);
-return -1;
 }
 
-int zipl_load(void)
+void zipl_load(void)
 {
 ScsiMbr *mbr = (void *)sec;
 uint8_t *ns, *ns_end;
 int program_table_entries = 0;
 const int pte_len = sizeof(ScsiBlockPtr);
 ScsiBlockPtr *prog_table_entry;
-const char *error = ;
 
 /* Grab the MBR */
-virtio_read(0, (void *)mbr);
+read_block(0, mbr, zipl_load: cannot read block 0);
 
 dputs(checking magic\n);
 
-if (!zipl_magic(mbr-magic)) {
-error = zipl_magic 1;
-goto fail;
-}
+IPL_assert(zipl_magic(mbr-magic), zipl_load: zipl_magic 1);
 
 debug_print_int(program table, mbr-blockptr.blockno);
 
 /* Parse the program table */
-if (virtio_read(mbr-blockptr.blockno, sec)) {
-error = virtio_read;
-goto fail;
-}
+read_block(mbr-blockptr.blockno, sec,
+   zipl_load: cannot read program table);
 
-if (!zipl_magic(sec)) {
-error = zipl_magic 2;
-goto fail;
-}
+IPL_assert(zipl_magic(sec), zipl_load: zipl_magic 2);
 
 ns_end = sec + virtio_get_block_size();
 for (ns = (sec + pte_len); (ns + pte_len)  ns_end; ns++) {
@@ -239,19 +207,11 @@ int zipl_load(void)
 
 debug_print_int(program table entries, program_table_entries);
 
-if (!program_table_entries) {
-goto fail;
-}
+IPL_assert(program_table_entries, zipl_load: no program table);
 
 /* Run the default entry */
 
 prog_table_entry = (ScsiBlockPtr *)(sec + pte_len);
 
-return 

[Qemu-devel] [PULL 08/10] pc-bios/s390-ccw: IPL from CDL-formatted ECKD DASD

2014-06-27 Thread Cornelia Huck
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com

Add code that allows us to start from ECKD DASD using the z/OS
compatible disk layout (CDL), which is the most common format for ECKD
DASD.

Acked-by: Christian Borntraeger borntrae...@de.ibm.com
Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com
Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 pc-bios/s390-ccw/bootmap.c |  168 
 1 file changed, 168 insertions(+)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index 3c08f82..beda4d6 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -74,6 +74,171 @@ static void jump_to_IPL_code(uint64_t address)
 }
 
 /***
+ * IPL an ECKD DASD (CDL or LDL/CMS format)
+ */
+
+static unsigned char _bprs[8*1024]; /* guessed max ECKD sector size */
+const int max_bprs_entries = sizeof(_bprs) / sizeof(ExtEckdBlockPtr);
+
+static bool eckd_valid_address(BootMapPointer *p)
+{
+const uint64_t cylinder = p-eckd.cylinder
++ ((p-eckd.head  0xfff0)  12);
+const uint64_t head = p-eckd.head  0x000f;
+
+if (head = virtio_get_heads()
+||  p-eckd.sector  virtio_get_sectors()
+||  p-eckd.sector = 0) {
+return false;
+}
+
+if (!virtio_guessed_disk_nature()  cylinder = virtio_get_cylinders()) {
+return false;
+}
+
+return true;
+}
+
+static block_number_t eckd_block_num(BootMapPointer *p)
+{
+const uint64_t sectors = virtio_get_sectors();
+const uint64_t heads = virtio_get_heads();
+const uint64_t cylinder = p-eckd.cylinder
++ ((p-eckd.head  0xfff0)  12);
+const uint64_t head = p-eckd.head  0x000f;
+const block_number_t block = sectors * heads * cylinder
+   + sectors * head
+   + p-eckd.sector
+   - 1; /* block nr starts with zero */
+return block;
+}
+
+static block_number_t load_eckd_segments(block_number_t blk, uint64_t *address)
+{
+block_number_t block_nr;
+int j, rc;
+BootMapPointer *bprs = (void *)_bprs;
+bool more_data;
+
+memset(_bprs, FREE_SPACE_FILLER, sizeof(_bprs));
+read_block(blk, bprs, BPRS read failed);
+
+do {
+more_data = false;
+for (j = 0;; j++) {
+block_nr = eckd_block_num((void *)(bprs[j].xeckd));
+if (is_null_block_number(block_nr)) { /* end of chunk */
+break;
+}
+
+/* we need the updated blockno for the next indirect entry
+ * in the chain, but don't want to advance address
+ */
+if (j == (max_bprs_entries - 1)) {
+break;
+}
+
+IPL_assert(block_size_ok(bprs[j].xeckd.bptr.size),
+   bad chunk block size);
+IPL_assert(eckd_valid_address(bprs[j]), bad chunk ECKD addr);
+
+if ((bprs[j].xeckd.bptr.count == 0)  unused_space((bprs[j+1]),
+sizeof(EckdBlockPtr))) {
+/* This is a continue pointer.
+ * This ptr should be the last one in the current
+ * script section.
+ * I.e. the next ptr must point to the unused memory area
+ */
+memset(_bprs, FREE_SPACE_FILLER, sizeof(_bprs));
+read_block(block_nr, bprs, BPRS continuation read failed);
+more_data = true;
+break;
+}
+
+/* Load (count+1) blocks of code at (block_nr)
+ * to memory (address).
+ */
+rc = virtio_read_many(block_nr, (void *)(*address),
+  bprs[j].xeckd.bptr.count+1);
+IPL_assert(rc == 0, code chunk read failed);
+
+*address += (bprs[j].xeckd.bptr.count+1) * virtio_get_block_size();
+}
+} while (more_data);
+return block_nr;
+}
+
+static void run_eckd_boot_script(block_number_t mbr_block_nr)
+{
+int i;
+block_number_t block_nr;
+uint64_t address;
+ScsiMbr *scsi_mbr = (void *)sec;
+BootMapScript *bms = (void *)sec;
+
+memset(sec, FREE_SPACE_FILLER, sizeof(sec));
+read_block(mbr_block_nr, sec, Cannot read MBR);
+
+block_nr = eckd_block_num((void *)(scsi_mbr-blockptr));
+
+memset(sec, FREE_SPACE_FILLER, sizeof(sec));
+read_block(block_nr, sec, Cannot read Boot Map Script);
+
+for (i = 0; bms-entry[i].type == BOOT_SCRIPT_LOAD; i++) {
+address = bms-entry[i].address.load_address;
+block_nr = eckd_block_num((bms-entry[i].blkptr));
+
+do {
+block_nr = load_eckd_segments(block_nr, address);
+} while (block_nr != -1);
+}
+
+IPL_assert(bms-entry[i].type == BOOT_SCRIPT_EXEC,
+   Unknown script entry 

[Qemu-devel] [PULL 10/10] pc-bios/s390-ccw: update binary

2014-06-27 Thread Cornelia Huck
From: Jens Freimann jf...@linux.vnet.ibm.com

Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 pc-bios/s390-ccw.img |  Bin 9432 - 17624 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/pc-bios/s390-ccw.img b/pc-bios/s390-ccw.img
index 
1c7f7640fc0c5f2505c4f1114a21b3b712852dbc..603e19e003d574b24bb3b97bacda2bf38077e8fd
 100644
GIT binary patch
literal 17624
zcmeHPe{@v!mA~^NnU`b;FCm8ENA(4$h98C@1XMtoACTtF#JxeEeXkxMDwf3fT(D5
zG)K|u8gfieYh625(udYn8nHF4mJzL0*Ve3x*sZN;2e6T))|Zy1$q0q_q~}U=$`Y
z?mv4@p69)Lzu)h@_j|wh-tYbX7|gdy%dfQCZIpcMlnY!cT!Yg(ta_gbS5ye}(WR71
zK1!!Fl$kcgY*z)i2F+3e5?%C?S(AW5$9$~y-$EEW|k7rEa7YOS!j~aX;%2V7Q0t
z3`Er|vpoTYk9q!weB`VRy%!kaD?H9}U4t{mKvccY^`AH`PQTClG-o-QNBFsIOVtn
z#w{zfpH6rTVUJ*;}#gVz_m*Eii6@aSM!F;LBTJ=z~}*C*XBRgw3Il+*m3@Vq{Z
zqpNGECFZ4BkpZGi7brZvza^Vcy%qRS^i#B!@1h2oOclYTIiO)9xPeZ7;(;dyR_
z8hc_WWaf6gtB^HA)6kdR6Rs{^u1?utF(nYO8Gz}btthJYH|e`gnE=?94XRk1_*
zZo}yPCD}a7JX1G$2Z(8M7kDnpz0C8xXHs8L74^*^ZLdbt^#XFB=bU}ESU;dUdoR9M
zMAg{OXsFzqI8M=DEj`X@%xB|afKr98HRCZF?8#F(fDMOI^guoyHlmC+msv#u@qM
zWzT@7if~*j9IagMvPlha$f%3@;uRSCY@(r-xQCpsVbAj{GxufBJdx50DV@deb{(K
z%HNgpts+BXD~Oz8wBOiGu_yGWVa2^fD7Ov07OR5YpgJ6_Dn?t+OT)#_2%b)GI45
zdOd?kqoLR0@PI20Up)~En7*30o_YG6^h4=)*3c2$)=UsQH|h=BY2VNct{Hk2Uw0H1
zrVgobm#JZUznhlfJp$V(QD2y7AME3fYGEgJP0i8g8b$gdahrgnXrILPGIxUcZ@AC
z5-$@cLx-}+K~mPf9)b32U(=`fsQb|fHiFWOg7F$yr_MFwMzRn+uQbQ9@pCE#VQX#
z`{_}2K%=zoM-P(zi%~p}3eP_Y-aAqp-3Lk_rMUe8eekpADGNQ^Gdu(Vr?xiL~
z0+4$zB4l60S_tBtiYw@M3fwaz1WvQwB5?OHt*4=XwY(`N3`U#rp+zVYa91U`}NK
z{W(-O^dsCSO^AjER8l}WN2FcB^MNHI`X1JAv0-i)Z$u;E6xzo!(o3csK}!xaC%4Qw
zmtXLkxYf0k@d;vuvy0#lD(ec{d4q@V$T%IckDqoETm9Je;#G)B@`7YmxvUP;9nO!
zTO{7kx}9|_J^!F@xES#yqF@xP)^A;rJJOFfSU2@HvT(S5mjAI)3m8K8Sm)GI
z$Z9*2uuapSxF6-X#CM2$%NXXG)T5#WYz+K0(PArW@!89W@hYJkl(y#tKPmVxgsxcV
zosHYg%*2EpF=j-`$zz|x}|Ns@~#Bs_FMm!wvrzLGx#ZmWj{dkuRa{8inrXCO;a
zM8){GL}#+^101)LX+{7e!^my7WjSQFXZ#%ze1ejVQ9H)lW)MwW7JoZE$T(IIH5xc
zYv~H}SWOh+#-5!lQBFicqQX*SOrox!t*Zb(r+{~XIiO_TqvU?At``aK-y9Etky
z*|=$VJAySTaC==-J}h8pVeGS2IvuoF~#cj%BjzQxqLU$o@=sZM(mnOXFi0vQ9
zM;)c*?o0aY(Mm!VnCN0{L;#@j}tjB{(7$zLC7zLWJma~gD8r7fG=Qa%(sQ}Dks
z?%pYMe*xWWo@ek;3D88Dm0R3GfD+F{f{Vw+}A)?H1w_5rDl$rFcbM9uvr^W^iEre
zzMm$F?4?56kIbTx6Vj9YZz2cSRvlMJazyX-`-sidn6JjrM$O0J@zPa=a2P}Js9ai
z+$|Sv5Bm?s4j}8$1al1gr1ojiD$Fc%7b4Mou1N`SebE3E)YWypAhJ|5|vMqaF5
z2jKg;kQ5famWU5MHuH?pFQe3GwvVW*czuLcY(Tc)*Z{TF?9Kn^mV=PY-S091*Gy
zg0BXyl`N;mB-a$((7*moSy@{tHjze@NDM8?E=L#NtTlu4YxlrjtF8A}Z4t
z3Y}z`jMai?3eR_hN9nOe5K(3Xr%Nn3q_NyfCc50;2lhR`eYA9Qj)YTX+I(D^9BF2
z;MWT6JD^QX(5A(u{dH-dDfp{`ZxY%Up@S)`+I4BM%r@)|Eb_t2^dWBr?IwpRWBQ
z`7pdln)7|%pB+U1xIhtya%|l=C_QWX_s%Gv|=qq+NHsj9bzSj;l2{h}2z?R^aWt
zF05%4NljniGaLJCO%-#b){OWp4gJB6n?j{!PhdL{+9GzhW!(z{o$x1j4e~r29-j$
zM0j2oyhwPS5uO5$@U-88=b8~er!5ycO?a?spwFqogS8)H%t7{PuWLcvYX^XNqvi
zyl@!8bq8~O`U#Q_M}!a_Z=Sy?a#Rf$G;0Lq8YTjL_b4(Pa*Y#+q@t)o{^#IDm
zvJXuKKYjs8!57GIF5+q10q@cpN?0$v@8oc4Hy+Desbgsqgzl*)@jsSB6va2(?Q
z^x4Ao{R+@ls;$ZBRvs7ndxic1!S7@|9oZDU+{W{IbtmRDC9+2P$Iv-7I?0!q;Tti-
zmtuxrBs2VRk#xIAs+Adzm|49|%GXHS4O=1TfW%kdd^wG#PCsG9ZR-%Ixk*4h
zd$oIkz?ay^$~#F-`j9n;M(@MZ@6}I#phh(e!0C@jj#|dOrF@lF15}`SIRS-3lLW!
z+bx)nPQVTN6!m}~YpylMp$pcG%KoBO~$%(KH1SeMV}4msLW{Q`%Gh4alG9BCY@R
zfgjYs6v0yBz^kq8(M(?V?p#_1@_gL*=*RA2P3h?F$OeCUwEUL0C^)+N0%EP|H
z+PC;$jU9U(-fAWPuRx0pd+|JtR%l)-o~M$3bBu8tEN}1Es1$47zlw4rBw%_^JJaI
z$ejlyqRwM|T@uUo@5yNMV#JoQcgskovL`;d8@-U|en(UVgR|dGS-T!^nWZMhu
zg%6Zf*K=H~3UOTM{qX%GWE`I4OUDkwGgK^lrzxazYt%OYllwRN5_=~_Vr8;%3chQO
zdNlTIH|()nZ62H9$KaTBDqTpnOWE+nccpWwN)fz*cS=@tY|);JrsqtOr0UIJeqCk
zor}^m{rk{o8Z_0(iJljrZYlQKDSAPLrs{;Dc18i4nmezi)S1U1|LkYni)$Kd
zWjiB1_6q)Y!LhqBj8m~u(b9Q#2vK8$C6*%E6yKktnnZ*b_Wcps?7aXz9sBHgpI
zxBUQte;k6Q6J8oPB}4Pjpx2ne%I^T4!r_#wd;n3l6^L42x?f3~th7pn;9;SY
z*uFjPxm?`-!xP|6MN$ez(xh`ieuTALwQQB^iHbwB=g2Pkjw+(b#FbD6A`OLkG
za+J?kkq%PIYB^%csv}{1zX2Yl}#K6XvJCL?mJAIX2I)za4r?#tCI_%gv?kbW0q
z)PHKI2y~myiy$G9C({(c9BqjmA{Sxp7|szic)mc(Zw^r6P`Wiyp_nqW$D5-VU!-S
z{?l2JENd^x`tRysJw84OFO!l}pHDCbBP;noO%SQ~1(hpFwly{PFK(6z_6B^ag+J
z!_47(bbrJPB=VEdA3@;H}3{a!Bca2_8eiz)5WcAPZ;_47j0Dbn{xwJz-Cd3{;=
zp!D?6$hcC(|77}qm(YvUE=?bZ%^7Zhb5-mFBCU1o@#RL6#O%f}H;xJFugkVVxp1
zoQLxC7w964O7rY)HR@pbuPecE5xY^rzKDlXi8|dUsbU~q;yDBR37pLsBrX%(=
zh@KX?UXH|$0gji3u_u|xIpyi7d_^J?p@^8{jdVZ%CQJ{a=3p_O#P|8Z^i!Cty0D
zFQBOyl%WXc-0CmKz$nJ;r_(C*Gtn~XI-x7K;-1Yg-RtKiWx4F@kK?*t1|-VGluO
z$oQ1=;jZ`fC6vjhs$GBcd3~fWF-`$0fS-1gz`-od#W!TSO)Ck#t9`ZOG`h07+(A
zYyaQJvKDRN!;K_?@rfQk^Zdip-f$R3Fw0{HL(Yq{Lw(I?Tz!FN?pVk%JmO#Fz
zl8ENam-9l_OXn3QM|~n#i_4I2$0r-U#l9UV@}Z{QSN`sgrx772dcHbY%fwyhNX4
zoC;|1W#8$DU*-$MSD?PAyCc4Z=ui=u4JZX^O^v2C$%EbP}}Vb8R_MbfJ7RneM-
zv)QR0iai3`OKivwT`ki`vlPII^v4xW|q4z0$e^d2_zUDE$uhG}rc*k@E?~{1
z4r+d9;{ASWuraK`v$hPU$M{ZTjz^S-1h3QRa=L;*@8j3BE$r8Py_A%87ar2Ts)D
zT;0=Zar6`@FK{HWF}ZVf!-RV*HLgvYl*byxTze8$Zg5;{|+pANsZH6y5pQLx
zmZ?w0zsmu{j?2k-fT2xRh73?ae^A*iD!Ysqo*ucUO8pHy5|V|If!vTjQ7ytuZqU
z$a(eV-yC%eP$UQT$k071!OkUg;4aK7+(6+jC$7ML9;ykbai`xKC*ZQS-~2t)5LQ}M
zue5I2$}zGW$X|jp_ZrM5ov1EU#DEB~oUca@uOhLj7yXG6W@9K+j}5?47kZ*IXJ
zMk8kh`SPPG`=U}s+=r;vkM%S_u`r_15ykw1zpyK~ZB`21MpR;QjpFDuh62o)j~gn@

[Qemu-devel] [PULL 01/10] pc-bios/s390-ccw: make checkpatch happy

2014-06-27 Thread Cornelia Huck
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com

Remove tabs, tweak whitespace and comments.

Acked-by: Christian Borntraeger borntrae...@de.ibm.com
Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com
Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 pc-bios/s390-ccw/bootmap.c|   37 -
 pc-bios/s390-ccw/s390-ccw.h   |   20 ++--
 pc-bios/s390-ccw/sclp-ascii.c |4 ++--
 pc-bios/s390-ccw/virtio.c |   26 +-
 pc-bios/s390-ccw/virtio.h |2 +-
 5 files changed, 46 insertions(+), 43 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index 5ee3fcb..753c288 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -10,7 +10,7 @@
 
 #include s390-ccw.h
 
-// #define DEBUG_FALLBACK
+/* #define DEBUG_FALLBACK */
 
 #ifdef DEBUG_FALLBACK
 #define dputs(txt) \
@@ -47,13 +47,13 @@ struct mbr {
 struct scsi_blockptr blockptr;
 } __attribute__ ((packed));
 
-#define ZIPL_MAGIC zIPL
+#define ZIPL_MAGIC  zIPL
 
-#define ZIPL_COMP_HEADER_IPL   0x00
-#define ZIPL_COMP_HEADER_DUMP  0x01
+#define ZIPL_COMP_HEADER_IPL0x00
+#define ZIPL_COMP_HEADER_DUMP   0x01
 
-#define ZIPL_COMP_ENTRY_LOAD   0x02
-#define ZIPL_COMP_ENTRY_EXEC   0x01
+#define ZIPL_COMP_ENTRY_LOAD0x02
+#define ZIPL_COMP_ENTRY_EXEC0x01
 
 /* Scratch space */
 static uint8_t sec[SECTOR_SIZE] __attribute__((__aligned__(SECTOR_SIZE)));
@@ -107,8 +107,8 @@ static void jump_to_IPL_code(uint64_t address)
 /* Check for ZIPL magic. Returns 0 if not matched. */
 static int zipl_magic(uint8_t *ptr)
 {
-uint32_t *p = (void*)ptr;
-uint32_t *z = (void*)ZIPL_MAGIC;
+uint32_t *p = (void *)ptr;
+uint32_t *z = (void *)ZIPL_MAGIC;
 
 if (*p != *z) {
 debug_print_int(invalid magic, *p);
@@ -136,7 +136,7 @@ static inline bool unused_space(const void *p, unsigned int 
size)
 static int zipl_load_segment(struct component_entry *entry)
 {
 const int max_entries = (SECTOR_SIZE / sizeof(struct scsi_blockptr));
-struct scsi_blockptr *bprs = (void*)sec;
+struct scsi_blockptr *bprs = (void *)sec;
 const int bprs_size = sizeof(sec);
 uint64_t blockno;
 long address;
@@ -156,16 +156,18 @@ static int zipl_load_segment(struct component_entry 
*entry)
 }
 
 for (i = 0;; i++) {
-u64 *cur_desc = (void*)bprs[i];
+u64 *cur_desc = (void *)bprs[i];
 
 blockno = bprs[i].blockno;
-if (!blockno)
+if (!blockno) {
 break;
+}
 
 /* we need the updated blockno for the next indirect entry in the
chain, but don't want to advance address */
-if (i == (max_entries - 1))
+if (i == (max_entries - 1)) {
 break;
+}
 
 if (bprs[i].blockct == 0  unused_space(bprs[i + 1],
 sizeof(struct scsi_blockptr))) {
@@ -178,9 +180,10 @@ static int zipl_load_segment(struct component_entry *entry)
 break;
 }
 address = virtio_load_direct(cur_desc[0], cur_desc[1], 0,
- (void*)address);
-if (address == -1)
+ (void *)address);
+if (address == -1) {
 goto fail;
+}
 }
 } while (blockno);
 
@@ -220,7 +223,7 @@ static int zipl_run(struct scsi_blockptr *pte)
 
 entry++;
 
-if ((uint8_t*)(entry[1])  (tmp_sec + SECTOR_SIZE)) {
+if ((uint8_t *)(entry[1])  (tmp_sec + SECTOR_SIZE)) {
 goto fail;
 }
 }
@@ -241,7 +244,7 @@ fail:
 
 int zipl_load(void)
 {
-struct mbr *mbr = (void*)sec;
+struct mbr *mbr = (void *)sec;
 uint8_t *ns, *ns_end;
 int program_table_entries = 0;
 int pte_len = sizeof(struct scsi_blockptr);
@@ -249,7 +252,7 @@ int zipl_load(void)
 const char *error = ;
 
 /* Grab the MBR */
-virtio_read(0, (void*)mbr);
+virtio_read(0, (void *)mbr);
 
 dputs(checking magic\n);
 
diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h
index 5e871ac..fe1dd22 100644
--- a/pc-bios/s390-ccw/s390-ccw.h
+++ b/pc-bios/s390-ccw/s390-ccw.h
@@ -34,10 +34,10 @@ typedef unsigned long long __u64;
 #define PAGE_SIZE 4096
 
 #ifndef EIO
-#define EIO1
+#define EIO 1
 #endif
 #ifndef EBUSY
-#define EBUSY  2
+#define EBUSY   2
 #endif
 #ifndef NULL
 #define NULL0
@@ -57,7 +57,7 @@ void sclp_setup(void);
 
 /* virtio.c */
 unsigned long virtio_load_direct(ulong rec_list1, ulong rec_list2,
-ulong subchan_id, void *load_addr);
+ ulong subchan_id, void *load_addr);
 bool virtio_is_blk(struct subchannel_id schid);
 void virtio_setup_block(struct 

Re: [Qemu-devel] [PATCH v4 0/6] iotests: Allow out-of-tree run

2014-06-27 Thread Markus Armbruster
Max Reitz mre...@redhat.com writes:

 On 07.06.2014 23:21, Max Reitz wrote:
 On 24.05.2014 23:24, Max Reitz wrote:
 This series enables qemu-iotests to be run in a build tree outside of
 the source tree. It also makes the tests use the command for invoking
 the Python interpreter specified through configure instead of always
 using /usr/bin/env python.

 Ping; I do understand that this series is not urgent, but since I
 realized out-of-tree builds to be probably superior, I personally
 base all my own patches on this series, as I don't want to fiddle
 around with the iotests. Therefore, I'd be glad if someone would
 review the remaining patches so it can be merged soon. :-)

 Ping again. Because this is just convenient for development, I don't
 need it in any specific release, though.

I haven't found the time for a proper review, and I can't promise one
right now, so I should probably keep my mouth where my money is, but
here goes anyway: unless running tests is utterly trivial, tests will
not be run, and avoidable mistakes happen.

Case in point: I spent a non-trivial chunk of time yesterday to debug
three regressions clearly visible in iotests.  I did not scold the
people involved in getting the regressions committed for not running
these tests, because I feel strongly I can't demand tests to be run that
require instructions more complex than make WHATEVER.

I don't think this is just convenient for development.  I'd say it's a
must-have.



Re: [Qemu-devel] [PATCH] Allow mismatched virtio config-len

2014-06-27 Thread Paolo Bonzini

Il 27/06/2014 10:34, Dr. David Alan Gilbert (git) ha scritto:

From: Dr. David Alan Gilbert dgilb...@redhat.com

Commit 'virtio: validate config_len on load' restricted config_len
loaded from the wire to match the config_len that the device had.

Unfortunately, there are cases where this isn't true, the one
we found it on was the wqe addition in virtio-blk.


Indeed, the alternative here is to break migration.

As a follow up, it would be nice to let the bus detect whether the 
config_len change is valid or not.


For virtio-mmio and s390, mst said that config length must always match 
(luckily, these machines aren't versioned so they are not affected by 
the wce change).


For virtio-pci, it is okay as long as the old_length + 
VIRTIO_PCI_REGION_SIZE(vdev) and new_length + 
VIRTIO_PCI_REGION_SIZE(vdev) do not cross a power of two.


Paolo


Allow mismatched config-lengths:
   *) If the version on the wire is shorter then ensure that the
  remainder is 0xff filled (as virtio_config_read does on
  out of range reads)
   *) If the version on the wire is longer, load what we have space
  for and skip the rest.

Signed-off-by: Dr. David Alan Gilbert dgilb...@redhat.com
---
 hw/virtio/virtio.c | 30 ++
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index a3082d5..2b11142 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -927,11 +927,33 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f)
 }
 config_len = qemu_get_be32(f);
 if (config_len != vdev-config_len) {
-error_report(Unexpected config length 0x%x. Expected 0x%zx,
- config_len, vdev-config_len);
-return -1;
+/*
+ * Unfortunately the reality is that there are cases where we
+ * see mismatched config lengths, so we have to deal with them
+ * rather than rejecting them.
+ */
+
+if (config_len  vdev-config_len) {
+/* This is normal in some devices when they add a new option */
+memset(vdev-config, 0xff, vdev-config_len);
+qemu_get_buffer(f, vdev-config, config_len);
+} else {
+int32_t diff;
+/* config_len  vdev-config_len
+ * This is rarer, but is here to allow us to fix the case above
+ */
+qemu_get_buffer(f, vdev-config, vdev-config_len);
+/*
+ * Even though we expect the diff to be small, we can't use
+ * qemu_file_skip because it's not safe for a large skip.
+ */
+for (diff = config_len - vdev-config_len; diff  0; diff--) {
+qemu_get_byte(f);
+}
+}
+} else {
+qemu_get_buffer(f, vdev-config, vdev-config_len);
 }
-qemu_get_buffer(f, vdev-config, vdev-config_len);

 num = qemu_get_be32(f);







[Qemu-devel] [PULL 07/10] pc-bios/s390-ccw: factor out ipl code

2014-06-27 Thread Cornelia Huck
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com

Move the scsi-disk specific ipl code from zipl_load() into a new
function ipl_scsi(). This makes it easier to add ipl routines for other
disk types.

Acked-by: Christian Borntraeger borntrae...@de.ibm.com
Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com
Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 pc-bios/s390-ccw/bootmap.c |   83 
 1 file changed, 45 insertions(+), 38 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index 1866a20..3c08f82 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -12,7 +12,9 @@
 #include bootmap.h
 #include virtio.h
 
+#ifdef DEBUG
 /* #define DEBUG_FALLBACK */
+#endif
 
 #ifdef DEBUG_FALLBACK
 #define dputs(txt) \
@@ -23,8 +25,7 @@
 #endif
 
 /* Scratch space */
-static uint8_t sec[MAX_SECTOR_SIZE]
-__attribute__((__aligned__(MAX_SECTOR_SIZE)));
+static uint8_t sec[MAX_SECTOR_SIZE*4] __attribute__((__aligned__(PAGE_SIZE)));
 
 typedef struct ResetInfo {
 uint32_t ipl_mask;
@@ -72,19 +73,9 @@ static void jump_to_IPL_code(uint64_t address)
 virtio_panic(\n! IPL returns !\n);
 }
 
-/* Check for ZIPL magic. Returns 0 if not matched. */
-static int zipl_magic(uint8_t *ptr)
-{
-uint32_t *p = (void *)ptr;
-uint32_t *z = (void *)ZIPL_MAGIC;
-
-if (*p != *z) {
-debug_print_int(invalid magic, *p);
-virtio_panic(invalid magic);
-}
-
-return 1;
-}
+/***
+ * IPL a SCSI disk
+ */
 
 static void zipl_load_segment(ComponentEntry *entry)
 {
@@ -92,8 +83,10 @@ static void zipl_load_segment(ComponentEntry *entry)
 ScsiBlockPtr *bprs = (void *)sec;
 const int bprs_size = sizeof(sec);
 block_number_t blockno;
-long address;
+uint64_t address;
 int i;
+char err_msg[] = zIPL failed to read BPRS at 0x;
+char *blk_no = err_msg[30]; /* where to print blockno in (those ZZs) */
 
 blockno = entry-data.blockno;
 address = entry-load_address;
@@ -103,11 +96,11 @@ static void zipl_load_segment(ComponentEntry *entry)
 
 do {
 memset(bprs, FREE_SPACE_FILLER, bprs_size);
-debug_print_int(reading bprs at, blockno);
-read_block(blockno, bprs, zipl_load_segment: cannot read block);
+fill_hex_val(blk_no, blockno, sizeof(blockno));
+read_block(blockno, bprs, err_msg);
 
 for (i = 0;; i++) {
-u64 *cur_desc = (void *)bprs[i];
+uint64_t *cur_desc = (void *)bprs[i];
 
 blockno = bprs[i].blockno;
 if (!blockno) {
@@ -132,7 +125,7 @@ static void zipl_load_segment(ComponentEntry *entry)
 }
 address = virtio_load_direct(cur_desc[0], cur_desc[1], 0,
  (void *)address);
-IPL_assert(address != -1, zipl_load_segment: wrong IPL address);
+IPL_assert(address != -1, zIPL load segment failed);
 }
 } while (blockno);
 }
@@ -144,13 +137,11 @@ static void zipl_run(ScsiBlockPtr *pte)
 ComponentEntry *entry;
 uint8_t tmp_sec[MAX_SECTOR_SIZE];
 
-virtio_read(pte-blockno, tmp_sec);
+read_block(pte-blockno, tmp_sec, Cannot read header);
 header = (ComponentHeader *)tmp_sec;
 
-IPL_assert(zipl_magic(tmp_sec), zipl_run: zipl_magic);
-
-IPL_assert(header-type == ZIPL_COMP_HEADER_IPL,
-   zipl_run: wrong header type);
+IPL_assert(magic_match(tmp_sec, ZIPL_MAGIC), No zIPL magic);
+IPL_assert(header-type == ZIPL_COMP_HEADER_IPL, Bad header type);
 
 dputs(start loading images\n);
 
@@ -162,17 +153,16 @@ static void zipl_run(ScsiBlockPtr *pte)
 entry++;
 
 IPL_assert((uint8_t *)(entry[1]) = (tmp_sec + MAX_SECTOR_SIZE),
-   zipl_run: wrong entry size);
+   Wrong entry value);
 }
 
-IPL_assert(entry-component_type == ZIPL_COMP_ENTRY_EXEC,
-   zipl_run: no EXEC entry);
+IPL_assert(entry-component_type == ZIPL_COMP_ENTRY_EXEC, No EXEC entry);
 
 /* should not return */
 jump_to_IPL_code(entry-load_address);
 }
 
-void zipl_load(void)
+static void ipl_scsi(void)
 {
 ScsiMbr *mbr = (void *)sec;
 uint8_t *ns, *ns_end;
@@ -180,20 +170,16 @@ void zipl_load(void)
 const int pte_len = sizeof(ScsiBlockPtr);
 ScsiBlockPtr *prog_table_entry;
 
-/* Grab the MBR */
-read_block(0, mbr, zipl_load: cannot read block 0);
-
-dputs(checking magic\n);
-
-IPL_assert(zipl_magic(mbr-magic), zipl_load: zipl_magic 1);
+/* The 0-th block (MBR) was already read into sec[] */
 
+sclp_print(Using SCSI scheme.\n);
 debug_print_int(program table, mbr-blockptr.blockno);
 
 /* Parse the program table */
 read_block(mbr-blockptr.blockno, sec,
-   zipl_load: cannot read program table);
+ 

Re: [Qemu-devel] [PATCH 4/5] PPC: e500: Support platform devices

2014-06-27 Thread Alexander Graf


On 27.06.14 11:29, Eric Auger wrote:

On 06/04/2014 02:28 PM, Alexander Graf wrote:

For e500 our approach to supporting platform devices is to create a simple
bus from the guest's point of view within which we map platform devices
dynamically.

We allocate memory regions always within the platform hole in address
space and map IRQs to predetermined IRQ lines that are reserved for platform
device usage.

This maps really nicely into device tree logic, so we can just tell the
guest about our virtual simple bus in device tree as well.

Signed-off-by: Alexander Graf ag...@suse.de
---
  default-configs/ppc-softmmu.mak   |   1 +
  default-configs/ppc64-softmmu.mak |   1 +
  hw/ppc/e500.c | 221 ++
  hw/ppc/e500.h |   1 +
  hw/ppc/e500plat.c |   1 +
  5 files changed, 225 insertions(+)

diff --git a/default-configs/ppc-softmmu.mak b/default-configs/ppc-softmmu.mak
index 33f8d84..d6ec8b9 100644
--- a/default-configs/ppc-softmmu.mak
+++ b/default-configs/ppc-softmmu.mak
@@ -45,6 +45,7 @@ CONFIG_PREP=y
  CONFIG_MAC=y
  CONFIG_E500=y
  CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
+CONFIG_PLATFORM=y
  # For PReP
  CONFIG_MC146818RTC=y
  CONFIG_ETSEC=y
diff --git a/default-configs/ppc64-softmmu.mak 
b/default-configs/ppc64-softmmu.mak
index 37a15b7..06677bf 100644
--- a/default-configs/ppc64-softmmu.mak
+++ b/default-configs/ppc64-softmmu.mak
@@ -45,6 +45,7 @@ CONFIG_PSERIES=y
  CONFIG_PREP=y
  CONFIG_MAC=y
  CONFIG_E500=y
+CONFIG_PLATFORM=y
  CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
  # For pSeries
  CONFIG_XICS=$(CONFIG_PSERIES)
diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
index 33d54b3..bc26215 100644
--- a/hw/ppc/e500.c
+++ b/hw/ppc/e500.c
@@ -36,6 +36,7 @@
  #include exec/address-spaces.h
  #include qemu/host-utils.h
  #include hw/pci-host/ppce500.h
+#include hw/platform/device.h
  
  #define EPAPR_MAGIC(0x45504150)

  #define BINARY_DEVICE_TREE_FILEmpc8544ds.dtb
@@ -47,6 +48,14 @@
  
  #define RAM_SIZES_ALIGN(64UL  20)
  
+#define E500_PLATFORM_BASE 0xF000ULL

+#define E500_PLATFORM_HOLE (128ULL * 1024 * 1024) /* 128 MB */
+#define E500_PLATFORM_PAGE_SHIFT   12
+#define E500_PLATFORM_HOLE_PAGES   (E500_PLATFORM_HOLE  \
+E500_PLATFORM_PAGE_SHIFT)
+#define E500_PLATFORM_FIRST_IRQ5
+#define E500_PLATFORM_NUM_IRQS 10
+
  /* TODO: parameterize */
  #define MPC8544_CCSRBAR_BASE   0xE000ULL
  #define MPC8544_CCSRBAR_SIZE   0x0010ULL
@@ -122,6 +131,62 @@ static void dt_serial_create(void *fdt, unsigned long long 
offset,
  }
  }
  
+typedef struct PlatformDevtreeData {

+void *fdt;
+const char *mpic;
+int irq_start;
+const char *node;
+} PlatformDevtreeData;
+
+static int platform_device_create_devtree(Object *obj, void *opaque)
+{
+PlatformDevtreeData *data = opaque;
+Object *dev;
+PlatformDeviceState *pdev;
+
+dev = object_dynamic_cast(obj, TYPE_PLATFORM_DEVICE);
+pdev = (PlatformDeviceState *)dev;
+
+if (!pdev) {
+/* Container, traverse it for children */
+return object_child_foreach(obj, platform_device_create_devtree, data);
+}
+
+return 0;
+}
+
+static void platform_create_devtree(void *fdt, const char *node, uint64_t addr,
+const char *mpic, int irq_start,
+int nr_irqs)
+{
+const char platcomp[] = qemu,platform\0simple-bus;
+PlatformDevtreeData data;
+
+/* Create a /platform node that we can put all devices into */
+
+qemu_fdt_add_subnode(fdt, node);
+qemu_fdt_setprop(fdt, node, compatible, platcomp, sizeof(platcomp));
+qemu_fdt_setprop_string(fdt, node, device_type, platform);
+
+/* Our platform hole is less than 32bit big, so 1 cell is enough for 
address
+   and size */
+qemu_fdt_setprop_cells(fdt, node, #size-cells, 1);
+qemu_fdt_setprop_cells(fdt, node, #address-cells, 1);
+qemu_fdt_setprop_cells(fdt, node, ranges, 0, addr  32, addr,
+   E500_PLATFORM_HOLE);
+
+qemu_fdt_setprop_phandle(fdt, node, interrupt-parent, mpic);
+
+/* Loop through all devices and create nodes for known ones */
+
+data.fdt = fdt;
+data.mpic = mpic;
+data.irq_start = irq_start;
+data.node = node;
+
+platform_device_create_devtree(qdev_get_machine(), data);
+}
+
  static int ppce500_load_device_tree(MachineState *machine,
  PPCE500Params *params,
  hwaddr addr,
@@ -379,6 +444,12 @@ static int ppce500_load_device_tree(MachineState *machine,
  qemu_fdt_setprop_cell(fdt, pci, #address-cells, 3);
  qemu_fdt_setprop_string(fdt, /aliases, pci0, pci);
  
+if (params-has_platform) {

+platform_create_devtree(fdt, /platform, E500_PLATFORM_BASE,
+   mpic, 

[Qemu-devel] [PULL 06/10] pc-bios/s390-ccw: Add fill_hex_val func to provide better msgs

2014-06-27 Thread Cornelia Huck
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com

Factor out helper function for dumping a hex value into a buffer.

Acked-by: Christian Borntraeger borntrae...@de.ibm.com
Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com
Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 pc-bios/s390-ccw/s390-ccw.h |   16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h
index 29468fb..959aed0 100644
--- a/pc-bios/s390-ccw/s390-ccw.h
+++ b/pc-bios/s390-ccw/s390-ccw.h
@@ -86,15 +86,21 @@ static inline void fill_hex(char *out, unsigned char val)
 out[1] = hex[val  0xf];
 }
 
-static inline void print_int(const char *desc, u64 addr)
+static inline void fill_hex_val(char *out, void *ptr, unsigned size)
 {
-unsigned char *addr_c = (unsigned char *)addr;
-char out[] = : 0x\n;
+unsigned char *value = ptr;
 unsigned int i;
 
-for (i = 0; i  sizeof(addr); i++) {
-fill_hex(out[4 + (i*2)], addr_c[i]);
+for (i = 0; i  size; i++) {
+fill_hex(out[i*2], value[i]);
 }
+}
+
+static inline void print_int(const char *desc, u64 addr)
+{
+char out[] = : 0x\n;
+
+fill_hex_val(out[4], addr, sizeof(addr));
 
 sclp_print(desc);
 sclp_print(out);
-- 
1.7.9.5




Re: [Qemu-devel] Reverse execution and deterministic replay

2014-06-27 Thread Pavel Dovgaluk
 On 27 June 2014 11:35, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote:
  The major disadvantage of icount is that it's updated only on TB boundaries.
  When one instruction in the middle of the block uses virtual clock, it could
  have different values for different divisions of the code to TB.
 
 This is only true if the instruction is incorrectly not
 marked as being I/O. The idea behind icount is that in
 general we update it on TB boundaries (it's much faster
 than doing it once per insn) but for those places which
 do turn out to need an exact icount we then retranslate
 the block to get the instruction-to-icount-adjustment
 mapping.

I forgot about one more issue.
When qemu stops execution on the breakpoint, the icount
is decreased to the number of instructions in the block.
But in this case the last instruction is not executed and
should not affect the counter.

Pavel Dovgaluk





[Qemu-devel] [PULL 09/10] pc-bios/s390-ccw: IPL from LDL/CMS-formatted ECKD DASD

2014-06-27 Thread Cornelia Huck
From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com

Add code that allows us to start from two further ECKD DASD disk
layouts: LDL (Linux disk layout) and CMS (cms-formatted disk).

Acked-by: Christian Borntraeger borntrae...@de.ibm.com
Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com
Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
 pc-bios/s390-ccw/bootmap.c |   92 
 pc-bios/s390-ccw/bootmap.h |7 
 2 files changed, 92 insertions(+), 7 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index beda4d6..fa54abb 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -80,6 +80,17 @@ static void jump_to_IPL_code(uint64_t address)
 static unsigned char _bprs[8*1024]; /* guessed max ECKD sector size */
 const int max_bprs_entries = sizeof(_bprs) / sizeof(ExtEckdBlockPtr);
 
+static inline void verify_boot_info(BootInfo *bip)
+{
+IPL_assert(magic_match(bip-magic, ZIPL_MAGIC), No zIPL magic);
+IPL_assert(bip-version == BOOT_INFO_VERSION, Wrong zIPL version);
+IPL_assert(bip-bp_type == BOOT_INFO_BP_TYPE_IPL, DASD is not for IPL);
+IPL_assert(bip-dev_type == BOOT_INFO_DEV_TYPE_ECKD, DASD is not ECKD);
+IPL_assert(bip-flags == BOOT_INFO_FLAGS_ARCH, Not for this arch);
+IPL_assert(block_size_ok(bip-bp.ipl.bm_ptr.eckd.bptr.size),
+   Bad block size in zIPL section of the 1st record.);
+}
+
 static bool eckd_valid_address(BootMapPointer *p)
 {
 const uint64_t cylinder = p-eckd.cylinder
@@ -198,19 +209,15 @@ static void run_eckd_boot_script(block_number_t 
mbr_block_nr)
 jump_to_IPL_code(bms-entry[i].address.load_address); /* no return */
 }
 
-static void ipl_eckd(void)
+static void ipl_eckd_cdl(void)
 {
 XEckdMbr *mbr;
 Ipl2 *ipl2 = (void *)sec;
 IplVolumeLabel *vlbl = (void *)sec;
 block_number_t block_nr;
 
-sclp_print(Using ECKD scheme.\n);
-if (virtio_guessed_disk_nature()) {
-sclp_print(Using guessed DASD geometry.\n);
-virtio_assume_eckd();
-}
 /* we have just read the block #0 and recognized it as IPL1 */
+sclp_print(CDL\n);
 
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
 read_block(1, ipl2, Cannot read IPL2 record at block 1);
@@ -238,6 +245,57 @@ static void ipl_eckd(void)
 /* no return */
 }
 
+static void ipl_eckd_ldl(ECKD_IPL_mode_t mode)
+{
+LDL_VTOC *vlbl = (void *)sec; /* already read, 3rd block */
+char msg[4] = { '?', '.', '\n', '\0' };
+block_number_t block_nr;
+BootInfo *bip;
+
+sclp_print((mode == ECKD_CMS) ? CMS : LDL);
+sclp_print( version );
+switch (vlbl-LDL_version) {
+case LDL1_VERSION:
+msg[0] = '1';
+break;
+case LDL2_VERSION:
+msg[0] = '2';
+break;
+default:
+msg[0] = vlbl-LDL_version;
+msg[0] = 0x0f; /* convert EBCDIC   */
+msg[0] |= 0x30; /* to ASCII (digit) */
+msg[1] = '?';
+break;
+}
+sclp_print(msg);
+print_volser(vlbl-volser);
+
+/* DO NOT read BootMap pointer (only one, xECKD) at block #2 */
+
+memset(sec, FREE_SPACE_FILLER, sizeof(sec));
+read_block(0, sec, Cannot read block 0);
+bip = (void *)(sec + 0x70); /* boot info is eckd mbr for LDL */
+verify_boot_info(bip);
+
+block_nr = eckd_block_num((void *)(bip-bp.ipl.bm_ptr.eckd.bptr));
+run_eckd_boot_script(block_nr);
+/* no return */
+}
+
+static void ipl_eckd(ECKD_IPL_mode_t mode)
+{
+switch (mode) {
+case ECKD_CDL:
+ipl_eckd_cdl(); /* no return */
+case ECKD_CMS:
+case ECKD_LDL:
+ipl_eckd_ldl(mode); /* no return */
+default:
+virtio_panic(\n! Unknown ECKD IPL mode !\n);
+}
+}
+
 /***
  * IPL a SCSI disk
  */
@@ -374,6 +432,7 @@ static void ipl_scsi(void)
 void zipl_load(void)
 {
 ScsiMbr *mbr = (void *)sec;
+LDL_VTOC *vlbl = (void *)sec;
 
 /* Grab the MBR */
 memset(sec, FREE_SPACE_FILLER, sizeof(sec));
@@ -384,8 +443,27 @@ void zipl_load(void)
 if (magic_match(mbr-magic, ZIPL_MAGIC)) {
 ipl_scsi(); /* no return */
 }
+
+/* We have failed to follow the SCSI scheme, so */
+sclp_print(Using ECKD scheme.\n);
+if (virtio_guessed_disk_nature()) {
+sclp_print(Using guessed DASD geometry.\n);
+virtio_assume_eckd();
+}
+
 if (magic_match(mbr-magic, IPL1_MAGIC)) {
-ipl_eckd(); /* CDL ECKD; no return */
+ipl_eckd(ECKD_CDL); /* no return */
+}
+
+/* LDL/CMS? */
+memset(sec, FREE_SPACE_FILLER, sizeof(sec));
+read_block(2, vlbl, Cannot read block 2);
+
+if (magic_match(vlbl-magic, CMS1_MAGIC)) {
+ipl_eckd(ECKD_CMS); /* no return */
+}
+if (magic_match(vlbl-magic, LNX1_MAGIC)) {
+ipl_eckd(ECKD_LDL); /* no return */
 }
 
 virtio_panic(\n* invalid MBR 

Re: [Qemu-devel] [PATCH 2/4] mips_malta: Change default KVM cpu to 24Kc (no FP)

2014-06-27 Thread Paolo Bonzini

Il 27/06/2014 10:43, Aurelien Jarno ha scritto:

On Thu, Jun 26, 2014 at 10:44:23AM +0100, James Hogan wrote:

Change the default Malta CPU model for when KVM is enabled to 24Kc which
doesn't have floating point support compared to the 24Kf.

The resulting incorrect Config CP0 register value doesn't get passed to
KVM yet as KVM doesn't expose it, however we should ensure it is set
correctly now to reduce the risk of breaking migration/loadvm to a
future version of QEMU/Linux that does support them.

Signed-off-by: James Hogan james.ho...@imgtec.com
Cc: Aurelien Jarno aurel...@aurel32.net
Cc: Paolo Bonzini pbonz...@redhat.com
---
 hw/mips/mips_malta.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
index 2868ee5b0307..c0841991f4e9 100644
--- a/hw/mips/mips_malta.c
+++ b/hw/mips/mips_malta.c
@@ -949,7 +949,12 @@ void mips_malta_init(MachineState *machine)
 #ifdef TARGET_MIPS64
 cpu_model = 20Kc;
 #else
-cpu_model = 24Kf;
+if (kvm_enabled()) {
+/* Don't enable FPU on KVM yet */
+cpu_model = 24Kc;
+} else {
+cpu_model = 24Kf;
+}
 #endif
 }


Given the explanations in the other mails, that looks fine to me, that
said I think we should at least warn the user that we are disabling some
features, instead of doing it silently. This is what is done for example
on x86 when a CPU feature is not available.


I agree.  James, can you send v2 of this patch only?

Paolo



Re: [Qemu-devel] [v5][PATCH 4/5] xen, gfx passthrough: create host bridge to passthrough

2014-06-27 Thread Paolo Bonzini

Il 27/06/2014 10:34, Chen, Tiejun ha scritto:



So how to separate this to specific to xen? Or you mean we need to
create an new machine to address this scenario? But actually this is
same as xenfv_machine except for these little codes.


Yes, please create a new machine so that -M pc doesn't have any of 
these hacks.


Note that -M xenfv is obsolete, Xen can now use -M pc (i.e. the 
default).


Paolo



Re: [Qemu-devel] [PATCH] ahci.c: mask the interrupt on complete flag to allow ahci.c to read the correct size for the PRDT

2014-06-27 Thread Paolo Bonzini

Il 27/06/2014 01:28, Reza Jelveh ha scritto:

+static int prdt_tbl_entry_size(const AHCI_SG tbl) {
+  return (le32_to_cpu(tbl.flags_size)  AHCI_PRDT_SIZE_MASK) + 1;
+}


Apart from the incorrect indentation/formatting here, the patch seems okay.

How can this be reproduced?

Paolo



Re: [Qemu-devel] [PATCH v1] trace: add qemu_system_powerdown_request and qemu_system_shutdown_request trace events

2014-06-27 Thread Stefan Hajnoczi
On Sun, Jun 22, 2014 at 02:43:03AM +0800, Yang Zhiyong wrote:
 We have the experience that the guest doesn't stop successfully 
 though it was instructed to shut down.
 
 The root cause may be not in QEMU mostly.  However, QEMU is often
 suspected at the beginning just because the issue occurred in
 virtualization environment.
 
 Therefore, we need to affirm that QEMU received the shutdown  
 request and raised ACPI irq from virsh shutdown command, 
 virt-manger or stopping QEMU process to the VM .
 So that we can affirm the problems was belonged to the Guset OS 
 rather than the QEMU itself.
 
 When we stop guests by virsh shutdown command or virt-manger, 
 or stopping QEMU process, qemu_system_powerdown_request() or
 qemu_system_shutdown_request() is called. Then the below functions 
 in main_loop_should_exit() of Vl.c are called roughly in the 
 following order.
   
   if (qemu_powerdown_requested()) 
   qemu_system_powerdown()
   monitor_protocol_event(QEVENT_POWERDOWN, NULL)
   
   OR
   
   if(qemu_shutdown_requested()} 
   monitor_protocol_event(QEVENT_SHUTDOWN, NULL);
 
 The tracepoint of monitor_protocol_event() already exists, but no
 tracepoints are defined for qemu_system_powerdown_request() and 
 qemu_system_shutdown_request(). So this patch adds two tracepoints for 
 the two functions. We believe that it will become much easier to 
 isolate the problem mentioned above by these tracepoints.
 
 
 Signed-off-by: Yang Zhiyong yangzy.f...@cn.fujitsu.com
 
 ---
  trace-events |2 ++
  vl.c |2 ++
  2 files changed, 4 insertions(+), 0 deletions(-)

Thanks, applied to my tracing tree:
https://github.com/stefanha/qemu/commits/tracing

Stefan


pgp5cILHIlkdU.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH 0/5] Platform device support

2014-06-27 Thread Alexander Graf


On 27.06.14 12:30, Andreas Färber wrote:

Am 26.06.2014 14:01, schrieb Alexander Graf:

On 20.06.14 08:43, Peter Crosthwaite wrote:

On Wed, Jun 4, 2014 at 10:28 PM, Alexander Graf ag...@suse.de wrote:

Platforms without ISA and/or PCI have had a seriously hard time in
the dynamic
device creation world of QEMU. Devices on these were modeled as
SysBus devices
which can only be instantiated in machine files, not through -device.

Why is that so?

Well, SysBus is trying to be incredibly generic. It allows you to
plug any
interrupt sender into any other interrupt receiver. It allows you to map
a device's memory regions into any other random memory region. All of
that
only works from C code or via really complicated command line
arguments under
discussion upstream right now.


What you are doing seem to me to be an extension of SysBus - you are
defining the same interfaces as sysbus but also adding some machine
specifics wiring info. I think it's a candidate for QOM inheritance to
avoid having to dup all the sysbus device models for both regular
sysbus and platform bus. I think your functionality should be added as
one of

1: and interface that can be added to sysbus devices
2: a new abstraction that inherits from SYS_BUS_DEVICE
3: just new features to the sysbus core.

Then both of us are using the same suite of device models and the
differences between our approaches are limited to machine level
instantiation method. My gut says #2 is the cleanest.

The more I think about it the more I believe #3 would be the cleanest.
The only thing my platform devices do in addition to sysbus devices is
that it exposes qdev properties to give mapping code hints where a
device wants to be mapped.

If we just add qdev properties for all the possible hints in generic
sysbus core code, we should be able to automatically convert all devices
into dynamically allocatable devices. Whether they actually do get
mapped and the generation of device tree chunks still stays in the the
machine file's court.

As discussed offline with Alex, one issue I see is that this would be
encouraging people to add more devices to an artificial global bus in
/machine/unassigned that we've been trying to obsolete, rather than
sitting down and please creating an e500 SoC object as a start. Maybe we
should start generating a list of shame for 2.1. ;)
Instantiating a new [Sys/AXI/AMBA/...]Bus inside that SoC object would
make me much happier than using SysBus as is.

The pure QOM approach would be link properties instead of a bus, but
then the machine needs to know how many slots there shall be in
advance. Note that the docking procedure is always initiated from the
realizing device, whether bus or no bus.


So my goal is to make life easy for users, not to fulfill some wet 
Anthony dreams :). And as a user, I want to be able to say -device foo 
and have that device created, like I do with PCI devices today.


There are 2 approaches to this that I can see:

  1) A new special type of bus that allows for dynamic allocation and 
that knows a flat numbering scheme
  2) Individual devices that get attached to whatever the machine file 
thinks makes it happy (basically emulating the above bus, but with more 
flexibility)


I implemented option 1 with the Platform bus. It's basically an 
abstraction of the Sys/AXI/AMBA idea but only with a single bus 
implementation, as everything else would just be ridiculously redundant 
(and if necessary could be implemented as a subclass on top of the 
bridge device). People didn't like it.


I implemented option 2 with the Platform devices - this patch set. 
People didn't like it because it duplicates SysBus devices - and it does.


I'm implementing 2 as an add-on of SysBusDevice now which to me really 
isn't too much different from a dangling QOM device.


Linking devices by force (set IRQ0 to MPIC IRQ 32, map region0 to 
physical address space offset 0x12300) is a nice thing to have for 
people who know what they're doing. That matches probably about 0.1% 
of our user base - I personally am not included there. We *have* to have 
a mechanism to make device creation easy for users if we want to have any.



Alex




Re: [Qemu-devel] [PATCH 0/5] Platform device support

2014-06-27 Thread Alexander Graf


On 27.06.14 12:54, Peter Crosthwaite wrote:

On Fri, Jun 27, 2014 at 8:30 PM, Andreas Färber afaer...@suse.de wrote:

Am 26.06.2014 14:01, schrieb Alexander Graf:

On 20.06.14 08:43, Peter Crosthwaite wrote:

On Wed, Jun 4, 2014 at 10:28 PM, Alexander Graf ag...@suse.de wrote:

Platforms without ISA and/or PCI have had a seriously hard time in
the dynamic
device creation world of QEMU. Devices on these were modeled as
SysBus devices
which can only be instantiated in machine files, not through -device.

Why is that so?

Well, SysBus is trying to be incredibly generic. It allows you to
plug any
interrupt sender into any other interrupt receiver. It allows you to map
a device's memory regions into any other random memory region. All of
that
only works from C code or via really complicated command line
arguments under
discussion upstream right now.


What you are doing seem to me to be an extension of SysBus - you are
defining the same interfaces as sysbus but also adding some machine
specifics wiring info. I think it's a candidate for QOM inheritance to
avoid having to dup all the sysbus device models for both regular
sysbus and platform bus. I think your functionality should be added as
one of

1: and interface that can be added to sysbus devices
2: a new abstraction that inherits from SYS_BUS_DEVICE
3: just new features to the sysbus core.

Then both of us are using the same suite of device models and the
differences between our approaches are limited to machine level
instantiation method. My gut says #2 is the cleanest.

The more I think about it the more I believe #3 would be the cleanest.
The only thing my platform devices do in addition to sysbus devices is
that it exposes qdev properties to give mapping code hints where a
device wants to be mapped.

If we just add qdev properties for all the possible hints in generic
sysbus core code, we should be able to automatically convert all devices
into dynamically allocatable devices. Whether they actually do get
mapped and the generation of device tree chunks still stays in the the
machine file's court.

As discussed offline with Alex, one issue I see is that this would be
encouraging people to add more devices to an artificial global bus in
/machine/unassigned that we've been trying to obsolete, rather than
sitting down and please creating an e500 SoC object as a start. Maybe we
should start generating a list of shame for 2.1. ;)
Instantiating a new [Sys/AXI/AMBA/...]Bus inside that SoC object would
make me much happier than using SysBus as is.


Do you mean address_space_memory (as used by sysbus_mmio_map)? We all
hate that global singleton, but can we decouple it from sysbus which
is not the root cause of that problem? sysbus_mmio_map usages just
need to be replaced with sysbus_mmio_get_region and you can create
whatever heirachy you want using unchanged sysbus devices.

Even if we phase out the global singleton and the SysBus bus, the
sysbus device abstraction is still sound and should be usable
busless. Then theres no need a for a tree-wide to implement Alex's
feature for all devs (assuming his plugger can be made to work
hintless?).


The plugger works just fine when you don't give hints - then it's up to 
dynamic allocation (same as PCI).


Yes I fully agree with you here.


Alex




Re: [Qemu-devel] [PULL 03/10] pc-bios/s390-ccw: handle different sector sizes

2014-06-27 Thread Alexander Graf


On 27.06.14 13:25, Cornelia Huck wrote:

From: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com

Use the virtio device's configuration to figure out the disk geometry
and use a sector size based upon the layout.

[CH: s/SECTOR_SIZE/MAX_SECTOR_SIZE/g]
Acked-by: Christian Borntraeger borntrae...@de.ibm.com
Signed-off-by: Eugene (jno) Dvurechenski j...@linux.vnet.ibm.com
Signed-off-by: Jens Freimann jf...@linux.vnet.ibm.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
---
  pc-bios/s390-ccw/bootmap.c  |   12 +++---
  pc-bios/s390-ccw/s390-ccw.h |2 +-
  pc-bios/s390-ccw/virtio.c   |   96 ---
  pc-bios/s390-ccw/virtio.h   |   48 ++
  4 files changed, 147 insertions(+), 11 deletions(-)

diff --git a/pc-bios/s390-ccw/bootmap.c b/pc-bios/s390-ccw/bootmap.c
index c216030..fa2ca26 100644
--- a/pc-bios/s390-ccw/bootmap.c
+++ b/pc-bios/s390-ccw/bootmap.c
@@ -10,6 +10,7 @@
  
  #include s390-ccw.h

  #include bootmap.h
+#include virtio.h
  
  /* #define DEBUG_FALLBACK */
  
@@ -22,7 +23,8 @@

  #endif
  
  /* Scratch space */

-static uint8_t sec[SECTOR_SIZE] __attribute__((__aligned__(SECTOR_SIZE)));
+static uint8_t sec[MAX_SECTOR_SIZE]
+__attribute__((__aligned__(MAX_SECTOR_SIZE)));
  
  typedef struct ResetInfo {

  uint32_t ipl_mask;
@@ -99,7 +101,7 @@ static inline bool unused_space(const void *p, unsigned int 
size)
  
  static int zipl_load_segment(ComponentEntry *entry)

  {
-const int max_entries = (SECTOR_SIZE / sizeof(ScsiBlockPtr));
+const int max_entries = (MAX_SECTOR_SIZE / sizeof(ScsiBlockPtr));


Is this really safe to increase? Doesn't max_entries depend on the real 
sector size?



Alex




Re: [Qemu-devel] [PATCH v11 1/3] sPAPR: Implement EEH RTAS calls

2014-06-27 Thread Alexander Graf


On 27.06.14 11:53, Gavin Shan wrote:

On Thu, Jun 26, 2014 at 12:46:50PM +0200, Alexander Graf wrote:

On 26.06.14 12:43, Gavin Shan wrote:

On Thu, Jun 26, 2014 at 12:30:16PM +0200, Alexander Graf wrote:

On 26.06.14 03:35, Gavin Shan wrote:

The emulation for EEH RTAS requests from guest isn't covered
by QEMU yet and the patch implements them.

The patch defines constants used by EEH RTAS calls and adds
callback sPAPRPHBClass::eeh_handler, which is going to be used
this way:

1. RTAS calls are received in spapr_pci.c, sanity check is done
there.
2. RTAS handlers handle what they can. If there is something it
cannot handle and sPAPRPHBClass::eeh_handler callback is defined,
it is called.
3. sPAPRPHBClass::eeh_handler is only implemented for VFIO now. It
does ioctl() to the IOMMU container fd to complete the call. Error
codes from that ioctl() are transferred back to the guest.

Signed-off-by: Gavin Shan gws...@linux.vnet.ibm.com
---
  hw/ppc/spapr_pci.c  | 240 
  include/hw/pci-host/spapr.h |   7 ++
  include/hw/ppc/spapr.h  |  33 ++
  3 files changed, 280 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 131434b..8712051 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -422,6 +422,233 @@ static void 
rtas_ibm_query_interrupt_source_number(PowerPCCPU *cpu,
  rtas_st(rets, 2, 1);/* 0 == level; 1 == edge */
  }
+static int rtas_handle_eeh_request(sPAPREnvironment *spapr,
+   uint64_t buid, uint32_t req, uint32_t opt)
+{
+sPAPRPHBState *sphb = spapr_find_phb(spapr, buid);
+sPAPRPHBClass *info = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
+
+if (!sphb || !info-eeh_handler) {
+return -ENOENT;
+}
+
+return info-eeh_handler(sphb, req, opt);
+}
+
+static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu,
+sPAPREnvironment *spapr,
+uint32_t token, uint32_t nargs,
+target_ulong args, uint32_t nret,
+target_ulong rets)
+{
+uint32_t addr, option;
+uint64_t buid = ((uint64_t)rtas_ld(args, 1)  32) | rtas_ld(args, 2);
+int ret;
+
+if ((nargs != 4) || (nret != 1)) {
+goto param_error_exit;
+}
+
+addr = rtas_ld(args, 0);
+option = rtas_ld(args, 3);
+switch (option) {
+case RTAS_EEH_ENABLE:
+if (!find_dev(spapr, buid, addr)) {
+goto param_error_exit;
+}
+break;
+case RTAS_EEH_DISABLE:
+case RTAS_EEH_THAW_IO:
+case RTAS_EEH_THAW_DMA:
+break;
+default:
+goto param_error_exit;
+}
+
+ret = rtas_handle_eeh_request(spapr, buid,
+  RTAS_EEH_REQ_SET_OPTION, option);
+if (ret = 0) {
+rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+return;
+}
+
+param_error_exit:
+rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
+}
+
+static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu,
+   sPAPREnvironment *spapr,
+   uint32_t token, uint32_t nargs,
+   target_ulong args, uint32_t nret,
+   target_ulong rets)
+{
+uint32_t addr, option;
+uint64_t buid = ((uint64_t)rtas_ld(args, 1)  32) | rtas_ld(args, 2);
+sPAPRPHBState *sphb = spapr_find_phb(spapr, buid);
+sPAPRPHBClass *info = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
+PCIDevice *pdev;
+
+if (!sphb || !info-eeh_handler) {
+goto param_error_exit;
+}
+
+if ((nargs != 4) || (nret != 2)) {
+goto param_error_exit;
+}
+
+addr = rtas_ld(args, 0);
+option = rtas_ld(args, 3);
+if (option != RTAS_GET_PE_ADDR  option != RTAS_GET_PE_MODE) {
+goto param_error_exit;
+}
+
+pdev = find_dev(spapr, buid, addr);
+if (!pdev) {
+goto param_error_exit;
+}
+
+/*
+ * For now, we always have bus level PE whose address
+ * has format 00BBSS00. The guest OS might regard
+ * PE address 0 as invalid. We avoid that simply by
+ * extending it with one.
+ */
+rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+if (option == RTAS_GET_PE_ADDR) {
+rtas_st(rets, 1, (pci_bus_num(pdev-bus)  16) + 1);
+} else {
+rtas_st(rets, 1, RTAS_PE_MODE_SHARED);
+}
+
+return;
+
+param_error_exit:
+rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
+}
+
+static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu,
+sPAPREnvironment *spapr,
+uint32_t token, uint32_t nargs,
+target_ulong args, uint32_t nret,
+target_ulong rets)
+{
+uint64_t buid = ((uint64_t)rtas_ld(args, 1)  32) | rtas_ld(args, 2);
+int ret;
+

Re: [Qemu-devel] [PATCH V3] qemu-img create: add 'nocow' option

2014-06-27 Thread Stefan Hajnoczi
On Mon, Jun 23, 2014 at 05:17:02PM +0800, Chunyan Liu wrote:
 Add 'nocow' option so that users could have a chance to set NOCOW flag to
 newly created files. It's useful on btrfs file system to enhance performance.
 
 Btrfs has low performance when hosting VM images, even more when the guest
 in those VM are also using btrfs as file system. One way to mitigate this bad
 performance is to turn off COW attributes on VM files. Generally, there are
 two ways to turn off NOCOW on btrfs: a) by mounting fs with nodatacow, then
 all newly created files will be NOCOW. b) per file. Add the NOCOW file
 attribute. It could only be done to empty or new files.
 
 This patch tries the second way, according to the option, it could add NOCOW
 per file.
 
 For most block drivers, since the create file step is in raw-posix.c, so we
 can do setting NOCOW flag ioctl in raw-posix.c only.
 
 But there are some exceptions, like block/vpc.c and block/vdi.c, they are
 creating file by calling qemu_open directly. For them, do the same setting
 NOCOW flag ioctl work in them separately.
 
 Signed-off-by: Chunyan Liu cy...@suse.com
 ---
 Changes to v2:
   * based on QemuOpts instead of old QEMUOptionParameters
   * add nocow description in man page and html doc
 
   Old v2 is here:
   http://lists.gnu.org/archive/html/qemu-devel/2013-11/msg02429.html
 
 ---
  block/cow.c   |  5 +
  block/qcow.c  |  5 +
  block/qcow2.c |  5 +
  block/qed.c   | 11 ---
  block/raw-posix.c | 25 +
  block/vdi.c   | 29 +
  block/vhdx.c  |  5 +
  block/vmdk.c  | 11 ---
  block/vpc.c   | 29 +
  include/block/block_int.h |  1 +
  qemu-doc.texi | 16 
  qemu-img.texi | 16 
  12 files changed, 152 insertions(+), 6 deletions(-)

Are you sure it's necessary to touch all image formats in order to pass
through the nocow option?  Looking at bdrv_img_create() I think it will
work without touching all image formats since both drv and
proto_drv-create_opts are appended:

void bdrv_img_create(const char *filename, const char *fmt,
 const char *base_filename, const char *base_fmt,
 char *options, uint64_t img_size, int flags,
 Error **errp, bool quiet)
{
QemuOptsList *create_opts = NULL;
...
create_opts = qemu_opts_append(create_opts, drv-create_opts);
create_opts = qemu_opts_append(create_opts, proto_drv-create_opts);

/* Create parameter list with default values */
opts = qemu_opts_create(create_opts, NULL, 0, error_abort);
qemu_opt_set_number(opts, BLOCK_OPT_SIZE, img_size);

/* Parse -o options */
if (options) {
if (qemu_opts_do_parse(opts, options, NULL) != 0) {
error_setg(errp, Invalid options for file format '%s', fmt);
goto out;
}
}


pgpn82sRWQr9t.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH 0/5] Platform device support

2014-06-27 Thread Paolo Bonzini

Il 27/06/2014 13:24, Alexander Graf ha scritto:


I think we can all agree that the sysbus bus is not a bus per se. So
conceptually, what's the difference between a device attached to a
non-bus and a device not attached to a bus at all? And why can't we
convert sysbus to not be a bus anymore?


I think there is no difference, and I don't think moving out of sysbus 
is really a goal that we need to pursue.


I agree with Andreas that having a SoC object as father of sysbus 
(instead of nothing at all) would be slightly better.


We could also make TYPE_MACHINE a subclass of TYPE_DEVICE, to have an 
obvious place for this SoC object.


Paolo



Re: [Qemu-devel] [PATCH trivial v2] block.c: Add return value for bdrv_append_temp_snapshot() to avoid incorrect failure processing issue

2014-06-27 Thread Stefan Hajnoczi
On Tue, Jun 24, 2014 at 01:01:52PM +0200, Markus Armbruster wrote:
 Kevin Wolf kw...@redhat.com writes:
 
  Am 23.06.2014 um 17:28 hat Chen Gang geschrieben:
  When failure occurs, 'ret' need be set, or may return 0 to indicate 
  success.
  And error_propagate() also need be called only one time within a function.
  
  It is abnormal to prevent bdrv_append_temp_snapshot() return value but 
  still
  set errp when error occurs -- although it contents return value internally.
  
  So let bdrv_append_temp_snapshot() internal return value outside, and let
  all things normal, then fix the issue too.
  
  Signed-off-by: Chen Gang gang.chen.5...@gmail.com
 
  What does this fix?
 
 It fixes the return value of bdrv_open() when
 bdrv_append_temp_snapshot() fails.  Before this patch, it returns a
 positive value, which is wrong.  After the patch, it returns the
 negative error code bdrv_append_temp_snapshot() now returns.

Exactly.  I asked for the -errno return because otherwise bdrv_open()
would have no accurate errno.

Stefan


pgptPuBNgjD6J.pgp
Description: PGP signature


Re: [Qemu-devel] Reverse execution and deterministic replay

2014-06-27 Thread Peter Maydell
On 27 June 2014 12:31, Pavel Dovgaluk pavel.dovga...@ispras.ru wrote:
 I forgot about one more issue.
 When qemu stops execution on the breakpoint, the icount
 is decreased to the number of instructions in the block.
 But in this case the last instruction is not executed and
 should not affect the counter.

Yes, indeed, that's the sort of edge case bug we should fix.

-- PMM



Re: [Qemu-devel] [PATCH 0/5] Platform device support

2014-06-27 Thread Peter Maydell
On 27 June 2014 12:48, Paolo Bonzini pbonz...@redhat.com wrote:
 We could also make TYPE_MACHINE a subclass of TYPE_DEVICE, to have an
 obvious place for this SoC object.

Why isn't TYPE_MACHINE a subclass of TYPE_DEVICE anyway?

thanks
-- PMM



[Qemu-devel] [PULL 14/32] target-ppc: Remove unused gen_qemu_ld8s()

2014-06-27 Thread Alexander Graf
From: Peter Maydell peter.mayd...@linaro.org

The gen_qemu_ld8s() function is unused; remove it.

Signed-off-by: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 target-ppc/translate.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index b501655..b23933f 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -2662,11 +2662,6 @@ static inline void gen_qemu_ld8u(DisasContext *ctx, TCGv 
arg1, TCGv arg2)
 tcg_gen_qemu_ld8u(arg1, arg2, ctx-mem_idx);
 }
 
-static inline void gen_qemu_ld8s(DisasContext *ctx, TCGv arg1, TCGv arg2)
-{
-tcg_gen_qemu_ld8s(arg1, arg2, ctx-mem_idx);
-}
-
 static inline void gen_qemu_ld16u(DisasContext *ctx, TCGv arg1, TCGv arg2)
 {
 TCGMemOp op = MO_UW | ctx-default_tcg_memop_mask;
-- 
1.8.1.4




[Qemu-devel] [PULL 03/32] linux-user: Identify Addition Hardware Capabilities for PowerPC

2014-06-27 Thread Alexander Graf
From: Tom Musta tommu...@gmail.com

Add VSX, DFP and ISA 2.06 to the bits identified in the AT_HWCAP
entry of the AUXV.

Signed-off-by: Tom Musta tommu...@gmail.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 linux-user/elfload.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 64d23fa..9a41882 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -749,6 +749,8 @@ static uint32_t get_elf_hwcap(void)
Altivec/FP/SPE support.  Anything else is just a bonus.  */
 #define GET_FEATURE(flag, feature)  \
 do { if (cpu-env.insns_flags  flag) { features |= feature; } } while (0)
+#define GET_FEATURE2(flag, feature)  \
+do { if (cpu-env.insns_flags2  flag) { features |= feature; } } while (0)
 GET_FEATURE(PPC_64B, QEMU_PPC_FEATURE_64);
 GET_FEATURE(PPC_FLOAT, QEMU_PPC_FEATURE_HAS_FPU);
 GET_FEATURE(PPC_ALTIVEC, QEMU_PPC_FEATURE_HAS_ALTIVEC);
@@ -757,7 +759,13 @@ static uint32_t get_elf_hwcap(void)
 GET_FEATURE(PPC_SPE_DOUBLE, QEMU_PPC_FEATURE_HAS_EFP_DOUBLE);
 GET_FEATURE(PPC_BOOKE, QEMU_PPC_FEATURE_BOOKE);
 GET_FEATURE(PPC_405_MAC, QEMU_PPC_FEATURE_HAS_4xxMAC);
+GET_FEATURE2(PPC2_DFP, QEMU_PPC_FEATURE_HAS_DFP);
+GET_FEATURE2(PPC2_VSX, QEMU_PPC_FEATURE_HAS_VSX);
+GET_FEATURE2((PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 | PPC2_ATOMIC_ISA206 |
+  PPC2_FP_CVT_ISA206 | PPC2_FP_TST_ISA206),
+  QEMU_PPC_FEATURE_ARCH_2_06);
 #undef GET_FEATURE
+#undef GET_FEATURE2
 
 return features;
 }
-- 
1.8.1.4




  1   2   3   4   >