date:20160228

Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay

2016-02-28 Thread Kevin Wolf

Am 29.02.2016 um 08:03 hat Pavel Dovgalyuk geschrieben:
> > From: Kevin Wolf [mailto:kw...@redhat.com]
> > Am 25.02.2016 um 10:06 hat Pavel Dovgalyuk geschrieben:
> > > There is one problem with flush event - callbacks for flush are called for
> > > all layers and I couldn't synchronize them correctly yet.
> > > I'll probably have to add new callback to block driver, which handles
> > > flush request for the whole stack of the drivers.
> > 
> > Flushes should be treated more or less the same a writes, I think.
> 
> But bdrv_co_flush has different structure and does not allow synchronization
> of the top layer. Here is the patch for fixing this:
> 
> diff --git a/block/io.c b/block/io.c
> index a69bfc4..9e05dfe 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2369,6 +2369,12 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>  }
>  
>  tracked_request_begin(&req, bs, 0, 0, BDRV_TRACKED_FLUSH);
> +/* Write back all layers by calling one driver function */
> +if (bs->drv->bdrv_co_flush) {
> +ret = bs->drv->bdrv_co_flush(bs);
> +goto out;
> +}
> +
>  /* Write back cached data to the OS even with cache=unsafe */
>  BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_OS);
>  if (bs->drv->bdrv_co_flush_to_os) {
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 9ef823a..9cc2c58 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -176,6 +176,13 @@ struct BlockDriver {
>  int (*bdrv_inactivate)(BlockDriverState *bs);
>  
>  /*
> + * Flushes all data for all layers by calling bdrv_co_flush for 
> underlying
> + * layers, if needed. This function is needed for deterministic
> + * synchronization of the flush finishing callback.
> + */
> +int coroutine_fn (*bdrv_co_flush)(BlockDriverState *bs);
> +
> +/*
>   * Flushes all data that was already written to the OS all the way down 
> to
>   * the disk (for example raw-posix calls fsync()).
>   */
> 
> 
> Then I'll have just to implement bdrv_co_flush for blkreplay layer.
> This callback will manually invoke bdrv_co_flush for underlying layer
> allowing synchronization of finishing callback.

The real problem is that the lower layer (bs->file) is automatically
flushed, introducing non-determinism outside of the protecting blkreplay
driver, correct?

Hm...

My first thought was maybe introduce a bool BlockDriver.no_flush_parent
which avoids the automatic flush for the parent node. Then you could use
.bdrv_co_flush_to_os() like all other drivers do.

But you're right that your callback wouldn't just flush to the OS, but
you would have to implement the flush on the lower layer and the
handling of BDRV_O_NO_FLUSH in the blkreplay driver, so introducing a
new callback for a "whole" flush might indeed be cleaner.

I don't particularly like it, but I don't see a better option, so it's
okay with me.

Kevin

Re: [Qemu-devel] [PATCH v3] net: netmap: probe netmap interface for virtio-net header

2016-02-28 Thread Jason Wang



On 02/24/2016 06:30 PM, Vincenzo Maffione wrote:
> Previous implementation of has_ufo, has_vnet_hdr, has_vnet_hdr_len, etc.
> did not really probe for virtio-net header support for the netmap
> interface attached to the backend. These callbacks were correct for
> VALE ports, but incorrect for hardware NICs, pipes, monitors, etc.
>
> This patch fixes the implementation to work properly with all kinds
> of netmap ports.
>
> Signed-off-by: Vincenzo Maffione 
> ---
>  net/netmap.c | 59 ++-
>  1 file changed, 38 insertions(+), 21 deletions(-)
>

Applied to -net.

Thanks

Re: [Qemu-devel] [PATCH v2 2/2] filter-buffer: Add status_changed callback processing

2016-02-28 Thread Hailiang Zhang


On 2016/2/29 15:27, Jason Wang wrote:



On 02/29/2016 09:46 AM, zhanghailiang wrote:

While the status of filter-buffer changing from 'on' to 'off',
it need to release all the buffered packets, and delete the related
timer, while switch from 'off' to 'on', it need to resume the release
packets timer.

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
Cc: Yang Hongyang 
---
v2:
- New patch
---
  net/filter-buffer.c | 19 +++
  1 file changed, 19 insertions(+)

diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 12ad2e3..ed3f19e 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -124,6 +124,24 @@ static void filter_buffer_setup(NetFilterState *nf, Error 
**errp)
  }
  }

+static void filter_buffer_status_changed(NetFilterState *nf, Error **errp)
+{
+FilterBufferState *s = FILTER_BUFFER(nf);
+
+if (!strcmp(nf->status, "off")) {
+if (s->interval) {
+timer_del(&s->release_timer);
+}
+filter_buffer_flush(nf);
+} else {
+if (s->interval) {
+timer_init_us(&s->release_timer, QEMU_CLOCK_VIRTUAL,
+filter_buffer_release_timer, nf);
+timer_mod(&s->release_timer,
+qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + s->interval);
+}


The code looks duplicated with filter_buffer_setup().



Yea, extract them into a new helper ? filter_buffer_setup_timer() ?


+}
+}
  static void filter_buffer_class_init(ObjectClass *oc, void *data)
  {
  NetFilterClass *nfc = NETFILTER_CLASS(oc);
@@ -131,6 +149,7 @@ static void filter_buffer_class_init(ObjectClass *oc, void 
*data)
  nfc->setup = filter_buffer_setup;
  nfc->cleanup = filter_buffer_cleanup;
  nfc->receive_iov = filter_buffer_receive_iov;
+nfc->status_changed = filter_buffer_status_changed;
  }

  static void filter_buffer_get_interval(Object *obj, Visitor *v,



.

Re: [Qemu-devel] [PATCH v2 1/2] filter: Add 'status' property for filter object

2016-02-28 Thread Hailiang Zhang


On 2016/2/29 15:26, Jason Wang wrote:



On 02/29/2016 09:46 AM, zhanghailiang wrote:

With this property, users can control if this filter is 'on'
or 'off'. The default behavior for filter is 'on'.

For some types of filters, they may need to react to status changing,
So here, we introduced status changing callback/notifier for filter class.

We will skip the disabled ('off') filter when delivering packets in net layer.

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
Cc: Yang Hongyang 
---
v2:
- Split the processing of buffer-filter into a new patch (Jason)
- Use 'status' instead of 'enabled' to store the filter state (Jason)
- Rename FilterDisable() callback to FilterStatusChanged(Jason)
---


Thanks, looks good, just few nits.


  include/net/filter.h |  4 
  net/filter.c | 42 ++
  qemu-options.hx  |  4 +++-
  3 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/include/net/filter.h b/include/net/filter.h
index 5639976..ebef0dc 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -36,12 +36,15 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc,
 int iovcnt,
 NetPacketSent *sent_cb);

+typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp);
+
  typedef struct NetFilterClass {
  ObjectClass parent_class;

  /* optional */
  FilterSetup *setup;
  FilterCleanup *cleanup;
+FilterStatusChanged *status_changed;
  /* mandatory */
  FilterReceiveIOV *receive_iov;
  } NetFilterClass;
@@ -55,6 +58,7 @@ struct NetFilterState {
  char *netdev_id;
  NetClientState *netdev;
  NetFilterDirection direction;
+char *status;


Let's use bool instead.



Er, then status=true means 'on' ? false means 'off' ?
That looks odd.  What about using 'bool status_on' ?



.

Re: [Qemu-devel] [PATCHv2 5/7] spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge

2016-02-28 Thread Alexey Kardashevskiy


On 02/29/2016 06:06 PM, David Gibson wrote:

Now that the regular spapr-pci-host-bridge can handle EEH, there are only
two things that spapr-pci-vfio-host-bridge does differently:
 1. automatically sizes its DMA window to match the host IOMMU
 2. checks if the attached VFIO container is backed by the
VFIO_SPAPR_TCE_IOMMU type on the host

(1) is not particularly useful, since the default window used by the
regular host bridge will work with the host IOMMU configuration on all
current systems anyway.

Plus, automatically changing guest visible configuration (such as the DMA
window) based on host settings is generally a bad idea.  It's not
definitively broken, since spapr-pci-vfio-host-bridge is only supposed to
support VFIO devices which can't be migrated anyway, but still.

(2) is not really useful, because if a guest tries to configure EEH on a
different host IOMMU, the first call will fail and that will be that.

It's possible there are scripts or tools out there which expect
spapr-pci-vfio-host-bridge, so we don't remove it entirely.  This patch
reduces it to just a stub for backwards compatibility.

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 




--
Alexey

Re: [Qemu-devel] [PATCHv2 3/7] spapr_pci: Eliminate class callbacks

2016-02-28 Thread Alexey Kardashevskiy


On 02/29/2016 06:06 PM, David Gibson wrote:

The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the
special groupid field in sPAPRPHBVFIOState.  So we can simplify, removing
the class specific callbacks with direct calls based on a simple
spapr_phb_eeh_enabled() helper.  For now we implement that in terms of
a boolean in the class, but we'll continue to clean that up later.

On its own this is a rather strange way of doing things, but it's a useful
intermediate step to further cleanups.

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 




--
Alexey

Re: [Qemu-devel] [PATCHv2 4/7] spapr_pci: Allow EEH on spapr-pci-host-bridge

2016-02-28 Thread Alexey Kardashevskiy


On 02/29/2016 06:06 PM, David Gibson wrote:

Now that the EEH code is independent of the special
spapr-vfio-pci-host-bridge device, we can allow it on all spapr PCI
host bridges instead.  We do this by changing spapr_phb_eeh_available()
to be based on the vfio_eeh_as_ok() call instead of the host bridge class.

Because the value of vfio_eeh_as_ok() can change with devices being
hotplugged or unplugged, this can potentially lead to some strange edge
cases where the guest starts using EEH, then it starts failing because
of a change in status.

However, it's not really any worse than the current situation.  Cases that
would have worked previously will still work (i.e. VFIO devices from at
most one VFIO IOMMU group per vPHB), it's just that it's no longer
necessary to use spapr-vfio-pci-host-bridge with the groupid pre-specified.

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 




--
Alexey

Re: [Qemu-devel] [PATCHv2 1/7] vfio: Start improving VFIO/EEH interface

2016-02-28 Thread Alexey Kardashevskiy


On 02/29/2016 06:06 PM, David Gibson wrote:

At present the code handling IBM's Enhanced Error Handling (EEH) interface
on VFIO devices operates by bypassing the usual VFIO logic with
vfio_container_ioctl().  That's a poorly designed interface with unclear
semantics about exactly what can be operated on.

In particular it operates on a single vfio container internally (hence the
name), but takes an address space and group id, from which it deduces the
container in a rather roundabout way.  groupids are something that code
outside vfio shouldn't even be aware of.

This patch creates new interfaces for EEH operations.  Internally we
have vfio_eeh_container_op() which takes a VFIOContainer object
directly.  For external use we have vfio_eeh_as_ok() which determines
if an AddressSpace is usable for EEH (at present this means it has a
single container and at most a single group attached), and
vfio_eeh_as_op() which will perform an operation on an AddressSpace in
the unambiguous case, and otherwise returns an error.

This interface still isn't great, but it's enough of an improvement to
allow a number of cleanups in other places.

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 




--
Alexey

Re: [Qemu-devel] [PATCH v2 2/2] filter-buffer: Add status_changed callback processing

2016-02-28 Thread Jason Wang



On 02/29/2016 09:46 AM, zhanghailiang wrote:
> While the status of filter-buffer changing from 'on' to 'off',
> it need to release all the buffered packets, and delete the related
> timer, while switch from 'off' to 'on', it need to resume the release
> packets timer.
>
> Signed-off-by: zhanghailiang 
> Cc: Jason Wang 
> Cc: Yang Hongyang 
> ---
> v2:
> - New patch
> ---
>  net/filter-buffer.c | 19 +++
>  1 file changed, 19 insertions(+)
>
> diff --git a/net/filter-buffer.c b/net/filter-buffer.c
> index 12ad2e3..ed3f19e 100644
> --- a/net/filter-buffer.c
> +++ b/net/filter-buffer.c
> @@ -124,6 +124,24 @@ static void filter_buffer_setup(NetFilterState *nf, 
> Error **errp)
>  }
>  }
>  
> +static void filter_buffer_status_changed(NetFilterState *nf, Error **errp)
> +{
> +FilterBufferState *s = FILTER_BUFFER(nf);
> +
> +if (!strcmp(nf->status, "off")) {
> +if (s->interval) {
> +timer_del(&s->release_timer);
> +}
> +filter_buffer_flush(nf);
> +} else {
> +if (s->interval) {
> +timer_init_us(&s->release_timer, QEMU_CLOCK_VIRTUAL,
> +filter_buffer_release_timer, nf);
> +timer_mod(&s->release_timer,
> +qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + s->interval);
> +}

The code looks duplicated with filter_buffer_setup().

> +}
> +}
>  static void filter_buffer_class_init(ObjectClass *oc, void *data)
>  {
>  NetFilterClass *nfc = NETFILTER_CLASS(oc);
> @@ -131,6 +149,7 @@ static void filter_buffer_class_init(ObjectClass *oc, 
> void *data)
>  nfc->setup = filter_buffer_setup;
>  nfc->cleanup = filter_buffer_cleanup;
>  nfc->receive_iov = filter_buffer_receive_iov;
> +nfc->status_changed = filter_buffer_status_changed;
>  }
>  
>  static void filter_buffer_get_interval(Object *obj, Visitor *v,

Re: [Qemu-devel] [PATCH v2 1/2] filter: Add 'status' property for filter object

2016-02-28 Thread Jason Wang



On 02/29/2016 09:46 AM, zhanghailiang wrote:
> With this property, users can control if this filter is 'on'
> or 'off'. The default behavior for filter is 'on'.
>
> For some types of filters, they may need to react to status changing,
> So here, we introduced status changing callback/notifier for filter class.
>
> We will skip the disabled ('off') filter when delivering packets in net layer.
>
> Signed-off-by: zhanghailiang 
> Cc: Jason Wang 
> Cc: Yang Hongyang 
> ---
> v2:
> - Split the processing of buffer-filter into a new patch (Jason)
> - Use 'status' instead of 'enabled' to store the filter state (Jason)
> - Rename FilterDisable() callback to FilterStatusChanged(Jason)
> ---

Thanks, looks good, just few nits.

>  include/net/filter.h |  4 
>  net/filter.c | 42 ++
>  qemu-options.hx  |  4 +++-
>  3 files changed, 49 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/filter.h b/include/net/filter.h
> index 5639976..ebef0dc 100644
> --- a/include/net/filter.h
> +++ b/include/net/filter.h
> @@ -36,12 +36,15 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc,
> int iovcnt,
> NetPacketSent *sent_cb);
>  
> +typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp);
> +
>  typedef struct NetFilterClass {
>  ObjectClass parent_class;
>  
>  /* optional */
>  FilterSetup *setup;
>  FilterCleanup *cleanup;
> +FilterStatusChanged *status_changed;
>  /* mandatory */
>  FilterReceiveIOV *receive_iov;
>  } NetFilterClass;
> @@ -55,6 +58,7 @@ struct NetFilterState {
>  char *netdev_id;
>  NetClientState *netdev;
>  NetFilterDirection direction;
> +char *status;

Let's use bool instead.

[Qemu-devel] An ivshmem + chardev peculiarity

2016-02-28 Thread Markus Armbruster

The ivshmem client/server protocol is one-way: only the server sends.
The only way clients can communicate is by connect and close.

Since the ivshmem device model uses a chardev for its connection to the
server, both connect and close happen in chardev code.

Example: monitor command chardev-add connects, the server starts
sending.  chardev-add succeeds.  device_add fails somehow.  Now you must
chardev-remove to close the connection.

>From the server's point of view, a peer connected and disconnected.  It
duly announces this to already connected peers.  This is a bit weird,
but it shouldn't cause problems.

If you forget to chardev-remove, the "peer" stays connected.  Can be
declared PEBKAC.

Connect on realize and close on realize failure and unrealize would be
neater, I think.  Chardevs don't let me do that, as far as I can tell.
Ideas?

There's one exception to the behavior I just described: on version
mismatch, the device model calls qemu_chr_delete().  I consider that a
bug, and I'm going to fix it.

Re: [Qemu-devel] [PATCH] net/filter-redirector:Add filter-redirector

2016-02-28 Thread Jason Wang

On 02/24/2016 05:03 PM, Zhang Chen wrote:
>
> If queue=rx, filter-redirector will get the packet that guest send,
> then redirect
> to outdev(if none, do nothing). but queue=rx/tx/all not related to
> indev. please
> look the flow chart below. queue=xxx just work for one
> way(filter->outdev).
>
>   filter
> +
> |
> |
>redirector   |
> +-+
> |   | |
> |   | |
> |   | |
>indev ++ +>  outdev
> | |   |
> | |   |
> | |   |
> +-+
>   |
>   |
>   v
>filter
>
>   |
>
>   |
>
>   v
>filter  filter .. guest
>

This looks a violation on the assumption of current filter behavior. 
Each filter should only talk to the 'next' or 'prev' filter on the chain
(depends on the direction) or netdev when queue=rx or netdev's peer when
queue=tx.

And in fact there's subtle differences with your patch:

When queue='all' since you force nf->netdev as sender, direction is
NET_FILTER_DIRECTION_TX, the packet will be passed to 'next' filter on
the chain.
When queue='rx', direction is NET_FILTER_DIRECTION_RX, the packet will
be pass to 'prev' filter on the chain.

So as you can see, 'all' is ambiguous here. I think we should keep
current behavior by redirecting traffic to netdev when queue='rx'. For
queue='all', maybe we need redirect the traffic to both netdev and
netdev's peer.

Re: [Qemu-devel] [RFC PATCH v0 3/6] spapr: Represent boot CPUs as spapr-cpu-core devices

2016-02-28 Thread David Gibson

On Mon, Feb 29, 2016 at 11:05:32AM +0530, Bharata B Rao wrote:
> On Fri, Feb 26, 2016 at 04:18:57PM +0100, Igor Mammedov wrote:
> > On Thu, 25 Feb 2016 21:52:39 +0530
> > Bharata B Rao  wrote:
[snip]
> > > @@ -2209,6 +2251,7 @@ static void 
> > > spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> > >DeviceState *dev, Error **errp)
> > >  {
> > >  sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
> > > +sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev);
> > >  
> > >  if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
> > >  int node;
> > > @@ -2245,6 +2288,11 @@ static void 
> > > spapr_machine_device_plug(HotplugHandler *hotplug_dev,
> > >  }
> > >  
> > >  spapr_memory_plug(hotplug_dev, dev, node, errp);
> > > +} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
> > here probably should be CORE and not TYPE_CPU,
> > then if board needs to set some state for child threads it
> > could ennumerate child of core here and do the required job.
> 
> Hmm, we have things to set for CPUs and cores as well from hotplug
> handler. So the above code is for CPUs and I have spapr_core_plug()
> which is called conditionally when device is of type TYPE_SPAPR_CPU_CORE.
> 
> Given that ->plug() is called during realization of both individual CPU
> threads as well as their parent core, I thought handling the setup for
> them separately like the above is simpler, no ?

So, this bothered me a bit as well.  I think having the hotplug
handler for the threads is misleading, since it suggests you can
individually hotplug them, when in fact the intention is just that
this is part of the hotplug code path for a whole core.

My suggestion earlier was to have the core realize code itself perform
this configuration on the sub-threads.  However, Igor's suggestion of,
essentially, an explicit loop over the threads in the core hotplug
handler would be fine by me also.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Qemu-devel] [PATCHv2 3/7] spapr_pci: Eliminate class callbacks

2016-02-28 Thread David Gibson

The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the
special groupid field in sPAPRPHBVFIOState.  So we can simplify, removing
the class specific callbacks with direct calls based on a simple
spapr_phb_eeh_enabled() helper.  For now we implement that in terms of
a boolean in the class, but we'll continue to clean that up later.

On its own this is a rather strange way of doing things, but it's a useful
intermediate step to further cleanups.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr_pci.c  | 44 ++--
 hw/ppc/spapr_pci_vfio.c | 18 +++---
 include/hw/pci-host/spapr.h | 37 +
 3 files changed, 62 insertions(+), 37 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index e8edad3..be05757 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -92,6 +92,13 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState *spapr, 
uint64_t buid,
 return pci_find_device(phb->bus, bus_num, devfn);
 }
 
+static bool spapr_phb_eeh_available(sPAPRPHBState *sphb)
+{
+sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
+
+return spc->eeh_available;
+}
+
 static uint32_t rtas_pci_cfgaddr(uint32_t arg)
 {
 /* This handles the encoding of extended config space addresses */
@@ -440,7 +447,6 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu,
 target_ulong rets)
 {
 sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
 uint32_t addr, option;
 uint64_t buid;
 int ret;
@@ -458,12 +464,11 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu,
 goto param_error_exit;
 }
 
-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_set_option) {
+if (!spapr_phb_eeh_available(sphb)) {
 goto param_error_exit;
 }
 
-ret = spc->eeh_set_option(sphb, addr, option);
+ret = spapr_phb_vfio_eeh_set_option(sphb, addr, option);
 rtas_st(rets, 0, ret);
 return;
 
@@ -478,7 +483,6 @@ static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu,
target_ulong rets)
 {
 sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
 PCIDevice *pdev;
 uint32_t addr, option;
 uint64_t buid;
@@ -493,8 +497,7 @@ static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu,
 goto param_error_exit;
 }
 
-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_set_option) {
+if (!spapr_phb_eeh_available(sphb)) {
 goto param_error_exit;
 }
 
@@ -534,7 +537,6 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu,
 target_ulong rets)
 {
 sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
 uint64_t buid;
 int state, ret;
 
@@ -548,12 +550,11 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU 
*cpu,
 goto param_error_exit;
 }
 
-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_get_state) {
+if (!spapr_phb_eeh_available(sphb)) {
 goto param_error_exit;
 }
 
-ret = spc->eeh_get_state(sphb, &state);
+ret = spapr_phb_vfio_eeh_get_state(sphb, &state);
 rtas_st(rets, 0, ret);
 if (ret != RTAS_OUT_SUCCESS) {
 return;
@@ -578,7 +579,6 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu,
 target_ulong rets)
 {
 sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
 uint32_t option;
 uint64_t buid;
 int ret;
@@ -594,12 +594,11 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu,
 goto param_error_exit;
 }
 
-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_reset) {
+if (!spapr_phb_eeh_available(sphb)) {
 goto param_error_exit;
 }
 
-ret = spc->eeh_reset(sphb, option);
+ret = spapr_phb_vfio_eeh_reset(sphb, option);
 rtas_st(rets, 0, ret);
 return;
 
@@ -614,7 +613,6 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu,
   target_ulong rets)
 {
 sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
 uint64_t buid;
 int ret;
 
@@ -628,12 +626,11 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu,
 goto param_error_exit;
 }
 
-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_configure) {
+if (!spapr_phb_eeh_available(sphb)) {
 goto param_error_exit;
 }
 
-ret = spc->eeh_configure(sphb);
+ret = spapr_phb_vfio_eeh_configure(sphb);
 rtas_st(rets, 0, ret);
 return;
 
@@ -649,7 +646,6 @@ static void rtas_ibm_slot_error_detail(PowerPCCPU *cpu,
target_ulong rets)
 {
 sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
 int option;
 uint64_t buid;
 
@@ -663,8 +659,7 @@ static void rtas_ibm_slot_error_detail(PowerPCCPU *cpu,
 goto param_error_exit;
 }
 
-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_set_option) {
+if (!spa

[Qemu-devel] [PATCHv2 2/7] spapr_pci: Switch to vfio_eeh_as_op() interface

2016-02-28 Thread David Gibson

This switches all EEH on VFIO operations in spapr_pci_vfio.c from the
broken vfio_container_ioctl() interface to the new vfio_as_eeh_op()
interface.

Signed-off-by: David Gibson 
Reviewed-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr_pci_vfio.c | 50 -
 1 file changed, 16 insertions(+), 34 deletions(-)

diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index 2f3752e..b1e8e8e 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -73,15 +73,9 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState 
*sphb, Error **errp)
 spapr_tce_get_iommu(tcet));
 }
 
-static void spapr_phb_vfio_eeh_reenable(sPAPRPHBVFIOState *svphb)
+static void spapr_phb_vfio_eeh_reenable(sPAPRPHBState *sphb)
 {
-struct vfio_eeh_pe_op op = {
-.argsz = sizeof(op),
-.op= VFIO_EEH_PE_ENABLE
-};
-
-vfio_container_ioctl(&svphb->phb.iommu_as,
- svphb->iommugroupid, VFIO_EEH_PE_OP, &op);
+vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_ENABLE);
 }
 
 static void spapr_phb_vfio_reset(DeviceState *qdev)
@@ -92,19 +86,18 @@ static void spapr_phb_vfio_reset(DeviceState *qdev)
  * ensures that the contained PCI devices will work properly
  * after reboot.
  */
-spapr_phb_vfio_eeh_reenable(SPAPR_PCI_VFIO_HOST_BRIDGE(qdev));
+spapr_phb_vfio_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev));
 }
 
 static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
  unsigned int addr, int option)
 {
-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) };
+uint32_t op;
 int ret;
 
 switch (option) {
 case RTAS_EEH_DISABLE:
-op.op = VFIO_EEH_PE_DISABLE;
+op = VFIO_EEH_PE_DISABLE;
 break;
 case RTAS_EEH_ENABLE: {
 PCIHostState *phb;
@@ -122,21 +115,20 @@ static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState 
*sphb,
 return RTAS_OUT_PARAM_ERROR;
 }
 
-op.op = VFIO_EEH_PE_ENABLE;
+op = VFIO_EEH_PE_ENABLE;
 break;
 }
 case RTAS_EEH_THAW_IO:
-op.op = VFIO_EEH_PE_UNFREEZE_IO;
+op = VFIO_EEH_PE_UNFREEZE_IO;
 break;
 case RTAS_EEH_THAW_DMA:
-op.op = VFIO_EEH_PE_UNFREEZE_DMA;
+op = VFIO_EEH_PE_UNFREEZE_DMA;
 break;
 default:
 return RTAS_OUT_PARAM_ERROR;
 }
 
-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid,
-   VFIO_EEH_PE_OP, &op);
+ret = vfio_eeh_as_op(&sphb->iommu_as, op);
 if (ret < 0) {
 return RTAS_OUT_HW_ERROR;
 }
@@ -146,13 +138,9 @@ static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState 
*sphb,
 
 static int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state)
 {
-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) };
 int ret;
 
-op.op = VFIO_EEH_PE_GET_STATE;
-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid,
-   VFIO_EEH_PE_OP, &op);
+ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_GET_STATE);
 if (ret < 0) {
 return RTAS_OUT_PARAM_ERROR;
 }
@@ -206,28 +194,26 @@ static void spapr_phb_vfio_eeh_pre_reset(sPAPRPHBState 
*sphb)
 
 static int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option)
 {
-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) };
+uint32_t op;
 int ret;
 
 switch (option) {
 case RTAS_SLOT_RESET_DEACTIVATE:
-op.op = VFIO_EEH_PE_RESET_DEACTIVATE;
+op = VFIO_EEH_PE_RESET_DEACTIVATE;
 break;
 case RTAS_SLOT_RESET_HOT:
 spapr_phb_vfio_eeh_pre_reset(sphb);
-op.op = VFIO_EEH_PE_RESET_HOT;
+op = VFIO_EEH_PE_RESET_HOT;
 break;
 case RTAS_SLOT_RESET_FUNDAMENTAL:
 spapr_phb_vfio_eeh_pre_reset(sphb);
-op.op = VFIO_EEH_PE_RESET_FUNDAMENTAL;
+op = VFIO_EEH_PE_RESET_FUNDAMENTAL;
 break;
 default:
 return RTAS_OUT_PARAM_ERROR;
 }
 
-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid,
-   VFIO_EEH_PE_OP, &op);
+ret = vfio_eeh_as_op(&sphb->iommu_as, op);
 if (ret < 0) {
 return RTAS_OUT_HW_ERROR;
 }
@@ -237,13 +223,9 @@ static int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, 
int option)
 
 static int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb)
 {
-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) };
 int ret;
 
-op.op = VFIO_EEH_PE_CONFIGURE;
-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid,
-   VFIO_EEH_PE_OP, &op);
+ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE

[Qemu-devel] [PATCHv2 4/7] spapr_pci: Allow EEH on spapr-pci-host-bridge

2016-02-28 Thread David Gibson

Now that the EEH code is independent of the special
spapr-vfio-pci-host-bridge device, we can allow it on all spapr PCI
host bridges instead.  We do this by changing spapr_phb_eeh_available()
to be based on the vfio_eeh_as_ok() call instead of the host bridge class.

Because the value of vfio_eeh_as_ok() can change with devices being
hotplugged or unplugged, this can potentially lead to some strange edge
cases where the guest starts using EEH, then it starts failing because
of a change in status.

However, it's not really any worse than the current situation.  Cases that
would have worked previously will still work (i.e. VFIO devices from at
most one VFIO IOMMU group per vPHB), it's just that it's no longer
necessary to use spapr-vfio-pci-host-bridge with the groupid pre-specified.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr_pci.c  | 10 ++
 hw/ppc/spapr_pci_vfio.c |  6 +-
 include/hw/pci-host/spapr.h |  6 +-
 3 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index be05757..4ac5273 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -42,6 +42,8 @@
 #include "hw/ppc/spapr_drc.h"
 #include "sysemu/device_tree.h"
 
+#include "hw/vfio/vfio.h"
+
 /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
 #define RTAS_QUERY_FN   0
 #define RTAS_CHANGE_FN  1
@@ -92,13 +94,6 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState *spapr, 
uint64_t buid,
 return pci_find_device(phb->bus, bus_num, devfn);
 }
 
-static bool spapr_phb_eeh_available(sPAPRPHBState *sphb)
-{
-sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-
-return spc->eeh_available;
-}
-
 static uint32_t rtas_pci_cfgaddr(uint32_t arg)
 {
 /* This handles the encoding of extended config space addresses */
@@ -1563,7 +1558,6 @@ static void spapr_phb_class_init(ObjectClass *klass, void 
*data)
 set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
 dc->cannot_instantiate_with_device_add_yet = false;
 spc->finish_realize = spapr_phb_finish_realize;
-spc->eeh_available = false;
 hp->plug = spapr_phb_hot_plug_child;
 hp->unplug = spapr_phb_hot_unplug_child;
 }
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index 10fa88a..16a4a8f 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -73,6 +73,11 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState 
*sphb, Error **errp)
 spapr_tce_get_iommu(tcet));
 }
 
+bool spapr_phb_eeh_available(sPAPRPHBState *sphb)
+{
+return vfio_eeh_as_ok(&sphb->iommu_as);
+}
+
 static void spapr_phb_vfio_eeh_reenable(sPAPRPHBState *sphb)
 {
 vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_ENABLE);
@@ -240,7 +245,6 @@ static void spapr_phb_vfio_class_init(ObjectClass *klass, 
void *data)
 
 dc->props = spapr_phb_vfio_properties;
 spc->finish_realize = spapr_phb_vfio_finish_realize;
-spc->eeh_available = true;
 }
 
 static const TypeInfo spapr_phb_vfio_info = {
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index 0b936c6..19a95e0 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -49,7 +49,6 @@ struct sPAPRPHBClass {
 PCIHostBridgeClass parent_class;
 
 void (*finish_realize)(sPAPRPHBState *sphb, Error **errp);
-bool eeh_available;
 };
 
 typedef struct spapr_pci_msi {
@@ -136,6 +135,7 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState *spapr, 
uint64_t buid,
 
 /* VFIO EEH hooks */
 #ifdef CONFIG_LINUX
+bool spapr_phb_eeh_available(sPAPRPHBState *sphb);
 int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
   unsigned int addr, int option);
 int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state);
@@ -143,6 +143,10 @@ int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int 
option);
 int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb);
 void spapr_phb_vfio_reset(DeviceState *qdev);
 #else
+static inline bool spapr_phb_eeh_available(sPAPRPHBState *sphb)
+{
+return false;
+}
 static inline int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
 unsigned int addr, int option)
 {
-- 
2.5.0

[Qemu-devel] [PATCHv2 6/7] spapr_pci: Remove finish_realize hook

2016-02-28 Thread David Gibson

Now that spapr-pci-vfio-host-bridge is reduced to just a stub, there is
only one implementation of the finish_realize hook in sPAPRPHBClass.  So,
we can fold that implementation into its (single) caller, and remove the
hook.  That's the last thing left in sPAPRPHBClass, so that can go away as
well.

Signed-off-by: David Gibson 
Reviewed-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr_pci.c  | 25 +
 include/hw/pci-host/spapr.h | 12 
 2 files changed, 5 insertions(+), 32 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 4ac5273..3d1145e 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1224,11 +1224,12 @@ static void spapr_phb_realize(DeviceState *dev, Error 
**errp)
 SysBusDevice *s = SYS_BUS_DEVICE(dev);
 sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(s);
 PCIHostState *phb = PCI_HOST_BRIDGE(s);
-sPAPRPHBClass *info = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(s);
 char *namebuf;
 int i;
 PCIBus *bus;
 uint64_t msi_window_size = 4096;
+sPAPRTCETable *tcet;
+uint32_t nb_table;
 
 if (sphb->index != (uint32_t)-1) {
 hwaddr windows_base;
@@ -1380,33 +1381,20 @@ static void spapr_phb_realize(DeviceState *dev, Error 
**errp)
 }
 }
 
-if (!info->finish_realize) {
-error_setg(errp, "finish_realize not defined");
-return;
-}
-
-info->finish_realize(sphb, errp);
-
-sphb->msi = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free);
-}
-
-static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp)
-{
-sPAPRTCETable *tcet;
-uint32_t nb_table;
-
 nb_table = sphb->dma_win_size >> SPAPR_TCE_PAGE_SHIFT;
 tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn,
0, SPAPR_TCE_PAGE_SHIFT, nb_table, false);
 if (!tcet) {
 error_setg(errp, "Unable to create TCE table for %s",
sphb->dtbusname);
-return ;
+return;
 }
 
 /* Register default 32bit DMA window */
 memory_region_add_subregion(&sphb->iommu_root, sphb->dma_win_addr,
 spapr_tce_get_iommu(tcet));
+
+sphb->msi = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free);
 }
 
 static int spapr_phb_children_reset(Object *child, void *opaque)
@@ -1547,7 +1535,6 @@ static void spapr_phb_class_init(ObjectClass *klass, void 
*data)
 {
 PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass);
 DeviceClass *dc = DEVICE_CLASS(klass);
-sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_CLASS(klass);
 HotplugHandlerClass *hp = HOTPLUG_HANDLER_CLASS(klass);
 
 hc->root_bus_path = spapr_phb_root_bus_path;
@@ -1557,7 +1544,6 @@ static void spapr_phb_class_init(ObjectClass *klass, void 
*data)
 dc->vmsd = &vmstate_spapr_pci;
 set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
 dc->cannot_instantiate_with_device_add_yet = false;
-spc->finish_realize = spapr_phb_finish_realize;
 hp->plug = spapr_phb_hot_plug_child;
 hp->unplug = spapr_phb_hot_unplug_child;
 }
@@ -1567,7 +1553,6 @@ static const TypeInfo spapr_phb_info = {
 .parent= TYPE_PCI_HOST_BRIDGE,
 .instance_size = sizeof(sPAPRPHBState),
 .class_init= spapr_phb_class_init,
-.class_size= sizeof(sPAPRPHBClass),
 .interfaces= (InterfaceInfo[]) {
 { TYPE_HOTPLUG_HANDLER },
 { }
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index a08235e..03ee006 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -32,20 +32,8 @@
 #define SPAPR_PCI_HOST_BRIDGE(obj) \
 OBJECT_CHECK(sPAPRPHBState, (obj), TYPE_SPAPR_PCI_HOST_BRIDGE)
 
-#define SPAPR_PCI_HOST_BRIDGE_CLASS(klass) \
- OBJECT_CLASS_CHECK(sPAPRPHBClass, (klass), TYPE_SPAPR_PCI_HOST_BRIDGE)
-#define SPAPR_PCI_HOST_BRIDGE_GET_CLASS(obj) \
- OBJECT_GET_CLASS(sPAPRPHBClass, (obj), TYPE_SPAPR_PCI_HOST_BRIDGE)
-
-typedef struct sPAPRPHBClass sPAPRPHBClass;
 typedef struct sPAPRPHBState sPAPRPHBState;
 
-struct sPAPRPHBClass {
-PCIHostBridgeClass parent_class;
-
-void (*finish_realize)(sPAPRPHBState *sphb, Error **errp);
-};
-
 typedef struct spapr_pci_msi {
 uint32_t first_irq;
 uint32_t num;
-- 
2.5.0

[Qemu-devel] [PATCHv2 1/7] vfio: Start improving VFIO/EEH interface

2016-02-28 Thread David Gibson

At present the code handling IBM's Enhanced Error Handling (EEH) interface
on VFIO devices operates by bypassing the usual VFIO logic with
vfio_container_ioctl().  That's a poorly designed interface with unclear
semantics about exactly what can be operated on.

In particular it operates on a single vfio container internally (hence the
name), but takes an address space and group id, from which it deduces the
container in a rather roundabout way.  groupids are something that code
outside vfio shouldn't even be aware of.

This patch creates new interfaces for EEH operations.  Internally we
have vfio_eeh_container_op() which takes a VFIOContainer object
directly.  For external use we have vfio_eeh_as_ok() which determines
if an AddressSpace is usable for EEH (at present this means it has a
single container and at most a single group attached), and
vfio_eeh_as_op() which will perform an operation on an AddressSpace in
the unambiguous case, and otherwise returns an error.

This interface still isn't great, but it's enough of an improvement to
allow a number of cleanups in other places.

Signed-off-by: David Gibson 
---
 hw/vfio/common.c   | 84 ++
 include/hw/vfio/vfio.h |  2 ++
 2 files changed, 86 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 607ec70..b61118e 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1003,3 +1003,87 @@ int vfio_container_ioctl(AddressSpace *as, int32_t 
groupid,
 
 return vfio_container_do_ioctl(as, groupid, req, param);
 }
+
+/*
+ * Interfaces for IBM EEH (Enhanced Error Handling)
+ */
+static bool vfio_eeh_container_ok(VFIOContainer *container)
+{
+/* A broken kernel implementation means EEH operations won't work
+ * correctly if there are multiple groups in a container.  So
+ * return true only if there is exactly one group attached to the
+ * container */
+
+if (QLIST_EMPTY(&container->group_list)) {
+return false;
+}
+
+if (QLIST_NEXT(QLIST_FIRST(&container->group_list), container_next)) {
+return false;
+}
+
+return true;
+}
+
+static int vfio_eeh_container_op(VFIOContainer *container, uint32_t op)
+{
+struct vfio_eeh_pe_op pe_op = {
+.argsz = sizeof(pe_op),
+.op = op,
+};
+int ret;
+
+if (!vfio_eeh_container_ok(container)) {
+error_report("vfio/eeh: EEH_PE_OP 0x%x called on container"
+ " with multiple groups", op);
+return -EPERM;
+}
+
+ret = ioctl(container->fd, VFIO_EEH_PE_OP, &pe_op);
+if (ret < 0) {
+error_report("vfio/eeh: EEH_PE_OP 0x%x failed: %m", op);
+return -errno;
+}
+
+return 0;
+}
+
+static VFIOContainer *vfio_eeh_as_container(AddressSpace *as)
+{
+VFIOAddressSpace *space = vfio_get_address_space(as);
+VFIOContainer *container = NULL;
+
+if (QLIST_EMPTY(&space->containers)) {
+/* No containers to act on */
+goto out;
+}
+
+container = QLIST_FIRST(&space->containers);
+
+if (QLIST_NEXT(container, next)) {
+/* We don't yet have logic to synchronize EEH state across
+ * multiple containers */
+container = NULL;
+goto out;
+}
+
+out:
+vfio_put_address_space(space);
+return container;
+}
+
+bool vfio_eeh_as_ok(AddressSpace *as)
+{
+VFIOContainer *container = vfio_eeh_as_container(as);
+
+return (container != NULL) && vfio_eeh_container_ok(container);
+}
+
+int vfio_eeh_as_op(AddressSpace *as, uint32_t op)
+{
+VFIOContainer *container = vfio_eeh_as_container(as);
+
+/* Shouldn't be called unless vfio_eeh_as_ok() returned true */
+assert(container);
+return vfio_eeh_container_op(container, op);
+}
diff --git a/include/hw/vfio/vfio.h b/include/hw/vfio/vfio.h
index 0b26cd8..fd3933b 100644
--- a/include/hw/vfio/vfio.h
+++ b/include/hw/vfio/vfio.h
@@ -5,5 +5,7 @@
 
 extern int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
 int req, void *param);
+bool vfio_eeh_as_ok(AddressSpace *as);
+int vfio_eeh_as_op(AddressSpace *as, uint32_t op);
 
 #endif
-- 
2.5.0

[Qemu-devel] [PATCHv2 5/7] spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge

2016-02-28 Thread David Gibson

Now that the regular spapr-pci-host-bridge can handle EEH, there are only
two things that spapr-pci-vfio-host-bridge does differently:
1. automatically sizes its DMA window to match the host IOMMU
2. checks if the attached VFIO container is backed by the
   VFIO_SPAPR_TCE_IOMMU type on the host

(1) is not particularly useful, since the default window used by the
regular host bridge will work with the host IOMMU configuration on all
current systems anyway.

Plus, automatically changing guest visible configuration (such as the DMA
window) based on host settings is generally a bad idea.  It's not
definitively broken, since spapr-pci-vfio-host-bridge is only supposed to
support VFIO devices which can't be migrated anyway, but still.

(2) is not really useful, because if a guest tries to configure EEH on a
different host IOMMU, the first call will fail and that will be that.

It's possible there are scripts or tools out there which expect
spapr-pci-vfio-host-bridge, so we don't remove it entirely.  This patch
reduces it to just a stub for backwards compatibility.

Signed-off-by: David Gibson 
---
 hw/ppc/spapr_pci_vfio.c | 61 +
 include/hw/pci-host/spapr.h | 11 
 2 files changed, 17 insertions(+), 55 deletions(-)

diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index 16a4a8f..9e15924 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -23,54 +23,29 @@
 #include "hw/pci/msix.h"
 #include "linux/vfio.h"
 #include "hw/vfio/vfio.h"
+#include "qemu/error-report.h"
 
-static Property spapr_phb_vfio_properties[] = {
-DEFINE_PROP_INT32("iommu", sPAPRPHBVFIOState, iommugroupid, -1),
-DEFINE_PROP_END_OF_LIST(),
-};
+#define TYPE_SPAPR_PCI_VFIO_HOST_BRIDGE "spapr-pci-vfio-host-bridge"
 
-static void spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **errp)
-{
-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
-struct vfio_iommu_spapr_tce_info info = { .argsz = sizeof(info) };
-int ret;
-sPAPRTCETable *tcet;
-uint32_t liobn = svphb->phb.dma_liobn;
+#define SPAPR_PCI_VFIO_HOST_BRIDGE(obj) \
+OBJECT_CHECK(sPAPRPHBVFIOState, (obj), TYPE_SPAPR_PCI_VFIO_HOST_BRIDGE)
 
-if (svphb->iommugroupid == -1) {
-error_setg(errp, "Wrong IOMMU group ID %d", svphb->iommugroupid);
-return;
-}
+typedef struct sPAPRPHBVFIOState sPAPRPHBVFIOState;
 
-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid,
-   VFIO_CHECK_EXTENSION,
-   (void *) VFIO_SPAPR_TCE_IOMMU);
-if (ret != 1) {
-error_setg_errno(errp, -ret,
- "spapr-vfio: SPAPR extension is not supported");
-return;
-}
+struct sPAPRPHBVFIOState {
+sPAPRPHBState phb;
 
-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid,
-   VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info);
-if (ret) {
-error_setg_errno(errp, -ret,
- "spapr-vfio: get info from container failed");
-return;
-}
+int32_t iommugroupid;
+};
 
-tcet = spapr_tce_new_table(DEVICE(sphb), liobn, info.dma32_window_start,
-   SPAPR_TCE_PAGE_SHIFT,
-   info.dma32_window_size >> SPAPR_TCE_PAGE_SHIFT,
-   true);
-if (!tcet) {
-error_setg(errp, "spapr-vfio: failed to create VFIO TCE table");
-return;
-}
+static Property spapr_phb_vfio_properties[] = {
+DEFINE_PROP_INT32("iommu", sPAPRPHBVFIOState, iommugroupid, -1),
+DEFINE_PROP_END_OF_LIST(),
+};
 
-/* Register default 32bit DMA window */
-memory_region_add_subregion(&sphb->iommu_root, tcet->bus_offset,
-spapr_tce_get_iommu(tcet));
+static void spapr_phb_vfio_instance_init(Object *obj)
+{
+error_report("spapr-pci-vfio-host-bridge is deprecated");
 }
 
 bool spapr_phb_eeh_available(sPAPRPHBState *sphb)
@@ -241,18 +216,16 @@ int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb)
 static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
-sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_CLASS(klass);
 
 dc->props = spapr_phb_vfio_properties;
-spc->finish_realize = spapr_phb_vfio_finish_realize;
 }
 
 static const TypeInfo spapr_phb_vfio_info = {
 .name  = TYPE_SPAPR_PCI_VFIO_HOST_BRIDGE,
 .parent= TYPE_SPAPR_PCI_HOST_BRIDGE,
 .instance_size = sizeof(sPAPRPHBVFIOState),
+.instance_init = spapr_phb_vfio_instance_init,
 .class_init= spapr_phb_vfio_class_init,
-.class_size= sizeof(sPAPRPHBClass),
 };
 
 static void spapr_pci_vfio_register_types(void)
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index 19a95e0..a08235e 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -28,14 +28,10 @@
 #in

[Qemu-devel] [PATCHv2 7/7] vfio: Eliminate vfio_container_ioctl()

2016-02-28 Thread David Gibson

vfio_container_ioctl() was a bad interface that bypassed abstraction
boundaries, had semantics that sat uneasily with its name, and was unsafe
in many realistic circumstances.  Now that spapr-pci-vfio-host-bridge has
been folded into spapr-pci-host-bridge, there are no more users, so remove
it.

Signed-off-by: David Gibson 
Reviewed-by: Alexey Kardashevskiy 
---
 hw/vfio/common.c   | 45 -
 include/hw/vfio/vfio.h |  2 --
 2 files changed, 47 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index b61118e..55e87d3 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -959,51 +959,6 @@ void vfio_put_base_device(VFIODevice *vbasedev)
 close(vbasedev->fd);
 }
 
-static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid,
-   int req, void *param)
-{
-VFIOGroup *group;
-VFIOContainer *container;
-int ret = -1;
-
-group = vfio_get_group(groupid, as);
-if (!group) {
-error_report("vfio: group %d not registered", groupid);
-return ret;
-}
-
-container = group->container;
-if (group->container) {
-ret = ioctl(container->fd, req, param);
-if (ret < 0) {
-error_report("vfio: failed to ioctl %d to container: ret=%d, %s",
- _IOC_NR(req) - VFIO_BASE, ret, strerror(errno));
-}
-}
-
-vfio_put_group(group);
-
-return ret;
-}
-
-int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
- int req, void *param)
-{
-/* We allow only certain ioctls to the container */
-switch (req) {
-case VFIO_CHECK_EXTENSION:
-case VFIO_IOMMU_SPAPR_TCE_GET_INFO:
-case VFIO_EEH_PE_OP:
-break;
-default:
-/* Return an error on unknown requests */
-error_report("vfio: unsupported ioctl %X", req);
-return -1;
-}
-
-return vfio_container_do_ioctl(as, groupid, req, param);
-}
-
 /*
  * Interfaces for IBM EEH (Enhanced Error Handling)
  */
diff --git a/include/hw/vfio/vfio.h b/include/hw/vfio/vfio.h
index fd3933b..7153604 100644
--- a/include/hw/vfio/vfio.h
+++ b/include/hw/vfio/vfio.h
@@ -3,8 +3,6 @@
 
 #include "qemu/typedefs.h"
 
-extern int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
-int req, void *param);
 bool vfio_eeh_as_ok(AddressSpace *as);
 int vfio_eeh_as_op(AddressSpace *as, uint32_t op);
 
-- 
2.5.0

[Qemu-devel] [PATCHv2 0/7] Allow EEH on spapr-pci-host-bridge devices

2016-02-28 Thread David Gibson

For historical reasons, the spapr machine type has two PCI host bridge
implementations: spapr-pci-host-bridge, and
spapr-vfio-pci-host-bridge.  The latter was (duh) designed for VFIO
devices, but later reworks mean it's not necessary for that, and VFIO
can be used on the regular host bridge.

The only remaining difference is that EEH (Enhanced Error Handling -
IBM's interface with similar purpose to AER) is only supported on the
spapr-vfio-pci-host-bridge device.

This series corrects this, allowing EEH operations on the regular host
bridge.  That allows the special VFIO host bridge (we leave a stub,
for backwards compatibility).

EEH is only supported for VFIO devices, for the time being, and due to
bugs in the kernel implementation, it is only usable when the host
bridge only has devices from a single (host-side) IOMMU group
("Partitionable Endpoint", in IBM terminology).  That's an annoying
limitation which we hope to lift someday, but it's no worse than now,
since the spapr-vfio-pci-host-bridge only permits devices from a
single IOMMU group in any case (although it doesn't properly enforce
that).

I wrote these a while back, and I'm not sure why they got sidelined -
I suspect I was looking for testing from Gavin Shan, not realising
then that he was no longer working on EEH.  In any case, I'm hoping we
can squeeze this change into 2.6, so we don't need to carry the broken
spapr-vfio-pci-host-bridge device any longer.


Alex,

If you could look at the 2 patches in this series affecting core VFIO
code ASAP, that would be really helpful.  Also, if you could let me
know if you would like to take those through your tree, or if I can
take them through mine, with your ack.

Changes since v1:
  * Don't consider containers with no groups as being able to support
EEH, which is a safer option
  * To avoid build problems on non-Linux hosts, I no longer merge most
of the code into spapr_pci.c, instead leaving it in helpers in
spapr_pci_vfio.c, just no longer part of its own special device
type (this approach suggested by Alexey Kardashevskiy).

David Gibson (7):
  vfio: Start improving VFIO/EEH interface
  spapr_pci: Switch to vfio_eeh_as_op() interface
  spapr_pci: Eliminate class callbacks
  spapr_pci: Allow EEH on spapr-pci-host-bridge
  spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge
  spapr_pci: Remove finish_realize hook
  vfio: Eliminate vfio_container_ioctl()

 hw/ppc/spapr_pci.c  |  63 +++--
 hw/ppc/spapr_pci_vfio.c | 131 +++-
 hw/vfio/common.c| 101 +++---
 include/hw/pci-host/spapr.h |  64 +-
 include/hw/vfio/vfio.h  |   4 +-
 5 files changed, 173 insertions(+), 190 deletions(-)

-- 
2.5.0

Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay

2016-02-28 Thread Pavel Dovgalyuk

> From: Kevin Wolf [mailto:kw...@redhat.com]
> Am 25.02.2016 um 10:06 hat Pavel Dovgalyuk geschrieben:
> > There is one problem with flush event - callbacks for flush are called for
> > all layers and I couldn't synchronize them correctly yet.
> > I'll probably have to add new callback to block driver, which handles
> > flush request for the whole stack of the drivers.
> 
> Flushes should be treated more or less the same a writes, I think.

But bdrv_co_flush has different structure and does not allow synchronization
of the top layer. Here is the patch for fixing this:

diff --git a/block/io.c b/block/io.c
index a69bfc4..9e05dfe 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2369,6 +2369,12 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 }
 
 tracked_request_begin(&req, bs, 0, 0, BDRV_TRACKED_FLUSH);
+/* Write back all layers by calling one driver function */
+if (bs->drv->bdrv_co_flush) {
+ret = bs->drv->bdrv_co_flush(bs);
+goto out;
+}
+
 /* Write back cached data to the OS even with cache=unsafe */
 BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_OS);
 if (bs->drv->bdrv_co_flush_to_os) {
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 9ef823a..9cc2c58 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -176,6 +176,13 @@ struct BlockDriver {
 int (*bdrv_inactivate)(BlockDriverState *bs);
 
 /*
+ * Flushes all data for all layers by calling bdrv_co_flush for underlying
+ * layers, if needed. This function is needed for deterministic
+ * synchronization of the flush finishing callback.
+ */
+int coroutine_fn (*bdrv_co_flush)(BlockDriverState *bs);
+
+/*
  * Flushes all data that was already written to the OS all the way down to
  * the disk (for example raw-posix calls fsync()).
  */


Then I'll have just to implement bdrv_co_flush for blkreplay layer.
This callback will manually invoke bdrv_co_flush for underlying layer
allowing synchronization of finishing callback.

Pavel Dovgalyuk

[Qemu-devel] [PULL 0/8] ppc-for-2.6 queue 20160229

2016-02-28 Thread David Gibson

To probably no-one's surprise, my "hopefully last" pull request before
the soft freeze has turned out not to be the last.  Here are a few
more patches for qemu-2.6.

The following changes since commit 6e378dd214fbbae8138ff011ec3de7ddf13a445f:

  Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160226' 
into staging (2016-02-26 16:02:00 +)

are available in the git repository at:

  git://github.com/dgibson/qemu.git tags/ppc-for-2.6-20160229

for you to fetch changes up to a005b3ef50439b5bc6b2eb0b5bda8e8c7c2368bf:

  xics: report errors with the QEMU Error API (2016-02-28 16:19:02 +1100)


ppc patch queue for 2016-02-29

Some more accumulated patches for target-ppc, pseries machine type and
related devices to fit in before the qemu-2.6 soft freeze.
* Mostly bugfixes and small cleanups for spapr and Mac platforms


Greg Kurz (7):
  spapr_rng: disable hotpluggability
  spapr_pci: kill useless variable in rtas_ibm_change_msi()
  spapr_pci: fix irq leak in RTAS ibm,change-msi
  spapr: disable vmdesc submission for old machines
  spapr: skip configuration section during migration of older machines
  migration: allow machine to enforce configuration section migration
  xics: report errors with the QEMU Error API

Hervé Poussineau (1):
  dbdma: warn when using unassigned channel

 hw/core/machine.c | 21 +
 hw/intc/xics.c| 13 +
 hw/misc/macio/mac_dbdma.c | 25 +++--
 hw/ppc/spapr.c|  2 ++
 hw/ppc/spapr_events.c |  3 ++-
 hw/ppc/spapr_pci.c| 31 +--
 hw/ppc/spapr_rng.c|  1 +
 hw/ppc/spapr_vio.c|  7 ---
 include/hw/boards.h   |  1 +
 include/hw/ppc/xics.h |  5 +++--
 migration/savevm.c| 10 --
 qemu-options.hx   |  3 ++-
 trace-events  |  2 --
 13 files changed, 97 insertions(+), 27 deletions(-)

[Qemu-devel] [PULL 6/8] spapr: skip configuration section during migration of older machines

2016-02-28 Thread David Gibson

From: Greg Kurz 

Since QEMU 2.4, we have a configuration section in the migration stream.
This must be skipped for older machines, like it is already done for x86.

This patch fixes the migration of pseries-2.3 from/to QEMU 2.3, but it
breaks migration of the same machine from/to QEMU 2.4/2.4.1/2.5. We do
that anyway because QEMU 2.3 is likely to be more widely deployed than
newer QEMU versions.

Fixes: 61964c23e5ddd5a33f15699e45ce126f879e3e33
Signed-off-by: Greg Kurz 
Reviewed-by: Laurent Vivier 
Reviewed-by: Juan Quintela 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 3b4a557..e9d4abf 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2427,6 +2427,7 @@ static void 
spapr_machine_2_3_instance_options(MachineState *machine)
 spapr_machine_2_4_instance_options(machine);
 savevm_skip_section_footers();
 global_state_set_optional();
+savevm_skip_configuration();
 }
 
 static void spapr_machine_2_3_class_options(MachineClass *mc)
-- 
2.5.0

[Qemu-devel] [PULL 1/8] spapr_rng: disable hotpluggability

2016-02-28 Thread David Gibson

From: Greg Kurz 

It is currently possible to hotplug a spapr_rng device but QEMU crashes
when we try to hot unplug:

ERROR:hw/core/qdev.c:295:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted

This happens because spapr_rng isn't plugged to any bus and sPAPR does
not provide hotplug support for it: qdev_get_hotplug_handler() hence
return NULL and we hit the assertion.

And anyway, it doesn't make much sense to unplug this device since hcalls
cannot be unregistered. Even the idea of hotplugging a RNG device instead
of declaring it on the QEMU command line looks weird.

This patch simply disables hotpluggability for the spapr-rng class.

Signed-off-by: Greg Kurz 
Reviewed-by: Thomas Huth 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_rng.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/ppc/spapr_rng.c b/hw/ppc/spapr_rng.c
index 8484fcf..a39d472 100644
--- a/hw/ppc/spapr_rng.c
+++ b/hw/ppc/spapr_rng.c
@@ -170,6 +170,7 @@ static void spapr_rng_class_init(ObjectClass *oc, void 
*data)
 dc->realize = spapr_rng_realize;
 set_bit(DEVICE_CATEGORY_MISC, dc->categories);
 dc->props = spapr_rng_properties;
+dc->hotpluggable = false;
 }
 
 static const TypeInfo spapr_rng_info = {
-- 
2.5.0

[Qemu-devel] [PULL 2/8] spapr_pci: kill useless variable in rtas_ibm_change_msi()

2016-02-28 Thread David Gibson

From: Greg Kurz 

The num local variable is initialized to zero and has no writer.

Signed-off-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_pci.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index cca9257..19dd6db 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -275,7 +275,7 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 unsigned int req_num = rtas_ld(args, 4); /* 0 == remove all */
 unsigned int seq_num = rtas_ld(args, 5);
 unsigned int ret_intr_type;
-unsigned int irq, max_irqs = 0, num = 0;
+unsigned int irq, max_irqs = 0;
 sPAPRPHBState *phb = NULL;
 PCIDevice *pdev = NULL;
 spapr_pci_msi *msi;
@@ -316,10 +316,10 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 
 xics_free(spapr->icp, msi->first_irq, msi->num);
 if (msi_present(pdev)) {
-spapr_msi_setmsg(pdev, 0, false, 0, num);
+spapr_msi_setmsg(pdev, 0, false, 0, 0);
 }
 if (msix_present(pdev)) {
-spapr_msi_setmsg(pdev, 0, true, 0, num);
+spapr_msi_setmsg(pdev, 0, true, 0, 0);
 }
 g_hash_table_remove(phb->msi, &config_addr);
 
-- 
2.5.0

[Qemu-devel] [PULL 3/8] spapr_pci: fix irq leak in RTAS ibm, change-msi

2016-02-28 Thread David Gibson

From: Greg Kurz 

This RTAS call is used to request new interrupts or to free all interrupts.

If the driver has already allocated interrupts and asks again for a non-null
number of irqs, then the rtas_ibm_change_msi() function will silently leak
the previous interrupts.

It happens because xics_free() is only called when the driver releases all
interrupts (!req_num case). Note that the previously allocated spapr_pci_msi
is not leaked because the GHashTable is created with destroy functions and
g_hash_table_insert() hence frees the old value.

This patch makes sure any previously allocated MSIs are released when a
new allocation succeeds.

Signed-off-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_pci.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 19dd6db..9b2b546 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -305,9 +305,10 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 return;
 }
 
+msi = (spapr_pci_msi *) g_hash_table_lookup(phb->msi, &config_addr);
+
 /* Releasing MSIs */
 if (!req_num) {
-msi = (spapr_pci_msi *) g_hash_table_lookup(phb->msi, &config_addr);
 if (!msi) {
 trace_spapr_pci_msi("Releasing wrong config", config_addr);
 rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
@@ -360,6 +361,12 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 return;
 }
 
+/* Release previous MSIs */
+if (msi) {
+xics_free(spapr->icp, msi->first_irq, msi->num);
+g_hash_table_remove(phb->msi, &config_addr);
+}
+
 /* Setup MSI/MSIX vectors in the device (via cfgspace or MSIX BAR) */
 spapr_msi_setmsg(pdev, SPAPR_PCI_MSI_WINDOW, ret_intr_type == 
RTAS_TYPE_MSIX,
  irq, req_num);
-- 
2.5.0

[Qemu-devel] [PULL 8/8] xics: report errors with the QEMU Error API

2016-02-28 Thread David Gibson

From: Greg Kurz 

Using the return value to report errors is error prone:
- xics_alloc() returns -1 on error but spapr_vio_busdev_realize() errors
  on 0
- xics_alloc_block() returns the unclear value of ics->offset - 1 on error
  but both rtas_ibm_change_msi() and spapr_phb_realize() error on 0

This patch adds an errp argument to xics_alloc() and xics_alloc_block() to
report errors. The return value of these functions is a valid IRQ number
if errp is NULL. It is undefined otherwise.

The corresponding error traces get promotted to error messages. Note that
the "can't allocate IRQ" error message in spapr_vio_busdev_realize() also
moves to xics_alloc(). Similar error message consolidation isn't really
applicable to xics_alloc_block() because callers have extra context (device
config address, MSI or MSIX).

This fixes the issues mentioned above.

Based on previous work from Brian W. Hart.

Signed-off-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 hw/intc/xics.c| 13 +
 hw/ppc/spapr_events.c |  3 ++-
 hw/ppc/spapr_pci.c| 16 ++--
 hw/ppc/spapr_vio.c|  7 ---
 include/hw/ppc/xics.h |  5 +++--
 trace-events  |  2 --
 6 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index e66ae32..213a370 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -712,7 +712,7 @@ static int ics_find_free_block(ICSState *ics, int num, int 
alignnum)
 return -1;
 }
 
-int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi)
+int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi, Error **errp)
 {
 ICSState *ics = &icp->ics[src];
 int irq;
@@ -720,14 +720,14 @@ int xics_alloc(XICSState *icp, int src, int irq_hint, 
bool lsi)
 if (irq_hint) {
 assert(src == xics_find_source(icp, irq_hint));
 if (!ICS_IRQ_FREE(ics, irq_hint - ics->offset)) {
-trace_xics_alloc_failed_hint(src, irq_hint);
+error_setg(errp, "can't allocate IRQ %d: already in use", 
irq_hint);
 return -1;
 }
 irq = irq_hint;
 } else {
 irq = ics_find_free_block(ics, 1, 1);
 if (irq < 0) {
-trace_xics_alloc_failed_no_left(src);
+error_setg(errp, "can't allocate IRQ: no IRQ left");
 return -1;
 }
 irq += ics->offset;
@@ -743,7 +743,8 @@ int xics_alloc(XICSState *icp, int src, int irq_hint, bool 
lsi)
  * Allocate block of consecutive IRQs, and return the number of the first IRQ 
in the block.
  * If align==true, aligns the first IRQ number to num.
  */
-int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align)
+int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align,
+ Error **errp)
 {
 int i, first = -1;
 ICSState *ics = &icp->ics[src];
@@ -763,6 +764,10 @@ int xics_alloc_block(XICSState *icp, int src, int num, 
bool lsi, bool align)
 } else {
 first = ics_find_free_block(ics, num, 1);
 }
+if (first < 0) {
+error_setg(errp, "can't find a free %d-IRQ block", num);
+return -1;
+}
 
 if (first >= 0) {
 for (i = first; i < first + num; ++i) {
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index f5eac4b..39f4682 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -588,7 +588,8 @@ out_no_events:
 void spapr_events_init(sPAPRMachineState *spapr)
 {
 QTAILQ_INIT(&spapr->pending_events);
-spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false);
+spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false,
+&error_fatal);
 spapr->epow_notifier.notify = spapr_powerdown_req;
 qemu_register_powerdown_notifier(&spapr->epow_notifier);
 spapr_rtas_register(RTAS_CHECK_EXCEPTION, "check-exception",
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 9b2b546..e8edad3 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -280,6 +280,7 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 PCIDevice *pdev = NULL;
 spapr_pci_msi *msi;
 int *config_addr_key;
+Error *err = NULL;
 
 switch (func) {
 case RTAS_CHANGE_MSI_FN:
@@ -354,9 +355,10 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 
 /* Allocate MSIs */
 irq = xics_alloc_block(spapr->icp, 0, req_num, false,
-   ret_intr_type == RTAS_TYPE_MSI);
-if (!irq) {
-error_report("Cannot allocate MSIs for device %x", config_addr);
+   ret_intr_type == RTAS_TYPE_MSI, &err);
+if (err) {
+error_reportf_err(err, "Can't allocate MSIs for device %x: ",
+  config_addr);
 rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
 return;
 }
@@ -1367,10 +1369,12 @@ static void spapr_phb_realize(DeviceState *dev, Error 
**errp)
 /* Initialize the LSI table */
 for (i = 0; i < P

[Qemu-devel] [PULL 7/8] migration: allow machine to enforce configuration section migration

2016-02-28 Thread David Gibson

From: Greg Kurz 

Migration of pseries-2.3 doesn't have configuration section. Unfortunately,
QEMU 2.4/2.4.1/2.5 are buggy and always stream and expect the configuration
section, and break migration both ways.

This patch introduces a property which allows to enforce a configuration
section for machines who don't have one.

It can be set at startup:

-machine enforce-config-section=on

or later from the QEMU monitor:

qom-set /machine enforce-config-section on

It is up to the tooling to set or unset this property according to the
version of the QEMU at the other end of the pipe.

Signed-off-by: Greg Kurz 
Reviewed-by: Laurent Vivier 
Reviewed-by: Juan Quintela 
Signed-off-by: David Gibson 
---
 hw/core/machine.c   | 21 +
 include/hw/boards.h |  1 +
 migration/savevm.c  | 10 --
 qemu-options.hx |  3 ++-
 4 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 6d1a0d8..a8c4680 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -312,6 +312,21 @@ static bool machine_get_suppress_vmdesc(Object *obj, Error 
**errp)
 return ms->suppress_vmdesc;
 }
 
+static void machine_set_enforce_config_section(Object *obj, bool value,
+ Error **errp)
+{
+MachineState *ms = MACHINE(obj);
+
+ms->enforce_config_section = value;
+}
+
+static bool machine_get_enforce_config_section(Object *obj, Error **errp)
+{
+MachineState *ms = MACHINE(obj);
+
+return ms->enforce_config_section;
+}
+
 static int error_on_sysbus_device(SysBusDevice *sbdev, void *opaque)
 {
 error_report("Option '-device %s' cannot be handled by this machine",
@@ -467,6 +482,12 @@ static void machine_initfn(Object *obj)
 object_property_set_description(obj, "suppress-vmdesc",
 "Set on to disable self-describing 
migration",
 NULL);
+object_property_add_bool(obj, "enforce-config-section",
+ machine_get_enforce_config_section,
+ machine_set_enforce_config_section, NULL);
+object_property_set_description(obj, "enforce-config-section",
+"Set on to enforce configuration section 
migration",
+NULL);
 
 /* Register notifier when init is done for sysbus sanity checks */
 ms->sysbus_notifier.notify = machine_init_notify;
diff --git a/include/hw/boards.h b/include/hw/boards.h
index de3b3bd..b5d7eae 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -127,6 +127,7 @@ struct MachineState {
 char *firmware;
 bool iommu;
 bool suppress_vmdesc;
+bool enforce_config_section;
 
 ram_addr_t ram_size;
 ram_addr_t maxram_size;
diff --git a/migration/savevm.c b/migration/savevm.c
index b459156..96e7db5 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -878,13 +878,19 @@ bool qemu_savevm_state_blocked(Error **errp)
 return false;
 }
 
+static bool enforce_config_section(void)
+{
+MachineState *machine = MACHINE(qdev_get_machine());
+return machine->enforce_config_section;
+}
+
 void qemu_savevm_state_header(QEMUFile *f)
 {
 trace_savevm_state_header();
 qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
 qemu_put_be32(f, QEMU_VM_FILE_VERSION);
 
-if (!savevm_state.skip_configuration) {
+if (!savevm_state.skip_configuration || enforce_config_section()) {
 qemu_put_byte(f, QEMU_VM_CONFIGURATION);
 vmstate_save_state(f, &vmstate_configuration, &savevm_state, 0);
 }
@@ -1883,7 +1889,7 @@ int qemu_loadvm_state(QEMUFile *f)
 return -ENOTSUP;
 }
 
-if (!savevm_state.skip_configuration) {
+if (!savevm_state.skip_configuration || enforce_config_section()) {
 if (qemu_get_byte(f) != QEMU_VM_CONFIGURATION) {
 error_report("Configuration section missing");
 return -EINVAL;
diff --git a/qemu-options.hx b/qemu-options.hx
index 599db94..144e6a9 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -43,7 +43,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
 "aes-key-wrap=on|off controls support for AES key wrapping 
(default=on)\n"
 "dea-key-wrap=on|off controls support for DEA key wrapping 
(default=on)\n"
 "suppress-vmdesc=on|off disables self-describing migration 
(default=off)\n"
-"nvdimm=on|off controls NVDIMM support (default=off)\n",
+"nvdimm=on|off controls NVDIMM support (default=off)\n"
+"enforce-config-section=on|off enforce configuration 
section migration (default=off)\n",
 QEMU_ARCH_ALL)
 STEXI
 @item -machine [type=]@var{name}[,prop=@var{value}[,...]]
-- 
2.5.0

[Qemu-devel] [PULL 5/8] dbdma: warn when using unassigned channel

2016-02-28 Thread David Gibson

From: Hervé Poussineau 

With this, it's easier to know if a guest uses an invalid and/or unimplemented
DMA channel.

Signed-off-by: Hervé Poussineau 
Reviewed-by: Thomas Huth 
Acked-by: Mark Cave-Ayland 
Signed-off-by: David Gibson 
---
 hw/misc/macio/mac_dbdma.c | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c
index d81dea7..6051f17 100644
--- a/hw/misc/macio/mac_dbdma.c
+++ b/hw/misc/macio/mac_dbdma.c
@@ -557,11 +557,13 @@ void DBDMA_register_channel(void *dbdma, int nchan, 
qemu_irq irq,
 
 DBDMA_DPRINTF("DBDMA_register_channel 0x%x\n", nchan);
 
+assert(rw);
+assert(flush);
+
 ch->irq = irq;
 ch->rw = rw;
 ch->flush = flush;
 ch->io.opaque = opaque;
-ch->io.channel = ch;
 }
 
 static void
@@ -775,6 +777,20 @@ static void dbdma_reset(void *opaque)
 memset(s->channels[i].regs, 0, DBDMA_SIZE);
 }
 
+static void dbdma_unassigned_rw(DBDMA_io *io)
+{
+DBDMA_channel *ch = io->channel;
+qemu_log_mask(LOG_GUEST_ERROR, "%s: use of unassigned channel %d\n",
+  __func__, ch->channel);
+}
+
+static void dbdma_unassigned_flush(DBDMA_io *io)
+{
+DBDMA_channel *ch = io->channel;
+qemu_log_mask(LOG_GUEST_ERROR, "%s: use of unassigned channel %d\n",
+  __func__, ch->channel);
+}
+
 void* DBDMA_init (MemoryRegion **dbdma_mem)
 {
 DBDMAState *s;
@@ -784,8 +800,13 @@ void* DBDMA_init (MemoryRegion **dbdma_mem)
 
 for (i = 0; i < DBDMA_CHANNELS; i++) {
 DBDMA_io *io = &s->channels[i].io;
+DBDMA_channel *ch = &s->channels[i];
 qemu_iovec_init(&io->iov, 1);
-s->channels[i].channel = i;
+
+ch->rw = dbdma_unassigned_rw;
+ch->flush = dbdma_unassigned_flush;
+ch->channel = i;
+ch->io.channel = ch;
 }
 
 memory_region_init_io(&s->mem, NULL, &dbdma_ops, s, "dbdma", 0x1000);
-- 
2.5.0

[Qemu-devel] [PULL 4/8] spapr: disable vmdesc submission for old machines

2016-02-28 Thread David Gibson

From: Greg Kurz 

Since QEMU 2.3, we have a vmdesc section in the migration stream.
This section is not mandatory but when migrating a pseries-2.2
machine from QEMU 2.2, you get a warning at the destination:

qemu-system-ppc64: Expected vmdescription section, but got 0

The warning goes away if we decide to skip vmdesc as well for
older pseries, like it is already done for pc's.

This can only be observed with -cpu POWER7 because POWER8
cannot migrate from QEMU 2.2 to 2.3 (insns_flags2 mismatch).

Signed-off-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index c119f55..3b4a557 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2452,6 +2452,7 @@ DEFINE_SPAPR_MACHINE(2_3, "2.3", false);
 static void spapr_machine_2_2_instance_options(MachineState *machine)
 {
 spapr_machine_2_3_instance_options(machine);
+machine->suppress_vmdesc = true;
 }
 
 static void spapr_machine_2_2_class_options(MachineClass *mc)
-- 
2.5.0

Re: [Qemu-devel] [PATCH 04/12] spapr_pci: Fold spapr_phb_vfio_eeh_configure() into spapr_pci code

2016-02-28 Thread David Gibson

On Mon, Feb 29, 2016 at 03:25:24PM +1100, Alexey Kardashevskiy wrote:
> On 02/29/2016 02:45 PM, Alexey Kardashevskiy wrote:
> >On 02/29/2016 12:43 PM, Alexey Kardashevskiy wrote:
> >>On 02/26/2016 10:31 PM, David Gibson wrote:
> >>>Simplify the sPAPR PCI code by folding spapr_phb_vfio_eeh_configure()
> >>>into rtas_ibm_configure_pe().
> >>>
> >>>Signed-off-by: David Gibson 
> >>
> >>Reviewed-by: Alexey Kardashevskiy 
> >
> >Aaaand this breaks mingw32:
> >
> >   CCppc64-softmmu/hw/ppc/spapr_pci.o
> >/home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:46:24: fatal error:
> >linux/vfio.h: No such file or directory
> >compilation terminated.
> >/home/aik/p/qemu-dwg-eeh/rules.mak:57: recipe for target
> >'hw/ppc/spapr_pci.o' failed
> >make[1]: *** [hw/ppc/spapr_pci.o] Error 1
> >make[1]: *** Waiting for unfinished jobs
> >Makefile:186: recipe for target 'subdir-ppc64-softmmu' failed
> >make: *** [subdir-ppc64-softmmu] Error 2
> >make: Leaving directory '/scratch/aik/p/qemu-dwg-eeh--ppc64_mingw32-build'
> >
> >
> >
> >>
> >>>---
> >>>  hw/ppc/spapr_pci.c  | 11 +--
> >>>  hw/ppc/spapr_pci_vfio.c | 12 
> >>>  include/hw/pci-host/spapr.h |  1 -
> >>>  3 files changed, 9 insertions(+), 15 deletions(-)
> >>>
> >>>diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> >>>index d1e5222..fa633a9 100644
> >>>--- a/hw/ppc/spapr_pci.c
> >>>+++ b/hw/ppc/spapr_pci.c
> >>>@@ -42,6 +42,9 @@
> >>>  #include "hw/ppc/spapr_drc.h"
> >>>  #include "sysemu/device_tree.h"
> >>>
> >>>+#include "hw/vfio/vfio.h"
> >>>+#include 
> >>>+
> 
> This is missing:
> 
> #ifdef CONFIG_LINUX
> #include 
> #endif
> 
> and below where you use symbols from linux/vfio.h.
> 
> 
> My version of this rework did convert class callbacks to exported helpers
> and keep these helpers (plus stubs) in spapr_pci_vfio.c, with one #ifdef
> CONFIG_LINUX. Looked quite clean to me...

Yeah, good idea.  I'd forgotten the case of non-Linux builds.  I'll do
something similar in v2.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH v0 2/6] spapr: CPU core device

2016-02-28 Thread Bharata B Rao

On Fri, Feb 26, 2016 at 12:13:39PM -0600, Michael Roth wrote:
> Quoting Bharata B Rao (2016-02-25 10:22:38)
> > Add sPAPR specific CPU core device that is based on generic CPU core device.
> > Creating this core device will result in creation of all the CPU thread
> > devices that are part of this core.
> > 
> > Signed-off-by: Bharata B Rao 
> > ---
> >  hw/ppc/Makefile.objs|   1 +
> >  hw/ppc/spapr_cpu_core.c | 210 
> > 
> >  include/hw/ppc/spapr_cpu_core.h |  32 ++
> >  3 files changed, 243 insertions(+)
> >  create mode 100644 hw/ppc/spapr_cpu_core.c
> >  create mode 100644 include/hw/ppc/spapr_cpu_core.h
> > 
> > diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
> > index c1ffc77..5cc6608 100644
> > --- a/hw/ppc/Makefile.objs
> > +++ b/hw/ppc/Makefile.objs
> > @@ -4,6 +4,7 @@ obj-y += ppc.o ppc_booke.o
> >  obj-$(CONFIG_PSERIES) += spapr.o spapr_vio.o spapr_events.o
> >  obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
> >  obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
> > +obj-$(CONFIG_PSERIES) += spapr_cpu_core.o
> >  ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
> >  obj-y += spapr_pci_vfio.o
> >  endif
> > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> > new file mode 100644
> > index 000..c44eb61
> > --- /dev/null
> > +++ b/hw/ppc/spapr_cpu_core.c
> > @@ -0,0 +1,210 @@
> > +/*
> > + * sPAPR CPU core device, acts as container of CPU thread devices.
> > + *
> > + * Copyright (C) 2016 Bharata B Rao 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or 
> > later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +#include "hw/cpu/core.h"
> > +#include "hw/ppc/spapr_cpu_core.h"
> > +#include "hw/ppc/spapr.h"
> > +#include "hw/boards.h"
> > +#include "qemu/error-report.h"
> > +#include "qapi/visitor.h"
> > +#include 
> > +
> > +static int spapr_cpu_core_realize_child(Object *child, void *opaque)
> > +{
> > +Error **errp = opaque;
> > +
> > +object_property_set_bool(child, true, "realized", errp);
> > +if (*errp) {
> > +return 1;
> > +}
> > +return 0;
> > +}
> > +
> > +static void spapr_cpu_core_realize(DeviceState *dev, Error **errp)
> > +{
> > +sPAPRCPUCore *core = SPAPR_CPU_CORE(OBJECT(dev));
> > +sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> > +Error *local_err = NULL;
> > +
> > +if (!core->nr_threads) {
> > +error_setg(errp, "nr_threads property can't be 0");
> > +return;
> > +}
> > +
> > +if (!core->cpu_model) {
> > +error_setg(errp, "cpu_model property isn't set");
> > +return;
> > +}
> > +
> > +/*
> > + * TODO: If slot isn't specified, plug this core into
> > + * an existing empty slot.
> > + */
> > +if (!core->slot) {
> > +error_setg(errp, "slot property isn't set");
> > +return;
> > +}
> > +
> > +object_property_set_link(OBJECT(spapr), OBJECT(core), core->slot,
> > + &local_err);
> > +if (local_err) {
> > +error_propagate(errp, local_err);
> > +return;
> > +}
> > +
> > +object_child_foreach(OBJECT(dev), spapr_cpu_core_realize_child, errp);
> > +}
> > +
> > +/*
> > + * This creates the CPU threads for a given @core.
> > + *
> > + * In order to create the threads, we need two inputs - number of
> > + * threads and the cpu_model. These are set as core object's properties.
> > + * When both of them become available/set, this routine will be called from
> > + * either property's set handler to create the threads.
> > + *
> > + * TODO: Dependence of threads creation on two properties is resulting
> > + * in this not-so-clean way of creating threads from either of the
> > + * property setters based on the order in which they get set. Check if
> > + * this can be handled in a better manner.
> > + */
> > +static void spapr_cpu_core_create_threads(sPAPRCPUCore *core, Error **errp)
> > +{
> > +int i;
> > +
> > +for (i = 0; i < core->nr_threads; i++) {
> > +char id[32];
> > +char type[32];
> > +
> > +snprintf(type, sizeof(type), "%s-%s", core->cpu_model,
> > + TYPE_POWERPC_CPU);
> > +object_initialize(&core->threads[i], sizeof(core->threads[i]), 
> > type);
> > +
> > +snprintf(id, sizeof(id), "thread[%d]", i);
> > +object_property_add_child(OBJECT(core), id, 
> > OBJECT(&core->threads[i]),
> > +  errp);
> > +}
> > +}
> > +
> > +static char *spapr_cpu_core_prop_get_slot(Object *obj, Error **errp)
> > +{
> > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj);
> > +
> > +return core->slot;
> > +}
> > +
> > +static void spapr_cpu_core_prop_set_slot(Object *obj, const char *val,
> > +  Error **errp)
> > +{
> > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj);
> > +
>

Re: [Qemu-devel] [RFC PATCH v0 5/6] qmp, spapr: Show hot-plugged/pluggable CPU slots in the Machine

2016-02-28 Thread Bharata B Rao

On Fri, Feb 26, 2016 at 08:58:05AM -0700, Eric Blake wrote:
> On 02/25/2016 09:22 AM, Bharata B Rao wrote:
> > Implement query cpu-slots that provides information about hot-plugged
> > as well as hot-pluggable CPU slots that the machine supports.
> > 
> > TODO: As Eric suggested use enum for type instead of str.
> > TODO: @hotplug-granularity probably isn't required.
> 
> I guess this is still marked TODO because the series is still RFC?

Yes.

> 
> > 
> > Signed-off-by: Bharata B Rao 
> > ---
> 
> > +++ b/qapi-schema.json
> > @@ -4083,3 +4083,88 @@
> >  ##
> >  { 'enum': 'ReplayMode',
> >'data': [ 'none', 'record', 'play' ] }
> > +
> > +##
> > +# @CPUInfo:
> > +#
> > +# Information about CPUs
> > +#
> > +# @arch-id: Arch-specific ID for the CPU.
> > +#
> > +# @type: QOM type of the CPU.
> > +#
> > +# @thread: Thread ID of the CPU.
> > +#
> > +# @core: Core ID of the CPU.
> > +#
> > +# @socket: Socket ID of the CPU.
> > +#
> > +# @node: NUMA node to which the CPU belongs.
> 
> Please add the '#optional' tag to the fields which are not always present.
> 

Sure.

> > +#
> > +# @qom-path: QOM path of the CPU object
> > +#
> > +# Since: 2.6
> > +##
> > +
> > +{ 'struct': 'CPUInfo',
> > +  'data': { 'arch-id': 'int',
> > +'type': 'str',
> 
> The TODO in the commit message mentions that this should be converted to
> an enum.
> 
> > +'*thread': 'int',
> > +'*core': 'int',
> > +'*socket' : 'int',
> > +'*node' : 'int',
> > +'*qom-path': 'str'
> > +  }
> 
> But looking better than the previous round.

Thanks for the review. Do you have any comments on the applicability/suitability
of this interface from libvirt point of view for performing device_add based
CPU hotplug ?

Regards,
Bharata.

Re: [Qemu-devel] [RFC PATCH v0 2/6] spapr: CPU core device

2016-02-28 Thread Bharata B Rao

On Fri, Feb 26, 2016 at 11:46:19AM +0100, Thomas Huth wrote:
> On 25.02.2016 17:22, Bharata B Rao wrote:
> > Add sPAPR specific CPU core device that is based on generic CPU core device.
> > Creating this core device will result in creation of all the CPU thread
> > devices that are part of this core.
> > 
> > Signed-off-by: Bharata B Rao 
> > ---
> ...
> > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> > new file mode 100644
> > index 000..c44eb61
> > --- /dev/null
> > +++ b/hw/ppc/spapr_cpu_core.c
> > @@ -0,0 +1,210 @@
> > +/*
> > + * sPAPR CPU core device, acts as container of CPU thread devices.
> > + *
> > + * Copyright (C) 2016 Bharata B Rao 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or 
> > later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +#include "hw/cpu/core.h"
> > +#include "hw/ppc/spapr_cpu_core.h"
> > +#include "hw/ppc/spapr.h"
> > +#include "hw/boards.h"
> > +#include "qemu/error-report.h"
> > +#include "qapi/visitor.h"
> > +#include 
> > +
> > +static int spapr_cpu_core_realize_child(Object *child, void *opaque)
> > +{
> > +Error **errp = opaque;
> > +
> > +object_property_set_bool(child, true, "realized", errp);
> > +if (*errp) {
> > +return 1;
> > +}
> > +return 0;
> > +}
> > +
> > +static void spapr_cpu_core_realize(DeviceState *dev, Error **errp)
> > +{
> > +sPAPRCPUCore *core = SPAPR_CPU_CORE(OBJECT(dev));
> > +sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> > +Error *local_err = NULL;
> > +
> > +if (!core->nr_threads) {
> > +error_setg(errp, "nr_threads property can't be 0");
> > +return;
> > +}
> > +
> > +if (!core->cpu_model) {
> > +error_setg(errp, "cpu_model property isn't set");
> > +return;
> > +}
> > +
> > +/*
> > + * TODO: If slot isn't specified, plug this core into
> > + * an existing empty slot.
> > + */
> > +if (!core->slot) {
> > +error_setg(errp, "slot property isn't set");
> > +return;
> > +}
> > +
> > +object_property_set_link(OBJECT(spapr), OBJECT(core), core->slot,
> > + &local_err);
> > +if (local_err) {
> > +error_propagate(errp, local_err);
> > +return;
> > +}
> > +
> > +object_child_foreach(OBJECT(dev), spapr_cpu_core_realize_child, errp);
> > +}
> > +
> > +/*
> > + * This creates the CPU threads for a given @core.
> > + *
> > + * In order to create the threads, we need two inputs - number of
> > + * threads and the cpu_model. These are set as core object's properties.
> > + * When both of them become available/set, this routine will be called from
> > + * either property's set handler to create the threads.
> > + *
> > + * TODO: Dependence of threads creation on two properties is resulting
> > + * in this not-so-clean way of creating threads from either of the
> > + * property setters based on the order in which they get set. Check if
> > + * this can be handled in a better manner.
> > + */
> > +static void spapr_cpu_core_create_threads(sPAPRCPUCore *core, Error **errp)
> > +{
> > +int i;
> > +
> > +for (i = 0; i < core->nr_threads; i++) {
> > +char id[32];
> > +char type[32];
> > +
> > +snprintf(type, sizeof(type), "%s-%s", core->cpu_model,
> > + TYPE_POWERPC_CPU);
> > +object_initialize(&core->threads[i], sizeof(core->threads[i]), 
> > type);
> > +
> > +snprintf(id, sizeof(id), "thread[%d]", i);
> > +object_property_add_child(OBJECT(core), id, 
> > OBJECT(&core->threads[i]),
> > +  errp);
> 
> Need to check errp here to see whether something went wrong?

Yes, I will use local_err and them propagate.

> 
> > +}
> > +}
> > +
> > +static char *spapr_cpu_core_prop_get_slot(Object *obj, Error **errp)
> > +{
> > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj);
> > +
> > +return core->slot;
> > +}
> > +
> > +static void spapr_cpu_core_prop_set_slot(Object *obj, const char *val,
> > +  Error **errp)
> > +{
> > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj);
> > +
> > +core->slot = g_strdup(val);
> > +}
> > +
> > +static char *spapr_cpu_core_prop_get_cpu_model(Object *obj, Error **errp)
> > +{
> > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj);
> > +
> > +return core->cpu_model;
> > +}
> > +
> > +static void spapr_cpu_core_prop_set_cpu_model(Object *obj, const char *val,
> > +  Error **errp)
> > +{
> > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj);
> > +MachineState *machine = MACHINE(qdev_get_machine());
> > +
> > +/*
> > + * cpu_model can't be different from what is specified with -cpu
> > + */
> > +if (strcmp(val, machine->cpu_model)) {
> > +   error_setg(errp, "cpu_model should be %s", machine->cpu_model);
> 
> s/should/must/

sure :)

> 
> > +   return

Re: [Qemu-devel] [RFC PATCH v2 1/3] vGPU Core driver

2016-02-28 Thread Tian, Kevin

> From: Kirti Wankhede
> Sent: Wednesday, February 24, 2016 12:24 AM
> 
> Design for vGPU Driver:
> Main purpose of vGPU driver is to provide a common interface for vGPU
> management that can be used by differnt GPU drivers.
> 
> This module would provide a generic interface to create the device, add
> it to vGPU bus, add device to IOMMU group and then add it to vfio group.
> 
> High Level block diagram:
> 
> +--+vgpu_register_driver()+---+
> | __init() +->+   |
> |  |  |   |
> |  +<-+vgpu.ko|
> | vgpu_vfio.ko |   probe()/remove()   |   |
> |  |+-+   +-+
> +--+| +---+---+ |
> | ^ |
> | callback| |
> | +---++|
> | |vgpu_register_device()   |
> | |||
> +---^-+-++-+--+-+
> | nvidia.ko ||  i915.ko   |
> |   |||
> +---+++
> 
> vGPU driver provides two types of registration interfaces:
> 1. Registration interface for vGPU bus driver:
> 
> /**
>   * struct vgpu_driver - vGPU device driver
>   * @name: driver name
>   * @probe: called when new device created
>   * @remove: called when device removed
>   * @driver: device driver structure
>   *
>   **/
> struct vgpu_driver {
>  const char *name;
>  int  (*probe)  (struct device *dev);
>  void (*remove) (struct device *dev);
>  struct device_driverdriver;
> };
> 
> int  vgpu_register_driver(struct vgpu_driver *drv, struct module *owner);
> void vgpu_unregister_driver(struct vgpu_driver *drv);
> 
> VFIO bus driver for vgpu, should use this interface to register with
> vGPU driver. With this, VFIO bus driver for vGPU devices is responsible
> to add vGPU device to VFIO group.
> 
> 2. GPU driver interface
> GPU driver interface provides GPU driver the set APIs to manage GPU driver
> related work in their own driver. APIs are to:
> - vgpu_supported_config: provide supported configuration list by the GPU.
> - vgpu_create: to allocate basic resouces in GPU driver for a vGPU device.
> - vgpu_destroy: to free resources in GPU driver during vGPU device destroy.
> - vgpu_start: to initiate vGPU initialization process from GPU driver when VM
>   boots and before QEMU starts.
> - vgpu_shutdown: to teardown vGPU resources during VM teardown.
> - read : read emulation callback.
> - write: write emulation callback.
> - vgpu_set_irqs: send interrupt configuration information that QEMU sets.
> - vgpu_bar_info: to provice BAR size and its flags for the vGPU device.
> - validate_map_request: to validate remap pfn request.
> 
> This registration interface should be used by GPU drivers to register
> each physical device to vGPU driver.
> 
> Updated this patch with couple of more functions in GPU driver interface
> which were discussed during v1 version of this RFC.
> 
> Thanks,
> Kirti.
> 
> Signed-off-by: Kirti Wankhede 
> Signed-off-by: Neo Jia 

Hi, Kirti/Neo,

Thanks a lot for you updated version. Having not looked into detail
code, first come with some high level comments.

First, in a glimpse the majority of the code (possibly >95%) is device
agnostic, though we call it vgpu today. Just thinking about the
extensibility and usability of this framework, would it be better to 
name it in a way that any other type of I/O device can be fit into 
this framework? I don't have a good idea of the name now, but 
a simple idea is to replace vgpu with vdev (vdev-core, vfio-vdev,
vfio-iommu-type1-vdev, etc.), and then underlying GPU drivers are
just one category of users of this general vdev framework. In the
future it's easily extended to support other I/O virtualization based 
on similar vgpu concept;

Second, are these 3 patches already working with nvidia device,
or are they just conceptual implementation w/o completing actual
test yet? We'll start moving our implementation toward this direction
too, so would be good to know the current status and how we can
further cooperate to move forward. Based on that we can start 
giving more comments on next level detail.

Thanks
Kevin

Re: [Qemu-devel] [RFC PATCH v0 3/6] spapr: Represent boot CPUs as spapr-cpu-core devices

2016-02-28 Thread Bharata B Rao

On Fri, Feb 26, 2016 at 04:18:57PM +0100, Igor Mammedov wrote:
> On Thu, 25 Feb 2016 21:52:39 +0530
> Bharata B Rao  wrote:
> 
> > Initialize boot CPUs as spapr-cpu-core devices and create links from
> > machine object to these core devices. These links can be considered
> > as CPU slots in which core devices will get hot-plugged. spapr-cpu-core
> > device's slot property indicates the slot where it is plugged. Information
> > about all the CPU slots can be obtained by walking these links.
> > 
> > Signed-off-by: Bharata B Rao 
> > ---
> >  hw/ppc/spapr.c | 65 
> > +++---
> >  include/hw/ppc/spapr.h |  3 +++
> >  2 files changed, 60 insertions(+), 8 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index e214a34..1f0d232 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -64,6 +64,7 @@
> >  
> >  #include "hw/compat.h"
> >  #include "qemu-common.h"
> > +#include "hw/ppc/spapr_cpu_core.h"
> >  
> >  #include 
> >  
> > @@ -1720,6 +1721,21 @@ static void spapr_validate_node_memory(MachineState 
> > *machine, Error **errp)
> >  }
> >  }
> >  
> > +/*
> > + * Check to see if core is being hot-plugged into an already populated 
> > slot.
> > + */
> > +static void spapr_cpu_core_allow_set_link(Object *obj, const char *name,
> > +  Object *val, Error **errp)
> > +{
> > +Object *core = object_property_get_link(qdev_get_machine(), name, 
> > NULL);
> > +
> > +if (core) {
> > +char *path = object_get_canonical_path(core);
> > +error_setg(errp, "Slot %s already populated with %s", name, path);
> > +g_free(path);
> > +}
> > +}
> > +
> >  /* pSeries LPAR / sPAPR hardware init */
> >  static void ppc_spapr_init(MachineState *machine)
> >  {
> > @@ -1728,7 +1744,6 @@ static void ppc_spapr_init(MachineState *machine)
> >  const char *kernel_filename = machine->kernel_filename;
> >  const char *kernel_cmdline = machine->kernel_cmdline;
> >  const char *initrd_filename = machine->initrd_filename;
> > -PowerPCCPU *cpu;
> >  PCIHostState *phb;
> >  int i;
> >  MemoryRegion *sysmem = get_system_memory();
> > @@ -1742,6 +1757,8 @@ static void ppc_spapr_init(MachineState *machine)
> >  long load_limit, fw_size;
> >  bool kernel_le = false;
> >  char *filename;
> > +int spapr_cores = smp_cpus / smp_threads;
> > +int spapr_max_cores = max_cpus / smp_threads;
> >  
> >  msi_supported = true;
> >  
> > @@ -1800,13 +1817,38 @@ static void ppc_spapr_init(MachineState *machine)
> >  if (machine->cpu_model == NULL) {
> >  machine->cpu_model = kvm_enabled() ? "host" : "POWER7";
> >  }
> > -for (i = 0; i < smp_cpus; i++) {
> > -cpu = cpu_ppc_init(machine->cpu_model);
> > -if (cpu == NULL) {
> > -error_report("Unable to find PowerPC CPU definition");
> > -exit(1);
> > +
> > +spapr->cores = g_malloc0(spapr_max_cores * sizeof(Object));
> souldn't it be sizeof(Object *) 

Yes, I meant to store the pointers to Object here, will change.

> 
> > +
> > +for (i = 0; i < spapr_max_cores; i++) {
> > +Object *spapr_cpu_core  = object_new(TYPE_SPAPR_CPU_CORE);
> you allocate spapr_max_cores cores but set links to only to spapr_cores only,
> it looks like all spapr_cpu_core objects are leaked for range 
> spapr_cores..spapr_max_cores
> 

Yes, as I replied in an ealier thread, the intention is to create links
for all possible cores, but create only those many cores required for CPUs
specified with -smp from machine init. The links will be set only for those
cores. Hotplugged cores will get created and the links will be set for them
during device_add. So the changed code now looks like this:

for (i = 0; i < spapr_max_cores; i++) {
char name[32];

/*
 * Create links from machine objects to all possible cores.
 */
snprintf(name, sizeof(name), "%s[%d]", SPAPR_MACHINE_CPU_CORE_PROP, i);
object_property_add_link(OBJECT(spapr), name, TYPE_SPAPR_CPU_CORE,
 (Object **)&spapr->cores[i],
 spapr_cpu_core_allow_set_link, 0,
 &error_fatal);

/*
 * Create cores and set link from machine object to core object for
 * boot time CPUs and realize them.
 */
if (i < spapr_cores) {
Object *core  = object_new(TYPE_SPAPR_CPU_CORE);

object_property_set_str(core, machine->cpu_model, "cpu_model",
&error_fatal);
object_property_set_int(core, smp_threads, "nr_threads",
&error_fatal);
object_property_set_str(core, name, SPAPR_CPU_CORE_SLOT_PROP,
&error_fatal);
object_property_set_bool(core, true, "realized", &error_fatal);
}
}

> 
>

Re: [Qemu-devel] [PATCH V7 0/2] net/filter-mirror:add filter-mirror and unit test

2016-02-28 Thread Jason Wang



On 02/26/2016 04:12 PM, Zhang Chen wrote:
> Filter-mirror is a netfilter plugin.
> It gives qemu the ability to mirror
> packets to a chardev.
>
> v7:
>  - fix mktemp() to mkstemp()
>
> v6:
>  - Address Jason's comments.
>
> v5:
>  - Address Jason's comments.
>
> v4:
>  - Address Jason's comments.
>
> v3:
>  - Add filter-mirror unit test according
>to Jason's comments
>  - Address zhanghailiang's comments.
>  - Address Jason's comments.
>
> v2:
>  - Address zhanghailiang's comments.
>  - Address Eric Blake's comments.
>  - Address Yang Hongyang's comments.
>  - Address Dave's comments.
>
> v1:
>  initial patch.
>
>
> Zhang Chen (2):
>   net/filter-mirror:Add filter-mirror
>   tests/test-filter-mirror:add filter-mirror unit test
>
>  net/Makefile.objs  |   1 +
>  net/filter-mirror.c| 181 
> +
>  qemu-options.hx|   5 ++
>  tests/.gitignore   |   1 +
>  tests/Makefile |   2 +
>  tests/test-filter-mirror.c |  89 ++
>  vl.c   |   3 +-
>  7 files changed, 281 insertions(+), 1 deletion(-)
>  create mode 100644 net/filter-mirror.c
>  create mode 100644 tests/test-filter-mirror.c
>

Applied to -net.

Thanks

Re: [Qemu-devel] Installing guest OS in VM

2016-02-28 Thread Fam Zheng

On Sun, 02/28 10:37, Sarah Khan wrote:
> Hi,
> I have been looking forward to participate in outreachy round 12.So, I was
> trying to install guest OS in VM according to instructions as given on this
> link:
> http://wiki.qemu-project.org/Hosts/Linux but I am stuck after the
> instruction which tells to boot PC-BIOS which is
> 
> bin/debug/native/x86_64-softmmu/qemu-system-x86_64 -L pc-bios
> 
> After running this I get:VNC server running on '127.0.0.1;5900'and nothing.

If you recompile with sdl or gtk support, you will get a window showing the
guest console. It looks like you have neither, but you can still use "vncviewer
127.0.0.1:5900" to connect to the vnc server.

Fam

> 
> Can someone suggest that I am going correct or not? Also after this
> how to go about this command:
> ./qemu-img create -f qcow2 test.qcow2 16G
> 
> 
> 
> Thanks,
> 
> Sarah

[Qemu-devel] Installing guest OS in VM

2016-02-28 Thread Sarah Khan

Hi,
I have been looking forward to participate in outreachy round 12.So, I was
trying to install guest OS in VM according to instructions as given on this
link:
http://wiki.qemu-project.org/Hosts/Linux but I am stuck after the
instruction which tells to boot PC-BIOS which is

bin/debug/native/x86_64-softmmu/qemu-system-x86_64 -L pc-bios

After running this I get:VNC server running on '127.0.0.1;5900'and nothing.

Can someone suggest that I am going correct or not? Also after this
how to go about this command:
./qemu-img create -f qcow2 test.qcow2 16G



Thanks,

Sarah

Re: [Qemu-devel] [RFC PATCH v0 2/6] spapr: CPU core device

2016-02-28 Thread David Gibson

On Fri, Feb 26, 2016 at 12:13:39PM -0600, Michael Roth wrote:
> Quoting Bharata B Rao (2016-02-25 10:22:38)
> > Add sPAPR specific CPU core device that is based on generic CPU core device.
> > Creating this core device will result in creation of all the CPU thread
> > devices that are part of this core.
> > 
> > Signed-off-by: Bharata B Rao 
> > ---
> >  hw/ppc/Makefile.objs|   1 +
> >  hw/ppc/spapr_cpu_core.c | 210 
> > 
> >  include/hw/ppc/spapr_cpu_core.h |  32 ++
> >  3 files changed, 243 insertions(+)
> >  create mode 100644 hw/ppc/spapr_cpu_core.c
> >  create mode 100644 include/hw/ppc/spapr_cpu_core.h
> > 
> > diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
> > index c1ffc77..5cc6608 100644
> > --- a/hw/ppc/Makefile.objs
> > +++ b/hw/ppc/Makefile.objs
> > @@ -4,6 +4,7 @@ obj-y += ppc.o ppc_booke.o
> >  obj-$(CONFIG_PSERIES) += spapr.o spapr_vio.o spapr_events.o
> >  obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
> >  obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
> > +obj-$(CONFIG_PSERIES) += spapr_cpu_core.o
> >  ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
> >  obj-y += spapr_pci_vfio.o
> >  endif
> > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> > new file mode 100644
> > index 000..c44eb61
> > --- /dev/null
> > +++ b/hw/ppc/spapr_cpu_core.c
> > @@ -0,0 +1,210 @@
> > +/*
> > + * sPAPR CPU core device, acts as container of CPU thread devices.
> > + *
> > + * Copyright (C) 2016 Bharata B Rao 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or 
> > later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +#include "hw/cpu/core.h"
> > +#include "hw/ppc/spapr_cpu_core.h"
> > +#include "hw/ppc/spapr.h"
> > +#include "hw/boards.h"
> > +#include "qemu/error-report.h"
> > +#include "qapi/visitor.h"
> > +#include 
> > +
> > +static int spapr_cpu_core_realize_child(Object *child, void *opaque)
> > +{
> > +Error **errp = opaque;
> > +
> > +object_property_set_bool(child, true, "realized", errp);
> > +if (*errp) {
> > +return 1;
> > +}
> > +return 0;
> > +}
> > +
> > +static void spapr_cpu_core_realize(DeviceState *dev, Error **errp)
> > +{
> > +sPAPRCPUCore *core = SPAPR_CPU_CORE(OBJECT(dev));
> > +sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> > +Error *local_err = NULL;
> > +
> > +if (!core->nr_threads) {
> > +error_setg(errp, "nr_threads property can't be 0");
> > +return;
> > +}
> > +
> > +if (!core->cpu_model) {
> > +error_setg(errp, "cpu_model property isn't set");
> > +return;
> > +}
> > +
> > +/*
> > + * TODO: If slot isn't specified, plug this core into
> > + * an existing empty slot.
> > + */
> > +if (!core->slot) {
> > +error_setg(errp, "slot property isn't set");
> > +return;
> > +}
> > +
> > +object_property_set_link(OBJECT(spapr), OBJECT(core), core->slot,
> > + &local_err);
> > +if (local_err) {
> > +error_propagate(errp, local_err);
> > +return;
> > +}
> > +
> > +object_child_foreach(OBJECT(dev), spapr_cpu_core_realize_child, errp);
> > +}
> > +
> > +/*
> > + * This creates the CPU threads for a given @core.
> > + *
> > + * In order to create the threads, we need two inputs - number of
> > + * threads and the cpu_model. These are set as core object's properties.
> > + * When both of them become available/set, this routine will be called from
> > + * either property's set handler to create the threads.
> > + *
> > + * TODO: Dependence of threads creation on two properties is resulting
> > + * in this not-so-clean way of creating threads from either of the
> > + * property setters based on the order in which they get set. Check if
> > + * this can be handled in a better manner.
> > + */
> > +static void spapr_cpu_core_create_threads(sPAPRCPUCore *core, Error **errp)
> > +{
> > +int i;
> > +
> > +for (i = 0; i < core->nr_threads; i++) {
> > +char id[32];
> > +char type[32];
> > +
> > +snprintf(type, sizeof(type), "%s-%s", core->cpu_model,
> > + TYPE_POWERPC_CPU);
> > +object_initialize(&core->threads[i], sizeof(core->threads[i]), 
> > type);
> > +
> > +snprintf(id, sizeof(id), "thread[%d]", i);
> > +object_property_add_child(OBJECT(core), id, 
> > OBJECT(&core->threads[i]),
> > +  errp);
> > +}
> > +}
> > +
> > +static char *spapr_cpu_core_prop_get_slot(Object *obj, Error **errp)
> > +{
> > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj);
> > +
> > +return core->slot;
> > +}
> > +
> > +static void spapr_cpu_core_prop_set_slot(Object *obj, const char *val,
> > +  Error **errp)
> > +{
> > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj);
> > +
>

[Qemu-devel] [Bug 893208] Re: qemu on ARM hosts can't boot i386 image

2016-02-28 Thread Zack Callendish

It doesn't work on my XU4 either. The supported virtualization would
probably work for ARM images but it's not something many people need.

What's the holdup, dear devs?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/893208

Title:
  qemu on ARM hosts can't boot i386 image

Status in QEMU:
  Confirmed
Status in Linaro QEMU:
  New

Bug description:
  If you apply some workarounds for bug 870990, bug 883133 and bug
  883136 QEMU still cannot boot the i386
  debian_squeeze_i386_standard.qcow2 image from
  http://people.debian.org/~aurel32/qemu/i386/ -- grub starts to boot
  but something causes the system to reset just before display of the
  blue-background grub menu, so we go round in a loop forever. This
  image boots OK on i386 hosted qemu so this indicates some kind of ARM-
  host specific bug.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/893208/+subscriptions

Re: [Qemu-devel] [PATCH 02/12] spapr_pci: Switch to vfio_eeh_as_op() interface

2016-02-28 Thread David Gibson

On Mon, Feb 29, 2016 at 12:43:05PM +1100, Alexey Kardashevskiy wrote:
> On 02/26/2016 10:31 PM, David Gibson wrote:
> >This switches all EEH on VFIO operations in spapr_pci_vfio.c from the
> >broken vfio_container_ioctl() interface to the new vfio_as_eeh_op()
> >interface.
> >
> >Signed-off-by: David Gibson mak
> 
> Where is  that "mak" from? :)

Oops, no idea.  Fixed.

> Reviewed-by: Alexey Kardashevskiy 
> 
> 
> >---
> >  hw/ppc/spapr_pci_vfio.c | 50 
> > -
> >  1 file changed, 16 insertions(+), 34 deletions(-)
> >
> >diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
> >index 2f3752e..b1e8e8e 100644
> >--- a/hw/ppc/spapr_pci_vfio.c
> >+++ b/hw/ppc/spapr_pci_vfio.c
> >@@ -73,15 +73,9 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState 
> >*sphb, Error **errp)
> >  spapr_tce_get_iommu(tcet));
> >  }
> >
> >-static void spapr_phb_vfio_eeh_reenable(sPAPRPHBVFIOState *svphb)
> >+static void spapr_phb_vfio_eeh_reenable(sPAPRPHBState *sphb)
> >  {
> >-struct vfio_eeh_pe_op op = {
> >-.argsz = sizeof(op),
> >-.op= VFIO_EEH_PE_ENABLE
> >-};
> >-
> >-vfio_container_ioctl(&svphb->phb.iommu_as,
> >- svphb->iommugroupid, VFIO_EEH_PE_OP, &op);
> >+vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_ENABLE);
> >  }
> >
> >  static void spapr_phb_vfio_reset(DeviceState *qdev)
> >@@ -92,19 +86,18 @@ static void spapr_phb_vfio_reset(DeviceState *qdev)
> >   * ensures that the contained PCI devices will work properly
> >   * after reboot.
> >   */
> >-spapr_phb_vfio_eeh_reenable(SPAPR_PCI_VFIO_HOST_BRIDGE(qdev));
> >+spapr_phb_vfio_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev));
> >  }
> >
> >  static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
> >   unsigned int addr, int option)
> >  {
> >-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
> >-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) };
> >+uint32_t op;
> >  int ret;
> >
> >  switch (option) {
> >  case RTAS_EEH_DISABLE:
> >-op.op = VFIO_EEH_PE_DISABLE;
> >+op = VFIO_EEH_PE_DISABLE;
> >  break;
> >  case RTAS_EEH_ENABLE: {
> >  PCIHostState *phb;
> >@@ -122,21 +115,20 @@ static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState 
> >*sphb,
> >  return RTAS_OUT_PARAM_ERROR;
> >  }
> >
> >-op.op = VFIO_EEH_PE_ENABLE;
> >+op = VFIO_EEH_PE_ENABLE;
> >  break;
> >  }
> >  case RTAS_EEH_THAW_IO:
> >-op.op = VFIO_EEH_PE_UNFREEZE_IO;
> >+op = VFIO_EEH_PE_UNFREEZE_IO;
> >  break;
> >  case RTAS_EEH_THAW_DMA:
> >-op.op = VFIO_EEH_PE_UNFREEZE_DMA;
> >+op = VFIO_EEH_PE_UNFREEZE_DMA;
> >  break;
> >  default:
> >  return RTAS_OUT_PARAM_ERROR;
> >  }
> >
> >-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid,
> >-   VFIO_EEH_PE_OP, &op);
> >+ret = vfio_eeh_as_op(&sphb->iommu_as, op);
> >  if (ret < 0) {
> >  return RTAS_OUT_HW_ERROR;
> >  }
> >@@ -146,13 +138,9 @@ static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState 
> >*sphb,
> >
> >  static int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state)
> >  {
> >-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
> >-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) };
> >  int ret;
> >
> >-op.op = VFIO_EEH_PE_GET_STATE;
> >-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid,
> >-   VFIO_EEH_PE_OP, &op);
> >+ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_GET_STATE);
> >  if (ret < 0) {
> >  return RTAS_OUT_PARAM_ERROR;
> >  }
> >@@ -206,28 +194,26 @@ static void spapr_phb_vfio_eeh_pre_reset(sPAPRPHBState 
> >*sphb)
> >
> >  static int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option)
> >  {
> >-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
> >-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) };
> >+uint32_t op;
> >  int ret;
> >
> >  switch (option) {
> >  case RTAS_SLOT_RESET_DEACTIVATE:
> >-op.op = VFIO_EEH_PE_RESET_DEACTIVATE;
> >+op = VFIO_EEH_PE_RESET_DEACTIVATE;
> >  break;
> >  case RTAS_SLOT_RESET_HOT:
> >  spapr_phb_vfio_eeh_pre_reset(sphb);
> >-op.op = VFIO_EEH_PE_RESET_HOT;
> >+op = VFIO_EEH_PE_RESET_HOT;
> >  break;
> >  case RTAS_SLOT_RESET_FUNDAMENTAL:
> >  spapr_phb_vfio_eeh_pre_reset(sphb);
> >-op.op = VFIO_EEH_PE_RESET_FUNDAMENTAL;
> >+op = VFIO_EEH_PE_RESET_FUNDAMENTAL;
> >  break;
> >  default:
> >  return RTAS_OUT_PARAM_ERROR;
> >  }
> >
> >-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid,
> >-   VFIO_EEH_PE_OP, &op);
> >+ret =

Re: [Qemu-devel] [PATCH 01/12] vfio: Start improving VFIO/EEH interface

2016-02-28 Thread David Gibson

On Mon, Feb 29, 2016 at 11:58:54AM +1100, Alexey Kardashevskiy wrote:
> On 02/26/2016 10:31 PM, David Gibson wrote:
> >At present the code handling IBM's Enhanced Error Handling (EEH) interface
> >on VFIO devices operates by bypassing the usual VFIO logic with
> >vfio_container_ioctl().  That's a poorly designed interface with unclear
> >semantics about exactly what can be operated on.
> >
> >In particular it operates on a single vfio container internally (hence the
> >name), but takes an address space and group id, from which it deduces the
> >container in a rather roundabout way.  groupids are something that code
> >outside vfio shouldn't even be aware of.
> >
> >This patch creates new interfaces for EEH operations.  Internally we
> >have vfio_eeh_container_op() which takes a VFIOContainer object
> >directly.  For external use we have vfio_eeh_as_ok() which determines
> >if an AddressSpace is usable for EEH (at present this means it has a
> >single container and at most a single group attached), and
> >vfio_eeh_as_op() which will perform an operation on an AddressSpace in
> >the unambiguous case, and otherwise returns an error.
> >
> >This interface still isn't great, but it's enough of an improvement to
> >allow a number of cleanups in other places.
> >
> >Signed-off-by: David Gibson 
> >---
> >  hw/vfio/common.c   | 77 
> > ++
> >  include/hw/vfio/vfio.h |  2 ++
> >  2 files changed, 79 insertions(+)
> >
> >diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> >index 607ec70..e419241 100644
> >--- a/hw/vfio/common.c
> >+++ b/hw/vfio/common.c
> >@@ -1003,3 +1003,80 @@ int vfio_container_ioctl(AddressSpace *as, int32_t 
> >groupid,
> >
> >  return vfio_container_do_ioctl(as, groupid, req, param);
> >  }
> >+
> >+/*
> >+ * Interfaces for IBM EEH (Enhanced Error Handling)
> >+ */
> >+static bool vfio_eeh_container_ok(VFIOContainer *container)
> >+{
> >+/* A broken kernel implementation means EEH operations won't work
> >+ * correctly if there are multiple groups in a container */
> >+
> >+if (!QLIST_EMPTY(&container->group_list)
> >+&& QLIST_NEXT(QLIST_FIRST(&container->group_list), container_next)) 
> >{
> >+return false;
> >+}
> >+
> >+return true;
> 
> If &container->group_list is empty, this helper returns "true". Does not
> look right, does it?...

Hmm.. my thinking was that EEH was safe (if a no-op) on a container
with no groups.  But, thinking about it again I'm not sure that the
state of previous EEH ops will get transferred to a group added to the
container later.  So I think returning false on an empty container
probably is safer.

I'll change it.

> >+}
> >+
> >+static int vfio_eeh_container_op(VFIOContainer *container, uint32_t op)
> >+{
> >+struct vfio_eeh_pe_op pe_op = {
> >+.argsz = sizeof(pe_op),
> >+.op = op,
> >+};
> >+int ret;
> >+
> >+if (!vfio_eeh_container_ok(container)) {
> >+error_report("vfio/eeh: EEH_PE_OP 0x%x called on container"
> >+ " with multiple groups", op);
> >+return -EPERM;
> >+}
> >+
> >+ret = ioctl(container->fd, VFIO_EEH_PE_OP, &pe_op);
> >+if (ret < 0) {
> >+error_report("vfio/eeh: EEH_PE_OP 0x%x failed: %m", op);
> >+return -errno;
> >+}
> >+
> >+return 0;
> >+}
> >+
> >+static VFIOContainer *vfio_eeh_as_container(AddressSpace *as)
> >+{
> >+VFIOAddressSpace *space = vfio_get_address_space(as);
> >+VFIOContainer *container = NULL;
> >+
> >+if (QLIST_EMPTY(&space->containers)) {
> >+/* No containers to act on */
> >+goto out;
> >+}
> >+
> >+container = QLIST_FIRST(&space->containers);
> >+
> >+if (QLIST_NEXT(container, next)) {
> >+/* We don't yet have logic to synchronize EEH state across
> >+ * multiple containers */
> >+container = NULL;
> >+goto out;
> >+}
> >+
> >+out:
> >+vfio_put_address_space(space);
> >+return container;
> >+}
> >+
> >+bool vfio_eeh_as_ok(AddressSpace *as)
> >+{
> >+VFIOContainer *container = vfio_eeh_as_container(as);
> >+
> >+return (container != NULL) && vfio_eeh_container_ok(container);
> >+}
> >+
> >+int vfio_eeh_as_op(AddressSpace *as, uint32_t op)
> >+{
> >+VFIOContainer *container = vfio_eeh_as_container(as);
> >+
> >+return vfio_eeh_container_op(container, op);
> 
> 
> vfio_eeh_as_ok() checks for (container != NULL) but this one does not,
> should not it?

Well.. you're not supposed to call as_op() if as_ok() returned false,
so it should be safe.  I'll add an assert to make this clearer.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-ppc] Migrating decrementer

2016-02-28 Thread David Gibson

On Fri, Feb 26, 2016 at 12:29:51PM +, Mark Cave-Ayland wrote:
> On 26/02/16 04:35, David Gibson wrote:
> 
> >> Sign. And let me try that again, this time after caffeine:
> >>
> >> cpu_start/resume():
> >> cpu->tb_env->tb_offset =
> >> muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL),
> >>  cpu->tb_env->tb_freq, NANOSECONDS_PER_SECOND) +
> >> cpu->tb_env->tb_offset -
> >> cpu_get_host_ticks();
> >>
> >> This should translate to: at CPU start, calculate the difference between
> >> the current guest virtual timebase and the host timebase, storing the
> >> difference in cpu->tb_env->tb_offset.
> > 
> > Ummm... I think that's right.  Except that you need to make sure you
> > calculate the tb_offset just once, and set the same value to all guest
> > CPUs.  Otherwise the guest TBs may be slightly out of sync with each
> > other, which is bad (the host should have already ensure that all host
> > TBs are in sync with each other).
> 
> Nods. The reason I really like this solution is because it correctly
> handles both paused/live machines and simplifies the migration logic
> significantly. In fact, with this solution the only information you
> would need in vmstate_ppc_timebase for migration would be the current
> tb_offset so the receiving host can calculate the guest timebase from
> the virtual clock accordingly.

> > We really should make helper routines that each Power machine type can
> > use for this.  Unfortunately we can't put it directly into the common
> > ppc cpu migration code because of the requirement to keep the TBs
> > synced across the machine.
> 
> Effectively I believe there are 2 cases here: TCG and KVM. For TCG all
> of this is a no-op since tb/decr are already derived from the virtual
> clock + tb_offset already so that really just leaves KVM.
> 
> >From what you're saying we would need 2 hooks for KVM here: one on
> machine start/resume which updates tb_offset for all vCPUs and one for
> vCPU resume which updates just that one particular vCPU.
> 
> Just curious as to your comment regarding helper routines for each Power
> machine type - is there any reason not to enable this globally for all
> Power machine types? If tb_offset isn't supported by the guest then the
> problem with not being able to handle a jump in timebase post-migration
> still remains exactly the same.

Well, I can't see a place to put it globally.  We can't put it in the
general vCPU stuff, because that would migrate each CPU's timebase
independently, but we want to migrate as a system wide operation, to
ensure the TBs are all synchronized in the destination guest.

To do the platform wide stuff, it pretty much has to be in the machine
type.

> The other question of course relates to the reason this thread was
> started which is will this approach still support migrating the
> decrementer? My feeling is that this would still work in that we would
> encode the number of ticks before the decrementer reaches its interrupt
> value into vmstate_ppc_timebase, whether high or low. For TCG it is easy
> to ensure that the decrementer will still fire in n ticks time
> post-migration (which solves my particular use case), but I don't have a
> feeling as to whether this is possible, or indeed desirable for KVM.

Yes, for TCG it should be fairly straightforward.  The DECR should be
calculated from the timebase.  We do need to check it on incoming
migration though, and check when we need to refire the decrementer
interrupt.

For KVM we'll need to load an appropriate value into the real
decrementer.  We probably want to migrate a difference between the TB
and the decrementer.  What could get hairy here is that there are a
number of different variants between ppc models on how exactly the
decrementer interrupt triggers: is it edge-triggered on 1->0
transition, edge-triggered on 0->-1 transition, or level triggered on
the DECR's sign bit.  

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 10/12] spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge

2016-02-28 Thread David Gibson

On Mon, Feb 29, 2016 at 12:42:28PM +1100, Alexey Kardashevskiy wrote:
> On 02/26/2016 10:32 PM, David Gibson wrote:
> >Now that the regular spapr-pci-host-bridge can handle EEH, there are only
> >two things that spapr-pci-vfio-host-bridge does differently:
> > 1. automatically sizes its DMA window to match the host IOMMU
> > 2. checks if the attached VFIO container is backed by the
> >VFIO_SPAPR_TCE_IOMMU type on the host
> >
> >(1) is not particularly useful, since the default window used by the
> >regular host bridge will work with the host IOMMU configuration on all
> >current systems anyway.
> >
> >Plus, automatically changing guest visible configuration (such as the DMA
> >window) based on host settings is generally a bad idea.
> 
> btw why exactly is it a bad idea?

So, as noted, in this case it's probably ok.  But in general changing
configuration based on host state will break migration, because the
source and destination VMs, with apparently identical setup won't
actually be the same.

It also makes it more difficult for management layers to know exactly
what they're constructing, which is potentially an issue even for this
non-migratable device.
> 
> 
> Anyway,
> Reviewed-by: Alexey Kardashevskiy 
> 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH v0 4/6] spapr: CPU hotplug support

2016-02-28 Thread Bharata B Rao

On Fri, Feb 26, 2016 at 02:51:41PM +1100, David Gibson wrote:
> On Thu, Feb 25, 2016 at 09:52:40PM +0530, Bharata B Rao wrote:
> > Set up device tree entries for the hotplugged CPU core and use the
> > exising EPOW event infrastructure to send CPU hotplug notification to
> > the guest.
> > 
> > Signed-off-by: Bharata B Rao 
> > ---
> >  hw/ppc/spapr.c | 136 
> > -
> >  hw/ppc/spapr_events.c  |   3 ++
> >  hw/ppc/spapr_rtas.c|  24 +
> >  include/hw/ppc/spapr.h |   1 +
> >  4 files changed, 163 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 1f0d232..780cd00 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -603,6 +603,18 @@ static void spapr_populate_cpu_dt(CPUState *cs, void 
> > *fdt, int offset,
> >  size_t page_sizes_prop_size;
> >  uint32_t vcpus_per_socket = smp_threads * smp_cores;
> >  uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
> > +sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
> > +sPAPRDRConnector *drc;
> > +sPAPRDRConnectorClass *drck;
> > +int drc_index;
> > +
> > +if (smc->dr_cpu_enabled) {
> > +drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index);
> > +g_assert(drc);
> > +drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> > +drc_index = drck->get_index(drc);
> > +_FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", 
> > drc_index)));
> > +}
> >  
> >  /* Note: we keep CI large pages off for now because a 64K capable guest
> >   * provisioned with large pages might otherwise try to map a qemu
> > @@ -987,6 +999,16 @@ static void spapr_finalize_fdt(sPAPRMachineState 
> > *spapr,
> >  _FDT(spapr_drc_populate_dt(fdt, 0, NULL, 
> > SPAPR_DR_CONNECTOR_TYPE_LMB));
> >  }
> >  
> > +if (smc->dr_cpu_enabled) {
> > +int offset = fdt_path_offset(fdt, "/cpus");
> > +ret = spapr_drc_populate_dt(fdt, offset, NULL,
> > +SPAPR_DR_CONNECTOR_TYPE_CPU);
> > +if (ret < 0) {
> > +fprintf(stderr, "Couldn't set up CPU DR device tree 
> > properties\n");
> > +exit(1);
> > +}
> > +}
> > +
> >  _FDT((fdt_pack(fdt)));
> >  
> >  if (fdt_totalsize(fdt) > FDT_MAX_SIZE) {
> > @@ -1759,6 +1781,7 @@ static void ppc_spapr_init(MachineState *machine)
> >  char *filename;
> >  int spapr_cores = smp_cpus / smp_threads;
> >  int spapr_max_cores = max_cpus / smp_threads;
> > +int smt = kvmppc_smt_threads();
> >  
> >  msi_supported = true;
> >  
> > @@ -1813,6 +1836,15 @@ static void ppc_spapr_init(MachineState *machine)
> >  spapr_validate_node_memory(machine, &error_fatal);
> >  }
> >  
> > +if (smc->dr_cpu_enabled) {
> > +for (i = 0; i < spapr_max_cores; i++) {
> > +sPAPRDRConnector *drc =
> > +spapr_dr_connector_new(OBJECT(spapr),
> > +   SPAPR_DR_CONNECTOR_TYPE_CPU, i * 
> > smt);
> > +qemu_register_reset(spapr_drc_reset, drc);
> > +}
> > +}
> > +
> >  /* init CPUs */
> >  if (machine->cpu_model == NULL) {
> >  machine->cpu_model = kvm_enabled() ? "host" : "POWER7";
> > @@ -2247,6 +2279,88 @@ out:
> >  error_propagate(errp, local_err);
> >  }
> >  
> > +static void *spapr_populate_hotplug_cpu_dt(DeviceState *dev, CPUState *cs,
> > +   int *fdt_offset,
> > +   sPAPRMachineState *spapr)
> > +{
> > +PowerPCCPU *cpu = POWERPC_CPU(cs);
> > +DeviceClass *dc = DEVICE_GET_CLASS(cs);
> > +int id = ppc_get_vcpu_dt_id(cpu);
> > +void *fdt;
> > +int offset, fdt_size;
> > +char *nodename;
> > +
> > +fdt = create_device_tree(&fdt_size);
> > +nodename = g_strdup_printf("%s@%x", dc->fw_name, id);
> > +offset = fdt_add_subnode(fdt, 0, nodename);
> > +
> > +spapr_populate_cpu_dt(cs, fdt, offset, spapr);
> > +g_free(nodename);
> > +
> > +*fdt_offset = offset;
> > +return fdt;
> > +}
> > +
> > +static void spapr_core_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> > +Error **errp)
> > +{
> > +sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine());
> > +sPAPRMachineState *ms = SPAPR_MACHINE(qdev_get_machine());
> > +sPAPRCPUCore *core = SPAPR_CPU_CORE(OBJECT(dev));
> > +PowerPCCPU *cpu = &core->threads[0];
> > +CPUState *cs = CPU(cpu);
> > +int id = ppc_get_vcpu_dt_id(cpu);
> > +sPAPRDRConnector *drc =
> > +spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id);
> > +sPAPRDRConnectorClass *drck;
> > +Error *local_err = NULL;
> > +void *fdt = NULL;
> > +int fdt_offset = 0;
> > +
> > +if (!smc->dr_cpu_enabled) {
> > +/*
> > + * This is a cold plugged CPU core but the mac

Re: [Qemu-devel] [PATCH 04/12] spapr_pci: Fold spapr_phb_vfio_eeh_configure() into spapr_pci code

2016-02-28 Thread Alexey Kardashevskiy


On 02/29/2016 02:45 PM, Alexey Kardashevskiy wrote:

On 02/29/2016 12:43 PM, Alexey Kardashevskiy wrote:

On 02/26/2016 10:31 PM, David Gibson wrote:

Simplify the sPAPR PCI code by folding spapr_phb_vfio_eeh_configure()
into rtas_ibm_configure_pe().

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 


Aaaand this breaks mingw32:

   CCppc64-softmmu/hw/ppc/spapr_pci.o
/home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:46:24: fatal error:
linux/vfio.h: No such file or directory
compilation terminated.
/home/aik/p/qemu-dwg-eeh/rules.mak:57: recipe for target
'hw/ppc/spapr_pci.o' failed
make[1]: *** [hw/ppc/spapr_pci.o] Error 1
make[1]: *** Waiting for unfinished jobs
Makefile:186: recipe for target 'subdir-ppc64-softmmu' failed
make: *** [subdir-ppc64-softmmu] Error 2
make: Leaving directory '/scratch/aik/p/qemu-dwg-eeh--ppc64_mingw32-build'






---
  hw/ppc/spapr_pci.c  | 11 +--
  hw/ppc/spapr_pci_vfio.c | 12 
  include/hw/pci-host/spapr.h |  1 -
  3 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index d1e5222..fa633a9 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -42,6 +42,9 @@
  #include "hw/ppc/spapr_drc.h"
  #include "sysemu/device_tree.h"

+#include "hw/vfio/vfio.h"
+#include 
+


This is missing:

#ifdef CONFIG_LINUX
#include 
#endif

and below where you use symbols from linux/vfio.h.


My version of this rework did convert class callbacks to exported helpers 
and keep these helpers (plus stubs) in spapr_pci_vfio.c, with one #ifdef 
CONFIG_LINUX. Looked quite clean to me...





  /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
  #define RTAS_QUERY_FN   0
  #define RTAS_CHANGE_FN  1
@@ -628,8 +631,12 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu,
  goto param_error_exit;
  }

-ret = spapr_phb_vfio_eeh_configure(sphb);
-rtas_st(rets, 0, ret);
+ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE);
+if (ret < 0) {
+goto param_error_exit;
+}
+
+rtas_st(rets, 0, RTAS_OUT_SUCCESS);
  return;

  param_error_exit:
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index 10fa88a..4ac5736 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -221,18 +221,6 @@ int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb,
int option)
  return RTAS_OUT_SUCCESS;
  }

-int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb)
-{
-int ret;
-
-ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE);
-if (ret < 0) {
-return RTAS_OUT_PARAM_ERROR;
-}
-
-return RTAS_OUT_SUCCESS;
-}
-
  static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(klass);
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index cc1b82c..f02ef51 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -140,6 +140,5 @@ int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
unsigned int addr, int option);
  int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state);
  int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option);
-int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb);

  #endif /* __HW_SPAPR_PCI_H__ */



--
Alexey

Re: [Qemu-devel] [PATCH 04/12] spapr_pci: Fold spapr_phb_vfio_eeh_configure() into spapr_pci code

2016-02-28 Thread Alexey Kardashevskiy


On 02/29/2016 12:43 PM, Alexey Kardashevskiy wrote:

On 02/26/2016 10:31 PM, David Gibson wrote:

Simplify the sPAPR PCI code by folding spapr_phb_vfio_eeh_configure()
into rtas_ibm_configure_pe().

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 


Aaaand this breaks mingw32:

  CCppc64-softmmu/hw/ppc/spapr_pci.o
/home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:46:24: fatal error: 
linux/vfio.h: No such file or directory

compilation terminated.
/home/aik/p/qemu-dwg-eeh/rules.mak:57: recipe for target 
'hw/ppc/spapr_pci.o' failed

make[1]: *** [hw/ppc/spapr_pci.o] Error 1
make[1]: *** Waiting for unfinished jobs
Makefile:186: recipe for target 'subdir-ppc64-softmmu' failed
make: *** [subdir-ppc64-softmmu] Error 2
make: Leaving directory '/scratch/aik/p/qemu-dwg-eeh--ppc64_mingw32-build'






---
  hw/ppc/spapr_pci.c  | 11 +--
  hw/ppc/spapr_pci_vfio.c | 12 
  include/hw/pci-host/spapr.h |  1 -
  3 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index d1e5222..fa633a9 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -42,6 +42,9 @@
  #include "hw/ppc/spapr_drc.h"
  #include "sysemu/device_tree.h"

+#include "hw/vfio/vfio.h"
+#include 
+
  /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
  #define RTAS_QUERY_FN   0
  #define RTAS_CHANGE_FN  1
@@ -628,8 +631,12 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu,
  goto param_error_exit;
  }

-ret = spapr_phb_vfio_eeh_configure(sphb);
-rtas_st(rets, 0, ret);
+ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE);
+if (ret < 0) {
+goto param_error_exit;
+}
+
+rtas_st(rets, 0, RTAS_OUT_SUCCESS);
  return;

  param_error_exit:
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index 10fa88a..4ac5736 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -221,18 +221,6 @@ int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb,
int option)
  return RTAS_OUT_SUCCESS;
  }

-int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb)
-{
-int ret;
-
-ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE);
-if (ret < 0) {
-return RTAS_OUT_PARAM_ERROR;
-}
-
-return RTAS_OUT_SUCCESS;
-}
-
  static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(klass);
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index cc1b82c..f02ef51 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -140,6 +140,5 @@ int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
unsigned int addr, int option);
  int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state);
  int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option);
-int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb);

  #endif /* __HW_SPAPR_PCI_H__ */







--
Alexey

Re: [Qemu-devel] [PATCH 03/12] spapr_pci: Eliminate class callbacks

2016-02-28 Thread Alexey Kardashevskiy


On 02/29/2016 12:43 PM, Alexey Kardashevskiy wrote:

On 02/26/2016 10:31 PM, David Gibson wrote:

The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the
special groupid field in sPAPRPHBVFIOState.  So we can simplify, removing
the class specific callbacks with direct calls based on a simple
spapr_phb_eeh_enabled() helper.  For now we implement that in terms of
a boolean in the class, but we'll continue to clean that up later.

On its own this is a rather strange way of doing things, but it's a useful
intermediate step to further cleanups.


Reviewed-by: Alexey Kardashevskiy 



This actually breaks mingw32 build:


  LINK  ppc64-softmmu/qemu-system-ppc64.exe
hw/ppc/spapr_pci.o: In function `rtas_ibm_set_eeh_option':
/home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:471: undefined reference to 
`spapr_phb_vfio_eeh_set_option'

hw/ppc/spapr_pci.o: In function `rtas_ibm_read_slot_reset_state2':
/home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:557: undefined reference to 
`spapr_phb_vfio_eeh_get_state'

hw/ppc/spapr_pci.o: In function `rtas_ibm_set_slot_reset':
/home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:601: undefined reference to 
`spapr_phb_vfio_eeh_reset'

hw/ppc/spapr_pci.o: In function `rtas_ibm_configure_pe':
/home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:633: undefined reference to 
`spapr_phb_vfio_eeh_configure'

hw/ppc/spapr_pci.o: In function `spapr_phb_reset':
/home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:1434: undefined reference to 
`spapr_phb_vfio_reset'

collect2: error: ld returned 1 exit status







Signed-off-by: David Gibson 
---
  hw/ppc/spapr_pci.c  | 44
++--
  hw/ppc/spapr_pci_vfio.c | 18 +++---
  include/hw/pci-host/spapr.h | 13 +
  3 files changed, 38 insertions(+), 37 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 9b2b546..d1e5222 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -92,6 +92,13 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState
*spapr, uint64_t buid,
  return pci_find_device(phb->bus, bus_num, devfn);
  }

+static bool spapr_phb_eeh_available(sPAPRPHBState *sphb)
+{
+sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
+
+return spc->eeh_available;
+}
+
  static uint32_t rtas_pci_cfgaddr(uint32_t arg)
  {
  /* This handles the encoding of extended config space addresses */
@@ -438,7 +445,6 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu,
  target_ulong rets)
  {
  sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
  uint32_t addr, option;
  uint64_t buid;
  int ret;
@@ -456,12 +462,11 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu,
  goto param_error_exit;
  }

-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_set_option) {
+if (!spapr_phb_eeh_available(sphb)) {
  goto param_error_exit;
  }

-ret = spc->eeh_set_option(sphb, addr, option);
+ret = spapr_phb_vfio_eeh_set_option(sphb, addr, option);
  rtas_st(rets, 0, ret);
  return;

@@ -476,7 +481,6 @@ static void rtas_ibm_get_config_addr_info2(PowerPCCPU
*cpu,
 target_ulong rets)
  {
  sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
  PCIDevice *pdev;
  uint32_t addr, option;
  uint64_t buid;
@@ -491,8 +495,7 @@ static void rtas_ibm_get_config_addr_info2(PowerPCCPU
*cpu,
  goto param_error_exit;
  }

-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_set_option) {
+if (!spapr_phb_eeh_available(sphb)) {
  goto param_error_exit;
  }

@@ -532,7 +535,6 @@ static void
rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu,
  target_ulong rets)
  {
  sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
  uint64_t buid;
  int state, ret;

@@ -546,12 +548,11 @@ static void
rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu,
  goto param_error_exit;
  }

-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_get_state) {
+if (!spapr_phb_eeh_available(sphb)) {
  goto param_error_exit;
  }

-ret = spc->eeh_get_state(sphb, &state);
+ret = spapr_phb_vfio_eeh_get_state(sphb, &state);
  rtas_st(rets, 0, ret);
  if (ret != RTAS_OUT_SUCCESS) {
  return;
@@ -576,7 +577,6 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu,
  target_ulong rets)
  {
  sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
  uint32_t option;
  uint64_t buid;
  int ret;
@@ -592,12 +592,11 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu,
  goto param_error_exit;
  }

-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_reset) {
+if (!spapr_phb_eeh_available(sphb)) {
  goto param_error_exit;
  }

-ret = spc->eeh_reset(sphb, option);
+ret = spapr_phb_vfio_eeh_reset(sphb, option);
  rtas_s

Re: [Qemu-devel] [PATCH] Qemu/KVM: Remove x2apic feature from CPU model when kernel_irqchip is off

2016-02-28 Thread Lan Tianyu

On 2016年02月27日 03:54, Eduardo Habkost wrote:
> On Thu, Feb 25, 2016 at 11:15:12PM +0800, Lan Tianyu wrote:
>> x2apic feature is in the kvm_default_props and automatically added to all
>> CPU models when KVM is enabled. But userspace devices don't support x2apic
>> which can't be enabled without the in-kernel irqchip. It will trigger
>> warning of "host doesn't support requested feature: CPUID.01H:ECX.x2apic
>> [bit 21]" when kernel_irqchip is off. This patch is to fix it via removing
>> x2apic feature when kernel_irqchip is off.
>>
>> Signed-off-by: Lan Tianyu 
>> ---
>>  target-i386/cpu.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/target-i386/cpu.c b/target-i386/cpu.c
>> index c78f824..298fb62 100644
>> --- a/target-i386/cpu.c
>> +++ b/target-i386/cpu.c
>> @@ -2125,6 +2125,10 @@ static void x86_cpu_load_def(X86CPU *cpu, 
>> X86CPUDefinition *def, Error **errp)
>>  
>>  /* Special cases not set in the X86CPUDefinition structs: */
>>  if (kvm_enabled()) {
>> +if (!kvm_irqchip_in_kernel()) {
>> +x86_cpu_change_kvm_default("x2apic", "off");
> 
> This should be NULL instead of "off".

I tried "NULL" before. But some cpus modules(E,G SandyBridge,
IvyBridge, haswell) already have x2apic feature in their default
features of struct X86CPUDefinition, passing "NULL" is not to add x2apic
feature to the cpu module and will not help to remove x2apic feature for
these cpu modules. So I changed "NULL" to "off".

> Otherwise, the warning will
> be disabled if using "-cpu ...,+x2apic".
> 

kvm_arch_get_supported_cpuid() always returns no x2apic support when
kernel_irqchip is off and so it still triggers warning with "-cpu
...,+x2apic".

#qemu-system-x86_64 -cpu qemu64,+x2apic -machine kernel-irqchip=off
warning: TCG doesn't support requested feature: CPUID.01H:ECX.x2apic
[bit 21]


>> +}
>> +
> 
> This function runs multiple times (once for each VCPU being
> created). Should be harmless, though. Eventually we could move
> the whole kvm_default_props logic outside cpu.c, to a KVM-x86
> accel class or to common PC initialization code.

Good suggestion and we can try that later.

> 
> 
>>  x86_cpu_apply_props(cpu, kvm_default_props);
>>  }
>>  
>> -- 
>> 1.9.3
>>
> 


-- 
Best regards
Tianyu Lan

[Qemu-devel] [PATCH 4/6] memory: Drop MemoryRegion.ram_addr

2016-02-28 Thread Fam Zheng

All references to mr->ram_addr are replaced by
memory_region_get_ram_addr(mr) (except for a few assertions that are
replaced with mr->ram_block).

Signed-off-by: Fam Zheng 
---
 cputlb.c  |  4 +--
 exec.c|  3 ++-
 hw/misc/ivshmem.c |  9 ---
 include/exec/memory.h |  1 -
 kvm-all.c |  3 ++-
 memory.c  | 71 ---
 6 files changed, 39 insertions(+), 52 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 3973030..2f7a166 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -416,8 +416,8 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr,
 /* Write access calls the I/O callback.  */
 te->addr_write = address | TLB_MMIO;
 } else if (memory_region_is_ram(section->mr)
-   && cpu_physical_memory_is_clean(section->mr->ram_addr
-   + xlat)) {
+   && cpu_physical_memory_is_clean(
+memory_region_get_ram_addr(section->mr) + xlat)) {
 te->addr_write = address | TLB_NOTDIRTY;
 } else {
 te->addr_write = address;
diff --git a/exec.c b/exec.c
index 83e3f7d..6ed4203 100644
--- a/exec.c
+++ b/exec.c
@@ -2699,7 +2699,8 @@ MemTxResult address_space_read_continue(AddressSpace *as, 
hwaddr addr,
 }
 } else {
 /* RAM case */
-ptr = qemu_get_ram_ptr(mr->ram_block, mr->ram_addr + addr1);
+ptr = qemu_get_ram_ptr(mr->ram_block,
+   memory_region_get_ram_addr(mr) + addr1);
 memcpy(buf, ptr, l);
 }
 
diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 48b7a34..1838bc8 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -400,7 +400,7 @@ static int create_shared_memory_BAR(IVShmemState *s, int 
fd, uint8_t attr,
 
 memory_region_init_ram_ptr(&s->ivshmem, OBJECT(s), "ivshmem.bar2",
s->ivshmem_size, ptr);
-qemu_set_ram_fd(s->ivshmem.ram_addr, fd);
+qemu_set_ram_fd(memory_region_get_ram_addr(&s->ivshmem), fd);
 vmstate_register_ram(&s->ivshmem, DEVICE(s));
 memory_region_add_subregion(&s->bar, 0, &s->ivshmem);
 
@@ -661,7 +661,8 @@ static void ivshmem_read(void *opaque, const uint8_t *buf, 
int size)
 }
 memory_region_init_ram_ptr(&s->ivshmem, OBJECT(s),
"ivshmem.bar2", s->ivshmem_size, map_ptr);
-qemu_set_ram_fd(s->ivshmem.ram_addr, incoming_fd);
+qemu_set_ram_fd(memory_region_get_ram_addr(&s->ivshmem),
+incoming_fd);
 vmstate_register_ram(&s->ivshmem, DEVICE(s));
 
 IVSHMEM_DPRINTF("guest h/w addr = %p, size = %" PRIu64 "\n",
@@ -996,8 +997,10 @@ static void pci_ivshmem_exit(PCIDevice *dev)
  strerror(errno));
 }
 
-if ((fd = qemu_get_ram_fd(s->ivshmem.ram_addr)) != -1)
+fd = qemu_get_ram_fd(memory_region_get_ram_addr(&s->ivshmem));
+if (fd != -1) {
 close(fd);
+}
 }
 
 vmstate_unregister_ram(&s->ivshmem, DEVICE(dev));
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 810d2c0..2de7898 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -169,7 +169,6 @@ struct MemoryRegion {
 bool flush_coalesced_mmio;
 bool global_locking;
 uint8_t dirty_log_mask;
-ram_addr_t ram_addr;
 RAMBlock *ram_block;
 Object *owner;
 const MemoryRegionIOMMUOps *iommu_ops;
diff --git a/kvm-all.c b/kvm-all.c
index a65e73f..161200e 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -366,7 +366,8 @@ static void kvm_log_stop(MemoryListener *listener,
 static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
  unsigned long *bitmap)
 {
-ram_addr_t start = section->offset_within_region + section->mr->ram_addr;
+ram_addr_t start = section->offset_within_region +
+   memory_region_get_ram_addr(section->mr);
 ram_addr_t pages = int128_get64(section->size) / getpagesize();
 
 cpu_physical_memory_set_dirty_lebitmap(bitmap, start, pages);
diff --git a/memory.c b/memory.c
index 769825e..b221f3c 100644
--- a/memory.c
+++ b/memory.c
@@ -858,12 +858,12 @@ static void memory_region_destructor_none(MemoryRegion 
*mr)
 
 static void memory_region_destructor_ram(MemoryRegion *mr)
 {
-qemu_ram_free(mr->ram_addr);
+qemu_ram_free(memory_region_get_ram_addr(mr));
 }
 
 static void memory_region_destructor_rom_device(MemoryRegion *mr)
 {
-qemu_ram_free(mr->ram_addr & TARGET_PAGE_MASK);
+qemu_ram_free(memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK);
 }
 
 static bool memory_region_need_escape(char c)
@@ -994,7 +994,6 @@ static void memory_region_initfn(Object *obj)
 ObjectProperty *op;
 
 mr->ops = &unassigned_mem_ops;
-mr->ram_addr = RAM_ADDR_INVALID;
 mr->e

[Qemu-devel] [PATCH 3/6] memory: Implement memory_region_get_ram_addr with mr->ram_block

2016-02-28 Thread Fam Zheng

Signed-off-by: Fam Zheng 
---
 include/exec/memory.h | 8 +---
 memory.c  | 5 +
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index d5284c2..810d2c0 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -978,14 +978,8 @@ void memory_region_add_subregion_overlap(MemoryRegion *mr,
 /**
  * memory_region_get_ram_addr: Get the ram address associated with a memory
  * region
- *
- * DO NOT USE THIS FUNCTION.  This is a temporary workaround while the Xen
- * code is being reworked.
  */
-static inline ram_addr_t memory_region_get_ram_addr(MemoryRegion *mr)
-{
-return mr->ram_addr;
-}
+ram_addr_t memory_region_get_ram_addr(MemoryRegion *mr);
 
 uint64_t memory_region_get_alignment(const MemoryRegion *mr);
 /**
diff --git a/memory.c b/memory.c
index fe70075..769825e 100644
--- a/memory.c
+++ b/memory.c
@@ -1596,6 +1596,11 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr)
 return ptr + offset;
 }
 
+ram_addr_t memory_region_get_ram_addr(MemoryRegion *mr)
+{
+return mr->ram_block->offset;
+}
+
 void memory_region_ram_resize(MemoryRegion *mr, ram_addr_t newsize, Error 
**errp)
 {
 assert(mr->ram_addr != RAM_ADDR_INVALID);
-- 
2.4.3

[Qemu-devel] [PATCH 5/6] exec: Pass RAMBlock pointer to qemu_ram_free

2016-02-28 Thread Fam Zheng

The only caller now knows exactly which RAMBlock to free, so it's not
necessary to do the lookup.

Signed-off-by: Fam Zheng 
---
 exec.c  | 21 +++--
 include/exec/ram_addr.h |  2 +-
 memory.c|  4 ++--
 3 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/exec.c b/exec.c
index 6ed4203..ad8b826 100644
--- a/exec.c
+++ b/exec.c
@@ -1751,22 +1751,15 @@ static void reclaim_ramblock(RAMBlock *block)
 g_free(block);
 }
 
-void qemu_ram_free(ram_addr_t addr)
+void qemu_ram_free(RAMBlock *block)
 {
-RAMBlock *block;
-
 qemu_mutex_lock_ramlist();
-QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
-if (addr == block->offset) {
-QLIST_REMOVE_RCU(block, next);
-ram_list.mru_block = NULL;
-/* Write list before version */
-smp_wmb();
-ram_list.version++;
-call_rcu(block, reclaim_ramblock, rcu);
-break;
-}
-}
+QLIST_REMOVE_RCU(block, next);
+ram_list.mru_block = NULL;
+/* Write list before version */
+smp_wmb();
+ram_list.version++;
+call_rcu(block, reclaim_ramblock, rcu);
 qemu_mutex_unlock_ramlist();
 }
 
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 865e19b..5adf7a4 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -108,7 +108,7 @@ RAMBlock *qemu_ram_alloc_resizeable(ram_addr_t size, 
ram_addr_t max_size,
 int qemu_get_ram_fd(ram_addr_t addr);
 void qemu_set_ram_fd(ram_addr_t addr, int fd);
 void *qemu_get_ram_block_host_ptr(ram_addr_t addr);
-void qemu_ram_free(ram_addr_t addr);
+void qemu_ram_free(RAMBlock *block);
 
 int qemu_ram_resize(ram_addr_t base, ram_addr_t newsize, Error **errp);
 
diff --git a/memory.c b/memory.c
index b221f3c..32d2912 100644
--- a/memory.c
+++ b/memory.c
@@ -858,12 +858,12 @@ static void memory_region_destructor_none(MemoryRegion 
*mr)
 
 static void memory_region_destructor_ram(MemoryRegion *mr)
 {
-qemu_ram_free(memory_region_get_ram_addr(mr));
+qemu_ram_free(mr->ram_block);
 }
 
 static void memory_region_destructor_rom_device(MemoryRegion *mr)
 {
-qemu_ram_free(memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK);
+qemu_ram_free(mr->ram_block);
 }
 
 static bool memory_region_need_escape(char c)
-- 
2.4.3

[Qemu-devel] [PATCH 6/6] exec: Introduce AddressSpaceDispatch.mru_section

2016-02-28 Thread Fam Zheng

Under heavy workloads the lookup will likely end up with the same
MemoryRegionSection from last time. Using a pointer to cache the result, like
ram_list.mru_block, significantly reduce computation cost of
address_space_translate.

During address space topology update, as->dispatch will be reallocated
so the pointer is invalidated automatically.

Perf reports a visible drop on the cpu usage.  Before:

   + 2.06% phys_page_find
   + 0.95% address_space_translate_internal
   + 0.80% address_space_translate

After:
   + 0.78% address_space_translate
   + 0.77% address_space_translate_internal
   + 0.69% address_space_lookup_region

Signed-off-by: Fam Zheng 
---
 exec.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/exec.c b/exec.c
index ad8b826..3232861 100644
--- a/exec.c
+++ b/exec.c
@@ -135,6 +135,7 @@ typedef struct PhysPageMap {
 struct AddressSpaceDispatch {
 struct rcu_head rcu;
 
+MemoryRegionSection *mru_section;
 /* This is a multi-level map on the physical address space.
  * The bottom level has pointers to MemoryRegionSections.
  */
@@ -342,14 +343,26 @@ static MemoryRegionSection 
*address_space_lookup_region(AddressSpaceDispatch *d,
 hwaddr addr,
 bool resolve_subpage)
 {
-MemoryRegionSection *section;
+MemoryRegionSection *section = atomic_read(&d->mru_section);
 subpage_t *subpage;
+bool update;
 
-section = phys_page_find(d->phys_map, addr, d->map.nodes, d->map.sections);
+if (section && section != &d->map.sections[PHYS_SECTION_UNASSIGNED] &&
+range_covers_byte(section->offset_within_address_space,
+  section->size.lo, addr)) {
+update = false;
+} else {
+section = phys_page_find(d->phys_map, addr, d->map.nodes,
+ d->map.sections);
+update = true;
+}
 if (resolve_subpage && section->mr->subpage) {
 subpage = container_of(section->mr, subpage_t, iomem);
 section = &d->map.sections[subpage->sub_section[SUBPAGE_IDX(addr)]];
 }
+if (update) {
+atomic_set(&d->mru_section, section);
+}
 return section;
 }
 
-- 
2.4.3

[Qemu-devel] [PATCH 2/6] memory: Move assignment to ram_block to memory_region_init_*

2016-02-28 Thread Fam Zheng

We don't force "const" qualifiers with pointers in QEMU, but it's still
good to keep a clean function interface. Assigning to mr->ram_block is
in this sense ugly - one initializer mutating its owning object's state.

Move it to memory_region_init_*, where mr->ram_addr is assigned.

Signed-off-by: Fam Zheng 
---
 exec.c   | 1 -
 memory.c | 5 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index 2b14b79..83e3f7d 100644
--- a/exec.c
+++ b/exec.c
@@ -1711,7 +1711,6 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, 
ram_addr_t max_size,
 error_propagate(errp, local_err);
 return NULL;
 }
-mr->ram_block = new_block;
 return new_block;
 }
 
diff --git a/memory.c b/memory.c
index ae13ba9..fe70075 100644
--- a/memory.c
+++ b/memory.c
@@ -1233,6 +1233,7 @@ void memory_region_init_ram(MemoryRegion *mr,
 mr->terminates = true;
 mr->destructor = memory_region_destructor_ram;
 ram_block = qemu_ram_alloc(size, mr, errp);
+mr->ram_block = ram_block;
 mr->ram_addr = ram_block->offset;
 mr->dirty_log_mask = tcg_enabled() ? (1 << DIRTY_MEMORY_CODE) : 0;
 }
@@ -1254,6 +1255,7 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr,
 mr->terminates = true;
 mr->destructor = memory_region_destructor_ram;
 ram_block = qemu_ram_alloc_resizeable(size, max_size, resized, mr, errp);
+mr->ram_block = ram_block;
 mr->ram_addr = ram_block->offset;
 mr->dirty_log_mask = tcg_enabled() ? (1 << DIRTY_MEMORY_CODE) : 0;
 }
@@ -1274,6 +1276,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
 mr->terminates = true;
 mr->destructor = memory_region_destructor_ram;
 ram_block = qemu_ram_alloc_from_file(size, mr, share, path, errp);
+mr->ram_block = ram_block;
 mr->ram_addr = ram_block->offset;
 mr->dirty_log_mask = tcg_enabled() ? (1 << DIRTY_MEMORY_CODE) : 0;
 }
@@ -1296,6 +1299,7 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
 /* qemu_ram_alloc_from_ptr cannot fail with ptr != NULL.  */
 assert(ptr != NULL);
 ram_block = qemu_ram_alloc_from_ptr(size, ptr, mr, &error_fatal);
+mr->ram_block = ram_block;
 mr->ram_addr = ram_block->offset;
 }
 
@@ -1333,6 +1337,7 @@ void memory_region_init_rom_device(MemoryRegion *mr,
 mr->rom_device = true;
 mr->destructor = memory_region_destructor_rom_device;
 ram_block = qemu_ram_alloc(size, mr, errp);
+mr->ram_block = ram_block;
 mr->ram_addr = ram_block->offset;
 }
 
-- 
2.4.3

[Qemu-devel] [PATCH 0/6] memory: Clean up MemoryRegion.ram_addr and optimize address_space_translate

2016-02-28 Thread Fam Zheng

The first four patches drop ram_addr from MemoryRegion on top of Gonglei's
optimization.

The next patch simplifies qemu_ram_free a bit by passing the RAMBlock pointer.

The last patch speeds up address_space_translate with a cache pointer inside
the AddressSpaceDispatch.



Fam Zheng (6):
  exec: Return RAMBlock pointer from allocating functions
  memory: Move assignment to ram_block to memory_region_init_*
  memory: Implement memory_region_get_ram_addr with mr->ram_block
  memory: Drop MemoryRegion.ram_addr
  exec: Pass RAMBlock pointer to qemu_ram_free
  exec: Introduce AddressSpaceDispatch.mru_section

 cputlb.c|  4 +--
 exec.c  | 93 -
 hw/misc/ivshmem.c   |  9 +++--
 include/exec/memory.h   |  9 +
 include/exec/ram_addr.h | 24 ++---
 kvm-all.c   |  3 +-
 memory.c| 56 -
 7 files changed, 101 insertions(+), 97 deletions(-)

-- 
2.4.3

[Qemu-devel] [PATCH 1/6] exec: Return RAMBlock pointer from allocating functions

2016-02-28 Thread Fam Zheng

Previously we return RAMBlock.offset; now return the pointer to the
whole structure.

ram_block_add returns void now, error is completely passed with errp.

Signed-off-by: Fam Zheng 
---
 exec.c  | 51 +
 include/exec/ram_addr.h | 22 ++---
 memory.c| 25 +++-
 3 files changed, 53 insertions(+), 45 deletions(-)

diff --git a/exec.c b/exec.c
index c62c439..2b14b79 100644
--- a/exec.c
+++ b/exec.c
@@ -1554,7 +1554,7 @@ static void dirty_memory_extend(ram_addr_t old_ram_size,
 }
 }
 
-static ram_addr_t ram_block_add(RAMBlock *new_block, Error **errp)
+static void ram_block_add(RAMBlock *new_block, Error **errp)
 {
 RAMBlock *block;
 RAMBlock *last_block = NULL;
@@ -1573,7 +1573,6 @@ static ram_addr_t ram_block_add(RAMBlock *new_block, 
Error **errp)
 if (err) {
 error_propagate(errp, err);
 qemu_mutex_unlock_ramlist();
-return -1;
 }
 } else {
 new_block->host = phys_mem_alloc(new_block->max_length,
@@ -1583,7 +1582,6 @@ static ram_addr_t ram_block_add(RAMBlock *new_block, 
Error **errp)
  "cannot set up guest memory '%s'",
  memory_region_name(new_block->mr));
 qemu_mutex_unlock_ramlist();
-return -1;
 }
 memory_try_enable_merging(new_block->host, new_block->max_length);
 }
@@ -1631,22 +1629,19 @@ static ram_addr_t ram_block_add(RAMBlock *new_block, 
Error **errp)
 kvm_setup_guest_memory(new_block->host, new_block->max_length);
 }
 }
-
-return new_block->offset;
 }
 
 #ifdef __linux__
-ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
-bool share, const char *mem_path,
-Error **errp)
+RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
+   bool share, const char *mem_path,
+   Error **errp)
 {
 RAMBlock *new_block;
-ram_addr_t addr;
 Error *local_err = NULL;
 
 if (xen_enabled()) {
 error_setg(errp, "-mem-path not supported with Xen");
-return -1;
+return NULL;
 }
 
 if (phys_mem_alloc != qemu_anon_ram_alloc) {
@@ -1657,7 +1652,7 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
MemoryRegion *mr,
  */
 error_setg(errp,
"-mem-path not supported with this accelerator");
-return -1;
+return NULL;
 }
 
 size = HOST_PAGE_ALIGN(size);
@@ -1670,29 +1665,28 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
MemoryRegion *mr,
  mem_path, errp);
 if (!new_block->host) {
 g_free(new_block);
-return -1;
+return NULL;
 }
 
-addr = ram_block_add(new_block, &local_err);
+ram_block_add(new_block, &local_err);
 if (local_err) {
 g_free(new_block);
 error_propagate(errp, local_err);
-return -1;
+return NULL;
 }
-return addr;
+return new_block;
 }
 #endif
 
 static
-ram_addr_t qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size,
-   void (*resized)(const char*,
-   uint64_t length,
-   void *host),
-   void *host, bool resizeable,
-   MemoryRegion *mr, Error **errp)
+RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size,
+  void (*resized)(const char*,
+  uint64_t length,
+  void *host),
+  void *host, bool resizeable,
+  MemoryRegion *mr, Error **errp)
 {
 RAMBlock *new_block;
-ram_addr_t addr;
 Error *local_err = NULL;
 
 size = HOST_PAGE_ALIGN(size);
@@ -1711,29 +1705,28 @@ ram_addr_t qemu_ram_alloc_internal(ram_addr_t size, 
ram_addr_t max_size,
 if (resizeable) {
 new_block->flags |= RAM_RESIZEABLE;
 }
-addr = ram_block_add(new_block, &local_err);
+ram_block_add(new_block, &local_err);
 if (local_err) {
 g_free(new_block);
 error_propagate(errp, local_err);
-return -1;
+return NULL;
 }
-
 mr->ram_block = new_block;
-return addr;
+return new_block;
 }
 
-ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
+RAMBlock *qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
MemoryRegion *mr, Error **errp)
 {
 return qemu_ram_alloc_internal(size, size, NULL, host, false, mr, errp);
 }
 
-ram_addr_t qemu_ram_a

Re: [Qemu-devel] [Qemu-arm] [PATCH v2 1/3] hw/timer: Add ASPEED AST2400 timer device model

2016-02-28 Thread Andrew Jeffery

On Fri, 2016-02-26 at 10:20 +, Peter Maydell wrote:
> On 26 February 2016 at 03:14, Andrew Jeffery  wrote:
> > 
> > Hi Peter,
> > 
> > On Thu, 2016-02-25 at 16:11 +, Peter Maydell wrote:
> > > 
> > > On 16 February 2016 at 11:34, Andrew Jeffery  wrote:
> > > > 
> > > > Implement basic AST2400 timer functionality: Timers can be configured,
> > > > enabled, reset and disabled.
> > > > 
> > > > A number of hardware features are not implemented:
> > > > 
> > > > * Timer Overflow interrupts
> > > > * Clock value matching
> > > > * Pulse generation
> > > > 
> > > > The implementation is enough to boot the Linux kernel configured with
> > > > aspeed_defconfig.
> > > Thanks; this mostly looks in reasonable shape; I have some comments below.
> > > 
> > > Do we have a datasheet for this chip ?
> > Unfortunately I don't know of a publicly available datasheet. What's
> > the best way to proceed in this case?
> We have devices in the tree that are either based on non-public datasheets
> or occasionally reverse engineered from Linux kernel drivers. That's OK,
> but it's nice to be clear in a comment at the top what the source is,
> so people maintaining it later know how much to trust the current code
> and (if possible) where to look for clarification.

No worries, I'll add notes in the header comments for each of the new
files.

> 
> > 
> > > 
> > > All source files need to #include "qemu/osdep.h" as their first
> > > include. That then means you don't need to include assert.h or
> > > stdio.h yourself.
> > > 
> > > What do we need from qemu/main-loop.h?
> > I'm using it for qemu_bh_new() which is required by ptimer, who registers
> > the aspeed_timer_tick() callback into the main loop timer handling.
> OK, no problem.
> 
> > 
> > > 
> > > > 
> > > > +static void aspeed_timer_irq_update(AspeedTimer *t)
> > > > +{
> > > > +qemu_set_irq(t->irq, t->enabled);
> > > Surely the timer doesn't assert its IRQ line all
> > > the time it's enabled? This doesn't look like the right condition.
> > So I think this is correct despite how it looks. There's some cruft
> > from modelling the implementation off of another timer that's probably
> > obscuring things, which should be fixed. aspeed_timer_irq_update()
> > is only called from aspeed_timer_tick(), so I'll just merge the two.
> > Then by renaming aspeed_timer_tick() to aspeed_timer_expire() as
> > mentioned above, this won't look so strange? I've read through the
> > timer handling code (the processing loop in timerlist_run_timers())
> > and my understanding is it has the behaviour we want - callback on
> > expiry, not on ticks - which is not how it reads as above.
> Usually functions in QEMU called thingy_irq_update() are the ones
> that do "look at current state of device and assert IRQ as
> necessary"; often this is "mask irq raw state against some
> irq-masking bit". Merging this into the timer expire function will
> probably help. (Is there no register bit that the guest can query
> that indicates "timer expired" or "raw interrupt state" ?)

It doesn't appear so - overflow interrupts can be enabled or disabled,
but it doesn't appear that there's a way to poll whether the timer has
expired. It doesn't look like it can be inferred either as the counter
status register doesn't stick at zero, rather resets back to the reload
register value (and my interpretation is it continues to count down if
the enabled bit remains set). Further, the interrupt on overflow bit
doesn't appear to be set by the kernel driver, but the two match
registers are initialised to zero which ensures an interrupt will be
fired. Non-zero match registers aren't currently supported by the
device model. I'll try to make this all clearer in the code.

> 
> > 
> > > 
> > > You should implement a VMState struct for device migration,
> > > and wire it up here via dc->vmsd.
> > I'll look into it. I started experimenting with a VMState struct
> > early on in the implementation but threw it away as it wasn't my
> > primary focus at the time.
> We insist on vmstate structs for all new devices, because
> they're fairly easy to implement, and if the original
> submitter doesn't implement one then the device becomes
> a landmine for any user trying migration or vmstate snapshots,
> because it will silently misbehave.

Yeah, no worries, v3 of the series will have VMState structs for all
devices.

Cheers,

Andrew

> 
> thanks
> -- PMM


signature.asc
Description: This is a digitally signed message part

[Qemu-devel] [PATCH v2 2/2] filter-buffer: Add status_changed callback processing

2016-02-28 Thread zhanghailiang

While the status of filter-buffer changing from 'on' to 'off',
it need to release all the buffered packets, and delete the related
timer, while switch from 'off' to 'on', it need to resume the release
packets timer.

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
Cc: Yang Hongyang 
---
v2:
- New patch
---
 net/filter-buffer.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/net/filter-buffer.c b/net/filter-buffer.c
index 12ad2e3..ed3f19e 100644
--- a/net/filter-buffer.c
+++ b/net/filter-buffer.c
@@ -124,6 +124,24 @@ static void filter_buffer_setup(NetFilterState *nf, Error 
**errp)
 }
 }
 
+static void filter_buffer_status_changed(NetFilterState *nf, Error **errp)
+{
+FilterBufferState *s = FILTER_BUFFER(nf);
+
+if (!strcmp(nf->status, "off")) {
+if (s->interval) {
+timer_del(&s->release_timer);
+}
+filter_buffer_flush(nf);
+} else {
+if (s->interval) {
+timer_init_us(&s->release_timer, QEMU_CLOCK_VIRTUAL,
+filter_buffer_release_timer, nf);
+timer_mod(&s->release_timer,
+qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + s->interval);
+}
+}
+}
 static void filter_buffer_class_init(ObjectClass *oc, void *data)
 {
 NetFilterClass *nfc = NETFILTER_CLASS(oc);
@@ -131,6 +149,7 @@ static void filter_buffer_class_init(ObjectClass *oc, void 
*data)
 nfc->setup = filter_buffer_setup;
 nfc->cleanup = filter_buffer_cleanup;
 nfc->receive_iov = filter_buffer_receive_iov;
+nfc->status_changed = filter_buffer_status_changed;
 }
 
 static void filter_buffer_get_interval(Object *obj, Visitor *v,
-- 
1.8.3.1

[Qemu-devel] [PATCH v2 0/2] Introduce 'status' property for netfilter

2016-02-28 Thread zhanghailiang

This is picked from COLO series, which is to realize the new 'status'
property for filter.

With this property, users can control if the filter is enabled or
disabled.

zhanghailiang (2):
  filter: Add 'status' property for filter object
  filter-buffer: Add status_changed callback processing

 include/net/filter.h |  4 
 net/filter-buffer.c  | 19 +++
 net/filter.c | 42 ++
 qemu-options.hx  |  4 +++-
 4 files changed, 68 insertions(+), 1 deletion(-)

-- 
1.8.3.1

[Qemu-devel] [PATCH v2 1/2] filter: Add 'status' property for filter object

2016-02-28 Thread zhanghailiang

With this property, users can control if this filter is 'on'
or 'off'. The default behavior for filter is 'on'.

For some types of filters, they may need to react to status changing,
So here, we introduced status changing callback/notifier for filter class.

We will skip the disabled ('off') filter when delivering packets in net layer.

Signed-off-by: zhanghailiang 
Cc: Jason Wang 
Cc: Yang Hongyang 
---
v2:
- Split the processing of buffer-filter into a new patch (Jason)
- Use 'status' instead of 'enabled' to store the filter state (Jason)
- Rename FilterDisable() callback to FilterStatusChanged(Jason)
---
 include/net/filter.h |  4 
 net/filter.c | 42 ++
 qemu-options.hx  |  4 +++-
 3 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/include/net/filter.h b/include/net/filter.h
index 5639976..ebef0dc 100644
--- a/include/net/filter.h
+++ b/include/net/filter.h
@@ -36,12 +36,15 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc,
int iovcnt,
NetPacketSent *sent_cb);
 
+typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp);
+
 typedef struct NetFilterClass {
 ObjectClass parent_class;
 
 /* optional */
 FilterSetup *setup;
 FilterCleanup *cleanup;
+FilterStatusChanged *status_changed;
 /* mandatory */
 FilterReceiveIOV *receive_iov;
 } NetFilterClass;
@@ -55,6 +58,7 @@ struct NetFilterState {
 char *netdev_id;
 NetClientState *netdev;
 NetFilterDirection direction;
+char *status;
 QTAILQ_ENTRY(NetFilterState) next;
 };
 
diff --git a/net/filter.c b/net/filter.c
index d2a514e..68b7bba 100644
--- a/net/filter.c
+++ b/net/filter.c
@@ -17,6 +17,11 @@
 #include "qom/object_interfaces.h"
 #include "qemu/iov.h"
 
+static inline bool qemu_can_skip_netfilter(NetFilterState *nf)
+{
+return !!strcmp(nf->status, "on");
+}
+
 ssize_t qemu_netfilter_receive(NetFilterState *nf,
NetFilterDirection direction,
NetClientState *sender,
@@ -25,6 +30,9 @@ ssize_t qemu_netfilter_receive(NetFilterState *nf,
int iovcnt,
NetPacketSent *sent_cb)
 {
+if (qemu_can_skip_netfilter(nf)) {
+return 0;
+}
 if (nf->direction == direction ||
 nf->direction == NET_FILTER_DIRECTION_ALL) {
 return NETFILTER_GET_CLASS(OBJECT(nf))->receive_iov(
@@ -134,8 +142,39 @@ static void netfilter_set_direction(Object *obj, int 
direction, Error **errp)
 nf->direction = direction;
 }
 
+static char *netfilter_get_status(Object *obj, Error **errp)
+{
+NetFilterState *nf = NETFILTER(obj);
+
+return g_strdup(nf->status);
+}
+
+static void netfilter_set_status(Object *obj, const char *str, Error **errp)
+{
+NetFilterState *nf = NETFILTER(obj);
+NetFilterClass *nfc = NETFILTER_GET_CLASS(obj);
+
+if (!strcmp(nf->status, str)) {
+return;
+}
+if (strcmp(str, "on") && strcmp(str, "off")) {
+error_setg(errp, "Invalid value for netfilter status, "
+ "should be 'on' or 'off'");
+return;
+}
+g_free(nf->status);
+nf->status = g_strdup(str);
+if (nfc->status_changed) {
+nfc->status_changed(nf, errp);
+}
+}
+
 static void netfilter_init(Object *obj)
 {
+NetFilterState *nf = NETFILTER(obj);
+
+nf->status = g_strdup("on");
+
 object_property_add_str(obj, "netdev",
 netfilter_get_netdev_id, netfilter_set_netdev_id,
 NULL);
@@ -143,6 +182,9 @@ static void netfilter_init(Object *obj)
  NetFilterDirection_lookup,
  netfilter_get_direction, netfilter_set_direction,
  NULL);
+object_property_add_str(obj, "status",
+netfilter_get_status, netfilter_set_status,
+NULL);
 }
 
 static void netfilter_complete(UserCreatable *uc, Error **errp)
diff --git a/qemu-options.hx b/qemu-options.hx
index 599db94..5291347 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3787,11 +3787,13 @@ version by providing the @var{passwordid} parameter. 
This provides
 the ID of a previously created @code{secret} object containing the
 password for decryption.
 
-@item -object 
filter-buffer,id=@var{id},netdev=@var{netdevid},interval=@var{t}[,queue=@var{all|rx|tx}]
+@item -object 
filter-buffer,id=@var{id},netdev=@var{netdevid},interval=@var{t}[,queue=@var{all|rx|tx}][,status=@var{on|off}]
 
 Interval @var{t} can't be 0, this filter batches the packet delivery: all
 packets arriving in a given interval on netdev @var{netdevid} are delayed
 until the end of the interval. Interval is in microseconds.
+@option{status} is optional that indicate whether the netfilter is
+on (enabled) or off (disabled), the default status for netf

Re: [Qemu-devel] [PATCH 09/12] spapr_pci: Allow EEH on spapr-pci-host-bridge

2016-02-28 Thread Alexey Kardashevskiy


On 02/26/2016 10:32 PM, David Gibson wrote:

Now that the EEH code is independent of the special
spapr-vfio-pci-host-bridge device, we can allow it on all spapr PCI
host bridges instead.  We do this by changing spapr_phb_eeh_available()
to be based on the vfio_eeh_as_ok() call instead of the host bridge class.

Because the value of vfio_eeh_as_ok() can change with devices being
hotplugged or unplugged, this can potentially lead to some strange edge
cases where the guest starts using EEH, then it starts failing because
of a change in status.

However, it's not really any worse than the current situation.  Cases that
would have worked previously will still work (i.e. VFIO devices from at
most one VFIO IOMMU group per vPHB), it's just that it's no longer
necessary to use spapr-vfio-pci-host-bridge with the groupid pre-specified.

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 


---
  hw/ppc/spapr_pci.c  | 5 +
  hw/ppc/spapr_pci_vfio.c | 1 -
  include/hw/pci-host/spapr.h | 1 -
  3 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index be18bf6..336ffd2 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -97,9 +97,7 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState *spapr, 
uint64_t buid,

  static bool spapr_phb_eeh_available(sPAPRPHBState *sphb)
  {
-sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-
-return spc->eeh_available;
+return vfio_eeh_as_ok(&sphb->iommu_as);
  }

  static uint32_t rtas_pci_cfgaddr(uint32_t arg)
@@ -1680,7 +1678,6 @@ static void spapr_phb_class_init(ObjectClass *klass, void 
*data)
  set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
  dc->cannot_instantiate_with_device_add_yet = false;
  spc->finish_realize = spapr_phb_finish_realize;
-spc->eeh_available = false;
  hp->plug = spapr_phb_hot_plug_child;
  hp->unplug = spapr_phb_hot_unplug_child;
  }
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index 0a0f31a..ecbcc5c 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -80,7 +80,6 @@ static void spapr_phb_vfio_class_init(ObjectClass *klass, 
void *data)

  dc->props = spapr_phb_vfio_properties;
  spc->finish_realize = spapr_phb_vfio_finish_realize;
-spc->eeh_available = true;
  }

  static const TypeInfo spapr_phb_vfio_info = {
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index d32750e..9313209 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -49,7 +49,6 @@ struct sPAPRPHBClass {
  PCIHostBridgeClass parent_class;

  void (*finish_realize)(sPAPRPHBState *sphb, Error **errp);
-bool eeh_available;
  };

  typedef struct spapr_pci_msi {




--
Alexey

Re: [Qemu-devel] [PATCH 11/12] spapr_pci: Remove finish_realize hook

2016-02-28 Thread Alexey Kardashevskiy


On 02/26/2016 10:32 PM, David Gibson wrote:

Now that spapr-pci-vfio-host-bridge is reduced to just a stub, there is
only one implementation of the finish_realize hook in sPAPRPHBClass.  So,
we can fold that implementation into its (single) caller, and remove the
hook.  That's the last thing left in sPAPRPHBClass, so that can go away as
well.

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 


---
  hw/ppc/spapr_pci.c  | 25 +
  include/hw/pci-host/spapr.h | 12 
  2 files changed, 5 insertions(+), 32 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 336ffd2..2faaa03 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1335,11 +1335,12 @@ static void spapr_phb_realize(DeviceState *dev, Error 
**errp)
  SysBusDevice *s = SYS_BUS_DEVICE(dev);
  sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(s);
  PCIHostState *phb = PCI_HOST_BRIDGE(s);
-sPAPRPHBClass *info = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(s);
  char *namebuf;
  int i;
  PCIBus *bus;
  uint64_t msi_window_size = 4096;
+sPAPRTCETable *tcet;
+uint32_t nb_table;

  if (sphb->index != (uint32_t)-1) {
  hwaddr windows_base;
@@ -1489,33 +1490,20 @@ static void spapr_phb_realize(DeviceState *dev, Error 
**errp)
  }
  }

-if (!info->finish_realize) {
-error_setg(errp, "finish_realize not defined");
-return;
-}
-
-info->finish_realize(sphb, errp);
-
-sphb->msi = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free);
-}
-
-static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp)
-{
-sPAPRTCETable *tcet;
-uint32_t nb_table;
-
  nb_table = sphb->dma_win_size >> SPAPR_TCE_PAGE_SHIFT;
  tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn,
 0, SPAPR_TCE_PAGE_SHIFT, nb_table, false);
  if (!tcet) {
  error_setg(errp, "Unable to create TCE table for %s",
 sphb->dtbusname);
-return ;
+return;
  }

  /* Register default 32bit DMA window */
  memory_region_add_subregion(&sphb->iommu_root, sphb->dma_win_addr,
  spapr_tce_get_iommu(tcet));
+
+sphb->msi = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free);
  }

  static int spapr_phb_children_reset(Object *child, void *opaque)
@@ -1667,7 +1655,6 @@ static void spapr_phb_class_init(ObjectClass *klass, void 
*data)
  {
  PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass);
  DeviceClass *dc = DEVICE_CLASS(klass);
-sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_CLASS(klass);
  HotplugHandlerClass *hp = HOTPLUG_HANDLER_CLASS(klass);

  hc->root_bus_path = spapr_phb_root_bus_path;
@@ -1677,7 +1664,6 @@ static void spapr_phb_class_init(ObjectClass *klass, void 
*data)
  dc->vmsd = &vmstate_spapr_pci;
  set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
  dc->cannot_instantiate_with_device_add_yet = false;
-spc->finish_realize = spapr_phb_finish_realize;
  hp->plug = spapr_phb_hot_plug_child;
  hp->unplug = spapr_phb_hot_unplug_child;
  }
@@ -1687,7 +1673,6 @@ static const TypeInfo spapr_phb_info = {
  .parent= TYPE_PCI_HOST_BRIDGE,
  .instance_size = sizeof(sPAPRPHBState),
  .class_init= spapr_phb_class_init,
-.class_size= sizeof(sPAPRPHBClass),
  .interfaces= (InterfaceInfo[]) {
  { TYPE_HOTPLUG_HANDLER },
  { }
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index 7110222..a603feb 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -32,20 +32,8 @@
  #define SPAPR_PCI_HOST_BRIDGE(obj) \
  OBJECT_CHECK(sPAPRPHBState, (obj), TYPE_SPAPR_PCI_HOST_BRIDGE)

-#define SPAPR_PCI_HOST_BRIDGE_CLASS(klass) \
- OBJECT_CLASS_CHECK(sPAPRPHBClass, (klass), TYPE_SPAPR_PCI_HOST_BRIDGE)
-#define SPAPR_PCI_HOST_BRIDGE_GET_CLASS(obj) \
- OBJECT_GET_CLASS(sPAPRPHBClass, (obj), TYPE_SPAPR_PCI_HOST_BRIDGE)
-
-typedef struct sPAPRPHBClass sPAPRPHBClass;
  typedef struct sPAPRPHBState sPAPRPHBState;

-struct sPAPRPHBClass {
-PCIHostBridgeClass parent_class;
-
-void (*finish_realize)(sPAPRPHBState *sphb, Error **errp);
-};
-
  typedef struct spapr_pci_msi {
  uint32_t first_irq;
  uint32_t num;




--
Alexey

Re: [Qemu-devel] [PATCH 08/12] spapr_pci: Fold spapr_phb_vfio_reset() into spapr_pci code

2016-02-28 Thread Alexey Kardashevskiy


On 02/26/2016 10:31 PM, David Gibson wrote:

Simplify the sPAPR PCI code by folding spapr_phb_vfio_reset() into
spapr_phb_reset().

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 


---
  hw/ppc/spapr_pci.c  | 13 -
  hw/ppc/spapr_pci_vfio.c | 16 
  2 files changed, 12 insertions(+), 17 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 26d08ad..be18bf6 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1531,13 +1531,24 @@ static int spapr_phb_children_reset(Object *child, void 
*opaque)
  return 0;
  }

+static void spapr_phb_eeh_reenable(sPAPRPHBState *sphb)
+{
+vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_ENABLE);
+}
+
  static void spapr_phb_reset(DeviceState *qdev)
  {
  /* Reset the IOMMU state */
  object_child_foreach(OBJECT(qdev), spapr_phb_children_reset, NULL);

  if (spapr_phb_eeh_available(SPAPR_PCI_HOST_BRIDGE(qdev))) {
-spapr_phb_vfio_reset(qdev);
+/*
+ * The PE might be in frozen state. To reenable the EEH
+ * functionality on it will clean the frozen state, which
+ * ensures that the contained PCI devices will work properly
+ * after reboot.
+ */
+spapr_phb_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev));
  }
  }

diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index cccd444..0a0f31a 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -73,22 +73,6 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState 
*sphb, Error **errp)
  spapr_tce_get_iommu(tcet));
  }

-static void spapr_phb_vfio_eeh_reenable(sPAPRPHBState *sphb)
-{
-vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_ENABLE);
-}
-
-void spapr_phb_vfio_reset(DeviceState *qdev)
-{
-/*
- * The PE might be in frozen state. To reenable the EEH
- * functionality on it will clean the frozen state, which
- * ensures that the contained PCI devices will work properly
- * after reboot.
- */
-spapr_phb_vfio_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev));
-}
-
  static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(klass);




--
Alexey

Re: [Qemu-devel] [PATCH 12/12] vfio: Eliminate vfio_container_ioctl()

2016-02-28 Thread Alexey Kardashevskiy


On 02/26/2016 10:32 PM, David Gibson wrote:

vfio_container_ioctl() was a bad interface that bypassed abstraction
boundaries, had semantics that sat uneasily with its name, and was unsafe
in many realistic circumstances.  Now that spapr-pci-vfio-host-bridge has
been folded into spapr-pci-host-bridge, there are no more users, so remove
it.

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 


---
  hw/vfio/common.c   | 45 -
  include/hw/vfio/vfio.h |  2 --
  2 files changed, 47 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index e419241..26e91ad 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -959,51 +959,6 @@ void vfio_put_base_device(VFIODevice *vbasedev)
  close(vbasedev->fd);
  }

-static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid,
-   int req, void *param)
-{
-VFIOGroup *group;
-VFIOContainer *container;
-int ret = -1;
-
-group = vfio_get_group(groupid, as);
-if (!group) {
-error_report("vfio: group %d not registered", groupid);
-return ret;
-}
-
-container = group->container;
-if (group->container) {
-ret = ioctl(container->fd, req, param);
-if (ret < 0) {
-error_report("vfio: failed to ioctl %d to container: ret=%d, %s",
- _IOC_NR(req) - VFIO_BASE, ret, strerror(errno));
-}
-}
-
-vfio_put_group(group);
-
-return ret;
-}
-
-int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
- int req, void *param)
-{
-/* We allow only certain ioctls to the container */
-switch (req) {
-case VFIO_CHECK_EXTENSION:
-case VFIO_IOMMU_SPAPR_TCE_GET_INFO:
-case VFIO_EEH_PE_OP:
-break;
-default:
-/* Return an error on unknown requests */
-error_report("vfio: unsupported ioctl %X", req);
-return -1;
-}
-
-return vfio_container_do_ioctl(as, groupid, req, param);
-}
-
  /*
   * Interfaces for IBM EEH (Enhanced Error Handling)
   */
diff --git a/include/hw/vfio/vfio.h b/include/hw/vfio/vfio.h
index fd3933b..7153604 100644
--- a/include/hw/vfio/vfio.h
+++ b/include/hw/vfio/vfio.h
@@ -3,8 +3,6 @@

  #include "qemu/typedefs.h"

-extern int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
-int req, void *param);
  bool vfio_eeh_as_ok(AddressSpace *as);
  int vfio_eeh_as_op(AddressSpace *as, uint32_t op);





--
Alexey

Re: [Qemu-devel] [PATCH 07/12] spapr_pci: Fold spapr_phb_vfio_eeh_set_option() into spapr_pci code

2016-02-28 Thread Alexey Kardashevskiy


On 02/26/2016 10:31 PM, David Gibson wrote:

Simplify the sPAPR PCI code by folding spapr_phb_eeh_set_option() into
rtas_ibm_set_eeh_option().

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 


---
  hw/ppc/spapr_pci.c  | 43 +++--
  hw/ppc/spapr_pci_vfio.c | 47 -
  include/hw/pci-host/spapr.h |  4 
  3 files changed, 41 insertions(+), 53 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index eaae7e2..26d08ad 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -450,6 +450,7 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu,
  sPAPRPHBState *sphb;
  uint32_t addr, option;
  uint64_t buid;
+uint32_t op;
  int ret;

  if ((nargs != 4) || (nret != 1)) {
@@ -469,8 +470,46 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu,
  goto param_error_exit;
  }

-ret = spapr_phb_vfio_eeh_set_option(sphb, addr, option);
-rtas_st(rets, 0, ret);
+switch (option) {
+case RTAS_EEH_DISABLE:
+op = VFIO_EEH_PE_DISABLE;
+break;
+case RTAS_EEH_ENABLE: {
+PCIHostState *phb;
+PCIDevice *pdev;
+
+/*
+ * The EEH functionality is enabled on basis of PCI device,
+ * instead of PE. We need check the validity of the PCI
+ * device address.
+ */
+phb = PCI_HOST_BRIDGE(sphb);
+pdev = pci_find_device(phb->bus,
+   (addr >> 16) & 0xFF, (addr >> 8) & 0xFF);
+if (!pdev || !object_dynamic_cast(OBJECT(pdev), "vfio-pci")) {
+goto param_error_exit;
+}
+
+op = VFIO_EEH_PE_ENABLE;
+break;
+}
+case RTAS_EEH_THAW_IO:
+op = VFIO_EEH_PE_UNFREEZE_IO;
+break;
+case RTAS_EEH_THAW_DMA:
+op = VFIO_EEH_PE_UNFREEZE_DMA;
+break;
+default:
+goto param_error_exit;
+}
+
+ret = vfio_eeh_as_op(&sphb->iommu_as, op);
+if (ret < 0) {
+rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
+return;
+}
+
+rtas_st(rets, 0, RTAS_OUT_SUCCESS);
  return;

  param_error_exit:
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index c87f2e4..cccd444 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -89,53 +89,6 @@ void spapr_phb_vfio_reset(DeviceState *qdev)
  spapr_phb_vfio_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev));
  }

-int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
-  unsigned int addr, int option)
-{
-uint32_t op;
-int ret;
-
-switch (option) {
-case RTAS_EEH_DISABLE:
-op = VFIO_EEH_PE_DISABLE;
-break;
-case RTAS_EEH_ENABLE: {
-PCIHostState *phb;
-PCIDevice *pdev;
-
-/*
- * The EEH functionality is enabled on basis of PCI device,
- * instead of PE. We need check the validity of the PCI
- * device address.
- */
-phb = PCI_HOST_BRIDGE(sphb);
-pdev = pci_find_device(phb->bus,
-   (addr >> 16) & 0xFF, (addr >> 8) & 0xFF);
-if (!pdev || !object_dynamic_cast(OBJECT(pdev), "vfio-pci")) {
-return RTAS_OUT_PARAM_ERROR;
-}
-
-op = VFIO_EEH_PE_ENABLE;
-break;
-}
-case RTAS_EEH_THAW_IO:
-op = VFIO_EEH_PE_UNFREEZE_IO;
-break;
-case RTAS_EEH_THAW_DMA:
-op = VFIO_EEH_PE_UNFREEZE_DMA;
-break;
-default:
-return RTAS_OUT_PARAM_ERROR;
-}
-
-ret = vfio_eeh_as_op(&sphb->iommu_as, op);
-if (ret < 0) {
-return RTAS_OUT_HW_ERROR;
-}
-
-return RTAS_OUT_SUCCESS;
-}
-
  static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(klass);
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index 55237fc..d32750e 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -135,8 +135,4 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState *spapr, 
uint64_t buid,
uint32_t config_addr);
  void spapr_phb_vfio_reset(DeviceState *qdev);

-/* VFIO EEH hooks */
-int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
-  unsigned int addr, int option);
-
  #endif /* __HW_SPAPR_PCI_H__ */




--
Alexey

Re: [Qemu-devel] [PATCH 05/12] spapr_pci: Fold spapr_phb_vfio_eeh_reset() into spapr_pci code

2016-02-28 Thread Alexey Kardashevskiy


On 02/26/2016 10:31 PM, David Gibson wrote:

Simplify the sPAPR PCI code by folding spapr_phb_vfio_eeh_reset() into
rtas_ibm_set_slot_reset().  We move several functions of which it was
the only caller (spapr_phb_eeh_clear_dev_msix(),
spapr_phb_eeh_clear_bus_msix() and spapr_phb_eeh_pre_reset()) into
spapr_pci.c along with it.

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 


---
  hw/ppc/spapr_pci.c  | 68 --
  hw/ppc/spapr_pci_vfio.c | 72 -
  include/hw/pci-host/spapr.h |  1 -
  3 files changed, 66 insertions(+), 75 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index fa633a9..96cdca2 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -573,6 +573,48 @@ param_error_exit:
  rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
  }

+static void spapr_phb_eeh_clear_dev_msix(PCIBus *bus, PCIDevice *pdev,
+ void *opaque)
+{
+/* Check if the device is VFIO PCI device */
+if (!object_dynamic_cast(OBJECT(pdev), "vfio-pci")) {
+return;
+}
+
+/*
+ * The MSIx table will be cleaned out by reset. We need
+ * disable it so that it can be reenabled properly. Also,
+ * the cached MSIx table should be cleared as it's not
+ * reflecting the contents in hardware.
+ */
+if (msix_enabled(pdev)) {
+uint16_t flags;
+
+flags = pci_host_config_read_common(pdev,
+pdev->msix_cap + PCI_MSIX_FLAGS,
+pci_config_size(pdev), 2);
+flags &= ~PCI_MSIX_FLAGS_ENABLE;
+pci_host_config_write_common(pdev,
+ pdev->msix_cap + PCI_MSIX_FLAGS,
+ pci_config_size(pdev), flags, 2);
+}
+
+msix_reset(pdev);
+}
+
+static void spapr_phb_eeh_clear_bus_msix(PCIBus *bus, void *opaque)
+{
+   pci_for_each_device(bus, pci_bus_num(bus),
+   spapr_phb_eeh_clear_dev_msix, NULL);
+}
+
+static void spapr_phb_eeh_pre_reset(sPAPRPHBState *sphb)
+{
+   PCIHostState *phb = PCI_HOST_BRIDGE(sphb);
+
+   pci_for_each_bus(phb->bus, spapr_phb_eeh_clear_bus_msix, NULL);
+}
+
  static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu,
  sPAPRMachineState *spapr,
  uint32_t token, uint32_t nargs,
@@ -582,6 +624,7 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu,
  sPAPRPHBState *sphb;
  uint32_t option;
  uint64_t buid;
+uint32_t op;
  int ret;

  if ((nargs != 4) || (nret != 1)) {
@@ -599,8 +642,29 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu,
  goto param_error_exit;
  }

-ret = spapr_phb_vfio_eeh_reset(sphb, option);
-rtas_st(rets, 0, ret);
+switch (option) {
+case RTAS_SLOT_RESET_DEACTIVATE:
+op = VFIO_EEH_PE_RESET_DEACTIVATE;
+break;
+case RTAS_SLOT_RESET_HOT:
+spapr_phb_eeh_pre_reset(sphb);
+op = VFIO_EEH_PE_RESET_HOT;
+break;
+case RTAS_SLOT_RESET_FUNDAMENTAL:
+spapr_phb_eeh_pre_reset(sphb);
+op = VFIO_EEH_PE_RESET_FUNDAMENTAL;
+break;
+default:
+goto param_error_exit;
+}
+
+ret = vfio_eeh_as_op(&sphb->iommu_as, op);
+if (ret < 0) {
+rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
+return;
+}
+
+rtas_st(rets, 0, RTAS_OUT_SUCCESS);
  return;

  param_error_exit:
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index 4ac5736..54075e0 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -149,78 +149,6 @@ int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int 
*state)
  return RTAS_OUT_SUCCESS;
  }

-static void spapr_phb_vfio_eeh_clear_dev_msix(PCIBus *bus,
-  PCIDevice *pdev,
-  void *opaque)
-{
-/* Check if the device is VFIO PCI device */
-if (!object_dynamic_cast(OBJECT(pdev), "vfio-pci")) {
-return;
-}
-
-/*
- * The MSIx table will be cleaned out by reset. We need
- * disable it so that it can be reenabled properly. Also,
- * the cached MSIx table should be cleared as it's not
- * reflecting the contents in hardware.
- */
-if (msix_enabled(pdev)) {
-uint16_t flags;
-
-flags = pci_host_config_read_common(pdev,
-pdev->msix_cap + PCI_MSIX_FLAGS,
-pci_config_size(pdev), 2);
-flags &= ~PCI_MSIX_FLAGS_ENABLE;
-pci_host_config_write_common(pdev,
- pdev->msix_cap + PCI_MSIX_FLAGS,
- pci_config_size(pdev), flags, 2);
-}
-
-msix_reset(pdev);
-}
-
-static void spapr_phb_vfio_eeh_clear_bus_msix(PCIBus *bus, void *opaque

Re: [Qemu-devel] [PATCH 03/12] spapr_pci: Eliminate class callbacks

2016-02-28 Thread Alexey Kardashevskiy


On 02/26/2016 10:31 PM, David Gibson wrote:

The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the
special groupid field in sPAPRPHBVFIOState.  So we can simplify, removing
the class specific callbacks with direct calls based on a simple
spapr_phb_eeh_enabled() helper.  For now we implement that in terms of
a boolean in the class, but we'll continue to clean that up later.

On its own this is a rather strange way of doing things, but it's a useful
intermediate step to further cleanups.


Reviewed-by: Alexey Kardashevskiy 



Signed-off-by: David Gibson 
---
  hw/ppc/spapr_pci.c  | 44 ++--
  hw/ppc/spapr_pci_vfio.c | 18 +++---
  include/hw/pci-host/spapr.h | 13 +
  3 files changed, 38 insertions(+), 37 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 9b2b546..d1e5222 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -92,6 +92,13 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState *spapr, 
uint64_t buid,
  return pci_find_device(phb->bus, bus_num, devfn);
  }

+static bool spapr_phb_eeh_available(sPAPRPHBState *sphb)
+{
+sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
+
+return spc->eeh_available;
+}
+
  static uint32_t rtas_pci_cfgaddr(uint32_t arg)
  {
  /* This handles the encoding of extended config space addresses */
@@ -438,7 +445,6 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu,
  target_ulong rets)
  {
  sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
  uint32_t addr, option;
  uint64_t buid;
  int ret;
@@ -456,12 +462,11 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu,
  goto param_error_exit;
  }

-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_set_option) {
+if (!spapr_phb_eeh_available(sphb)) {
  goto param_error_exit;
  }

-ret = spc->eeh_set_option(sphb, addr, option);
+ret = spapr_phb_vfio_eeh_set_option(sphb, addr, option);
  rtas_st(rets, 0, ret);
  return;

@@ -476,7 +481,6 @@ static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu,
 target_ulong rets)
  {
  sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
  PCIDevice *pdev;
  uint32_t addr, option;
  uint64_t buid;
@@ -491,8 +495,7 @@ static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu,
  goto param_error_exit;
  }

-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_set_option) {
+if (!spapr_phb_eeh_available(sphb)) {
  goto param_error_exit;
  }

@@ -532,7 +535,6 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu,
  target_ulong rets)
  {
  sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
  uint64_t buid;
  int state, ret;

@@ -546,12 +548,11 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU 
*cpu,
  goto param_error_exit;
  }

-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_get_state) {
+if (!spapr_phb_eeh_available(sphb)) {
  goto param_error_exit;
  }

-ret = spc->eeh_get_state(sphb, &state);
+ret = spapr_phb_vfio_eeh_get_state(sphb, &state);
  rtas_st(rets, 0, ret);
  if (ret != RTAS_OUT_SUCCESS) {
  return;
@@ -576,7 +577,6 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu,
  target_ulong rets)
  {
  sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
  uint32_t option;
  uint64_t buid;
  int ret;
@@ -592,12 +592,11 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu,
  goto param_error_exit;
  }

-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_reset) {
+if (!spapr_phb_eeh_available(sphb)) {
  goto param_error_exit;
  }

-ret = spc->eeh_reset(sphb, option);
+ret = spapr_phb_vfio_eeh_reset(sphb, option);
  rtas_st(rets, 0, ret);
  return;

@@ -612,7 +611,6 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu,
target_ulong rets)
  {
  sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
  uint64_t buid;
  int ret;

@@ -626,12 +624,11 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu,
  goto param_error_exit;
  }

-spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb);
-if (!spc->eeh_configure) {
+if (!spapr_phb_eeh_available(sphb)) {
  goto param_error_exit;
  }

-ret = spc->eeh_configure(sphb);
+ret = spapr_phb_vfio_eeh_configure(sphb);
  rtas_st(rets, 0, ret);
  return;

@@ -647,7 +644,6 @@ static void rtas_ibm_slot_error_detail(PowerPCCPU *cpu,
 target_ulong rets)
  {
  sPAPRPHBState *sphb;
-sPAPRPHBClass *spc;
  int option;
  uint64_t buid;

@@ -661,8 +657,7 @@ static void rtas_ibm_slot_error_detail(PowerPCCPU *cpu,
  goto param_e

Re: [Qemu-devel] [PATCH 06/12] spapr_pci: Fold spapr_phb_vfio_eeh_get_state() into spapr_pci code

2016-02-28 Thread Alexey Kardashevskiy


On 02/26/2016 10:31 PM, David Gibson wrote:

Simplify the sPAPR PCI code by folding spapr_phb_vfio_eeh_get_state9)
into rtas_ibm_read_slot_reset_state2().

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 


---
  hw/ppc/spapr_pci.c  | 12 ++--
  hw/ppc/spapr_pci_vfio.c | 13 -
  include/hw/pci-host/spapr.h |  1 -
  3 files changed, 6 insertions(+), 20 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 96cdca2..eaae7e2 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -539,7 +539,7 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu,
  {
  sPAPRPHBState *sphb;
  uint64_t buid;
-int state, ret;
+int ret;

  if ((nargs != 3) || (nret != 4 && nret != 5)) {
  goto param_error_exit;
@@ -555,13 +555,13 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU 
*cpu,
  goto param_error_exit;
  }

-ret = spapr_phb_vfio_eeh_get_state(sphb, &state);
-rtas_st(rets, 0, ret);
-if (ret != RTAS_OUT_SUCCESS) {
-return;
+ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_GET_STATE);
+if (ret < 0) {
+goto param_error_exit;
  }

-rtas_st(rets, 1, state);
+rtas_st(rets, 0, RTAS_OUT_SUCCESS);
+rtas_st(rets, 1, ret);
  rtas_st(rets, 2, RTAS_EEH_SUPPORT);
  rtas_st(rets, 3, RTAS_EEH_PE_UNAVAIL_INFO);
  if (nret >= 5) {
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index 54075e0..c87f2e4 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -136,19 +136,6 @@ int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
  return RTAS_OUT_SUCCESS;
  }

-int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state)
-{
-int ret;
-
-ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_GET_STATE);
-if (ret < 0) {
-return RTAS_OUT_PARAM_ERROR;
-}
-
-*state = ret;
-return RTAS_OUT_SUCCESS;
-}
-
  static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(klass);
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index d31162b..55237fc 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -138,6 +138,5 @@ void spapr_phb_vfio_reset(DeviceState *qdev);
  /* VFIO EEH hooks */
  int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
unsigned int addr, int option);
-int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state);

  #endif /* __HW_SPAPR_PCI_H__ */




--
Alexey

Re: [Qemu-devel] [PATCH 04/12] spapr_pci: Fold spapr_phb_vfio_eeh_configure() into spapr_pci code

2016-02-28 Thread Alexey Kardashevskiy


On 02/26/2016 10:31 PM, David Gibson wrote:

Simplify the sPAPR PCI code by folding spapr_phb_vfio_eeh_configure()
into rtas_ibm_configure_pe().

Signed-off-by: David Gibson 


Reviewed-by: Alexey Kardashevskiy 


---
  hw/ppc/spapr_pci.c  | 11 +--
  hw/ppc/spapr_pci_vfio.c | 12 
  include/hw/pci-host/spapr.h |  1 -
  3 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index d1e5222..fa633a9 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -42,6 +42,9 @@
  #include "hw/ppc/spapr_drc.h"
  #include "sysemu/device_tree.h"

+#include "hw/vfio/vfio.h"
+#include 
+
  /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
  #define RTAS_QUERY_FN   0
  #define RTAS_CHANGE_FN  1
@@ -628,8 +631,12 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu,
  goto param_error_exit;
  }

-ret = spapr_phb_vfio_eeh_configure(sphb);
-rtas_st(rets, 0, ret);
+ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE);
+if (ret < 0) {
+goto param_error_exit;
+}
+
+rtas_st(rets, 0, RTAS_OUT_SUCCESS);
  return;

  param_error_exit:
diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index 10fa88a..4ac5736 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -221,18 +221,6 @@ int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int 
option)
  return RTAS_OUT_SUCCESS;
  }

-int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb)
-{
-int ret;
-
-ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE);
-if (ret < 0) {
-return RTAS_OUT_PARAM_ERROR;
-}
-
-return RTAS_OUT_SUCCESS;
-}
-
  static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(klass);
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index cc1b82c..f02ef51 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -140,6 +140,5 @@ int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
unsigned int addr, int option);
  int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state);
  int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option);
-int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb);

  #endif /* __HW_SPAPR_PCI_H__ */




--
Alexey

Re: [Qemu-devel] [PATCH 02/12] spapr_pci: Switch to vfio_eeh_as_op() interface

2016-02-28 Thread Alexey Kardashevskiy


On 02/26/2016 10:31 PM, David Gibson wrote:

This switches all EEH on VFIO operations in spapr_pci_vfio.c from the
broken vfio_container_ioctl() interface to the new vfio_as_eeh_op()
interface.

Signed-off-by: David Gibson mak


Where is  that "mak" from? :)


Reviewed-by: Alexey Kardashevskiy 



---
  hw/ppc/spapr_pci_vfio.c | 50 -
  1 file changed, 16 insertions(+), 34 deletions(-)

diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
index 2f3752e..b1e8e8e 100644
--- a/hw/ppc/spapr_pci_vfio.c
+++ b/hw/ppc/spapr_pci_vfio.c
@@ -73,15 +73,9 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState 
*sphb, Error **errp)
  spapr_tce_get_iommu(tcet));
  }

-static void spapr_phb_vfio_eeh_reenable(sPAPRPHBVFIOState *svphb)
+static void spapr_phb_vfio_eeh_reenable(sPAPRPHBState *sphb)
  {
-struct vfio_eeh_pe_op op = {
-.argsz = sizeof(op),
-.op= VFIO_EEH_PE_ENABLE
-};
-
-vfio_container_ioctl(&svphb->phb.iommu_as,
- svphb->iommugroupid, VFIO_EEH_PE_OP, &op);
+vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_ENABLE);
  }

  static void spapr_phb_vfio_reset(DeviceState *qdev)
@@ -92,19 +86,18 @@ static void spapr_phb_vfio_reset(DeviceState *qdev)
   * ensures that the contained PCI devices will work properly
   * after reboot.
   */
-spapr_phb_vfio_eeh_reenable(SPAPR_PCI_VFIO_HOST_BRIDGE(qdev));
+spapr_phb_vfio_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev));
  }

  static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb,
   unsigned int addr, int option)
  {
-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) };
+uint32_t op;
  int ret;

  switch (option) {
  case RTAS_EEH_DISABLE:
-op.op = VFIO_EEH_PE_DISABLE;
+op = VFIO_EEH_PE_DISABLE;
  break;
  case RTAS_EEH_ENABLE: {
  PCIHostState *phb;
@@ -122,21 +115,20 @@ static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState 
*sphb,
  return RTAS_OUT_PARAM_ERROR;
  }

-op.op = VFIO_EEH_PE_ENABLE;
+op = VFIO_EEH_PE_ENABLE;
  break;
  }
  case RTAS_EEH_THAW_IO:
-op.op = VFIO_EEH_PE_UNFREEZE_IO;
+op = VFIO_EEH_PE_UNFREEZE_IO;
  break;
  case RTAS_EEH_THAW_DMA:
-op.op = VFIO_EEH_PE_UNFREEZE_DMA;
+op = VFIO_EEH_PE_UNFREEZE_DMA;
  break;
  default:
  return RTAS_OUT_PARAM_ERROR;
  }

-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid,
-   VFIO_EEH_PE_OP, &op);
+ret = vfio_eeh_as_op(&sphb->iommu_as, op);
  if (ret < 0) {
  return RTAS_OUT_HW_ERROR;
  }
@@ -146,13 +138,9 @@ static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState 
*sphb,

  static int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state)
  {
-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) };
  int ret;

-op.op = VFIO_EEH_PE_GET_STATE;
-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid,
-   VFIO_EEH_PE_OP, &op);
+ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_GET_STATE);
  if (ret < 0) {
  return RTAS_OUT_PARAM_ERROR;
  }
@@ -206,28 +194,26 @@ static void spapr_phb_vfio_eeh_pre_reset(sPAPRPHBState 
*sphb)

  static int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option)
  {
-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) };
+uint32_t op;
  int ret;

  switch (option) {
  case RTAS_SLOT_RESET_DEACTIVATE:
-op.op = VFIO_EEH_PE_RESET_DEACTIVATE;
+op = VFIO_EEH_PE_RESET_DEACTIVATE;
  break;
  case RTAS_SLOT_RESET_HOT:
  spapr_phb_vfio_eeh_pre_reset(sphb);
-op.op = VFIO_EEH_PE_RESET_HOT;
+op = VFIO_EEH_PE_RESET_HOT;
  break;
  case RTAS_SLOT_RESET_FUNDAMENTAL:
  spapr_phb_vfio_eeh_pre_reset(sphb);
-op.op = VFIO_EEH_PE_RESET_FUNDAMENTAL;
+op = VFIO_EEH_PE_RESET_FUNDAMENTAL;
  break;
  default:
  return RTAS_OUT_PARAM_ERROR;
  }

-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid,
-   VFIO_EEH_PE_OP, &op);
+ret = vfio_eeh_as_op(&sphb->iommu_as, op);
  if (ret < 0) {
  return RTAS_OUT_HW_ERROR;
  }
@@ -237,13 +223,9 @@ static int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, 
int option)

  static int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb)
  {
-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb);
-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) };
  int ret;

-op.op = VFIO_EEH_PE_CONFIGURE;
-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb

Re: [Qemu-devel] [PATCH 10/12] spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge

2016-02-28 Thread Alexey Kardashevskiy


On 02/26/2016 10:32 PM, David Gibson wrote:

Now that the regular spapr-pci-host-bridge can handle EEH, there are only
two things that spapr-pci-vfio-host-bridge does differently:
 1. automatically sizes its DMA window to match the host IOMMU
 2. checks if the attached VFIO container is backed by the
VFIO_SPAPR_TCE_IOMMU type on the host

(1) is not particularly useful, since the default window used by the
regular host bridge will work with the host IOMMU configuration on all
current systems anyway.

Plus, automatically changing guest visible configuration (such as the DMA
window) based on host settings is generally a bad idea.


btw why exactly is it a bad idea?


Anyway,
Reviewed-by: Alexey Kardashevskiy 


--
Alexey

Re: [Qemu-devel] [PATCH 01/12] vfio: Start improving VFIO/EEH interface

2016-02-28 Thread Alexey Kardashevskiy


On 02/26/2016 10:31 PM, David Gibson wrote:

At present the code handling IBM's Enhanced Error Handling (EEH) interface
on VFIO devices operates by bypassing the usual VFIO logic with
vfio_container_ioctl().  That's a poorly designed interface with unclear
semantics about exactly what can be operated on.

In particular it operates on a single vfio container internally (hence the
name), but takes an address space and group id, from which it deduces the
container in a rather roundabout way.  groupids are something that code
outside vfio shouldn't even be aware of.

This patch creates new interfaces for EEH operations.  Internally we
have vfio_eeh_container_op() which takes a VFIOContainer object
directly.  For external use we have vfio_eeh_as_ok() which determines
if an AddressSpace is usable for EEH (at present this means it has a
single container and at most a single group attached), and
vfio_eeh_as_op() which will perform an operation on an AddressSpace in
the unambiguous case, and otherwise returns an error.

This interface still isn't great, but it's enough of an improvement to
allow a number of cleanups in other places.

Signed-off-by: David Gibson 
---
  hw/vfio/common.c   | 77 ++
  include/hw/vfio/vfio.h |  2 ++
  2 files changed, 79 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 607ec70..e419241 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1003,3 +1003,80 @@ int vfio_container_ioctl(AddressSpace *as, int32_t 
groupid,

  return vfio_container_do_ioctl(as, groupid, req, param);
  }
+
+/*
+ * Interfaces for IBM EEH (Enhanced Error Handling)
+ */
+static bool vfio_eeh_container_ok(VFIOContainer *container)
+{
+/* A broken kernel implementation means EEH operations won't work
+ * correctly if there are multiple groups in a container */
+
+if (!QLIST_EMPTY(&container->group_list)
+&& QLIST_NEXT(QLIST_FIRST(&container->group_list), container_next)) {
+return false;
+}
+
+return true;


If &container->group_list is empty, this helper returns "true". Does not 
look right, does it?...




+}
+
+static int vfio_eeh_container_op(VFIOContainer *container, uint32_t op)
+{
+struct vfio_eeh_pe_op pe_op = {
+.argsz = sizeof(pe_op),
+.op = op,
+};
+int ret;
+
+if (!vfio_eeh_container_ok(container)) {
+error_report("vfio/eeh: EEH_PE_OP 0x%x called on container"
+ " with multiple groups", op);
+return -EPERM;
+}
+
+ret = ioctl(container->fd, VFIO_EEH_PE_OP, &pe_op);
+if (ret < 0) {
+error_report("vfio/eeh: EEH_PE_OP 0x%x failed: %m", op);
+return -errno;
+}
+
+return 0;
+}
+
+static VFIOContainer *vfio_eeh_as_container(AddressSpace *as)
+{
+VFIOAddressSpace *space = vfio_get_address_space(as);
+VFIOContainer *container = NULL;
+
+if (QLIST_EMPTY(&space->containers)) {
+/* No containers to act on */
+goto out;
+}
+
+container = QLIST_FIRST(&space->containers);
+
+if (QLIST_NEXT(container, next)) {
+/* We don't yet have logic to synchronize EEH state across
+ * multiple containers */
+container = NULL;
+goto out;
+}
+
+out:
+vfio_put_address_space(space);
+return container;
+}
+
+bool vfio_eeh_as_ok(AddressSpace *as)
+{
+VFIOContainer *container = vfio_eeh_as_container(as);
+
+return (container != NULL) && vfio_eeh_container_ok(container);
+}
+
+int vfio_eeh_as_op(AddressSpace *as, uint32_t op)
+{
+VFIOContainer *container = vfio_eeh_as_container(as);
+
+return vfio_eeh_container_op(container, op);



vfio_eeh_as_ok() checks for (container != NULL) but this one does not, 
should not it?




+}
diff --git a/include/hw/vfio/vfio.h b/include/hw/vfio/vfio.h
index 0b26cd8..fd3933b 100644
--- a/include/hw/vfio/vfio.h
+++ b/include/hw/vfio/vfio.h
@@ -5,5 +5,7 @@

  extern int vfio_container_ioctl(AddressSpace *as, int32_t groupid,
  int req, void *param);
+bool vfio_eeh_as_ok(AddressSpace *as);
+int vfio_eeh_as_op(AddressSpace *as, uint32_t op);

  #endif




--
Alexey

Re: [Qemu-devel] [PATCH v2] xics: report errors with the QEMU Error API

2016-02-28 Thread David Gibson

On Fri, Feb 26, 2016 at 10:44:07AM +0100, Greg Kurz wrote:
> Using the return value to report errors is error prone:
> - xics_alloc() returns -1 on error but spapr_vio_busdev_realize() errors
>   on 0
> - xics_alloc_block() returns the unclear value of ics->offset - 1 on error
>   but both rtas_ibm_change_msi() and spapr_phb_realize() error on 0
> 
> This patch adds an errp argument to xics_alloc() and xics_alloc_block() to
> report errors. The return value of these functions is a valid IRQ number
> if errp is NULL. It is undefined otherwise.
> 
> The corresponding error traces get promotted to error messages. Note that
> the "can't allocate IRQ" error message in spapr_vio_busdev_realize() also
> moves to xics_alloc(). Similar error message consolidation isn't really
> applicable to xics_alloc_block() because callers have extra context (device
> config address, MSI or MSIX).
> 
> This fixes the issues mentioned above.
> 
> Based on previous work from Brian W. Hart.
> 
> Signed-off-by: Greg Kurz 

Merged, thanks.

> ---
> v2: - reverted to non-void xics_alloc() and xics_alloc_block()
> - consolidated error message in xics_alloc()
> - pass &error_fatal instead of NULL in spapr_events_init()
> ---
> 
>  hw/intc/xics.c|   13 +
>  hw/ppc/spapr_events.c |3 ++-
>  hw/ppc/spapr_pci.c|   16 ++--
>  hw/ppc/spapr_vio.c|7 ---
>  include/hw/ppc/xics.h |5 +++--
>  trace-events  |2 --
>  6 files changed, 28 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index e66ae328819a..213a3709254c 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -712,7 +712,7 @@ static int ics_find_free_block(ICSState *ics, int num, 
> int alignnum)
>  return -1;
>  }
>  
> -int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi)
> +int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi, Error **errp)
>  {
>  ICSState *ics = &icp->ics[src];
>  int irq;
> @@ -720,14 +720,14 @@ int xics_alloc(XICSState *icp, int src, int irq_hint, 
> bool lsi)
>  if (irq_hint) {
>  assert(src == xics_find_source(icp, irq_hint));
>  if (!ICS_IRQ_FREE(ics, irq_hint - ics->offset)) {
> -trace_xics_alloc_failed_hint(src, irq_hint);
> +error_setg(errp, "can't allocate IRQ %d: already in use", 
> irq_hint);
>  return -1;
>  }
>  irq = irq_hint;
>  } else {
>  irq = ics_find_free_block(ics, 1, 1);
>  if (irq < 0) {
> -trace_xics_alloc_failed_no_left(src);
> +error_setg(errp, "can't allocate IRQ: no IRQ left");
>  return -1;
>  }
>  irq += ics->offset;
> @@ -743,7 +743,8 @@ int xics_alloc(XICSState *icp, int src, int irq_hint, 
> bool lsi)
>   * Allocate block of consecutive IRQs, and return the number of the first 
> IRQ in the block.
>   * If align==true, aligns the first IRQ number to num.
>   */
> -int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align)
> +int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align,
> + Error **errp)
>  {
>  int i, first = -1;
>  ICSState *ics = &icp->ics[src];
> @@ -763,6 +764,10 @@ int xics_alloc_block(XICSState *icp, int src, int num, 
> bool lsi, bool align)
>  } else {
>  first = ics_find_free_block(ics, num, 1);
>  }
> +if (first < 0) {
> +error_setg(errp, "can't find a free %d-IRQ block", num);
> +return -1;
> +}
>  
>  if (first >= 0) {
>  for (i = first; i < first + num; ++i) {
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index f5eac4b5441c..39f4682f957f 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -588,7 +588,8 @@ out_no_events:
>  void spapr_events_init(sPAPRMachineState *spapr)
>  {
>  QTAILQ_INIT(&spapr->pending_events);
> -spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false);
> +spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false,
> +&error_fatal);
>  spapr->epow_notifier.notify = spapr_powerdown_req;
>  qemu_register_powerdown_notifier(&spapr->epow_notifier);
>  spapr_rtas_register(RTAS_CHECK_EXCEPTION, "check-exception",
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 9b2b546b541c..e8edad3ab7c3 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -280,6 +280,7 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, 
> sPAPRMachineState *spapr,
>  PCIDevice *pdev = NULL;
>  spapr_pci_msi *msi;
>  int *config_addr_key;
> +Error *err = NULL;
>  
>  switch (func) {
>  case RTAS_CHANGE_MSI_FN:
> @@ -354,9 +355,10 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, 
> sPAPRMachineState *spapr,
>  
>  /* Allocate MSIs */
>  irq = xics_alloc_block(spapr->icp, 0, req_num, false,
> -   ret_intr_type == RTAS_TYPE_MSI);

Re: [Qemu-devel] [PATCH v2 01/16] tcg-mips: Always use tcg_debug_assert

2016-02-28 Thread Aurelien Jarno

On 2016-02-15 14:42, Richard Henderson wrote:
> Signed-off-by: Richard Henderson 
> ---
>  tcg/mips/tcg-target.c | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/tcg/mips/tcg-target.c b/tcg/mips/tcg-target.c
> index 2dc4998..ebb936d 100644
> --- a/tcg/mips/tcg-target.c
> +++ b/tcg/mips/tcg-target.c
> @@ -128,7 +128,7 @@ static inline uint32_t reloc_pc16_val(tcg_insn_unit *pc, 
> tcg_insn_unit *target)
>  {
>  /* Let the compiler perform the right-shift as part of the arithmetic.  
> */
>  ptrdiff_t disp = target - (pc + 1);
> -assert(disp == (int16_t)disp);
> +tcg_debug_assert(disp == (int16_t)disp);
>  return disp & 0x;
>  }
>  
> @@ -139,7 +139,7 @@ static inline void reloc_pc16(tcg_insn_unit *pc, 
> tcg_insn_unit *target)
>  
>  static inline uint32_t reloc_26_val(tcg_insn_unit *pc, tcg_insn_unit *target)
>  {
> -assertuintptr_t)pc ^ (uintptr_t)target) & 0xf000) == 0);
> +tcg_debug_assertuintptr_t)pc ^ (uintptr_t)target) & 0xf000) == 
> 0);
>  return ((uintptr_t)target >> 2) & 0x3ff;
>  }
>  
> @@ -151,8 +151,8 @@ static inline void reloc_26(tcg_insn_unit *pc, 
> tcg_insn_unit *target)
>  static void patch_reloc(tcg_insn_unit *code_ptr, int type,
>  intptr_t value, intptr_t addend)
>  {
> -assert(type == R_MIPS_PC16);
> -assert(addend == 0);
> +tcg_debug_assert(type == R_MIPS_PC16);
> +tcg_debug_assert(addend == 0);
>  reloc_pc16(code_ptr, (tcg_insn_unit *)value);
>  }
>  
> @@ -433,7 +433,7 @@ static bool tcg_out_opc_jmp(TCGContext *s, MIPSInsn opc, 
> void *target)
>  if ((from ^ dest) & -(1 << 28)) {
>  return false;
>  }
> -assert((dest & 3) == 0);
> +tcg_debug_assert((dest & 3) == 0);
>  
>  inst = opc;
>  inst |= (dest >> 2) & 0x3ff;
> @@ -808,9 +808,9 @@ static void tcg_out_setcond2(TCGContext *s, TCGCond cond, 
> TCGReg ret,
>  TCGReg tmp0 = TCG_TMP0;
>  TCGReg tmp1 = ret;
>  
> -assert(ret != TCG_TMP0);
> +tcg_debug_assert(ret != TCG_TMP0);
>  if (ret == ah || ret == bh) {
> -assert(ret != TCG_TMP1);
> +tcg_debug_assert(ret != TCG_TMP1);
>  tmp1 = TCG_TMP1;
>  }
>  
> @@ -1471,8 +1471,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode 
> opc,
>  case INDEX_op_and_i32:
>  if (c2 && a2 != (uint16_t)a2) {
>  int msb = ctz32(~a2) - 1;
> -assert(use_mips32r2_instructions);
> -assert(is_p2m1(a2));
> +tcg_debug_assert(use_mips32r2_instructions);
> +tcg_debug_assert(is_p2m1(a2));
>  tcg_out_opc_bf(s, OPC_EXT, a0, a1, msb, 0);
>  break;
>  }

Reviewed-by: Aurelien Jarno 

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH v2 00/16] tcg mips64 and mips r6 improvements

2016-02-28 Thread Aurelien Jarno

On 2016-02-15 14:42, Richard Henderson wrote:
> Changes since v1:
>   * Some bugs pointed out by Mark fixed.
>   * Canonicalize the whole file on tcg_debug_assert.
>   * Switch bswap code to subroutine earlier; the first patch is
> standalone for mips32, and there is no longer an intermediate
> patch with inline bswap for mips64.
>   * Use NAL for pre-r6 mips64 loading of the slow path return address.
> 

Thanks a lot for working on that, it's something I have on my TODO list
for months. I have finally found time to have a look and give a try over
the week-end (sorry about the delay).

It seems to work perfectly for 64-bit guests on mips64el but 32-bit
guests end-up quickly in a segmentation fault. It's easily reproducible 
by starting qemu-system-i386 on a mips64el host, it crashes when
executing seabios.

More problematic it seems that the patch "Adjust qemu_ld/st for mips64"
causes a regression on at least a big-endian 32-bit host running
qemu-system-i386. It is reproducible by booting a Debian i386 wheezy
guest on such a system. Unfortunately the week-end was too short for
finding the issue, I'll continue looking in the next days.

I have a few comments on the individual patches, I'll send them asap.
Note that I don't have an R6 machine, so I haven't been able to test
that part.

Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

[Qemu-devel] [QEMU] Windows XP / Windows 95 / MS-DOS 6 regressions

2016-02-28 Thread Hervé Poussineau


Hi,

I currently see some regressions on Microsoft operating systems.

1) Windows XP bugchecks since commit:
commit 7f0b7141b4c7deab51efd8ee1e83eab2d9b7a9ea
Author: Richard Henderson 
Date:   Mon Jul 6 17:29:59 2015 +0100

target-i386: Perform set/reset_inhibit_irq inline

With helpers that can be reused for other things.

Signed-off-by: Richard Henderson 

I'm starting QEMU with -cpu pentium2.
Attached patch can be applied on master to work-around the problem.
Another work-around is to start with -enable-kvm.

2) Windows 95 bugchecks since commit:
commit d6a2914984c89fa0a3125b9842e0cbf68de79a3d
Author: Richard Henderson 
Date:   Thu Dec 17 11:19:19 2015 -0800

target-i386: Use gen_lea_v_seg in gen_lea_modrm

Centralize handling of segment bases.

Signed-off-by: Richard Henderson 
Message-Id: <1450379966-28198-4-git-send-email-...@twiddle.net>
Signed-off-by: Paolo Bonzini 

(with 88c73d16ad1b6c22a2ab082064d0d521f756296a and 
4987783400667147ada01a5bdcce53f11b822888 cherry-picked)

I'm starting QEMU with -cpu pentium.
I've not searched for a work-around.

3) MS-DOS 6 freezes when loading himem.sys since commit:
commit 1906b2af7c2345037d9b2fdf484b457b5acd09d1
Author: Richard Henderson 
Date: Thu Jul 2 13:59:21 2015 +0100

target-i386: Rearrange processing of 0F 01

Rather than nesting tests of OP, MOD, and RM, decode them
all at once with a switch. Fixes incorrect decoding of
AMD Pacifica extensions (aka vmrun et al) via op==2 path.

Signed-off-by: Richard Henderson 

I'm starting QEMU with -cpu 486.
It works on master if I add -enable-kvm

Hervé
>From 0e66ca87ac7c94219ab49cfbed6f586c51c697de Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Herv=C3=A9=20Poussineau?= 
Date: Sun, 28 Feb 2016 15:00:29 +0100
Subject: [PATCH] target-i386: partially revert 'Perform set/reset_inhibit_irq
 inline'

This partially reverts commit 7f0b7141b4c7deab51efd8ee1e83eab2d9b7a9ea to fix Windows XP.
---
 target-i386/cc_helper.c | 5 +
 target-i386/helper.h| 1 +
 target-i386/translate.c | 3 ++-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/target-i386/cc_helper.c b/target-i386/cc_helper.c
index 83af223..e6f4fe9 100644
--- a/target-i386/cc_helper.c
+++ b/target-i386/cc_helper.c
@@ -383,3 +383,8 @@ void helper_sti_vm(CPUX86State *env)
 }
 }
 #endif
+
+void helper_set_inhibit_irq(CPUX86State *env)
+{
+env->hflags |= HF_INHIBIT_IRQ_MASK;
+}
diff --git a/target-i386/helper.h b/target-i386/helper.h
index e33451a..bdba6bd 100644
--- a/target-i386/helper.h
+++ b/target-i386/helper.h
@@ -70,6 +70,7 @@ DEF_HELPER_1(cli, void, env)
 DEF_HELPER_1(sti, void, env)
 DEF_HELPER_1(clac, void, env)
 DEF_HELPER_1(stac, void, env)
+DEF_HELPER_1(set_inhibit_irq, void, env)
 DEF_HELPER_3(boundw, void, env, tl, int)
 DEF_HELPER_3(boundl, void, env, tl, int)
 DEF_HELPER_1(rsm, void, env)
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 9171929..4ec94b9 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -6780,7 +6780,8 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s,
 /* interruptions are enabled only the first insn after sti */
 /* If several instructions disable interrupts, only the
_first_ does it */
-gen_set_hflag(s, HF_INHIBIT_IRQ_MASK);
+if (!(s->tb->flags & HF_INHIBIT_IRQ_MASK))
+gen_helper_set_inhibit_irq(cpu_env);
 /* give a chance to handle pending irqs */
 gen_jmp_im(s->pc - s->cs_base);
 gen_eob(s);
-- 
2.1.4

Re: [Qemu-devel] [PATCH v2 0/9] Add i.MX6 (Single/Dual/Quad) support

2016-02-28 Thread Jean-Christophe DUBOIS


Peter,

Are you OK if I merge the 2 series (i.MX6 and SPI for i.MX6) into a 
single series?


Thanks.

JC

Le 08/02/2016 23:08, Jean-Christophe Dubois a écrit :

This patch series adds support for the Freescale i.MX6 processor.

For now we only support the following devices:
* up to 4 Cortex A9 cores
* A9 MPCORE (SCU, GIC, TWD)
* 5 i.MX UARTs
* 2 EPIT timers
* 1 GPT timer
* 7 GPIO controllers
* 6 SDHC controllers
* 1 CCM device
* 1 SRC device
* 3 I2C devices
* various ROM/RAM areas.

This also adds the sabrelite board as a an actual platform for i.MX6.

This series was tested by booting a 4.4 linux kernel (using the
imx_v6_v7_defconfig file as kernel configuration).

Note1: as sabrelite uses uart2 as console, you have to redirect the second
QEMU serial port to stdout.
 qemu-system-arm -M sabrelite -display none ... -serial null -serial stdio

Note2: as the SPI controller is not yet supported in QEMU for i.MX you need
to disable SPI controllers in Linux DTS tree. Otherwise Linux would hang
during the boot waiting for a SPI device to respond.

Note3: You need to disable the GPIO section related to physical push buttons
in the Linux DTS tree in order to avoid excecive interrupt triggering on
GPIO devices for non existant buttons.

Jean-Christophe Dubois (9):
   i.MX: Allow GPT timer to rollover.
   i.MX: Rename CCM NOCLK to CLK_NONE for naming consistency.
   i.MX: Remove CCM useless clock computation handling.
   i.MX: Add the CLK_IPG_HIGH clock
   i.MX: Add i.MX6 CCM and ANALOG device.
   i.MX: Add i.MX6 System Reset Controller device.
   i.MX: Add i.MX6 SOC implementation.
   i.MX: Add sabrelite i.MX6 emulation.
   i.MX: Add missing descriptions in devices.

  default-configs/arm-softmmu.mak |   1 +
  hw/arm/Makefile.objs|   1 +
  hw/arm/fsl-imx25.c  |   1 +
  hw/arm/fsl-imx31.c  |   1 +
  hw/arm/fsl-imx6.c   | 407 +
  hw/arm/sabrelite.c  |  93 +
  hw/i2c/imx_i2c.c|   1 +
  hw/misc/Makefile.objs   |   2 +
  hw/misc/imx25_ccm.c |  29 +-
  hw/misc/imx31_ccm.c |  35 +-
  hw/misc/imx6_ccm.c  | 773 
  hw/misc/imx6_src.c  | 353 ++
  hw/net/imx_fec.c|   1 +
  hw/timer/imx_epit.c |   8 +-
  hw/timer/imx_gpt.c  |  43 +--
  include/hw/arm/fsl-imx6.h   | 447 +++
  include/hw/misc/imx6_ccm.h  | 197 ++
  include/hw/misc/imx6_src.h  |  73 
  include/hw/misc/imx_ccm.h   |  10 +-
  19 files changed, 2384 insertions(+), 92 deletions(-)
  create mode 100644 hw/arm/fsl-imx6.c
  create mode 100644 hw/arm/sabrelite.c
  create mode 100644 hw/misc/imx6_ccm.c
  create mode 100644 hw/misc/imx6_src.c
  create mode 100644 include/hw/arm/fsl-imx6.h
  create mode 100644 include/hw/misc/imx6_ccm.h
  create mode 100644 include/hw/misc/imx6_src.h

Re: [Qemu-devel] [PATCH v1 16/17] loader: Add data swap option to load-elf

2016-02-28 Thread Peter Crosthwaite

On Sun, Feb 28, 2016 at 7:28 AM, Peter Maydell  wrote:
> On 27 February 2016 at 23:14, Peter Crosthwaite
>  wrote:
>> On Tue, Jan 19, 2016 at 9:53 AM, Peter Maydell  
>> wrote:
>>> Can we have a doc comment so we have something that defines what
>>> values data_swab accepts? (it's not just a bool).
>>>
>>
>> This is difficult to capture without writing to whole documentation
>> for load_elf. So here goes:
>
> Thanks; a couple of typos below, but otherwise looks good.
>
>
>> /** load_elf:
>>  * @filename: Path of ELF file
>>  * @translate_fn: optional function to translate load addresses
>>  * @translate_opaque: opaque data passed to @translate_fn
>>  * @pentry: Populated with program entry point. Ignored if NULL.
>>  * @lowaddr: Populated with lowest loaded address. Ignored if NULL.
>>  * @highaddr: Populated with highest loaded address. Ignored if NULL.
>>  * @bigendian: Expected ELF endianness. 0 for LE otherwise BE
>>  * @elf_machine: Expected ELF machine type
>>  * @clear_lsb: Set to mask off LSB of addresses (Some arch's use this
>
> Can we just write out "architectures" here?
>
>>for non-address data)
>>  * @data_swab: Set to order of byte swapping for data. 0 for no swap, 1
>>  * for swapping bytes within halfwords, 2 for bytes within
>>  * words and 3 for within doublewords.
>>  *
>>  * Load an ELF file's contents to the emulated systems address space.
>
> "system's"
>
>>  * Clients may optionally specify a callback to perform address
>>  * translations. @pentry, @lowaddr and @highaddr are optional pointers
>>  * which will be populated with various load information. @bigendian and
>>  * @elf_machine give the expected endianness and machine for the ELF the
>>  * load will fail it the target ELF does not match. Some architectures
>
> "if"
>
>>  * have some arch-specific behaviours that come into effect when their
>
> "architecture"
>

All fixed. Thanks for the pre-review.

Regards,
Peter

>>  * particular values for @elf_machine are set.
>>  */
>
> thanks
> -- PMM

Re: [Qemu-devel] [PATCH v4 02/10] pc: init pcms->apic_id_limit once and use it throughout pc.c

2016-02-28 Thread Marcel Apfelbaum


On 02/26/2016 03:59 PM, Igor Mammedov wrote:

Signed-off-by: Igor Mammedov 
---
  hw/i386/pc.c | 45 +++--
  1 file changed, 19 insertions(+), 26 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0aeefd2..151a64c 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -700,18 +700,6 @@ static uint32_t x86_cpu_apic_id_from_index(unsigned int 
cpu_index)
  }
  }

-/* Calculates the limit to CPU APIC ID values
- *
- * This function returns the limit for the APIC ID value, so that all
- * CPU APIC IDs are < pc_apic_id_limit().
- *
- * This is used for FW_CFG_MAX_CPUS. See comments on bochs_bios_init().
- */
-static unsigned int pc_apic_id_limit(unsigned int max_cpus)
-{
-return x86_cpu_apic_id_from_index(max_cpus - 1) + 1;
-}
-
  static void pc_build_smbios(FWCfgState *fw_cfg)
  {
  uint8_t *smbios_tables, *smbios_anchor;
@@ -749,12 +737,11 @@ static void pc_build_smbios(FWCfgState *fw_cfg)
  }
  }

-static FWCfgState *bochs_bios_init(AddressSpace *as)
+static FWCfgState *bochs_bios_init(AddressSpace *as, PCMachineState *pcms)
  {
  FWCfgState *fw_cfg;
  uint64_t *numa_fw_cfg;
  int i, j;
-unsigned int apic_id_limit = pc_apic_id_limit(max_cpus);

  fw_cfg = fw_cfg_init_io_dma(BIOS_CFG_IOPORT, BIOS_CFG_IOPORT + 4, as);

@@ -772,7 +759,7 @@ static FWCfgState *bochs_bios_init(AddressSpace *as)
   * [1] The only kind of "CPU identifier" used between SeaBIOS and QEMU is
   * the APIC ID, not the "CPU index"
   */
-fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)apic_id_limit);
+fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)pcms->apic_id_limit);
  fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size);
  fw_cfg_add_bytes(fw_cfg, FW_CFG_ACPI_TABLES,
   acpi_tables, acpi_tables_len);
@@ -790,11 +777,11 @@ static FWCfgState *bochs_bios_init(AddressSpace *as)
   * of nodes, one word for each VCPU->node and one word for each node to
   * hold the amount of memory.
   */
-numa_fw_cfg = g_new0(uint64_t, 1 + apic_id_limit + nb_numa_nodes);
+numa_fw_cfg = g_new0(uint64_t, 1 + pcms->apic_id_limit + nb_numa_nodes);
  numa_fw_cfg[0] = cpu_to_le64(nb_numa_nodes);
  for (i = 0; i < max_cpus; i++) {
  unsigned int apic_id = x86_cpu_apic_id_from_index(i);
-assert(apic_id < apic_id_limit);
+assert(apic_id < pcms->apic_id_limit);
  for (j = 0; j < nb_numa_nodes; j++) {
  if (test_bit(i, numa_info[j].node_cpu)) {
  numa_fw_cfg[apic_id + 1] = cpu_to_le64(j);
@@ -803,10 +790,11 @@ static FWCfgState *bochs_bios_init(AddressSpace *as)
  }
  }
  for (i = 0; i < nb_numa_nodes; i++) {
-numa_fw_cfg[apic_id_limit + 1 + i] = 
cpu_to_le64(numa_info[i].node_mem);
+numa_fw_cfg[pcms->apic_id_limit + 1 + i] =
+cpu_to_le64(numa_info[i].node_mem);
  }
  fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA, numa_fw_cfg,
- (1 + apic_id_limit + nb_numa_nodes) *
+ (1 + pcms->apic_id_limit + nb_numa_nodes) *
   sizeof(*numa_fw_cfg));

  return fw_cfg;
@@ -1120,7 +1108,6 @@ void pc_cpus_init(PCMachineState *pcms)
  int i;
  X86CPU *cpu = NULL;
  MachineState *machine = MACHINE(pcms);
-unsigned long apic_id_limit;

  /* init CPUs */
  if (machine->cpu_model == NULL) {
@@ -1131,10 +1118,17 @@ void pc_cpus_init(PCMachineState *pcms)
  #endif
  }

-apic_id_limit = pc_apic_id_limit(max_cpus);
-if (apic_id_limit > ACPI_CPU_HOTPLUG_ID_LIMIT) {
-error_report("max_cpus is too large. APIC ID of last CPU is %lu",
- apic_id_limit - 1);
+/* Calculates the limit to CPU APIC ID values
+ *
+ * Limit for the APIC ID value, so that all
+ * CPU APIC IDs are < pcms->apic_id_limit.
+ *
+ * This is used for FW_CFG_MAX_CPUS. See comments on bochs_bios_init().
+ */
+pcms->apic_id_limit = x86_cpu_apic_id_from_index(max_cpus - 1) + 1;
+if (pcms->apic_id_limit > ACPI_CPU_HOTPLUG_ID_LIMIT) {
+error_report("max_cpus is too large. APIC ID of last CPU is %u",
+ pcms->apic_id_limit - 1);
  exit(1);
  }

@@ -1187,7 +1181,6 @@ void pc_guest_info_init(PCMachineState *pcms)
  {
  int i, j;

-pcms->apic_id_limit = pc_apic_id_limit(max_cpus);
  pcms->apic_xrupt_override = kvm_allows_irq0_override();
  pcms->numa_nodes = nb_numa_nodes;
  pcms->node_mem = g_malloc0(pcms->numa_nodes *
@@ -1372,7 +1365,7 @@ void pc_memory_init(PCMachineState *pcms,
  option_rom_mr,
  1);

-fw_cfg = bochs_bios_init(&address_space_memory);
+fw_cfg = bochs_bios_init(&address_space_memory, pcms);

  rom_set_fw(fw_cfg);





Looks OK to me.

Reviewed-by: Marcel Apfelbaum 

Thanks,
Marcel

Re: [Qemu-devel] [PATCH v4 04/10] pc: acpi: cleanup qdev_get_machine() calls

2016-02-28 Thread Marcel Apfelbaum


On 02/26/2016 03:59 PM, Igor Mammedov wrote:

cache qdev_get_machine() result in acpi_setup/acpi_build_update
time and pass it as an argument to child functions that need it.

Signed-off-by: Igor Mammedov 
---
  hw/i386/acpi-build.c | 31 +++
  1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 52c9470..c750435 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -362,9 +362,9 @@ build_fadt(GArray *table_data, GArray *linker, AcpiPmInfo 
*pm,
  }

  static void
-build_madt(GArray *table_data, GArray *linker, AcpiCpuInfo *cpu)
+build_madt(GArray *table_data, GArray *linker, PCMachineState *pcms,
+   AcpiCpuInfo *cpu)
  {
-PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
  int madt_start = table_data->len;

  AcpiMultipleApicTable *madt;
@@ -1938,13 +1938,12 @@ static Aml *build_q35_osc_method(void)
  static void
  build_dsdt(GArray *table_data, GArray *linker,
 AcpiCpuInfo *cpu, AcpiPmInfo *pm, AcpiMiscInfo *misc,
-   PcPciInfo *pci)
+   PcPciInfo *pci, MachineState *machine)
  {
  CrsRangeEntry *entry;
  Aml *dsdt, *sb_scope, *scope, *dev, *method, *field, *pkg, *crs;
  GPtrArray *mem_ranges = g_ptr_array_new_with_free_func(crs_range_free);
  GPtrArray *io_ranges = g_ptr_array_new_with_free_func(crs_range_free);
-MachineState *machine = MACHINE(qdev_get_machine());
  PCMachineState *pcms = PC_MACHINE(machine);
  uint32_t nr_mem = machine->ram_slots;
  int root_bus_limit = 0xFF;
@@ -2367,7 +2366,7 @@ acpi_build_srat_memory(AcpiSratMemoryAffinity *numamem, 
uint64_t base,
  }

  static void
-build_srat(GArray *table_data, GArray *linker)
+build_srat(GArray *table_data, GArray *linker, MachineState *machine)
  {
  AcpiSystemResourceAffinityTable *srat;
  AcpiSratProcessorAffinity *core;
@@ -2377,7 +2376,7 @@ build_srat(GArray *table_data, GArray *linker)
  uint64_t curnode;
  int srat_start, numa_start, slots;
  uint64_t mem_len, mem_base, next_base;
-PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
+PCMachineState *pcms = PC_MACHINE(machine);
  ram_addr_t hotplugabble_address_space_size =
  object_property_get_int(OBJECT(pcms), PC_MACHINE_MEMHP_REGION_SIZE,
  NULL);
@@ -2581,17 +2580,17 @@ static bool acpi_has_iommu(void)
  return intel_iommu && !ambiguous;
  }

-static bool acpi_has_nvdimm(void)
+static bool acpi_has_nvdimm(MachineState *machine)
  {
-PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
+PCMachineState *pcms = PC_MACHINE(machine);

  return pcms->nvdimm;
  }

  static
-void acpi_build(AcpiBuildTables *tables)
+void acpi_build(AcpiBuildTables *tables, MachineState *machine)
  {
-PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
+PCMachineState *pcms = PC_MACHINE(machine);
  PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
  GArray *table_offsets;
  unsigned facs, dsdt, rsdt, fadt;
@@ -2629,7 +2628,7 @@ void acpi_build(AcpiBuildTables *tables)

  /* DSDT is pointed to by FADT */
  dsdt = tables_blob->len;
-build_dsdt(tables_blob, tables->linker, &cpu, &pm, &misc, &pci);
+build_dsdt(tables_blob, tables->linker, &cpu, &pm, &misc, &pci, machine);

  /* Count the size of the DSDT and SSDT, we will need it for legacy
   * sizing of ACPI tables.
@@ -2644,7 +2643,7 @@ void acpi_build(AcpiBuildTables *tables)
  aml_len += tables_blob->len - fadt;

  acpi_add_table(table_offsets, tables_blob);
-build_madt(tables_blob, tables->linker, &cpu);
+build_madt(tables_blob, tables->linker, pcms, &cpu);

  if (misc.has_hpet) {
  acpi_add_table(table_offsets, tables_blob);
@@ -2661,7 +2660,7 @@ void acpi_build(AcpiBuildTables *tables)
  }
  if (pcms->numa_nodes) {
  acpi_add_table(table_offsets, tables_blob);
-build_srat(tables_blob, tables->linker);
+build_srat(tables_blob, tables->linker, machine);
  }
  if (acpi_get_mcfg(&mcfg)) {
  acpi_add_table(table_offsets, tables_blob);
@@ -2672,7 +2671,7 @@ void acpi_build(AcpiBuildTables *tables)
  build_dmar_q35(tables_blob, tables->linker);
  }

-if (acpi_has_nvdimm()) {
+if (acpi_has_nvdimm(machine)) {
  nvdimm_build_acpi(table_offsets, tables_blob, tables->linker);
  }

@@ -2766,7 +2765,7 @@ static void acpi_build_update(void *build_opaque)

  acpi_build_tables_init(&tables);

-acpi_build(&tables);
+acpi_build(&tables, MACHINE(qdev_get_machine()));

  acpi_ram_update(build_state->table_mr, tables.table_data);

@@ -2831,7 +2830,7 @@ void acpi_setup(void)
  acpi_set_pci_info();

  acpi_build_tables_init(&tables);
-acpi_build(&tables);
+acpi_build(&tables, MACHINE(pcms));

  /* Now expose it all to Guest */
  build_state->table_mr = acpi_add_rom_blob(build_state, tables.table_data,




Reviewe

Re: [Qemu-devel] Can't compile QEMU 2.5.0 on Arch Linux ARM

2016-02-28 Thread XJDHDR

>> {standard input}:9097: Error: bad instruction `lock'
>> {standard input}:9097: Error: bad instruction `addl $0,0(%rsp)'
>> {standard input}:9412: Error: bad instruction `lock'
>> {standard input}:9412: Error: bad instruction `addl $0,0(%rsp)'
>> /home/share/qemu-devel/QEMU/src/qemu-2.5.0/rules.mak:57: recipe for target
>> 'hw/display/qxl.o' failed
>> make: *** [hw/display/qxl.o] Error 1
>> ==> ERROR: A failure occurred in build().
>> Aborting...

>This should really not happen because it implies that we
>somehow ended up with x86 inline assembly.

>Some things to test:
>(1) if you configure QEMU with --disable-spice, does it build
>OK, or does it just fail later on on some other source file?‎

Thank you ‎for the assistance Peter

I have managed to try your suggestion and made some progress. After I added 
"--disable-spice", the code compiled without any error. This implies that Spice 
was causing my problem.

While this may indicate a problem with the spice portion of QEMU's code, I am 
more convinced that I didn't compile Spice properly and that this is what is 
causing me problems. I might test this or I might simply not bother with Spice.

Kind regards
Xavier de Rauville 

Sent from my BlackBerry Passport.

[Qemu-devel] [Bug 1488363] Re: qemu 2.4.0 hangs using vfio for pci passthrough of graphics card

2016-02-28 Thread Peter Maloney

With seabios 1.9.1-1 and qemu 2.5.0-1 manjaro packages (which as far as
I know have no patches), it seems to work now.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1488363

Title:
  qemu 2.4.0 hangs using vfio for pci passthrough of graphics card

Status in QEMU:
  New

Bug description:
  2.3.0 (manjaro distro package) works fine. 2.4.0 (manjaro or the arch
  vanilla one) hangs on the SeaBIOS screen when saying "Press F12 for
  boot menu". All tested with the same hardware, OS, command and
  configuration. It also starts without the GPU passed through, even
  with the USB passed through. I am using the latest SeaBIOS 1.8.2.

  The release notes say:
   VFIO
  Support for resetting AMD Bonaire and Hawaii GPUs
  Platform device passthrough support for Calxeda xgmac devices 
  
  So maybe something there broke it.
  
  I am using the arch qemu 2.4.0 PKGBUILD (modified to have make -j8 and 
removed iscsi, gluster, ceph, etc.), which uses vanilla sources and no patches. 
https://projects.archlinux.org/svntogit/packages.git/tree/trunk?h=packages/qemu

  I am not using a frontend. I am using a script I wrote that generates
  the command below.

  Guest OS here would be 64 bit windows 7, but it didn't start so that's
  not relevant. Also a Manjaro Linux VM won't start.

  CPU is AMD FX-8150; board is Gigabyte GA-990FXA-UD5 (990FX chipset).

  full command line (without the \ after each line) is:

  qemu-system-x86_64
  -enable-kvm
  -M q35
  -m 3584
  -cpu host
  -boot c
  -smp 7,sockets=1,cores=7,threads=1
  -vga none
  -device ioh3420,bus=pcie.0,addr=1c.0,port=1,chassis=1,id=root.1
  -device 
vfio-pci,host=04:00.0,bus=root.1,multifunction=on,x-vga=on,addr=0.0,romfile=Sapphire.R7260X.1024.131106.rom
  -device vfio-pci,host=00:14.2,bus=pcie.0
  -device vfio-pci,host=00:16.0,bus=root.1
  -device vfio-pci,host=00:16.2,bus=root.1
  -usb
  -device ahci,bus=pcie.0,id=ahci
  -drive 
file=/dev/data/vm1,id=disk1,format=raw,if=virtio,index=0,media=disk,discard=on
  -drive media=cdrom,id=cdrom,index=5,media=cdrom
  -netdev type=tap,id=net0,ifname=tap-vm1
  -device virtio-net-pci,netdev=net0,mac=00:01:02:03:04:05
  -monitor stdio
  -boot menu=on

  
  $ lspci -nn | grep -E "04:00.0|00:14.2|00:16.0|00:16.2"
  00:14.2 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 
Azalia (Intel HDA) [1002:4383] (rev 40)
  00:16.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
  00:16.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] 
SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
  04:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. 
[AMD/ATI] Bonaire XTX [Radeon R7 260X] [1002:6658]

  
  Also I have this one that also hangs:
  05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. 
[AMD/ATI] Juniper XT [Radeon HD 6770] [1002:68ba]

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1488363/+subscriptions

[Qemu-devel] [Bug 893208] Re: qemu on ARM hosts can't boot i386 image

2016-02-28 Thread Peter Maydell

The "holdup" is simply that nobody who is interested in this issue has
written a patch like that Paolo proposed in comment #13. (Mostly people
either want to run ARM or other guest images in emulation on x86, or
they're running ARM images with hardware virtualization on ARM hardware.
Trying to run x86 images in emulation on ARM hosts is much less common.)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/893208

Title:
  qemu on ARM hosts can't boot i386 image

Status in QEMU:
  Confirmed
Status in Linaro QEMU:
  New

Bug description:
  If you apply some workarounds for bug 870990, bug 883133 and bug
  883136 QEMU still cannot boot the i386
  debian_squeeze_i386_standard.qcow2 image from
  http://people.debian.org/~aurel32/qemu/i386/ -- grub starts to boot
  but something causes the system to reset just before display of the
  blue-background grub menu, so we go round in a loop forever. This
  image boots OK on i386 hosted qemu so this indicates some kind of ARM-
  host specific bug.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/893208/+subscriptions

Re: [Qemu-devel] [PATCH v1 16/17] loader: Add data swap option to load-elf

2016-02-28 Thread Peter Maydell

On 27 February 2016 at 23:14, Peter Crosthwaite
 wrote:
> On Tue, Jan 19, 2016 at 9:53 AM, Peter Maydell  
> wrote:
>> Can we have a doc comment so we have something that defines what
>> values data_swab accepts? (it's not just a bool).
>>
>
> This is difficult to capture without writing to whole documentation
> for load_elf. So here goes:

Thanks; a couple of typos below, but otherwise looks good.


> /** load_elf:
>  * @filename: Path of ELF file
>  * @translate_fn: optional function to translate load addresses
>  * @translate_opaque: opaque data passed to @translate_fn
>  * @pentry: Populated with program entry point. Ignored if NULL.
>  * @lowaddr: Populated with lowest loaded address. Ignored if NULL.
>  * @highaddr: Populated with highest loaded address. Ignored if NULL.
>  * @bigendian: Expected ELF endianness. 0 for LE otherwise BE
>  * @elf_machine: Expected ELF machine type
>  * @clear_lsb: Set to mask off LSB of addresses (Some arch's use this

Can we just write out "architectures" here?

>for non-address data)
>  * @data_swab: Set to order of byte swapping for data. 0 for no swap, 1
>  * for swapping bytes within halfwords, 2 for bytes within
>  * words and 3 for within doublewords.
>  *
>  * Load an ELF file's contents to the emulated systems address space.

"system's"

>  * Clients may optionally specify a callback to perform address
>  * translations. @pentry, @lowaddr and @highaddr are optional pointers
>  * which will be populated with various load information. @bigendian and
>  * @elf_machine give the expected endianness and machine for the ELF the
>  * load will fail it the target ELF does not match. Some architectures

"if"

>  * have some arch-specific behaviours that come into effect when their

"architecture"

>  * particular values for @elf_machine are set.
>  */

thanks
-- PMM

Re: [Qemu-devel] [PATCH] kvm: x86: q35: Add support for -machine kernel_irqchip=split for q35

2016-02-28 Thread Jan Kiszka

On 2016-02-28 12:52, Rita Sinha wrote:
> The split IRQ chip mode via KVM_CAP_SPLIT_IRQCHIP was introduced with commit
> 15eafc2e602ff8c37c6e132eb8c63fec8fc17175 but was broken for q35. This patch

You may abbreviate hashes in logs. I usually use 10 digits, some even
only 8. There is a risk of collisions, but the context will resolve it
again for the reader.

> makes kernel_irqchip=split functional for q35.

Signed-off is missing.

> ---
>  hw/i386/pc_q35.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 46522c9..ff6a5af 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -39,6 +39,7 @@
>  #include "hw/kvm/clock.h"
>  #include "hw/pci-host/q35.h"
>  #include "exec/address-spaces.h"
> +#include "hw/i386/pc.h"
>  #include "hw/i386/ich9.h"
>  #include "hw/smbios/smbios.h"
>  #include "hw/ide/pci.h"
> @@ -145,7 +146,7 @@ static void pc_q35_init(MachineState *machine)
>  
>  /* irq lines */
>  gsi_state = g_malloc0(sizeof(*gsi_state));
> -if (kvm_irqchip_in_kernel()) {
> +if (kvm_ioapic_in_kernel()) {
>  kvm_pc_setup_irq_routing(pcmc->pci_enabled);
>  gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
>   GSI_NUM_PINS);
> @@ -192,7 +193,7 @@ static void pc_q35_init(MachineState *machine)
>  /*end early*/
>  isa_bus_irqs(isa_bus, gsi);
>  
> -if (kvm_irqchip_in_kernel()) {
> +if (kvm_pit_in_kernel()) {

kvm_pic_in_kernel

>  i8259 = kvm_i8259_init(isa_bus);
>  } else if (xen_enabled()) {
>  i8259 = xen_interrupt_controller_init();
> 

Jan



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 6/9] i.MX: Add i.MX6 System Reset Controller device.

2016-02-28 Thread Jean-Christophe DUBOIS


Le 27/02/2016 18:43, Peter Maydell a écrit :

On 27 February 2016 at 16:57, Jean-Christophe DUBOIS
 wrote:

Hi Peter and Peter,

I need to test that the changes I did for PSCI (factor out on/off code) do
not introduce any regression.

On which QEMU target should I test my changes to PSCI to check I didn't mess
up anything?

The 'virt' board uses PSCI.


Is there some quick howto somewhere on the 'virt' plateform (which linux 
defconfig to use, which DT tree to use, which command line t use, ...) ?


Thanks

JC



thanks
-- PMM

[Qemu-devel] [PATCH] kvm: x86: q35: Add support for -machine kernel_irqchip=split for q35

2016-02-28 Thread Rita Sinha

The split IRQ chip mode via KVM_CAP_SPLIT_IRQCHIP was introduced with commit
15eafc2e602ff8c37c6e132eb8c63fec8fc17175 but was broken for q35. This patch
makes kernel_irqchip=split functional for q35.
---
 hw/i386/pc_q35.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 46522c9..ff6a5af 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -39,6 +39,7 @@
 #include "hw/kvm/clock.h"
 #include "hw/pci-host/q35.h"
 #include "exec/address-spaces.h"
+#include "hw/i386/pc.h"
 #include "hw/i386/ich9.h"
 #include "hw/smbios/smbios.h"
 #include "hw/ide/pci.h"
@@ -145,7 +146,7 @@ static void pc_q35_init(MachineState *machine)
 
 /* irq lines */
 gsi_state = g_malloc0(sizeof(*gsi_state));
-if (kvm_irqchip_in_kernel()) {
+if (kvm_ioapic_in_kernel()) {
 kvm_pc_setup_irq_routing(pcmc->pci_enabled);
 gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
  GSI_NUM_PINS);
@@ -192,7 +193,7 @@ static void pc_q35_init(MachineState *machine)
 /*end early*/
 isa_bus_irqs(isa_bus, gsi);
 
-if (kvm_irqchip_in_kernel()) {
+if (kvm_pit_in_kernel()) {
 i8259 = kvm_i8259_init(isa_bus);
 } else if (xen_enabled()) {
 i8259 = xen_interrupt_controller_init();
-- 
2.7.1

[Qemu-devel] [PATCH v4 0/2] doc/memory: cleanup

2016-02-28 Thread Cao jin

see each commit message

Cao jin (2):
  doc/memory: fix inconsistency between code and doc
  doc/memory: remove the stray extra '-'

 docs/memory.txt | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

-- 
2.1.0

[Qemu-devel] [PATCH v4 2/2] doc/memory: remove the stray extra '-'

2016-02-28 Thread Cao jin

Signed-off-by: Cao jin 
Reviewed-by: Peter Maydell 
---
 docs/memory.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/memory.txt b/docs/memory.txt
index 746d794..cd6657f 100644
--- a/docs/memory.txt
+++ b/docs/memory.txt
@@ -185,8 +185,8 @@ an MMIO region mapped at 0x0, size 0x6000, priority 2. B 
currently has two
 of its own subregions: D of size 0x1000 at offset 0 and E of size 0x1000 at
 offset 0x2000. As a diagram:
 
-0  1000   2000   3000   4000   5000   6000   70008000
-|--|--|--|--|--|--|--|---|
+0  1000   2000   3000   4000   5000   6000   7000   8000
+|--|--|--|--|--|--|--|--|
   A:[   ]
   C:[]
   B:  [  ]
-- 
2.1.0

[Qemu-devel] [PATCH v4 1/2] doc/memory: fix inconsistency between code and doc

2016-02-28 Thread Cao jin

change ".valid.aligned" to ".valid.unaligned", and also modify its description,
make the text parallel to the existing .impl.unaligned doc.

Signed-off-by: Cao jin 
---
 docs/memory.txt | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/docs/memory.txt b/docs/memory.txt
index 8745f76..746d794 100644
--- a/docs/memory.txt
+++ b/docs/memory.txt
@@ -297,8 +297,9 @@ various constraints can be supplied to control how these 
callbacks are called:
  - .valid.min_access_size, .valid.max_access_size define the access sizes
(in bytes) which the device accepts; accesses outside this range will
have device and bus specific behaviour (ignored, or machine check)
- - .valid.aligned specifies that the device only accepts naturally aligned
-   accesses.  Unaligned accesses invoke device and bus specific behaviour.
+ - .valid.unaligned specifies that the *device being modelled* supports
+   unaligned accesses; if false, unaligned accesses will invoke the appropriate
+   bus or CPU specific behaviour.
  - .impl.min_access_size, .impl.max_access_size define the access sizes
(in bytes) supported by the *implementation*; other access sizes will be
emulated using the ones available.  For example a 4-byte write will be
-- 
2.1.0

Re: [Qemu-devel] [PATCH v3 1/2] doc/memory: fix typo

2016-02-28 Thread Cao jin




On 02/26/2016 07:36 PM, Peter Maydell wrote:

On 26 February 2016 at 11:29, Cao jin  wrote:




Needs to be a semicolon before the "if".

Either "something something. If foo" or "something something; if foo"
are correct as punctuation. Since we're aiming to follow the same
construction as the existing text for .impl.unaligned, we want the
semicolon and lowercase.



interesting punctuation knowledge to me(as a non-native speaker). Thanks 
for the comment:)


--
Yours Sincerely,

Cao jin

[Qemu-devel] ODP: Quick folder sharing to arm quest from host

2016-02-28 Thread Krzeminski, Marcin (Nokia - PL/Wroclaw)

So it seem that again it was my lack of knowledge from virtualization area.
It was thta simple as adding virtio-mmio device to machine model.
Now I can easy share files between host and arm quest.

Thanks,
Marcin

Od: Krzeminski, Marcin (Nokia - PL/Wroclaw)
Wysłano: 25 lutego 2016 13:18
Do: qemu-devel@nongnu.org
DW: Peter Maydell
Temat: Quick folder sharing to arm quest from host

Hello,

I need to upload as fast as possible an app to arm linux quest from x86 host.
If I understand correctly 9p and  can not be used since it needs working kvm.
There is build in auto configuration for samba that can be used but it 
generates problem with privileges,
but maybe there is other way to share folder from host to quest?


Best regards,
Marcin Krzemiński

94 matches

Mail list logo