Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay
Am 29.02.2016 um 08:03 hat Pavel Dovgalyuk geschrieben: > > From: Kevin Wolf [mailto:kw...@redhat.com] > > Am 25.02.2016 um 10:06 hat Pavel Dovgalyuk geschrieben: > > > There is one problem with flush event - callbacks for flush are called for > > > all layers and I couldn't synchronize them correctly yet. > > > I'll probably have to add new callback to block driver, which handles > > > flush request for the whole stack of the drivers. > > > > Flushes should be treated more or less the same a writes, I think. > > But bdrv_co_flush has different structure and does not allow synchronization > of the top layer. Here is the patch for fixing this: > > diff --git a/block/io.c b/block/io.c > index a69bfc4..9e05dfe 100644 > --- a/block/io.c > +++ b/block/io.c > @@ -2369,6 +2369,12 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs) > } > > tracked_request_begin(&req, bs, 0, 0, BDRV_TRACKED_FLUSH); > +/* Write back all layers by calling one driver function */ > +if (bs->drv->bdrv_co_flush) { > +ret = bs->drv->bdrv_co_flush(bs); > +goto out; > +} > + > /* Write back cached data to the OS even with cache=unsafe */ > BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_OS); > if (bs->drv->bdrv_co_flush_to_os) { > diff --git a/include/block/block_int.h b/include/block/block_int.h > index 9ef823a..9cc2c58 100644 > --- a/include/block/block_int.h > +++ b/include/block/block_int.h > @@ -176,6 +176,13 @@ struct BlockDriver { > int (*bdrv_inactivate)(BlockDriverState *bs); > > /* > + * Flushes all data for all layers by calling bdrv_co_flush for > underlying > + * layers, if needed. This function is needed for deterministic > + * synchronization of the flush finishing callback. > + */ > +int coroutine_fn (*bdrv_co_flush)(BlockDriverState *bs); > + > +/* > * Flushes all data that was already written to the OS all the way down > to > * the disk (for example raw-posix calls fsync()). > */ > > > Then I'll have just to implement bdrv_co_flush for blkreplay layer. > This callback will manually invoke bdrv_co_flush for underlying layer > allowing synchronization of finishing callback. The real problem is that the lower layer (bs->file) is automatically flushed, introducing non-determinism outside of the protecting blkreplay driver, correct? Hm... My first thought was maybe introduce a bool BlockDriver.no_flush_parent which avoids the automatic flush for the parent node. Then you could use .bdrv_co_flush_to_os() like all other drivers do. But you're right that your callback wouldn't just flush to the OS, but you would have to implement the flush on the lower layer and the handling of BDRV_O_NO_FLUSH in the blkreplay driver, so introducing a new callback for a "whole" flush might indeed be cleaner. I don't particularly like it, but I don't see a better option, so it's okay with me. Kevin
Re: [Qemu-devel] [PATCH v3] net: netmap: probe netmap interface for virtio-net header
On 02/24/2016 06:30 PM, Vincenzo Maffione wrote: > Previous implementation of has_ufo, has_vnet_hdr, has_vnet_hdr_len, etc. > did not really probe for virtio-net header support for the netmap > interface attached to the backend. These callbacks were correct for > VALE ports, but incorrect for hardware NICs, pipes, monitors, etc. > > This patch fixes the implementation to work properly with all kinds > of netmap ports. > > Signed-off-by: Vincenzo Maffione > --- > net/netmap.c | 59 ++- > 1 file changed, 38 insertions(+), 21 deletions(-) > Applied to -net. Thanks
Re: [Qemu-devel] [PATCH v2 2/2] filter-buffer: Add status_changed callback processing
On 2016/2/29 15:27, Jason Wang wrote: On 02/29/2016 09:46 AM, zhanghailiang wrote: While the status of filter-buffer changing from 'on' to 'off', it need to release all the buffered packets, and delete the related timer, while switch from 'off' to 'on', it need to resume the release packets timer. Signed-off-by: zhanghailiang Cc: Jason Wang Cc: Yang Hongyang --- v2: - New patch --- net/filter-buffer.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/net/filter-buffer.c b/net/filter-buffer.c index 12ad2e3..ed3f19e 100644 --- a/net/filter-buffer.c +++ b/net/filter-buffer.c @@ -124,6 +124,24 @@ static void filter_buffer_setup(NetFilterState *nf, Error **errp) } } +static void filter_buffer_status_changed(NetFilterState *nf, Error **errp) +{ +FilterBufferState *s = FILTER_BUFFER(nf); + +if (!strcmp(nf->status, "off")) { +if (s->interval) { +timer_del(&s->release_timer); +} +filter_buffer_flush(nf); +} else { +if (s->interval) { +timer_init_us(&s->release_timer, QEMU_CLOCK_VIRTUAL, +filter_buffer_release_timer, nf); +timer_mod(&s->release_timer, +qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + s->interval); +} The code looks duplicated with filter_buffer_setup(). Yea, extract them into a new helper ? filter_buffer_setup_timer() ? +} +} static void filter_buffer_class_init(ObjectClass *oc, void *data) { NetFilterClass *nfc = NETFILTER_CLASS(oc); @@ -131,6 +149,7 @@ static void filter_buffer_class_init(ObjectClass *oc, void *data) nfc->setup = filter_buffer_setup; nfc->cleanup = filter_buffer_cleanup; nfc->receive_iov = filter_buffer_receive_iov; +nfc->status_changed = filter_buffer_status_changed; } static void filter_buffer_get_interval(Object *obj, Visitor *v, .
Re: [Qemu-devel] [PATCH v2 1/2] filter: Add 'status' property for filter object
On 2016/2/29 15:26, Jason Wang wrote: On 02/29/2016 09:46 AM, zhanghailiang wrote: With this property, users can control if this filter is 'on' or 'off'. The default behavior for filter is 'on'. For some types of filters, they may need to react to status changing, So here, we introduced status changing callback/notifier for filter class. We will skip the disabled ('off') filter when delivering packets in net layer. Signed-off-by: zhanghailiang Cc: Jason Wang Cc: Yang Hongyang --- v2: - Split the processing of buffer-filter into a new patch (Jason) - Use 'status' instead of 'enabled' to store the filter state (Jason) - Rename FilterDisable() callback to FilterStatusChanged(Jason) --- Thanks, looks good, just few nits. include/net/filter.h | 4 net/filter.c | 42 ++ qemu-options.hx | 4 +++- 3 files changed, 49 insertions(+), 1 deletion(-) diff --git a/include/net/filter.h b/include/net/filter.h index 5639976..ebef0dc 100644 --- a/include/net/filter.h +++ b/include/net/filter.h @@ -36,12 +36,15 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc, int iovcnt, NetPacketSent *sent_cb); +typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp); + typedef struct NetFilterClass { ObjectClass parent_class; /* optional */ FilterSetup *setup; FilterCleanup *cleanup; +FilterStatusChanged *status_changed; /* mandatory */ FilterReceiveIOV *receive_iov; } NetFilterClass; @@ -55,6 +58,7 @@ struct NetFilterState { char *netdev_id; NetClientState *netdev; NetFilterDirection direction; +char *status; Let's use bool instead. Er, then status=true means 'on' ? false means 'off' ? That looks odd. What about using 'bool status_on' ? .
Re: [Qemu-devel] [PATCHv2 5/7] spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge
On 02/29/2016 06:06 PM, David Gibson wrote: Now that the regular spapr-pci-host-bridge can handle EEH, there are only two things that spapr-pci-vfio-host-bridge does differently: 1. automatically sizes its DMA window to match the host IOMMU 2. checks if the attached VFIO container is backed by the VFIO_SPAPR_TCE_IOMMU type on the host (1) is not particularly useful, since the default window used by the regular host bridge will work with the host IOMMU configuration on all current systems anyway. Plus, automatically changing guest visible configuration (such as the DMA window) based on host settings is generally a bad idea. It's not definitively broken, since spapr-pci-vfio-host-bridge is only supposed to support VFIO devices which can't be migrated anyway, but still. (2) is not really useful, because if a guest tries to configure EEH on a different host IOMMU, the first call will fail and that will be that. It's possible there are scripts or tools out there which expect spapr-pci-vfio-host-bridge, so we don't remove it entirely. This patch reduces it to just a stub for backwards compatibility. Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy -- Alexey
Re: [Qemu-devel] [PATCHv2 3/7] spapr_pci: Eliminate class callbacks
On 02/29/2016 06:06 PM, David Gibson wrote: The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the special groupid field in sPAPRPHBVFIOState. So we can simplify, removing the class specific callbacks with direct calls based on a simple spapr_phb_eeh_enabled() helper. For now we implement that in terms of a boolean in the class, but we'll continue to clean that up later. On its own this is a rather strange way of doing things, but it's a useful intermediate step to further cleanups. Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy -- Alexey
Re: [Qemu-devel] [PATCHv2 4/7] spapr_pci: Allow EEH on spapr-pci-host-bridge
On 02/29/2016 06:06 PM, David Gibson wrote: Now that the EEH code is independent of the special spapr-vfio-pci-host-bridge device, we can allow it on all spapr PCI host bridges instead. We do this by changing spapr_phb_eeh_available() to be based on the vfio_eeh_as_ok() call instead of the host bridge class. Because the value of vfio_eeh_as_ok() can change with devices being hotplugged or unplugged, this can potentially lead to some strange edge cases where the guest starts using EEH, then it starts failing because of a change in status. However, it's not really any worse than the current situation. Cases that would have worked previously will still work (i.e. VFIO devices from at most one VFIO IOMMU group per vPHB), it's just that it's no longer necessary to use spapr-vfio-pci-host-bridge with the groupid pre-specified. Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy -- Alexey
Re: [Qemu-devel] [PATCHv2 1/7] vfio: Start improving VFIO/EEH interface
On 02/29/2016 06:06 PM, David Gibson wrote: At present the code handling IBM's Enhanced Error Handling (EEH) interface on VFIO devices operates by bypassing the usual VFIO logic with vfio_container_ioctl(). That's a poorly designed interface with unclear semantics about exactly what can be operated on. In particular it operates on a single vfio container internally (hence the name), but takes an address space and group id, from which it deduces the container in a rather roundabout way. groupids are something that code outside vfio shouldn't even be aware of. This patch creates new interfaces for EEH operations. Internally we have vfio_eeh_container_op() which takes a VFIOContainer object directly. For external use we have vfio_eeh_as_ok() which determines if an AddressSpace is usable for EEH (at present this means it has a single container and at most a single group attached), and vfio_eeh_as_op() which will perform an operation on an AddressSpace in the unambiguous case, and otherwise returns an error. This interface still isn't great, but it's enough of an improvement to allow a number of cleanups in other places. Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy -- Alexey
Re: [Qemu-devel] [PATCH v2 2/2] filter-buffer: Add status_changed callback processing
On 02/29/2016 09:46 AM, zhanghailiang wrote: > While the status of filter-buffer changing from 'on' to 'off', > it need to release all the buffered packets, and delete the related > timer, while switch from 'off' to 'on', it need to resume the release > packets timer. > > Signed-off-by: zhanghailiang > Cc: Jason Wang > Cc: Yang Hongyang > --- > v2: > - New patch > --- > net/filter-buffer.c | 19 +++ > 1 file changed, 19 insertions(+) > > diff --git a/net/filter-buffer.c b/net/filter-buffer.c > index 12ad2e3..ed3f19e 100644 > --- a/net/filter-buffer.c > +++ b/net/filter-buffer.c > @@ -124,6 +124,24 @@ static void filter_buffer_setup(NetFilterState *nf, > Error **errp) > } > } > > +static void filter_buffer_status_changed(NetFilterState *nf, Error **errp) > +{ > +FilterBufferState *s = FILTER_BUFFER(nf); > + > +if (!strcmp(nf->status, "off")) { > +if (s->interval) { > +timer_del(&s->release_timer); > +} > +filter_buffer_flush(nf); > +} else { > +if (s->interval) { > +timer_init_us(&s->release_timer, QEMU_CLOCK_VIRTUAL, > +filter_buffer_release_timer, nf); > +timer_mod(&s->release_timer, > +qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + s->interval); > +} The code looks duplicated with filter_buffer_setup(). > +} > +} > static void filter_buffer_class_init(ObjectClass *oc, void *data) > { > NetFilterClass *nfc = NETFILTER_CLASS(oc); > @@ -131,6 +149,7 @@ static void filter_buffer_class_init(ObjectClass *oc, > void *data) > nfc->setup = filter_buffer_setup; > nfc->cleanup = filter_buffer_cleanup; > nfc->receive_iov = filter_buffer_receive_iov; > +nfc->status_changed = filter_buffer_status_changed; > } > > static void filter_buffer_get_interval(Object *obj, Visitor *v,
Re: [Qemu-devel] [PATCH v2 1/2] filter: Add 'status' property for filter object
On 02/29/2016 09:46 AM, zhanghailiang wrote: > With this property, users can control if this filter is 'on' > or 'off'. The default behavior for filter is 'on'. > > For some types of filters, they may need to react to status changing, > So here, we introduced status changing callback/notifier for filter class. > > We will skip the disabled ('off') filter when delivering packets in net layer. > > Signed-off-by: zhanghailiang > Cc: Jason Wang > Cc: Yang Hongyang > --- > v2: > - Split the processing of buffer-filter into a new patch (Jason) > - Use 'status' instead of 'enabled' to store the filter state (Jason) > - Rename FilterDisable() callback to FilterStatusChanged(Jason) > --- Thanks, looks good, just few nits. > include/net/filter.h | 4 > net/filter.c | 42 ++ > qemu-options.hx | 4 +++- > 3 files changed, 49 insertions(+), 1 deletion(-) > > diff --git a/include/net/filter.h b/include/net/filter.h > index 5639976..ebef0dc 100644 > --- a/include/net/filter.h > +++ b/include/net/filter.h > @@ -36,12 +36,15 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc, > int iovcnt, > NetPacketSent *sent_cb); > > +typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp); > + > typedef struct NetFilterClass { > ObjectClass parent_class; > > /* optional */ > FilterSetup *setup; > FilterCleanup *cleanup; > +FilterStatusChanged *status_changed; > /* mandatory */ > FilterReceiveIOV *receive_iov; > } NetFilterClass; > @@ -55,6 +58,7 @@ struct NetFilterState { > char *netdev_id; > NetClientState *netdev; > NetFilterDirection direction; > +char *status; Let's use bool instead.
[Qemu-devel] An ivshmem + chardev peculiarity
The ivshmem client/server protocol is one-way: only the server sends. The only way clients can communicate is by connect and close. Since the ivshmem device model uses a chardev for its connection to the server, both connect and close happen in chardev code. Example: monitor command chardev-add connects, the server starts sending. chardev-add succeeds. device_add fails somehow. Now you must chardev-remove to close the connection. >From the server's point of view, a peer connected and disconnected. It duly announces this to already connected peers. This is a bit weird, but it shouldn't cause problems. If you forget to chardev-remove, the "peer" stays connected. Can be declared PEBKAC. Connect on realize and close on realize failure and unrealize would be neater, I think. Chardevs don't let me do that, as far as I can tell. Ideas? There's one exception to the behavior I just described: on version mismatch, the device model calls qemu_chr_delete(). I consider that a bug, and I'm going to fix it.
Re: [Qemu-devel] [PATCH] net/filter-redirector:Add filter-redirector
On 02/24/2016 05:03 PM, Zhang Chen wrote: > > If queue=rx, filter-redirector will get the packet that guest send, > then redirect > to outdev(if none, do nothing). but queue=rx/tx/all not related to > indev. please > look the flow chart below. queue=xxx just work for one > way(filter->outdev). > > filter > + > | > | >redirector | > +-+ > | | | > | | | > | | | >indev ++ +> outdev > | | | > | | | > | | | > +-+ > | > | > v >filter > > | > > | > > v >filter filter .. guest > This looks a violation on the assumption of current filter behavior. Each filter should only talk to the 'next' or 'prev' filter on the chain (depends on the direction) or netdev when queue=rx or netdev's peer when queue=tx. And in fact there's subtle differences with your patch: When queue='all' since you force nf->netdev as sender, direction is NET_FILTER_DIRECTION_TX, the packet will be passed to 'next' filter on the chain. When queue='rx', direction is NET_FILTER_DIRECTION_RX, the packet will be pass to 'prev' filter on the chain. So as you can see, 'all' is ambiguous here. I think we should keep current behavior by redirecting traffic to netdev when queue='rx'. For queue='all', maybe we need redirect the traffic to both netdev and netdev's peer.
Re: [Qemu-devel] [RFC PATCH v0 3/6] spapr: Represent boot CPUs as spapr-cpu-core devices
On Mon, Feb 29, 2016 at 11:05:32AM +0530, Bharata B Rao wrote: > On Fri, Feb 26, 2016 at 04:18:57PM +0100, Igor Mammedov wrote: > > On Thu, 25 Feb 2016 21:52:39 +0530 > > Bharata B Rao wrote: [snip] > > > @@ -2209,6 +2251,7 @@ static void > > > spapr_machine_device_plug(HotplugHandler *hotplug_dev, > > >DeviceState *dev, Error **errp) > > > { > > > sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine()); > > > +sPAPRMachineState *ms = SPAPR_MACHINE(hotplug_dev); > > > > > > if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) { > > > int node; > > > @@ -2245,6 +2288,11 @@ static void > > > spapr_machine_device_plug(HotplugHandler *hotplug_dev, > > > } > > > > > > spapr_memory_plug(hotplug_dev, dev, node, errp); > > > +} else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { > > here probably should be CORE and not TYPE_CPU, > > then if board needs to set some state for child threads it > > could ennumerate child of core here and do the required job. > > Hmm, we have things to set for CPUs and cores as well from hotplug > handler. So the above code is for CPUs and I have spapr_core_plug() > which is called conditionally when device is of type TYPE_SPAPR_CPU_CORE. > > Given that ->plug() is called during realization of both individual CPU > threads as well as their parent core, I thought handling the setup for > them separately like the above is simpler, no ? So, this bothered me a bit as well. I think having the hotplug handler for the threads is misleading, since it suggests you can individually hotplug them, when in fact the intention is just that this is part of the hotplug code path for a whole core. My suggestion earlier was to have the core realize code itself perform this configuration on the sub-threads. However, Igor's suggestion of, essentially, an explicit loop over the threads in the core hotplug handler would be fine by me also. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
[Qemu-devel] [PATCHv2 3/7] spapr_pci: Eliminate class callbacks
The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the special groupid field in sPAPRPHBVFIOState. So we can simplify, removing the class specific callbacks with direct calls based on a simple spapr_phb_eeh_enabled() helper. For now we implement that in terms of a boolean in the class, but we'll continue to clean that up later. On its own this is a rather strange way of doing things, but it's a useful intermediate step to further cleanups. Signed-off-by: David Gibson --- hw/ppc/spapr_pci.c | 44 ++-- hw/ppc/spapr_pci_vfio.c | 18 +++--- include/hw/pci-host/spapr.h | 37 + 3 files changed, 62 insertions(+), 37 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index e8edad3..be05757 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -92,6 +92,13 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState *spapr, uint64_t buid, return pci_find_device(phb->bus, bus_num, devfn); } +static bool spapr_phb_eeh_available(sPAPRPHBState *sphb) +{ +sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); + +return spc->eeh_available; +} + static uint32_t rtas_pci_cfgaddr(uint32_t arg) { /* This handles the encoding of extended config space addresses */ @@ -440,7 +447,6 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; uint32_t addr, option; uint64_t buid; int ret; @@ -458,12 +464,11 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_set_option) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } -ret = spc->eeh_set_option(sphb, addr, option); +ret = spapr_phb_vfio_eeh_set_option(sphb, addr, option); rtas_st(rets, 0, ret); return; @@ -478,7 +483,6 @@ static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; PCIDevice *pdev; uint32_t addr, option; uint64_t buid; @@ -493,8 +497,7 @@ static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_set_option) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } @@ -534,7 +537,6 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; uint64_t buid; int state, ret; @@ -548,12 +550,11 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_get_state) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } -ret = spc->eeh_get_state(sphb, &state); +ret = spapr_phb_vfio_eeh_get_state(sphb, &state); rtas_st(rets, 0, ret); if (ret != RTAS_OUT_SUCCESS) { return; @@ -578,7 +579,6 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; uint32_t option; uint64_t buid; int ret; @@ -594,12 +594,11 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_reset) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } -ret = spc->eeh_reset(sphb, option); +ret = spapr_phb_vfio_eeh_reset(sphb, option); rtas_st(rets, 0, ret); return; @@ -614,7 +613,6 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; uint64_t buid; int ret; @@ -628,12 +626,11 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_configure) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } -ret = spc->eeh_configure(sphb); +ret = spapr_phb_vfio_eeh_configure(sphb); rtas_st(rets, 0, ret); return; @@ -649,7 +646,6 @@ static void rtas_ibm_slot_error_detail(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; int option; uint64_t buid; @@ -663,8 +659,7 @@ static void rtas_ibm_slot_error_detail(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_set_option) { +if (!spa
[Qemu-devel] [PATCHv2 2/7] spapr_pci: Switch to vfio_eeh_as_op() interface
This switches all EEH on VFIO operations in spapr_pci_vfio.c from the broken vfio_container_ioctl() interface to the new vfio_as_eeh_op() interface. Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy --- hw/ppc/spapr_pci_vfio.c | 50 - 1 file changed, 16 insertions(+), 34 deletions(-) diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index 2f3752e..b1e8e8e 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -73,15 +73,9 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **errp) spapr_tce_get_iommu(tcet)); } -static void spapr_phb_vfio_eeh_reenable(sPAPRPHBVFIOState *svphb) +static void spapr_phb_vfio_eeh_reenable(sPAPRPHBState *sphb) { -struct vfio_eeh_pe_op op = { -.argsz = sizeof(op), -.op= VFIO_EEH_PE_ENABLE -}; - -vfio_container_ioctl(&svphb->phb.iommu_as, - svphb->iommugroupid, VFIO_EEH_PE_OP, &op); +vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_ENABLE); } static void spapr_phb_vfio_reset(DeviceState *qdev) @@ -92,19 +86,18 @@ static void spapr_phb_vfio_reset(DeviceState *qdev) * ensures that the contained PCI devices will work properly * after reboot. */ -spapr_phb_vfio_eeh_reenable(SPAPR_PCI_VFIO_HOST_BRIDGE(qdev)); +spapr_phb_vfio_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev)); } static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, unsigned int addr, int option) { -sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); -struct vfio_eeh_pe_op op = { .argsz = sizeof(op) }; +uint32_t op; int ret; switch (option) { case RTAS_EEH_DISABLE: -op.op = VFIO_EEH_PE_DISABLE; +op = VFIO_EEH_PE_DISABLE; break; case RTAS_EEH_ENABLE: { PCIHostState *phb; @@ -122,21 +115,20 @@ static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, return RTAS_OUT_PARAM_ERROR; } -op.op = VFIO_EEH_PE_ENABLE; +op = VFIO_EEH_PE_ENABLE; break; } case RTAS_EEH_THAW_IO: -op.op = VFIO_EEH_PE_UNFREEZE_IO; +op = VFIO_EEH_PE_UNFREEZE_IO; break; case RTAS_EEH_THAW_DMA: -op.op = VFIO_EEH_PE_UNFREEZE_DMA; +op = VFIO_EEH_PE_UNFREEZE_DMA; break; default: return RTAS_OUT_PARAM_ERROR; } -ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid, - VFIO_EEH_PE_OP, &op); +ret = vfio_eeh_as_op(&sphb->iommu_as, op); if (ret < 0) { return RTAS_OUT_HW_ERROR; } @@ -146,13 +138,9 @@ static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, static int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state) { -sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); -struct vfio_eeh_pe_op op = { .argsz = sizeof(op) }; int ret; -op.op = VFIO_EEH_PE_GET_STATE; -ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid, - VFIO_EEH_PE_OP, &op); +ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_GET_STATE); if (ret < 0) { return RTAS_OUT_PARAM_ERROR; } @@ -206,28 +194,26 @@ static void spapr_phb_vfio_eeh_pre_reset(sPAPRPHBState *sphb) static int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option) { -sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); -struct vfio_eeh_pe_op op = { .argsz = sizeof(op) }; +uint32_t op; int ret; switch (option) { case RTAS_SLOT_RESET_DEACTIVATE: -op.op = VFIO_EEH_PE_RESET_DEACTIVATE; +op = VFIO_EEH_PE_RESET_DEACTIVATE; break; case RTAS_SLOT_RESET_HOT: spapr_phb_vfio_eeh_pre_reset(sphb); -op.op = VFIO_EEH_PE_RESET_HOT; +op = VFIO_EEH_PE_RESET_HOT; break; case RTAS_SLOT_RESET_FUNDAMENTAL: spapr_phb_vfio_eeh_pre_reset(sphb); -op.op = VFIO_EEH_PE_RESET_FUNDAMENTAL; +op = VFIO_EEH_PE_RESET_FUNDAMENTAL; break; default: return RTAS_OUT_PARAM_ERROR; } -ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid, - VFIO_EEH_PE_OP, &op); +ret = vfio_eeh_as_op(&sphb->iommu_as, op); if (ret < 0) { return RTAS_OUT_HW_ERROR; } @@ -237,13 +223,9 @@ static int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option) static int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb) { -sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); -struct vfio_eeh_pe_op op = { .argsz = sizeof(op) }; int ret; -op.op = VFIO_EEH_PE_CONFIGURE; -ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid, - VFIO_EEH_PE_OP, &op); +ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE
[Qemu-devel] [PATCHv2 4/7] spapr_pci: Allow EEH on spapr-pci-host-bridge
Now that the EEH code is independent of the special spapr-vfio-pci-host-bridge device, we can allow it on all spapr PCI host bridges instead. We do this by changing spapr_phb_eeh_available() to be based on the vfio_eeh_as_ok() call instead of the host bridge class. Because the value of vfio_eeh_as_ok() can change with devices being hotplugged or unplugged, this can potentially lead to some strange edge cases where the guest starts using EEH, then it starts failing because of a change in status. However, it's not really any worse than the current situation. Cases that would have worked previously will still work (i.e. VFIO devices from at most one VFIO IOMMU group per vPHB), it's just that it's no longer necessary to use spapr-vfio-pci-host-bridge with the groupid pre-specified. Signed-off-by: David Gibson --- hw/ppc/spapr_pci.c | 10 ++ hw/ppc/spapr_pci_vfio.c | 6 +- include/hw/pci-host/spapr.h | 6 +- 3 files changed, 12 insertions(+), 10 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index be05757..4ac5273 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -42,6 +42,8 @@ #include "hw/ppc/spapr_drc.h" #include "sysemu/device_tree.h" +#include "hw/vfio/vfio.h" + /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */ #define RTAS_QUERY_FN 0 #define RTAS_CHANGE_FN 1 @@ -92,13 +94,6 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState *spapr, uint64_t buid, return pci_find_device(phb->bus, bus_num, devfn); } -static bool spapr_phb_eeh_available(sPAPRPHBState *sphb) -{ -sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); - -return spc->eeh_available; -} - static uint32_t rtas_pci_cfgaddr(uint32_t arg) { /* This handles the encoding of extended config space addresses */ @@ -1563,7 +1558,6 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data) set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); dc->cannot_instantiate_with_device_add_yet = false; spc->finish_realize = spapr_phb_finish_realize; -spc->eeh_available = false; hp->plug = spapr_phb_hot_plug_child; hp->unplug = spapr_phb_hot_unplug_child; } diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index 10fa88a..16a4a8f 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -73,6 +73,11 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **errp) spapr_tce_get_iommu(tcet)); } +bool spapr_phb_eeh_available(sPAPRPHBState *sphb) +{ +return vfio_eeh_as_ok(&sphb->iommu_as); +} + static void spapr_phb_vfio_eeh_reenable(sPAPRPHBState *sphb) { vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_ENABLE); @@ -240,7 +245,6 @@ static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data) dc->props = spapr_phb_vfio_properties; spc->finish_realize = spapr_phb_vfio_finish_realize; -spc->eeh_available = true; } static const TypeInfo spapr_phb_vfio_info = { diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index 0b936c6..19a95e0 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -49,7 +49,6 @@ struct sPAPRPHBClass { PCIHostBridgeClass parent_class; void (*finish_realize)(sPAPRPHBState *sphb, Error **errp); -bool eeh_available; }; typedef struct spapr_pci_msi { @@ -136,6 +135,7 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState *spapr, uint64_t buid, /* VFIO EEH hooks */ #ifdef CONFIG_LINUX +bool spapr_phb_eeh_available(sPAPRPHBState *sphb); int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, unsigned int addr, int option); int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state); @@ -143,6 +143,10 @@ int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option); int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb); void spapr_phb_vfio_reset(DeviceState *qdev); #else +static inline bool spapr_phb_eeh_available(sPAPRPHBState *sphb) +{ +return false; +} static inline int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, unsigned int addr, int option) { -- 2.5.0
[Qemu-devel] [PATCHv2 6/7] spapr_pci: Remove finish_realize hook
Now that spapr-pci-vfio-host-bridge is reduced to just a stub, there is only one implementation of the finish_realize hook in sPAPRPHBClass. So, we can fold that implementation into its (single) caller, and remove the hook. That's the last thing left in sPAPRPHBClass, so that can go away as well. Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy --- hw/ppc/spapr_pci.c | 25 + include/hw/pci-host/spapr.h | 12 2 files changed, 5 insertions(+), 32 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 4ac5273..3d1145e 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -1224,11 +1224,12 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) SysBusDevice *s = SYS_BUS_DEVICE(dev); sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(s); PCIHostState *phb = PCI_HOST_BRIDGE(s); -sPAPRPHBClass *info = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(s); char *namebuf; int i; PCIBus *bus; uint64_t msi_window_size = 4096; +sPAPRTCETable *tcet; +uint32_t nb_table; if (sphb->index != (uint32_t)-1) { hwaddr windows_base; @@ -1380,33 +1381,20 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) } } -if (!info->finish_realize) { -error_setg(errp, "finish_realize not defined"); -return; -} - -info->finish_realize(sphb, errp); - -sphb->msi = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free); -} - -static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp) -{ -sPAPRTCETable *tcet; -uint32_t nb_table; - nb_table = sphb->dma_win_size >> SPAPR_TCE_PAGE_SHIFT; tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn, 0, SPAPR_TCE_PAGE_SHIFT, nb_table, false); if (!tcet) { error_setg(errp, "Unable to create TCE table for %s", sphb->dtbusname); -return ; +return; } /* Register default 32bit DMA window */ memory_region_add_subregion(&sphb->iommu_root, sphb->dma_win_addr, spapr_tce_get_iommu(tcet)); + +sphb->msi = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free); } static int spapr_phb_children_reset(Object *child, void *opaque) @@ -1547,7 +1535,6 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data) { PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass); DeviceClass *dc = DEVICE_CLASS(klass); -sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_CLASS(klass); HotplugHandlerClass *hp = HOTPLUG_HANDLER_CLASS(klass); hc->root_bus_path = spapr_phb_root_bus_path; @@ -1557,7 +1544,6 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data) dc->vmsd = &vmstate_spapr_pci; set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); dc->cannot_instantiate_with_device_add_yet = false; -spc->finish_realize = spapr_phb_finish_realize; hp->plug = spapr_phb_hot_plug_child; hp->unplug = spapr_phb_hot_unplug_child; } @@ -1567,7 +1553,6 @@ static const TypeInfo spapr_phb_info = { .parent= TYPE_PCI_HOST_BRIDGE, .instance_size = sizeof(sPAPRPHBState), .class_init= spapr_phb_class_init, -.class_size= sizeof(sPAPRPHBClass), .interfaces= (InterfaceInfo[]) { { TYPE_HOTPLUG_HANDLER }, { } diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index a08235e..03ee006 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -32,20 +32,8 @@ #define SPAPR_PCI_HOST_BRIDGE(obj) \ OBJECT_CHECK(sPAPRPHBState, (obj), TYPE_SPAPR_PCI_HOST_BRIDGE) -#define SPAPR_PCI_HOST_BRIDGE_CLASS(klass) \ - OBJECT_CLASS_CHECK(sPAPRPHBClass, (klass), TYPE_SPAPR_PCI_HOST_BRIDGE) -#define SPAPR_PCI_HOST_BRIDGE_GET_CLASS(obj) \ - OBJECT_GET_CLASS(sPAPRPHBClass, (obj), TYPE_SPAPR_PCI_HOST_BRIDGE) - -typedef struct sPAPRPHBClass sPAPRPHBClass; typedef struct sPAPRPHBState sPAPRPHBState; -struct sPAPRPHBClass { -PCIHostBridgeClass parent_class; - -void (*finish_realize)(sPAPRPHBState *sphb, Error **errp); -}; - typedef struct spapr_pci_msi { uint32_t first_irq; uint32_t num; -- 2.5.0
[Qemu-devel] [PATCHv2 1/7] vfio: Start improving VFIO/EEH interface
At present the code handling IBM's Enhanced Error Handling (EEH) interface on VFIO devices operates by bypassing the usual VFIO logic with vfio_container_ioctl(). That's a poorly designed interface with unclear semantics about exactly what can be operated on. In particular it operates on a single vfio container internally (hence the name), but takes an address space and group id, from which it deduces the container in a rather roundabout way. groupids are something that code outside vfio shouldn't even be aware of. This patch creates new interfaces for EEH operations. Internally we have vfio_eeh_container_op() which takes a VFIOContainer object directly. For external use we have vfio_eeh_as_ok() which determines if an AddressSpace is usable for EEH (at present this means it has a single container and at most a single group attached), and vfio_eeh_as_op() which will perform an operation on an AddressSpace in the unambiguous case, and otherwise returns an error. This interface still isn't great, but it's enough of an improvement to allow a number of cleanups in other places. Signed-off-by: David Gibson --- hw/vfio/common.c | 84 ++ include/hw/vfio/vfio.h | 2 ++ 2 files changed, 86 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 607ec70..b61118e 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1003,3 +1003,87 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, return vfio_container_do_ioctl(as, groupid, req, param); } + +/* + * Interfaces for IBM EEH (Enhanced Error Handling) + */ +static bool vfio_eeh_container_ok(VFIOContainer *container) +{ +/* A broken kernel implementation means EEH operations won't work + * correctly if there are multiple groups in a container. So + * return true only if there is exactly one group attached to the + * container */ + +if (QLIST_EMPTY(&container->group_list)) { +return false; +} + +if (QLIST_NEXT(QLIST_FIRST(&container->group_list), container_next)) { +return false; +} + +return true; +} + +static int vfio_eeh_container_op(VFIOContainer *container, uint32_t op) +{ +struct vfio_eeh_pe_op pe_op = { +.argsz = sizeof(pe_op), +.op = op, +}; +int ret; + +if (!vfio_eeh_container_ok(container)) { +error_report("vfio/eeh: EEH_PE_OP 0x%x called on container" + " with multiple groups", op); +return -EPERM; +} + +ret = ioctl(container->fd, VFIO_EEH_PE_OP, &pe_op); +if (ret < 0) { +error_report("vfio/eeh: EEH_PE_OP 0x%x failed: %m", op); +return -errno; +} + +return 0; +} + +static VFIOContainer *vfio_eeh_as_container(AddressSpace *as) +{ +VFIOAddressSpace *space = vfio_get_address_space(as); +VFIOContainer *container = NULL; + +if (QLIST_EMPTY(&space->containers)) { +/* No containers to act on */ +goto out; +} + +container = QLIST_FIRST(&space->containers); + +if (QLIST_NEXT(container, next)) { +/* We don't yet have logic to synchronize EEH state across + * multiple containers */ +container = NULL; +goto out; +} + +out: +vfio_put_address_space(space); +return container; +} + +bool vfio_eeh_as_ok(AddressSpace *as) +{ +VFIOContainer *container = vfio_eeh_as_container(as); + +return (container != NULL) && vfio_eeh_container_ok(container); +} + +int vfio_eeh_as_op(AddressSpace *as, uint32_t op) +{ +VFIOContainer *container = vfio_eeh_as_container(as); + +/* Shouldn't be called unless vfio_eeh_as_ok() returned true */ +assert(container); +return vfio_eeh_container_op(container, op); +} diff --git a/include/hw/vfio/vfio.h b/include/hw/vfio/vfio.h index 0b26cd8..fd3933b 100644 --- a/include/hw/vfio/vfio.h +++ b/include/hw/vfio/vfio.h @@ -5,5 +5,7 @@ extern int vfio_container_ioctl(AddressSpace *as, int32_t groupid, int req, void *param); +bool vfio_eeh_as_ok(AddressSpace *as); +int vfio_eeh_as_op(AddressSpace *as, uint32_t op); #endif -- 2.5.0
[Qemu-devel] [PATCHv2 5/7] spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge
Now that the regular spapr-pci-host-bridge can handle EEH, there are only two things that spapr-pci-vfio-host-bridge does differently: 1. automatically sizes its DMA window to match the host IOMMU 2. checks if the attached VFIO container is backed by the VFIO_SPAPR_TCE_IOMMU type on the host (1) is not particularly useful, since the default window used by the regular host bridge will work with the host IOMMU configuration on all current systems anyway. Plus, automatically changing guest visible configuration (such as the DMA window) based on host settings is generally a bad idea. It's not definitively broken, since spapr-pci-vfio-host-bridge is only supposed to support VFIO devices which can't be migrated anyway, but still. (2) is not really useful, because if a guest tries to configure EEH on a different host IOMMU, the first call will fail and that will be that. It's possible there are scripts or tools out there which expect spapr-pci-vfio-host-bridge, so we don't remove it entirely. This patch reduces it to just a stub for backwards compatibility. Signed-off-by: David Gibson --- hw/ppc/spapr_pci_vfio.c | 61 + include/hw/pci-host/spapr.h | 11 2 files changed, 17 insertions(+), 55 deletions(-) diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index 16a4a8f..9e15924 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -23,54 +23,29 @@ #include "hw/pci/msix.h" #include "linux/vfio.h" #include "hw/vfio/vfio.h" +#include "qemu/error-report.h" -static Property spapr_phb_vfio_properties[] = { -DEFINE_PROP_INT32("iommu", sPAPRPHBVFIOState, iommugroupid, -1), -DEFINE_PROP_END_OF_LIST(), -}; +#define TYPE_SPAPR_PCI_VFIO_HOST_BRIDGE "spapr-pci-vfio-host-bridge" -static void spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **errp) -{ -sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); -struct vfio_iommu_spapr_tce_info info = { .argsz = sizeof(info) }; -int ret; -sPAPRTCETable *tcet; -uint32_t liobn = svphb->phb.dma_liobn; +#define SPAPR_PCI_VFIO_HOST_BRIDGE(obj) \ +OBJECT_CHECK(sPAPRPHBVFIOState, (obj), TYPE_SPAPR_PCI_VFIO_HOST_BRIDGE) -if (svphb->iommugroupid == -1) { -error_setg(errp, "Wrong IOMMU group ID %d", svphb->iommugroupid); -return; -} +typedef struct sPAPRPHBVFIOState sPAPRPHBVFIOState; -ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid, - VFIO_CHECK_EXTENSION, - (void *) VFIO_SPAPR_TCE_IOMMU); -if (ret != 1) { -error_setg_errno(errp, -ret, - "spapr-vfio: SPAPR extension is not supported"); -return; -} +struct sPAPRPHBVFIOState { +sPAPRPHBState phb; -ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid, - VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info); -if (ret) { -error_setg_errno(errp, -ret, - "spapr-vfio: get info from container failed"); -return; -} +int32_t iommugroupid; +}; -tcet = spapr_tce_new_table(DEVICE(sphb), liobn, info.dma32_window_start, - SPAPR_TCE_PAGE_SHIFT, - info.dma32_window_size >> SPAPR_TCE_PAGE_SHIFT, - true); -if (!tcet) { -error_setg(errp, "spapr-vfio: failed to create VFIO TCE table"); -return; -} +static Property spapr_phb_vfio_properties[] = { +DEFINE_PROP_INT32("iommu", sPAPRPHBVFIOState, iommugroupid, -1), +DEFINE_PROP_END_OF_LIST(), +}; -/* Register default 32bit DMA window */ -memory_region_add_subregion(&sphb->iommu_root, tcet->bus_offset, -spapr_tce_get_iommu(tcet)); +static void spapr_phb_vfio_instance_init(Object *obj) +{ +error_report("spapr-pci-vfio-host-bridge is deprecated"); } bool spapr_phb_eeh_available(sPAPRPHBState *sphb) @@ -241,18 +216,16 @@ int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb) static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); -sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_CLASS(klass); dc->props = spapr_phb_vfio_properties; -spc->finish_realize = spapr_phb_vfio_finish_realize; } static const TypeInfo spapr_phb_vfio_info = { .name = TYPE_SPAPR_PCI_VFIO_HOST_BRIDGE, .parent= TYPE_SPAPR_PCI_HOST_BRIDGE, .instance_size = sizeof(sPAPRPHBVFIOState), +.instance_init = spapr_phb_vfio_instance_init, .class_init= spapr_phb_vfio_class_init, -.class_size= sizeof(sPAPRPHBClass), }; static void spapr_pci_vfio_register_types(void) diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index 19a95e0..a08235e 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -28,14 +28,10 @@ #in
[Qemu-devel] [PATCHv2 7/7] vfio: Eliminate vfio_container_ioctl()
vfio_container_ioctl() was a bad interface that bypassed abstraction boundaries, had semantics that sat uneasily with its name, and was unsafe in many realistic circumstances. Now that spapr-pci-vfio-host-bridge has been folded into spapr-pci-host-bridge, there are no more users, so remove it. Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy --- hw/vfio/common.c | 45 - include/hw/vfio/vfio.h | 2 -- 2 files changed, 47 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index b61118e..55e87d3 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -959,51 +959,6 @@ void vfio_put_base_device(VFIODevice *vbasedev) close(vbasedev->fd); } -static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid, - int req, void *param) -{ -VFIOGroup *group; -VFIOContainer *container; -int ret = -1; - -group = vfio_get_group(groupid, as); -if (!group) { -error_report("vfio: group %d not registered", groupid); -return ret; -} - -container = group->container; -if (group->container) { -ret = ioctl(container->fd, req, param); -if (ret < 0) { -error_report("vfio: failed to ioctl %d to container: ret=%d, %s", - _IOC_NR(req) - VFIO_BASE, ret, strerror(errno)); -} -} - -vfio_put_group(group); - -return ret; -} - -int vfio_container_ioctl(AddressSpace *as, int32_t groupid, - int req, void *param) -{ -/* We allow only certain ioctls to the container */ -switch (req) { -case VFIO_CHECK_EXTENSION: -case VFIO_IOMMU_SPAPR_TCE_GET_INFO: -case VFIO_EEH_PE_OP: -break; -default: -/* Return an error on unknown requests */ -error_report("vfio: unsupported ioctl %X", req); -return -1; -} - -return vfio_container_do_ioctl(as, groupid, req, param); -} - /* * Interfaces for IBM EEH (Enhanced Error Handling) */ diff --git a/include/hw/vfio/vfio.h b/include/hw/vfio/vfio.h index fd3933b..7153604 100644 --- a/include/hw/vfio/vfio.h +++ b/include/hw/vfio/vfio.h @@ -3,8 +3,6 @@ #include "qemu/typedefs.h" -extern int vfio_container_ioctl(AddressSpace *as, int32_t groupid, -int req, void *param); bool vfio_eeh_as_ok(AddressSpace *as); int vfio_eeh_as_op(AddressSpace *as, uint32_t op); -- 2.5.0
[Qemu-devel] [PATCHv2 0/7] Allow EEH on spapr-pci-host-bridge devices
For historical reasons, the spapr machine type has two PCI host bridge implementations: spapr-pci-host-bridge, and spapr-vfio-pci-host-bridge. The latter was (duh) designed for VFIO devices, but later reworks mean it's not necessary for that, and VFIO can be used on the regular host bridge. The only remaining difference is that EEH (Enhanced Error Handling - IBM's interface with similar purpose to AER) is only supported on the spapr-vfio-pci-host-bridge device. This series corrects this, allowing EEH operations on the regular host bridge. That allows the special VFIO host bridge (we leave a stub, for backwards compatibility). EEH is only supported for VFIO devices, for the time being, and due to bugs in the kernel implementation, it is only usable when the host bridge only has devices from a single (host-side) IOMMU group ("Partitionable Endpoint", in IBM terminology). That's an annoying limitation which we hope to lift someday, but it's no worse than now, since the spapr-vfio-pci-host-bridge only permits devices from a single IOMMU group in any case (although it doesn't properly enforce that). I wrote these a while back, and I'm not sure why they got sidelined - I suspect I was looking for testing from Gavin Shan, not realising then that he was no longer working on EEH. In any case, I'm hoping we can squeeze this change into 2.6, so we don't need to carry the broken spapr-vfio-pci-host-bridge device any longer. Alex, If you could look at the 2 patches in this series affecting core VFIO code ASAP, that would be really helpful. Also, if you could let me know if you would like to take those through your tree, or if I can take them through mine, with your ack. Changes since v1: * Don't consider containers with no groups as being able to support EEH, which is a safer option * To avoid build problems on non-Linux hosts, I no longer merge most of the code into spapr_pci.c, instead leaving it in helpers in spapr_pci_vfio.c, just no longer part of its own special device type (this approach suggested by Alexey Kardashevskiy). David Gibson (7): vfio: Start improving VFIO/EEH interface spapr_pci: Switch to vfio_eeh_as_op() interface spapr_pci: Eliminate class callbacks spapr_pci: Allow EEH on spapr-pci-host-bridge spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge spapr_pci: Remove finish_realize hook vfio: Eliminate vfio_container_ioctl() hw/ppc/spapr_pci.c | 63 +++-- hw/ppc/spapr_pci_vfio.c | 131 +++- hw/vfio/common.c| 101 +++--- include/hw/pci-host/spapr.h | 64 +- include/hw/vfio/vfio.h | 4 +- 5 files changed, 173 insertions(+), 190 deletions(-) -- 2.5.0
Re: [Qemu-devel] [PATCH 3/3] replay: introduce block devices record/replay
> From: Kevin Wolf [mailto:kw...@redhat.com] > Am 25.02.2016 um 10:06 hat Pavel Dovgalyuk geschrieben: > > There is one problem with flush event - callbacks for flush are called for > > all layers and I couldn't synchronize them correctly yet. > > I'll probably have to add new callback to block driver, which handles > > flush request for the whole stack of the drivers. > > Flushes should be treated more or less the same a writes, I think. But bdrv_co_flush has different structure and does not allow synchronization of the top layer. Here is the patch for fixing this: diff --git a/block/io.c b/block/io.c index a69bfc4..9e05dfe 100644 --- a/block/io.c +++ b/block/io.c @@ -2369,6 +2369,12 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs) } tracked_request_begin(&req, bs, 0, 0, BDRV_TRACKED_FLUSH); +/* Write back all layers by calling one driver function */ +if (bs->drv->bdrv_co_flush) { +ret = bs->drv->bdrv_co_flush(bs); +goto out; +} + /* Write back cached data to the OS even with cache=unsafe */ BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_OS); if (bs->drv->bdrv_co_flush_to_os) { diff --git a/include/block/block_int.h b/include/block/block_int.h index 9ef823a..9cc2c58 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -176,6 +176,13 @@ struct BlockDriver { int (*bdrv_inactivate)(BlockDriverState *bs); /* + * Flushes all data for all layers by calling bdrv_co_flush for underlying + * layers, if needed. This function is needed for deterministic + * synchronization of the flush finishing callback. + */ +int coroutine_fn (*bdrv_co_flush)(BlockDriverState *bs); + +/* * Flushes all data that was already written to the OS all the way down to * the disk (for example raw-posix calls fsync()). */ Then I'll have just to implement bdrv_co_flush for blkreplay layer. This callback will manually invoke bdrv_co_flush for underlying layer allowing synchronization of finishing callback. Pavel Dovgalyuk
[Qemu-devel] [PULL 0/8] ppc-for-2.6 queue 20160229
To probably no-one's surprise, my "hopefully last" pull request before the soft freeze has turned out not to be the last. Here are a few more patches for qemu-2.6. The following changes since commit 6e378dd214fbbae8138ff011ec3de7ddf13a445f: Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160226' into staging (2016-02-26 16:02:00 +) are available in the git repository at: git://github.com/dgibson/qemu.git tags/ppc-for-2.6-20160229 for you to fetch changes up to a005b3ef50439b5bc6b2eb0b5bda8e8c7c2368bf: xics: report errors with the QEMU Error API (2016-02-28 16:19:02 +1100) ppc patch queue for 2016-02-29 Some more accumulated patches for target-ppc, pseries machine type and related devices to fit in before the qemu-2.6 soft freeze. * Mostly bugfixes and small cleanups for spapr and Mac platforms Greg Kurz (7): spapr_rng: disable hotpluggability spapr_pci: kill useless variable in rtas_ibm_change_msi() spapr_pci: fix irq leak in RTAS ibm,change-msi spapr: disable vmdesc submission for old machines spapr: skip configuration section during migration of older machines migration: allow machine to enforce configuration section migration xics: report errors with the QEMU Error API Hervé Poussineau (1): dbdma: warn when using unassigned channel hw/core/machine.c | 21 + hw/intc/xics.c| 13 + hw/misc/macio/mac_dbdma.c | 25 +++-- hw/ppc/spapr.c| 2 ++ hw/ppc/spapr_events.c | 3 ++- hw/ppc/spapr_pci.c| 31 +-- hw/ppc/spapr_rng.c| 1 + hw/ppc/spapr_vio.c| 7 --- include/hw/boards.h | 1 + include/hw/ppc/xics.h | 5 +++-- migration/savevm.c| 10 -- qemu-options.hx | 3 ++- trace-events | 2 -- 13 files changed, 97 insertions(+), 27 deletions(-)
[Qemu-devel] [PULL 6/8] spapr: skip configuration section during migration of older machines
From: Greg Kurz Since QEMU 2.4, we have a configuration section in the migration stream. This must be skipped for older machines, like it is already done for x86. This patch fixes the migration of pseries-2.3 from/to QEMU 2.3, but it breaks migration of the same machine from/to QEMU 2.4/2.4.1/2.5. We do that anyway because QEMU 2.3 is likely to be more widely deployed than newer QEMU versions. Fixes: 61964c23e5ddd5a33f15699e45ce126f879e3e33 Signed-off-by: Greg Kurz Reviewed-by: Laurent Vivier Reviewed-by: Juan Quintela Signed-off-by: David Gibson --- hw/ppc/spapr.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 3b4a557..e9d4abf 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -2427,6 +2427,7 @@ static void spapr_machine_2_3_instance_options(MachineState *machine) spapr_machine_2_4_instance_options(machine); savevm_skip_section_footers(); global_state_set_optional(); +savevm_skip_configuration(); } static void spapr_machine_2_3_class_options(MachineClass *mc) -- 2.5.0
[Qemu-devel] [PULL 1/8] spapr_rng: disable hotpluggability
From: Greg Kurz It is currently possible to hotplug a spapr_rng device but QEMU crashes when we try to hot unplug: ERROR:hw/core/qdev.c:295:qdev_unplug: assertion failed: (hotplug_ctrl) Aborted This happens because spapr_rng isn't plugged to any bus and sPAPR does not provide hotplug support for it: qdev_get_hotplug_handler() hence return NULL and we hit the assertion. And anyway, it doesn't make much sense to unplug this device since hcalls cannot be unregistered. Even the idea of hotplugging a RNG device instead of declaring it on the QEMU command line looks weird. This patch simply disables hotpluggability for the spapr-rng class. Signed-off-by: Greg Kurz Reviewed-by: Thomas Huth Signed-off-by: David Gibson --- hw/ppc/spapr_rng.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/ppc/spapr_rng.c b/hw/ppc/spapr_rng.c index 8484fcf..a39d472 100644 --- a/hw/ppc/spapr_rng.c +++ b/hw/ppc/spapr_rng.c @@ -170,6 +170,7 @@ static void spapr_rng_class_init(ObjectClass *oc, void *data) dc->realize = spapr_rng_realize; set_bit(DEVICE_CATEGORY_MISC, dc->categories); dc->props = spapr_rng_properties; +dc->hotpluggable = false; } static const TypeInfo spapr_rng_info = { -- 2.5.0
[Qemu-devel] [PULL 2/8] spapr_pci: kill useless variable in rtas_ibm_change_msi()
From: Greg Kurz The num local variable is initialized to zero and has no writer. Signed-off-by: Greg Kurz Signed-off-by: David Gibson --- hw/ppc/spapr_pci.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index cca9257..19dd6db 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -275,7 +275,7 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr, unsigned int req_num = rtas_ld(args, 4); /* 0 == remove all */ unsigned int seq_num = rtas_ld(args, 5); unsigned int ret_intr_type; -unsigned int irq, max_irqs = 0, num = 0; +unsigned int irq, max_irqs = 0; sPAPRPHBState *phb = NULL; PCIDevice *pdev = NULL; spapr_pci_msi *msi; @@ -316,10 +316,10 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr, xics_free(spapr->icp, msi->first_irq, msi->num); if (msi_present(pdev)) { -spapr_msi_setmsg(pdev, 0, false, 0, num); +spapr_msi_setmsg(pdev, 0, false, 0, 0); } if (msix_present(pdev)) { -spapr_msi_setmsg(pdev, 0, true, 0, num); +spapr_msi_setmsg(pdev, 0, true, 0, 0); } g_hash_table_remove(phb->msi, &config_addr); -- 2.5.0
[Qemu-devel] [PULL 3/8] spapr_pci: fix irq leak in RTAS ibm, change-msi
From: Greg Kurz This RTAS call is used to request new interrupts or to free all interrupts. If the driver has already allocated interrupts and asks again for a non-null number of irqs, then the rtas_ibm_change_msi() function will silently leak the previous interrupts. It happens because xics_free() is only called when the driver releases all interrupts (!req_num case). Note that the previously allocated spapr_pci_msi is not leaked because the GHashTable is created with destroy functions and g_hash_table_insert() hence frees the old value. This patch makes sure any previously allocated MSIs are released when a new allocation succeeds. Signed-off-by: Greg Kurz Signed-off-by: David Gibson --- hw/ppc/spapr_pci.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 19dd6db..9b2b546 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -305,9 +305,10 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr, return; } +msi = (spapr_pci_msi *) g_hash_table_lookup(phb->msi, &config_addr); + /* Releasing MSIs */ if (!req_num) { -msi = (spapr_pci_msi *) g_hash_table_lookup(phb->msi, &config_addr); if (!msi) { trace_spapr_pci_msi("Releasing wrong config", config_addr); rtas_st(rets, 0, RTAS_OUT_HW_ERROR); @@ -360,6 +361,12 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr, return; } +/* Release previous MSIs */ +if (msi) { +xics_free(spapr->icp, msi->first_irq, msi->num); +g_hash_table_remove(phb->msi, &config_addr); +} + /* Setup MSI/MSIX vectors in the device (via cfgspace or MSIX BAR) */ spapr_msi_setmsg(pdev, SPAPR_PCI_MSI_WINDOW, ret_intr_type == RTAS_TYPE_MSIX, irq, req_num); -- 2.5.0
[Qemu-devel] [PULL 8/8] xics: report errors with the QEMU Error API
From: Greg Kurz Using the return value to report errors is error prone: - xics_alloc() returns -1 on error but spapr_vio_busdev_realize() errors on 0 - xics_alloc_block() returns the unclear value of ics->offset - 1 on error but both rtas_ibm_change_msi() and spapr_phb_realize() error on 0 This patch adds an errp argument to xics_alloc() and xics_alloc_block() to report errors. The return value of these functions is a valid IRQ number if errp is NULL. It is undefined otherwise. The corresponding error traces get promotted to error messages. Note that the "can't allocate IRQ" error message in spapr_vio_busdev_realize() also moves to xics_alloc(). Similar error message consolidation isn't really applicable to xics_alloc_block() because callers have extra context (device config address, MSI or MSIX). This fixes the issues mentioned above. Based on previous work from Brian W. Hart. Signed-off-by: Greg Kurz Signed-off-by: David Gibson --- hw/intc/xics.c| 13 + hw/ppc/spapr_events.c | 3 ++- hw/ppc/spapr_pci.c| 16 ++-- hw/ppc/spapr_vio.c| 7 --- include/hw/ppc/xics.h | 5 +++-- trace-events | 2 -- 6 files changed, 28 insertions(+), 18 deletions(-) diff --git a/hw/intc/xics.c b/hw/intc/xics.c index e66ae32..213a370 100644 --- a/hw/intc/xics.c +++ b/hw/intc/xics.c @@ -712,7 +712,7 @@ static int ics_find_free_block(ICSState *ics, int num, int alignnum) return -1; } -int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi) +int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi, Error **errp) { ICSState *ics = &icp->ics[src]; int irq; @@ -720,14 +720,14 @@ int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi) if (irq_hint) { assert(src == xics_find_source(icp, irq_hint)); if (!ICS_IRQ_FREE(ics, irq_hint - ics->offset)) { -trace_xics_alloc_failed_hint(src, irq_hint); +error_setg(errp, "can't allocate IRQ %d: already in use", irq_hint); return -1; } irq = irq_hint; } else { irq = ics_find_free_block(ics, 1, 1); if (irq < 0) { -trace_xics_alloc_failed_no_left(src); +error_setg(errp, "can't allocate IRQ: no IRQ left"); return -1; } irq += ics->offset; @@ -743,7 +743,8 @@ int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi) * Allocate block of consecutive IRQs, and return the number of the first IRQ in the block. * If align==true, aligns the first IRQ number to num. */ -int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align) +int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align, + Error **errp) { int i, first = -1; ICSState *ics = &icp->ics[src]; @@ -763,6 +764,10 @@ int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align) } else { first = ics_find_free_block(ics, num, 1); } +if (first < 0) { +error_setg(errp, "can't find a free %d-IRQ block", num); +return -1; +} if (first >= 0) { for (i = first; i < first + num; ++i) { diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c index f5eac4b..39f4682 100644 --- a/hw/ppc/spapr_events.c +++ b/hw/ppc/spapr_events.c @@ -588,7 +588,8 @@ out_no_events: void spapr_events_init(sPAPRMachineState *spapr) { QTAILQ_INIT(&spapr->pending_events); -spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false); +spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false, +&error_fatal); spapr->epow_notifier.notify = spapr_powerdown_req; qemu_register_powerdown_notifier(&spapr->epow_notifier); spapr_rtas_register(RTAS_CHECK_EXCEPTION, "check-exception", diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 9b2b546..e8edad3 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -280,6 +280,7 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr, PCIDevice *pdev = NULL; spapr_pci_msi *msi; int *config_addr_key; +Error *err = NULL; switch (func) { case RTAS_CHANGE_MSI_FN: @@ -354,9 +355,10 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, sPAPRMachineState *spapr, /* Allocate MSIs */ irq = xics_alloc_block(spapr->icp, 0, req_num, false, - ret_intr_type == RTAS_TYPE_MSI); -if (!irq) { -error_report("Cannot allocate MSIs for device %x", config_addr); + ret_intr_type == RTAS_TYPE_MSI, &err); +if (err) { +error_reportf_err(err, "Can't allocate MSIs for device %x: ", + config_addr); rtas_st(rets, 0, RTAS_OUT_HW_ERROR); return; } @@ -1367,10 +1369,12 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) /* Initialize the LSI table */ for (i = 0; i < P
[Qemu-devel] [PULL 7/8] migration: allow machine to enforce configuration section migration
From: Greg Kurz Migration of pseries-2.3 doesn't have configuration section. Unfortunately, QEMU 2.4/2.4.1/2.5 are buggy and always stream and expect the configuration section, and break migration both ways. This patch introduces a property which allows to enforce a configuration section for machines who don't have one. It can be set at startup: -machine enforce-config-section=on or later from the QEMU monitor: qom-set /machine enforce-config-section on It is up to the tooling to set or unset this property according to the version of the QEMU at the other end of the pipe. Signed-off-by: Greg Kurz Reviewed-by: Laurent Vivier Reviewed-by: Juan Quintela Signed-off-by: David Gibson --- hw/core/machine.c | 21 + include/hw/boards.h | 1 + migration/savevm.c | 10 -- qemu-options.hx | 3 ++- 4 files changed, 32 insertions(+), 3 deletions(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index 6d1a0d8..a8c4680 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -312,6 +312,21 @@ static bool machine_get_suppress_vmdesc(Object *obj, Error **errp) return ms->suppress_vmdesc; } +static void machine_set_enforce_config_section(Object *obj, bool value, + Error **errp) +{ +MachineState *ms = MACHINE(obj); + +ms->enforce_config_section = value; +} + +static bool machine_get_enforce_config_section(Object *obj, Error **errp) +{ +MachineState *ms = MACHINE(obj); + +return ms->enforce_config_section; +} + static int error_on_sysbus_device(SysBusDevice *sbdev, void *opaque) { error_report("Option '-device %s' cannot be handled by this machine", @@ -467,6 +482,12 @@ static void machine_initfn(Object *obj) object_property_set_description(obj, "suppress-vmdesc", "Set on to disable self-describing migration", NULL); +object_property_add_bool(obj, "enforce-config-section", + machine_get_enforce_config_section, + machine_set_enforce_config_section, NULL); +object_property_set_description(obj, "enforce-config-section", +"Set on to enforce configuration section migration", +NULL); /* Register notifier when init is done for sysbus sanity checks */ ms->sysbus_notifier.notify = machine_init_notify; diff --git a/include/hw/boards.h b/include/hw/boards.h index de3b3bd..b5d7eae 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -127,6 +127,7 @@ struct MachineState { char *firmware; bool iommu; bool suppress_vmdesc; +bool enforce_config_section; ram_addr_t ram_size; ram_addr_t maxram_size; diff --git a/migration/savevm.c b/migration/savevm.c index b459156..96e7db5 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -878,13 +878,19 @@ bool qemu_savevm_state_blocked(Error **errp) return false; } +static bool enforce_config_section(void) +{ +MachineState *machine = MACHINE(qdev_get_machine()); +return machine->enforce_config_section; +} + void qemu_savevm_state_header(QEMUFile *f) { trace_savevm_state_header(); qemu_put_be32(f, QEMU_VM_FILE_MAGIC); qemu_put_be32(f, QEMU_VM_FILE_VERSION); -if (!savevm_state.skip_configuration) { +if (!savevm_state.skip_configuration || enforce_config_section()) { qemu_put_byte(f, QEMU_VM_CONFIGURATION); vmstate_save_state(f, &vmstate_configuration, &savevm_state, 0); } @@ -1883,7 +1889,7 @@ int qemu_loadvm_state(QEMUFile *f) return -ENOTSUP; } -if (!savevm_state.skip_configuration) { +if (!savevm_state.skip_configuration || enforce_config_section()) { if (qemu_get_byte(f) != QEMU_VM_CONFIGURATION) { error_report("Configuration section missing"); return -EINVAL; diff --git a/qemu-options.hx b/qemu-options.hx index 599db94..144e6a9 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -43,7 +43,8 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \ "aes-key-wrap=on|off controls support for AES key wrapping (default=on)\n" "dea-key-wrap=on|off controls support for DEA key wrapping (default=on)\n" "suppress-vmdesc=on|off disables self-describing migration (default=off)\n" -"nvdimm=on|off controls NVDIMM support (default=off)\n", +"nvdimm=on|off controls NVDIMM support (default=off)\n" +"enforce-config-section=on|off enforce configuration section migration (default=off)\n", QEMU_ARCH_ALL) STEXI @item -machine [type=]@var{name}[,prop=@var{value}[,...]] -- 2.5.0
[Qemu-devel] [PULL 5/8] dbdma: warn when using unassigned channel
From: Hervé Poussineau With this, it's easier to know if a guest uses an invalid and/or unimplemented DMA channel. Signed-off-by: Hervé Poussineau Reviewed-by: Thomas Huth Acked-by: Mark Cave-Ayland Signed-off-by: David Gibson --- hw/misc/macio/mac_dbdma.c | 25 +++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c index d81dea7..6051f17 100644 --- a/hw/misc/macio/mac_dbdma.c +++ b/hw/misc/macio/mac_dbdma.c @@ -557,11 +557,13 @@ void DBDMA_register_channel(void *dbdma, int nchan, qemu_irq irq, DBDMA_DPRINTF("DBDMA_register_channel 0x%x\n", nchan); +assert(rw); +assert(flush); + ch->irq = irq; ch->rw = rw; ch->flush = flush; ch->io.opaque = opaque; -ch->io.channel = ch; } static void @@ -775,6 +777,20 @@ static void dbdma_reset(void *opaque) memset(s->channels[i].regs, 0, DBDMA_SIZE); } +static void dbdma_unassigned_rw(DBDMA_io *io) +{ +DBDMA_channel *ch = io->channel; +qemu_log_mask(LOG_GUEST_ERROR, "%s: use of unassigned channel %d\n", + __func__, ch->channel); +} + +static void dbdma_unassigned_flush(DBDMA_io *io) +{ +DBDMA_channel *ch = io->channel; +qemu_log_mask(LOG_GUEST_ERROR, "%s: use of unassigned channel %d\n", + __func__, ch->channel); +} + void* DBDMA_init (MemoryRegion **dbdma_mem) { DBDMAState *s; @@ -784,8 +800,13 @@ void* DBDMA_init (MemoryRegion **dbdma_mem) for (i = 0; i < DBDMA_CHANNELS; i++) { DBDMA_io *io = &s->channels[i].io; +DBDMA_channel *ch = &s->channels[i]; qemu_iovec_init(&io->iov, 1); -s->channels[i].channel = i; + +ch->rw = dbdma_unassigned_rw; +ch->flush = dbdma_unassigned_flush; +ch->channel = i; +ch->io.channel = ch; } memory_region_init_io(&s->mem, NULL, &dbdma_ops, s, "dbdma", 0x1000); -- 2.5.0
[Qemu-devel] [PULL 4/8] spapr: disable vmdesc submission for old machines
From: Greg Kurz Since QEMU 2.3, we have a vmdesc section in the migration stream. This section is not mandatory but when migrating a pseries-2.2 machine from QEMU 2.2, you get a warning at the destination: qemu-system-ppc64: Expected vmdescription section, but got 0 The warning goes away if we decide to skip vmdesc as well for older pseries, like it is already done for pc's. This can only be observed with -cpu POWER7 because POWER8 cannot migrate from QEMU 2.2 to 2.3 (insns_flags2 mismatch). Signed-off-by: Greg Kurz Signed-off-by: David Gibson --- hw/ppc/spapr.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index c119f55..3b4a557 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -2452,6 +2452,7 @@ DEFINE_SPAPR_MACHINE(2_3, "2.3", false); static void spapr_machine_2_2_instance_options(MachineState *machine) { spapr_machine_2_3_instance_options(machine); +machine->suppress_vmdesc = true; } static void spapr_machine_2_2_class_options(MachineClass *mc) -- 2.5.0
Re: [Qemu-devel] [PATCH 04/12] spapr_pci: Fold spapr_phb_vfio_eeh_configure() into spapr_pci code
On Mon, Feb 29, 2016 at 03:25:24PM +1100, Alexey Kardashevskiy wrote: > On 02/29/2016 02:45 PM, Alexey Kardashevskiy wrote: > >On 02/29/2016 12:43 PM, Alexey Kardashevskiy wrote: > >>On 02/26/2016 10:31 PM, David Gibson wrote: > >>>Simplify the sPAPR PCI code by folding spapr_phb_vfio_eeh_configure() > >>>into rtas_ibm_configure_pe(). > >>> > >>>Signed-off-by: David Gibson > >> > >>Reviewed-by: Alexey Kardashevskiy > > > >Aaaand this breaks mingw32: > > > > CCppc64-softmmu/hw/ppc/spapr_pci.o > >/home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:46:24: fatal error: > >linux/vfio.h: No such file or directory > >compilation terminated. > >/home/aik/p/qemu-dwg-eeh/rules.mak:57: recipe for target > >'hw/ppc/spapr_pci.o' failed > >make[1]: *** [hw/ppc/spapr_pci.o] Error 1 > >make[1]: *** Waiting for unfinished jobs > >Makefile:186: recipe for target 'subdir-ppc64-softmmu' failed > >make: *** [subdir-ppc64-softmmu] Error 2 > >make: Leaving directory '/scratch/aik/p/qemu-dwg-eeh--ppc64_mingw32-build' > > > > > > > >> > >>>--- > >>> hw/ppc/spapr_pci.c | 11 +-- > >>> hw/ppc/spapr_pci_vfio.c | 12 > >>> include/hw/pci-host/spapr.h | 1 - > >>> 3 files changed, 9 insertions(+), 15 deletions(-) > >>> > >>>diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > >>>index d1e5222..fa633a9 100644 > >>>--- a/hw/ppc/spapr_pci.c > >>>+++ b/hw/ppc/spapr_pci.c > >>>@@ -42,6 +42,9 @@ > >>> #include "hw/ppc/spapr_drc.h" > >>> #include "sysemu/device_tree.h" > >>> > >>>+#include "hw/vfio/vfio.h" > >>>+#include > >>>+ > > This is missing: > > #ifdef CONFIG_LINUX > #include > #endif > > and below where you use symbols from linux/vfio.h. > > > My version of this rework did convert class callbacks to exported helpers > and keep these helpers (plus stubs) in spapr_pci_vfio.c, with one #ifdef > CONFIG_LINUX. Looked quite clean to me... Yeah, good idea. I'd forgotten the case of non-Linux builds. I'll do something similar in v2. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [RFC PATCH v0 2/6] spapr: CPU core device
On Fri, Feb 26, 2016 at 12:13:39PM -0600, Michael Roth wrote: > Quoting Bharata B Rao (2016-02-25 10:22:38) > > Add sPAPR specific CPU core device that is based on generic CPU core device. > > Creating this core device will result in creation of all the CPU thread > > devices that are part of this core. > > > > Signed-off-by: Bharata B Rao > > --- > > hw/ppc/Makefile.objs| 1 + > > hw/ppc/spapr_cpu_core.c | 210 > > > > include/hw/ppc/spapr_cpu_core.h | 32 ++ > > 3 files changed, 243 insertions(+) > > create mode 100644 hw/ppc/spapr_cpu_core.c > > create mode 100644 include/hw/ppc/spapr_cpu_core.h > > > > diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs > > index c1ffc77..5cc6608 100644 > > --- a/hw/ppc/Makefile.objs > > +++ b/hw/ppc/Makefile.objs > > @@ -4,6 +4,7 @@ obj-y += ppc.o ppc_booke.o > > obj-$(CONFIG_PSERIES) += spapr.o spapr_vio.o spapr_events.o > > obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o > > obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o > > +obj-$(CONFIG_PSERIES) += spapr_cpu_core.o > > ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy) > > obj-y += spapr_pci_vfio.o > > endif > > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c > > new file mode 100644 > > index 000..c44eb61 > > --- /dev/null > > +++ b/hw/ppc/spapr_cpu_core.c > > @@ -0,0 +1,210 @@ > > +/* > > + * sPAPR CPU core device, acts as container of CPU thread devices. > > + * > > + * Copyright (C) 2016 Bharata B Rao > > + * > > + * This work is licensed under the terms of the GNU GPL, version 2 or > > later. > > + * See the COPYING file in the top-level directory. > > + */ > > +#include "hw/cpu/core.h" > > +#include "hw/ppc/spapr_cpu_core.h" > > +#include "hw/ppc/spapr.h" > > +#include "hw/boards.h" > > +#include "qemu/error-report.h" > > +#include "qapi/visitor.h" > > +#include > > + > > +static int spapr_cpu_core_realize_child(Object *child, void *opaque) > > +{ > > +Error **errp = opaque; > > + > > +object_property_set_bool(child, true, "realized", errp); > > +if (*errp) { > > +return 1; > > +} > > +return 0; > > +} > > + > > +static void spapr_cpu_core_realize(DeviceState *dev, Error **errp) > > +{ > > +sPAPRCPUCore *core = SPAPR_CPU_CORE(OBJECT(dev)); > > +sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine()); > > +Error *local_err = NULL; > > + > > +if (!core->nr_threads) { > > +error_setg(errp, "nr_threads property can't be 0"); > > +return; > > +} > > + > > +if (!core->cpu_model) { > > +error_setg(errp, "cpu_model property isn't set"); > > +return; > > +} > > + > > +/* > > + * TODO: If slot isn't specified, plug this core into > > + * an existing empty slot. > > + */ > > +if (!core->slot) { > > +error_setg(errp, "slot property isn't set"); > > +return; > > +} > > + > > +object_property_set_link(OBJECT(spapr), OBJECT(core), core->slot, > > + &local_err); > > +if (local_err) { > > +error_propagate(errp, local_err); > > +return; > > +} > > + > > +object_child_foreach(OBJECT(dev), spapr_cpu_core_realize_child, errp); > > +} > > + > > +/* > > + * This creates the CPU threads for a given @core. > > + * > > + * In order to create the threads, we need two inputs - number of > > + * threads and the cpu_model. These are set as core object's properties. > > + * When both of them become available/set, this routine will be called from > > + * either property's set handler to create the threads. > > + * > > + * TODO: Dependence of threads creation on two properties is resulting > > + * in this not-so-clean way of creating threads from either of the > > + * property setters based on the order in which they get set. Check if > > + * this can be handled in a better manner. > > + */ > > +static void spapr_cpu_core_create_threads(sPAPRCPUCore *core, Error **errp) > > +{ > > +int i; > > + > > +for (i = 0; i < core->nr_threads; i++) { > > +char id[32]; > > +char type[32]; > > + > > +snprintf(type, sizeof(type), "%s-%s", core->cpu_model, > > + TYPE_POWERPC_CPU); > > +object_initialize(&core->threads[i], sizeof(core->threads[i]), > > type); > > + > > +snprintf(id, sizeof(id), "thread[%d]", i); > > +object_property_add_child(OBJECT(core), id, > > OBJECT(&core->threads[i]), > > + errp); > > +} > > +} > > + > > +static char *spapr_cpu_core_prop_get_slot(Object *obj, Error **errp) > > +{ > > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj); > > + > > +return core->slot; > > +} > > + > > +static void spapr_cpu_core_prop_set_slot(Object *obj, const char *val, > > + Error **errp) > > +{ > > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj); > > + >
Re: [Qemu-devel] [RFC PATCH v0 5/6] qmp, spapr: Show hot-plugged/pluggable CPU slots in the Machine
On Fri, Feb 26, 2016 at 08:58:05AM -0700, Eric Blake wrote: > On 02/25/2016 09:22 AM, Bharata B Rao wrote: > > Implement query cpu-slots that provides information about hot-plugged > > as well as hot-pluggable CPU slots that the machine supports. > > > > TODO: As Eric suggested use enum for type instead of str. > > TODO: @hotplug-granularity probably isn't required. > > I guess this is still marked TODO because the series is still RFC? Yes. > > > > > Signed-off-by: Bharata B Rao > > --- > > > +++ b/qapi-schema.json > > @@ -4083,3 +4083,88 @@ > > ## > > { 'enum': 'ReplayMode', > >'data': [ 'none', 'record', 'play' ] } > > + > > +## > > +# @CPUInfo: > > +# > > +# Information about CPUs > > +# > > +# @arch-id: Arch-specific ID for the CPU. > > +# > > +# @type: QOM type of the CPU. > > +# > > +# @thread: Thread ID of the CPU. > > +# > > +# @core: Core ID of the CPU. > > +# > > +# @socket: Socket ID of the CPU. > > +# > > +# @node: NUMA node to which the CPU belongs. > > Please add the '#optional' tag to the fields which are not always present. > Sure. > > +# > > +# @qom-path: QOM path of the CPU object > > +# > > +# Since: 2.6 > > +## > > + > > +{ 'struct': 'CPUInfo', > > + 'data': { 'arch-id': 'int', > > +'type': 'str', > > The TODO in the commit message mentions that this should be converted to > an enum. > > > +'*thread': 'int', > > +'*core': 'int', > > +'*socket' : 'int', > > +'*node' : 'int', > > +'*qom-path': 'str' > > + } > > But looking better than the previous round. Thanks for the review. Do you have any comments on the applicability/suitability of this interface from libvirt point of view for performing device_add based CPU hotplug ? Regards, Bharata.
Re: [Qemu-devel] [RFC PATCH v0 2/6] spapr: CPU core device
On Fri, Feb 26, 2016 at 11:46:19AM +0100, Thomas Huth wrote: > On 25.02.2016 17:22, Bharata B Rao wrote: > > Add sPAPR specific CPU core device that is based on generic CPU core device. > > Creating this core device will result in creation of all the CPU thread > > devices that are part of this core. > > > > Signed-off-by: Bharata B Rao > > --- > ... > > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c > > new file mode 100644 > > index 000..c44eb61 > > --- /dev/null > > +++ b/hw/ppc/spapr_cpu_core.c > > @@ -0,0 +1,210 @@ > > +/* > > + * sPAPR CPU core device, acts as container of CPU thread devices. > > + * > > + * Copyright (C) 2016 Bharata B Rao > > + * > > + * This work is licensed under the terms of the GNU GPL, version 2 or > > later. > > + * See the COPYING file in the top-level directory. > > + */ > > +#include "hw/cpu/core.h" > > +#include "hw/ppc/spapr_cpu_core.h" > > +#include "hw/ppc/spapr.h" > > +#include "hw/boards.h" > > +#include "qemu/error-report.h" > > +#include "qapi/visitor.h" > > +#include > > + > > +static int spapr_cpu_core_realize_child(Object *child, void *opaque) > > +{ > > +Error **errp = opaque; > > + > > +object_property_set_bool(child, true, "realized", errp); > > +if (*errp) { > > +return 1; > > +} > > +return 0; > > +} > > + > > +static void spapr_cpu_core_realize(DeviceState *dev, Error **errp) > > +{ > > +sPAPRCPUCore *core = SPAPR_CPU_CORE(OBJECT(dev)); > > +sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine()); > > +Error *local_err = NULL; > > + > > +if (!core->nr_threads) { > > +error_setg(errp, "nr_threads property can't be 0"); > > +return; > > +} > > + > > +if (!core->cpu_model) { > > +error_setg(errp, "cpu_model property isn't set"); > > +return; > > +} > > + > > +/* > > + * TODO: If slot isn't specified, plug this core into > > + * an existing empty slot. > > + */ > > +if (!core->slot) { > > +error_setg(errp, "slot property isn't set"); > > +return; > > +} > > + > > +object_property_set_link(OBJECT(spapr), OBJECT(core), core->slot, > > + &local_err); > > +if (local_err) { > > +error_propagate(errp, local_err); > > +return; > > +} > > + > > +object_child_foreach(OBJECT(dev), spapr_cpu_core_realize_child, errp); > > +} > > + > > +/* > > + * This creates the CPU threads for a given @core. > > + * > > + * In order to create the threads, we need two inputs - number of > > + * threads and the cpu_model. These are set as core object's properties. > > + * When both of them become available/set, this routine will be called from > > + * either property's set handler to create the threads. > > + * > > + * TODO: Dependence of threads creation on two properties is resulting > > + * in this not-so-clean way of creating threads from either of the > > + * property setters based on the order in which they get set. Check if > > + * this can be handled in a better manner. > > + */ > > +static void spapr_cpu_core_create_threads(sPAPRCPUCore *core, Error **errp) > > +{ > > +int i; > > + > > +for (i = 0; i < core->nr_threads; i++) { > > +char id[32]; > > +char type[32]; > > + > > +snprintf(type, sizeof(type), "%s-%s", core->cpu_model, > > + TYPE_POWERPC_CPU); > > +object_initialize(&core->threads[i], sizeof(core->threads[i]), > > type); > > + > > +snprintf(id, sizeof(id), "thread[%d]", i); > > +object_property_add_child(OBJECT(core), id, > > OBJECT(&core->threads[i]), > > + errp); > > Need to check errp here to see whether something went wrong? Yes, I will use local_err and them propagate. > > > +} > > +} > > + > > +static char *spapr_cpu_core_prop_get_slot(Object *obj, Error **errp) > > +{ > > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj); > > + > > +return core->slot; > > +} > > + > > +static void spapr_cpu_core_prop_set_slot(Object *obj, const char *val, > > + Error **errp) > > +{ > > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj); > > + > > +core->slot = g_strdup(val); > > +} > > + > > +static char *spapr_cpu_core_prop_get_cpu_model(Object *obj, Error **errp) > > +{ > > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj); > > + > > +return core->cpu_model; > > +} > > + > > +static void spapr_cpu_core_prop_set_cpu_model(Object *obj, const char *val, > > + Error **errp) > > +{ > > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj); > > +MachineState *machine = MACHINE(qdev_get_machine()); > > + > > +/* > > + * cpu_model can't be different from what is specified with -cpu > > + */ > > +if (strcmp(val, machine->cpu_model)) { > > + error_setg(errp, "cpu_model should be %s", machine->cpu_model); > > s/should/must/ sure :) > > > + return
Re: [Qemu-devel] [RFC PATCH v2 1/3] vGPU Core driver
> From: Kirti Wankhede > Sent: Wednesday, February 24, 2016 12:24 AM > > Design for vGPU Driver: > Main purpose of vGPU driver is to provide a common interface for vGPU > management that can be used by differnt GPU drivers. > > This module would provide a generic interface to create the device, add > it to vGPU bus, add device to IOMMU group and then add it to vfio group. > > High Level block diagram: > > +--+vgpu_register_driver()+---+ > | __init() +->+ | > | | | | > | +<-+vgpu.ko| > | vgpu_vfio.ko | probe()/remove() | | > | |+-+ +-+ > +--+| +---+---+ | > | ^ | > | callback| | > | +---++| > | |vgpu_register_device() | > | ||| > +---^-+-++-+--+-+ > | nvidia.ko || i915.ko | > | ||| > +---+++ > > vGPU driver provides two types of registration interfaces: > 1. Registration interface for vGPU bus driver: > > /** > * struct vgpu_driver - vGPU device driver > * @name: driver name > * @probe: called when new device created > * @remove: called when device removed > * @driver: device driver structure > * > **/ > struct vgpu_driver { > const char *name; > int (*probe) (struct device *dev); > void (*remove) (struct device *dev); > struct device_driverdriver; > }; > > int vgpu_register_driver(struct vgpu_driver *drv, struct module *owner); > void vgpu_unregister_driver(struct vgpu_driver *drv); > > VFIO bus driver for vgpu, should use this interface to register with > vGPU driver. With this, VFIO bus driver for vGPU devices is responsible > to add vGPU device to VFIO group. > > 2. GPU driver interface > GPU driver interface provides GPU driver the set APIs to manage GPU driver > related work in their own driver. APIs are to: > - vgpu_supported_config: provide supported configuration list by the GPU. > - vgpu_create: to allocate basic resouces in GPU driver for a vGPU device. > - vgpu_destroy: to free resources in GPU driver during vGPU device destroy. > - vgpu_start: to initiate vGPU initialization process from GPU driver when VM > boots and before QEMU starts. > - vgpu_shutdown: to teardown vGPU resources during VM teardown. > - read : read emulation callback. > - write: write emulation callback. > - vgpu_set_irqs: send interrupt configuration information that QEMU sets. > - vgpu_bar_info: to provice BAR size and its flags for the vGPU device. > - validate_map_request: to validate remap pfn request. > > This registration interface should be used by GPU drivers to register > each physical device to vGPU driver. > > Updated this patch with couple of more functions in GPU driver interface > which were discussed during v1 version of this RFC. > > Thanks, > Kirti. > > Signed-off-by: Kirti Wankhede > Signed-off-by: Neo Jia Hi, Kirti/Neo, Thanks a lot for you updated version. Having not looked into detail code, first come with some high level comments. First, in a glimpse the majority of the code (possibly >95%) is device agnostic, though we call it vgpu today. Just thinking about the extensibility and usability of this framework, would it be better to name it in a way that any other type of I/O device can be fit into this framework? I don't have a good idea of the name now, but a simple idea is to replace vgpu with vdev (vdev-core, vfio-vdev, vfio-iommu-type1-vdev, etc.), and then underlying GPU drivers are just one category of users of this general vdev framework. In the future it's easily extended to support other I/O virtualization based on similar vgpu concept; Second, are these 3 patches already working with nvidia device, or are they just conceptual implementation w/o completing actual test yet? We'll start moving our implementation toward this direction too, so would be good to know the current status and how we can further cooperate to move forward. Based on that we can start giving more comments on next level detail. Thanks Kevin
Re: [Qemu-devel] [RFC PATCH v0 3/6] spapr: Represent boot CPUs as spapr-cpu-core devices
On Fri, Feb 26, 2016 at 04:18:57PM +0100, Igor Mammedov wrote: > On Thu, 25 Feb 2016 21:52:39 +0530 > Bharata B Rao wrote: > > > Initialize boot CPUs as spapr-cpu-core devices and create links from > > machine object to these core devices. These links can be considered > > as CPU slots in which core devices will get hot-plugged. spapr-cpu-core > > device's slot property indicates the slot where it is plugged. Information > > about all the CPU slots can be obtained by walking these links. > > > > Signed-off-by: Bharata B Rao > > --- > > hw/ppc/spapr.c | 65 > > +++--- > > include/hw/ppc/spapr.h | 3 +++ > > 2 files changed, 60 insertions(+), 8 deletions(-) > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > index e214a34..1f0d232 100644 > > --- a/hw/ppc/spapr.c > > +++ b/hw/ppc/spapr.c > > @@ -64,6 +64,7 @@ > > > > #include "hw/compat.h" > > #include "qemu-common.h" > > +#include "hw/ppc/spapr_cpu_core.h" > > > > #include > > > > @@ -1720,6 +1721,21 @@ static void spapr_validate_node_memory(MachineState > > *machine, Error **errp) > > } > > } > > > > +/* > > + * Check to see if core is being hot-plugged into an already populated > > slot. > > + */ > > +static void spapr_cpu_core_allow_set_link(Object *obj, const char *name, > > + Object *val, Error **errp) > > +{ > > +Object *core = object_property_get_link(qdev_get_machine(), name, > > NULL); > > + > > +if (core) { > > +char *path = object_get_canonical_path(core); > > +error_setg(errp, "Slot %s already populated with %s", name, path); > > +g_free(path); > > +} > > +} > > + > > /* pSeries LPAR / sPAPR hardware init */ > > static void ppc_spapr_init(MachineState *machine) > > { > > @@ -1728,7 +1744,6 @@ static void ppc_spapr_init(MachineState *machine) > > const char *kernel_filename = machine->kernel_filename; > > const char *kernel_cmdline = machine->kernel_cmdline; > > const char *initrd_filename = machine->initrd_filename; > > -PowerPCCPU *cpu; > > PCIHostState *phb; > > int i; > > MemoryRegion *sysmem = get_system_memory(); > > @@ -1742,6 +1757,8 @@ static void ppc_spapr_init(MachineState *machine) > > long load_limit, fw_size; > > bool kernel_le = false; > > char *filename; > > +int spapr_cores = smp_cpus / smp_threads; > > +int spapr_max_cores = max_cpus / smp_threads; > > > > msi_supported = true; > > > > @@ -1800,13 +1817,38 @@ static void ppc_spapr_init(MachineState *machine) > > if (machine->cpu_model == NULL) { > > machine->cpu_model = kvm_enabled() ? "host" : "POWER7"; > > } > > -for (i = 0; i < smp_cpus; i++) { > > -cpu = cpu_ppc_init(machine->cpu_model); > > -if (cpu == NULL) { > > -error_report("Unable to find PowerPC CPU definition"); > > -exit(1); > > + > > +spapr->cores = g_malloc0(spapr_max_cores * sizeof(Object)); > souldn't it be sizeof(Object *) Yes, I meant to store the pointers to Object here, will change. > > > + > > +for (i = 0; i < spapr_max_cores; i++) { > > +Object *spapr_cpu_core = object_new(TYPE_SPAPR_CPU_CORE); > you allocate spapr_max_cores cores but set links to only to spapr_cores only, > it looks like all spapr_cpu_core objects are leaked for range > spapr_cores..spapr_max_cores > Yes, as I replied in an ealier thread, the intention is to create links for all possible cores, but create only those many cores required for CPUs specified with -smp from machine init. The links will be set only for those cores. Hotplugged cores will get created and the links will be set for them during device_add. So the changed code now looks like this: for (i = 0; i < spapr_max_cores; i++) { char name[32]; /* * Create links from machine objects to all possible cores. */ snprintf(name, sizeof(name), "%s[%d]", SPAPR_MACHINE_CPU_CORE_PROP, i); object_property_add_link(OBJECT(spapr), name, TYPE_SPAPR_CPU_CORE, (Object **)&spapr->cores[i], spapr_cpu_core_allow_set_link, 0, &error_fatal); /* * Create cores and set link from machine object to core object for * boot time CPUs and realize them. */ if (i < spapr_cores) { Object *core = object_new(TYPE_SPAPR_CPU_CORE); object_property_set_str(core, machine->cpu_model, "cpu_model", &error_fatal); object_property_set_int(core, smp_threads, "nr_threads", &error_fatal); object_property_set_str(core, name, SPAPR_CPU_CORE_SLOT_PROP, &error_fatal); object_property_set_bool(core, true, "realized", &error_fatal); } } > >
Re: [Qemu-devel] [PATCH V7 0/2] net/filter-mirror:add filter-mirror and unit test
On 02/26/2016 04:12 PM, Zhang Chen wrote: > Filter-mirror is a netfilter plugin. > It gives qemu the ability to mirror > packets to a chardev. > > v7: > - fix mktemp() to mkstemp() > > v6: > - Address Jason's comments. > > v5: > - Address Jason's comments. > > v4: > - Address Jason's comments. > > v3: > - Add filter-mirror unit test according >to Jason's comments > - Address zhanghailiang's comments. > - Address Jason's comments. > > v2: > - Address zhanghailiang's comments. > - Address Eric Blake's comments. > - Address Yang Hongyang's comments. > - Address Dave's comments. > > v1: > initial patch. > > > Zhang Chen (2): > net/filter-mirror:Add filter-mirror > tests/test-filter-mirror:add filter-mirror unit test > > net/Makefile.objs | 1 + > net/filter-mirror.c| 181 > + > qemu-options.hx| 5 ++ > tests/.gitignore | 1 + > tests/Makefile | 2 + > tests/test-filter-mirror.c | 89 ++ > vl.c | 3 +- > 7 files changed, 281 insertions(+), 1 deletion(-) > create mode 100644 net/filter-mirror.c > create mode 100644 tests/test-filter-mirror.c > Applied to -net. Thanks
Re: [Qemu-devel] Installing guest OS in VM
On Sun, 02/28 10:37, Sarah Khan wrote: > Hi, > I have been looking forward to participate in outreachy round 12.So, I was > trying to install guest OS in VM according to instructions as given on this > link: > http://wiki.qemu-project.org/Hosts/Linux but I am stuck after the > instruction which tells to boot PC-BIOS which is > > bin/debug/native/x86_64-softmmu/qemu-system-x86_64 -L pc-bios > > After running this I get:VNC server running on '127.0.0.1;5900'and nothing. If you recompile with sdl or gtk support, you will get a window showing the guest console. It looks like you have neither, but you can still use "vncviewer 127.0.0.1:5900" to connect to the vnc server. Fam > > Can someone suggest that I am going correct or not? Also after this > how to go about this command: > ./qemu-img create -f qcow2 test.qcow2 16G > > > > Thanks, > > Sarah
[Qemu-devel] Installing guest OS in VM
Hi, I have been looking forward to participate in outreachy round 12.So, I was trying to install guest OS in VM according to instructions as given on this link: http://wiki.qemu-project.org/Hosts/Linux but I am stuck after the instruction which tells to boot PC-BIOS which is bin/debug/native/x86_64-softmmu/qemu-system-x86_64 -L pc-bios After running this I get:VNC server running on '127.0.0.1;5900'and nothing. Can someone suggest that I am going correct or not? Also after this how to go about this command: ./qemu-img create -f qcow2 test.qcow2 16G Thanks, Sarah
Re: [Qemu-devel] [RFC PATCH v0 2/6] spapr: CPU core device
On Fri, Feb 26, 2016 at 12:13:39PM -0600, Michael Roth wrote: > Quoting Bharata B Rao (2016-02-25 10:22:38) > > Add sPAPR specific CPU core device that is based on generic CPU core device. > > Creating this core device will result in creation of all the CPU thread > > devices that are part of this core. > > > > Signed-off-by: Bharata B Rao > > --- > > hw/ppc/Makefile.objs| 1 + > > hw/ppc/spapr_cpu_core.c | 210 > > > > include/hw/ppc/spapr_cpu_core.h | 32 ++ > > 3 files changed, 243 insertions(+) > > create mode 100644 hw/ppc/spapr_cpu_core.c > > create mode 100644 include/hw/ppc/spapr_cpu_core.h > > > > diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs > > index c1ffc77..5cc6608 100644 > > --- a/hw/ppc/Makefile.objs > > +++ b/hw/ppc/Makefile.objs > > @@ -4,6 +4,7 @@ obj-y += ppc.o ppc_booke.o > > obj-$(CONFIG_PSERIES) += spapr.o spapr_vio.o spapr_events.o > > obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o > > obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o > > +obj-$(CONFIG_PSERIES) += spapr_cpu_core.o > > ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy) > > obj-y += spapr_pci_vfio.o > > endif > > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c > > new file mode 100644 > > index 000..c44eb61 > > --- /dev/null > > +++ b/hw/ppc/spapr_cpu_core.c > > @@ -0,0 +1,210 @@ > > +/* > > + * sPAPR CPU core device, acts as container of CPU thread devices. > > + * > > + * Copyright (C) 2016 Bharata B Rao > > + * > > + * This work is licensed under the terms of the GNU GPL, version 2 or > > later. > > + * See the COPYING file in the top-level directory. > > + */ > > +#include "hw/cpu/core.h" > > +#include "hw/ppc/spapr_cpu_core.h" > > +#include "hw/ppc/spapr.h" > > +#include "hw/boards.h" > > +#include "qemu/error-report.h" > > +#include "qapi/visitor.h" > > +#include > > + > > +static int spapr_cpu_core_realize_child(Object *child, void *opaque) > > +{ > > +Error **errp = opaque; > > + > > +object_property_set_bool(child, true, "realized", errp); > > +if (*errp) { > > +return 1; > > +} > > +return 0; > > +} > > + > > +static void spapr_cpu_core_realize(DeviceState *dev, Error **errp) > > +{ > > +sPAPRCPUCore *core = SPAPR_CPU_CORE(OBJECT(dev)); > > +sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine()); > > +Error *local_err = NULL; > > + > > +if (!core->nr_threads) { > > +error_setg(errp, "nr_threads property can't be 0"); > > +return; > > +} > > + > > +if (!core->cpu_model) { > > +error_setg(errp, "cpu_model property isn't set"); > > +return; > > +} > > + > > +/* > > + * TODO: If slot isn't specified, plug this core into > > + * an existing empty slot. > > + */ > > +if (!core->slot) { > > +error_setg(errp, "slot property isn't set"); > > +return; > > +} > > + > > +object_property_set_link(OBJECT(spapr), OBJECT(core), core->slot, > > + &local_err); > > +if (local_err) { > > +error_propagate(errp, local_err); > > +return; > > +} > > + > > +object_child_foreach(OBJECT(dev), spapr_cpu_core_realize_child, errp); > > +} > > + > > +/* > > + * This creates the CPU threads for a given @core. > > + * > > + * In order to create the threads, we need two inputs - number of > > + * threads and the cpu_model. These are set as core object's properties. > > + * When both of them become available/set, this routine will be called from > > + * either property's set handler to create the threads. > > + * > > + * TODO: Dependence of threads creation on two properties is resulting > > + * in this not-so-clean way of creating threads from either of the > > + * property setters based on the order in which they get set. Check if > > + * this can be handled in a better manner. > > + */ > > +static void spapr_cpu_core_create_threads(sPAPRCPUCore *core, Error **errp) > > +{ > > +int i; > > + > > +for (i = 0; i < core->nr_threads; i++) { > > +char id[32]; > > +char type[32]; > > + > > +snprintf(type, sizeof(type), "%s-%s", core->cpu_model, > > + TYPE_POWERPC_CPU); > > +object_initialize(&core->threads[i], sizeof(core->threads[i]), > > type); > > + > > +snprintf(id, sizeof(id), "thread[%d]", i); > > +object_property_add_child(OBJECT(core), id, > > OBJECT(&core->threads[i]), > > + errp); > > +} > > +} > > + > > +static char *spapr_cpu_core_prop_get_slot(Object *obj, Error **errp) > > +{ > > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj); > > + > > +return core->slot; > > +} > > + > > +static void spapr_cpu_core_prop_set_slot(Object *obj, const char *val, > > + Error **errp) > > +{ > > +sPAPRCPUCore *core = SPAPR_CPU_CORE(obj); > > + >
[Qemu-devel] [Bug 893208] Re: qemu on ARM hosts can't boot i386 image
It doesn't work on my XU4 either. The supported virtualization would probably work for ARM images but it's not something many people need. What's the holdup, dear devs? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/893208 Title: qemu on ARM hosts can't boot i386 image Status in QEMU: Confirmed Status in Linaro QEMU: New Bug description: If you apply some workarounds for bug 870990, bug 883133 and bug 883136 QEMU still cannot boot the i386 debian_squeeze_i386_standard.qcow2 image from http://people.debian.org/~aurel32/qemu/i386/ -- grub starts to boot but something causes the system to reset just before display of the blue-background grub menu, so we go round in a loop forever. This image boots OK on i386 hosted qemu so this indicates some kind of ARM- host specific bug. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/893208/+subscriptions
Re: [Qemu-devel] [PATCH 02/12] spapr_pci: Switch to vfio_eeh_as_op() interface
On Mon, Feb 29, 2016 at 12:43:05PM +1100, Alexey Kardashevskiy wrote: > On 02/26/2016 10:31 PM, David Gibson wrote: > >This switches all EEH on VFIO operations in spapr_pci_vfio.c from the > >broken vfio_container_ioctl() interface to the new vfio_as_eeh_op() > >interface. > > > >Signed-off-by: David Gibson mak > > Where is that "mak" from? :) Oops, no idea. Fixed. > Reviewed-by: Alexey Kardashevskiy > > > >--- > > hw/ppc/spapr_pci_vfio.c | 50 > > - > > 1 file changed, 16 insertions(+), 34 deletions(-) > > > >diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c > >index 2f3752e..b1e8e8e 100644 > >--- a/hw/ppc/spapr_pci_vfio.c > >+++ b/hw/ppc/spapr_pci_vfio.c > >@@ -73,15 +73,9 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState > >*sphb, Error **errp) > > spapr_tce_get_iommu(tcet)); > > } > > > >-static void spapr_phb_vfio_eeh_reenable(sPAPRPHBVFIOState *svphb) > >+static void spapr_phb_vfio_eeh_reenable(sPAPRPHBState *sphb) > > { > >-struct vfio_eeh_pe_op op = { > >-.argsz = sizeof(op), > >-.op= VFIO_EEH_PE_ENABLE > >-}; > >- > >-vfio_container_ioctl(&svphb->phb.iommu_as, > >- svphb->iommugroupid, VFIO_EEH_PE_OP, &op); > >+vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_ENABLE); > > } > > > > static void spapr_phb_vfio_reset(DeviceState *qdev) > >@@ -92,19 +86,18 @@ static void spapr_phb_vfio_reset(DeviceState *qdev) > > * ensures that the contained PCI devices will work properly > > * after reboot. > > */ > >-spapr_phb_vfio_eeh_reenable(SPAPR_PCI_VFIO_HOST_BRIDGE(qdev)); > >+spapr_phb_vfio_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev)); > > } > > > > static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, > > unsigned int addr, int option) > > { > >-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); > >-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) }; > >+uint32_t op; > > int ret; > > > > switch (option) { > > case RTAS_EEH_DISABLE: > >-op.op = VFIO_EEH_PE_DISABLE; > >+op = VFIO_EEH_PE_DISABLE; > > break; > > case RTAS_EEH_ENABLE: { > > PCIHostState *phb; > >@@ -122,21 +115,20 @@ static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState > >*sphb, > > return RTAS_OUT_PARAM_ERROR; > > } > > > >-op.op = VFIO_EEH_PE_ENABLE; > >+op = VFIO_EEH_PE_ENABLE; > > break; > > } > > case RTAS_EEH_THAW_IO: > >-op.op = VFIO_EEH_PE_UNFREEZE_IO; > >+op = VFIO_EEH_PE_UNFREEZE_IO; > > break; > > case RTAS_EEH_THAW_DMA: > >-op.op = VFIO_EEH_PE_UNFREEZE_DMA; > >+op = VFIO_EEH_PE_UNFREEZE_DMA; > > break; > > default: > > return RTAS_OUT_PARAM_ERROR; > > } > > > >-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid, > >- VFIO_EEH_PE_OP, &op); > >+ret = vfio_eeh_as_op(&sphb->iommu_as, op); > > if (ret < 0) { > > return RTAS_OUT_HW_ERROR; > > } > >@@ -146,13 +138,9 @@ static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState > >*sphb, > > > > static int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state) > > { > >-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); > >-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) }; > > int ret; > > > >-op.op = VFIO_EEH_PE_GET_STATE; > >-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid, > >- VFIO_EEH_PE_OP, &op); > >+ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_GET_STATE); > > if (ret < 0) { > > return RTAS_OUT_PARAM_ERROR; > > } > >@@ -206,28 +194,26 @@ static void spapr_phb_vfio_eeh_pre_reset(sPAPRPHBState > >*sphb) > > > > static int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option) > > { > >-sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); > >-struct vfio_eeh_pe_op op = { .argsz = sizeof(op) }; > >+uint32_t op; > > int ret; > > > > switch (option) { > > case RTAS_SLOT_RESET_DEACTIVATE: > >-op.op = VFIO_EEH_PE_RESET_DEACTIVATE; > >+op = VFIO_EEH_PE_RESET_DEACTIVATE; > > break; > > case RTAS_SLOT_RESET_HOT: > > spapr_phb_vfio_eeh_pre_reset(sphb); > >-op.op = VFIO_EEH_PE_RESET_HOT; > >+op = VFIO_EEH_PE_RESET_HOT; > > break; > > case RTAS_SLOT_RESET_FUNDAMENTAL: > > spapr_phb_vfio_eeh_pre_reset(sphb); > >-op.op = VFIO_EEH_PE_RESET_FUNDAMENTAL; > >+op = VFIO_EEH_PE_RESET_FUNDAMENTAL; > > break; > > default: > > return RTAS_OUT_PARAM_ERROR; > > } > > > >-ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid, > >- VFIO_EEH_PE_OP, &op); > >+ret =
Re: [Qemu-devel] [PATCH 01/12] vfio: Start improving VFIO/EEH interface
On Mon, Feb 29, 2016 at 11:58:54AM +1100, Alexey Kardashevskiy wrote: > On 02/26/2016 10:31 PM, David Gibson wrote: > >At present the code handling IBM's Enhanced Error Handling (EEH) interface > >on VFIO devices operates by bypassing the usual VFIO logic with > >vfio_container_ioctl(). That's a poorly designed interface with unclear > >semantics about exactly what can be operated on. > > > >In particular it operates on a single vfio container internally (hence the > >name), but takes an address space and group id, from which it deduces the > >container in a rather roundabout way. groupids are something that code > >outside vfio shouldn't even be aware of. > > > >This patch creates new interfaces for EEH operations. Internally we > >have vfio_eeh_container_op() which takes a VFIOContainer object > >directly. For external use we have vfio_eeh_as_ok() which determines > >if an AddressSpace is usable for EEH (at present this means it has a > >single container and at most a single group attached), and > >vfio_eeh_as_op() which will perform an operation on an AddressSpace in > >the unambiguous case, and otherwise returns an error. > > > >This interface still isn't great, but it's enough of an improvement to > >allow a number of cleanups in other places. > > > >Signed-off-by: David Gibson > >--- > > hw/vfio/common.c | 77 > > ++ > > include/hw/vfio/vfio.h | 2 ++ > > 2 files changed, 79 insertions(+) > > > >diff --git a/hw/vfio/common.c b/hw/vfio/common.c > >index 607ec70..e419241 100644 > >--- a/hw/vfio/common.c > >+++ b/hw/vfio/common.c > >@@ -1003,3 +1003,80 @@ int vfio_container_ioctl(AddressSpace *as, int32_t > >groupid, > > > > return vfio_container_do_ioctl(as, groupid, req, param); > > } > >+ > >+/* > >+ * Interfaces for IBM EEH (Enhanced Error Handling) > >+ */ > >+static bool vfio_eeh_container_ok(VFIOContainer *container) > >+{ > >+/* A broken kernel implementation means EEH operations won't work > >+ * correctly if there are multiple groups in a container */ > >+ > >+if (!QLIST_EMPTY(&container->group_list) > >+&& QLIST_NEXT(QLIST_FIRST(&container->group_list), container_next)) > >{ > >+return false; > >+} > >+ > >+return true; > > If &container->group_list is empty, this helper returns "true". Does not > look right, does it?... Hmm.. my thinking was that EEH was safe (if a no-op) on a container with no groups. But, thinking about it again I'm not sure that the state of previous EEH ops will get transferred to a group added to the container later. So I think returning false on an empty container probably is safer. I'll change it. > >+} > >+ > >+static int vfio_eeh_container_op(VFIOContainer *container, uint32_t op) > >+{ > >+struct vfio_eeh_pe_op pe_op = { > >+.argsz = sizeof(pe_op), > >+.op = op, > >+}; > >+int ret; > >+ > >+if (!vfio_eeh_container_ok(container)) { > >+error_report("vfio/eeh: EEH_PE_OP 0x%x called on container" > >+ " with multiple groups", op); > >+return -EPERM; > >+} > >+ > >+ret = ioctl(container->fd, VFIO_EEH_PE_OP, &pe_op); > >+if (ret < 0) { > >+error_report("vfio/eeh: EEH_PE_OP 0x%x failed: %m", op); > >+return -errno; > >+} > >+ > >+return 0; > >+} > >+ > >+static VFIOContainer *vfio_eeh_as_container(AddressSpace *as) > >+{ > >+VFIOAddressSpace *space = vfio_get_address_space(as); > >+VFIOContainer *container = NULL; > >+ > >+if (QLIST_EMPTY(&space->containers)) { > >+/* No containers to act on */ > >+goto out; > >+} > >+ > >+container = QLIST_FIRST(&space->containers); > >+ > >+if (QLIST_NEXT(container, next)) { > >+/* We don't yet have logic to synchronize EEH state across > >+ * multiple containers */ > >+container = NULL; > >+goto out; > >+} > >+ > >+out: > >+vfio_put_address_space(space); > >+return container; > >+} > >+ > >+bool vfio_eeh_as_ok(AddressSpace *as) > >+{ > >+VFIOContainer *container = vfio_eeh_as_container(as); > >+ > >+return (container != NULL) && vfio_eeh_container_ok(container); > >+} > >+ > >+int vfio_eeh_as_op(AddressSpace *as, uint32_t op) > >+{ > >+VFIOContainer *container = vfio_eeh_as_container(as); > >+ > >+return vfio_eeh_container_op(container, op); > > > vfio_eeh_as_ok() checks for (container != NULL) but this one does not, > should not it? Well.. you're not supposed to call as_op() if as_ok() returned false, so it should be safe. I'll add an assert to make this clearer. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [Qemu-ppc] Migrating decrementer
On Fri, Feb 26, 2016 at 12:29:51PM +, Mark Cave-Ayland wrote: > On 26/02/16 04:35, David Gibson wrote: > > >> Sign. And let me try that again, this time after caffeine: > >> > >> cpu_start/resume(): > >> cpu->tb_env->tb_offset = > >> muldiv64(qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL), > >> cpu->tb_env->tb_freq, NANOSECONDS_PER_SECOND) + > >> cpu->tb_env->tb_offset - > >> cpu_get_host_ticks(); > >> > >> This should translate to: at CPU start, calculate the difference between > >> the current guest virtual timebase and the host timebase, storing the > >> difference in cpu->tb_env->tb_offset. > > > > Ummm... I think that's right. Except that you need to make sure you > > calculate the tb_offset just once, and set the same value to all guest > > CPUs. Otherwise the guest TBs may be slightly out of sync with each > > other, which is bad (the host should have already ensure that all host > > TBs are in sync with each other). > > Nods. The reason I really like this solution is because it correctly > handles both paused/live machines and simplifies the migration logic > significantly. In fact, with this solution the only information you > would need in vmstate_ppc_timebase for migration would be the current > tb_offset so the receiving host can calculate the guest timebase from > the virtual clock accordingly. > > We really should make helper routines that each Power machine type can > > use for this. Unfortunately we can't put it directly into the common > > ppc cpu migration code because of the requirement to keep the TBs > > synced across the machine. > > Effectively I believe there are 2 cases here: TCG and KVM. For TCG all > of this is a no-op since tb/decr are already derived from the virtual > clock + tb_offset already so that really just leaves KVM. > > >From what you're saying we would need 2 hooks for KVM here: one on > machine start/resume which updates tb_offset for all vCPUs and one for > vCPU resume which updates just that one particular vCPU. > > Just curious as to your comment regarding helper routines for each Power > machine type - is there any reason not to enable this globally for all > Power machine types? If tb_offset isn't supported by the guest then the > problem with not being able to handle a jump in timebase post-migration > still remains exactly the same. Well, I can't see a place to put it globally. We can't put it in the general vCPU stuff, because that would migrate each CPU's timebase independently, but we want to migrate as a system wide operation, to ensure the TBs are all synchronized in the destination guest. To do the platform wide stuff, it pretty much has to be in the machine type. > The other question of course relates to the reason this thread was > started which is will this approach still support migrating the > decrementer? My feeling is that this would still work in that we would > encode the number of ticks before the decrementer reaches its interrupt > value into vmstate_ppc_timebase, whether high or low. For TCG it is easy > to ensure that the decrementer will still fire in n ticks time > post-migration (which solves my particular use case), but I don't have a > feeling as to whether this is possible, or indeed desirable for KVM. Yes, for TCG it should be fairly straightforward. The DECR should be calculated from the timebase. We do need to check it on incoming migration though, and check when we need to refire the decrementer interrupt. For KVM we'll need to load an appropriate value into the real decrementer. We probably want to migrate a difference between the TB and the decrementer. What could get hairy here is that there are a number of different variants between ppc models on how exactly the decrementer interrupt triggers: is it edge-triggered on 1->0 transition, edge-triggered on 0->-1 transition, or level triggered on the DECR's sign bit. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH 10/12] spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge
On Mon, Feb 29, 2016 at 12:42:28PM +1100, Alexey Kardashevskiy wrote: > On 02/26/2016 10:32 PM, David Gibson wrote: > >Now that the regular spapr-pci-host-bridge can handle EEH, there are only > >two things that spapr-pci-vfio-host-bridge does differently: > > 1. automatically sizes its DMA window to match the host IOMMU > > 2. checks if the attached VFIO container is backed by the > >VFIO_SPAPR_TCE_IOMMU type on the host > > > >(1) is not particularly useful, since the default window used by the > >regular host bridge will work with the host IOMMU configuration on all > >current systems anyway. > > > >Plus, automatically changing guest visible configuration (such as the DMA > >window) based on host settings is generally a bad idea. > > btw why exactly is it a bad idea? So, as noted, in this case it's probably ok. But in general changing configuration based on host state will break migration, because the source and destination VMs, with apparently identical setup won't actually be the same. It also makes it more difficult for management layers to know exactly what they're constructing, which is potentially an issue even for this non-migratable device. > > > Anyway, > Reviewed-by: Alexey Kardashevskiy > > -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [RFC PATCH v0 4/6] spapr: CPU hotplug support
On Fri, Feb 26, 2016 at 02:51:41PM +1100, David Gibson wrote: > On Thu, Feb 25, 2016 at 09:52:40PM +0530, Bharata B Rao wrote: > > Set up device tree entries for the hotplugged CPU core and use the > > exising EPOW event infrastructure to send CPU hotplug notification to > > the guest. > > > > Signed-off-by: Bharata B Rao > > --- > > hw/ppc/spapr.c | 136 > > - > > hw/ppc/spapr_events.c | 3 ++ > > hw/ppc/spapr_rtas.c| 24 + > > include/hw/ppc/spapr.h | 1 + > > 4 files changed, 163 insertions(+), 1 deletion(-) > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > index 1f0d232..780cd00 100644 > > --- a/hw/ppc/spapr.c > > +++ b/hw/ppc/spapr.c > > @@ -603,6 +603,18 @@ static void spapr_populate_cpu_dt(CPUState *cs, void > > *fdt, int offset, > > size_t page_sizes_prop_size; > > uint32_t vcpus_per_socket = smp_threads * smp_cores; > > uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)}; > > +sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine()); > > +sPAPRDRConnector *drc; > > +sPAPRDRConnectorClass *drck; > > +int drc_index; > > + > > +if (smc->dr_cpu_enabled) { > > +drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index); > > +g_assert(drc); > > +drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); > > +drc_index = drck->get_index(drc); > > +_FDT((fdt_setprop_cell(fdt, offset, "ibm,my-drc-index", > > drc_index))); > > +} > > > > /* Note: we keep CI large pages off for now because a 64K capable guest > > * provisioned with large pages might otherwise try to map a qemu > > @@ -987,6 +999,16 @@ static void spapr_finalize_fdt(sPAPRMachineState > > *spapr, > > _FDT(spapr_drc_populate_dt(fdt, 0, NULL, > > SPAPR_DR_CONNECTOR_TYPE_LMB)); > > } > > > > +if (smc->dr_cpu_enabled) { > > +int offset = fdt_path_offset(fdt, "/cpus"); > > +ret = spapr_drc_populate_dt(fdt, offset, NULL, > > +SPAPR_DR_CONNECTOR_TYPE_CPU); > > +if (ret < 0) { > > +fprintf(stderr, "Couldn't set up CPU DR device tree > > properties\n"); > > +exit(1); > > +} > > +} > > + > > _FDT((fdt_pack(fdt))); > > > > if (fdt_totalsize(fdt) > FDT_MAX_SIZE) { > > @@ -1759,6 +1781,7 @@ static void ppc_spapr_init(MachineState *machine) > > char *filename; > > int spapr_cores = smp_cpus / smp_threads; > > int spapr_max_cores = max_cpus / smp_threads; > > +int smt = kvmppc_smt_threads(); > > > > msi_supported = true; > > > > @@ -1813,6 +1836,15 @@ static void ppc_spapr_init(MachineState *machine) > > spapr_validate_node_memory(machine, &error_fatal); > > } > > > > +if (smc->dr_cpu_enabled) { > > +for (i = 0; i < spapr_max_cores; i++) { > > +sPAPRDRConnector *drc = > > +spapr_dr_connector_new(OBJECT(spapr), > > + SPAPR_DR_CONNECTOR_TYPE_CPU, i * > > smt); > > +qemu_register_reset(spapr_drc_reset, drc); > > +} > > +} > > + > > /* init CPUs */ > > if (machine->cpu_model == NULL) { > > machine->cpu_model = kvm_enabled() ? "host" : "POWER7"; > > @@ -2247,6 +2279,88 @@ out: > > error_propagate(errp, local_err); > > } > > > > +static void *spapr_populate_hotplug_cpu_dt(DeviceState *dev, CPUState *cs, > > + int *fdt_offset, > > + sPAPRMachineState *spapr) > > +{ > > +PowerPCCPU *cpu = POWERPC_CPU(cs); > > +DeviceClass *dc = DEVICE_GET_CLASS(cs); > > +int id = ppc_get_vcpu_dt_id(cpu); > > +void *fdt; > > +int offset, fdt_size; > > +char *nodename; > > + > > +fdt = create_device_tree(&fdt_size); > > +nodename = g_strdup_printf("%s@%x", dc->fw_name, id); > > +offset = fdt_add_subnode(fdt, 0, nodename); > > + > > +spapr_populate_cpu_dt(cs, fdt, offset, spapr); > > +g_free(nodename); > > + > > +*fdt_offset = offset; > > +return fdt; > > +} > > + > > +static void spapr_core_plug(HotplugHandler *hotplug_dev, DeviceState *dev, > > +Error **errp) > > +{ > > +sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(qdev_get_machine()); > > +sPAPRMachineState *ms = SPAPR_MACHINE(qdev_get_machine()); > > +sPAPRCPUCore *core = SPAPR_CPU_CORE(OBJECT(dev)); > > +PowerPCCPU *cpu = &core->threads[0]; > > +CPUState *cs = CPU(cpu); > > +int id = ppc_get_vcpu_dt_id(cpu); > > +sPAPRDRConnector *drc = > > +spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, id); > > +sPAPRDRConnectorClass *drck; > > +Error *local_err = NULL; > > +void *fdt = NULL; > > +int fdt_offset = 0; > > + > > +if (!smc->dr_cpu_enabled) { > > +/* > > + * This is a cold plugged CPU core but the mac
Re: [Qemu-devel] [PATCH 04/12] spapr_pci: Fold spapr_phb_vfio_eeh_configure() into spapr_pci code
On 02/29/2016 02:45 PM, Alexey Kardashevskiy wrote: On 02/29/2016 12:43 PM, Alexey Kardashevskiy wrote: On 02/26/2016 10:31 PM, David Gibson wrote: Simplify the sPAPR PCI code by folding spapr_phb_vfio_eeh_configure() into rtas_ibm_configure_pe(). Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy Aaaand this breaks mingw32: CCppc64-softmmu/hw/ppc/spapr_pci.o /home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:46:24: fatal error: linux/vfio.h: No such file or directory compilation terminated. /home/aik/p/qemu-dwg-eeh/rules.mak:57: recipe for target 'hw/ppc/spapr_pci.o' failed make[1]: *** [hw/ppc/spapr_pci.o] Error 1 make[1]: *** Waiting for unfinished jobs Makefile:186: recipe for target 'subdir-ppc64-softmmu' failed make: *** [subdir-ppc64-softmmu] Error 2 make: Leaving directory '/scratch/aik/p/qemu-dwg-eeh--ppc64_mingw32-build' --- hw/ppc/spapr_pci.c | 11 +-- hw/ppc/spapr_pci_vfio.c | 12 include/hw/pci-host/spapr.h | 1 - 3 files changed, 9 insertions(+), 15 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index d1e5222..fa633a9 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -42,6 +42,9 @@ #include "hw/ppc/spapr_drc.h" #include "sysemu/device_tree.h" +#include "hw/vfio/vfio.h" +#include + This is missing: #ifdef CONFIG_LINUX #include #endif and below where you use symbols from linux/vfio.h. My version of this rework did convert class callbacks to exported helpers and keep these helpers (plus stubs) in spapr_pci_vfio.c, with one #ifdef CONFIG_LINUX. Looked quite clean to me... /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */ #define RTAS_QUERY_FN 0 #define RTAS_CHANGE_FN 1 @@ -628,8 +631,12 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu, goto param_error_exit; } -ret = spapr_phb_vfio_eeh_configure(sphb); -rtas_st(rets, 0, ret); +ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE); +if (ret < 0) { +goto param_error_exit; +} + +rtas_st(rets, 0, RTAS_OUT_SUCCESS); return; param_error_exit: diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index 10fa88a..4ac5736 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -221,18 +221,6 @@ int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option) return RTAS_OUT_SUCCESS; } -int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb) -{ -int ret; - -ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE); -if (ret < 0) { -return RTAS_OUT_PARAM_ERROR; -} - -return RTAS_OUT_SUCCESS; -} - static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index cc1b82c..f02ef51 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -140,6 +140,5 @@ int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, unsigned int addr, int option); int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state); int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option); -int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb); #endif /* __HW_SPAPR_PCI_H__ */ -- Alexey
Re: [Qemu-devel] [PATCH 04/12] spapr_pci: Fold spapr_phb_vfio_eeh_configure() into spapr_pci code
On 02/29/2016 12:43 PM, Alexey Kardashevskiy wrote: On 02/26/2016 10:31 PM, David Gibson wrote: Simplify the sPAPR PCI code by folding spapr_phb_vfio_eeh_configure() into rtas_ibm_configure_pe(). Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy Aaaand this breaks mingw32: CCppc64-softmmu/hw/ppc/spapr_pci.o /home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:46:24: fatal error: linux/vfio.h: No such file or directory compilation terminated. /home/aik/p/qemu-dwg-eeh/rules.mak:57: recipe for target 'hw/ppc/spapr_pci.o' failed make[1]: *** [hw/ppc/spapr_pci.o] Error 1 make[1]: *** Waiting for unfinished jobs Makefile:186: recipe for target 'subdir-ppc64-softmmu' failed make: *** [subdir-ppc64-softmmu] Error 2 make: Leaving directory '/scratch/aik/p/qemu-dwg-eeh--ppc64_mingw32-build' --- hw/ppc/spapr_pci.c | 11 +-- hw/ppc/spapr_pci_vfio.c | 12 include/hw/pci-host/spapr.h | 1 - 3 files changed, 9 insertions(+), 15 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index d1e5222..fa633a9 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -42,6 +42,9 @@ #include "hw/ppc/spapr_drc.h" #include "sysemu/device_tree.h" +#include "hw/vfio/vfio.h" +#include + /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */ #define RTAS_QUERY_FN 0 #define RTAS_CHANGE_FN 1 @@ -628,8 +631,12 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu, goto param_error_exit; } -ret = spapr_phb_vfio_eeh_configure(sphb); -rtas_st(rets, 0, ret); +ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE); +if (ret < 0) { +goto param_error_exit; +} + +rtas_st(rets, 0, RTAS_OUT_SUCCESS); return; param_error_exit: diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index 10fa88a..4ac5736 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -221,18 +221,6 @@ int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option) return RTAS_OUT_SUCCESS; } -int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb) -{ -int ret; - -ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE); -if (ret < 0) { -return RTAS_OUT_PARAM_ERROR; -} - -return RTAS_OUT_SUCCESS; -} - static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index cc1b82c..f02ef51 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -140,6 +140,5 @@ int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, unsigned int addr, int option); int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state); int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option); -int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb); #endif /* __HW_SPAPR_PCI_H__ */ -- Alexey
Re: [Qemu-devel] [PATCH 03/12] spapr_pci: Eliminate class callbacks
On 02/29/2016 12:43 PM, Alexey Kardashevskiy wrote: On 02/26/2016 10:31 PM, David Gibson wrote: The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the special groupid field in sPAPRPHBVFIOState. So we can simplify, removing the class specific callbacks with direct calls based on a simple spapr_phb_eeh_enabled() helper. For now we implement that in terms of a boolean in the class, but we'll continue to clean that up later. On its own this is a rather strange way of doing things, but it's a useful intermediate step to further cleanups. Reviewed-by: Alexey Kardashevskiy This actually breaks mingw32 build: LINK ppc64-softmmu/qemu-system-ppc64.exe hw/ppc/spapr_pci.o: In function `rtas_ibm_set_eeh_option': /home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:471: undefined reference to `spapr_phb_vfio_eeh_set_option' hw/ppc/spapr_pci.o: In function `rtas_ibm_read_slot_reset_state2': /home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:557: undefined reference to `spapr_phb_vfio_eeh_get_state' hw/ppc/spapr_pci.o: In function `rtas_ibm_set_slot_reset': /home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:601: undefined reference to `spapr_phb_vfio_eeh_reset' hw/ppc/spapr_pci.o: In function `rtas_ibm_configure_pe': /home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:633: undefined reference to `spapr_phb_vfio_eeh_configure' hw/ppc/spapr_pci.o: In function `spapr_phb_reset': /home/aik/p/qemu-dwg-eeh/hw/ppc/spapr_pci.c:1434: undefined reference to `spapr_phb_vfio_reset' collect2: error: ld returned 1 exit status Signed-off-by: David Gibson --- hw/ppc/spapr_pci.c | 44 ++-- hw/ppc/spapr_pci_vfio.c | 18 +++--- include/hw/pci-host/spapr.h | 13 + 3 files changed, 38 insertions(+), 37 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 9b2b546..d1e5222 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -92,6 +92,13 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState *spapr, uint64_t buid, return pci_find_device(phb->bus, bus_num, devfn); } +static bool spapr_phb_eeh_available(sPAPRPHBState *sphb) +{ +sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); + +return spc->eeh_available; +} + static uint32_t rtas_pci_cfgaddr(uint32_t arg) { /* This handles the encoding of extended config space addresses */ @@ -438,7 +445,6 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; uint32_t addr, option; uint64_t buid; int ret; @@ -456,12 +462,11 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_set_option) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } -ret = spc->eeh_set_option(sphb, addr, option); +ret = spapr_phb_vfio_eeh_set_option(sphb, addr, option); rtas_st(rets, 0, ret); return; @@ -476,7 +481,6 @@ static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; PCIDevice *pdev; uint32_t addr, option; uint64_t buid; @@ -491,8 +495,7 @@ static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_set_option) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } @@ -532,7 +535,6 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; uint64_t buid; int state, ret; @@ -546,12 +548,11 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_get_state) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } -ret = spc->eeh_get_state(sphb, &state); +ret = spapr_phb_vfio_eeh_get_state(sphb, &state); rtas_st(rets, 0, ret); if (ret != RTAS_OUT_SUCCESS) { return; @@ -576,7 +577,6 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; uint32_t option; uint64_t buid; int ret; @@ -592,12 +592,11 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_reset) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } -ret = spc->eeh_reset(sphb, option); +ret = spapr_phb_vfio_eeh_reset(sphb, option); rtas_s
Re: [Qemu-devel] [PATCH] Qemu/KVM: Remove x2apic feature from CPU model when kernel_irqchip is off
On 2016年02月27日 03:54, Eduardo Habkost wrote: > On Thu, Feb 25, 2016 at 11:15:12PM +0800, Lan Tianyu wrote: >> x2apic feature is in the kvm_default_props and automatically added to all >> CPU models when KVM is enabled. But userspace devices don't support x2apic >> which can't be enabled without the in-kernel irqchip. It will trigger >> warning of "host doesn't support requested feature: CPUID.01H:ECX.x2apic >> [bit 21]" when kernel_irqchip is off. This patch is to fix it via removing >> x2apic feature when kernel_irqchip is off. >> >> Signed-off-by: Lan Tianyu >> --- >> target-i386/cpu.c | 4 >> 1 file changed, 4 insertions(+) >> >> diff --git a/target-i386/cpu.c b/target-i386/cpu.c >> index c78f824..298fb62 100644 >> --- a/target-i386/cpu.c >> +++ b/target-i386/cpu.c >> @@ -2125,6 +2125,10 @@ static void x86_cpu_load_def(X86CPU *cpu, >> X86CPUDefinition *def, Error **errp) >> >> /* Special cases not set in the X86CPUDefinition structs: */ >> if (kvm_enabled()) { >> +if (!kvm_irqchip_in_kernel()) { >> +x86_cpu_change_kvm_default("x2apic", "off"); > > This should be NULL instead of "off". I tried "NULL" before. But some cpus modules(E,G SandyBridge, IvyBridge, haswell) already have x2apic feature in their default features of struct X86CPUDefinition, passing "NULL" is not to add x2apic feature to the cpu module and will not help to remove x2apic feature for these cpu modules. So I changed "NULL" to "off". > Otherwise, the warning will > be disabled if using "-cpu ...,+x2apic". > kvm_arch_get_supported_cpuid() always returns no x2apic support when kernel_irqchip is off and so it still triggers warning with "-cpu ...,+x2apic". #qemu-system-x86_64 -cpu qemu64,+x2apic -machine kernel-irqchip=off warning: TCG doesn't support requested feature: CPUID.01H:ECX.x2apic [bit 21] >> +} >> + > > This function runs multiple times (once for each VCPU being > created). Should be harmless, though. Eventually we could move > the whole kvm_default_props logic outside cpu.c, to a KVM-x86 > accel class or to common PC initialization code. Good suggestion and we can try that later. > > >> x86_cpu_apply_props(cpu, kvm_default_props); >> } >> >> -- >> 1.9.3 >> > -- Best regards Tianyu Lan
[Qemu-devel] [PATCH 4/6] memory: Drop MemoryRegion.ram_addr
All references to mr->ram_addr are replaced by memory_region_get_ram_addr(mr) (except for a few assertions that are replaced with mr->ram_block). Signed-off-by: Fam Zheng --- cputlb.c | 4 +-- exec.c| 3 ++- hw/misc/ivshmem.c | 9 --- include/exec/memory.h | 1 - kvm-all.c | 3 ++- memory.c | 71 --- 6 files changed, 39 insertions(+), 52 deletions(-) diff --git a/cputlb.c b/cputlb.c index 3973030..2f7a166 100644 --- a/cputlb.c +++ b/cputlb.c @@ -416,8 +416,8 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr, /* Write access calls the I/O callback. */ te->addr_write = address | TLB_MMIO; } else if (memory_region_is_ram(section->mr) - && cpu_physical_memory_is_clean(section->mr->ram_addr - + xlat)) { + && cpu_physical_memory_is_clean( +memory_region_get_ram_addr(section->mr) + xlat)) { te->addr_write = address | TLB_NOTDIRTY; } else { te->addr_write = address; diff --git a/exec.c b/exec.c index 83e3f7d..6ed4203 100644 --- a/exec.c +++ b/exec.c @@ -2699,7 +2699,8 @@ MemTxResult address_space_read_continue(AddressSpace *as, hwaddr addr, } } else { /* RAM case */ -ptr = qemu_get_ram_ptr(mr->ram_block, mr->ram_addr + addr1); +ptr = qemu_get_ram_ptr(mr->ram_block, + memory_region_get_ram_addr(mr) + addr1); memcpy(buf, ptr, l); } diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c index 48b7a34..1838bc8 100644 --- a/hw/misc/ivshmem.c +++ b/hw/misc/ivshmem.c @@ -400,7 +400,7 @@ static int create_shared_memory_BAR(IVShmemState *s, int fd, uint8_t attr, memory_region_init_ram_ptr(&s->ivshmem, OBJECT(s), "ivshmem.bar2", s->ivshmem_size, ptr); -qemu_set_ram_fd(s->ivshmem.ram_addr, fd); +qemu_set_ram_fd(memory_region_get_ram_addr(&s->ivshmem), fd); vmstate_register_ram(&s->ivshmem, DEVICE(s)); memory_region_add_subregion(&s->bar, 0, &s->ivshmem); @@ -661,7 +661,8 @@ static void ivshmem_read(void *opaque, const uint8_t *buf, int size) } memory_region_init_ram_ptr(&s->ivshmem, OBJECT(s), "ivshmem.bar2", s->ivshmem_size, map_ptr); -qemu_set_ram_fd(s->ivshmem.ram_addr, incoming_fd); +qemu_set_ram_fd(memory_region_get_ram_addr(&s->ivshmem), +incoming_fd); vmstate_register_ram(&s->ivshmem, DEVICE(s)); IVSHMEM_DPRINTF("guest h/w addr = %p, size = %" PRIu64 "\n", @@ -996,8 +997,10 @@ static void pci_ivshmem_exit(PCIDevice *dev) strerror(errno)); } -if ((fd = qemu_get_ram_fd(s->ivshmem.ram_addr)) != -1) +fd = qemu_get_ram_fd(memory_region_get_ram_addr(&s->ivshmem)); +if (fd != -1) { close(fd); +} } vmstate_unregister_ram(&s->ivshmem, DEVICE(dev)); diff --git a/include/exec/memory.h b/include/exec/memory.h index 810d2c0..2de7898 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -169,7 +169,6 @@ struct MemoryRegion { bool flush_coalesced_mmio; bool global_locking; uint8_t dirty_log_mask; -ram_addr_t ram_addr; RAMBlock *ram_block; Object *owner; const MemoryRegionIOMMUOps *iommu_ops; diff --git a/kvm-all.c b/kvm-all.c index a65e73f..161200e 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -366,7 +366,8 @@ static void kvm_log_stop(MemoryListener *listener, static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section, unsigned long *bitmap) { -ram_addr_t start = section->offset_within_region + section->mr->ram_addr; +ram_addr_t start = section->offset_within_region + + memory_region_get_ram_addr(section->mr); ram_addr_t pages = int128_get64(section->size) / getpagesize(); cpu_physical_memory_set_dirty_lebitmap(bitmap, start, pages); diff --git a/memory.c b/memory.c index 769825e..b221f3c 100644 --- a/memory.c +++ b/memory.c @@ -858,12 +858,12 @@ static void memory_region_destructor_none(MemoryRegion *mr) static void memory_region_destructor_ram(MemoryRegion *mr) { -qemu_ram_free(mr->ram_addr); +qemu_ram_free(memory_region_get_ram_addr(mr)); } static void memory_region_destructor_rom_device(MemoryRegion *mr) { -qemu_ram_free(mr->ram_addr & TARGET_PAGE_MASK); +qemu_ram_free(memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK); } static bool memory_region_need_escape(char c) @@ -994,7 +994,6 @@ static void memory_region_initfn(Object *obj) ObjectProperty *op; mr->ops = &unassigned_mem_ops; -mr->ram_addr = RAM_ADDR_INVALID; mr->e
[Qemu-devel] [PATCH 3/6] memory: Implement memory_region_get_ram_addr with mr->ram_block
Signed-off-by: Fam Zheng --- include/exec/memory.h | 8 +--- memory.c | 5 + 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index d5284c2..810d2c0 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -978,14 +978,8 @@ void memory_region_add_subregion_overlap(MemoryRegion *mr, /** * memory_region_get_ram_addr: Get the ram address associated with a memory * region - * - * DO NOT USE THIS FUNCTION. This is a temporary workaround while the Xen - * code is being reworked. */ -static inline ram_addr_t memory_region_get_ram_addr(MemoryRegion *mr) -{ -return mr->ram_addr; -} +ram_addr_t memory_region_get_ram_addr(MemoryRegion *mr); uint64_t memory_region_get_alignment(const MemoryRegion *mr); /** diff --git a/memory.c b/memory.c index fe70075..769825e 100644 --- a/memory.c +++ b/memory.c @@ -1596,6 +1596,11 @@ void *memory_region_get_ram_ptr(MemoryRegion *mr) return ptr + offset; } +ram_addr_t memory_region_get_ram_addr(MemoryRegion *mr) +{ +return mr->ram_block->offset; +} + void memory_region_ram_resize(MemoryRegion *mr, ram_addr_t newsize, Error **errp) { assert(mr->ram_addr != RAM_ADDR_INVALID); -- 2.4.3
[Qemu-devel] [PATCH 5/6] exec: Pass RAMBlock pointer to qemu_ram_free
The only caller now knows exactly which RAMBlock to free, so it's not necessary to do the lookup. Signed-off-by: Fam Zheng --- exec.c | 21 +++-- include/exec/ram_addr.h | 2 +- memory.c| 4 ++-- 3 files changed, 10 insertions(+), 17 deletions(-) diff --git a/exec.c b/exec.c index 6ed4203..ad8b826 100644 --- a/exec.c +++ b/exec.c @@ -1751,22 +1751,15 @@ static void reclaim_ramblock(RAMBlock *block) g_free(block); } -void qemu_ram_free(ram_addr_t addr) +void qemu_ram_free(RAMBlock *block) { -RAMBlock *block; - qemu_mutex_lock_ramlist(); -QLIST_FOREACH_RCU(block, &ram_list.blocks, next) { -if (addr == block->offset) { -QLIST_REMOVE_RCU(block, next); -ram_list.mru_block = NULL; -/* Write list before version */ -smp_wmb(); -ram_list.version++; -call_rcu(block, reclaim_ramblock, rcu); -break; -} -} +QLIST_REMOVE_RCU(block, next); +ram_list.mru_block = NULL; +/* Write list before version */ +smp_wmb(); +ram_list.version++; +call_rcu(block, reclaim_ramblock, rcu); qemu_mutex_unlock_ramlist(); } diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index 865e19b..5adf7a4 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -108,7 +108,7 @@ RAMBlock *qemu_ram_alloc_resizeable(ram_addr_t size, ram_addr_t max_size, int qemu_get_ram_fd(ram_addr_t addr); void qemu_set_ram_fd(ram_addr_t addr, int fd); void *qemu_get_ram_block_host_ptr(ram_addr_t addr); -void qemu_ram_free(ram_addr_t addr); +void qemu_ram_free(RAMBlock *block); int qemu_ram_resize(ram_addr_t base, ram_addr_t newsize, Error **errp); diff --git a/memory.c b/memory.c index b221f3c..32d2912 100644 --- a/memory.c +++ b/memory.c @@ -858,12 +858,12 @@ static void memory_region_destructor_none(MemoryRegion *mr) static void memory_region_destructor_ram(MemoryRegion *mr) { -qemu_ram_free(memory_region_get_ram_addr(mr)); +qemu_ram_free(mr->ram_block); } static void memory_region_destructor_rom_device(MemoryRegion *mr) { -qemu_ram_free(memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK); +qemu_ram_free(mr->ram_block); } static bool memory_region_need_escape(char c) -- 2.4.3
[Qemu-devel] [PATCH 6/6] exec: Introduce AddressSpaceDispatch.mru_section
Under heavy workloads the lookup will likely end up with the same MemoryRegionSection from last time. Using a pointer to cache the result, like ram_list.mru_block, significantly reduce computation cost of address_space_translate. During address space topology update, as->dispatch will be reallocated so the pointer is invalidated automatically. Perf reports a visible drop on the cpu usage. Before: + 2.06% phys_page_find + 0.95% address_space_translate_internal + 0.80% address_space_translate After: + 0.78% address_space_translate + 0.77% address_space_translate_internal + 0.69% address_space_lookup_region Signed-off-by: Fam Zheng --- exec.c | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/exec.c b/exec.c index ad8b826..3232861 100644 --- a/exec.c +++ b/exec.c @@ -135,6 +135,7 @@ typedef struct PhysPageMap { struct AddressSpaceDispatch { struct rcu_head rcu; +MemoryRegionSection *mru_section; /* This is a multi-level map on the physical address space. * The bottom level has pointers to MemoryRegionSections. */ @@ -342,14 +343,26 @@ static MemoryRegionSection *address_space_lookup_region(AddressSpaceDispatch *d, hwaddr addr, bool resolve_subpage) { -MemoryRegionSection *section; +MemoryRegionSection *section = atomic_read(&d->mru_section); subpage_t *subpage; +bool update; -section = phys_page_find(d->phys_map, addr, d->map.nodes, d->map.sections); +if (section && section != &d->map.sections[PHYS_SECTION_UNASSIGNED] && +range_covers_byte(section->offset_within_address_space, + section->size.lo, addr)) { +update = false; +} else { +section = phys_page_find(d->phys_map, addr, d->map.nodes, + d->map.sections); +update = true; +} if (resolve_subpage && section->mr->subpage) { subpage = container_of(section->mr, subpage_t, iomem); section = &d->map.sections[subpage->sub_section[SUBPAGE_IDX(addr)]]; } +if (update) { +atomic_set(&d->mru_section, section); +} return section; } -- 2.4.3
[Qemu-devel] [PATCH 2/6] memory: Move assignment to ram_block to memory_region_init_*
We don't force "const" qualifiers with pointers in QEMU, but it's still good to keep a clean function interface. Assigning to mr->ram_block is in this sense ugly - one initializer mutating its owning object's state. Move it to memory_region_init_*, where mr->ram_addr is assigned. Signed-off-by: Fam Zheng --- exec.c | 1 - memory.c | 5 + 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/exec.c b/exec.c index 2b14b79..83e3f7d 100644 --- a/exec.c +++ b/exec.c @@ -1711,7 +1711,6 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size, error_propagate(errp, local_err); return NULL; } -mr->ram_block = new_block; return new_block; } diff --git a/memory.c b/memory.c index ae13ba9..fe70075 100644 --- a/memory.c +++ b/memory.c @@ -1233,6 +1233,7 @@ void memory_region_init_ram(MemoryRegion *mr, mr->terminates = true; mr->destructor = memory_region_destructor_ram; ram_block = qemu_ram_alloc(size, mr, errp); +mr->ram_block = ram_block; mr->ram_addr = ram_block->offset; mr->dirty_log_mask = tcg_enabled() ? (1 << DIRTY_MEMORY_CODE) : 0; } @@ -1254,6 +1255,7 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr, mr->terminates = true; mr->destructor = memory_region_destructor_ram; ram_block = qemu_ram_alloc_resizeable(size, max_size, resized, mr, errp); +mr->ram_block = ram_block; mr->ram_addr = ram_block->offset; mr->dirty_log_mask = tcg_enabled() ? (1 << DIRTY_MEMORY_CODE) : 0; } @@ -1274,6 +1276,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr, mr->terminates = true; mr->destructor = memory_region_destructor_ram; ram_block = qemu_ram_alloc_from_file(size, mr, share, path, errp); +mr->ram_block = ram_block; mr->ram_addr = ram_block->offset; mr->dirty_log_mask = tcg_enabled() ? (1 << DIRTY_MEMORY_CODE) : 0; } @@ -1296,6 +1299,7 @@ void memory_region_init_ram_ptr(MemoryRegion *mr, /* qemu_ram_alloc_from_ptr cannot fail with ptr != NULL. */ assert(ptr != NULL); ram_block = qemu_ram_alloc_from_ptr(size, ptr, mr, &error_fatal); +mr->ram_block = ram_block; mr->ram_addr = ram_block->offset; } @@ -1333,6 +1337,7 @@ void memory_region_init_rom_device(MemoryRegion *mr, mr->rom_device = true; mr->destructor = memory_region_destructor_rom_device; ram_block = qemu_ram_alloc(size, mr, errp); +mr->ram_block = ram_block; mr->ram_addr = ram_block->offset; } -- 2.4.3
[Qemu-devel] [PATCH 0/6] memory: Clean up MemoryRegion.ram_addr and optimize address_space_translate
The first four patches drop ram_addr from MemoryRegion on top of Gonglei's optimization. The next patch simplifies qemu_ram_free a bit by passing the RAMBlock pointer. The last patch speeds up address_space_translate with a cache pointer inside the AddressSpaceDispatch. Fam Zheng (6): exec: Return RAMBlock pointer from allocating functions memory: Move assignment to ram_block to memory_region_init_* memory: Implement memory_region_get_ram_addr with mr->ram_block memory: Drop MemoryRegion.ram_addr exec: Pass RAMBlock pointer to qemu_ram_free exec: Introduce AddressSpaceDispatch.mru_section cputlb.c| 4 +-- exec.c | 93 - hw/misc/ivshmem.c | 9 +++-- include/exec/memory.h | 9 + include/exec/ram_addr.h | 24 ++--- kvm-all.c | 3 +- memory.c| 56 - 7 files changed, 101 insertions(+), 97 deletions(-) -- 2.4.3
[Qemu-devel] [PATCH 1/6] exec: Return RAMBlock pointer from allocating functions
Previously we return RAMBlock.offset; now return the pointer to the whole structure. ram_block_add returns void now, error is completely passed with errp. Signed-off-by: Fam Zheng --- exec.c | 51 + include/exec/ram_addr.h | 22 ++--- memory.c| 25 +++- 3 files changed, 53 insertions(+), 45 deletions(-) diff --git a/exec.c b/exec.c index c62c439..2b14b79 100644 --- a/exec.c +++ b/exec.c @@ -1554,7 +1554,7 @@ static void dirty_memory_extend(ram_addr_t old_ram_size, } } -static ram_addr_t ram_block_add(RAMBlock *new_block, Error **errp) +static void ram_block_add(RAMBlock *new_block, Error **errp) { RAMBlock *block; RAMBlock *last_block = NULL; @@ -1573,7 +1573,6 @@ static ram_addr_t ram_block_add(RAMBlock *new_block, Error **errp) if (err) { error_propagate(errp, err); qemu_mutex_unlock_ramlist(); -return -1; } } else { new_block->host = phys_mem_alloc(new_block->max_length, @@ -1583,7 +1582,6 @@ static ram_addr_t ram_block_add(RAMBlock *new_block, Error **errp) "cannot set up guest memory '%s'", memory_region_name(new_block->mr)); qemu_mutex_unlock_ramlist(); -return -1; } memory_try_enable_merging(new_block->host, new_block->max_length); } @@ -1631,22 +1629,19 @@ static ram_addr_t ram_block_add(RAMBlock *new_block, Error **errp) kvm_setup_guest_memory(new_block->host, new_block->max_length); } } - -return new_block->offset; } #ifdef __linux__ -ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr, -bool share, const char *mem_path, -Error **errp) +RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr, + bool share, const char *mem_path, + Error **errp) { RAMBlock *new_block; -ram_addr_t addr; Error *local_err = NULL; if (xen_enabled()) { error_setg(errp, "-mem-path not supported with Xen"); -return -1; +return NULL; } if (phys_mem_alloc != qemu_anon_ram_alloc) { @@ -1657,7 +1652,7 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr, */ error_setg(errp, "-mem-path not supported with this accelerator"); -return -1; +return NULL; } size = HOST_PAGE_ALIGN(size); @@ -1670,29 +1665,28 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr, mem_path, errp); if (!new_block->host) { g_free(new_block); -return -1; +return NULL; } -addr = ram_block_add(new_block, &local_err); +ram_block_add(new_block, &local_err); if (local_err) { g_free(new_block); error_propagate(errp, local_err); -return -1; +return NULL; } -return addr; +return new_block; } #endif static -ram_addr_t qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size, - void (*resized)(const char*, - uint64_t length, - void *host), - void *host, bool resizeable, - MemoryRegion *mr, Error **errp) +RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size, + void (*resized)(const char*, + uint64_t length, + void *host), + void *host, bool resizeable, + MemoryRegion *mr, Error **errp) { RAMBlock *new_block; -ram_addr_t addr; Error *local_err = NULL; size = HOST_PAGE_ALIGN(size); @@ -1711,29 +1705,28 @@ ram_addr_t qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size, if (resizeable) { new_block->flags |= RAM_RESIZEABLE; } -addr = ram_block_add(new_block, &local_err); +ram_block_add(new_block, &local_err); if (local_err) { g_free(new_block); error_propagate(errp, local_err); -return -1; +return NULL; } - mr->ram_block = new_block; -return addr; +return new_block; } -ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host, +RAMBlock *qemu_ram_alloc_from_ptr(ram_addr_t size, void *host, MemoryRegion *mr, Error **errp) { return qemu_ram_alloc_internal(size, size, NULL, host, false, mr, errp); } -ram_addr_t qemu_ram_a
Re: [Qemu-devel] [Qemu-arm] [PATCH v2 1/3] hw/timer: Add ASPEED AST2400 timer device model
On Fri, 2016-02-26 at 10:20 +, Peter Maydell wrote: > On 26 February 2016 at 03:14, Andrew Jeffery wrote: > > > > Hi Peter, > > > > On Thu, 2016-02-25 at 16:11 +, Peter Maydell wrote: > > > > > > On 16 February 2016 at 11:34, Andrew Jeffery wrote: > > > > > > > > Implement basic AST2400 timer functionality: Timers can be configured, > > > > enabled, reset and disabled. > > > > > > > > A number of hardware features are not implemented: > > > > > > > > * Timer Overflow interrupts > > > > * Clock value matching > > > > * Pulse generation > > > > > > > > The implementation is enough to boot the Linux kernel configured with > > > > aspeed_defconfig. > > > Thanks; this mostly looks in reasonable shape; I have some comments below. > > > > > > Do we have a datasheet for this chip ? > > Unfortunately I don't know of a publicly available datasheet. What's > > the best way to proceed in this case? > We have devices in the tree that are either based on non-public datasheets > or occasionally reverse engineered from Linux kernel drivers. That's OK, > but it's nice to be clear in a comment at the top what the source is, > so people maintaining it later know how much to trust the current code > and (if possible) where to look for clarification. No worries, I'll add notes in the header comments for each of the new files. > > > > > > > > > All source files need to #include "qemu/osdep.h" as their first > > > include. That then means you don't need to include assert.h or > > > stdio.h yourself. > > > > > > What do we need from qemu/main-loop.h? > > I'm using it for qemu_bh_new() which is required by ptimer, who registers > > the aspeed_timer_tick() callback into the main loop timer handling. > OK, no problem. > > > > > > > > > > > > > > +static void aspeed_timer_irq_update(AspeedTimer *t) > > > > +{ > > > > +qemu_set_irq(t->irq, t->enabled); > > > Surely the timer doesn't assert its IRQ line all > > > the time it's enabled? This doesn't look like the right condition. > > So I think this is correct despite how it looks. There's some cruft > > from modelling the implementation off of another timer that's probably > > obscuring things, which should be fixed. aspeed_timer_irq_update() > > is only called from aspeed_timer_tick(), so I'll just merge the two. > > Then by renaming aspeed_timer_tick() to aspeed_timer_expire() as > > mentioned above, this won't look so strange? I've read through the > > timer handling code (the processing loop in timerlist_run_timers()) > > and my understanding is it has the behaviour we want - callback on > > expiry, not on ticks - which is not how it reads as above. > Usually functions in QEMU called thingy_irq_update() are the ones > that do "look at current state of device and assert IRQ as > necessary"; often this is "mask irq raw state against some > irq-masking bit". Merging this into the timer expire function will > probably help. (Is there no register bit that the guest can query > that indicates "timer expired" or "raw interrupt state" ?) It doesn't appear so - overflow interrupts can be enabled or disabled, but it doesn't appear that there's a way to poll whether the timer has expired. It doesn't look like it can be inferred either as the counter status register doesn't stick at zero, rather resets back to the reload register value (and my interpretation is it continues to count down if the enabled bit remains set). Further, the interrupt on overflow bit doesn't appear to be set by the kernel driver, but the two match registers are initialised to zero which ensures an interrupt will be fired. Non-zero match registers aren't currently supported by the device model. I'll try to make this all clearer in the code. > > > > > > > > > You should implement a VMState struct for device migration, > > > and wire it up here via dc->vmsd. > > I'll look into it. I started experimenting with a VMState struct > > early on in the implementation but threw it away as it wasn't my > > primary focus at the time. > We insist on vmstate structs for all new devices, because > they're fairly easy to implement, and if the original > submitter doesn't implement one then the device becomes > a landmine for any user trying migration or vmstate snapshots, > because it will silently misbehave. Yeah, no worries, v3 of the series will have VMState structs for all devices. Cheers, Andrew > > thanks > -- PMM signature.asc Description: This is a digitally signed message part
[Qemu-devel] [PATCH v2 2/2] filter-buffer: Add status_changed callback processing
While the status of filter-buffer changing from 'on' to 'off', it need to release all the buffered packets, and delete the related timer, while switch from 'off' to 'on', it need to resume the release packets timer. Signed-off-by: zhanghailiang Cc: Jason Wang Cc: Yang Hongyang --- v2: - New patch --- net/filter-buffer.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/net/filter-buffer.c b/net/filter-buffer.c index 12ad2e3..ed3f19e 100644 --- a/net/filter-buffer.c +++ b/net/filter-buffer.c @@ -124,6 +124,24 @@ static void filter_buffer_setup(NetFilterState *nf, Error **errp) } } +static void filter_buffer_status_changed(NetFilterState *nf, Error **errp) +{ +FilterBufferState *s = FILTER_BUFFER(nf); + +if (!strcmp(nf->status, "off")) { +if (s->interval) { +timer_del(&s->release_timer); +} +filter_buffer_flush(nf); +} else { +if (s->interval) { +timer_init_us(&s->release_timer, QEMU_CLOCK_VIRTUAL, +filter_buffer_release_timer, nf); +timer_mod(&s->release_timer, +qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + s->interval); +} +} +} static void filter_buffer_class_init(ObjectClass *oc, void *data) { NetFilterClass *nfc = NETFILTER_CLASS(oc); @@ -131,6 +149,7 @@ static void filter_buffer_class_init(ObjectClass *oc, void *data) nfc->setup = filter_buffer_setup; nfc->cleanup = filter_buffer_cleanup; nfc->receive_iov = filter_buffer_receive_iov; +nfc->status_changed = filter_buffer_status_changed; } static void filter_buffer_get_interval(Object *obj, Visitor *v, -- 1.8.3.1
[Qemu-devel] [PATCH v2 0/2] Introduce 'status' property for netfilter
This is picked from COLO series, which is to realize the new 'status' property for filter. With this property, users can control if the filter is enabled or disabled. zhanghailiang (2): filter: Add 'status' property for filter object filter-buffer: Add status_changed callback processing include/net/filter.h | 4 net/filter-buffer.c | 19 +++ net/filter.c | 42 ++ qemu-options.hx | 4 +++- 4 files changed, 68 insertions(+), 1 deletion(-) -- 1.8.3.1
[Qemu-devel] [PATCH v2 1/2] filter: Add 'status' property for filter object
With this property, users can control if this filter is 'on' or 'off'. The default behavior for filter is 'on'. For some types of filters, they may need to react to status changing, So here, we introduced status changing callback/notifier for filter class. We will skip the disabled ('off') filter when delivering packets in net layer. Signed-off-by: zhanghailiang Cc: Jason Wang Cc: Yang Hongyang --- v2: - Split the processing of buffer-filter into a new patch (Jason) - Use 'status' instead of 'enabled' to store the filter state (Jason) - Rename FilterDisable() callback to FilterStatusChanged(Jason) --- include/net/filter.h | 4 net/filter.c | 42 ++ qemu-options.hx | 4 +++- 3 files changed, 49 insertions(+), 1 deletion(-) diff --git a/include/net/filter.h b/include/net/filter.h index 5639976..ebef0dc 100644 --- a/include/net/filter.h +++ b/include/net/filter.h @@ -36,12 +36,15 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc, int iovcnt, NetPacketSent *sent_cb); +typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp); + typedef struct NetFilterClass { ObjectClass parent_class; /* optional */ FilterSetup *setup; FilterCleanup *cleanup; +FilterStatusChanged *status_changed; /* mandatory */ FilterReceiveIOV *receive_iov; } NetFilterClass; @@ -55,6 +58,7 @@ struct NetFilterState { char *netdev_id; NetClientState *netdev; NetFilterDirection direction; +char *status; QTAILQ_ENTRY(NetFilterState) next; }; diff --git a/net/filter.c b/net/filter.c index d2a514e..68b7bba 100644 --- a/net/filter.c +++ b/net/filter.c @@ -17,6 +17,11 @@ #include "qom/object_interfaces.h" #include "qemu/iov.h" +static inline bool qemu_can_skip_netfilter(NetFilterState *nf) +{ +return !!strcmp(nf->status, "on"); +} + ssize_t qemu_netfilter_receive(NetFilterState *nf, NetFilterDirection direction, NetClientState *sender, @@ -25,6 +30,9 @@ ssize_t qemu_netfilter_receive(NetFilterState *nf, int iovcnt, NetPacketSent *sent_cb) { +if (qemu_can_skip_netfilter(nf)) { +return 0; +} if (nf->direction == direction || nf->direction == NET_FILTER_DIRECTION_ALL) { return NETFILTER_GET_CLASS(OBJECT(nf))->receive_iov( @@ -134,8 +142,39 @@ static void netfilter_set_direction(Object *obj, int direction, Error **errp) nf->direction = direction; } +static char *netfilter_get_status(Object *obj, Error **errp) +{ +NetFilterState *nf = NETFILTER(obj); + +return g_strdup(nf->status); +} + +static void netfilter_set_status(Object *obj, const char *str, Error **errp) +{ +NetFilterState *nf = NETFILTER(obj); +NetFilterClass *nfc = NETFILTER_GET_CLASS(obj); + +if (!strcmp(nf->status, str)) { +return; +} +if (strcmp(str, "on") && strcmp(str, "off")) { +error_setg(errp, "Invalid value for netfilter status, " + "should be 'on' or 'off'"); +return; +} +g_free(nf->status); +nf->status = g_strdup(str); +if (nfc->status_changed) { +nfc->status_changed(nf, errp); +} +} + static void netfilter_init(Object *obj) { +NetFilterState *nf = NETFILTER(obj); + +nf->status = g_strdup("on"); + object_property_add_str(obj, "netdev", netfilter_get_netdev_id, netfilter_set_netdev_id, NULL); @@ -143,6 +182,9 @@ static void netfilter_init(Object *obj) NetFilterDirection_lookup, netfilter_get_direction, netfilter_set_direction, NULL); +object_property_add_str(obj, "status", +netfilter_get_status, netfilter_set_status, +NULL); } static void netfilter_complete(UserCreatable *uc, Error **errp) diff --git a/qemu-options.hx b/qemu-options.hx index 599db94..5291347 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -3787,11 +3787,13 @@ version by providing the @var{passwordid} parameter. This provides the ID of a previously created @code{secret} object containing the password for decryption. -@item -object filter-buffer,id=@var{id},netdev=@var{netdevid},interval=@var{t}[,queue=@var{all|rx|tx}] +@item -object filter-buffer,id=@var{id},netdev=@var{netdevid},interval=@var{t}[,queue=@var{all|rx|tx}][,status=@var{on|off}] Interval @var{t} can't be 0, this filter batches the packet delivery: all packets arriving in a given interval on netdev @var{netdevid} are delayed until the end of the interval. Interval is in microseconds. +@option{status} is optional that indicate whether the netfilter is +on (enabled) or off (disabled), the default status for netf
Re: [Qemu-devel] [PATCH 09/12] spapr_pci: Allow EEH on spapr-pci-host-bridge
On 02/26/2016 10:32 PM, David Gibson wrote: Now that the EEH code is independent of the special spapr-vfio-pci-host-bridge device, we can allow it on all spapr PCI host bridges instead. We do this by changing spapr_phb_eeh_available() to be based on the vfio_eeh_as_ok() call instead of the host bridge class. Because the value of vfio_eeh_as_ok() can change with devices being hotplugged or unplugged, this can potentially lead to some strange edge cases where the guest starts using EEH, then it starts failing because of a change in status. However, it's not really any worse than the current situation. Cases that would have worked previously will still work (i.e. VFIO devices from at most one VFIO IOMMU group per vPHB), it's just that it's no longer necessary to use spapr-vfio-pci-host-bridge with the groupid pre-specified. Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy --- hw/ppc/spapr_pci.c | 5 + hw/ppc/spapr_pci_vfio.c | 1 - include/hw/pci-host/spapr.h | 1 - 3 files changed, 1 insertion(+), 6 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index be18bf6..336ffd2 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -97,9 +97,7 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState *spapr, uint64_t buid, static bool spapr_phb_eeh_available(sPAPRPHBState *sphb) { -sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); - -return spc->eeh_available; +return vfio_eeh_as_ok(&sphb->iommu_as); } static uint32_t rtas_pci_cfgaddr(uint32_t arg) @@ -1680,7 +1678,6 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data) set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); dc->cannot_instantiate_with_device_add_yet = false; spc->finish_realize = spapr_phb_finish_realize; -spc->eeh_available = false; hp->plug = spapr_phb_hot_plug_child; hp->unplug = spapr_phb_hot_unplug_child; } diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index 0a0f31a..ecbcc5c 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -80,7 +80,6 @@ static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data) dc->props = spapr_phb_vfio_properties; spc->finish_realize = spapr_phb_vfio_finish_realize; -spc->eeh_available = true; } static const TypeInfo spapr_phb_vfio_info = { diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index d32750e..9313209 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -49,7 +49,6 @@ struct sPAPRPHBClass { PCIHostBridgeClass parent_class; void (*finish_realize)(sPAPRPHBState *sphb, Error **errp); -bool eeh_available; }; typedef struct spapr_pci_msi { -- Alexey
Re: [Qemu-devel] [PATCH 11/12] spapr_pci: Remove finish_realize hook
On 02/26/2016 10:32 PM, David Gibson wrote: Now that spapr-pci-vfio-host-bridge is reduced to just a stub, there is only one implementation of the finish_realize hook in sPAPRPHBClass. So, we can fold that implementation into its (single) caller, and remove the hook. That's the last thing left in sPAPRPHBClass, so that can go away as well. Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy --- hw/ppc/spapr_pci.c | 25 + include/hw/pci-host/spapr.h | 12 2 files changed, 5 insertions(+), 32 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 336ffd2..2faaa03 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -1335,11 +1335,12 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) SysBusDevice *s = SYS_BUS_DEVICE(dev); sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(s); PCIHostState *phb = PCI_HOST_BRIDGE(s); -sPAPRPHBClass *info = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(s); char *namebuf; int i; PCIBus *bus; uint64_t msi_window_size = 4096; +sPAPRTCETable *tcet; +uint32_t nb_table; if (sphb->index != (uint32_t)-1) { hwaddr windows_base; @@ -1489,33 +1490,20 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) } } -if (!info->finish_realize) { -error_setg(errp, "finish_realize not defined"); -return; -} - -info->finish_realize(sphb, errp); - -sphb->msi = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free); -} - -static void spapr_phb_finish_realize(sPAPRPHBState *sphb, Error **errp) -{ -sPAPRTCETable *tcet; -uint32_t nb_table; - nb_table = sphb->dma_win_size >> SPAPR_TCE_PAGE_SHIFT; tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn, 0, SPAPR_TCE_PAGE_SHIFT, nb_table, false); if (!tcet) { error_setg(errp, "Unable to create TCE table for %s", sphb->dtbusname); -return ; +return; } /* Register default 32bit DMA window */ memory_region_add_subregion(&sphb->iommu_root, sphb->dma_win_addr, spapr_tce_get_iommu(tcet)); + +sphb->msi = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free); } static int spapr_phb_children_reset(Object *child, void *opaque) @@ -1667,7 +1655,6 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data) { PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass); DeviceClass *dc = DEVICE_CLASS(klass); -sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_CLASS(klass); HotplugHandlerClass *hp = HOTPLUG_HANDLER_CLASS(klass); hc->root_bus_path = spapr_phb_root_bus_path; @@ -1677,7 +1664,6 @@ static void spapr_phb_class_init(ObjectClass *klass, void *data) dc->vmsd = &vmstate_spapr_pci; set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); dc->cannot_instantiate_with_device_add_yet = false; -spc->finish_realize = spapr_phb_finish_realize; hp->plug = spapr_phb_hot_plug_child; hp->unplug = spapr_phb_hot_unplug_child; } @@ -1687,7 +1673,6 @@ static const TypeInfo spapr_phb_info = { .parent= TYPE_PCI_HOST_BRIDGE, .instance_size = sizeof(sPAPRPHBState), .class_init= spapr_phb_class_init, -.class_size= sizeof(sPAPRPHBClass), .interfaces= (InterfaceInfo[]) { { TYPE_HOTPLUG_HANDLER }, { } diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index 7110222..a603feb 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -32,20 +32,8 @@ #define SPAPR_PCI_HOST_BRIDGE(obj) \ OBJECT_CHECK(sPAPRPHBState, (obj), TYPE_SPAPR_PCI_HOST_BRIDGE) -#define SPAPR_PCI_HOST_BRIDGE_CLASS(klass) \ - OBJECT_CLASS_CHECK(sPAPRPHBClass, (klass), TYPE_SPAPR_PCI_HOST_BRIDGE) -#define SPAPR_PCI_HOST_BRIDGE_GET_CLASS(obj) \ - OBJECT_GET_CLASS(sPAPRPHBClass, (obj), TYPE_SPAPR_PCI_HOST_BRIDGE) - -typedef struct sPAPRPHBClass sPAPRPHBClass; typedef struct sPAPRPHBState sPAPRPHBState; -struct sPAPRPHBClass { -PCIHostBridgeClass parent_class; - -void (*finish_realize)(sPAPRPHBState *sphb, Error **errp); -}; - typedef struct spapr_pci_msi { uint32_t first_irq; uint32_t num; -- Alexey
Re: [Qemu-devel] [PATCH 08/12] spapr_pci: Fold spapr_phb_vfio_reset() into spapr_pci code
On 02/26/2016 10:31 PM, David Gibson wrote: Simplify the sPAPR PCI code by folding spapr_phb_vfio_reset() into spapr_phb_reset(). Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy --- hw/ppc/spapr_pci.c | 13 - hw/ppc/spapr_pci_vfio.c | 16 2 files changed, 12 insertions(+), 17 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 26d08ad..be18bf6 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -1531,13 +1531,24 @@ static int spapr_phb_children_reset(Object *child, void *opaque) return 0; } +static void spapr_phb_eeh_reenable(sPAPRPHBState *sphb) +{ +vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_ENABLE); +} + static void spapr_phb_reset(DeviceState *qdev) { /* Reset the IOMMU state */ object_child_foreach(OBJECT(qdev), spapr_phb_children_reset, NULL); if (spapr_phb_eeh_available(SPAPR_PCI_HOST_BRIDGE(qdev))) { -spapr_phb_vfio_reset(qdev); +/* + * The PE might be in frozen state. To reenable the EEH + * functionality on it will clean the frozen state, which + * ensures that the contained PCI devices will work properly + * after reboot. + */ +spapr_phb_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev)); } } diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index cccd444..0a0f31a 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -73,22 +73,6 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **errp) spapr_tce_get_iommu(tcet)); } -static void spapr_phb_vfio_eeh_reenable(sPAPRPHBState *sphb) -{ -vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_ENABLE); -} - -void spapr_phb_vfio_reset(DeviceState *qdev) -{ -/* - * The PE might be in frozen state. To reenable the EEH - * functionality on it will clean the frozen state, which - * ensures that the contained PCI devices will work properly - * after reboot. - */ -spapr_phb_vfio_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev)); -} - static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); -- Alexey
Re: [Qemu-devel] [PATCH 12/12] vfio: Eliminate vfio_container_ioctl()
On 02/26/2016 10:32 PM, David Gibson wrote: vfio_container_ioctl() was a bad interface that bypassed abstraction boundaries, had semantics that sat uneasily with its name, and was unsafe in many realistic circumstances. Now that spapr-pci-vfio-host-bridge has been folded into spapr-pci-host-bridge, there are no more users, so remove it. Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy --- hw/vfio/common.c | 45 - include/hw/vfio/vfio.h | 2 -- 2 files changed, 47 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index e419241..26e91ad 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -959,51 +959,6 @@ void vfio_put_base_device(VFIODevice *vbasedev) close(vbasedev->fd); } -static int vfio_container_do_ioctl(AddressSpace *as, int32_t groupid, - int req, void *param) -{ -VFIOGroup *group; -VFIOContainer *container; -int ret = -1; - -group = vfio_get_group(groupid, as); -if (!group) { -error_report("vfio: group %d not registered", groupid); -return ret; -} - -container = group->container; -if (group->container) { -ret = ioctl(container->fd, req, param); -if (ret < 0) { -error_report("vfio: failed to ioctl %d to container: ret=%d, %s", - _IOC_NR(req) - VFIO_BASE, ret, strerror(errno)); -} -} - -vfio_put_group(group); - -return ret; -} - -int vfio_container_ioctl(AddressSpace *as, int32_t groupid, - int req, void *param) -{ -/* We allow only certain ioctls to the container */ -switch (req) { -case VFIO_CHECK_EXTENSION: -case VFIO_IOMMU_SPAPR_TCE_GET_INFO: -case VFIO_EEH_PE_OP: -break; -default: -/* Return an error on unknown requests */ -error_report("vfio: unsupported ioctl %X", req); -return -1; -} - -return vfio_container_do_ioctl(as, groupid, req, param); -} - /* * Interfaces for IBM EEH (Enhanced Error Handling) */ diff --git a/include/hw/vfio/vfio.h b/include/hw/vfio/vfio.h index fd3933b..7153604 100644 --- a/include/hw/vfio/vfio.h +++ b/include/hw/vfio/vfio.h @@ -3,8 +3,6 @@ #include "qemu/typedefs.h" -extern int vfio_container_ioctl(AddressSpace *as, int32_t groupid, -int req, void *param); bool vfio_eeh_as_ok(AddressSpace *as); int vfio_eeh_as_op(AddressSpace *as, uint32_t op); -- Alexey
Re: [Qemu-devel] [PATCH 07/12] spapr_pci: Fold spapr_phb_vfio_eeh_set_option() into spapr_pci code
On 02/26/2016 10:31 PM, David Gibson wrote: Simplify the sPAPR PCI code by folding spapr_phb_eeh_set_option() into rtas_ibm_set_eeh_option(). Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy --- hw/ppc/spapr_pci.c | 43 +++-- hw/ppc/spapr_pci_vfio.c | 47 - include/hw/pci-host/spapr.h | 4 3 files changed, 41 insertions(+), 53 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index eaae7e2..26d08ad 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -450,6 +450,7 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu, sPAPRPHBState *sphb; uint32_t addr, option; uint64_t buid; +uint32_t op; int ret; if ((nargs != 4) || (nret != 1)) { @@ -469,8 +470,46 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu, goto param_error_exit; } -ret = spapr_phb_vfio_eeh_set_option(sphb, addr, option); -rtas_st(rets, 0, ret); +switch (option) { +case RTAS_EEH_DISABLE: +op = VFIO_EEH_PE_DISABLE; +break; +case RTAS_EEH_ENABLE: { +PCIHostState *phb; +PCIDevice *pdev; + +/* + * The EEH functionality is enabled on basis of PCI device, + * instead of PE. We need check the validity of the PCI + * device address. + */ +phb = PCI_HOST_BRIDGE(sphb); +pdev = pci_find_device(phb->bus, + (addr >> 16) & 0xFF, (addr >> 8) & 0xFF); +if (!pdev || !object_dynamic_cast(OBJECT(pdev), "vfio-pci")) { +goto param_error_exit; +} + +op = VFIO_EEH_PE_ENABLE; +break; +} +case RTAS_EEH_THAW_IO: +op = VFIO_EEH_PE_UNFREEZE_IO; +break; +case RTAS_EEH_THAW_DMA: +op = VFIO_EEH_PE_UNFREEZE_DMA; +break; +default: +goto param_error_exit; +} + +ret = vfio_eeh_as_op(&sphb->iommu_as, op); +if (ret < 0) { +rtas_st(rets, 0, RTAS_OUT_HW_ERROR); +return; +} + +rtas_st(rets, 0, RTAS_OUT_SUCCESS); return; param_error_exit: diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index c87f2e4..cccd444 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -89,53 +89,6 @@ void spapr_phb_vfio_reset(DeviceState *qdev) spapr_phb_vfio_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev)); } -int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, - unsigned int addr, int option) -{ -uint32_t op; -int ret; - -switch (option) { -case RTAS_EEH_DISABLE: -op = VFIO_EEH_PE_DISABLE; -break; -case RTAS_EEH_ENABLE: { -PCIHostState *phb; -PCIDevice *pdev; - -/* - * The EEH functionality is enabled on basis of PCI device, - * instead of PE. We need check the validity of the PCI - * device address. - */ -phb = PCI_HOST_BRIDGE(sphb); -pdev = pci_find_device(phb->bus, - (addr >> 16) & 0xFF, (addr >> 8) & 0xFF); -if (!pdev || !object_dynamic_cast(OBJECT(pdev), "vfio-pci")) { -return RTAS_OUT_PARAM_ERROR; -} - -op = VFIO_EEH_PE_ENABLE; -break; -} -case RTAS_EEH_THAW_IO: -op = VFIO_EEH_PE_UNFREEZE_IO; -break; -case RTAS_EEH_THAW_DMA: -op = VFIO_EEH_PE_UNFREEZE_DMA; -break; -default: -return RTAS_OUT_PARAM_ERROR; -} - -ret = vfio_eeh_as_op(&sphb->iommu_as, op); -if (ret < 0) { -return RTAS_OUT_HW_ERROR; -} - -return RTAS_OUT_SUCCESS; -} - static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index 55237fc..d32750e 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -135,8 +135,4 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState *spapr, uint64_t buid, uint32_t config_addr); void spapr_phb_vfio_reset(DeviceState *qdev); -/* VFIO EEH hooks */ -int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, - unsigned int addr, int option); - #endif /* __HW_SPAPR_PCI_H__ */ -- Alexey
Re: [Qemu-devel] [PATCH 05/12] spapr_pci: Fold spapr_phb_vfio_eeh_reset() into spapr_pci code
On 02/26/2016 10:31 PM, David Gibson wrote: Simplify the sPAPR PCI code by folding spapr_phb_vfio_eeh_reset() into rtas_ibm_set_slot_reset(). We move several functions of which it was the only caller (spapr_phb_eeh_clear_dev_msix(), spapr_phb_eeh_clear_bus_msix() and spapr_phb_eeh_pre_reset()) into spapr_pci.c along with it. Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy --- hw/ppc/spapr_pci.c | 68 -- hw/ppc/spapr_pci_vfio.c | 72 - include/hw/pci-host/spapr.h | 1 - 3 files changed, 66 insertions(+), 75 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index fa633a9..96cdca2 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -573,6 +573,48 @@ param_error_exit: rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); } +static void spapr_phb_eeh_clear_dev_msix(PCIBus *bus, PCIDevice *pdev, + void *opaque) +{ +/* Check if the device is VFIO PCI device */ +if (!object_dynamic_cast(OBJECT(pdev), "vfio-pci")) { +return; +} + +/* + * The MSIx table will be cleaned out by reset. We need + * disable it so that it can be reenabled properly. Also, + * the cached MSIx table should be cleared as it's not + * reflecting the contents in hardware. + */ +if (msix_enabled(pdev)) { +uint16_t flags; + +flags = pci_host_config_read_common(pdev, +pdev->msix_cap + PCI_MSIX_FLAGS, +pci_config_size(pdev), 2); +flags &= ~PCI_MSIX_FLAGS_ENABLE; +pci_host_config_write_common(pdev, + pdev->msix_cap + PCI_MSIX_FLAGS, + pci_config_size(pdev), flags, 2); +} + +msix_reset(pdev); +} + +static void spapr_phb_eeh_clear_bus_msix(PCIBus *bus, void *opaque) +{ + pci_for_each_device(bus, pci_bus_num(bus), + spapr_phb_eeh_clear_dev_msix, NULL); +} + +static void spapr_phb_eeh_pre_reset(sPAPRPHBState *sphb) +{ + PCIHostState *phb = PCI_HOST_BRIDGE(sphb); + + pci_for_each_bus(phb->bus, spapr_phb_eeh_clear_bus_msix, NULL); +} + static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu, sPAPRMachineState *spapr, uint32_t token, uint32_t nargs, @@ -582,6 +624,7 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu, sPAPRPHBState *sphb; uint32_t option; uint64_t buid; +uint32_t op; int ret; if ((nargs != 4) || (nret != 1)) { @@ -599,8 +642,29 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu, goto param_error_exit; } -ret = spapr_phb_vfio_eeh_reset(sphb, option); -rtas_st(rets, 0, ret); +switch (option) { +case RTAS_SLOT_RESET_DEACTIVATE: +op = VFIO_EEH_PE_RESET_DEACTIVATE; +break; +case RTAS_SLOT_RESET_HOT: +spapr_phb_eeh_pre_reset(sphb); +op = VFIO_EEH_PE_RESET_HOT; +break; +case RTAS_SLOT_RESET_FUNDAMENTAL: +spapr_phb_eeh_pre_reset(sphb); +op = VFIO_EEH_PE_RESET_FUNDAMENTAL; +break; +default: +goto param_error_exit; +} + +ret = vfio_eeh_as_op(&sphb->iommu_as, op); +if (ret < 0) { +rtas_st(rets, 0, RTAS_OUT_HW_ERROR); +return; +} + +rtas_st(rets, 0, RTAS_OUT_SUCCESS); return; param_error_exit: diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index 4ac5736..54075e0 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -149,78 +149,6 @@ int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state) return RTAS_OUT_SUCCESS; } -static void spapr_phb_vfio_eeh_clear_dev_msix(PCIBus *bus, - PCIDevice *pdev, - void *opaque) -{ -/* Check if the device is VFIO PCI device */ -if (!object_dynamic_cast(OBJECT(pdev), "vfio-pci")) { -return; -} - -/* - * The MSIx table will be cleaned out by reset. We need - * disable it so that it can be reenabled properly. Also, - * the cached MSIx table should be cleared as it's not - * reflecting the contents in hardware. - */ -if (msix_enabled(pdev)) { -uint16_t flags; - -flags = pci_host_config_read_common(pdev, -pdev->msix_cap + PCI_MSIX_FLAGS, -pci_config_size(pdev), 2); -flags &= ~PCI_MSIX_FLAGS_ENABLE; -pci_host_config_write_common(pdev, - pdev->msix_cap + PCI_MSIX_FLAGS, - pci_config_size(pdev), flags, 2); -} - -msix_reset(pdev); -} - -static void spapr_phb_vfio_eeh_clear_bus_msix(PCIBus *bus, void *opaque
Re: [Qemu-devel] [PATCH 03/12] spapr_pci: Eliminate class callbacks
On 02/26/2016 10:31 PM, David Gibson wrote: The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the special groupid field in sPAPRPHBVFIOState. So we can simplify, removing the class specific callbacks with direct calls based on a simple spapr_phb_eeh_enabled() helper. For now we implement that in terms of a boolean in the class, but we'll continue to clean that up later. On its own this is a rather strange way of doing things, but it's a useful intermediate step to further cleanups. Reviewed-by: Alexey Kardashevskiy Signed-off-by: David Gibson --- hw/ppc/spapr_pci.c | 44 ++-- hw/ppc/spapr_pci_vfio.c | 18 +++--- include/hw/pci-host/spapr.h | 13 + 3 files changed, 38 insertions(+), 37 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 9b2b546..d1e5222 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -92,6 +92,13 @@ PCIDevice *spapr_pci_find_dev(sPAPRMachineState *spapr, uint64_t buid, return pci_find_device(phb->bus, bus_num, devfn); } +static bool spapr_phb_eeh_available(sPAPRPHBState *sphb) +{ +sPAPRPHBClass *spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); + +return spc->eeh_available; +} + static uint32_t rtas_pci_cfgaddr(uint32_t arg) { /* This handles the encoding of extended config space addresses */ @@ -438,7 +445,6 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; uint32_t addr, option; uint64_t buid; int ret; @@ -456,12 +462,11 @@ static void rtas_ibm_set_eeh_option(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_set_option) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } -ret = spc->eeh_set_option(sphb, addr, option); +ret = spapr_phb_vfio_eeh_set_option(sphb, addr, option); rtas_st(rets, 0, ret); return; @@ -476,7 +481,6 @@ static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; PCIDevice *pdev; uint32_t addr, option; uint64_t buid; @@ -491,8 +495,7 @@ static void rtas_ibm_get_config_addr_info2(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_set_option) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } @@ -532,7 +535,6 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; uint64_t buid; int state, ret; @@ -546,12 +548,11 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_get_state) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } -ret = spc->eeh_get_state(sphb, &state); +ret = spapr_phb_vfio_eeh_get_state(sphb, &state); rtas_st(rets, 0, ret); if (ret != RTAS_OUT_SUCCESS) { return; @@ -576,7 +577,6 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; uint32_t option; uint64_t buid; int ret; @@ -592,12 +592,11 @@ static void rtas_ibm_set_slot_reset(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_reset) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } -ret = spc->eeh_reset(sphb, option); +ret = spapr_phb_vfio_eeh_reset(sphb, option); rtas_st(rets, 0, ret); return; @@ -612,7 +611,6 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; uint64_t buid; int ret; @@ -626,12 +624,11 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu, goto param_error_exit; } -spc = SPAPR_PCI_HOST_BRIDGE_GET_CLASS(sphb); -if (!spc->eeh_configure) { +if (!spapr_phb_eeh_available(sphb)) { goto param_error_exit; } -ret = spc->eeh_configure(sphb); +ret = spapr_phb_vfio_eeh_configure(sphb); rtas_st(rets, 0, ret); return; @@ -647,7 +644,6 @@ static void rtas_ibm_slot_error_detail(PowerPCCPU *cpu, target_ulong rets) { sPAPRPHBState *sphb; -sPAPRPHBClass *spc; int option; uint64_t buid; @@ -661,8 +657,7 @@ static void rtas_ibm_slot_error_detail(PowerPCCPU *cpu, goto param_e
Re: [Qemu-devel] [PATCH 06/12] spapr_pci: Fold spapr_phb_vfio_eeh_get_state() into spapr_pci code
On 02/26/2016 10:31 PM, David Gibson wrote: Simplify the sPAPR PCI code by folding spapr_phb_vfio_eeh_get_state9) into rtas_ibm_read_slot_reset_state2(). Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy --- hw/ppc/spapr_pci.c | 12 ++-- hw/ppc/spapr_pci_vfio.c | 13 - include/hw/pci-host/spapr.h | 1 - 3 files changed, 6 insertions(+), 20 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 96cdca2..eaae7e2 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -539,7 +539,7 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu, { sPAPRPHBState *sphb; uint64_t buid; -int state, ret; +int ret; if ((nargs != 3) || (nret != 4 && nret != 5)) { goto param_error_exit; @@ -555,13 +555,13 @@ static void rtas_ibm_read_slot_reset_state2(PowerPCCPU *cpu, goto param_error_exit; } -ret = spapr_phb_vfio_eeh_get_state(sphb, &state); -rtas_st(rets, 0, ret); -if (ret != RTAS_OUT_SUCCESS) { -return; +ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_GET_STATE); +if (ret < 0) { +goto param_error_exit; } -rtas_st(rets, 1, state); +rtas_st(rets, 0, RTAS_OUT_SUCCESS); +rtas_st(rets, 1, ret); rtas_st(rets, 2, RTAS_EEH_SUPPORT); rtas_st(rets, 3, RTAS_EEH_PE_UNAVAIL_INFO); if (nret >= 5) { diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index 54075e0..c87f2e4 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -136,19 +136,6 @@ int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, return RTAS_OUT_SUCCESS; } -int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state) -{ -int ret; - -ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_GET_STATE); -if (ret < 0) { -return RTAS_OUT_PARAM_ERROR; -} - -*state = ret; -return RTAS_OUT_SUCCESS; -} - static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index d31162b..55237fc 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -138,6 +138,5 @@ void spapr_phb_vfio_reset(DeviceState *qdev); /* VFIO EEH hooks */ int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, unsigned int addr, int option); -int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state); #endif /* __HW_SPAPR_PCI_H__ */ -- Alexey
Re: [Qemu-devel] [PATCH 04/12] spapr_pci: Fold spapr_phb_vfio_eeh_configure() into spapr_pci code
On 02/26/2016 10:31 PM, David Gibson wrote: Simplify the sPAPR PCI code by folding spapr_phb_vfio_eeh_configure() into rtas_ibm_configure_pe(). Signed-off-by: David Gibson Reviewed-by: Alexey Kardashevskiy --- hw/ppc/spapr_pci.c | 11 +-- hw/ppc/spapr_pci_vfio.c | 12 include/hw/pci-host/spapr.h | 1 - 3 files changed, 9 insertions(+), 15 deletions(-) diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index d1e5222..fa633a9 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -42,6 +42,9 @@ #include "hw/ppc/spapr_drc.h" #include "sysemu/device_tree.h" +#include "hw/vfio/vfio.h" +#include + /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */ #define RTAS_QUERY_FN 0 #define RTAS_CHANGE_FN 1 @@ -628,8 +631,12 @@ static void rtas_ibm_configure_pe(PowerPCCPU *cpu, goto param_error_exit; } -ret = spapr_phb_vfio_eeh_configure(sphb); -rtas_st(rets, 0, ret); +ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE); +if (ret < 0) { +goto param_error_exit; +} + +rtas_st(rets, 0, RTAS_OUT_SUCCESS); return; param_error_exit: diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index 10fa88a..4ac5736 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -221,18 +221,6 @@ int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option) return RTAS_OUT_SUCCESS; } -int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb) -{ -int ret; - -ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_CONFIGURE); -if (ret < 0) { -return RTAS_OUT_PARAM_ERROR; -} - -return RTAS_OUT_SUCCESS; -} - static void spapr_phb_vfio_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index cc1b82c..f02ef51 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -140,6 +140,5 @@ int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, unsigned int addr, int option); int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state); int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option); -int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb); #endif /* __HW_SPAPR_PCI_H__ */ -- Alexey
Re: [Qemu-devel] [PATCH 02/12] spapr_pci: Switch to vfio_eeh_as_op() interface
On 02/26/2016 10:31 PM, David Gibson wrote: This switches all EEH on VFIO operations in spapr_pci_vfio.c from the broken vfio_container_ioctl() interface to the new vfio_as_eeh_op() interface. Signed-off-by: David Gibson mak Where is that "mak" from? :) Reviewed-by: Alexey Kardashevskiy --- hw/ppc/spapr_pci_vfio.c | 50 - 1 file changed, 16 insertions(+), 34 deletions(-) diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index 2f3752e..b1e8e8e 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -73,15 +73,9 @@ static void spapr_phb_vfio_finish_realize(sPAPRPHBState *sphb, Error **errp) spapr_tce_get_iommu(tcet)); } -static void spapr_phb_vfio_eeh_reenable(sPAPRPHBVFIOState *svphb) +static void spapr_phb_vfio_eeh_reenable(sPAPRPHBState *sphb) { -struct vfio_eeh_pe_op op = { -.argsz = sizeof(op), -.op= VFIO_EEH_PE_ENABLE -}; - -vfio_container_ioctl(&svphb->phb.iommu_as, - svphb->iommugroupid, VFIO_EEH_PE_OP, &op); +vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_ENABLE); } static void spapr_phb_vfio_reset(DeviceState *qdev) @@ -92,19 +86,18 @@ static void spapr_phb_vfio_reset(DeviceState *qdev) * ensures that the contained PCI devices will work properly * after reboot. */ -spapr_phb_vfio_eeh_reenable(SPAPR_PCI_VFIO_HOST_BRIDGE(qdev)); +spapr_phb_vfio_eeh_reenable(SPAPR_PCI_HOST_BRIDGE(qdev)); } static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, unsigned int addr, int option) { -sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); -struct vfio_eeh_pe_op op = { .argsz = sizeof(op) }; +uint32_t op; int ret; switch (option) { case RTAS_EEH_DISABLE: -op.op = VFIO_EEH_PE_DISABLE; +op = VFIO_EEH_PE_DISABLE; break; case RTAS_EEH_ENABLE: { PCIHostState *phb; @@ -122,21 +115,20 @@ static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, return RTAS_OUT_PARAM_ERROR; } -op.op = VFIO_EEH_PE_ENABLE; +op = VFIO_EEH_PE_ENABLE; break; } case RTAS_EEH_THAW_IO: -op.op = VFIO_EEH_PE_UNFREEZE_IO; +op = VFIO_EEH_PE_UNFREEZE_IO; break; case RTAS_EEH_THAW_DMA: -op.op = VFIO_EEH_PE_UNFREEZE_DMA; +op = VFIO_EEH_PE_UNFREEZE_DMA; break; default: return RTAS_OUT_PARAM_ERROR; } -ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid, - VFIO_EEH_PE_OP, &op); +ret = vfio_eeh_as_op(&sphb->iommu_as, op); if (ret < 0) { return RTAS_OUT_HW_ERROR; } @@ -146,13 +138,9 @@ static int spapr_phb_vfio_eeh_set_option(sPAPRPHBState *sphb, static int spapr_phb_vfio_eeh_get_state(sPAPRPHBState *sphb, int *state) { -sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); -struct vfio_eeh_pe_op op = { .argsz = sizeof(op) }; int ret; -op.op = VFIO_EEH_PE_GET_STATE; -ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid, - VFIO_EEH_PE_OP, &op); +ret = vfio_eeh_as_op(&sphb->iommu_as, VFIO_EEH_PE_GET_STATE); if (ret < 0) { return RTAS_OUT_PARAM_ERROR; } @@ -206,28 +194,26 @@ static void spapr_phb_vfio_eeh_pre_reset(sPAPRPHBState *sphb) static int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option) { -sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); -struct vfio_eeh_pe_op op = { .argsz = sizeof(op) }; +uint32_t op; int ret; switch (option) { case RTAS_SLOT_RESET_DEACTIVATE: -op.op = VFIO_EEH_PE_RESET_DEACTIVATE; +op = VFIO_EEH_PE_RESET_DEACTIVATE; break; case RTAS_SLOT_RESET_HOT: spapr_phb_vfio_eeh_pre_reset(sphb); -op.op = VFIO_EEH_PE_RESET_HOT; +op = VFIO_EEH_PE_RESET_HOT; break; case RTAS_SLOT_RESET_FUNDAMENTAL: spapr_phb_vfio_eeh_pre_reset(sphb); -op.op = VFIO_EEH_PE_RESET_FUNDAMENTAL; +op = VFIO_EEH_PE_RESET_FUNDAMENTAL; break; default: return RTAS_OUT_PARAM_ERROR; } -ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb->iommugroupid, - VFIO_EEH_PE_OP, &op); +ret = vfio_eeh_as_op(&sphb->iommu_as, op); if (ret < 0) { return RTAS_OUT_HW_ERROR; } @@ -237,13 +223,9 @@ static int spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option) static int spapr_phb_vfio_eeh_configure(sPAPRPHBState *sphb) { -sPAPRPHBVFIOState *svphb = SPAPR_PCI_VFIO_HOST_BRIDGE(sphb); -struct vfio_eeh_pe_op op = { .argsz = sizeof(op) }; int ret; -op.op = VFIO_EEH_PE_CONFIGURE; -ret = vfio_container_ioctl(&svphb->phb.iommu_as, svphb
Re: [Qemu-devel] [PATCH 10/12] spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge
On 02/26/2016 10:32 PM, David Gibson wrote: Now that the regular spapr-pci-host-bridge can handle EEH, there are only two things that spapr-pci-vfio-host-bridge does differently: 1. automatically sizes its DMA window to match the host IOMMU 2. checks if the attached VFIO container is backed by the VFIO_SPAPR_TCE_IOMMU type on the host (1) is not particularly useful, since the default window used by the regular host bridge will work with the host IOMMU configuration on all current systems anyway. Plus, automatically changing guest visible configuration (such as the DMA window) based on host settings is generally a bad idea. btw why exactly is it a bad idea? Anyway, Reviewed-by: Alexey Kardashevskiy -- Alexey
Re: [Qemu-devel] [PATCH 01/12] vfio: Start improving VFIO/EEH interface
On 02/26/2016 10:31 PM, David Gibson wrote: At present the code handling IBM's Enhanced Error Handling (EEH) interface on VFIO devices operates by bypassing the usual VFIO logic with vfio_container_ioctl(). That's a poorly designed interface with unclear semantics about exactly what can be operated on. In particular it operates on a single vfio container internally (hence the name), but takes an address space and group id, from which it deduces the container in a rather roundabout way. groupids are something that code outside vfio shouldn't even be aware of. This patch creates new interfaces for EEH operations. Internally we have vfio_eeh_container_op() which takes a VFIOContainer object directly. For external use we have vfio_eeh_as_ok() which determines if an AddressSpace is usable for EEH (at present this means it has a single container and at most a single group attached), and vfio_eeh_as_op() which will perform an operation on an AddressSpace in the unambiguous case, and otherwise returns an error. This interface still isn't great, but it's enough of an improvement to allow a number of cleanups in other places. Signed-off-by: David Gibson --- hw/vfio/common.c | 77 ++ include/hw/vfio/vfio.h | 2 ++ 2 files changed, 79 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 607ec70..e419241 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1003,3 +1003,80 @@ int vfio_container_ioctl(AddressSpace *as, int32_t groupid, return vfio_container_do_ioctl(as, groupid, req, param); } + +/* + * Interfaces for IBM EEH (Enhanced Error Handling) + */ +static bool vfio_eeh_container_ok(VFIOContainer *container) +{ +/* A broken kernel implementation means EEH operations won't work + * correctly if there are multiple groups in a container */ + +if (!QLIST_EMPTY(&container->group_list) +&& QLIST_NEXT(QLIST_FIRST(&container->group_list), container_next)) { +return false; +} + +return true; If &container->group_list is empty, this helper returns "true". Does not look right, does it?... +} + +static int vfio_eeh_container_op(VFIOContainer *container, uint32_t op) +{ +struct vfio_eeh_pe_op pe_op = { +.argsz = sizeof(pe_op), +.op = op, +}; +int ret; + +if (!vfio_eeh_container_ok(container)) { +error_report("vfio/eeh: EEH_PE_OP 0x%x called on container" + " with multiple groups", op); +return -EPERM; +} + +ret = ioctl(container->fd, VFIO_EEH_PE_OP, &pe_op); +if (ret < 0) { +error_report("vfio/eeh: EEH_PE_OP 0x%x failed: %m", op); +return -errno; +} + +return 0; +} + +static VFIOContainer *vfio_eeh_as_container(AddressSpace *as) +{ +VFIOAddressSpace *space = vfio_get_address_space(as); +VFIOContainer *container = NULL; + +if (QLIST_EMPTY(&space->containers)) { +/* No containers to act on */ +goto out; +} + +container = QLIST_FIRST(&space->containers); + +if (QLIST_NEXT(container, next)) { +/* We don't yet have logic to synchronize EEH state across + * multiple containers */ +container = NULL; +goto out; +} + +out: +vfio_put_address_space(space); +return container; +} + +bool vfio_eeh_as_ok(AddressSpace *as) +{ +VFIOContainer *container = vfio_eeh_as_container(as); + +return (container != NULL) && vfio_eeh_container_ok(container); +} + +int vfio_eeh_as_op(AddressSpace *as, uint32_t op) +{ +VFIOContainer *container = vfio_eeh_as_container(as); + +return vfio_eeh_container_op(container, op); vfio_eeh_as_ok() checks for (container != NULL) but this one does not, should not it? +} diff --git a/include/hw/vfio/vfio.h b/include/hw/vfio/vfio.h index 0b26cd8..fd3933b 100644 --- a/include/hw/vfio/vfio.h +++ b/include/hw/vfio/vfio.h @@ -5,5 +5,7 @@ extern int vfio_container_ioctl(AddressSpace *as, int32_t groupid, int req, void *param); +bool vfio_eeh_as_ok(AddressSpace *as); +int vfio_eeh_as_op(AddressSpace *as, uint32_t op); #endif -- Alexey
Re: [Qemu-devel] [PATCH v2] xics: report errors with the QEMU Error API
On Fri, Feb 26, 2016 at 10:44:07AM +0100, Greg Kurz wrote: > Using the return value to report errors is error prone: > - xics_alloc() returns -1 on error but spapr_vio_busdev_realize() errors > on 0 > - xics_alloc_block() returns the unclear value of ics->offset - 1 on error > but both rtas_ibm_change_msi() and spapr_phb_realize() error on 0 > > This patch adds an errp argument to xics_alloc() and xics_alloc_block() to > report errors. The return value of these functions is a valid IRQ number > if errp is NULL. It is undefined otherwise. > > The corresponding error traces get promotted to error messages. Note that > the "can't allocate IRQ" error message in spapr_vio_busdev_realize() also > moves to xics_alloc(). Similar error message consolidation isn't really > applicable to xics_alloc_block() because callers have extra context (device > config address, MSI or MSIX). > > This fixes the issues mentioned above. > > Based on previous work from Brian W. Hart. > > Signed-off-by: Greg Kurz Merged, thanks. > --- > v2: - reverted to non-void xics_alloc() and xics_alloc_block() > - consolidated error message in xics_alloc() > - pass &error_fatal instead of NULL in spapr_events_init() > --- > > hw/intc/xics.c| 13 + > hw/ppc/spapr_events.c |3 ++- > hw/ppc/spapr_pci.c| 16 ++-- > hw/ppc/spapr_vio.c|7 --- > include/hw/ppc/xics.h |5 +++-- > trace-events |2 -- > 6 files changed, 28 insertions(+), 18 deletions(-) > > diff --git a/hw/intc/xics.c b/hw/intc/xics.c > index e66ae328819a..213a3709254c 100644 > --- a/hw/intc/xics.c > +++ b/hw/intc/xics.c > @@ -712,7 +712,7 @@ static int ics_find_free_block(ICSState *ics, int num, > int alignnum) > return -1; > } > > -int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi) > +int xics_alloc(XICSState *icp, int src, int irq_hint, bool lsi, Error **errp) > { > ICSState *ics = &icp->ics[src]; > int irq; > @@ -720,14 +720,14 @@ int xics_alloc(XICSState *icp, int src, int irq_hint, > bool lsi) > if (irq_hint) { > assert(src == xics_find_source(icp, irq_hint)); > if (!ICS_IRQ_FREE(ics, irq_hint - ics->offset)) { > -trace_xics_alloc_failed_hint(src, irq_hint); > +error_setg(errp, "can't allocate IRQ %d: already in use", > irq_hint); > return -1; > } > irq = irq_hint; > } else { > irq = ics_find_free_block(ics, 1, 1); > if (irq < 0) { > -trace_xics_alloc_failed_no_left(src); > +error_setg(errp, "can't allocate IRQ: no IRQ left"); > return -1; > } > irq += ics->offset; > @@ -743,7 +743,8 @@ int xics_alloc(XICSState *icp, int src, int irq_hint, > bool lsi) > * Allocate block of consecutive IRQs, and return the number of the first > IRQ in the block. > * If align==true, aligns the first IRQ number to num. > */ > -int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align) > +int xics_alloc_block(XICSState *icp, int src, int num, bool lsi, bool align, > + Error **errp) > { > int i, first = -1; > ICSState *ics = &icp->ics[src]; > @@ -763,6 +764,10 @@ int xics_alloc_block(XICSState *icp, int src, int num, > bool lsi, bool align) > } else { > first = ics_find_free_block(ics, num, 1); > } > +if (first < 0) { > +error_setg(errp, "can't find a free %d-IRQ block", num); > +return -1; > +} > > if (first >= 0) { > for (i = first; i < first + num; ++i) { > diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c > index f5eac4b5441c..39f4682f957f 100644 > --- a/hw/ppc/spapr_events.c > +++ b/hw/ppc/spapr_events.c > @@ -588,7 +588,8 @@ out_no_events: > void spapr_events_init(sPAPRMachineState *spapr) > { > QTAILQ_INIT(&spapr->pending_events); > -spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false); > +spapr->check_exception_irq = xics_alloc(spapr->icp, 0, 0, false, > +&error_fatal); > spapr->epow_notifier.notify = spapr_powerdown_req; > qemu_register_powerdown_notifier(&spapr->epow_notifier); > spapr_rtas_register(RTAS_CHECK_EXCEPTION, "check-exception", > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c > index 9b2b546b541c..e8edad3ab7c3 100644 > --- a/hw/ppc/spapr_pci.c > +++ b/hw/ppc/spapr_pci.c > @@ -280,6 +280,7 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, > sPAPRMachineState *spapr, > PCIDevice *pdev = NULL; > spapr_pci_msi *msi; > int *config_addr_key; > +Error *err = NULL; > > switch (func) { > case RTAS_CHANGE_MSI_FN: > @@ -354,9 +355,10 @@ static void rtas_ibm_change_msi(PowerPCCPU *cpu, > sPAPRMachineState *spapr, > > /* Allocate MSIs */ > irq = xics_alloc_block(spapr->icp, 0, req_num, false, > - ret_intr_type == RTAS_TYPE_MSI);
Re: [Qemu-devel] [PATCH v2 01/16] tcg-mips: Always use tcg_debug_assert
On 2016-02-15 14:42, Richard Henderson wrote: > Signed-off-by: Richard Henderson > --- > tcg/mips/tcg-target.c | 18 +- > 1 file changed, 9 insertions(+), 9 deletions(-) > > diff --git a/tcg/mips/tcg-target.c b/tcg/mips/tcg-target.c > index 2dc4998..ebb936d 100644 > --- a/tcg/mips/tcg-target.c > +++ b/tcg/mips/tcg-target.c > @@ -128,7 +128,7 @@ static inline uint32_t reloc_pc16_val(tcg_insn_unit *pc, > tcg_insn_unit *target) > { > /* Let the compiler perform the right-shift as part of the arithmetic. > */ > ptrdiff_t disp = target - (pc + 1); > -assert(disp == (int16_t)disp); > +tcg_debug_assert(disp == (int16_t)disp); > return disp & 0x; > } > > @@ -139,7 +139,7 @@ static inline void reloc_pc16(tcg_insn_unit *pc, > tcg_insn_unit *target) > > static inline uint32_t reloc_26_val(tcg_insn_unit *pc, tcg_insn_unit *target) > { > -assertuintptr_t)pc ^ (uintptr_t)target) & 0xf000) == 0); > +tcg_debug_assertuintptr_t)pc ^ (uintptr_t)target) & 0xf000) == > 0); > return ((uintptr_t)target >> 2) & 0x3ff; > } > > @@ -151,8 +151,8 @@ static inline void reloc_26(tcg_insn_unit *pc, > tcg_insn_unit *target) > static void patch_reloc(tcg_insn_unit *code_ptr, int type, > intptr_t value, intptr_t addend) > { > -assert(type == R_MIPS_PC16); > -assert(addend == 0); > +tcg_debug_assert(type == R_MIPS_PC16); > +tcg_debug_assert(addend == 0); > reloc_pc16(code_ptr, (tcg_insn_unit *)value); > } > > @@ -433,7 +433,7 @@ static bool tcg_out_opc_jmp(TCGContext *s, MIPSInsn opc, > void *target) > if ((from ^ dest) & -(1 << 28)) { > return false; > } > -assert((dest & 3) == 0); > +tcg_debug_assert((dest & 3) == 0); > > inst = opc; > inst |= (dest >> 2) & 0x3ff; > @@ -808,9 +808,9 @@ static void tcg_out_setcond2(TCGContext *s, TCGCond cond, > TCGReg ret, > TCGReg tmp0 = TCG_TMP0; > TCGReg tmp1 = ret; > > -assert(ret != TCG_TMP0); > +tcg_debug_assert(ret != TCG_TMP0); > if (ret == ah || ret == bh) { > -assert(ret != TCG_TMP1); > +tcg_debug_assert(ret != TCG_TMP1); > tmp1 = TCG_TMP1; > } > > @@ -1471,8 +1471,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode > opc, > case INDEX_op_and_i32: > if (c2 && a2 != (uint16_t)a2) { > int msb = ctz32(~a2) - 1; > -assert(use_mips32r2_instructions); > -assert(is_p2m1(a2)); > +tcg_debug_assert(use_mips32r2_instructions); > +tcg_debug_assert(is_p2m1(a2)); > tcg_out_opc_bf(s, OPC_EXT, a0, a1, msb, 0); > break; > } Reviewed-by: Aurelien Jarno -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH v2 00/16] tcg mips64 and mips r6 improvements
On 2016-02-15 14:42, Richard Henderson wrote: > Changes since v1: > * Some bugs pointed out by Mark fixed. > * Canonicalize the whole file on tcg_debug_assert. > * Switch bswap code to subroutine earlier; the first patch is > standalone for mips32, and there is no longer an intermediate > patch with inline bswap for mips64. > * Use NAL for pre-r6 mips64 loading of the slow path return address. > Thanks a lot for working on that, it's something I have on my TODO list for months. I have finally found time to have a look and give a try over the week-end (sorry about the delay). It seems to work perfectly for 64-bit guests on mips64el but 32-bit guests end-up quickly in a segmentation fault. It's easily reproducible by starting qemu-system-i386 on a mips64el host, it crashes when executing seabios. More problematic it seems that the patch "Adjust qemu_ld/st for mips64" causes a regression on at least a big-endian 32-bit host running qemu-system-i386. It is reproducible by booting a Debian i386 wheezy guest on such a system. Unfortunately the week-end was too short for finding the issue, I'll continue looking in the next days. I have a few comments on the individual patches, I'll send them asap. Note that I don't have an R6 machine, so I haven't been able to test that part. Aurelien -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
[Qemu-devel] [QEMU] Windows XP / Windows 95 / MS-DOS 6 regressions
Hi, I currently see some regressions on Microsoft operating systems. 1) Windows XP bugchecks since commit: commit 7f0b7141b4c7deab51efd8ee1e83eab2d9b7a9ea Author: Richard Henderson Date: Mon Jul 6 17:29:59 2015 +0100 target-i386: Perform set/reset_inhibit_irq inline With helpers that can be reused for other things. Signed-off-by: Richard Henderson I'm starting QEMU with -cpu pentium2. Attached patch can be applied on master to work-around the problem. Another work-around is to start with -enable-kvm. 2) Windows 95 bugchecks since commit: commit d6a2914984c89fa0a3125b9842e0cbf68de79a3d Author: Richard Henderson Date: Thu Dec 17 11:19:19 2015 -0800 target-i386: Use gen_lea_v_seg in gen_lea_modrm Centralize handling of segment bases. Signed-off-by: Richard Henderson Message-Id: <1450379966-28198-4-git-send-email-...@twiddle.net> Signed-off-by: Paolo Bonzini (with 88c73d16ad1b6c22a2ab082064d0d521f756296a and 4987783400667147ada01a5bdcce53f11b822888 cherry-picked) I'm starting QEMU with -cpu pentium. I've not searched for a work-around. 3) MS-DOS 6 freezes when loading himem.sys since commit: commit 1906b2af7c2345037d9b2fdf484b457b5acd09d1 Author: Richard Henderson Date: Thu Jul 2 13:59:21 2015 +0100 target-i386: Rearrange processing of 0F 01 Rather than nesting tests of OP, MOD, and RM, decode them all at once with a switch. Fixes incorrect decoding of AMD Pacifica extensions (aka vmrun et al) via op==2 path. Signed-off-by: Richard Henderson I'm starting QEMU with -cpu 486. It works on master if I add -enable-kvm Hervé >From 0e66ca87ac7c94219ab49cfbed6f586c51c697de Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Herv=C3=A9=20Poussineau?= Date: Sun, 28 Feb 2016 15:00:29 +0100 Subject: [PATCH] target-i386: partially revert 'Perform set/reset_inhibit_irq inline' This partially reverts commit 7f0b7141b4c7deab51efd8ee1e83eab2d9b7a9ea to fix Windows XP. --- target-i386/cc_helper.c | 5 + target-i386/helper.h| 1 + target-i386/translate.c | 3 ++- 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/target-i386/cc_helper.c b/target-i386/cc_helper.c index 83af223..e6f4fe9 100644 --- a/target-i386/cc_helper.c +++ b/target-i386/cc_helper.c @@ -383,3 +383,8 @@ void helper_sti_vm(CPUX86State *env) } } #endif + +void helper_set_inhibit_irq(CPUX86State *env) +{ +env->hflags |= HF_INHIBIT_IRQ_MASK; +} diff --git a/target-i386/helper.h b/target-i386/helper.h index e33451a..bdba6bd 100644 --- a/target-i386/helper.h +++ b/target-i386/helper.h @@ -70,6 +70,7 @@ DEF_HELPER_1(cli, void, env) DEF_HELPER_1(sti, void, env) DEF_HELPER_1(clac, void, env) DEF_HELPER_1(stac, void, env) +DEF_HELPER_1(set_inhibit_irq, void, env) DEF_HELPER_3(boundw, void, env, tl, int) DEF_HELPER_3(boundl, void, env, tl, int) DEF_HELPER_1(rsm, void, env) diff --git a/target-i386/translate.c b/target-i386/translate.c index 9171929..4ec94b9 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -6780,7 +6780,8 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s, /* interruptions are enabled only the first insn after sti */ /* If several instructions disable interrupts, only the _first_ does it */ -gen_set_hflag(s, HF_INHIBIT_IRQ_MASK); +if (!(s->tb->flags & HF_INHIBIT_IRQ_MASK)) +gen_helper_set_inhibit_irq(cpu_env); /* give a chance to handle pending irqs */ gen_jmp_im(s->pc - s->cs_base); gen_eob(s); -- 2.1.4
Re: [Qemu-devel] [PATCH v2 0/9] Add i.MX6 (Single/Dual/Quad) support
Peter, Are you OK if I merge the 2 series (i.MX6 and SPI for i.MX6) into a single series? Thanks. JC Le 08/02/2016 23:08, Jean-Christophe Dubois a écrit : This patch series adds support for the Freescale i.MX6 processor. For now we only support the following devices: * up to 4 Cortex A9 cores * A9 MPCORE (SCU, GIC, TWD) * 5 i.MX UARTs * 2 EPIT timers * 1 GPT timer * 7 GPIO controllers * 6 SDHC controllers * 1 CCM device * 1 SRC device * 3 I2C devices * various ROM/RAM areas. This also adds the sabrelite board as a an actual platform for i.MX6. This series was tested by booting a 4.4 linux kernel (using the imx_v6_v7_defconfig file as kernel configuration). Note1: as sabrelite uses uart2 as console, you have to redirect the second QEMU serial port to stdout. qemu-system-arm -M sabrelite -display none ... -serial null -serial stdio Note2: as the SPI controller is not yet supported in QEMU for i.MX you need to disable SPI controllers in Linux DTS tree. Otherwise Linux would hang during the boot waiting for a SPI device to respond. Note3: You need to disable the GPIO section related to physical push buttons in the Linux DTS tree in order to avoid excecive interrupt triggering on GPIO devices for non existant buttons. Jean-Christophe Dubois (9): i.MX: Allow GPT timer to rollover. i.MX: Rename CCM NOCLK to CLK_NONE for naming consistency. i.MX: Remove CCM useless clock computation handling. i.MX: Add the CLK_IPG_HIGH clock i.MX: Add i.MX6 CCM and ANALOG device. i.MX: Add i.MX6 System Reset Controller device. i.MX: Add i.MX6 SOC implementation. i.MX: Add sabrelite i.MX6 emulation. i.MX: Add missing descriptions in devices. default-configs/arm-softmmu.mak | 1 + hw/arm/Makefile.objs| 1 + hw/arm/fsl-imx25.c | 1 + hw/arm/fsl-imx31.c | 1 + hw/arm/fsl-imx6.c | 407 + hw/arm/sabrelite.c | 93 + hw/i2c/imx_i2c.c| 1 + hw/misc/Makefile.objs | 2 + hw/misc/imx25_ccm.c | 29 +- hw/misc/imx31_ccm.c | 35 +- hw/misc/imx6_ccm.c | 773 hw/misc/imx6_src.c | 353 ++ hw/net/imx_fec.c| 1 + hw/timer/imx_epit.c | 8 +- hw/timer/imx_gpt.c | 43 +-- include/hw/arm/fsl-imx6.h | 447 +++ include/hw/misc/imx6_ccm.h | 197 ++ include/hw/misc/imx6_src.h | 73 include/hw/misc/imx_ccm.h | 10 +- 19 files changed, 2384 insertions(+), 92 deletions(-) create mode 100644 hw/arm/fsl-imx6.c create mode 100644 hw/arm/sabrelite.c create mode 100644 hw/misc/imx6_ccm.c create mode 100644 hw/misc/imx6_src.c create mode 100644 include/hw/arm/fsl-imx6.h create mode 100644 include/hw/misc/imx6_ccm.h create mode 100644 include/hw/misc/imx6_src.h
Re: [Qemu-devel] [PATCH v1 16/17] loader: Add data swap option to load-elf
On Sun, Feb 28, 2016 at 7:28 AM, Peter Maydell wrote: > On 27 February 2016 at 23:14, Peter Crosthwaite > wrote: >> On Tue, Jan 19, 2016 at 9:53 AM, Peter Maydell >> wrote: >>> Can we have a doc comment so we have something that defines what >>> values data_swab accepts? (it's not just a bool). >>> >> >> This is difficult to capture without writing to whole documentation >> for load_elf. So here goes: > > Thanks; a couple of typos below, but otherwise looks good. > > >> /** load_elf: >> * @filename: Path of ELF file >> * @translate_fn: optional function to translate load addresses >> * @translate_opaque: opaque data passed to @translate_fn >> * @pentry: Populated with program entry point. Ignored if NULL. >> * @lowaddr: Populated with lowest loaded address. Ignored if NULL. >> * @highaddr: Populated with highest loaded address. Ignored if NULL. >> * @bigendian: Expected ELF endianness. 0 for LE otherwise BE >> * @elf_machine: Expected ELF machine type >> * @clear_lsb: Set to mask off LSB of addresses (Some arch's use this > > Can we just write out "architectures" here? > >>for non-address data) >> * @data_swab: Set to order of byte swapping for data. 0 for no swap, 1 >> * for swapping bytes within halfwords, 2 for bytes within >> * words and 3 for within doublewords. >> * >> * Load an ELF file's contents to the emulated systems address space. > > "system's" > >> * Clients may optionally specify a callback to perform address >> * translations. @pentry, @lowaddr and @highaddr are optional pointers >> * which will be populated with various load information. @bigendian and >> * @elf_machine give the expected endianness and machine for the ELF the >> * load will fail it the target ELF does not match. Some architectures > > "if" > >> * have some arch-specific behaviours that come into effect when their > > "architecture" > All fixed. Thanks for the pre-review. Regards, Peter >> * particular values for @elf_machine are set. >> */ > > thanks > -- PMM
Re: [Qemu-devel] [PATCH v4 02/10] pc: init pcms->apic_id_limit once and use it throughout pc.c
On 02/26/2016 03:59 PM, Igor Mammedov wrote: Signed-off-by: Igor Mammedov --- hw/i386/pc.c | 45 +++-- 1 file changed, 19 insertions(+), 26 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 0aeefd2..151a64c 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -700,18 +700,6 @@ static uint32_t x86_cpu_apic_id_from_index(unsigned int cpu_index) } } -/* Calculates the limit to CPU APIC ID values - * - * This function returns the limit for the APIC ID value, so that all - * CPU APIC IDs are < pc_apic_id_limit(). - * - * This is used for FW_CFG_MAX_CPUS. See comments on bochs_bios_init(). - */ -static unsigned int pc_apic_id_limit(unsigned int max_cpus) -{ -return x86_cpu_apic_id_from_index(max_cpus - 1) + 1; -} - static void pc_build_smbios(FWCfgState *fw_cfg) { uint8_t *smbios_tables, *smbios_anchor; @@ -749,12 +737,11 @@ static void pc_build_smbios(FWCfgState *fw_cfg) } } -static FWCfgState *bochs_bios_init(AddressSpace *as) +static FWCfgState *bochs_bios_init(AddressSpace *as, PCMachineState *pcms) { FWCfgState *fw_cfg; uint64_t *numa_fw_cfg; int i, j; -unsigned int apic_id_limit = pc_apic_id_limit(max_cpus); fw_cfg = fw_cfg_init_io_dma(BIOS_CFG_IOPORT, BIOS_CFG_IOPORT + 4, as); @@ -772,7 +759,7 @@ static FWCfgState *bochs_bios_init(AddressSpace *as) * [1] The only kind of "CPU identifier" used between SeaBIOS and QEMU is * the APIC ID, not the "CPU index" */ -fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)apic_id_limit); +fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)pcms->apic_id_limit); fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size); fw_cfg_add_bytes(fw_cfg, FW_CFG_ACPI_TABLES, acpi_tables, acpi_tables_len); @@ -790,11 +777,11 @@ static FWCfgState *bochs_bios_init(AddressSpace *as) * of nodes, one word for each VCPU->node and one word for each node to * hold the amount of memory. */ -numa_fw_cfg = g_new0(uint64_t, 1 + apic_id_limit + nb_numa_nodes); +numa_fw_cfg = g_new0(uint64_t, 1 + pcms->apic_id_limit + nb_numa_nodes); numa_fw_cfg[0] = cpu_to_le64(nb_numa_nodes); for (i = 0; i < max_cpus; i++) { unsigned int apic_id = x86_cpu_apic_id_from_index(i); -assert(apic_id < apic_id_limit); +assert(apic_id < pcms->apic_id_limit); for (j = 0; j < nb_numa_nodes; j++) { if (test_bit(i, numa_info[j].node_cpu)) { numa_fw_cfg[apic_id + 1] = cpu_to_le64(j); @@ -803,10 +790,11 @@ static FWCfgState *bochs_bios_init(AddressSpace *as) } } for (i = 0; i < nb_numa_nodes; i++) { -numa_fw_cfg[apic_id_limit + 1 + i] = cpu_to_le64(numa_info[i].node_mem); +numa_fw_cfg[pcms->apic_id_limit + 1 + i] = +cpu_to_le64(numa_info[i].node_mem); } fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA, numa_fw_cfg, - (1 + apic_id_limit + nb_numa_nodes) * + (1 + pcms->apic_id_limit + nb_numa_nodes) * sizeof(*numa_fw_cfg)); return fw_cfg; @@ -1120,7 +1108,6 @@ void pc_cpus_init(PCMachineState *pcms) int i; X86CPU *cpu = NULL; MachineState *machine = MACHINE(pcms); -unsigned long apic_id_limit; /* init CPUs */ if (machine->cpu_model == NULL) { @@ -1131,10 +1118,17 @@ void pc_cpus_init(PCMachineState *pcms) #endif } -apic_id_limit = pc_apic_id_limit(max_cpus); -if (apic_id_limit > ACPI_CPU_HOTPLUG_ID_LIMIT) { -error_report("max_cpus is too large. APIC ID of last CPU is %lu", - apic_id_limit - 1); +/* Calculates the limit to CPU APIC ID values + * + * Limit for the APIC ID value, so that all + * CPU APIC IDs are < pcms->apic_id_limit. + * + * This is used for FW_CFG_MAX_CPUS. See comments on bochs_bios_init(). + */ +pcms->apic_id_limit = x86_cpu_apic_id_from_index(max_cpus - 1) + 1; +if (pcms->apic_id_limit > ACPI_CPU_HOTPLUG_ID_LIMIT) { +error_report("max_cpus is too large. APIC ID of last CPU is %u", + pcms->apic_id_limit - 1); exit(1); } @@ -1187,7 +1181,6 @@ void pc_guest_info_init(PCMachineState *pcms) { int i, j; -pcms->apic_id_limit = pc_apic_id_limit(max_cpus); pcms->apic_xrupt_override = kvm_allows_irq0_override(); pcms->numa_nodes = nb_numa_nodes; pcms->node_mem = g_malloc0(pcms->numa_nodes * @@ -1372,7 +1365,7 @@ void pc_memory_init(PCMachineState *pcms, option_rom_mr, 1); -fw_cfg = bochs_bios_init(&address_space_memory); +fw_cfg = bochs_bios_init(&address_space_memory, pcms); rom_set_fw(fw_cfg); Looks OK to me. Reviewed-by: Marcel Apfelbaum Thanks, Marcel
Re: [Qemu-devel] [PATCH v4 04/10] pc: acpi: cleanup qdev_get_machine() calls
On 02/26/2016 03:59 PM, Igor Mammedov wrote: cache qdev_get_machine() result in acpi_setup/acpi_build_update time and pass it as an argument to child functions that need it. Signed-off-by: Igor Mammedov --- hw/i386/acpi-build.c | 31 +++ 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 52c9470..c750435 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -362,9 +362,9 @@ build_fadt(GArray *table_data, GArray *linker, AcpiPmInfo *pm, } static void -build_madt(GArray *table_data, GArray *linker, AcpiCpuInfo *cpu) +build_madt(GArray *table_data, GArray *linker, PCMachineState *pcms, + AcpiCpuInfo *cpu) { -PCMachineState *pcms = PC_MACHINE(qdev_get_machine()); int madt_start = table_data->len; AcpiMultipleApicTable *madt; @@ -1938,13 +1938,12 @@ static Aml *build_q35_osc_method(void) static void build_dsdt(GArray *table_data, GArray *linker, AcpiCpuInfo *cpu, AcpiPmInfo *pm, AcpiMiscInfo *misc, - PcPciInfo *pci) + PcPciInfo *pci, MachineState *machine) { CrsRangeEntry *entry; Aml *dsdt, *sb_scope, *scope, *dev, *method, *field, *pkg, *crs; GPtrArray *mem_ranges = g_ptr_array_new_with_free_func(crs_range_free); GPtrArray *io_ranges = g_ptr_array_new_with_free_func(crs_range_free); -MachineState *machine = MACHINE(qdev_get_machine()); PCMachineState *pcms = PC_MACHINE(machine); uint32_t nr_mem = machine->ram_slots; int root_bus_limit = 0xFF; @@ -2367,7 +2366,7 @@ acpi_build_srat_memory(AcpiSratMemoryAffinity *numamem, uint64_t base, } static void -build_srat(GArray *table_data, GArray *linker) +build_srat(GArray *table_data, GArray *linker, MachineState *machine) { AcpiSystemResourceAffinityTable *srat; AcpiSratProcessorAffinity *core; @@ -2377,7 +2376,7 @@ build_srat(GArray *table_data, GArray *linker) uint64_t curnode; int srat_start, numa_start, slots; uint64_t mem_len, mem_base, next_base; -PCMachineState *pcms = PC_MACHINE(qdev_get_machine()); +PCMachineState *pcms = PC_MACHINE(machine); ram_addr_t hotplugabble_address_space_size = object_property_get_int(OBJECT(pcms), PC_MACHINE_MEMHP_REGION_SIZE, NULL); @@ -2581,17 +2580,17 @@ static bool acpi_has_iommu(void) return intel_iommu && !ambiguous; } -static bool acpi_has_nvdimm(void) +static bool acpi_has_nvdimm(MachineState *machine) { -PCMachineState *pcms = PC_MACHINE(qdev_get_machine()); +PCMachineState *pcms = PC_MACHINE(machine); return pcms->nvdimm; } static -void acpi_build(AcpiBuildTables *tables) +void acpi_build(AcpiBuildTables *tables, MachineState *machine) { -PCMachineState *pcms = PC_MACHINE(qdev_get_machine()); +PCMachineState *pcms = PC_MACHINE(machine); PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms); GArray *table_offsets; unsigned facs, dsdt, rsdt, fadt; @@ -2629,7 +2628,7 @@ void acpi_build(AcpiBuildTables *tables) /* DSDT is pointed to by FADT */ dsdt = tables_blob->len; -build_dsdt(tables_blob, tables->linker, &cpu, &pm, &misc, &pci); +build_dsdt(tables_blob, tables->linker, &cpu, &pm, &misc, &pci, machine); /* Count the size of the DSDT and SSDT, we will need it for legacy * sizing of ACPI tables. @@ -2644,7 +2643,7 @@ void acpi_build(AcpiBuildTables *tables) aml_len += tables_blob->len - fadt; acpi_add_table(table_offsets, tables_blob); -build_madt(tables_blob, tables->linker, &cpu); +build_madt(tables_blob, tables->linker, pcms, &cpu); if (misc.has_hpet) { acpi_add_table(table_offsets, tables_blob); @@ -2661,7 +2660,7 @@ void acpi_build(AcpiBuildTables *tables) } if (pcms->numa_nodes) { acpi_add_table(table_offsets, tables_blob); -build_srat(tables_blob, tables->linker); +build_srat(tables_blob, tables->linker, machine); } if (acpi_get_mcfg(&mcfg)) { acpi_add_table(table_offsets, tables_blob); @@ -2672,7 +2671,7 @@ void acpi_build(AcpiBuildTables *tables) build_dmar_q35(tables_blob, tables->linker); } -if (acpi_has_nvdimm()) { +if (acpi_has_nvdimm(machine)) { nvdimm_build_acpi(table_offsets, tables_blob, tables->linker); } @@ -2766,7 +2765,7 @@ static void acpi_build_update(void *build_opaque) acpi_build_tables_init(&tables); -acpi_build(&tables); +acpi_build(&tables, MACHINE(qdev_get_machine())); acpi_ram_update(build_state->table_mr, tables.table_data); @@ -2831,7 +2830,7 @@ void acpi_setup(void) acpi_set_pci_info(); acpi_build_tables_init(&tables); -acpi_build(&tables); +acpi_build(&tables, MACHINE(pcms)); /* Now expose it all to Guest */ build_state->table_mr = acpi_add_rom_blob(build_state, tables.table_data, Reviewe
Re: [Qemu-devel] Can't compile QEMU 2.5.0 on Arch Linux ARM
>> {standard input}:9097: Error: bad instruction `lock' >> {standard input}:9097: Error: bad instruction `addl $0,0(%rsp)' >> {standard input}:9412: Error: bad instruction `lock' >> {standard input}:9412: Error: bad instruction `addl $0,0(%rsp)' >> /home/share/qemu-devel/QEMU/src/qemu-2.5.0/rules.mak:57: recipe for target >> 'hw/display/qxl.o' failed >> make: *** [hw/display/qxl.o] Error 1 >> ==> ERROR: A failure occurred in build(). >> Aborting... >This should really not happen because it implies that we >somehow ended up with x86 inline assembly. >Some things to test: >(1) if you configure QEMU with --disable-spice, does it build >OK, or does it just fail later on on some other source file? Thank you for the assistance Peter I have managed to try your suggestion and made some progress. After I added "--disable-spice", the code compiled without any error. This implies that Spice was causing my problem. While this may indicate a problem with the spice portion of QEMU's code, I am more convinced that I didn't compile Spice properly and that this is what is causing me problems. I might test this or I might simply not bother with Spice. Kind regards Xavier de Rauville Sent from my BlackBerry Passport.
[Qemu-devel] [Bug 1488363] Re: qemu 2.4.0 hangs using vfio for pci passthrough of graphics card
With seabios 1.9.1-1 and qemu 2.5.0-1 manjaro packages (which as far as I know have no patches), it seems to work now. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1488363 Title: qemu 2.4.0 hangs using vfio for pci passthrough of graphics card Status in QEMU: New Bug description: 2.3.0 (manjaro distro package) works fine. 2.4.0 (manjaro or the arch vanilla one) hangs on the SeaBIOS screen when saying "Press F12 for boot menu". All tested with the same hardware, OS, command and configuration. It also starts without the GPU passed through, even with the USB passed through. I am using the latest SeaBIOS 1.8.2. The release notes say: VFIO Support for resetting AMD Bonaire and Hawaii GPUs Platform device passthrough support for Calxeda xgmac devices So maybe something there broke it. I am using the arch qemu 2.4.0 PKGBUILD (modified to have make -j8 and removed iscsi, gluster, ceph, etc.), which uses vanilla sources and no patches. https://projects.archlinux.org/svntogit/packages.git/tree/trunk?h=packages/qemu I am not using a frontend. I am using a script I wrote that generates the command below. Guest OS here would be 64 bit windows 7, but it didn't start so that's not relevant. Also a Manjaro Linux VM won't start. CPU is AMD FX-8150; board is Gigabyte GA-990FXA-UD5 (990FX chipset). full command line (without the \ after each line) is: qemu-system-x86_64 -enable-kvm -M q35 -m 3584 -cpu host -boot c -smp 7,sockets=1,cores=7,threads=1 -vga none -device ioh3420,bus=pcie.0,addr=1c.0,port=1,chassis=1,id=root.1 -device vfio-pci,host=04:00.0,bus=root.1,multifunction=on,x-vga=on,addr=0.0,romfile=Sapphire.R7260X.1024.131106.rom -device vfio-pci,host=00:14.2,bus=pcie.0 -device vfio-pci,host=00:16.0,bus=root.1 -device vfio-pci,host=00:16.2,bus=root.1 -usb -device ahci,bus=pcie.0,id=ahci -drive file=/dev/data/vm1,id=disk1,format=raw,if=virtio,index=0,media=disk,discard=on -drive media=cdrom,id=cdrom,index=5,media=cdrom -netdev type=tap,id=net0,ifname=tap-vm1 -device virtio-net-pci,netdev=net0,mac=00:01:02:03:04:05 -monitor stdio -boot menu=on $ lspci -nn | grep -E "04:00.0|00:14.2|00:16.0|00:16.2" 00:14.2 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) [1002:4383] (rev 40) 00:16.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397] 00:16.2 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396] 04:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Bonaire XTX [Radeon R7 260X] [1002:6658] Also I have this one that also hangs: 05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Juniper XT [Radeon HD 6770] [1002:68ba] To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1488363/+subscriptions
[Qemu-devel] [Bug 893208] Re: qemu on ARM hosts can't boot i386 image
The "holdup" is simply that nobody who is interested in this issue has written a patch like that Paolo proposed in comment #13. (Mostly people either want to run ARM or other guest images in emulation on x86, or they're running ARM images with hardware virtualization on ARM hardware. Trying to run x86 images in emulation on ARM hosts is much less common.) -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/893208 Title: qemu on ARM hosts can't boot i386 image Status in QEMU: Confirmed Status in Linaro QEMU: New Bug description: If you apply some workarounds for bug 870990, bug 883133 and bug 883136 QEMU still cannot boot the i386 debian_squeeze_i386_standard.qcow2 image from http://people.debian.org/~aurel32/qemu/i386/ -- grub starts to boot but something causes the system to reset just before display of the blue-background grub menu, so we go round in a loop forever. This image boots OK on i386 hosted qemu so this indicates some kind of ARM- host specific bug. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/893208/+subscriptions
Re: [Qemu-devel] [PATCH v1 16/17] loader: Add data swap option to load-elf
On 27 February 2016 at 23:14, Peter Crosthwaite wrote: > On Tue, Jan 19, 2016 at 9:53 AM, Peter Maydell > wrote: >> Can we have a doc comment so we have something that defines what >> values data_swab accepts? (it's not just a bool). >> > > This is difficult to capture without writing to whole documentation > for load_elf. So here goes: Thanks; a couple of typos below, but otherwise looks good. > /** load_elf: > * @filename: Path of ELF file > * @translate_fn: optional function to translate load addresses > * @translate_opaque: opaque data passed to @translate_fn > * @pentry: Populated with program entry point. Ignored if NULL. > * @lowaddr: Populated with lowest loaded address. Ignored if NULL. > * @highaddr: Populated with highest loaded address. Ignored if NULL. > * @bigendian: Expected ELF endianness. 0 for LE otherwise BE > * @elf_machine: Expected ELF machine type > * @clear_lsb: Set to mask off LSB of addresses (Some arch's use this Can we just write out "architectures" here? >for non-address data) > * @data_swab: Set to order of byte swapping for data. 0 for no swap, 1 > * for swapping bytes within halfwords, 2 for bytes within > * words and 3 for within doublewords. > * > * Load an ELF file's contents to the emulated systems address space. "system's" > * Clients may optionally specify a callback to perform address > * translations. @pentry, @lowaddr and @highaddr are optional pointers > * which will be populated with various load information. @bigendian and > * @elf_machine give the expected endianness and machine for the ELF the > * load will fail it the target ELF does not match. Some architectures "if" > * have some arch-specific behaviours that come into effect when their "architecture" > * particular values for @elf_machine are set. > */ thanks -- PMM
Re: [Qemu-devel] [PATCH] kvm: x86: q35: Add support for -machine kernel_irqchip=split for q35
On 2016-02-28 12:52, Rita Sinha wrote: > The split IRQ chip mode via KVM_CAP_SPLIT_IRQCHIP was introduced with commit > 15eafc2e602ff8c37c6e132eb8c63fec8fc17175 but was broken for q35. This patch You may abbreviate hashes in logs. I usually use 10 digits, some even only 8. There is a risk of collisions, but the context will resolve it again for the reader. > makes kernel_irqchip=split functional for q35. Signed-off is missing. > --- > hw/i386/pc_q35.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c > index 46522c9..ff6a5af 100644 > --- a/hw/i386/pc_q35.c > +++ b/hw/i386/pc_q35.c > @@ -39,6 +39,7 @@ > #include "hw/kvm/clock.h" > #include "hw/pci-host/q35.h" > #include "exec/address-spaces.h" > +#include "hw/i386/pc.h" > #include "hw/i386/ich9.h" > #include "hw/smbios/smbios.h" > #include "hw/ide/pci.h" > @@ -145,7 +146,7 @@ static void pc_q35_init(MachineState *machine) > > /* irq lines */ > gsi_state = g_malloc0(sizeof(*gsi_state)); > -if (kvm_irqchip_in_kernel()) { > +if (kvm_ioapic_in_kernel()) { > kvm_pc_setup_irq_routing(pcmc->pci_enabled); > gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state, > GSI_NUM_PINS); > @@ -192,7 +193,7 @@ static void pc_q35_init(MachineState *machine) > /*end early*/ > isa_bus_irqs(isa_bus, gsi); > > -if (kvm_irqchip_in_kernel()) { > +if (kvm_pit_in_kernel()) { kvm_pic_in_kernel > i8259 = kvm_i8259_init(isa_bus); > } else if (xen_enabled()) { > i8259 = xen_interrupt_controller_init(); > Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 6/9] i.MX: Add i.MX6 System Reset Controller device.
Le 27/02/2016 18:43, Peter Maydell a écrit : On 27 February 2016 at 16:57, Jean-Christophe DUBOIS wrote: Hi Peter and Peter, I need to test that the changes I did for PSCI (factor out on/off code) do not introduce any regression. On which QEMU target should I test my changes to PSCI to check I didn't mess up anything? The 'virt' board uses PSCI. Is there some quick howto somewhere on the 'virt' plateform (which linux defconfig to use, which DT tree to use, which command line t use, ...) ? Thanks JC thanks -- PMM
[Qemu-devel] [PATCH] kvm: x86: q35: Add support for -machine kernel_irqchip=split for q35
The split IRQ chip mode via KVM_CAP_SPLIT_IRQCHIP was introduced with commit 15eafc2e602ff8c37c6e132eb8c63fec8fc17175 but was broken for q35. This patch makes kernel_irqchip=split functional for q35. --- hw/i386/pc_q35.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 46522c9..ff6a5af 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -39,6 +39,7 @@ #include "hw/kvm/clock.h" #include "hw/pci-host/q35.h" #include "exec/address-spaces.h" +#include "hw/i386/pc.h" #include "hw/i386/ich9.h" #include "hw/smbios/smbios.h" #include "hw/ide/pci.h" @@ -145,7 +146,7 @@ static void pc_q35_init(MachineState *machine) /* irq lines */ gsi_state = g_malloc0(sizeof(*gsi_state)); -if (kvm_irqchip_in_kernel()) { +if (kvm_ioapic_in_kernel()) { kvm_pc_setup_irq_routing(pcmc->pci_enabled); gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state, GSI_NUM_PINS); @@ -192,7 +193,7 @@ static void pc_q35_init(MachineState *machine) /*end early*/ isa_bus_irqs(isa_bus, gsi); -if (kvm_irqchip_in_kernel()) { +if (kvm_pit_in_kernel()) { i8259 = kvm_i8259_init(isa_bus); } else if (xen_enabled()) { i8259 = xen_interrupt_controller_init(); -- 2.7.1
[Qemu-devel] [PATCH v4 0/2] doc/memory: cleanup
see each commit message Cao jin (2): doc/memory: fix inconsistency between code and doc doc/memory: remove the stray extra '-' docs/memory.txt | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) -- 2.1.0
[Qemu-devel] [PATCH v4 2/2] doc/memory: remove the stray extra '-'
Signed-off-by: Cao jin Reviewed-by: Peter Maydell --- docs/memory.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/memory.txt b/docs/memory.txt index 746d794..cd6657f 100644 --- a/docs/memory.txt +++ b/docs/memory.txt @@ -185,8 +185,8 @@ an MMIO region mapped at 0x0, size 0x6000, priority 2. B currently has two of its own subregions: D of size 0x1000 at offset 0 and E of size 0x1000 at offset 0x2000. As a diagram: -0 1000 2000 3000 4000 5000 6000 70008000 -|--|--|--|--|--|--|--|---| +0 1000 2000 3000 4000 5000 6000 7000 8000 +|--|--|--|--|--|--|--|--| A:[ ] C:[] B: [ ] -- 2.1.0
[Qemu-devel] [PATCH v4 1/2] doc/memory: fix inconsistency between code and doc
change ".valid.aligned" to ".valid.unaligned", and also modify its description, make the text parallel to the existing .impl.unaligned doc. Signed-off-by: Cao jin --- docs/memory.txt | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/memory.txt b/docs/memory.txt index 8745f76..746d794 100644 --- a/docs/memory.txt +++ b/docs/memory.txt @@ -297,8 +297,9 @@ various constraints can be supplied to control how these callbacks are called: - .valid.min_access_size, .valid.max_access_size define the access sizes (in bytes) which the device accepts; accesses outside this range will have device and bus specific behaviour (ignored, or machine check) - - .valid.aligned specifies that the device only accepts naturally aligned - accesses. Unaligned accesses invoke device and bus specific behaviour. + - .valid.unaligned specifies that the *device being modelled* supports + unaligned accesses; if false, unaligned accesses will invoke the appropriate + bus or CPU specific behaviour. - .impl.min_access_size, .impl.max_access_size define the access sizes (in bytes) supported by the *implementation*; other access sizes will be emulated using the ones available. For example a 4-byte write will be -- 2.1.0
Re: [Qemu-devel] [PATCH v3 1/2] doc/memory: fix typo
On 02/26/2016 07:36 PM, Peter Maydell wrote: On 26 February 2016 at 11:29, Cao jin wrote: Needs to be a semicolon before the "if". Either "something something. If foo" or "something something; if foo" are correct as punctuation. Since we're aiming to follow the same construction as the existing text for .impl.unaligned, we want the semicolon and lowercase. interesting punctuation knowledge to me(as a non-native speaker). Thanks for the comment:) -- Yours Sincerely, Cao jin
[Qemu-devel] ODP: Quick folder sharing to arm quest from host
So it seem that again it was my lack of knowledge from virtualization area. It was thta simple as adding virtio-mmio device to machine model. Now I can easy share files between host and arm quest. Thanks, Marcin Od: Krzeminski, Marcin (Nokia - PL/Wroclaw) Wysłano: 25 lutego 2016 13:18 Do: qemu-devel@nongnu.org DW: Peter Maydell Temat: Quick folder sharing to arm quest from host Hello, I need to upload as fast as possible an app to arm linux quest from x86 host. If I understand correctly 9p and can not be used since it needs working kvm. There is build in auto configuration for samba that can be used but it generates problem with privileges, but maybe there is other way to share folder from host to quest? Best regards, Marcin Krzemiński