On Tue, Jan 22, 2019 at 01:24:48AM -0700, Jan Beulich wrote: >>>> On 22.01.19 at 06:50, <chao....@intel.com> wrote: >> On Wed, Jan 16, 2019 at 11:38:23AM +0100, Roger Pau Monné wrote: >>>On Wed, Jan 16, 2019 at 04:17:30PM +0800, Chao Gao wrote: >>>> @@ -1529,6 +1591,8 @@ int deassign_device(struct domain *d, u16 seg, u8 >>>> bus, u8 devfn) >>>> if ( !pdev ) >>>> return -ENODEV; >>>> >>>> + pci_unmap_msi(pdev); >>> >>>Just want to make sure, since deassign_device will be called for both >>>PV and HVM domains. AFAICT pci_unmap_msi is safe to call when the >>>device is assigned to a PV guest, but would like your confirmation. >> >> Tested with a PV guest loaded by Pygrub. PV guest doesn't suffer the >> msi-x issue I want to fix. >> >> With these three patches applied, I got some error messages from Xen >> and Dom0 as follow: >> >> (XEN) irq.c:2176: dom3: forcing unbind of pirq 332 >> (XEN) irq.c:2176: dom3: forcing unbind of pirq 331 >> (XEN) irq.c:2176: dom3: forcing unbind of pirq 328 >> (XEN) irq.c:2148: dom3: pirq 359 not mapped >> [ 2887.067685] xen:events: unmap irq failed -22 >> (XEN) irq.c:2148: dom3: pirq 358 not mapped >> [ 2887.075917] xen:events: unmap irq failed -22 >> (XEN) irq.c:2148: dom3: pirq 357 not mapped >> >> It seems, the cause of such error is that pirq-s are unmapped and forcibly >> unbound on deassignment; subsequent unmapping pirq issued by dom0 fail. >> From some aspects, this error is expected. Because with this patch, >> pirq-s are expected to be mapped by qemu or dom0 kernel (for pv case) before >> deassignment and mapping/binding pirq after deassignment should fail. >> >> So what's your opinion on handling such error? We should figure out another >> method to fix msi-x issue to avoid such error or suppress these errors in >> qemu and linux kernel? > >The "forcing unbind" ones are probably fine to leave alone, but >the errors would better be avoided in Xen (i.e. without a need >to also change qemu and/or Linux). Since you don't really say >when / why these errors now surface, it's hard to suggest what >might be best to do.
With these patches applied, these errors surface in three cases: 1. destroy the PV guest with assigned devices by "xl destroy" 2. hot-unplug a assigned device from the PV guest 3. shut down the PV guest by executing "init 0" in guest (only for some devices whose driver doesn't clean up MSI-x when shutdown) The reason is: when detaching a device from a domain, Toolstack always calls xc_deassign_device() prior to libxl__device_pci_remove_xenstore(). The latter notifies xen_pciback to clean up the pci devices. I guess unbinding and unmapping pirq are steps of the cleanup (just like qemu's role in device deassignment for HVM guest). But in this patch, pirqs are forcibly unmapped when calling xc_deassign_device(). Thus when xen_pciback tries to unmap pirqs as usual, xen reports this pirq isn't mapped and propagates this error to xen_pciback. Thanks Chao _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel