On 05.11.2021 07:56, Oleksandr Andrushchenko wrote: > From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com> > > When a vPCI is removed for a PCI device it is possible that we have > scheduled a delayed work for map/unmap operations for that device. > For example, the following scenario can illustrate the problem: > > pci_physdev_op > pci_add_device > init_bars -> modify_bars -> defer_map -> > raise_softirq(SCHEDULE_SOFTIRQ) > iommu_add_device <- FAILS > vpci_remove_device -> xfree(pdev->vpci) > > leave_hypervisor_to_guest > vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci == NULL > > For the hardware domain we continue execution as the worse that > could happen is that MMIO mappings are left in place when the > device has been deassigned
Is continuing safe in this case? I.e. isn't there the risk of a NULL deref? > For unprivileged domains that get a failure in the middle of a vPCI > {un}map operation we need to destroy them, as we don't know in which > state the p2m is. This can only happen in vpci_process_pending for > DomUs as they won't be allowed to call pci_add_device. You saying "we need to destroy them" made me look for a new domain_crash() that you add, but there is none. What is this about? > @@ -165,6 +164,18 @@ bool vpci_process_pending(struct vcpu *v) > return false; > } > > +void vpci_cancel_pending(const struct pci_dev *pdev) > +{ > + struct vcpu *v = current; > + > + /* Cancel any pending work now. */ Doesn't "any" include pending work on all vCPU-s of the guest, not just current? Is current even relevant (as in: part of the correct domain), considering ... > --- a/xen/drivers/vpci/vpci.c > +++ b/xen/drivers/vpci/vpci.c > @@ -51,6 +51,8 @@ void vpci_remove_device(struct pci_dev *pdev) > xfree(r); > } > spin_unlock(&pdev->vpci->lock); > + > + vpci_cancel_pending(pdev); ... this code path, when coming here from pci_{add,remove}_device()? I can agree that there's a problem here, but I think you need to properly (i.e. in a race free manner) drain pending work. Jan