unmap on vpci removal

Jan Beulich Mon, 15 Nov 2021 08:56:36 -0800

On 05.11.2021 07:56, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
> 
> When a vPCI is removed for a PCI device it is possible that we have
> scheduled a delayed work for map/unmap operations for that device.
> For example, the following scenario can illustrate the problem:
> 
> pci_physdev_op
>    pci_add_device
>        init_bars -> modify_bars -> defer_map -> 
> raise_softirq(SCHEDULE_SOFTIRQ)
>    iommu_add_device <- FAILS
>    vpci_remove_device -> xfree(pdev->vpci)
> 
> leave_hypervisor_to_guest
>    vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci == NULL
> 
> For the hardware domain we continue execution as the worse that
> could happen is that MMIO mappings are left in place when the
> device has been deassigned


Is continuing safe in this case? I.e. isn't there the risk of a NULL
deref?

> For unprivileged domains that get a failure in the middle of a vPCI
> {un}map operation we need to destroy them, as we don't know in which
> state the p2m is. This can only happen in vpci_process_pending for
> DomUs as they won't be allowed to call pci_add_device.

You saying "we need to destroy them" made me look for a new domain_crash()
that you add, but there is none. What is this about?

> @@ -165,6 +164,18 @@ bool vpci_process_pending(struct vcpu *v)
>      return false;
>  }
>  
> +void vpci_cancel_pending(const struct pci_dev *pdev)
> +{
> +    struct vcpu *v = current;
> +
> +    /* Cancel any pending work now. */

Doesn't "any" include pending work on all vCPU-s of the guest, not
just current? Is current even relevant (as in: part of the correct
domain), considering ...

> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -51,6 +51,8 @@ void vpci_remove_device(struct pci_dev *pdev)
>          xfree(r);
>      }
>      spin_unlock(&pdev->vpci->lock);
> +
> +    vpci_cancel_pending(pdev);

... this code path, when coming here from pci_{add,remove}_device()?

I can agree that there's a problem here, but I think you need to
properly (i.e. in a race free manner) drain pending work.

Jan

Re: [PATCH v4 02/11] vpci: cancel pending map/unmap on vpci removal

Reply via email to