Hi Alex,

On 19/09/2017 18:58, Alex Williamson wrote:
> With virtual PCI-Express chipsets, we now see userspace/guest drivers
> trying to match the physical MPS setting to a virtual downstream port.
> Of course a lone physical device surrounded by virtual interconnects
> cannot make a correct decision for a proper MPS setting.  Instead,
> let's virtualize the MPS control register so that writes through to
> hardware are disallowed.  Userspace drivers like QEMU assume they can
> write anything to the device and we'll filter out anything dangerous.
> Since mismatched MPS can lead to AER and other faults, let's add it
> to the kernel side rather than relying on userspace virtualization to
> handle it.
> 
> Signed-off-by: Alex Williamson <alex.william...@redhat.com>
> ---
> 
> Do we have any reason to suspect that a userspace driver has any
> dependencies on the physical MPS setting or is this only tuning the
> protocol layer and it's transparent to the driver?  Note that per the
> PCI spec, a device supporting only 128B MPS can hardwire the control
> register to 000b, but it doesn't seem PCIe compliant to hardwire it to
> any given value, such as would be the appearance if we exposed this as
> a read-only register rather than virtualizing it.  QEMU would then be
> responsible for virtualizing it, which makes coordinating the upgrade
> troublesome.
> 
>  drivers/vfio/pci/vfio_pci_config.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_config.c 
> b/drivers/vfio/pci/vfio_pci_config.c
> index 5628fe114347..91335e6de88a 100644
> --- a/drivers/vfio/pci/vfio_pci_config.c
> +++ b/drivers/vfio/pci/vfio_pci_config.c
> @@ -849,11 +849,13 @@ static int __init init_pci_cap_exp_perm(struct 
> perm_bits *perm)
>  
>       /*
>        * Allow writes to device control fields, except devctl_phantom,
> -      * which could confuse IOMMU, and the ARI bit in devctl2, which
> +      * which could confuse IOMMU, MPS, which can break communication
> +      * with other physical devices, and the ARI bit in devctl2, which
>        * is set at probe time.  FLR gets virtualized via our writefn.
>        */
>       p_setw(perm, PCI_EXP_DEVCTL,
> -            PCI_EXP_DEVCTL_BCR_FLR, ~PCI_EXP_DEVCTL_PHANTOM);
> +            PCI_EXP_DEVCTL_BCR_FLR | PCI_EXP_DEVCTL_PAYLOAD,
> +            ~PCI_EXP_DEVCTL_PHANTOM);
>       p_setw(perm, PCI_EXP_DEVCTL2, NO_VIRT, ~PCI_EXP_DEVCTL2_ARI);
Is it correct that the read value still will be the one written by the
guest?

I see the MMRS can take the read MPS value in some pcie_bus_config
values. So a consequence could be that the applied MMRS (which is not
virtualized) is lower than what is set by host, due to a guest pcie root
port MPSS for instance.

So if the above is not totally wrong, shouldn't we virtualize MMRS as well?

Thanks

Eric

>       return 0;
>  }
> 

Reply via email to