common: Support device dirty page tracking with vIOMMU

Joao Martins Thu, 23 Feb 2023 13:30:37 -0800

On 23/02/2023 20:55, Jason Gunthorpe wrote:
> On Thu, Feb 23, 2023 at 01:06:33PM -0700, Alex Williamson wrote:
>>> #2 is the presumption that the guest is using an identity map.
>> Isn't it reasonable to require that a device support dirty tracking for
>> the entire extent if its DMA address width in order to support this
>> feature?
> 
> No, 2**64 is too big a number to be reasonable.
> 
+1


> Ideally we'd work it the other way and tell the vIOMMU that the vHW
> only supports a limited number of address bits for the translation, eg
> through the ACPI tables. Then the dirty tracking could safely cover
> the larger of all system memory or the limited IOVA address space.
> 
> Or even better figure out how to get interrupt remapping without IOMMU
> support :\

FWIW That's generally my use of `iommu=pt` because all I want is interrupt
remapping, not the DMA remapping part. And this is going to be specially
relevant with these new boxes that easily surprass the >255 dedicated physical
CPUs mark with just two sockets.

The only other alternative I could see is to rely on IOMMU attribute for DMA
translation. Today you can actually toggle that 'off' in VT-d (and I can imagine
the same thing working for AMD-vIOMMU). In Intel it just omits the 39
Address-width cap. And it means it doesn't have virtual addressing. Similar to
what Avihai already does for MAX_IOVA, we would do for DMA_TRANSLATION, and let
each vIOMMU implementation support that.

But to be honest I am not sure how robust relying on that is as that doesn't
really represent a hardware implementation. Without vIOMMU you have a (KVM) PV
op in new *guest* kernels that (ab)uses some unused bits in IOAPIC for a 24-bit
DestID. But this is only on new guests and hypervisors, old *guests* running
older < 5.15 kernels won't work.

... So iommu=pt really is the most convenient right now :/

        Joao

Re: [PATCH v2 17/20] vfio/common: Support device dirty page tracking with vIOMMU

Reply via email to