On 19.04.2023 22:46, Andrew Cooper wrote:
> On 19/04/2023 11:46 am, Jan Beulich wrote:
>> We don't need to invalidate caches here; all we're after is that earlier
>> writes have made it to main memory (and aiui even that just in case).
>>
>> Signed-off-by: Jan Beulich <jbeul...@suse.com>
>> ---
>> This, aiui, being an analogue to uses of iommu_sync_cache() (just not
>> range restricted), I wonder whether it shouldn't be conditional upon
>> iommu_non_coherent. Then again I'm vaguely under the impression that
>> we had been here before, possibly even as far as questioning the need
>> for this call altogether.
> 
> I'd far rather we fix it property than continue to massage around the
> sides of known-broken logic.

I agree in principle, but you're not asking that I actually amend this
single-line change with all the work you outline, are you? To be quite
honest, what you ask for really is something the VT-d maintainer(s)
(i.e. Kevin as it presently stands) ought to be doing (or really have
done long ago) ...

Jan

> Coherency, or not, of the memory accesses of an IOMMU is a per-IOMMU
> property, not a system-wide property.  What the iommu_non_coherent
> global boolean currently does for us is enforce cache maintenance on all
> IOMMUs, even the coherent ones, when any single IOMMU in the system is
> non-coherent.
> 
> Inside the IOMMU driver, cache maintenance should depend on iommu->ecap
> alone, disregarding anything else.  This will improve the performance on
> any system with mixed coherent and non-coherent IOMMUs, without any
> other behavioural impact.
> 
> The complication comes in in p2m-ept when we're sharing EPT and IOMMU
> pagetables, because the EPT can be accessed by more than one IOMMU if
> the guest has devices behind different IOMMUs.
> 
> But we know the devices assigned to the domain.  They're almost always
> static for the lifetime of the domain, and generally single device only,
> so there may be value in considering a per-domain "I've got a
> non-coherent IOMMU" flag, because we absolutely don't want to be walking
> the list of attached IOMMUs on each EPT update.
> 
> 
> But...
> 
> SandyBridge era systems are the only ones I'm aware of where the IOMMUs
> advertise themselves as non-coherent.  And on that generation we quirk
> the superpage capability off anyway, meaning they are ineligible for
> HAP-PT sharing anyway.
> 
> And if we are doing HAP-PT sharing, the cache maintenance for regular
> EPT updates will be crippling for CPU performance, especially as the
> software bit updates are not relevant to the IOMMU anyway.
> 
> A better option would be to simply disallow HAP-PT sharing when there's
> a non-coherent IOMMU in the system, or (slightly more fine grained) to
> disallow adding a device behind a non-coherent IOMMU to a domain using
> HAP-PT sharing (as there's a disable now anyway).
> 
> Either will reduce the complexity of Xen's code without any loss of
> functionality in real systems AFAICT.
> 
> ~Andrew


Reply via email to