On 2022/1/14 15:22, Jason Wang wrote:
On Fri, Jan 14, 2022 at 3:13 PM Peter Xu <pet...@redhat.com> wrote:
On Fri, Jan 14, 2022 at 01:58:07PM +0800, Jason Wang wrote:
Right, but I think you meant to do this only when scalable mode is disabled.
Yes IMHO it will definitely suite for !scalable case since that's exactly what
we did before. What I'm also wondering is even if scalable is enabled but no
"real" pasid is used, so if all the translations go through the default pasid
that stored in the device context entry, then maybe we can ignore checking it.
The latter is the "hacky" part mentioned above.
The problem I see is that we can't know what PASID is used as default
without reading the context entry?
Can the default NO_PASID being used in mixture of !NO_PASID use case on the
same device? If that's possible, then I agree..
My understanding is that it is possible.
My previous idea should be based on the fact that if NO_PASID is used on one
device, then all translations will be based on NO_PASID, but now I'm not sure
of it.
Actually, what I meant is:
device 1 using transactions without PASID with RID2PASID 1
device 2 using transactions without PASID with RID2PASID 2
Interesting series, Jason.
haven't read through all your code yet. Just a quick comment. The
RID2PASID1 and RID2PASID2 may be the same one. Vt-d spec has defined a RPS
bit in ecap register. If it is reported as 0, that means the RID_PASID
(previously it is called RID2PASID :-)) field of scalable mode context
entry is not supported, a PASID value of 0 will be used for transactions
wihout PASID. So in the code, you may check the RPS bit to see if the
RID_PASID value are the same for all devices.
Regards,
Yi Liu
Then we can't assume a default pasid here.
The other thing to mention is, if we postpone the iotlb lookup to be after
context entry, then logically we can have per-device iotlb, that means we can
replace IntelIOMMUState.iotlb with VTDAddressSpace.iotlb in the future, too,
which can also be more efficient.
Right but we still need to limit the total slots and ATS is a better
way to deal with the IOTLB bottleneck actually.
I think it depends on how the iotlb ghash is implemented. Logically I think if
we can split the cache to per-device it'll be slightly better because we don't
need to iterate over iotlbs of other devices when lookup anymore; meanwhile
each iotlb takes less space too (no devfn needed anymore).
So we've already used sid in the IOTLB hash, I wonder how much we can
gain form this.
Thanks
Thanks,
--
Peter Xu
--
Regards,
Yi Liu