>-----Original Message----- >From: Liu, Yi L <yi.l....@intel.com> >Subject: Re: [PATCH v1 02/15] intel_iommu: Optimize context entry cache >utilization > >On 2025/6/6 18:04, Zhenzhong Duan wrote: >> There are many call sites referencing context entry by calling >> vtd_dev_to_context_entry() which will traverse the DMAR table. >> >> In most cases we can use cached context entry in vtd_as->context_cache_entry >> except when its entry is stale. Currently only global and domain context >> invalidation stale it. >> >> So introduce a helper function vtd_as_to_context_entry() to fetch from cache >> before trying with vtd_dev_to_context_entry(). > >The cached context entry is now protected by vtd_iommu_lock(). While not >all caller of vtd_dev_to_context_entry() are under this lock. > >Also, the cached context entry is created in the translate path. IMHO, >this path is not supposed to be triggered for passthrough devices. >While this may need double check and may change in the future. But let's >see if any locking issue with the current code.
Good finding, yes. Previously I thought translation path updates cc_entry->context_entry after cc_entry->context_cache_gen. In vtd_as_to_context_entry() cc_entry->context_cache_gen is checked first, so there was no real race. But I still missed a memory barrier like below: @@ -2277,6 +2286,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus, cc_entry->context_cache_gen, s->context_cache_gen); cc_entry->context_entry = ce; + smp_wmb(); cc_entry->context_cache_gen = s->context_cache_gen; } Another option I can think of is adding lock to cache reading like below: @@ -1659,11 +1659,15 @@ static int vtd_as_to_context_entry(VTDAddressSpace *vtd_as, VTDContextEntry *ce) uint8_t devfn = vtd_as->devfn; VTDContextCacheEntry *cc_entry = &vtd_as->context_cache_entry; + vtd_iommu_lock(s); + /* Try to fetch context-entry from cache first */ if (cc_entry->context_cache_gen == s->context_cache_gen) { *ce = cc_entry->context_entry; + vtd_iommu_unlock(s); return 0; } else { + vtd_iommu_unlock(s); return vtd_dev_to_context_entry(s, bus_num, devfn, ce); } } Which one do you prefer? Thanks Zhenzhong