>-----Original Message-----
>From: Liu, Yi L <yi.l....@intel.com>
>Subject: Re: [PATCH v1 02/15] intel_iommu: Optimize context entry cache
>utilization
>
>On 2025/6/6 18:04, Zhenzhong Duan wrote:
>> There are many call sites referencing context entry by calling
>> vtd_dev_to_context_entry() which will traverse the DMAR table.
>>
>> In most cases we can use cached context entry in vtd_as->context_cache_entry
>> except when its entry is stale. Currently only global and domain context
>> invalidation stale it.
>>
>> So introduce a helper function vtd_as_to_context_entry() to fetch from cache
>> before trying with vtd_dev_to_context_entry().
>
>The cached context entry is now protected by vtd_iommu_lock(). While not
>all caller of vtd_dev_to_context_entry() are under this lock.
>
>Also, the cached context entry is created in the translate path. IMHO,
>this path is not supposed to be triggered for passthrough devices.
>While this may need double check and may change in the future. But let's
>see if any locking issue with the current code.

Good finding, yes.
Previously I thought translation path updates cc_entry->context_entry after 
cc_entry->context_cache_gen.
In vtd_as_to_context_entry() cc_entry->context_cache_gen is checked first, so 
there was no real race.
But I still missed a memory barrier like below:

@@ -2277,6 +2286,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace 
*vtd_as, PCIBus *bus,
                                   cc_entry->context_cache_gen,
                                   s->context_cache_gen);
         cc_entry->context_entry = ce;
+        smp_wmb();
         cc_entry->context_cache_gen = s->context_cache_gen;
     }
 
Another option I can think of is adding lock to cache reading like below:

@@ -1659,11 +1659,15 @@ static int vtd_as_to_context_entry(VTDAddressSpace 
*vtd_as, VTDContextEntry *ce)
     uint8_t devfn = vtd_as->devfn;
     VTDContextCacheEntry *cc_entry = &vtd_as->context_cache_entry;

+    vtd_iommu_lock(s);
+
     /* Try to fetch context-entry from cache first */
     if (cc_entry->context_cache_gen == s->context_cache_gen) {
         *ce = cc_entry->context_entry;
+        vtd_iommu_unlock(s);
         return 0;
     } else {
+        vtd_iommu_unlock(s);
         return vtd_dev_to_context_entry(s, bus_num, devfn, ce);
     }
 }

Which one do you prefer?

Thanks
Zhenzhong

Reply via email to