>-----Original Message----- >From: Liu, Yi L <yi.l....@intel.com> >Subject: Re: [PATCH v5 12/21] intel_iommu: Handle PASID entry addition > >On 2025/9/1 17:03, Duan, Zhenzhong wrote: > >>>> } >>>> >>>> +/* >>>> + * This function walks over PASID range within [start, end) in a single >>>> + * PASID table for entries matching @info type/did, then retrieve/create >>>> + * vtd_as and fill associated pasid entry cache. >>>> + */ >>>> +static void vtd_sm_pasid_table_walk_one(IntelIOMMUState *s, >>>> + dma_addr_t pt_base, >>>> + int start, >>>> + int end, >>>> + VTDPASIDCacheInfo >>> *info) >>>> +{ >>>> + VTDPASIDEntry pe; >>>> + int pasid = start; >>>> + >>>> + while (pasid < end) { >>>> + if (!vtd_get_pe_in_pasid_leaf_table(s, pasid, pt_base, &pe) >>>> + && vtd_pe_present(&pe)) { >>>> + int bus_n = pci_bus_num(info->bus), devfn = info->devfn; >>>> + uint16_t sid = PCI_BUILD_BDF(bus_n, devfn); >>>> + VTDPASIDCacheEntry *pc_entry; >>>> + VTDAddressSpace *vtd_as; >>>> + >>>> + vtd_iommu_lock(s); >>>> + /* >>>> + * When indexed by rid2pasid, vtd_as should have been >>> created, >>>> + * e.g., by PCI subsystem. For other iommu pasid, we >need >>> to >>>> + * create vtd_as dynamically. Other iommu pasid is same >>> value >>> since you don't support somthing else than rid2pasid, I would drop that >>> and simplify the code. See below. >>>> + * as PCI's pasid, so it's used as input of >vtd_find_add_as(). >>>> + */ >>>> + vtd_as = vtd_as_from_iommu_pasid_locked(s, sid, pasid); >>>> + vtd_iommu_unlock(s); >>>> + if (!vtd_as) { >>>> + vtd_as = vtd_find_add_as(s, info->bus, devfn, pasid); >>> you could check the vtd_as already exists here per the rid2pasid support >>> limitation >> >> In this series, I do include some basic codes for non-rid2pasid because they >share some common code with rid2pasid and we already have emulated >rid2pasid support in vIOMMU for a long time, it's not bad to accumulate some >supporting code for non-rid2pasid for passthrough device. But I can do the >factor out if you insist to have only rid_pasid code. > >I think it's a reasonable ask. :)
OK, will do. > >> >>>> + } >>>> + >>>> + if ((info->type == VTD_PASID_CACHE_DOMSI || >>>> + info->type == VTD_PASID_CACHE_PASIDSI) && >>>> + (info->did != VTD_SM_PASID_ENTRY_DID(&pe))) { >>>> + /* >>>> + * VTD_PASID_CACHE_DOMSI and >>> VTD_PASID_CACHE_PASIDSI >>>> + * requires domain id check. If domain id check fail, >>> fails >>>> + * go to next pasid. >>>> + */ >>>> + pasid++; >>>> + continue; >>>> + } >>>> + >>>> + pc_entry = &vtd_as->pasid_cache_entry; >>>> + /* >>>> + * pasid cache update and clear are handled in >>>> + * vtd_flush_pasid_locked(), only care new pasid entry >>> here. >>>> + */ >>>> + if (!pc_entry->valid) { >>>> + pc_entry->pasid_entry = pe; >>>> + pc_entry->valid = true; >>>> + } >>>> + } >>>> + pasid++; >>>> + } >>>> +} >>>> + >>>> +/* >>>> + * In VT-d scalable mode translation, PASID dir + PASID table is used. >>>> + * This function aims at looping over a range of PASIDs in the given >>>> + * two level table to identify the pasid config in guest. >>>> + */ >>>> +static void vtd_sm_pasid_table_walk(IntelIOMMUState *s, >>>> + dma_addr_t pdt_base, >>>> + int start, int end, >>>> + VTDPASIDCacheInfo *info) >>>> +{ >>>> + VTDPASIDDirEntry pdire; >>>> + int pasid = start; >>>> + int pasid_next; >>>> + dma_addr_t pt_base; >>>> + >>>> + while (pasid < end) { >>>> + pasid_next = >>>> + (pasid + VTD_PASID_TBL_ENTRY_NUM) & >>> ~(VTD_PASID_TBL_ENTRY_NUM - 1); >>>> + pasid_next = pasid_next < end ? pasid_next : end; >>>> + >>>> + if (!vtd_get_pdire_from_pdir_table(pdt_base, pasid, &pdire) >>>> + && vtd_pdire_present(&pdire)) { >>>> + pt_base = pdire.val & >>> VTD_PASID_TABLE_BASE_ADDR_MASK; >>>> + vtd_sm_pasid_table_walk_one(s, pt_base, pasid, >>> pasid_next, info); >>>> + } >>>> + pasid = pasid_next; >>>> + } >>>> +} >>>> + >>>> +static void vtd_replay_pasid_bind_for_dev(IntelIOMMUState *s, >>>> + int start, int end, >>>> + VTDPASIDCacheInfo >>> *info) >>>> +{ >>>> + VTDContextEntry ce; >>>> + >>>> + if (!vtd_dev_to_context_entry(s, pci_bus_num(info->bus), >>> info->devfn, >>>> + &ce)) { >>>> + uint32_t max_pasid; >>>> + >>>> + max_pasid = vtd_sm_ce_get_pdt_entry_num(&ce) * >>> VTD_PASID_TBL_ENTRY_NUM; >>>> + if (end > max_pasid) { >>>> + end = max_pasid; >>>> + } >>>> + vtd_sm_pasid_table_walk(s, >>>> + >>> VTD_CE_GET_PASID_DIR_TABLE(&ce), >>>> + start, >>>> + end, >>>> + info); >>>> + } >>>> +} >>>> + >>>> +/* >>>> + * This function replays the guest pasid bindings by walking the two level >>>> + * guest PASID table. For each valid pasid entry, it finds or creates a >>>> + * vtd_as and caches pasid entry in vtd_as. >>>> + */ >>>> +static void vtd_replay_guest_pasid_bindings(IntelIOMMUState *s, >>>> + >VTDPASIDCacheInfo >>> *pc_info) >>>> +{ >>>> + /* >>>> + * Currently only Requests-without-PASID is supported, as >vIOMMU >>> doesn't >>>> + * support RPS(RID-PASID Support), pasid scope is fixed to [0, 1). >>>> + */ >>>> + int start = 0, end = 1; >>>> + VTDHostIOMMUDevice *vtd_hiod; >>>> + VTDPASIDCacheInfo walk_info; >>>> + GHashTableIter as_it; >>>> + >>>> + switch (pc_info->type) { >>>> + case VTD_PASID_CACHE_PASIDSI: >>>> + start = pc_info->pasid; >>>> + end = pc_info->pasid + 1; >>> if you never replay a range, you could simplify the code for now because >>> some code paths are not properly tested >> >> OK. Instead of assignment of start and end variable, maybe just an >assert(!pc_info->pasid). > >I think there are two reasons for this range replay. > >1) as a preparation for patch 16 of this series. >2) support domain selective or global pasid cache invalidation Exactly. > >> >>>> + /* fall through */ >>>> + case VTD_PASID_CACHE_DOMSI: >>> Why can't we have other invalidation types along with request without >>> PASID? It is not obvious to me at least why it couldn't be used by the >>> guest. Would deserve a comment in the commit desc I think. >> >> Other invalidation types are indeed used, just in pasid scope [0, 1), because >[start, end) are already initialized to [0, 1), nothing more here, so just >break. > >hmmm. The fixed scope makes the range replay a fake one. It's better >holding the range replay logic for now and add it when there is >non-rid_pasid support for passthrough devices. Yes, will delete the non-rid_pasid code. Thanks Zhenzhong