On Tue, Jul 09, 2024 at 07:14:19AM +0000, Mostafa Saleh wrote: > Hi Jean, > > On Thu, Jul 04, 2024 at 07:12:35PM +0100, Jean-Philippe Brucker wrote: > > On Mon, Jul 01, 2024 at 11:02:31AM +0000, Mostafa Saleh wrote: > > > In the next patch, combine_tlb() will be added which combines 2 TLB > > > entries into one for nested translations, which chooses the granule > > > and level from the smallest entry. > > > > > > This means that with nested translation, an entry can be cached with > > > the granule of stage-2 and not stage-1. > > > > > > However, currently, the lookup for an IOVA is done with input stage > > > granule, which is stage-1 for nested configuration, which will not > > > work with the above logic. > > > This patch reworks lookup in that case, so it falls back to stage-2 > > > granule if no entry is found using stage-1 granule. > > > > Why not initialize tt_combined to the minimum granule of stages 1 and 2? > > It looks like you introduced it for this. I'm wondering if we lookup the > > wrong IOVA if changing the granule size after the address is masked in > > smmu_translate() > > I am not sure I fully understand, but I don’t think that would work as it is > not guaranteed that the minimum granule is the one that would be cached, > as we might hit block mappings. > > The IOVA at first is masked with the first stage mask for the expected page > address, and the lookup logic would mask the address for each level look up, > so It should match the alignment of the cached page of that granule and level, > and as the combine logic is done with the aligned_addr it is guaranteed by > construction that it has to be aligned with stage-1.
I missed something, this is what I had in mind initially: * s1 granule is 64k, s2 granule is 4k * the tlb already contains a translations for IOVA 0x30000, tg=4k * now we lookup IOVA 0x31000. Masked with the s1 granule, aligned_addr is 0x30000. Not found at first because lookup is with tg=64k, but then we call smmu_iotlb_lookup_all_levels() again with the s2 granule and the same IOVA, which returns the wrong translation But it's not actually possible, because if cfg->stage == SMMU_NESTED, then in smmu_translate() we end up with } else { /* Stage2. */ tt_combined.granule_sz = cfg->s2cfg.granule_sz; So I think the condition (cfg->stage == SMMU_NESTED) && (cfg->s2cfg.granule_sz != tt->granule_sz) in this patch is never true? Then the following scenario: * s1 granule is 4k, s2 granule is 64k * we lookup IOVA A, miss. The translation gets cached with granule 4k * we lookup IOVA A again, but with tt->granule_sz = 64k so we'll never find the entry? I guess we want to start the lookup with the smallest granule, and then if the s1 and s2 granules differ, retry with the other one. Or with SMMU_NESTED, start with the s1 granule and keep this patch to fallback to s2 granule, but without masking the IOVA in smmu_translate() (it will be masked correctly by smmu_iotlb_lookup_all_levels()). Thanks, Jean > > Thanks, > Mostafa > > > > > Thanks, > > Jean > > > > > > > > Signed-off-by: Mostafa Saleh <smost...@google.com> > > > --- > > > hw/arm/smmu-common.c | 36 ++++++++++++++++++++++++++++++++++-- > > > 1 file changed, 34 insertions(+), 2 deletions(-) > > > > > > diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c > > > index 21982621c0..0840b5cffd 100644 > > > --- a/hw/arm/smmu-common.c > > > +++ b/hw/arm/smmu-common.c > > > @@ -66,8 +66,10 @@ SMMUIOTLBKey smmu_get_iotlb_key(int asid, int vmid, > > > uint64_t iova, > > > return key; > > > } > > > > > > -SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg *cfg, > > > - SMMUTransTableInfo *tt, hwaddr iova) > > > +static SMMUTLBEntry *smmu_iotlb_lookup_all_levels(SMMUState *bs, > > > + SMMUTransCfg *cfg, > > > + SMMUTransTableInfo *tt, > > > + hwaddr iova) > > > { > > > uint8_t tg = (tt->granule_sz - 10) / 2; > > > uint8_t inputsize = 64 - tt->tsz; > > > @@ -88,6 +90,36 @@ SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, > > > SMMUTransCfg *cfg, > > > } > > > level++; > > > } > > > + return entry; > > > +} > > > + > > > +/** > > > + * smmu_iotlb_lookup - Look up for a TLB entry. > > > + * @bs: SMMU state which includes the TLB instance > > > + * @cfg: Configuration of the translation > > > + * @tt: Translation table info (granule and tsz) > > > + * @iova: IOVA address to lookup > > > + * > > > + * returns a valid entry on success, otherwise NULL. > > > + * In case of nested translation, tt can be updated to include > > > + * the granule of the found entry as it might different from > > > + * the IOVA granule. > > > + */ > > > +SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg *cfg, > > > + SMMUTransTableInfo *tt, hwaddr iova) > > > +{ > > > + SMMUTLBEntry *entry = NULL; > > > + > > > + entry = smmu_iotlb_lookup_all_levels(bs, cfg, tt, iova); > > > + /* > > > + * For nested translation also try the s2 granule, as the TLB will > > > insert > > > + * it if the size of s2 tlb entry was smaller. > > > + */ > > > + if (!entry && (cfg->stage == SMMU_NESTED) && > > > + (cfg->s2cfg.granule_sz != tt->granule_sz)) { > > > + tt->granule_sz = cfg->s2cfg.granule_sz; > > > + entry = smmu_iotlb_lookup_all_levels(bs, cfg, tt, iova); > > > + } > > > > > > if (entry) { > > > cfg->iotlb_hits++; > > > -- > > > 2.45.2.803.g4e1b14247a-goog > > >