> From: Jason Gunthorpe <[email protected]>
> Sent: Thursday, September 4, 2025 1:47 AM
> 
> AMD IOMMU v1 is unique in supporting contiguous pages with a variable size
> and it can decode the full 64 bit VA space. Unlike other x86 page tables
> this explicitly does not do sign extension as part of allowing the entire
> 64 bit VA space to be supported.
> 
> The general design is quite similar to the x86 PAE format, except with a
> 6th level and quite different PTE encoding.
> 
> This format is the only one that uses the PT_FEAT_DYNAMIC_TOP feature in
> the existing code as the existing AMDv1 code starts out with a 3 level
> table and adds levels on the fly if more IOVA is needed.
> 
> Comparing the performance of several operations to the existing version:
> 
> iommu_map()
>    pgsz  ,avg new,old ns, min new,old ns  , min % (+ve is better)
>      2^12,     65,64    ,      62,61      ,  -1.01
>      2^13,     70,66    ,      67,62      ,  -8.08
>      2^14,     73,69    ,      71,65      ,  -9.09
>      2^15,     78,75    ,      75,71      ,  -5.05
>      2^16,     89,89    ,      86,84      ,  -2.02
>      2^17,    128,121   ,     124,112     , -10.10
>      2^18,    175,175   ,     170,163     ,  -4.04
>      2^19,    264,306   ,     261,279     ,   6.06
>      2^20,    444,525   ,     438,489     ,  10.10
>      2^21,     60,62    ,      58,59      ,   1.01
>  256*2^12,    381,1833  ,     367,1795    ,  79.79
>  256*2^21,    375,1623  ,     356,1555    ,  77.77
>  256*2^30,    356,1338  ,     349,1277    ,  72.72
> 
> iommu_unmap()
>    pgsz  ,avg new,old ns, min new,old ns  , min % (+ve is better)
>      2^12,     76,89    ,      71,86      ,  17.17
>      2^13,     79,89    ,      75,86      ,  12.12
>      2^14,     78,90    ,      74,86      ,  13.13
>      2^15,     82,89    ,      74,86      ,  13.13
>      2^16,     79,89    ,      74,86      ,  13.13
>      2^17,     81,89    ,      77,87      ,  11.11
>      2^18,     90,92    ,      87,89      ,   2.02
>      2^19,     91,93    ,      88,90      ,   2.02
>      2^20,     96,95    ,      91,92      ,   1.01
>      2^21,     72,88    ,      68,85      ,  20.20
>  256*2^12,    372,6583  ,     364,6251    ,  94.94
>  256*2^21,    398,6032  ,     392,5758    ,  93.93
>  256*2^30,    396,5665  ,     389,5258    ,  92.92

data here mismatches those in coverletter, though the difference
didn't affect the conclusion. 😊

> +
> +if IOMMU_PT
> +config IOMMU_PT_AMDV1
> +     tristate "IOMMU page table for 64-bit AMD IOMMU v1"

remove "64-bit"? I don't think there is a 32-bit format ever.

> +
> +static inline unsigned int amdv1pt_table_item_lg2sz(const struct pt_state
> *pts)
> +{
> +     return PT_GRANULE_LG2SZ +
> +            (PT_TABLEMEM_LG2SZ - ilog2(PT_ITEM_WORD_SIZE)) * pts-
> >level;
> +}
> +#define pt_table_item_lg2sz amdv1pt_table_item_lg2sz

this is the same as in pt_fmt_defaults.h

> +static inline void
> +amdv1pt_install_leaf_entry(struct pt_state *pts, pt_oaddr_t oa,
> +                        unsigned int oasz_lg2,
> +                        const struct pt_write_attrs *attrs)
> +{
> +     unsigned int isz_lg2 = pt_table_item_lg2sz(pts);
> +     u64 *tablep = pt_cur_table(pts, u64) + pts->index;

check that the index is aligned to oasz_log2

Reviewed-by: Kevin Tian <[email protected]>

Reply via email to