On Fri, Sep 24, 2021 at 11:52:14AM +0200, Jan Beulich wrote:
> No separate feature flags exist which would control availability of
> these; the only restriction is HATS (establishing the maximum number of
> page table levels in general), and even that has a lower bound of 4.
> Thus we can unconditionally announce 2M, 1G, and 512G mappings. (Via
> non-default page sizes the implementation in principle permits arbitrary
> size mappings, but these require multiple identical leaf PTEs to be
> written, which isn't all that different from having to write multiple
> consecutive PTEs with increasing frame numbers. IMO that's therefore
> beneficial only on hardware where suitable TLBs exist; I'm unaware of
> such hardware.)

Also replacing/shattering such non-standard page sizes will require
more logic, so unless there's a performance benefit I would just skip
using them.

> 
> Signed-off-by: Jan Beulich <jbeul...@suse.com>
> ---
> I'm not fully sure about allowing 512G mappings: The scheduling-for-
> freeing of intermediate page tables can take quite a while when
> replacing a tree of 4k mappings by a single 512G one. Plus (or otoh)
> there's no present code path via which 512G chunks of memory could be
> allocated (and hence mapped) anyway.

I would limit to 1G, which is what we support for CPU page tables
also.

> --- a/xen/drivers/passthrough/amd/iommu_map.c
> +++ b/xen/drivers/passthrough/amd/iommu_map.c
> @@ -32,12 +32,13 @@ static unsigned int pfn_to_pde_idx(unsig
>  }
>  
>  static union amd_iommu_pte clear_iommu_pte_present(unsigned long l1_mfn,
> -                                                   unsigned long dfn)
> +                                                   unsigned long dfn,
> +                                                   unsigned int level)
>  {
>      union amd_iommu_pte *table, *pte, old;
>  
>      table = map_domain_page(_mfn(l1_mfn));
> -    pte = &table[pfn_to_pde_idx(dfn, 1)];
> +    pte = &table[pfn_to_pde_idx(dfn, level)];
>      old = *pte;
>  
>      write_atomic(&pte->raw, 0);
> @@ -288,10 +289,31 @@ static int iommu_pde_from_dfn(struct dom
>      return 0;
>  }
>  
> +static void queue_free_pt(struct domain *d, mfn_t mfn, unsigned int 
> next_level)

Nit: should the last parameter be named level rather than next_level?
AFAICT it's the level of the mfn parameter.

Should we also assert that level (or next_level) is always != 0 for
extra safety?

Thanks, Roger.

Reply via email to