Re: [PATCH 08/22] x86/pv: rewrite how building PV dom0 handles domheap mappings

2024-01-10 Thread El Yandouzi, Elias

Hi Jan,

I have been looking at this series recently and tried my best
to address your comments. I'll shortly to the other patches too.

On 22/12/2022 11:48, Jan Beulich wrote:

On 16.12.2022 12:48, Julien Grall wrote:

From: Hongyan Xia 

Building a PV dom0 is allocating from the domheap but uses it like the
xenheap. This is clearly wrong. Fix.


"Clearly wrong" would mean there's a bug here, at lest under certain
conditions. But there isn't: Even on huge systems, due to running on
idle page tables, all memory is mapped at present.


I agree with you, I'll rephrase the commit message.




@@ -711,22 +715,32 @@ int __init dom0_construct_pv(struct domain *d,
  v->arch.pv.event_callback_cs= FLAT_COMPAT_KERNEL_CS;
  }
  
+#define UNMAP_MAP_AND_ADVANCE(mfn_var, virt_var, maddr) \

+do {\
+UNMAP_DOMAIN_PAGE(virt_var);\


Not much point using the macro when ...


+mfn_var = maddr_to_mfn(maddr);  \
+maddr += PAGE_SIZE; \
+virt_var = map_domain_page(mfn_var);\


... the variable gets reset again to non-NULL unconditionally right
away.


Sure, I'll change that.




+} while ( false )


This being a local macro and all use sites passing mpt_alloc as the
last argument, I think that parameter wants dropping, which would
improve readability.


I have to disagree. It wouldn't improve readability but make only make 
things more obscure. I'll keep the macro as is.





@@ -792,9 +808,9 @@ int __init dom0_construct_pv(struct domain *d,
  if ( !l3e_get_intpte(*l3tab) )
  {
  maddr_to_page(mpt_alloc)->u.inuse.type_info = 
PGT_l2_page_table;
-l2tab = __va(mpt_alloc); mpt_alloc += PAGE_SIZE;
-clear_page(l2tab);
-*l3tab = l3e_from_paddr(__pa(l2tab), L3_PROT);
+UNMAP_MAP_AND_ADVANCE(l2start_mfn, l2start, mpt_alloc);
+clear_page(l2start);
+*l3tab = l3e_from_mfn(l2start_mfn, L3_PROT);


The l2start you map on the last iteration here can be re-used ...


@@ -805,9 +821,17 @@ int __init dom0_construct_pv(struct domain *d,
  unmap_domain_page(l2t);
  }


... in the code the tail of which is visible here, eliminating a
redundant map/unmap pair.


Good catch, I'll remove the redundant pair.




@@ -977,8 +1001,12 @@ int __init dom0_construct_pv(struct domain *d,
   * !CONFIG_VIDEO case so the logic here can be simplified.
   */
  if ( pv_shim )
+{
+l4start = map_domain_page(l4start_mfn);
  pv_shim_setup_dom(d, l4start, v_start, vxenstore_start, 
vconsole_start,
vphysmap_start, si);
+UNMAP_DOMAIN_PAGE(l4start);
+}


The, at the first glance, redundant re-mapping of the L4 table here could
do with explaining in the description. However, I further wonder in how
far in shim mode eliminating the direct map is actually useful. Which is
to say that I question the need for this change in the first place. Or
wait - isn't this (unlike the rest of this patch) actually a bug fix? At
this point we're on the domain's page tables, which may not cover the
page the L4 is allocated at (if a truly huge shim was configured). So I
guess the change is needed but wants breaking out, allowing to at least
consider whether to backport it.



I will create a separate patch for this change.


Jan





Re: [PATCH 08/22] x86/pv: rewrite how building PV dom0 handles domheap mappings

2022-12-22 Thread Jan Beulich
On 16.12.2022 12:48, Julien Grall wrote:
> From: Hongyan Xia 
> 
> Building a PV dom0 is allocating from the domheap but uses it like the
> xenheap. This is clearly wrong. Fix.

"Clearly wrong" would mean there's a bug here, at lest under certain
conditions. But there isn't: Even on huge systems, due to running on
idle page tables, all memory is mapped at present.

> @@ -711,22 +715,32 @@ int __init dom0_construct_pv(struct domain *d,
>  v->arch.pv.event_callback_cs= FLAT_COMPAT_KERNEL_CS;
>  }
>  
> +#define UNMAP_MAP_AND_ADVANCE(mfn_var, virt_var, maddr) \
> +do {\
> +UNMAP_DOMAIN_PAGE(virt_var);\

Not much point using the macro when ...

> +mfn_var = maddr_to_mfn(maddr);  \
> +maddr += PAGE_SIZE; \
> +virt_var = map_domain_page(mfn_var);\

... the variable gets reset again to non-NULL unconditionally right
away.

> +} while ( false )

This being a local macro and all use sites passing mpt_alloc as the
last argument, I think that parameter wants dropping, which would
improve readability.

> @@ -792,9 +808,9 @@ int __init dom0_construct_pv(struct domain *d,
>  if ( !l3e_get_intpte(*l3tab) )
>  {
>  maddr_to_page(mpt_alloc)->u.inuse.type_info = 
> PGT_l2_page_table;
> -l2tab = __va(mpt_alloc); mpt_alloc += PAGE_SIZE;
> -clear_page(l2tab);
> -*l3tab = l3e_from_paddr(__pa(l2tab), L3_PROT);
> +UNMAP_MAP_AND_ADVANCE(l2start_mfn, l2start, mpt_alloc);
> +clear_page(l2start);
> +*l3tab = l3e_from_mfn(l2start_mfn, L3_PROT);

The l2start you map on the last iteration here can be re-used ...

> @@ -805,9 +821,17 @@ int __init dom0_construct_pv(struct domain *d,
>  unmap_domain_page(l2t);
>  }

... in the code the tail of which is visible here, eliminating a
redundant map/unmap pair.

> @@ -977,8 +1001,12 @@ int __init dom0_construct_pv(struct domain *d,
>   * !CONFIG_VIDEO case so the logic here can be simplified.
>   */
>  if ( pv_shim )
> +{
> +l4start = map_domain_page(l4start_mfn);
>  pv_shim_setup_dom(d, l4start, v_start, vxenstore_start, 
> vconsole_start,
>vphysmap_start, si);
> +UNMAP_DOMAIN_PAGE(l4start);
> +}

The, at the first glance, redundant re-mapping of the L4 table here could
do with explaining in the description. However, I further wonder in how
far in shim mode eliminating the direct map is actually useful. Which is
to say that I question the need for this change in the first place. Or
wait - isn't this (unlike the rest of this patch) actually a bug fix? At
this point we're on the domain's page tables, which may not cover the
page the L4 is allocated at (if a truly huge shim was configured). So I
guess the change is needed but wants breaking out, allowing to at least
consider whether to backport it.

Jan