From: Yu Zhang <[email protected]> Sent: Monday, December 8, 2025 
9:11 PM
> 

The "Subject:" line prefix for this patch should probably be "Drivers: hv:"
to be consistent with most other changes to this source code file.

> Previously, the allocation of per-CPU output argument pages was restricted
> to root partitions or those operating in VTL mode.
> 
> Remove this restriction to support guest IOMMU related hypercalls, which
> require valid output pages to function correctly.

The thinking here isn't quite correct. Just because a hypercall produces output
doesn't mean that Linux needs to allocate a page for the output that is separate
from the input. It's perfectly OK to use the same page for both input and 
output,
as long as the two areas don't overlap. Yes, the page is called
"hyperv_pcpu_input_arg", but that's a historical artifact from before the time
it was realized that the same page can be used for both input and output.

Of course, if there's ever a hypercall that needs lots of input and lots of 
output
such that the combined size doesn't fit in a single page, then separate input
and output pages will be needed. But I'm skeptical that will ever happen. Rep
hypercalls could have large amounts of input and/or output, but I'd venture
that the rep count can always be managed so everything fits in a single page.

> 
> While unconditionally allocating per-CPU output pages scales with the number
> of vCPUs, and potentially adding overhead for guests that may not utilize the
> IOMMU, this change anticipates that future hypercalls from child partitions
> may also require these output pages.

I've heard the argument that the amount of overhead is modest relative to the
overall amount of memory that is typically in a VM, particularly VMs with high
vCPU counts. And I don't disagree. But on the flip side, why tie up memory when
there's no need to do so? I'd argue for dropping this patch, and changing the
two hypercall call sites in Patch 5 to just use part of the so-called hypercall 
input
page for the output as well. It's only a one-line change in each hypercall call 
site.

If folks really want to always allocate the separate output page, it's not an
issue that I'll continue to fight. But at least give a valid reason "why" in the
commit message.

Michael

> 
> Signed-off-by: Yu Zhang <[email protected]>
> ---
>  drivers/hv/hv_common.c | 21 ++++++---------------
>  1 file changed, 6 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
> index e109a620c83f..034fb2592884 100644
> --- a/drivers/hv/hv_common.c
> +++ b/drivers/hv/hv_common.c
> @@ -255,11 +255,6 @@ static void hv_kmsg_dump_register(void)
>       }
>  }
> 
> -static inline bool hv_output_page_exists(void)
> -{
> -     return hv_parent_partition() || IS_ENABLED(CONFIG_HYPERV_VTL_MODE);
> -}
> -
>  void __init hv_get_partition_id(void)
>  {
>       struct hv_output_get_partition_id *output;
> @@ -371,11 +366,9 @@ int __init hv_common_init(void)
>       hyperv_pcpu_input_arg = alloc_percpu(void  *);
>       BUG_ON(!hyperv_pcpu_input_arg);
> 
> -     /* Allocate the per-CPU state for output arg for root */
> -     if (hv_output_page_exists()) {
> -             hyperv_pcpu_output_arg = alloc_percpu(void *);
> -             BUG_ON(!hyperv_pcpu_output_arg);
> -     }
> +     /* Allocate the per-CPU state for output arg*/
> +     hyperv_pcpu_output_arg = alloc_percpu(void *);
> +     BUG_ON(!hyperv_pcpu_output_arg);
> 
>       if (hv_parent_partition()) {
>               hv_synic_eventring_tail = alloc_percpu(u8 *);
> @@ -473,7 +466,7 @@ int hv_common_cpu_init(unsigned int cpu)
>       u8 **synic_eventring_tail;
>       u64 msr_vp_index;
>       gfp_t flags;
> -     const int pgcount = hv_output_page_exists() ? 2 : 1;
> +     const int pgcount = 2;
>       void *mem;
>       int ret = 0;
> 
> @@ -491,10 +484,8 @@ int hv_common_cpu_init(unsigned int cpu)
>               if (!mem)
>                       return -ENOMEM;
> 
> -             if (hv_output_page_exists()) {
> -                     outputarg = (void 
> **)this_cpu_ptr(hyperv_pcpu_output_arg);
> -                     *outputarg = (char *)mem + HV_HYP_PAGE_SIZE;
> -             }
> +             outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
> +             *outputarg = (char *)mem + HV_HYP_PAGE_SIZE;
> 
>               if (!ms_hyperv.paravisor_present &&
>                   (hv_isolation_type_snp() || hv_isolation_type_tdx())) {
> --
> 2.49.0


Reply via email to