On Mon, Aug 10, 2020 at 02:24:21PM -0700, Kan Liang wrote:
> Current perf can report both virtual addresses and physical addresses,
> but not the page size. Without the page size information of the utilized
> page, users cannot decide whether to promote/demote large pages to
> optimize memory usage.
> 
> Add a new sample type for the data page size.
> 
> Current perf already has a facility to collect data virtual addresses.
> A page walker is required to walk the pages tables and calculate the
> page size from a given virtual address.
> 
> On some platforms, e.g., X86, the page walker is invoked in an NMI
> handler. So the page walker must be IRQ-safe and low overhead. Besides,
> the page walker should work for both user and kernel virtual address.
> The existing generic page walker, e.g., walk_page_range_novma(), is a
> little bit complex and doesn't guarantee the IRQ-safe. The follow_page()
> is only for user-virtual address.
> 
> Add a new function perf_get_page_size() to walk the page tables and
> calculate the page size. In the function:
> - Interrupts have to be disabled to prevent any teardown of the page
>   tables.
> - The size of a normal page is from the pre-defined page size macros.
> - The size of a compound page is retrieved from the helper function,
>   page_size().
> 
> Suggested-by: Peter Zijlstra <pet...@infradead.org>
> Signed-off-by: Kan Liang <kan.li...@linux.intel.com>

>  /* default value for data source */
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 52ca2093831c..32484accc7a3 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -143,8 +143,9 @@ enum perf_event_sample_format {
>       PERF_SAMPLE_PHYS_ADDR                   = 1U << 19,
>       PERF_SAMPLE_AUX                         = 1U << 20,
>       PERF_SAMPLE_CGROUP                      = 1U << 21,
> +     PERF_SAMPLE_DATA_PAGE_SIZE              = 1U << 22,
>  
> -     PERF_SAMPLE_MAX = 1U << 22,             /* non-ABI */
> +     PERF_SAMPLE_MAX = 1U << 23,             /* non-ABI */
>  
>       __PERF_SAMPLE_CALLCHAIN_EARLY           = 1ULL << 63, /* non-ABI; 
> internal use */
>  };

> @@ -7151,6 +7269,9 @@ void perf_prepare_sample(struct perf_event_header 
> *header,
>       }
>  #endif
>  
> +     if (sample_type & PERF_SAMPLE_DATA_PAGE_SIZE)
> +             data->data_page_size = perf_get_page_size(data->addr);
> +

We could just require SAMPLE_DATA_PAGE requires SAMPLE_ADDR.

Reply via email to