On Mon, Aug 10, 2020 at 02:24:21PM -0700, Kan Liang wrote: > Current perf can report both virtual addresses and physical addresses, > but not the page size. Without the page size information of the utilized > page, users cannot decide whether to promote/demote large pages to > optimize memory usage. > > Add a new sample type for the data page size. > > Current perf already has a facility to collect data virtual addresses. > A page walker is required to walk the pages tables and calculate the > page size from a given virtual address. > > On some platforms, e.g., X86, the page walker is invoked in an NMI > handler. So the page walker must be IRQ-safe and low overhead. Besides, > the page walker should work for both user and kernel virtual address. > The existing generic page walker, e.g., walk_page_range_novma(), is a > little bit complex and doesn't guarantee the IRQ-safe. The follow_page() > is only for user-virtual address. > > Add a new function perf_get_page_size() to walk the page tables and > calculate the page size. In the function: > - Interrupts have to be disabled to prevent any teardown of the page > tables. > - The size of a normal page is from the pre-defined page size macros. > - The size of a compound page is retrieved from the helper function, > page_size(). > > Suggested-by: Peter Zijlstra <pet...@infradead.org> > Signed-off-by: Kan Liang <kan.li...@linux.intel.com>
> /* default value for data source */ > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > index 52ca2093831c..32484accc7a3 100644 > --- a/include/uapi/linux/perf_event.h > +++ b/include/uapi/linux/perf_event.h > @@ -143,8 +143,9 @@ enum perf_event_sample_format { > PERF_SAMPLE_PHYS_ADDR = 1U << 19, > PERF_SAMPLE_AUX = 1U << 20, > PERF_SAMPLE_CGROUP = 1U << 21, > + PERF_SAMPLE_DATA_PAGE_SIZE = 1U << 22, > > - PERF_SAMPLE_MAX = 1U << 22, /* non-ABI */ > + PERF_SAMPLE_MAX = 1U << 23, /* non-ABI */ > > __PERF_SAMPLE_CALLCHAIN_EARLY = 1ULL << 63, /* non-ABI; > internal use */ > }; > @@ -7151,6 +7269,9 @@ void perf_prepare_sample(struct perf_event_header > *header, > } > #endif > > + if (sample_type & PERF_SAMPLE_DATA_PAGE_SIZE) > + data->data_page_size = perf_get_page_size(data->addr); > + We could just require SAMPLE_DATA_PAGE requires SAMPLE_ADDR.