2015-03-24 9:18 GMT+09:00 Namhyung Kim <namhy...@kernel.org>:
> On Tue, Mar 24, 2015 at 02:32:17AM +0900, Joonsoo Kim wrote:
>> 2015-03-23 15:30 GMT+09:00 Namhyung Kim <namhy...@kernel.org>:
>> > The perf kmem command records and analyze kernel memory allocation
>> > only for SLAB objects.  This patch implement a simple page allocator
>> > analyzer using kmem:mm_page_alloc and kmem:mm_page_free events.
>> >
>> > It adds two new options of --slab and --page.  The --slab option is
>> > for analyzing SLAB allocator and that's what perf kmem currently does.
>> >
>> > The new --page option enables page allocator events and analyze kernel
>> > memory usage in page unit.  Currently, 'stat --alloc' subcommand is
>> > implemented only.
>> >
>> > If none of these --slab nor --page is specified, --slab is implied.
>> >
>> >   # perf kmem stat --page --alloc --line 10
>> >
>> >   
>> > -------------------------------------------------------------------------------------
>> >    Page             | Total alloc (KB) | Hits     | Order | Migration type 
>> > | GFP flags
>> >   
>> > -------------------------------------------------------------------------------------
>> >    ffffea0015e48e00 |               16 |        1 |     2 |    RECLAIMABLE 
>> > |  00285250
>> >    ffffea0015e47400 |               16 |        1 |     2 |    RECLAIMABLE 
>> > |  00285250
>> >    ffffea001440f600 |               16 |        1 |     2 |    RECLAIMABLE 
>> > |  00285250
>> >    ffffea001440cc00 |               16 |        1 |     2 |    RECLAIMABLE 
>> > |  00285250
>> >    ffffea00140c6300 |               16 |        1 |     2 |    RECLAIMABLE 
>> > |  00285250
>> >    ffffea00140c5c00 |               16 |        1 |     2 |    RECLAIMABLE 
>> > |  00285250
>> >    ffffea00140c5000 |               16 |        1 |     2 |    RECLAIMABLE 
>> > |  00285250
>> >    ffffea00140c4f00 |               16 |        1 |     2 |    RECLAIMABLE 
>> > |  00285250
>> >    ffffea00140c4e00 |               16 |        1 |     2 |    RECLAIMABLE 
>> > |  00285250
>> >    ffffea00140c4d00 |               16 |        1 |     2 |    RECLAIMABLE 
>> > |  00285250
>> >    ...              | ...              | ...      | ...   | ...            
>> > | ...
>> >   
>> > -------------------------------------------------------------------------------------
>>
>> Tracepoint on mm_page_alloc print out pfn as well as pointer of struct page.
>> How about printing pfn rather than pointer of struct page?
>
> I'd really like to have pfn rather than struct page.  But I don't know
> how to convert page pointer to pfn in userspace.
>
> The output of tracepoint via $debugfs/tracing/trace file is generated
> from kernel-side, so it can easily have pfn from page pointer.  But
> tracepoint itself only saves page pointer and we need to convert/print
> it in userspace.

Ah...I didn't realize that perf don't use output of $debugfs/tracing/trace
file. So, perf just uses raw trace buffer directly? If pfn is saved to
the trace buffer, perf can print pfn rather than pointer of struct page?

> Yes, perf script (or libtraceevent) shows pfn when printing those
> events.  But that's bogus since it cannot determine the size of the
> struct page so the pointer arithmetic in open-coded page_to_pfn()
> which is saved in the print_fmt of the tracepoint will end up with an
> normal integer arithmatic.

How about following change and making 'perf kmem' print pfn?
If we store pfn on the trace buffer, we can print $debugfs/tracing/trace
as is and 'perf kmem' can also print pfn.

Thanks.

diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
index 4ad10ba..9dcfd0b 100644
--- a/include/trace/events/kmem.h
+++ b/include/trace/events/kmem.h
@@ -199,22 +199,22 @@ TRACE_EVENT(mm_page_alloc,
        TP_ARGS(page, order, gfp_flags, migratetype),

        TP_STRUCT__entry(
-               __field(        struct page *,  page            )
+               __field(        unsigned long,  pfn             )
                __field(        unsigned int,   order           )
                __field(        gfp_t,          gfp_flags       )
                __field(        int,            migratetype     )
        ),

        TP_fast_assign(
-               __entry->page           = page;
+               __entry->pfn            = page ? page_to_pfn(page) : -1;
                __entry->order          = order;
                __entry->gfp_flags      = gfp_flags;
                __entry->migratetype    = migratetype;
        ),

        TP_printk("page=%p pfn=%lu order=%d migratetype=%d gfp_flags=%s",
-               __entry->page,
-               __entry->page ? page_to_pfn(__entry->page) : 0,
+               __entry->pfn != -1 ? pfn_to_page(__entry->pfn) : NULL,
+               __entry->pfn != -1 ? __entry->pfn : 0,
                __entry->order,
                __entry->migratetype,
                show_gfp_flags(__entry->gfp_flags))
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to