Hi Mark,

On 10/01/2019 14:27, Mark Rutland wrote:
> The perf tool uses /proc/sys/kernel/perf_event_mlock_kb to determine how
> large its ringbuffer mmap should be. This can be configured to arbitrary
> values, which can be larger than the maximum possible allocation from
> kmalloc.
> 
> When this is configured to a suitably large value (e.g. thanks to the
> perf fuzzer), attempting to use perf record triggers a WARN_ON_ONCE() in
> __alloc_pages_nodemask():
> 
> [  337.316688] WARNING: CPU: 2 PID: 5666 at mm/page_alloc.c:4511
> __alloc_pages_nodemask+0x3f8/0xbc8
> [  337.316694] Modules linked in:
> [  337.316704] CPU: 2 PID: 5666 Comm: perf Not tainted 5.0.0-rc1 #2669
> [  337.316708] Hardware name: ARM Juno development board (r0) (DT)
> [  337.316714] pstate: 20000005 (nzCv daif -PAN -UAO)
> [  337.316720] pc : __alloc_pages_nodemask+0x3f8/0xbc8
> [  337.316728] lr : alloc_pages_current+0x80/0xe8
> [  337.316732] sp : ffff000016eeb9e0
> [  337.316736] x29: ffff000016eeb9e0 x28: 0000000000080001
> [  337.316744] x27: 0000000000000000 x26: ffff0000111e21f0
> [  337.316751] x25: 0000000000000001 x24: 0000000000000000
> [  337.316757] x23: 0000000000080001 x22: 0000000000000000
> [  337.316762] x21: 0000000000000000 x20: 000000000000000b
> [  337.316768] x19: 000000000060c0c0 x18: 0000000000000000
> [  337.316773] x17: 0000000000000000 x16: 0000000000000000
> [  337.316779] x15: 0000000000000000 x14: 0000000000000000
> [  337.316784] x13: 0000000000000000 x12: 0000000000000000
> [  337.316789] x11: 0000000000100000 x10: 0000000000000000
> [  337.316795] x9 : 0000000010044400 x8 : 0000000080001000
> [  337.316800] x7 : 0000000000000000 x6 : ffff800975584700
> [  337.316806] x5 : 0000000000000000 x4 : ffff0000111cd6c8
> [  337.316811] x3 : 0000000000000000 x2 : 0000000000000000
> [  337.316816] x1 : 000000000000000b x0 : 000000000060c0c0
> [  337.316822] Call trace:
> [  337.316828]  __alloc_pages_nodemask+0x3f8/0xbc8
> [  337.316834]  alloc_pages_current+0x80/0xe8
> [  337.316841]  kmalloc_order+0x14/0x30
> [  337.316848]  __kmalloc+0x1dc/0x240
> [  337.316854]  rb_alloc+0x3c/0x170
> [  337.316860]  perf_mmap+0x3bc/0x470
> [  337.316867]  mmap_region+0x374/0x4f8
> [  337.316873]  do_mmap+0x300/0x430
> [  337.316878]  vm_mmap_pgoff+0xe4/0x110
> [  337.316884]  ksys_mmap_pgoff+0xc0/0x230
> [  337.316892]  __arm64_sys_mmap+0x28/0x38
> [  337.316899]  el0_svc_common+0xb4/0x118
> [  337.316905]  el0_svc_handler+0x2c/0x80
> [  337.316910]  el0_svc+0x8/0xc
> [  337.316915] ---[ end trace fa29167e20ef0c62 ]---
> 
> Let's avoid this by checking that the requested allocation is possible
> before calling kzalloc.
> 
> Reported-by: Julien Thierry <[email protected]>
> Signed-off-by: Mark Rutland <[email protected]>
> Cc: Alexander Shishkin <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> ---
>  kernel/events/ring_buffer.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
> index 4a9937076331..309ef5a64af5 100644
> --- a/kernel/events/ring_buffer.c
> +++ b/kernel/events/ring_buffer.c
> @@ -734,6 +734,9 @@ struct ring_buffer *rb_alloc(int nr_pages, long 
> watermark, int cpu, int flags)
>       size = sizeof(struct ring_buffer);
>       size += nr_pages * sizeof(void *);
>  
> +     if (order_base_2(size) >= MAX_ORDER)
> +             goto fail;
> +

I see that in kernel/events/ring_buffer.c there are two versions of
rb_alloc() (depending on whether CONFIG_PERF_USE_VMALLOC is defined or not).

Since the warning comes from the kzalloc, I'd think we'd need to add
this check in both implementations of rb_alloc().


With that change (or if for some reason the other rb_alloc() version
doesn't need the check):

Reviewed-by: Julien Thierry <[email protected]>

Thanks,

-- 
Julien Thierry

Reply via email to