Hi Steven and Mingo,

While trying to adjust the buffer size (echo <size> > /sys/kernel/debug/tracing/buffer_size_kb), we see that the kernel gets caught up in an infinite loop while traversing the "cpu_buffer->pages" list in rb_head_page_deactivate().

Looks like the last node of the list could be uninitialized, thus leading to infinite traversal. From the data that we captured:
000|rb_head_page_deactivate(inline)
| cpu_buffer = 0xFFFFFF8000671600 = kernel_size_le_lo32+0xFFFFFF652F6EE600 -> (
...
| pages = 0xFFFFFF80A909D980 = kernel_size_le_lo32+0xFFFFFF65D811A980 -> ( | next = 0xFFFFFF80A909D200 = kernel_size_le_lo32+0xFFFFFF65D811A200 -> ( | next = 0xFFFFFF80A909D580 = kernel_size_le_lo32+0xFFFFFF65D811A580 -> ( | next = 0xFFFFFF8138D1CD00 = kernel_size_le_lo32+0xFFFFFF6667D99D00 -> ( | next = 0xFFFFFF80006716F0 = kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> ( | next = 0xFFFFFF80006716F0 = kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> ( | next = 0xFFFFFF80006716F0 = kernel_size_le_lo32+0xFFFFFF652F6EE6F0 -> ( | next = 0xFFFFFF80006716F0 = kernel_size_le_lo32+0xFFFFFF652F6EE6F0,

Wanted to check with you if there's any scenario that could lead us into this state.

Test details:
-- Arch: arm64
-- Kernel version 5.4.30; running on Andriod
-- Test case: Running the following set of commands across reboot will lead us to the scenario

  atrace --async_start -z -c -b 120000 sched audio irq idle freq
  < Run any workload here >
atrace --async_dump -z -c -b 1200000 sched audio irq idle freq > mytrace.trace
  atrace --async_stop > /dev/null
  echo 150000 > /sys/kernel/debug/tracing/buffer_size_kb
  echo 200000 > /sys/kernel/debug/tracing/buffer_size_kb
  reboot

Repeating the above lines across reboots would reproduce the issue.
The "atrace" or "echo" would just get stuck while resizing the buffer size. I'll try to reproduce the issue without atrace as well, but wondering what could be the reason for leading us to this state.

Thank you.
Raghavendra

Reply via email to