On Sat, 2019-09-14 at 17:36 -0400, Cebtenzzre wrote: > Hi, > > I started a balance of one block group, and I saw this in dmesg: > > BTRFS info (device sdi1): balance: start -dvrange=2236714319872..2236714319873 > BTRFS info (device sdi1): relocating block group 2236714319872 flags > data|raid0 > BTRFS info (device sdi1): found 1 extents > BTRFS info (device sdi1): found 1 extents > BTRFS info (device sdi1): found 1 extents > BTRFS info (device sdi1): found 1 extents > BTRFS info (device sdi1): found 1 extents > > It continued like that for a total of 754 lines until I rebooted. Before > that, I captured some debug info. I ran this in my shell for a few > seconds, where PID is the pid of the process that called the balance > ioctl: > > integer i=0; while true; do sudo cat /proc/PID/stack >stack$i; sleep > .01010101; i+=1; done > > Which effectively gave me stack samples at (close to) 99Hz. Maybe not > ideal, but I was in a hurry and I didn't want my disks to sustain such > heavy, repetitive I/O for too long. > > I've attached the stack samples as stacks.tar.gz. A few of them are > empty. To me, it looks like the kernel never left the while (1) loop in > btrfs_relocate_block_group. The kernel messages seem to confirm this. > > I am using Arch Linux with kernel version 5.2.14-arch2, and I specified > "slub_debug=P,kmalloc-2k" in the kernel cmdline to detect and protect > against a use-after-free that I found when I had KASAN enabled. Would > that kernel parameter result in a silent retry if it hit the use-after- > free?
Please disregard the quoted message. This behavior does appear to be a result of using the slub_debug option instead of KASAN. It is not directly caused by BTRFS. -- Cebtenzzre <cebtenz...@gmail.com>