Replace-start and replace-cancel threads can race to create a messy situation leading to UAF. We use the scrub code to write the blocks on the replace target. So if we haven't have set the replace-scrub-running yet, without this patch we just ignore the error and free the target device. When this happens the system panics with UAF error.
Its nice to see that btrfs_dev_replace_finishing() already handles the ECANCELED (replace canceled) situation, but for an unknown reason we aren't using it to cleanup the replace cancel situation, instead we just let the replace cancel ioctl thread to cleanup the target device and return and out of synchronous with the scrub code. This patch 4/9, 5/9 and 6/9 uses the return code of btrfs_scrub_cancel() to check if the scrub was really running. And if its not then shall return an error to the user (replace not started error) so that user can retry replace cancel. And uses btrfs_dev_replace_finishing() code to cleanup after successful cancel of the replace scrub. Further, a suspended replace, when tries to restart, and if it fails (for example target device missing, or excl ops running) it goes to the started state, and so the cli 'btrfs replace status /mnt' hangs with no progress. So patches 2/9 and 3/9 fixes that. As the originals code idea of ECANCELED was limited to the situation of the error only and not user requested, there are unnecessary error log and warn log which 7/9 and 8/9 patches fixes. Patches 1/9 and 9/9 are good to have fixes. Makes a function static and code readability good. Testing: (I did some attempt to convert these into xfstests but need a mechanism where kernel thread can wait for user land script. I thought I could do it using ebfp, but needs more digging on how). As of now hand tested with using procfs to hold kernel thread at (wait_for_user(..)) until user land issues go. 1. umount /btrfs; wipefs -a /dev/sd[b-f] && mkfs.btrfs -fq /dev/sdb && mount /dev/sdb /btrfs && fillfs /btrfs 10000 btrfs replace start /dev/sdb /dev/sdc /btrfs wait_for_user("scrub running is set..waiting"); AND OR wait_for_user("scrub running is NOT set..waiting"); btrfs replace cancel /btrfs wait_for_user_go(); btrfs replace status /btrfs 2. umount /btrfs; wipefs -a /dev/sd[b-f] && mkfs.btrfs -fq /dev/sdb && mount /dev/sdb /btrfs && fillfs /btrfs 10000 btrfs replace start /dev/sdb /dev/sdc /btrfs wait_for_user("scrub running is set..waiting"); AND OR wait_for_user("scrub running is NOT set..waiting"); reboot mount -o device=/dev/sdc /dev/sdb /btrfs wait_for_user_go(); btrfs replace status /btrfs btrfs replace cancel /btrfs btrfs replace status /btrfs 3. umount /btrfs; wipefs -a /dev/sd[b-f] && mkfs.btrfs -fq /dev/sdb && mount /dev/sdb /btrfs && fillfs /btrfs 10000 btrfs replace start /dev/sdb /dev/sdc /btrfs wait_for_user("scrub running is set..waiting"); AND OR wait_for_user("scrub running is NOT set..waiting"); reboot mount -o degraded /dev/sdb /btrfs btrfs replace status /btrfs btrfs replace cancel /btrfs btrfs replace status /btrfs umount /btrfs mount /dev/sdb /btrfs Anand Jain (9): btrfs: mark btrfs_dev_replace_start() as static btrfs: replace go back to suspended if target missing btrfs: replace back to suspend state if EXCL OP is running btrfs: fix UAF due to race between replace start and cancel btrfs: replace cancel is successful if scrub cancel is successful btrfs: replace's scrub must not be running in replace suspended state btrfs: quiten warn if the replace is canceled at finish btrfs: user requsted replace cancel is not an error btrfs: add explicit check for replace result no error fs/btrfs/dev-replace.c | 90 ++++++++++++++++++++++++++++++++++---------------- fs/btrfs/dev-replace.h | 3 -- 2 files changed, 62 insertions(+), 31 deletions(-) -- 1.8.3.1