On Wed, Jan 30, 2019 at 02:45:00PM +0800, Anand Jain wrote:
> v3->v4: Fix list corruption as reported by btrfs/073 by David.
>            [1]
>            https://patchwork.kernel.org/patch/10705741/
>       Which I was able to reproduce with an instrumented kernel but not with
>       btrfs/073.
>         In v3 patch, it releases the fs_info::scrub_lock to destroy the work 
> queue
>       which raced with new scrub requests, overwriting the scrub workers
>       pointers. So in v4, it kills the function scrub_workers_put(), and
>       performs the destroy_workqueue in two stages, with worker pointers
>       copied locally.

> @@ -3932,9 +3925,16 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 
> devid, u64 start,
>  
>       mutex_lock(&fs_info->scrub_lock);
>       dev->scrub_ctx = NULL;
> -     scrub_workers_put(fs_info);
> +     if (--fs_info->scrub_workers_refcnt == 0) {
> +             scrub_workers = fs_info->scrub_workers;
> +             scrub_wr_comp = fs_info->scrub_wr_completion_workers;
> +             scrub_parity = fs_info->scrub_parity_workers;
> +     }
>       mutex_unlock(&fs_info->scrub_lock);
>  
> +     btrfs_destroy_workqueue(scrub_workers);
> +     btrfs_destroy_workqueue(scrub_wr_comp);
> +     btrfs_destroy_workqueue(scrub_parity);

https://lore.kernel.org/linux-btrfs/1543554924-17397-2-git-send-email-anand.j...@oracle.com/

Comparing to the previous version, it's almost the same I think. If
scrub_workers_get races between the unlock and destroy_workers, anything
that uses fs_info->scrub_wokers will soon use freed memory.

The difference is that the worker pointers are read from fs_info under a
lock but are still used outside. I haven't tested this version but from
the analysis of previous crash, I don't see how v4 is supposed to be
better.

Reply via email to