Re: [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan

Filipe Manana Thu, 17 Dec 2020 10:23:58 -0800

On Thu, Dec 17, 2020 at 5:45 PM David Sterba <dste...@suse.cz> wrote:
>
> On Mon, Dec 14, 2020 at 10:10:45AM +0000, fdman...@kernel.org wrote:
> > +static bool rescan_should_stop(struct btrfs_fs_info *fs_info)
> > +{
> > +     return btrfs_fs_closing(fs_info) ||
> > +             test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state);
> > +}
> > +
> >  static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> >  {
> >       struct btrfs_fs_info *fs_info = container_of(work, struct 
> > btrfs_fs_info,
> > @@ -3198,6 +3204,7 @@ static void btrfs_qgroup_rescan_worker(struct 
> > btrfs_work *work)
> >       struct btrfs_trans_handle *trans = NULL;
> >       int err = -ENOMEM;
> >       int ret = 0;
> > +     bool stopped = false;
> >
> >       path = btrfs_alloc_path();
> >       if (!path)
> > @@ -3210,7 +3217,7 @@ static void btrfs_qgroup_rescan_worker(struct 
> > btrfs_work *work)
> >       path->skip_locking = 1;
> >
> >       err = 0;
> > -     while (!err && !btrfs_fs_closing(fs_info)) {
> > +     while (!err && !(stopped = rescan_should_stop(fs_info))) {
> >               trans = btrfs_start_transaction(fs_info->fs_root, 0);
> >               if (IS_ERR(trans)) {
> >                       err = PTR_ERR(trans);
> > @@ -3253,7 +3260,7 @@ static void btrfs_qgroup_rescan_worker(struct 
> > btrfs_work *work)
> >       }
> >
> >       mutex_lock(&fs_info->qgroup_rescan_lock);
> > -     if (!btrfs_fs_closing(fs_info))
> > +     if (!stopped)
> >               fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
> >       if (trans) {
> >               ret = update_qgroup_status_item(trans);
> > @@ -3272,7 +3279,7 @@ static void btrfs_qgroup_rescan_worker(struct 
> > btrfs_work *work)
> >
> >       btrfs_end_transaction(trans);
> >
> > -     if (btrfs_fs_closing(fs_info)) {
> > +     if (stopped) {
>
> Thinking aloud, this is slightly different as it uses the cached status
> of fs_closing but there is mutex lock/unlock or transaction start/end
> between the checks so the status could change.
>
> But as the flow goes, we want to get fresh status in the while loop.
> Once it stops because of the fs_closing or remount request, the
> following code does the qgroup status update, wakeups, even tough this
> means one more transaction. Remount needs to sync anyway and this should
> be no problem.


Yes, that and the fact that the rescan calls
complete_all(&fs_info->qgroup_rescan_completion) before it logs the
reason why it finished.

So it would be possible for remount to stop it, then remount
completes, and then the rescan worker logs that an error happened
instead of logging that it was stopped - it's a very big stretch for
that to happen, but an error message would be confusing from a user's
perspective at least.

Re: [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan

Reply via email to