On 2018/11/19 下午10:15, fdman...@kernel.org wrote:
> From: Filipe Manana <fdman...@suse.com>
> 
> If the quota enable and snapshot creation ioctls are called concurrently
> we can get into a deadlock where the task enabling quotas will deadlock
> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
> twice, or the task creating a snapshot tries to commit the transaction
> while the task enabling quota waits for the former task to commit the
> transaction while holding the mutex. The following time diagrams show how
> both cases happen.
> 
> First scenario:
> 
>            CPU 0                                    CPU 1
> 
>  btrfs_ioctl()
>   btrfs_ioctl_quota_ctl()
>    btrfs_quota_enable()
>     mutex_lock(fs_info->qgroup_ioctl_lock)
>     btrfs_start_transaction()
> 
>                                              btrfs_ioctl()
>                                               btrfs_ioctl_snap_create_v2
>                                                create_snapshot()
>                                                 --> adds snapshot to the
>                                                     list pending_snapshots
>                                                     of the current
>                                                     transaction
> 
>     btrfs_commit_transaction()
>      create_pending_snapshots()
>        create_pending_snapshot()
>         qgroup_account_snapshot()
>          btrfs_qgroup_inherit()
>          mutex_lock(fs_info->qgroup_ioctl_lock)
>           --> deadlock, mutex already locked
>               by this task at
>               btrfs_quota_enable()
> 
> Second scenario:
> 
>            CPU 0                                    CPU 1
> 
>  btrfs_ioctl()
>   btrfs_ioctl_quota_ctl()
>    btrfs_quota_enable()
>     mutex_lock(fs_info->qgroup_ioctl_lock)
>     btrfs_start_transaction()
> 
>                                              btrfs_ioctl()
>                                               btrfs_ioctl_snap_create_v2
>                                                create_snapshot()
>                                                 --> adds snapshot to the
>                                                     list pending_snapshots
>                                                     of the current
>                                                     transaction
> 
>                                                 btrfs_commit_transaction()
>                                                  --> waits for task at
>                                                      CPU 0 to release
>                                                      its transaction
>                                                      handle
> 
>     btrfs_commit_transaction()
>      --> sees another task started
>          the transaction commit first
>      --> releases its transaction
>          handle
>      --> waits for the transaction
>          commit to be completed by
>          the task at CPU 1
> 
>                                                  create_pending_snapshot()
>                                                   qgroup_account_snapshot()
>                                                    btrfs_qgroup_inherit()
>                                                     
> mutex_lock(fs_info->qgroup_ioctl_lock)
>                                                      --> deadlock, task at 
> CPU 0
>                                                          has the mutex locked 
> but
>                                                          it is waiting for us 
> to
>                                                          finish the 
> transaction
>                                                          commit
> 
> So fix this by setting the quota enabled flag in fs_info after committing
> the transaction at btrfs_quota_enable(). This ends up serializing quota
> enable and snapshot creation as if the snapshot creation happened just
> before the quota enable request. The quota rescan task, scheduled after
> committing the transaction in btrfs_quote_enable(), will do the accounting.
> 
> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating 
> snapshot")
> Signed-off-by: Filipe Manana <fdman...@suse.com>
> ---
> 
> V2: Added second deadlock example to changelog and changed the fix to deal
>     with that case as well.
> 
>  fs/btrfs/qgroup.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index d4917c0cddf5..ae1358253b7b 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -1013,16 +1013,22 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
>               btrfs_abort_transaction(trans, ret);
>               goto out_free_path;
>       }
> -     spin_lock(&fs_info->qgroup_lock);
> -     fs_info->quota_root = quota_root;
> -     set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
> -     spin_unlock(&fs_info->qgroup_lock);
>  
>       ret = btrfs_commit_transaction(trans);
>       trans = NULL;
>       if (ret)
>               goto out_free_path;

The main concern here is, if we don't set qgroup enabled bit before we
commit transaction, there will be a window where new tree modification
could happen before QGROUP_ENABLED bit set.

There may be some qgroup reserved space related problem in such case,
but I'm not 100% sure to foresee such problem.


The best way to do this is, commit trans first, and before any other one
trying to start transaction, we set the bit.
However I can't find such infrastructure now (I still remember we used
to have a pending bit to change quota enabled flag, but removed later)

Thanks,
Qu

>  
> +     /*
> +      * Set quota enabled flag after committing the transaction, to avoid
> +      * deadlocks on fs_info->qgroup_ioctl_lock with concurrent snapshot
> +      * creation.
> +      */
> +     spin_lock(&fs_info->qgroup_lock);
> +     fs_info->quota_root = quota_root;
> +     set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
> +     spin_unlock(&fs_info->qgroup_lock);
> +
>       ret = qgroup_rescan_init(fs_info, 0, 1);
>       if (!ret) {
>               qgroup_rescan_zero_tracking(fs_info);
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to