From: Chandan Rajendra <chan...@linux.vnet.ibm.com>

[ Upstream commit 97dcdea076ecef41ea4aaa23d4397c2f622e4265 ]

The following deadlock is seen when executing generic/113 test,

 
---------------------------------------------------------+----------------------------------------------------
  Direct I/O task                                           Fast fsync task
 
---------------------------------------------------------+----------------------------------------------------
  btrfs_direct_IO
    __blockdev_direct_IO
     do_blockdev_direct_IO
      do_direct_IO
       btrfs_get_blocks_direct
        while (blocks needs to written)
         get_more_blocks (first iteration)
          btrfs_get_blocks_direct
           btrfs_create_dio_extent
             down_read(&BTRFS_I(inode) >dio_sem)
             Create and add extent map and ordered extent
             up_read(&BTRFS_I(inode) >dio_sem)
                                                            btrfs_sync_file
                                                              
btrfs_log_dentry_safe
                                                               
btrfs_log_inode_parent
                                                                btrfs_log_inode
                                                                 
btrfs_log_changed_extents
                                                                  
down_write(&BTRFS_I(inode) >dio_sem)
                                                                   Collect new 
extent maps and ordered extents
                                                                    wait for 
ordered extent completion
         get_more_blocks (second iteration)
          btrfs_get_blocks_direct
           btrfs_create_dio_extent
             down_read(&BTRFS_I(inode) >dio_sem)
 
--------------------------------------------------------------------------------------------------------------

In the above description, Btrfs direct I/O code path has not yet started
submitting bios for file range covered by the initial ordered
extent. Meanwhile, The fast fsync task obtains the write semaphore and
waits for I/O on the ordered extent to get completed. However, the
Direct I/O task is now blocked on obtaining the read semaphore.

To resolve the deadlock, this commit modifies the Direct I/O code path
to obtain the read semaphore before invoking
__blockdev_direct_IO(). The semaphore is then given up after
__blockdev_direct_IO() returns. This allows the Direct I/O code to
complete I/O on all the ordered extents it creates.

Signed-off-by: Chandan Rajendra <chan...@linux.vnet.ibm.com>
Reviewed-by: Filipe Manana <fdman...@suse.com>
Signed-off-by: David Sterba <dste...@suse.com>
Signed-off-by: Sasha Levin <alexander.le...@verizon.com>
---
 fs/btrfs/inode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index bddbae796941..cada3f977baf 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7235,7 +7235,6 @@ static struct extent_map *btrfs_create_dio_extent(struct 
inode *inode,
        struct extent_map *em = NULL;
        int ret;
 
-       down_read(&BTRFS_I(inode)->dio_sem);
        if (type != BTRFS_ORDERED_NOCOW) {
                em = create_pinned_em(inode, start, len, orig_start,
                                      block_start, block_len, orig_block_len,
@@ -7254,7 +7253,6 @@ static struct extent_map *btrfs_create_dio_extent(struct 
inode *inode,
                em = ERR_PTR(ret);
        }
  out:
-       up_read(&BTRFS_I(inode)->dio_sem);
 
        return em;
 }
@@ -8707,6 +8705,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct 
iov_iter *iter)
                dio_data.unsubmitted_oe_range_start = (u64)offset;
                dio_data.unsubmitted_oe_range_end = (u64)offset;
                current->journal_info = &dio_data;
+               down_read(&BTRFS_I(inode)->dio_sem);
        } else if (test_bit(BTRFS_INODE_READDIO_NEED_LOCK,
                                     &BTRFS_I(inode)->runtime_flags)) {
                inode_dio_end(inode);
@@ -8719,6 +8718,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct 
iov_iter *iter)
                                   iter, btrfs_get_blocks_direct, NULL,
                                   btrfs_submit_direct, flags);
        if (iov_iter_rw(iter) == WRITE) {
+               up_read(&BTRFS_I(inode)->dio_sem);
                current->journal_info = NULL;
                if (ret < 0 && ret != -EIOCBQUEUED) {
                        if (dio_data.reserve)
-- 
2.11.0

Reply via email to