4.15-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Harshad Shirwadkar <harsh...@google.com>

commit abbc3f9395c76d554a9ed27d4b1ebfb5d9b0e4ca upstream.

This patch fixes a race between the shutdown path and bio completion
handling. In the ext4 direct io path with async io, after submitting a
bio to the block layer, if journal starting fails,
ext4_direct_IO_write() would bail out pretending that the IO
failed. The caller would have had no way of knowing whether or not the
IO was successfully submitted. So instead, we return -EIOCBQUEUED in
this case. Now, the caller knows that the IO was submitted.  The bio
completion handler takes care of the error.

Tested: Ran the shutdown xfstest test 461 in loop for over 2 hours across
4 machines resulting in over 400 runs. Verified that the race didn't
occur. Usually the race was seen in about 20-30 iterations.

Signed-off-by: Harshad Shirwadkar <harsh...@google.com>
Signed-off-by: Theodore Ts'o <ty...@mit.edu>
Cc: sta...@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>

---
 fs/ext4/inode.c |   16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3767,10 +3767,18 @@ static ssize_t ext4_direct_IO_write(stru
                /* Credits for sb + inode write */
                handle = ext4_journal_start(inode, EXT4_HT_INODE, 2);
                if (IS_ERR(handle)) {
-                       /* This is really bad luck. We've written the data
-                        * but cannot extend i_size. Bail out and pretend
-                        * the write failed... */
-                       ret = PTR_ERR(handle);
+                       /*
+                        * We wrote the data but cannot extend
+                        * i_size. Bail out. In async io case, we do
+                        * not return error here because we have
+                        * already submmitted the corresponding
+                        * bio. Returning error here makes the caller
+                        * think that this IO is done and failed
+                        * resulting in race with bio's completion
+                        * handler.
+                        */
+                       if (!ret)
+                               ret = PTR_ERR(handle);
                        if (inode->i_nlink)
                                ext4_orphan_del(NULL, inode);
 


Reply via email to