Hi Ross,

On Fri, 7 Jun 2019 at 18:21, Ross Lagerwall <ross.lagerw...@citrix.com> wrote:
> On 5/7/19 9:32 PM, Andreas Gruenbacher wrote:
> > Since commit 64bc06bb32ee ("gfs2: iomap buffered write support"), gfs2 is 
> > doing
> > buffered writes by starting a transaction in iomap_begin, writing a range of
> > pages, and ending that transaction in iomap_end.  This approach suffers from
> > two problems:
> >
> >    (1) Any allocations necessary for the write are done in iomap_begin, so 
> > when
> >    the data aren't journaled, there is no need for keeping the transaction 
> > open
> >    until iomap_end.
> >
> >    (2) Transactions keep the gfs2 log flush lock held.  When
> >    iomap_file_buffered_write calls balance_dirty_pages, this can end up 
> > calling
> >    gfs2_write_inode, which will try to flush the log.  This requires taking 
> > the
> >    log flush lock which is already held, resulting in a deadlock.
> >
> > Fix both of these issues by not keeping transactions open from iomap_begin 
> > to
> > iomap_end.  Instead, start a small transaction in page_prepare and end it in
> > page_done when necessary.
> >
> Unfortunately, this patch broke growing gfs2 filesystems. It is easy to
> reproduce:
>
> $ mkfs.gfs2 -t xxx:yyy /dev/xvdb  4369065
> $ mount /dev/xvdb /mnt
> $ gfs2_grow /mnt (doesn't finish)
> FS: Mount point:             /mnt
> FS: Device:                  /dev/xvdb
> FS: Size:                    4369062 (0x42aaa6)
> DEV: Length:                 13107200 (0xc80000)
> The file system will grow by 34133MB.
>
> Looking at the kernel log, I see it hits the following assertion and
> then hangs trying to withdraw the filesystem (which is a separate
> problem, presumably):
>
> gfs2: fsid=xxx:yyy.0: fatal: assertion "(nbuf <= tr->tr_blocks) &&
> (tr->tr_num_revoke <= tr->tr_revokes)" failed
>     function = gfs2_trans_end, file = fs/gfs2/trans.c, line = 117
> gfs2: fsid=xxx:yyy.0: about to withdraw this file system
>
> Rearranging the code so that it prints information about the transaction
> before the failed withdrawal attempt shows:
> gfs2: fsid=xxx:yyy.0: Transaction created at:
> iomap_write_begin.constprop.45+0xbc/0x380
> gfs2: fsid=xxx:yyy.0: blocks=1 revokes=0 reserved=8 touched=1
> gfs2: fsid=xxx:yyy.0: Buf 1/0 Databuf 1/0 Revoke 0/0
>
> Reverting this commit fixes the issue. Tested with git master as of
> today (16d72dd4891fe).

thanks for the error report. This turns out to be a rounding error in
gfs2_iomap_page_prepare; the attached patch should help.

Thanks,
Andreas

---
 fs/gfs2/bmap.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index f42718d..d2a3f038 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -994,9 +994,11 @@ static void gfs2_write_unlock(struct inode *inode)
 static int gfs2_iomap_page_prepare(struct inode *inode, loff_t pos,
                                   unsigned len, struct iomap *iomap)
 {
+       unsigned int blocks;
        struct gfs2_sbd *sdp = GFS2_SB(inode);
 
-       return gfs2_trans_begin(sdp, RES_DINODE + (len >> inode->i_blkbits), 0);
+       blocks = (len + i_blocksize(inode) - 1) >> inode->i_blkbits;
+       return gfs2_trans_begin(sdp, RES_DINODE + blocks, 0);
 }
 
 static void gfs2_iomap_page_done(struct inode *inode, loff_t pos,
-- 
1.8.3.1

Reply via email to