Re: [Cluster-devel] gfs2 is unhappy on pagecache/for-next

Andreas Gruenbacher Mon, 20 Jun 2022 10:20:58 -0700

On Mon, Jun 20, 2022 at 8:21 AM Christoph Hellwig <h...@lst.de> wrote:
> On Sun, Jun 19, 2022 at 01:15:06PM +0100, Matthew Wilcox wrote:
> > On Sun, Jun 19, 2022 at 09:05:59AM +0200, Christoph Hellwig wrote:
> > > When trying to run xfstests on gfs2 (locally with the lock_nolock
> > > cluster managed) the first mount already hits this warning in
> > > inode_to_wb called from mark_buffer_dirty.  This all seems standard
> > > code from folio_account_dirtied, so not sure what is going there.
> >
> > I don't think this is new to pagecache/for-next.
> > https://lore.kernel.org/linux-mm/cf8bc8dd-8e16-3590-a714-51203e6f4...@redhat.com/
>
> Indeed, I can reproduce this on mainline as well.  I just didn't
> expect a maintained file system to blow up on the very first mount
> in xfstests..


Yes, I'm aware of this. For all I know, we've been having this issue
since Tejun added this warning in 2015 in commit aaa2cacf8184
("writeback: add lockdep annotation to inode_to_wb()"), and I don't
know what to do about it. The only way of building a working version
of gfs2 currently is without CONFIG_LOCKDEP, or by removing that
warning.

My best guess is that it has to do with how gfs2 uses address spaces:
we have two address spaces attached to each inode: one for the inode's
data, and one for the inode's metadata. The "normal" data address
space works as it does on other filesystems. The metadata address
space is used to flush and purge ("truncate") an inode's metadata from
memory so that we can allow other cluster nodes to modify that inode.
The metadata can be spread out over the whole disk, but we want to
flush it in some sensible order; the address space allows that.

We've switched to that setup in commit 009d851837ab ("GFS2: Metadata
address space clean up") in 2009. Back then, each resource group also
had its own address space, but that was merged into a single address
space in commit 70d4ee94b370 (sd_aspace, "GFS2: Use only a single
address space for rgrps"). But then last year, Jan Kara basically said
that this has never worked and was never going to work [1]. More
recently, Willy pointed us at a similar looking fix in nilfs [2]. If I
understand that fix correctly, it would put us back into the state
before commit 009d851837ab ("GFS2: Metadata address space clean up"),
wasting an entire struct inode for each gfs2 inode for basically
nothing. Or maybe I'm just misunderstanding this whole crap.

Thanks,
Andreas

[1] Jan Kara on July 28, 2021:
https://listman.redhat.com/archives/cluster-devel/2021-July/021300.html

[2] Matthew Willcox on May 22, 2022:
https://lore.kernel.org/lkml/yordhw5umhutq...@casper.infradead.org/

Re: [Cluster-devel] gfs2 is unhappy on pagecache/for-next

Reply via email to