Hi, On Mon, 20 Apr 2009, Steven Whitehouse wrote:
> On Mon, 2009-04-20 at 15:14 +0200, Fabio M. Di Nitto wrote: > > > > On Mon, 2009-04-20 at 14:27 +0200, Kadlecsik Jozsef wrote: > > > > > > I reported on the linux-cluster mailing list (see thread > > > https://www.redhat.com/archives/linux-cluster/2009-March/msg00176.html) > > > that version cluster-2.03.11 freezes. > > > > > > As Wendy Cheng pointed out, the reason for this is that the filesystem > > > API > > > of the kernel changed and "put_inode" was replaced by "drop_inode": while > > > "put_inode" was called without holding any lock, "drop_inode" is called > > > under inode_lock held but the called gfs_sync_page_i may block > > > (https://www.redhat.com/archives/linux-cluster/2009-April/msg00060.html). > > > > Its not a replacement as such, they are different operations. > ->put_inode() was called on each and every iput() (despite what the docs > said, as Christoph noted) whereas ->drop_inode() is only called when the > inode is being removed from the cache. I knew I worded it incorrectly ;-). > The purpose of drop inode is to allow the fs to override the decision > about whether the inode needs to be dallocated (->delete_inode()) or not > (->clear_inode()). > > Christoph's patch is here: > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=33dcdac2df54e66c447ae03f58c95c7251aa5649 > > > > > > The patch below (applied in gfs-kernel/src/gfs/) fixes the bug by > > > releasing the lock before calling gfs_sync_page_i and locking it back > > > after the function call: > > > > > > --- gfs-orig/ops_super.c 2009-01-22 13:33:51.000000000 +0100 > > > +++ gfs/ops_super.c 2009-04-06 13:07:06.000000000 +0200 > > > @@ -9,6 +9,7 @@ > > > #include <linux/statfs.h> > > > #include <linux/seq_file.h> > > > #include <linux/mount.h> > > > +#include <linux/writeback.h> > > > > > > #include "gfs.h" > > > #include "dio.h" > > > @@ -68,8 +69,11 @@ > > > if (ip && > > > !inode->i_nlink && > > > S_ISREG(inode->i_mode) && > > > - !sdp->sd_args.ar_localcaching) > > > + !sdp->sd_args.ar_localcaching) { > > > + spin_unlock(&inode_lock); > > > gfs_sync_page_i(inode, DIO_START | DIO_WAIT); > > > + spin_lock(&inode_lock); > > > + } > > > generic_drop_inode(inode); > > > } > > > > You can't do this in this particular way because you must set the > inode's state before releasing the inode_lock. That is done in > generic_drop_inode() and the functions that it calls. > > Not to mention which, this doesn't seem to make any sense. It appears to > be trying to write the inode's data back to disk, but if we've got to > this point in the inode's life cycle, there should be nothing to write. > > Since its only called when i_nlink is 0, thats even more strange since > (a) it could be moved to ->delete_inode() and (b) why do we care about > forcing out dirty data for an inode thats being unlinked anyway.... > > This patch raises more questions than it answers I think. Do you imply it is superfluous to call gfs_sync_page_i at all, and in consequence there's no need for a gfs-specific ->drop_inode function? Just for the record: without unlocking/locking, the node having a mailmain queue over GFS got frozen in a few seconds as mailman queue manager was started. With the patch above it ran smoothly for a longer (test) period. Best regards, Jozsef -- E-mail : [email protected], [email protected] PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt Address: KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary
