Re: [PATCH] remove sb->s_files and file_list_lock usage in dquot.c
On Thu, Feb 08, 2007 at 11:14:04PM -0800, Bill Huey wrote: > I have an adaptive lock implementation in my tree that eliminates > the contention between what looks like the IDE layer, work queues and > the anticipatory scheduler, but that's not a real problem unlike what > I've mentioned above. I can get you and others more specifics on the > problem if folks working on the lower layers want it. Correction, it eliminates all of the "live" contentions where the mutex owner isn't asleep already. bill - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb->s_files and file_list_lock usage in dquot.c
On Thu, Feb 08, 2007 at 01:01:21AM -0800, Bill Huey wrote: > Christoph, > > The i_mutex lock the inode structure is also a source of contention > heavy when running a lot of parallel "find"s. I'm sure that folks > would be open to hearing suggestions regarding how to fix that. Christoph, And while you're at it, you should also know that dcache_lock is next in line to be nixed out of existence if possible. i_mutex is a bitch and I'm not even going to think about how to get rid of it since it's so widely used in many places (file systems aren't my think as well). Maybe some more precise tracking of contention paths would be useful to see if there's a pathological case creating a cascade of contention events so that can be nixed, don't know. About 1/10th of the lock stat events I've logged report that the owner of the rtmutex is the "current" on a runqueue some where. An adaptive lock would help with those contention events (spin for it while owner is running for the mutex release) but really, the contention should be avoided in the first place since they have a kind of (I think) polynomial increase in contention time as you add more processors to the mix. I have an adaptive lock implementation in my tree that eliminates the contention between what looks like the IDE layer, work queues and the anticipatory scheduler, but that's not a real problem unlike what I've mentioned above. I can get you and others more specifics on the problem if folks working on the lower layers want it. Other than that the -rt patch does quite well with instrumenting all sort of kernel behaviors that include contention and latency issues. So I don't need to tell you how valuable the -rt patch is for these issues since it's obvious, and I'm sure that you'll agree, that it's been instrumental at discovering many problems with the stock kernel. bill - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb->s_files and file_list_lock usage in dquot.c
On Feb 7 2007 19:06, Christoph Hellwig wrote: >On Wed, Feb 07, 2007 at 07:03:05PM +0100, Jan Engelhardt wrote: > >> With filesystems that can turn on their quota after mount time (about >> every fs except xfs), I can surely have a ton of files open, and hence, >> if I understand correctly, have lots of inodes instantiated. > >Yes, you can in theory. But turning on quota on a filesystem in full >steam useage is not a common use case and thus there is no point in >optimizing for it. I put it to a test in a default scenario (quotaon sometime at boot). SUSE Linux 10.1 with reiserfs(usrquota,grpquota). This is the result: add_dquot_ref: 30 files in sb->s_files for hda2 add_dquot_ref: Restarting after 12 files add_dquot_ref: 30 files in sb->s_files for hda2 add_dquot_ref: Restarted 1 times add_dquot_ref: 30 files in sb->s_files for hda2 add_dquot_ref: Restarting after 12 files add_dquot_ref: 30 files in sb->s_files for hda2 add_dquot_ref: Restarted 1 times Surprisingly few. I had expected to see more instantiated (but not necessarily open) files. Patch for reference: Index: 18/fs/dquot.c === --- 18.orig/fs/dquot.c +++ 18/fs/dquot.c @@ -689,22 +689,35 @@ static int dqinit_needed(struct inode *i static void add_dquot_ref(struct super_block *sb, int type) { struct list_head *p; +int restart = 0; +int num_files; restart: file_list_lock(); +num_files = 0; +list_for_each(p, >s_files) { +++num_files; +} +printk(KERN_WARNING "add_dquot_ref: %d files in sb->s_files for %s\n", +num_files, sb->s_id); +num_files = 0; list_for_each(p, >s_files) { struct file *filp = list_entry(p, struct file, f_u.fu_list); struct inode *inode = filp->f_dentry->d_inode; +++num_files; if (filp->f_mode & FMODE_WRITE && dqinit_needed(inode, type)) { struct dentry *dentry = dget(filp->f_dentry); file_list_unlock(); sb->dq_op->initialize(inode, type); dput(dentry); /* As we may have blocked we had better restart... */ +printk(KERN_WARNING "add_dquot_ref: Restarting after %d files\n", num_files); +++restart; goto restart; } } file_list_unlock(); +printk(KERN_WARNING "add_dquot_ref: Restarted %d times\n", restart); } /* Return 0 if dqput() won't block (note that 1 doesn't necessarily mean blocking) */ BTW, whilst looking for the function to return a readable name, I came across reiserfs_bdevname() and bdevname(). The former uses sb->s_id, the latter struct gendisk->hd_name+partition number. Could someone elaborate on why there are two ways? Jan -- ft: http://freshmeat.net/p/chaostables/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb->s_files and file_list_lock usage in dquot.c
On Tue, Feb 06, 2007 at 02:23:33PM +0100, Christoph Hellwig wrote: > Iterate over sb->s_inodes instead of sb->s_files in add_dquot_ref. > This reduces list search and lock hold time aswell as getting rid of > one of the few uses of file_list_lock which Ingo identified as a > scalability problem. Christoph, The i_mutex lock the inode structure is also a source of contention heavy when running a lot of parallel "find"s. I'm sure that folks would be open to hearing suggestions regarding how to fix that. bill - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb-s_files and file_list_lock usage in dquot.c
On Tue, Feb 06, 2007 at 02:23:33PM +0100, Christoph Hellwig wrote: Iterate over sb-s_inodes instead of sb-s_files in add_dquot_ref. This reduces list search and lock hold time aswell as getting rid of one of the few uses of file_list_lock which Ingo identified as a scalability problem. Christoph, The i_mutex lock the inode structure is also a source of contention heavy when running a lot of parallel finds. I'm sure that folks would be open to hearing suggestions regarding how to fix that. bill - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb-s_files and file_list_lock usage in dquot.c
On Feb 7 2007 19:06, Christoph Hellwig wrote: On Wed, Feb 07, 2007 at 07:03:05PM +0100, Jan Engelhardt wrote: With filesystems that can turn on their quota after mount time (about every fs except xfs), I can surely have a ton of files open, and hence, if I understand correctly, have lots of inodes instantiated. Yes, you can in theory. But turning on quota on a filesystem in full steam useage is not a common use case and thus there is no point in optimizing for it. I put it to a test in a default scenario (quotaon sometime at boot). SUSE Linux 10.1 with reiserfs(usrquota,grpquota). This is the result: add_dquot_ref: 30 files in sb-s_files for hda2 add_dquot_ref: Restarting after 12 files add_dquot_ref: 30 files in sb-s_files for hda2 add_dquot_ref: Restarted 1 times add_dquot_ref: 30 files in sb-s_files for hda2 add_dquot_ref: Restarting after 12 files add_dquot_ref: 30 files in sb-s_files for hda2 add_dquot_ref: Restarted 1 times Surprisingly few. I had expected to see more instantiated (but not necessarily open) files. Patch for reference: Index: 18/fs/dquot.c === --- 18.orig/fs/dquot.c +++ 18/fs/dquot.c @@ -689,22 +689,35 @@ static int dqinit_needed(struct inode *i static void add_dquot_ref(struct super_block *sb, int type) { struct list_head *p; +int restart = 0; +int num_files; restart: file_list_lock(); +num_files = 0; +list_for_each(p, sb-s_files) { +++num_files; +} +printk(KERN_WARNING add_dquot_ref: %d files in sb-s_files for %s\n, +num_files, sb-s_id); +num_files = 0; list_for_each(p, sb-s_files) { struct file *filp = list_entry(p, struct file, f_u.fu_list); struct inode *inode = filp-f_dentry-d_inode; +++num_files; if (filp-f_mode FMODE_WRITE dqinit_needed(inode, type)) { struct dentry *dentry = dget(filp-f_dentry); file_list_unlock(); sb-dq_op-initialize(inode, type); dput(dentry); /* As we may have blocked we had better restart... */ +printk(KERN_WARNING add_dquot_ref: Restarting after %d files\n, num_files); +++restart; goto restart; } } file_list_unlock(); +printk(KERN_WARNING add_dquot_ref: Restarted %d times\n, restart); } /* Return 0 if dqput() won't block (note that 1 doesn't necessarily mean blocking) */ BTW, whilst looking for the function to return a readable name, I came across reiserfs_bdevname() and bdevname(). The former uses sb-s_id, the latter struct gendisk-hd_name+partition number. Could someone elaborate on why there are two ways? Jan -- ft: http://freshmeat.net/p/chaostables/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb-s_files and file_list_lock usage in dquot.c
On Thu, Feb 08, 2007 at 01:01:21AM -0800, Bill Huey wrote: Christoph, The i_mutex lock the inode structure is also a source of contention heavy when running a lot of parallel finds. I'm sure that folks would be open to hearing suggestions regarding how to fix that. Christoph, And while you're at it, you should also know that dcache_lock is next in line to be nixed out of existence if possible. i_mutex is a bitch and I'm not even going to think about how to get rid of it since it's so widely used in many places (file systems aren't my think as well). Maybe some more precise tracking of contention paths would be useful to see if there's a pathological case creating a cascade of contention events so that can be nixed, don't know. About 1/10th of the lock stat events I've logged report that the owner of the rtmutex is the current on a runqueue some where. An adaptive lock would help with those contention events (spin for it while owner is running for the mutex release) but really, the contention should be avoided in the first place since they have a kind of (I think) polynomial increase in contention time as you add more processors to the mix. I have an adaptive lock implementation in my tree that eliminates the contention between what looks like the IDE layer, work queues and the anticipatory scheduler, but that's not a real problem unlike what I've mentioned above. I can get you and others more specifics on the problem if folks working on the lower layers want it. Other than that the -rt patch does quite well with instrumenting all sort of kernel behaviors that include contention and latency issues. So I don't need to tell you how valuable the -rt patch is for these issues since it's obvious, and I'm sure that you'll agree, that it's been instrumental at discovering many problems with the stock kernel. bill - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb-s_files and file_list_lock usage in dquot.c
On Thu, Feb 08, 2007 at 11:14:04PM -0800, Bill Huey wrote: I have an adaptive lock implementation in my tree that eliminates the contention between what looks like the IDE layer, work queues and the anticipatory scheduler, but that's not a real problem unlike what I've mentioned above. I can get you and others more specifics on the problem if folks working on the lower layers want it. Correction, it eliminates all of the live contentions where the mutex owner isn't asleep already. bill - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb->s_files and file_list_lock usage in dquot.c
On Wed, Feb 07, 2007 at 07:03:05PM +0100, Jan Engelhardt wrote: > With filesystems that can turn on their quota after mount time (about > every fs except xfs), I can surely have a ton of files open, and hence, > if I understand correctly, have lots of inodes instantiated. Yes, you can in theory. But turning on quota on a filesystem in full steam useage is not a common use case and thus there is no point in optimizing for it. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb->s_files and file_list_lock usage in dquot.c
On Feb 7 2007 08:22, Christoph Hellwig wrote: >> That loop has (and had) up to O(n^n) operations. Is there something which >> prevents this from going insane? > >I don't think so. Then again it's only called when you call quotaon on >a mounted filesystem, and normally you don't have that many inodes >instanciated at that time. With filesystems that can turn on their quota after mount time (about every fs except xfs), I can surely have a ton of files open, and hence, if I understand correctly, have lots of inodes instantiated. Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb->s_files and file_list_lock usage in dquot.c
On Tue 06-02-07 15:50:01, Andrew Morton wrote: > On Tue, 6 Feb 2007 14:23:33 +0100 > Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > > static void add_dquot_ref(struct super_block *sb, int type) > > { > > - struct list_head *p; > > + struct inode *inode; > > > > restart: > > - file_list_lock(); > > - list_for_each(p, >s_files) { > > - struct file *filp = list_entry(p, struct file, f_u.fu_list); > > - struct inode *inode = filp->f_path.dentry->d_inode; > > - if (filp->f_mode & FMODE_WRITE && dqinit_needed(inode, type)) { > > - struct dentry *dentry = dget(filp->f_path.dentry); > > - file_list_unlock(); > > - sb->dq_op->initialize(inode, type); > > - dput(dentry); > > - /* As we may have blocked we had better restart... */ > > - goto restart; > > - } > > + spin_lock(_lock); > > + list_for_each_entry(inode, >s_inodes, i_sb_list) { > > + if (!atomic_read(>i_writecount)) > > + continue; > > + if (!dqinit_needed(inode, type)) > > + continue; > > + if (inode->i_state & (I_FREEING|I_WILL_FREE)) > > + continue; > > + > > + __iget(inode); > > + spin_unlock(_lock); > > + > > + sb->dq_op->initialize(inode, type); > > + iput(inode); > > + /* As we may have blocked we had better restart... */ > > + goto restart; > > } > > - file_list_unlock(); > > + spin_unlock(_lock); > > } > > That loop has (and had) up to O(n^n) operations. Is there something which > prevents this from going insane? Huh, I guess you meant O(n^2)... Yes, I know about this but as Christoph pointed out, the loop runs only when quotaon is called on a filesystem and for inodes somebody writes to. Usually, there are not so many such inodes. So I'm not sure it's worth the trouble to fix this. Honza -- Jan Kara <[EMAIL PROTECTED]> SuSE CR Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb-s_files and file_list_lock usage in dquot.c
On Tue 06-02-07 15:50:01, Andrew Morton wrote: On Tue, 6 Feb 2007 14:23:33 +0100 Christoph Hellwig [EMAIL PROTECTED] wrote: static void add_dquot_ref(struct super_block *sb, int type) { - struct list_head *p; + struct inode *inode; restart: - file_list_lock(); - list_for_each(p, sb-s_files) { - struct file *filp = list_entry(p, struct file, f_u.fu_list); - struct inode *inode = filp-f_path.dentry-d_inode; - if (filp-f_mode FMODE_WRITE dqinit_needed(inode, type)) { - struct dentry *dentry = dget(filp-f_path.dentry); - file_list_unlock(); - sb-dq_op-initialize(inode, type); - dput(dentry); - /* As we may have blocked we had better restart... */ - goto restart; - } + spin_lock(inode_lock); + list_for_each_entry(inode, sb-s_inodes, i_sb_list) { + if (!atomic_read(inode-i_writecount)) + continue; + if (!dqinit_needed(inode, type)) + continue; + if (inode-i_state (I_FREEING|I_WILL_FREE)) + continue; + + __iget(inode); + spin_unlock(inode_lock); + + sb-dq_op-initialize(inode, type); + iput(inode); + /* As we may have blocked we had better restart... */ + goto restart; } - file_list_unlock(); + spin_unlock(inode_lock); } That loop has (and had) up to O(n^n) operations. Is there something which prevents this from going insane? Huh, I guess you meant O(n^2)... Yes, I know about this but as Christoph pointed out, the loop runs only when quotaon is called on a filesystem and for inodes somebody writes to. Usually, there are not so many such inodes. So I'm not sure it's worth the trouble to fix this. Honza -- Jan Kara [EMAIL PROTECTED] SuSE CR Labs - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb-s_files and file_list_lock usage in dquot.c
On Feb 7 2007 08:22, Christoph Hellwig wrote: That loop has (and had) up to O(n^n) operations. Is there something which prevents this from going insane? I don't think so. Then again it's only called when you call quotaon on a mounted filesystem, and normally you don't have that many inodes instanciated at that time. With filesystems that can turn on their quota after mount time (about every fs except xfs), I can surely have a ton of files open, and hence, if I understand correctly, have lots of inodes instantiated. Jan -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb-s_files and file_list_lock usage in dquot.c
On Wed, Feb 07, 2007 at 07:03:05PM +0100, Jan Engelhardt wrote: With filesystems that can turn on their quota after mount time (about every fs except xfs), I can surely have a ton of files open, and hence, if I understand correctly, have lots of inodes instantiated. Yes, you can in theory. But turning on quota on a filesystem in full steam useage is not a common use case and thus there is no point in optimizing for it. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb->s_files and file_list_lock usage in dquot.c
On Tue, Feb 06, 2007 at 03:50:01PM -0800, Andrew Morton wrote: > On Tue, 6 Feb 2007 14:23:33 +0100 > Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > > static void add_dquot_ref(struct super_block *sb, int type) > > { > > - struct list_head *p; > > + struct inode *inode; > > > > restart: > > - file_list_lock(); > > - list_for_each(p, >s_files) { > > - struct file *filp = list_entry(p, struct file, f_u.fu_list); > > - struct inode *inode = filp->f_path.dentry->d_inode; > > - if (filp->f_mode & FMODE_WRITE && dqinit_needed(inode, type)) { > > - struct dentry *dentry = dget(filp->f_path.dentry); > > - file_list_unlock(); > > - sb->dq_op->initialize(inode, type); > > - dput(dentry); > > - /* As we may have blocked we had better restart... */ > > - goto restart; > > - } > > + spin_lock(_lock); > > + list_for_each_entry(inode, >s_inodes, i_sb_list) { > > + if (!atomic_read(>i_writecount)) > > + continue; > > + if (!dqinit_needed(inode, type)) > > + continue; > > + if (inode->i_state & (I_FREEING|I_WILL_FREE)) > > + continue; > > + > > + __iget(inode); > > + spin_unlock(_lock); > > + > > + sb->dq_op->initialize(inode, type); > > + iput(inode); > > + /* As we may have blocked we had better restart... */ > > + goto restart; > > } > > - file_list_unlock(); > > + spin_unlock(_lock); > > } > > That loop has (and had) up to O(n^n) operations. Is there something which > prevents this from going insane? I don't think so. Then again it's only called when you call quotaon on a mounted filesystem, and normally you don't have that many inodes instanciated at that time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb->s_files and file_list_lock usage in dquot.c
On Tue, 6 Feb 2007 14:23:33 +0100 Christoph Hellwig <[EMAIL PROTECTED]> wrote: > static void add_dquot_ref(struct super_block *sb, int type) > { > - struct list_head *p; > + struct inode *inode; > > restart: > - file_list_lock(); > - list_for_each(p, >s_files) { > - struct file *filp = list_entry(p, struct file, f_u.fu_list); > - struct inode *inode = filp->f_path.dentry->d_inode; > - if (filp->f_mode & FMODE_WRITE && dqinit_needed(inode, type)) { > - struct dentry *dentry = dget(filp->f_path.dentry); > - file_list_unlock(); > - sb->dq_op->initialize(inode, type); > - dput(dentry); > - /* As we may have blocked we had better restart... */ > - goto restart; > - } > + spin_lock(_lock); > + list_for_each_entry(inode, >s_inodes, i_sb_list) { > + if (!atomic_read(>i_writecount)) > + continue; > + if (!dqinit_needed(inode, type)) > + continue; > + if (inode->i_state & (I_FREEING|I_WILL_FREE)) > + continue; > + > + __iget(inode); > + spin_unlock(_lock); > + > + sb->dq_op->initialize(inode, type); > + iput(inode); > + /* As we may have blocked we had better restart... */ > + goto restart; > } > - file_list_unlock(); > + spin_unlock(_lock); > } That loop has (and had) up to O(n^n) operations. Is there something which prevents this from going insane? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb->s_files and file_list_lock usage in dquot.c
On Tue 06-02-07 14:23:33, Christoph Hellwig wrote: > Iterate over sb->s_inodes instead of sb->s_files in add_dquot_ref. > This reduces list search and lock hold time aswell as getting rid of > one of the few uses of file_list_lock which Ingo identified as a > scalability problem. > > Previously we called dq_op->initialize for every inode handing of > a writeable file that wasn't initialized before. Now we're calling > it for every inode that has a non-zero i_writecount, aka a writeable > file descriptor refering to it. > > Thanks a lot to Jan Kara for running this patch through his quota > test harness. > > Note: this is ontop of '[PATCH] move remove_dquot_ref to dqout.c' > which I sent out yesterday. > > > Signed-off-by: Christoph Hellwig <[EMAIL PROTECTED]> Signed-off-by: Jan Kara <[EMAIL PROTECTED]> > Index: linux-2.6/fs/dquot.c > === > --- linux-2.6.orig/fs/dquot.c 2007-02-05 18:54:36.0 +0100 > +++ linux-2.6/fs/dquot.c 2007-02-05 18:59:48.0 +0100 > @@ -689,23 +689,27 @@ > /* This routine is guarded by dqonoff_mutex mutex */ > static void add_dquot_ref(struct super_block *sb, int type) > { > - struct list_head *p; > + struct inode *inode; > > restart: > - file_list_lock(); > - list_for_each(p, >s_files) { > - struct file *filp = list_entry(p, struct file, f_u.fu_list); > - struct inode *inode = filp->f_path.dentry->d_inode; > - if (filp->f_mode & FMODE_WRITE && dqinit_needed(inode, type)) { > - struct dentry *dentry = dget(filp->f_path.dentry); > - file_list_unlock(); > - sb->dq_op->initialize(inode, type); > - dput(dentry); > - /* As we may have blocked we had better restart... */ > - goto restart; > - } > + spin_lock(_lock); > + list_for_each_entry(inode, >s_inodes, i_sb_list) { > + if (!atomic_read(>i_writecount)) > + continue; > + if (!dqinit_needed(inode, type)) > + continue; > + if (inode->i_state & (I_FREEING|I_WILL_FREE)) > + continue; > + > + __iget(inode); > + spin_unlock(_lock); > + > + sb->dq_op->initialize(inode, type); > + iput(inode); > + /* As we may have blocked we had better restart... */ > + goto restart; > } > - file_list_unlock(); > + spin_unlock(_lock); > } > > /* Return 0 if dqput() won't block (note that 1 doesn't necessarily mean > blocking) */ Honza -- Jan Kara <[EMAIL PROTECTED]> SuSE CR Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] remove sb->s_files and file_list_lock usage in dquot.c
Iterate over sb->s_inodes instead of sb->s_files in add_dquot_ref. This reduces list search and lock hold time aswell as getting rid of one of the few uses of file_list_lock which Ingo identified as a scalability problem. Previously we called dq_op->initialize for every inode handing of a writeable file that wasn't initialized before. Now we're calling it for every inode that has a non-zero i_writecount, aka a writeable file descriptor refering to it. Thanks a lot to Jan Kara for running this patch through his quota test harness. Note: this is ontop of '[PATCH] move remove_dquot_ref to dqout.c' which I sent out yesterday. Signed-off-by: Christoph Hellwig <[EMAIL PROTECTED]> Index: linux-2.6/fs/dquot.c === --- linux-2.6.orig/fs/dquot.c 2007-02-05 18:54:36.0 +0100 +++ linux-2.6/fs/dquot.c2007-02-05 18:59:48.0 +0100 @@ -689,23 +689,27 @@ /* This routine is guarded by dqonoff_mutex mutex */ static void add_dquot_ref(struct super_block *sb, int type) { - struct list_head *p; + struct inode *inode; restart: - file_list_lock(); - list_for_each(p, >s_files) { - struct file *filp = list_entry(p, struct file, f_u.fu_list); - struct inode *inode = filp->f_path.dentry->d_inode; - if (filp->f_mode & FMODE_WRITE && dqinit_needed(inode, type)) { - struct dentry *dentry = dget(filp->f_path.dentry); - file_list_unlock(); - sb->dq_op->initialize(inode, type); - dput(dentry); - /* As we may have blocked we had better restart... */ - goto restart; - } + spin_lock(_lock); + list_for_each_entry(inode, >s_inodes, i_sb_list) { + if (!atomic_read(>i_writecount)) + continue; + if (!dqinit_needed(inode, type)) + continue; + if (inode->i_state & (I_FREEING|I_WILL_FREE)) + continue; + + __iget(inode); + spin_unlock(_lock); + + sb->dq_op->initialize(inode, type); + iput(inode); + /* As we may have blocked we had better restart... */ + goto restart; } - file_list_unlock(); + spin_unlock(_lock); } /* Return 0 if dqput() won't block (note that 1 doesn't necessarily mean blocking) */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] remove sb-s_files and file_list_lock usage in dquot.c
Iterate over sb-s_inodes instead of sb-s_files in add_dquot_ref. This reduces list search and lock hold time aswell as getting rid of one of the few uses of file_list_lock which Ingo identified as a scalability problem. Previously we called dq_op-initialize for every inode handing of a writeable file that wasn't initialized before. Now we're calling it for every inode that has a non-zero i_writecount, aka a writeable file descriptor refering to it. Thanks a lot to Jan Kara for running this patch through his quota test harness. Note: this is ontop of '[PATCH] move remove_dquot_ref to dqout.c' which I sent out yesterday. Signed-off-by: Christoph Hellwig [EMAIL PROTECTED] Index: linux-2.6/fs/dquot.c === --- linux-2.6.orig/fs/dquot.c 2007-02-05 18:54:36.0 +0100 +++ linux-2.6/fs/dquot.c2007-02-05 18:59:48.0 +0100 @@ -689,23 +689,27 @@ /* This routine is guarded by dqonoff_mutex mutex */ static void add_dquot_ref(struct super_block *sb, int type) { - struct list_head *p; + struct inode *inode; restart: - file_list_lock(); - list_for_each(p, sb-s_files) { - struct file *filp = list_entry(p, struct file, f_u.fu_list); - struct inode *inode = filp-f_path.dentry-d_inode; - if (filp-f_mode FMODE_WRITE dqinit_needed(inode, type)) { - struct dentry *dentry = dget(filp-f_path.dentry); - file_list_unlock(); - sb-dq_op-initialize(inode, type); - dput(dentry); - /* As we may have blocked we had better restart... */ - goto restart; - } + spin_lock(inode_lock); + list_for_each_entry(inode, sb-s_inodes, i_sb_list) { + if (!atomic_read(inode-i_writecount)) + continue; + if (!dqinit_needed(inode, type)) + continue; + if (inode-i_state (I_FREEING|I_WILL_FREE)) + continue; + + __iget(inode); + spin_unlock(inode_lock); + + sb-dq_op-initialize(inode, type); + iput(inode); + /* As we may have blocked we had better restart... */ + goto restart; } - file_list_unlock(); + spin_unlock(inode_lock); } /* Return 0 if dqput() won't block (note that 1 doesn't necessarily mean blocking) */ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb-s_files and file_list_lock usage in dquot.c
On Tue 06-02-07 14:23:33, Christoph Hellwig wrote: Iterate over sb-s_inodes instead of sb-s_files in add_dquot_ref. This reduces list search and lock hold time aswell as getting rid of one of the few uses of file_list_lock which Ingo identified as a scalability problem. Previously we called dq_op-initialize for every inode handing of a writeable file that wasn't initialized before. Now we're calling it for every inode that has a non-zero i_writecount, aka a writeable file descriptor refering to it. Thanks a lot to Jan Kara for running this patch through his quota test harness. Note: this is ontop of '[PATCH] move remove_dquot_ref to dqout.c' which I sent out yesterday. Signed-off-by: Christoph Hellwig [EMAIL PROTECTED] Signed-off-by: Jan Kara [EMAIL PROTECTED] Index: linux-2.6/fs/dquot.c === --- linux-2.6.orig/fs/dquot.c 2007-02-05 18:54:36.0 +0100 +++ linux-2.6/fs/dquot.c 2007-02-05 18:59:48.0 +0100 @@ -689,23 +689,27 @@ /* This routine is guarded by dqonoff_mutex mutex */ static void add_dquot_ref(struct super_block *sb, int type) { - struct list_head *p; + struct inode *inode; restart: - file_list_lock(); - list_for_each(p, sb-s_files) { - struct file *filp = list_entry(p, struct file, f_u.fu_list); - struct inode *inode = filp-f_path.dentry-d_inode; - if (filp-f_mode FMODE_WRITE dqinit_needed(inode, type)) { - struct dentry *dentry = dget(filp-f_path.dentry); - file_list_unlock(); - sb-dq_op-initialize(inode, type); - dput(dentry); - /* As we may have blocked we had better restart... */ - goto restart; - } + spin_lock(inode_lock); + list_for_each_entry(inode, sb-s_inodes, i_sb_list) { + if (!atomic_read(inode-i_writecount)) + continue; + if (!dqinit_needed(inode, type)) + continue; + if (inode-i_state (I_FREEING|I_WILL_FREE)) + continue; + + __iget(inode); + spin_unlock(inode_lock); + + sb-dq_op-initialize(inode, type); + iput(inode); + /* As we may have blocked we had better restart... */ + goto restart; } - file_list_unlock(); + spin_unlock(inode_lock); } /* Return 0 if dqput() won't block (note that 1 doesn't necessarily mean blocking) */ Honza -- Jan Kara [EMAIL PROTECTED] SuSE CR Labs - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb-s_files and file_list_lock usage in dquot.c
On Tue, 6 Feb 2007 14:23:33 +0100 Christoph Hellwig [EMAIL PROTECTED] wrote: static void add_dquot_ref(struct super_block *sb, int type) { - struct list_head *p; + struct inode *inode; restart: - file_list_lock(); - list_for_each(p, sb-s_files) { - struct file *filp = list_entry(p, struct file, f_u.fu_list); - struct inode *inode = filp-f_path.dentry-d_inode; - if (filp-f_mode FMODE_WRITE dqinit_needed(inode, type)) { - struct dentry *dentry = dget(filp-f_path.dentry); - file_list_unlock(); - sb-dq_op-initialize(inode, type); - dput(dentry); - /* As we may have blocked we had better restart... */ - goto restart; - } + spin_lock(inode_lock); + list_for_each_entry(inode, sb-s_inodes, i_sb_list) { + if (!atomic_read(inode-i_writecount)) + continue; + if (!dqinit_needed(inode, type)) + continue; + if (inode-i_state (I_FREEING|I_WILL_FREE)) + continue; + + __iget(inode); + spin_unlock(inode_lock); + + sb-dq_op-initialize(inode, type); + iput(inode); + /* As we may have blocked we had better restart... */ + goto restart; } - file_list_unlock(); + spin_unlock(inode_lock); } That loop has (and had) up to O(n^n) operations. Is there something which prevents this from going insane? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] remove sb-s_files and file_list_lock usage in dquot.c
On Tue, Feb 06, 2007 at 03:50:01PM -0800, Andrew Morton wrote: On Tue, 6 Feb 2007 14:23:33 +0100 Christoph Hellwig [EMAIL PROTECTED] wrote: static void add_dquot_ref(struct super_block *sb, int type) { - struct list_head *p; + struct inode *inode; restart: - file_list_lock(); - list_for_each(p, sb-s_files) { - struct file *filp = list_entry(p, struct file, f_u.fu_list); - struct inode *inode = filp-f_path.dentry-d_inode; - if (filp-f_mode FMODE_WRITE dqinit_needed(inode, type)) { - struct dentry *dentry = dget(filp-f_path.dentry); - file_list_unlock(); - sb-dq_op-initialize(inode, type); - dput(dentry); - /* As we may have blocked we had better restart... */ - goto restart; - } + spin_lock(inode_lock); + list_for_each_entry(inode, sb-s_inodes, i_sb_list) { + if (!atomic_read(inode-i_writecount)) + continue; + if (!dqinit_needed(inode, type)) + continue; + if (inode-i_state (I_FREEING|I_WILL_FREE)) + continue; + + __iget(inode); + spin_unlock(inode_lock); + + sb-dq_op-initialize(inode, type); + iput(inode); + /* As we may have blocked we had better restart... */ + goto restart; } - file_list_unlock(); + spin_unlock(inode_lock); } That loop has (and had) up to O(n^n) operations. Is there something which prevents this from going insane? I don't think so. Then again it's only called when you call quotaon on a mounted filesystem, and normally you don't have that many inodes instanciated at that time. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/