Re: [PATCH 4/7] fs: break the file_list_lock for sb->s_files

2007-01-28 Thread hui
On Sun, Jan 28, 2007 at 03:30:06PM +, Christoph Hellwig wrote:
> On Sun, Jan 28, 2007 at 04:21:06PM +0100, Ingo Molnar wrote:
> > > > sb->s_files is converted to a lock_list. furthermore to prevent the 
> > > > lock_list_head of getting too contended with concurrent add 
> > > > operations the add is buffered in per cpu filevecs.
> > > 
> > > NACK.  Please don't start using lockdep internals in core code.
> > 
> > what do you mean by that?
> 
> struct lock_list is an lockdep implementation detail and should not
> leak out and be used anywhere else.   If we want something similar it
> should be given a proper name and a header of it's own, but I don't
> think it's a valueable abstraction for the core kernel.

Christoph,

"lock list" has nothing to do with lockdep. It's a relatively new
data type used to construct concurrent linked lists using a spinlock
per entry.

bill

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] fs: break the file_list_lock for sb->s_files

2007-01-28 Thread Ingo Molnar

* Christoph Hellwig <[EMAIL PROTECTED]> wrote:

> > > NACK.  Please don't start using lockdep internals in core code.
> > 
> > what do you mean by that?
> 
> struct lock_list is an lockdep implementation detail and should not 
> leak out and be used anywhere else.  [...]

no, it is not. It is a new data type.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] fs: break the file_list_lock for sb->s_files

2007-01-28 Thread Christoph Hellwig
On Sun, Jan 28, 2007 at 04:32:43PM +0100, Peter Zijlstra wrote:
> please see patch 2/7, its unrelated to lockdep internals.

I can't see any 2/7 on lkml yet..

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] fs: break the file_list_lock for sb->s_files

2007-01-28 Thread Peter Zijlstra
On Sun, 2007-01-28 at 15:30 +, Christoph Hellwig wrote:
> On Sun, Jan 28, 2007 at 04:21:06PM +0100, Ingo Molnar wrote:
> > > > sb->s_files is converted to a lock_list. furthermore to prevent the 
> > > > lock_list_head of getting too contended with concurrent add 
> > > > operations the add is buffered in per cpu filevecs.
> > > 
> > > NACK.  Please don't start using lockdep internals in core code.
> > 
> > what do you mean by that?
> 
> struct lock_list is an lockdep implementation detail and should not
> leak out and be used anywhere else.   If we want something similar it
> should be given a proper name and a header of it's own, but I don't
> think it's a valueable abstraction for the core kernel.

please see patch 2/7, its unrelated to lockdep internals.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] fs: break the file_list_lock for sb->s_files

2007-01-28 Thread Christoph Hellwig
On Sun, Jan 28, 2007 at 04:21:06PM +0100, Ingo Molnar wrote:
> > > sb->s_files is converted to a lock_list. furthermore to prevent the 
> > > lock_list_head of getting too contended with concurrent add 
> > > operations the add is buffered in per cpu filevecs.
> > 
> > NACK.  Please don't start using lockdep internals in core code.
> 
> what do you mean by that?

struct lock_list is an lockdep implementation detail and should not
leak out and be used anywhere else.   If we want something similar it
should be given a proper name and a header of it's own, but I don't
think it's a valueable abstraction for the core kernel.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] fs: break the file_list_lock for sb->s_files

2007-01-28 Thread Ingo Molnar

* Christoph Hellwig <[EMAIL PROTECTED]> wrote:

> On Sun, Jan 28, 2007 at 12:51:22PM +0100, Peter Zijlstra wrote:
> > Break the protection of sb->s_files out from under the global 
> > file_list_lock.
> > 
> > sb->s_files is converted to a lock_list. furthermore to prevent the 
> > lock_list_head of getting too contended with concurrent add 
> > operations the add is buffered in per cpu filevecs.
> 
> NACK.  Please don't start using lockdep internals in core code.

what do you mean by that?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] fs: break the file_list_lock for sb->s_files

2007-01-28 Thread Christoph Hellwig
On Sun, Jan 28, 2007 at 12:51:22PM +0100, Peter Zijlstra wrote:
> Break the protection of sb->s_files out from under the global file_list_lock.
> 
> sb->s_files is converted to a lock_list. furthermore to prevent the 
> lock_list_head of getting too contended with concurrent add operations
> the add is buffered in per cpu filevecs.

NACK.  Please don't start using lockdep internals in core code.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/7] fs: break the file_list_lock for sb->s_files

2007-01-28 Thread Peter Zijlstra
Break the protection of sb->s_files out from under the global file_list_lock.

sb->s_files is converted to a lock_list. furthermore to prevent the 
lock_list_head of getting too contended with concurrent add operations
the add is buffered in per cpu filevecs.

This would ordinarily require a flush before a delete operation - to ensure
the to be deleted entry is indeed added to the list. This is avoided by 
storing a pointer to the filevec location in the not yet used list_head.
This pointer can then be used to clear the filevec entry before its actually
added.

The file_flag mess is a bit unfortunate - this can be removed by also
converting tty->tty_files to a lock_list (TODO).

Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]>
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 fs/dquot.c   |   10 +-
 fs/file_table.c  |  170 +++
 fs/open.c|2 
 fs/proc/generic.c|8 --
 fs/super.c   |7 -
 include/linux/fs.h   |   23 +
 mm/readahead.c   |2 
 security/selinux/selinuxfs.c |9 +-
 8 files changed, 194 insertions(+), 37 deletions(-)

Index: linux-2.6/fs/dquot.c
===
--- linux-2.6.orig/fs/dquot.c   2007-01-13 21:04:07.0 +0100
+++ linux-2.6/fs/dquot.c2007-01-27 21:07:44.0 +0100
@@ -688,23 +688,21 @@ static int dqinit_needed(struct inode *i
 /* This routine is guarded by dqonoff_mutex mutex */
 static void add_dquot_ref(struct super_block *sb, int type)
 {
-   struct list_head *p;
+   struct file *filp;
 
 restart:
-   file_list_lock();
-   list_for_each(p, >s_files) {
-   struct file *filp = list_entry(p, struct file, f_u.fu_list);
+   filevec_add_drain_all();
+   lock_list_for_each_entry(filp, >s_files, f_u.fu_llist) {
struct inode *inode = filp->f_path.dentry->d_inode;
if (filp->f_mode & FMODE_WRITE && dqinit_needed(inode, type)) {
struct dentry *dentry = dget(filp->f_path.dentry);
-   file_list_unlock();
+   lock_list_for_each_entry_stop(filp, f_u.fu_llist);
sb->dq_op->initialize(inode, type);
dput(dentry);
/* As we may have blocked we had better restart... */
goto restart;
}
}
-   file_list_unlock();
 }
 
 /* Return 0 if dqput() won't block (note that 1 doesn't necessarily mean 
blocking) */
Index: linux-2.6/fs/file_table.c
===
--- linux-2.6.orig/fs/file_table.c  2007-01-13 21:04:07.0 +0100
+++ linux-2.6/fs/file_table.c   2007-01-27 21:07:44.0 +0100
@@ -113,7 +113,7 @@ struct file *get_empty_filp(void)
goto fail_sec;
 
tsk = current;
-   INIT_LIST_HEAD(>f_u.fu_list);
+   INIT_LOCK_LIST_HEAD(>f_u.fu_llist);
atomic_set(>f_count, 1);
rwlock_init(>f_owner.lock);
f->f_uid = tsk->fsuid;
@@ -245,32 +245,175 @@ void put_filp(struct file *file)
}
 }
 
-void file_move(struct file *file, struct list_head *list)
+enum {
+   FILEVEC_SIZE = 15
+};
+
+struct filevec {
+   unsigned long nr;
+   struct file *files[FILEVEC_SIZE];
+};
+
+static DEFINE_PER_CPU(struct filevec, sb_fvec);
+
+static inline unsigned int filevec_size(struct filevec *fvec)
 {
-   if (!list)
-   return;
-   file_list_lock();
-   list_move(>f_u.fu_list, list);
-   file_list_unlock();
+   return FILEVEC_SIZE - fvec->nr;
+}
+
+static inline unsigned int filevec_count(struct filevec *fvec)
+{
+   return fvec->nr;
+}
+
+static inline void filevec_reinit(struct filevec *fvec)
+{
+   fvec->nr = 0;
+}
+
+static inline unsigned int filevec_add(struct filevec *fvec, struct file *filp)
+{
+   rcu_assign_pointer(fvec->files[fvec->nr], filp);
+
+   /*
+* Here we do icky stuff in order to avoid flushing the per cpu filevec
+* on list removal.
+*
+* We store the location on the per cpu filevec in the as of yet unused
+* fu_llist.next field and toggle bit 0 to indicate we done so. This
+* allows the removal code to set the filevec entry to NULL, thereby
+* avoiding the list add.
+*
+* Abuse the fu_llist.lock for protection.
+*/
+   spin_lock(>f_u.fu_llist.lock);
+   filp->f_u.fu_llist.next = (void *)>files[fvec->nr];
+   __set_bit(0, (void *)>f_u.fu_llist.next);
+   spin_unlock(>f_u.fu_llist.lock);
+
+   fvec->nr++;
+   return filevec_size(fvec);
+}
+
+static void __filevec_add(struct filevec *fvec)
+{
+   int i;
+
+   for (i = 0; i < filevec_count(fvec); i++) {
+   struct file *filp;
+
+   /*
+* see the comment in 

[PATCH 4/7] fs: break the file_list_lock for sb-s_files

2007-01-28 Thread Peter Zijlstra
Break the protection of sb-s_files out from under the global file_list_lock.

sb-s_files is converted to a lock_list. furthermore to prevent the 
lock_list_head of getting too contended with concurrent add operations
the add is buffered in per cpu filevecs.

This would ordinarily require a flush before a delete operation - to ensure
the to be deleted entry is indeed added to the list. This is avoided by 
storing a pointer to the filevec location in the not yet used list_head.
This pointer can then be used to clear the filevec entry before its actually
added.

The file_flag mess is a bit unfortunate - this can be removed by also
converting tty-tty_files to a lock_list (TODO).

Signed-off-by: Peter Zijlstra [EMAIL PROTECTED]
Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
---
 fs/dquot.c   |   10 +-
 fs/file_table.c  |  170 +++
 fs/open.c|2 
 fs/proc/generic.c|8 --
 fs/super.c   |7 -
 include/linux/fs.h   |   23 +
 mm/readahead.c   |2 
 security/selinux/selinuxfs.c |9 +-
 8 files changed, 194 insertions(+), 37 deletions(-)

Index: linux-2.6/fs/dquot.c
===
--- linux-2.6.orig/fs/dquot.c   2007-01-13 21:04:07.0 +0100
+++ linux-2.6/fs/dquot.c2007-01-27 21:07:44.0 +0100
@@ -688,23 +688,21 @@ static int dqinit_needed(struct inode *i
 /* This routine is guarded by dqonoff_mutex mutex */
 static void add_dquot_ref(struct super_block *sb, int type)
 {
-   struct list_head *p;
+   struct file *filp;
 
 restart:
-   file_list_lock();
-   list_for_each(p, sb-s_files) {
-   struct file *filp = list_entry(p, struct file, f_u.fu_list);
+   filevec_add_drain_all();
+   lock_list_for_each_entry(filp, sb-s_files, f_u.fu_llist) {
struct inode *inode = filp-f_path.dentry-d_inode;
if (filp-f_mode  FMODE_WRITE  dqinit_needed(inode, type)) {
struct dentry *dentry = dget(filp-f_path.dentry);
-   file_list_unlock();
+   lock_list_for_each_entry_stop(filp, f_u.fu_llist);
sb-dq_op-initialize(inode, type);
dput(dentry);
/* As we may have blocked we had better restart... */
goto restart;
}
}
-   file_list_unlock();
 }
 
 /* Return 0 if dqput() won't block (note that 1 doesn't necessarily mean 
blocking) */
Index: linux-2.6/fs/file_table.c
===
--- linux-2.6.orig/fs/file_table.c  2007-01-13 21:04:07.0 +0100
+++ linux-2.6/fs/file_table.c   2007-01-27 21:07:44.0 +0100
@@ -113,7 +113,7 @@ struct file *get_empty_filp(void)
goto fail_sec;
 
tsk = current;
-   INIT_LIST_HEAD(f-f_u.fu_list);
+   INIT_LOCK_LIST_HEAD(f-f_u.fu_llist);
atomic_set(f-f_count, 1);
rwlock_init(f-f_owner.lock);
f-f_uid = tsk-fsuid;
@@ -245,32 +245,175 @@ void put_filp(struct file *file)
}
 }
 
-void file_move(struct file *file, struct list_head *list)
+enum {
+   FILEVEC_SIZE = 15
+};
+
+struct filevec {
+   unsigned long nr;
+   struct file *files[FILEVEC_SIZE];
+};
+
+static DEFINE_PER_CPU(struct filevec, sb_fvec);
+
+static inline unsigned int filevec_size(struct filevec *fvec)
 {
-   if (!list)
-   return;
-   file_list_lock();
-   list_move(file-f_u.fu_list, list);
-   file_list_unlock();
+   return FILEVEC_SIZE - fvec-nr;
+}
+
+static inline unsigned int filevec_count(struct filevec *fvec)
+{
+   return fvec-nr;
+}
+
+static inline void filevec_reinit(struct filevec *fvec)
+{
+   fvec-nr = 0;
+}
+
+static inline unsigned int filevec_add(struct filevec *fvec, struct file *filp)
+{
+   rcu_assign_pointer(fvec-files[fvec-nr], filp);
+
+   /*
+* Here we do icky stuff in order to avoid flushing the per cpu filevec
+* on list removal.
+*
+* We store the location on the per cpu filevec in the as of yet unused
+* fu_llist.next field and toggle bit 0 to indicate we done so. This
+* allows the removal code to set the filevec entry to NULL, thereby
+* avoiding the list add.
+*
+* Abuse the fu_llist.lock for protection.
+*/
+   spin_lock(filp-f_u.fu_llist.lock);
+   filp-f_u.fu_llist.next = (void *)fvec-files[fvec-nr];
+   __set_bit(0, (void *)filp-f_u.fu_llist.next);
+   spin_unlock(filp-f_u.fu_llist.lock);
+
+   fvec-nr++;
+   return filevec_size(fvec);
+}
+
+static void __filevec_add(struct filevec *fvec)
+{
+   int i;
+
+   for (i = 0; i  filevec_count(fvec); i++) {
+   struct file *filp;
+
+   /*
+* see the comment in 

Re: [PATCH 4/7] fs: break the file_list_lock for sb-s_files

2007-01-28 Thread Christoph Hellwig
On Sun, Jan 28, 2007 at 12:51:22PM +0100, Peter Zijlstra wrote:
 Break the protection of sb-s_files out from under the global file_list_lock.
 
 sb-s_files is converted to a lock_list. furthermore to prevent the 
 lock_list_head of getting too contended with concurrent add operations
 the add is buffered in per cpu filevecs.

NACK.  Please don't start using lockdep internals in core code.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] fs: break the file_list_lock for sb-s_files

2007-01-28 Thread Ingo Molnar

* Christoph Hellwig [EMAIL PROTECTED] wrote:

 On Sun, Jan 28, 2007 at 12:51:22PM +0100, Peter Zijlstra wrote:
  Break the protection of sb-s_files out from under the global 
  file_list_lock.
  
  sb-s_files is converted to a lock_list. furthermore to prevent the 
  lock_list_head of getting too contended with concurrent add 
  operations the add is buffered in per cpu filevecs.
 
 NACK.  Please don't start using lockdep internals in core code.

what do you mean by that?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] fs: break the file_list_lock for sb-s_files

2007-01-28 Thread Christoph Hellwig
On Sun, Jan 28, 2007 at 04:21:06PM +0100, Ingo Molnar wrote:
   sb-s_files is converted to a lock_list. furthermore to prevent the 
   lock_list_head of getting too contended with concurrent add 
   operations the add is buffered in per cpu filevecs.
  
  NACK.  Please don't start using lockdep internals in core code.
 
 what do you mean by that?

struct lock_list is an lockdep implementation detail and should not
leak out and be used anywhere else.   If we want something similar it
should be given a proper name and a header of it's own, but I don't
think it's a valueable abstraction for the core kernel.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] fs: break the file_list_lock for sb-s_files

2007-01-28 Thread Peter Zijlstra
On Sun, 2007-01-28 at 15:30 +, Christoph Hellwig wrote:
 On Sun, Jan 28, 2007 at 04:21:06PM +0100, Ingo Molnar wrote:
sb-s_files is converted to a lock_list. furthermore to prevent the 
lock_list_head of getting too contended with concurrent add 
operations the add is buffered in per cpu filevecs.
   
   NACK.  Please don't start using lockdep internals in core code.
  
  what do you mean by that?
 
 struct lock_list is an lockdep implementation detail and should not
 leak out and be used anywhere else.   If we want something similar it
 should be given a proper name and a header of it's own, but I don't
 think it's a valueable abstraction for the core kernel.

please see patch 2/7, its unrelated to lockdep internals.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] fs: break the file_list_lock for sb-s_files

2007-01-28 Thread Christoph Hellwig
On Sun, Jan 28, 2007 at 04:32:43PM +0100, Peter Zijlstra wrote:
 please see patch 2/7, its unrelated to lockdep internals.

I can't see any 2/7 on lkml yet..

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] fs: break the file_list_lock for sb-s_files

2007-01-28 Thread Ingo Molnar

* Christoph Hellwig [EMAIL PROTECTED] wrote:

   NACK.  Please don't start using lockdep internals in core code.
  
  what do you mean by that?
 
 struct lock_list is an lockdep implementation detail and should not 
 leak out and be used anywhere else.  [...]

no, it is not. It is a new data type.

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/7] fs: break the file_list_lock for sb-s_files

2007-01-28 Thread hui
On Sun, Jan 28, 2007 at 03:30:06PM +, Christoph Hellwig wrote:
 On Sun, Jan 28, 2007 at 04:21:06PM +0100, Ingo Molnar wrote:
sb-s_files is converted to a lock_list. furthermore to prevent the 
lock_list_head of getting too contended with concurrent add 
operations the add is buffered in per cpu filevecs.
   
   NACK.  Please don't start using lockdep internals in core code.
  
  what do you mean by that?
 
 struct lock_list is an lockdep implementation detail and should not
 leak out and be used anywhere else.   If we want something similar it
 should be given a proper name and a header of it's own, but I don't
 think it's a valueable abstraction for the core kernel.

Christoph,

lock list has nothing to do with lockdep. It's a relatively new
data type used to construct concurrent linked lists using a spinlock
per entry.

bill

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/