Re: Fix for SMP deadlock in autofs4

2001-04-21 Thread Jeremy Fitzhardinge

On Sat, Apr 21, 2001 at 02:21:38AM -0400, Alexander Viro wrote:
> Looks sane for me. However, I would add check for dentry being hashed and
> would skip the unhashed ones. Otherwise you can get a directory that
> had been removed but is still busy - doesn't look like a right thing to
> do. Jeremy?

It wouldn't hurt.  It can't happen in practice since unlink/rmdir happen
in very controlled ways (only the automount daemon is allowed to perform
those ops, so it will keep them in sync).

J

 PGP signature


Re: Fix for SMP deadlock in autofs4

2001-04-21 Thread Jeremy Fitzhardinge

On Fri, Apr 20, 2001 at 10:59:43PM -0700, Linus Torvalds wrote:
> It's untested, but looks fairly obvious. It removes the increment, and
> changes autofs4_expire() to properly bump the count of the returned dentry
> (and callers will dput() it when done). This may be unnecessarily careful,
> but it's the RightThing(tm) to do.

I suppose so.  It is pretty paranoid, because of autofs4's extra reference it
can't (shouldn't) ever drop to zero until the filesystem allows it to drop to
zero.  In other words, if it helps, it's hiding another bug.  But you're right,
if this were a general routine, it should definitely return with an elevated
count.

> Jeremy, would you mind verifying that this WorksForYou(tm)?

Looks fine to me.  I'll give it a spin.

J

 PGP signature


Re: Fix for SMP deadlock in autofs4

2001-04-21 Thread Alexander Viro



On Fri, 20 Apr 2001, Linus Torvalds wrote:

> 
> 
> On Fri, 20 Apr 2001, Jeremy Fitzhardinge wrote:
> >
> > I kept the dget/put out caution and ignorance, but they're clearly
> > problematic.  I'm happy to drop them if holding dcache_lock is enough
> > to keep the tree stable while I traverse it.
> 
> How does this patch look to you people?
> 
> It's untested, but looks fairly obvious. It removes the increment, and
> changes autofs4_expire() to properly bump the count of the returned dentry
> (and callers will dput() it when done). This may be unnecessarily careful,
> but it's the RightThing(tm) to do.

Looks sane for me. However, I would add check for dentry being hashed and
would skip the unhashed ones. Otherwise you can get a directory that
had been removed but is still busy - doesn't look like a right thing to
do. Jeremy?
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Fix for SMP deadlock in autofs4

2001-04-21 Thread Jeremy Fitzhardinge

On Fri, Apr 20, 2001 at 10:59:43PM -0700, Linus Torvalds wrote:
 It's untested, but looks fairly obvious. It removes the increment, and
 changes autofs4_expire() to properly bump the count of the returned dentry
 (and callers will dput() it when done). This may be unnecessarily careful,
 but it's the RightThing(tm) to do.

I suppose so.  It is pretty paranoid, because of autofs4's extra reference it
can't (shouldn't) ever drop to zero until the filesystem allows it to drop to
zero.  In other words, if it helps, it's hiding another bug.  But you're right,
if this were a general routine, it should definitely return with an elevated
count.

 Jeremy, would you mind verifying that this WorksForYou(tm)?

Looks fine to me.  I'll give it a spin.

J

 PGP signature


Re: Fix for SMP deadlock in autofs4

2001-04-21 Thread Jeremy Fitzhardinge

On Sat, Apr 21, 2001 at 02:21:38AM -0400, Alexander Viro wrote:
 Looks sane for me. However, I would add check for dentry being hashed and
 would skip the unhashed ones. Otherwise you can get a directory that
 had been removed but is still busy - doesn't look like a right thing to
 do. Jeremy?

It wouldn't hurt.  It can't happen in practice since unlink/rmdir happen
in very controlled ways (only the automount daemon is allowed to perform
those ops, so it will keep them in sync).

J

 PGP signature


Re: Fix for SMP deadlock in autofs4

2001-04-21 Thread Alexander Viro



On Fri, 20 Apr 2001, Linus Torvalds wrote:

 
 
 On Fri, 20 Apr 2001, Jeremy Fitzhardinge wrote:
 
  I kept the dget/put out caution and ignorance, but they're clearly
  problematic.  I'm happy to drop them if holding dcache_lock is enough
  to keep the tree stable while I traverse it.
 
 How does this patch look to you people?
 
 It's untested, but looks fairly obvious. It removes the increment, and
 changes autofs4_expire() to properly bump the count of the returned dentry
 (and callers will dput() it when done). This may be unnecessarily careful,
 but it's the RightThing(tm) to do.

Looks sane for me. However, I would add check for dentry being hashed and
would skip the unhashed ones. Otherwise you can get a directory that
had been removed but is still busy - doesn't look like a right thing to
do. Jeremy?
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Fix for SMP deadlock in autofs4

2001-04-20 Thread Linus Torvalds



On Fri, 20 Apr 2001, Jeremy Fitzhardinge wrote:
>
> I kept the dget/put out caution and ignorance, but they're clearly
> problematic.  I'm happy to drop them if holding dcache_lock is enough
> to keep the tree stable while I traverse it.

How does this patch look to you people?

It's untested, but looks fairly obvious. It removes the increment, and
changes autofs4_expire() to properly bump the count of the returned dentry
(and callers will dput() it when done). This may be unnecessarily careful,
but it's the RightThing(tm) to do.

Jeremy, would you mind verifying that this WorksForYou(tm)?

Linus

-
diff -u --recursive --new-file pre5/linux/fs/autofs4/expire.c linux/fs/autofs4/expire.c
--- pre5/linux/fs/autofs4/expire.c  Mon Oct 23 21:57:38 2000
+++ linux/fs/autofs4/expire.c   Fri Apr 20 22:57:51 2001
@@ -98,8 +98,6 @@
 top, count));
this_parent = top;

-   count--;/* top is passed in after being dgot */
-
if (is_autofs4_dentry(top)) {
count--;
DPRINTK(("is_tree_busy: autofs; count=%d\n", count));
@@ -168,8 +166,6 @@
unsigned long timeout;
struct dentry *root = sb->s_root;
struct list_head *tmp;
-   struct dentry *d;
-   struct vfsmount *p;

if (!sbi->exp_timeout || !root)
return NULL;
@@ -208,23 +204,17 @@
 attempts if expire fails the first time */
ino->last_used = now;
}
-   p = mntget(mnt);
-   d = dget_locked(dentry);
-
-   if (!is_tree_busy(p, d)) {
+   if (!is_tree_busy(mnt, dentry)) {
DPRINTK(("autofs_expire: returning %p %.*s\n",
 dentry, (int)dentry->d_name.len, 
dentry->d_name.name));
/* Start from here next time */
list_del(>d_subdirs);
list_add(>d_subdirs, >d_child);
+   dget(dentry);
spin_unlock(_lock);

-   dput(d);
-   mntput(p);
return dentry;
}
-   dput(d);
-   mntput(p);
}
spin_unlock(_lock);

@@ -251,6 +241,7 @@
pkt.len = dentry->d_name.len;
memcpy(pkt.name, dentry->d_name.name, pkt.len);
pkt.name[pkt.len] = '\0';
+   dput(dentry);

if ( copy_to_user(pkt_p, , sizeof(struct autofs_packet_expire)) )
return -EFAULT;
@@ -278,6 +269,7 @@
de_info->flags |= AUTOFS_INF_EXPIRING;
ret = autofs4_wait(sbi, >d_name, NFY_EXPIRE);
de_info->flags &= ~AUTOFS_INF_EXPIRING;
+   dput(dentry);
}

return ret;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Fix for SMP deadlock in autofs4

2001-04-20 Thread Jeremy Fitzhardinge

On Fri, Apr 20, 2001 at 03:53:45PM -0400, Alexander Viro wrote:
> > Why are we doing the mntget/dget at all? We hold the spinlock, so we know
> > they are not going away. Not doing the mntget/dget means that we (a) run
> > faster and (b) don't have the bug, because we don't need to put the damn
> > things.
> > 
> > Comments?
> 
> It looks like you are right, but I wonder how the hell did that code
> happen at all. Looks like somewhere around 2.4.0-test10-pre* dcache_lock
> was moved out of is_tree_busy() and covered dget/dput. Hmm... Might be
> my fault - I don't remember doing that, but...

I did it.  I couldn't see a point in continiously taking and releasing
the dcache lock, since it just increased complexity and expire is not
a performance-critical path (ie, it happens rarely).

I kept the dget/put out caution and ignorance, but they're clearly
problematic.  I'm happy to drop them if holding dcache_lock is enough
to keep the tree stable while I traverse it.

> Removing that will require an obvious change in is_tree_busy() (shift
> count by 1). However, the real question is WTF are we trying to 
> get in autofs4_expire() - it returns dentry without grabbing a
> reference to it. The only thing that saves us is that we have a
> ramfs-style situation (dentries are pinned until we rmdir) and
> everything up to the point where we silently forget about dentry
> is covered by BKL. Since ->rmdir() is under BKL too it's enough,
> but... Eww... 

The dentry it returns is always an autofs4 dentry, and autofs4 always
keeps a refcount on its dentries like ramfs (because like ramfs,
autofs4 exists only in the dcache).

> Jeremy, what are you really trying to do there? is_tree_busy()
> seems to be written in assumption that mnt/dentry is not a
> mountpoint but root of a subtree with something mounted on its
> leaves. And autofs4_expire() traverses the list of root's
> subdirectories, picks one that has nothing busy mounted in
> _its_ subdirectories and essentially pass the name to caller.
> Which sends that name (of first-level subdirectory) to
> userland.

Exactly right.

> Is that what you really want there? It looks very odd - why don't we pass
> the names of actual mountpoints? What's wrong with the case when foo/bar
> is busy, but foo/baz is not?

Say for example you have an autofs4 filesystem mounted on /net.  When you
do a "cd /net/host", all of host's exported NFS filesystems are mounted
on the directory /net/host; obviously the mountpoint /net/host is an
autofs4 directory.

autofs4_expire traverses the directories in its root and finds the ones
which are currently unused and have been idle for some time.  Since all
the filesystems mounted under /net/host are part of the same logical
tree, it examines them as a single unit so they can be umounted as a
single unit.

Note that /net/host may not itself be a mountpoint.  If host doesn't
export / but only, say, /home and /usr/local, then there'll be a tree of
skeleton directories in the autofs4 filesystem to create the paths
up to the mountpoints (so there'll be /net/host/{home,usr/local}).
But because everything under /net/host is treated as a single unit, it's
only correct for autofs4_expire to return /net/host, not /net/host/home
or /net/host/usr/local.

The simplifying assumption I make is that there's a single root directory
with a number of sub-directories; each subdirectory is treated as
a single unit.  They general case would be to mark some directories
as being the root of an atomic set, and other directories simply being
structural, but the need has never come up (and it can be worked around by
having nested autofs filesystems).

J

 PGP signature


Re: Fix for SMP deadlock in autofs4

2001-04-20 Thread Alexander Viro



On Fri, 20 Apr 2001, Linus Torvalds wrote:

> Why are we doing the mntget/dget at all? We hold the spinlock, so we know
> they are not going away. Not doing the mntget/dget means that we (a) run
> faster and (b) don't have the bug, because we don't need to put the damn
> things.
> 
> Comments?

It looks like you are right, but I wonder how the hell did that code
happen at all. Looks like somewhere around 2.4.0-test10-pre* dcache_lock
was moved out of is_tree_busy() and covered dget/dput. Hmm... Might be
my fault - I don't remember doing that, but...
 
Anyway, it looks like in that case we can forget about games with
->d_count/->mnt_count. Other cases when we do "safe" dput() under
spinlocks are done under _different_ spinlocks, so they are not
a problem.
 
Removing that will require an obvious change in is_tree_busy() (shift
count by 1). However, the real question is WTF are we trying to 
get in autofs4_expire() - it returns dentry without grabbing a
reference to it. The only thing that saves us is that we have a
ramfs-style situation (dentries are pinned until we rmdir) and
everything up to the point where we silently forget about dentry
is covered by BKL. Since ->rmdir() is under BKL too it's enough,
but... Eww... 

Jeremy, what are you really trying to do there? is_tree_busy()
seems to be written in assumption that mnt/dentry is not a
mountpoint but root of a subtree with something mounted on its
leaves. And autofs4_expire() traverses the list of root's
subdirectories, picks one that has nothing busy mounted in
_its_ subdirectories and essentially pass the name to caller.
Which sends that name (of first-level subdirectory) to
userland.

Is that what you really want there? It looks very odd - why don't we pass
the names of actual mountpoints? What's wrong with the case when foo/bar
is busy, but foo/baz is not?
Al


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Fix for SMP deadlock in autofs4

2001-04-20 Thread Linus Torvalds



On Fri, 20 Apr 2001, Jeremy Fitzhardinge wrote:
>
> This is a fix for a potential deadlock in autofs4's expire routine.

It's wrong.

I don't think we should be able to do a mntput() _either_ inside the
spinlock. The filesystem should not "know" that mntput is safe.

For this reason I don't think "dput_locked()" is the right answer either.

Why are we doing the mntget/dget at all? We hold the spinlock, so we know
they are not going away. Not doing the mntget/dget means that we (a) run
faster and (b) don't have the bug, because we don't need to put the damn
things.

Comments?

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Fix for SMP deadlock in autofs4

2001-04-20 Thread Jeremy Fitzhardinge

On Fri, Apr 20, 2001 at 05:00:04AM -0400, Alexander Viro wrote:
> Frankly, I'd rather add dput_locked() in dcache.c. The bug is real and
> since autofs4 is not the only place like that... I'll look into that
> stuff.

Sounds fine.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Fix for SMP deadlock in autofs4

2001-04-20 Thread Alexander Viro



On Fri, 20 Apr 2001, Jeremy Fitzhardinge wrote:

> This is a fix for a potential deadlock in autofs4's expire routine.
> It tries to use dput() while holding the dcache_lock.  This isn't a
> problem in principle since dput() should only try to take the dcache_lock
> when the counter makes a transition to zero, which can't happen in
> this case.  Unfortunately the generic (and only) implementation of
> atomic_dec_and_lock always takes the lock, so deadlocks.

Frankly, I'd rather add dput_locked() in dcache.c. The bug is real and
since autofs4 is not the only place like that... I'll look into that
stuff.
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Fix for SMP deadlock in autofs4

2001-04-20 Thread Jeremy Fitzhardinge

This is a fix for a potential deadlock in autofs4's expire routine.
It tries to use dput() while holding the dcache_lock.  This isn't a
problem in principle since dput() should only try to take the dcache_lock
when the counter makes a transition to zero, which can't happen in
this case.  Unfortunately the generic (and only) implementation of
atomic_dec_and_lock always takes the lock, so deadlocks.

Obviously, this only effects SMP.  UP's wise avoidance of spinlocks
saves it once again.

The simple solution is simply to replace dput() with atomic_dec().
The count can't reach zero because we did a dget_locked() and held
dcache_lock the whole time, so we never need to worry about the rest of
the dput() logic.

--- ../2.4/fs/autofs4/expire.c  Wed Jan 31 00:20:50 2001
+++ fs/autofs4/expire.c Fri Apr 20 01:29:53 2001
@@ -223,7 +223,8 @@
mntput(p);
return dentry;
}
-   dput(d);
+
+   atomic_dec(>d_count); /* dput(), but we'll never hit zero */
mntput(p);
}
spin_unlock(_lock);

J

 PGP signature


Re: Fix for SMP deadlock in autofs4

2001-04-20 Thread Alexander Viro



On Fri, 20 Apr 2001, Linus Torvalds wrote:

 Why are we doing the mntget/dget at all? We hold the spinlock, so we know
 they are not going away. Not doing the mntget/dget means that we (a) run
 faster and (b) don't have the bug, because we don't need to put the damn
 things.
 
 Comments?

It looks like you are right, but I wonder how the hell did that code
happen at all. Looks like somewhere around 2.4.0-test10-pre* dcache_lock
was moved out of is_tree_busy() and covered dget/dput. Hmm... Might be
my fault - I don't remember doing that, but...
 
Anyway, it looks like in that case we can forget about games with
-d_count/-mnt_count. Other cases when we do "safe" dput() under
spinlocks are done under _different_ spinlocks, so they are not
a problem.
 
Removing that will require an obvious change in is_tree_busy() (shift
count by 1). However, the real question is WTF are we trying to 
get in autofs4_expire() - it returns dentry without grabbing a
reference to it. The only thing that saves us is that we have a
ramfs-style situation (dentries are pinned until we rmdir) and
everything up to the point where we silently forget about dentry
is covered by BKL. Since -rmdir() is under BKL too it's enough,
but... Eww... 

Jeremy, what are you really trying to do there? is_tree_busy()
seems to be written in assumption that mnt/dentry is not a
mountpoint but root of a subtree with something mounted on its
leaves. And autofs4_expire() traverses the list of root's
subdirectories, picks one that has nothing busy mounted in
_its_ subdirectories and essentially pass the name to caller.
Which sends that name (of first-level subdirectory) to
userland.

Is that what you really want there? It looks very odd - why don't we pass
the names of actual mountpoints? What's wrong with the case when foo/bar
is busy, but foo/baz is not?
Al


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Fix for SMP deadlock in autofs4

2001-04-20 Thread Jeremy Fitzhardinge

This is a fix for a potential deadlock in autofs4's expire routine.
It tries to use dput() while holding the dcache_lock.  This isn't a
problem in principle since dput() should only try to take the dcache_lock
when the counter makes a transition to zero, which can't happen in
this case.  Unfortunately the generic (and only) implementation of
atomic_dec_and_lock always takes the lock, so deadlocks.

Obviously, this only effects SMP.  UP's wise avoidance of spinlocks
saves it once again.

The simple solution is simply to replace dput() with atomic_dec().
The count can't reach zero because we did a dget_locked() and held
dcache_lock the whole time, so we never need to worry about the rest of
the dput() logic.

--- ../2.4/fs/autofs4/expire.c  Wed Jan 31 00:20:50 2001
+++ fs/autofs4/expire.c Fri Apr 20 01:29:53 2001
@@ -223,7 +223,8 @@
mntput(p);
return dentry;
}
-   dput(d);
+
+   atomic_dec(d-d_count); /* dput(), but we'll never hit zero */
mntput(p);
}
spin_unlock(dcache_lock);

J

 PGP signature


Re: Fix for SMP deadlock in autofs4

2001-04-20 Thread Alexander Viro



On Fri, 20 Apr 2001, Jeremy Fitzhardinge wrote:

 This is a fix for a potential deadlock in autofs4's expire routine.
 It tries to use dput() while holding the dcache_lock.  This isn't a
 problem in principle since dput() should only try to take the dcache_lock
 when the counter makes a transition to zero, which can't happen in
 this case.  Unfortunately the generic (and only) implementation of
 atomic_dec_and_lock always takes the lock, so deadlocks.

Frankly, I'd rather add dput_locked() in dcache.c. The bug is real and
since autofs4 is not the only place like that... I'll look into that
stuff.
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Fix for SMP deadlock in autofs4

2001-04-20 Thread Jeremy Fitzhardinge

On Fri, Apr 20, 2001 at 05:00:04AM -0400, Alexander Viro wrote:
 Frankly, I'd rather add dput_locked() in dcache.c. The bug is real and
 since autofs4 is not the only place like that... I'll look into that
 stuff.

Sounds fine.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Fix for SMP deadlock in autofs4

2001-04-20 Thread Linus Torvalds



On Fri, 20 Apr 2001, Jeremy Fitzhardinge wrote:

 This is a fix for a potential deadlock in autofs4's expire routine.

It's wrong.

I don't think we should be able to do a mntput() _either_ inside the
spinlock. The filesystem should not "know" that mntput is safe.

For this reason I don't think "dput_locked()" is the right answer either.

Why are we doing the mntget/dget at all? We hold the spinlock, so we know
they are not going away. Not doing the mntget/dget means that we (a) run
faster and (b) don't have the bug, because we don't need to put the damn
things.

Comments?

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Fix for SMP deadlock in autofs4

2001-04-20 Thread Linus Torvalds



On Fri, 20 Apr 2001, Jeremy Fitzhardinge wrote:

 I kept the dget/put out caution and ignorance, but they're clearly
 problematic.  I'm happy to drop them if holding dcache_lock is enough
 to keep the tree stable while I traverse it.

How does this patch look to you people?

It's untested, but looks fairly obvious. It removes the increment, and
changes autofs4_expire() to properly bump the count of the returned dentry
(and callers will dput() it when done). This may be unnecessarily careful,
but it's the RightThing(tm) to do.

Jeremy, would you mind verifying that this WorksForYou(tm)?

Linus

-
diff -u --recursive --new-file pre5/linux/fs/autofs4/expire.c linux/fs/autofs4/expire.c
--- pre5/linux/fs/autofs4/expire.c  Mon Oct 23 21:57:38 2000
+++ linux/fs/autofs4/expire.c   Fri Apr 20 22:57:51 2001
@@ -98,8 +98,6 @@
 top, count));
this_parent = top;

-   count--;/* top is passed in after being dgot */
-
if (is_autofs4_dentry(top)) {
count--;
DPRINTK(("is_tree_busy: autofs; count=%d\n", count));
@@ -168,8 +166,6 @@
unsigned long timeout;
struct dentry *root = sb-s_root;
struct list_head *tmp;
-   struct dentry *d;
-   struct vfsmount *p;

if (!sbi-exp_timeout || !root)
return NULL;
@@ -208,23 +204,17 @@
 attempts if expire fails the first time */
ino-last_used = now;
}
-   p = mntget(mnt);
-   d = dget_locked(dentry);
-
-   if (!is_tree_busy(p, d)) {
+   if (!is_tree_busy(mnt, dentry)) {
DPRINTK(("autofs_expire: returning %p %.*s\n",
 dentry, (int)dentry-d_name.len, 
dentry-d_name.name));
/* Start from here next time */
list_del(root-d_subdirs);
list_add(root-d_subdirs, dentry-d_child);
+   dget(dentry);
spin_unlock(dcache_lock);

-   dput(d);
-   mntput(p);
return dentry;
}
-   dput(d);
-   mntput(p);
}
spin_unlock(dcache_lock);

@@ -251,6 +241,7 @@
pkt.len = dentry-d_name.len;
memcpy(pkt.name, dentry-d_name.name, pkt.len);
pkt.name[pkt.len] = '\0';
+   dput(dentry);

if ( copy_to_user(pkt_p, pkt, sizeof(struct autofs_packet_expire)) )
return -EFAULT;
@@ -278,6 +269,7 @@
de_info-flags |= AUTOFS_INF_EXPIRING;
ret = autofs4_wait(sbi, dentry-d_name, NFY_EXPIRE);
de_info-flags = ~AUTOFS_INF_EXPIRING;
+   dput(dentry);
}

return ret;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Fix for SMP deadlock in autofs4

2001-04-20 Thread Jeremy Fitzhardinge

On Fri, Apr 20, 2001 at 03:53:45PM -0400, Alexander Viro wrote:
  Why are we doing the mntget/dget at all? We hold the spinlock, so we know
  they are not going away. Not doing the mntget/dget means that we (a) run
  faster and (b) don't have the bug, because we don't need to put the damn
  things.
  
  Comments?
 
 It looks like you are right, but I wonder how the hell did that code
 happen at all. Looks like somewhere around 2.4.0-test10-pre* dcache_lock
 was moved out of is_tree_busy() and covered dget/dput. Hmm... Might be
 my fault - I don't remember doing that, but...

I did it.  I couldn't see a point in continiously taking and releasing
the dcache lock, since it just increased complexity and expire is not
a performance-critical path (ie, it happens rarely).

I kept the dget/put out caution and ignorance, but they're clearly
problematic.  I'm happy to drop them if holding dcache_lock is enough
to keep the tree stable while I traverse it.

 Removing that will require an obvious change in is_tree_busy() (shift
 count by 1). However, the real question is WTF are we trying to 
 get in autofs4_expire() - it returns dentry without grabbing a
 reference to it. The only thing that saves us is that we have a
 ramfs-style situation (dentries are pinned until we rmdir) and
 everything up to the point where we silently forget about dentry
 is covered by BKL. Since -rmdir() is under BKL too it's enough,
 but... Eww... 

The dentry it returns is always an autofs4 dentry, and autofs4 always
keeps a refcount on its dentries like ramfs (because like ramfs,
autofs4 exists only in the dcache).

 Jeremy, what are you really trying to do there? is_tree_busy()
 seems to be written in assumption that mnt/dentry is not a
 mountpoint but root of a subtree with something mounted on its
 leaves. And autofs4_expire() traverses the list of root's
 subdirectories, picks one that has nothing busy mounted in
 _its_ subdirectories and essentially pass the name to caller.
 Which sends that name (of first-level subdirectory) to
 userland.

Exactly right.

 Is that what you really want there? It looks very odd - why don't we pass
 the names of actual mountpoints? What's wrong with the case when foo/bar
 is busy, but foo/baz is not?

Say for example you have an autofs4 filesystem mounted on /net.  When you
do a "cd /net/host", all of host's exported NFS filesystems are mounted
on the directory /net/host; obviously the mountpoint /net/host is an
autofs4 directory.

autofs4_expire traverses the directories in its root and finds the ones
which are currently unused and have been idle for some time.  Since all
the filesystems mounted under /net/host are part of the same logical
tree, it examines them as a single unit so they can be umounted as a
single unit.

Note that /net/host may not itself be a mountpoint.  If host doesn't
export / but only, say, /home and /usr/local, then there'll be a tree of
skeleton directories in the autofs4 filesystem to create the paths
up to the mountpoints (so there'll be /net/host/{home,usr/local}).
But because everything under /net/host is treated as a single unit, it's
only correct for autofs4_expire to return /net/host, not /net/host/home
or /net/host/usr/local.

The simplifying assumption I make is that there's a single root directory
with a number of sub-directories; each subdirectory is treated as
a single unit.  They general case would be to mark some directories
as being the root of an atomic set, and other directories simply being
structural, but the need has never come up (and it can be worked around by
having nested autofs filesystems).

J

 PGP signature