On Sun, Jan 16, 2011 at 12:30 AM, Al Viro <[email protected]> wrote:
>
>> > AFAICS, it keeps your write-free objectives and gets much saner API.
>> > Shout if you have problems with that...
>>
>> No that sounds good, I don't have a problem with it, although I don't
>> exactly understand what you're getting at in the second paragraph.
>
> OK, I have a hopefully sane implementation in tip of #untested.
>
> There's a fun problem with what you do in do_lookup(), BTW. Look:
> we enter do_lookup() with LOOKUP_RCU. We find dentry in dcache,
> everything's beautiful. The sucker has ->d_revalidate(). We go
> to need_revalidate. There we call do_revalidate(). It calls
> d_revalidate(), which calls ->d_revalidate() and instead of spitting
> into your eye and returning -ECHILD it happily returns 1. So
> do d_revalidate() and then do_revalidate(), without any further
> actions. do_revalidate() has returned our dentry, which is neither
> NULL nor ERR_PTR(), so back in do_lookup() we go to done.
>
> There we set path->mnt and path->dentry and call __follow_mount().
> And damn, it *is* a mountpoint. So we
> * do dput() on dentry we'd never grabbed a reference to
> * grab a reference to a different dentry (and remain in happy
> belief that we are in LOOKUP_RCU mode, and thus don't need to drop it)
> * grab a reference to vfsmount (via lookup_mnt()). Ditto (and
> I haven't checked if grabbing vfsmount_lock twice shared isn't a recipe
> for a deadlocky race with something grabbing it exclusive between these
> nested shared grabs).
> * if we are really unlucky and that mountpoint is, in turn,
> overmounted, we'll hit mntput(). While under vfsmount_lock.
>
> AFAICS, it's badly b0rken. And autofs really steps into that mess.
Ah, I think you're right there, good catch.
> As minimum, we'd need to split need_revalidate: and done: in RCU and non-RCU
> variants. I'm about to fall down right now after an all-nighter (and then
> some); if you put something together before I get up, please throw it
> my way.
It also has a problem with jumping back to need_lookup case there too.
I think separating out those two cases is reasonable.
>
> Note that the problem exists both in mainline and in mainline+automount
> patches; in the latter it's a bit nastier, but in principle the situation
> is the same, so I'd rather see a fix for that in front of automount queue.
Definitely agree.
I'm on the road at the moment so would much appreciate if you can
cut the patch, but I could suggest something along the lines of:
if (nd->flags & LOOKUP_RCU) {
unsigned seq;
*inode = nd->inode;
dentry = __d_lookup_rcu(parent, name, &seq, inode);
if (!dentry)
goto need_lookup_rcu;
/* Memory barrier in read_seqcount_begin of child is enough */
if (__read_seqcount_retry(&parent->d_seq, nd->seq))
return -ECHILD;
nd->seq = seq;
if (dentry->d_flags & DCACHE_OP_REVALIDATE) {
dentry = do_revalidate(dentry, nd);
if (!dentry)
goto need_lookup_rcu;
if (IS_ERR(dentry))
goto fail;
}
path->mnt = mnt;
path->dentry = dentry;
__follow_mount_rcu(nd, path, inode);
} else {
dentry = __d_lookup(parent, name);
if (!dentry)
goto need_lookup;
found:
if (dentry->d_flags & DCACHE_OP_REVALIDATE) {
dentry = do_revalidate(dentry, nd);
if (!dentry)
goto need_lookup;
if (IS_ERR(dentry))
goto fail;
}
path->mnt = mnt;
path->dentry = dentry;
__follow_mount(path);
*inode = path->dentry->d_inode;
}
return 0;
need_lookup_rcu:
if (nameidata_drop_rcu(nd))
return -ECHILD;
need_lookup:
...
_______________________________________________
autofs mailing list
[email protected]
http://linux.kernel.org/mailman/listinfo/autofs