On Wed, 16 Nov 2005, Ram Pai wrote:

> > > > 
> > > > Aren't there other paths that enter revalidate without holding the 
> > > > semaphore? chdir?
> > > 
> > > Looking at the code and it seemed to me that ->revalidate() function is
> > > always with the semaphore held. Atleast VFS seem to have that
> > > assumption.
> > 
> > My reading gives me the opposite impression.
> > 
> > The chdir example above calls the path walking routine which calls the 
> > lookup method with the semaphore held and revalidate without it held. 
> > Clearly, there are a number of other examples.
> > 
> > Certainly, I could be wrong and I'll be checking that.
> 
> I see your point. Looking more through the code it looks like
> the convention about how ->revalidate() gets called, seems to be
> inconsistent in VFS.
> 
> in do_lookup() which calls ->revalidate(), the semaphore is not-held.
> 
> Where as lookup_one_len() is expected to be called with the semaphore
> held. This function calls lookup_hash() which calls cached_lookup()
> which later calls ->revalidate(), and here ->revalidate() is called with
> the semaphore held.  Is this the source of the bug?

Yep. My focus has been very much on link_path_walk so I've missed this 
case.

I understood what was going on after reading your answer to Jeffs 
question. I realized then what lead to it doesn't matter, your point being 
that it's possible to cause this by calling lookup_one_len, as you describe.

> > 
> > What I can't work out is how getdents appears to be called without having 
> > called open. Is there anything more that you can tell me about how you 
> > have been able to demonstrate this error.
> > 
> Maybe you are missing the stubfs part. The stubfs is kind of
> in-the-middle filesystem which sits between the application and
> the autofs4. 

Yep. Got that from the other post to.

> 
> Will Taber: Am I saying this right?

Your original description was fine.

Thanks for putting in the effort to make it clear.

> 
> Take a look at the test patch for stubfs posted at
> http://www.sudhaa.com/~ram/readahead/stubfs.patch
> 

I will. Soon as I get a chance. Sounds interesting.

> 
> For clarity here is the scenario:
> 
> P1 executes 'ls' on a directory belonging to stubfs. 
>           stubfs's ->lookup() gets
>         called and it internally redirects that lookup to autofs4
>           by calling lookup_one_len() on /net/ram 
>           note: /net belongs to autofs4 and lookup_one_len() is
>            called holding the inode-semaphore of /net .
>            lookup_one_len() calls lookup_hash() which finds that there
>          is no cached dentry for 'ram', and hence allocates a dentry
>            and calls ->lookup() of autofs4. 
>            autofs4 adds the dentry to the dcache and calls its
>          ->revalidate() after releasing the semaphore.
>            ->revalidate tries to wake up the automounter daemon, and
>             goes to sleep on a waitq.
> 
> P2 executes 'ls' on another directory belonging to stubfs. 
>          stubfs's ->lookup()
>          gets called and it internally redirects that lookup to autofs4
>          by calling lookup_one_len() on /net/ram. lookup_one_len() is
>          called holding the inode-semaphore of /net.  
>          lookup_one_len() calls lookup_hash() which calls
>          cached_lookup(). cached_lookup() finds the dentry
>          corresponding to 'ram' in the dcache. So it calls
>          ->revalidate() on it. NOTE: this time autofs4's 
>          ->revalidate() is called holding the semaphore.
>          ->revalidate() goes to sleep on the same waitq 
>          waiting on the automounter to wake him up.
> 
> automouter: the automounter now comes in and tries to hold 
>            the semaphore on /net and deadlocks.
> 
> The question is: Who is the culprit?  stubfs?  VFS? or
>              autofs4?

I'm happy to fix it in autofs unless you feel we need to address the wider 
issue.

I'll put together a patch which takes account of this and pushes the 
hold/release down into try_to_fill_dentry. But I would like a little 
time to think about whether there may be other implications.

Ian

_______________________________________________
autofs mailing list
[email protected]
http://linux.kernel.org/mailman/listinfo/autofs

Reply via email to