Re: fs: lockup on rename_mutex in fs/dcache.c:1035

2014-10-26 Thread Al Viro
On Sun, Oct 26, 2014 at 04:33:11PM -0700, Linus Torvalds wrote: > On Sun, Oct 26, 2014 at 2:57 PM, Al Viro wrote: > > .. snip .. > > in d_walk(), __list_del() instead of list_del() in __dentry_kill(), > > d_u.d_child > > turning into d_child everywhere, while d_alias turns into d_u.d_alias... >

Re: fs: lockup on rename_mutex in fs/dcache.c:1035

2014-10-26 Thread Linus Torvalds
On Sun, Oct 26, 2014 at 2:57 PM, Al Viro wrote: > .. snip .. > in d_walk(), __list_del() instead of list_del() in __dentry_kill(), > d_u.d_child > turning into d_child everywhere, while d_alias turns into d_u.d_alias... > > It looks like that way we would get no retries on the second pass. Comm

Re: fs: lockup on rename_mutex in fs/dcache.c:1035

2014-10-26 Thread Al Viro
On Sun, Oct 26, 2014 at 07:13:32PM +, Al Viro wrote: > The comment is not correct. dentry_kill() won't screw the pointer to > parent; it will, however, screw the pointer to next sibling. > > It used to screw the pointer to parent (which is what the first part of > condition was about). After

Re: fs: lockup on rename_mutex in fs/dcache.c:1035

2014-10-26 Thread Al Viro
On Sun, Oct 26, 2014 at 11:56:08AM -0700, Linus Torvalds wrote: > On Sat, Oct 25, 2014 at 8:57 PM, Al Viro wrote: > Your patch looks fine, and I don't think we can livelock - because we > always set 'seq' to 1 if we retry, and that causes us to get the > exclusive lock, so we'd better not then ent

Re: fs: lockup on rename_mutex in fs/dcache.c:1035

2014-10-26 Thread Linus Torvalds
On Sat, Oct 25, 2014 at 8:57 PM, Al Viro wrote: > [context for Linus] > > Fuzzer has triggered deadlock in d_walk() with rename_lock taken twice. > AFAICS, the plausible scenario is > (child->d_flags & DCACHE_DENTRY_KILLED) || > triggering while ascending to parent, on the

Re: fs: lockup on rename_mutex in fs/dcache.c:1035

2014-10-25 Thread Al Viro
[context for Linus] Fuzzer has triggered deadlock in d_walk() with rename_lock taken twice. AFAICS, the plausible scenario is (child->d_flags & DCACHE_DENTRY_KILLED) || triggering while ascending to parent, on the pass with rename_lock already held exclusive. In that case

Re: fs: lockup on rename_mutex in fs/dcache.c:1035

2014-10-25 Thread Al Viro
On Sun, Oct 26, 2014 at 03:06:08AM +, Al Viro wrote: > > >From a quick reading of the code it simply isn't possible for d_walk to > > take the lock twice short of memory corruption. And the fact that the > > code has not changed in years seems to suggest it isn't the obvious > > cause of d_wal

Re: fs: lockup on rename_mutex in fs/dcache.c:1035

2014-10-25 Thread Al Viro
On Sat, Oct 25, 2014 at 08:01:41PM -0700, Eric W. Biederman wrote: > Sasha Levin writes: > > > Hi all, > > > > While fuzzing with trinity inside a KVM tools guest running the latest -next > > kernel, I've stumbled on the following spew: > > Weird. I took a quick look and I don't see any changes

Re: fs: lockup on rename_mutex in fs/dcache.c:1035

2014-10-25 Thread Eric W. Biederman
Sasha Levin writes: > Hi all, > > While fuzzing with trinity inside a KVM tools guest running the latest -next > kernel, I've stumbled on the following spew: Weird. I took a quick look and I don't see any changes in d_walk that in Al's tree or in Linus's kernel for years. Has read_seqbegin_or_

Re: fs: lockup on rename_mutex in fs/dcache.c:1035

2014-10-25 Thread Al Viro
On Sat, Oct 25, 2014 at 09:39:23PM -0400, Sasha Levin wrote: > [ 6172.870045] trinity-c55/12752 is trying to acquire lock: > [ 6172.870045] (rename_lock){+.+...}, at: d_walk > (include/linux/spinlock.h:309 fs/dcache.c:1035) > [ 6172.870045] > [ 6172.870045] but task is already holding lock: > [ 61