Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-18 Thread Paul E. McKenney
On Tue, Apr 18, 2017 at 05:20:38PM +0200, Johannes Berg wrote:
> On Tue, 2017-04-18 at 08:17 -0700, Paul E. McKenney wrote:
> 
> > > Again, no (S)RCU abuse here, just an ABBA deadlock.
> > 
> > OK, please accept my apologies for failing to follow the thread.
> 
> No worries - just wanted to clarify this in case I was missing
> something.
> 
> > I nevertheless reiterate my advice to run at least some tests with
> > CONFIG_PROVE_RCU=y.  And yes, it would be good to upgrade lockdep
> > to find the above theoretical deadlock.
> 
> Right. It won't matter for debugfs after this patchset, but it would
> indeed be nice - I failed in my attempt though :)

Hey, I was hoping!  ;-)

Thanx, Paul



Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-18 Thread Paul E. McKenney
On Tue, Apr 18, 2017 at 05:20:38PM +0200, Johannes Berg wrote:
> On Tue, 2017-04-18 at 08:17 -0700, Paul E. McKenney wrote:
> 
> > > Again, no (S)RCU abuse here, just an ABBA deadlock.
> > 
> > OK, please accept my apologies for failing to follow the thread.
> 
> No worries - just wanted to clarify this in case I was missing
> something.
> 
> > I nevertheless reiterate my advice to run at least some tests with
> > CONFIG_PROVE_RCU=y.  And yes, it would be good to upgrade lockdep
> > to find the above theoretical deadlock.
> 
> Right. It won't matter for debugfs after this patchset, but it would
> indeed be nice - I failed in my attempt though :)

Hey, I was hoping!  ;-)

Thanx, Paul



Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-18 Thread Johannes Berg
On Tue, 2017-04-18 at 08:17 -0700, Paul E. McKenney wrote:

> > Again, no (S)RCU abuse here, just an ABBA deadlock.
> 
> OK, please accept my apologies for failing to follow the thread.

No worries - just wanted to clarify this in case I was missing
something.

> I nevertheless reiterate my advice to run at least some tests with
> CONFIG_PROVE_RCU=y.  And yes, it would be good to upgrade lockdep
> to find the above theoretical deadlock.

Right. It won't matter for debugfs after this patchset, but it would
indeed be nice - I failed in my attempt though :)

johannes


Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-18 Thread Johannes Berg
On Tue, 2017-04-18 at 08:17 -0700, Paul E. McKenney wrote:

> > Again, no (S)RCU abuse here, just an ABBA deadlock.
> 
> OK, please accept my apologies for failing to follow the thread.

No worries - just wanted to clarify this in case I was missing
something.

> I nevertheless reiterate my advice to run at least some tests with
> CONFIG_PROVE_RCU=y.  And yes, it would be good to upgrade lockdep
> to find the above theoretical deadlock.

Right. It won't matter for debugfs after this patchset, but it would
indeed be nice - I failed in my attempt though :)

johannes


Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-18 Thread Paul E. McKenney
On Tue, Apr 18, 2017 at 03:40:32PM +0200, Johannes Berg wrote:
> On Tue, 2017-04-18 at 06:31 -0700, Paul E. McKenney wrote:
> > On Tue, Apr 18, 2017 at 11:39:27AM +0200, Johannes Berg wrote:
> > > On Mon, 2017-04-17 at 09:01 -0700, Paul E. McKenney wrote:
> > > 
> > > > If you have not already done so, please run this with debug
> > > > enabled,
> > > > especially CONFIG_PROVE_LOCKING=y (which implies
> > > > CONFIG_PROVE_RCU=y).
> > > > This is important because there are configurations for which the
> > > > deadlocks you saw with SRCU turn into silent failure, including
> > > > memory corruption.
> > > > CONFIG_PROVE_RCU=y will catch many of those situations.
> > > 
> > > Can you elaborate on that? I think we may have had CONFIG_PROVE_RCU
> > > enabled in the builds where we saw the problem, but I'm not sure.
> > 
> > CONFIG_PROVE_RCU=y will reliably catch things like this:
> > 
> > 1.  rcu_read_lock();
> > synchronize_rcu();
> > rcu_read_unlock();
> 
> Ok, that's not something that happens here either.
> 
> > 2.  rcu_read_lock();
> > schedule_timeout_interruptible(HZ);
> > rcu_read_unlock();
> 
> Neither is this happening.
> 
> > There are more, but this should get you the flavor of the types
> > of bugs CONFIG_PROVE_RCU=y can locate for you.
> 
> Makes sense. However, the issue at hand is what we (you and I)
> discussed earlier wrt. lockdep -- from SRCU's point of view everything
> is actually OK, except that the one thread is waiting for something and
> we can never finish the grace period, and thus synchronize_srcu() will
> never return. But there's no real SRCU bug here.
> 
> > > Nicolai probably never even ran into this problem, though it should
> > > be easy to reproduce.
> > 
> > I am just worried that the situation resulting in the earlier SRCU
> > deadlocks might be hiding behind CONFIG_PROVE_RCU=n,
> > CONFIG_PREEMPT=n, and CONFIG_PREEMPT_COUNT=n.  Or some other bug
> > hiding behind some other set of Kconfig options.
> 
> There's no SRCU deadlock though. I know exactly why it happens, in my
> case, which is the following:
> 
> Thread 1
> userspace: read(debugfs_file_1)
> srcu_read_lock(_srcu); // in debugfs bowels
> wait_event_interruptible(...); // in my driver's debugfs read method
> 
> Thread 2:
> debugfs_remove(debugfs_file_2);
> srcu_synchronize(_srcu); // in debugfs bowels
> 
> 
> This is the live-lock. The deadlock is something I posited but never
> ran into:
> 
> CPU 1 CPU 2
> srcu_read_lock(_srcu);
>   rtnl_lock();
> rtnl_lock();
>   srcu_synchronize(_srcu);
> 
> Again, no (S)RCU abuse here, just an ABBA deadlock.

OK, please accept my apologies for failing to follow the thread.

I nevertheless reiterate my advice to run at least some tests with
CONFIG_PROVE_RCU=y.  And yes, it would be good to upgrade lockdep
to find the above theoretical deadlock.

Thanx, Paul



Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-18 Thread Paul E. McKenney
On Tue, Apr 18, 2017 at 03:40:32PM +0200, Johannes Berg wrote:
> On Tue, 2017-04-18 at 06:31 -0700, Paul E. McKenney wrote:
> > On Tue, Apr 18, 2017 at 11:39:27AM +0200, Johannes Berg wrote:
> > > On Mon, 2017-04-17 at 09:01 -0700, Paul E. McKenney wrote:
> > > 
> > > > If you have not already done so, please run this with debug
> > > > enabled,
> > > > especially CONFIG_PROVE_LOCKING=y (which implies
> > > > CONFIG_PROVE_RCU=y).
> > > > This is important because there are configurations for which the
> > > > deadlocks you saw with SRCU turn into silent failure, including
> > > > memory corruption.
> > > > CONFIG_PROVE_RCU=y will catch many of those situations.
> > > 
> > > Can you elaborate on that? I think we may have had CONFIG_PROVE_RCU
> > > enabled in the builds where we saw the problem, but I'm not sure.
> > 
> > CONFIG_PROVE_RCU=y will reliably catch things like this:
> > 
> > 1.  rcu_read_lock();
> > synchronize_rcu();
> > rcu_read_unlock();
> 
> Ok, that's not something that happens here either.
> 
> > 2.  rcu_read_lock();
> > schedule_timeout_interruptible(HZ);
> > rcu_read_unlock();
> 
> Neither is this happening.
> 
> > There are more, but this should get you the flavor of the types
> > of bugs CONFIG_PROVE_RCU=y can locate for you.
> 
> Makes sense. However, the issue at hand is what we (you and I)
> discussed earlier wrt. lockdep -- from SRCU's point of view everything
> is actually OK, except that the one thread is waiting for something and
> we can never finish the grace period, and thus synchronize_srcu() will
> never return. But there's no real SRCU bug here.
> 
> > > Nicolai probably never even ran into this problem, though it should
> > > be easy to reproduce.
> > 
> > I am just worried that the situation resulting in the earlier SRCU
> > deadlocks might be hiding behind CONFIG_PROVE_RCU=n,
> > CONFIG_PREEMPT=n, and CONFIG_PREEMPT_COUNT=n.  Or some other bug
> > hiding behind some other set of Kconfig options.
> 
> There's no SRCU deadlock though. I know exactly why it happens, in my
> case, which is the following:
> 
> Thread 1
> userspace: read(debugfs_file_1)
> srcu_read_lock(_srcu); // in debugfs bowels
> wait_event_interruptible(...); // in my driver's debugfs read method
> 
> Thread 2:
> debugfs_remove(debugfs_file_2);
> srcu_synchronize(_srcu); // in debugfs bowels
> 
> 
> This is the live-lock. The deadlock is something I posited but never
> ran into:
> 
> CPU 1 CPU 2
> srcu_read_lock(_srcu);
>   rtnl_lock();
> rtnl_lock();
>   srcu_synchronize(_srcu);
> 
> Again, no (S)RCU abuse here, just an ABBA deadlock.

OK, please accept my apologies for failing to follow the thread.

I nevertheless reiterate my advice to run at least some tests with
CONFIG_PROVE_RCU=y.  And yes, it would be good to upgrade lockdep
to find the above theoretical deadlock.

Thanx, Paul



Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-18 Thread Johannes Berg
On Tue, 2017-04-18 at 06:31 -0700, Paul E. McKenney wrote:
> On Tue, Apr 18, 2017 at 11:39:27AM +0200, Johannes Berg wrote:
> > On Mon, 2017-04-17 at 09:01 -0700, Paul E. McKenney wrote:
> > 
> > > If you have not already done so, please run this with debug
> > > enabled,
> > > especially CONFIG_PROVE_LOCKING=y (which implies
> > > CONFIG_PROVE_RCU=y).
> > > This is important because there are configurations for which the
> > > deadlocks you saw with SRCU turn into silent failure, including
> > > memory corruption.
> > > CONFIG_PROVE_RCU=y will catch many of those situations.
> > 
> > Can you elaborate on that? I think we may have had CONFIG_PROVE_RCU
> > enabled in the builds where we saw the problem, but I'm not sure.
> 
> CONFIG_PROVE_RCU=y will reliably catch things like this:
> 
> 1.rcu_read_lock();
>   synchronize_rcu();
>   rcu_read_unlock();

Ok, that's not something that happens here either.

> 2.rcu_read_lock();
>   schedule_timeout_interruptible(HZ);
>   rcu_read_unlock();

Neither is this happening.

> There are more, but this should get you the flavor of the types
> of bugs CONFIG_PROVE_RCU=y can locate for you.

Makes sense. However, the issue at hand is what we (you and I)
discussed earlier wrt. lockdep -- from SRCU's point of view everything
is actually OK, except that the one thread is waiting for something and
we can never finish the grace period, and thus synchronize_srcu() will
never return. But there's no real SRCU bug here.

> > Nicolai probably never even ran into this problem, though it should
> > be easy to reproduce.
> 
> I am just worried that the situation resulting in the earlier SRCU
> deadlocks might be hiding behind CONFIG_PROVE_RCU=n,
> CONFIG_PREEMPT=n, and CONFIG_PREEMPT_COUNT=n.  Or some other bug
> hiding behind some other set of Kconfig options.

There's no SRCU deadlock though. I know exactly why it happens, in my
case, which is the following:

Thread 1
userspace: read(debugfs_file_1)
srcu_read_lock(_srcu); // in debugfs bowels
wait_event_interruptible(...); // in my driver's debugfs read method

Thread 2:
debugfs_remove(debugfs_file_2);
srcu_synchronize(_srcu); // in debugfs bowels


This is the live-lock. The deadlock is something I posited but never
ran into:

CPU 1   CPU 2
srcu_read_lock(_srcu);
rtnl_lock();
rtnl_lock();
srcu_synchronize(_srcu);

Again, no (S)RCU abuse here, just an ABBA deadlock.

johannes


Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-18 Thread Johannes Berg
On Tue, 2017-04-18 at 06:31 -0700, Paul E. McKenney wrote:
> On Tue, Apr 18, 2017 at 11:39:27AM +0200, Johannes Berg wrote:
> > On Mon, 2017-04-17 at 09:01 -0700, Paul E. McKenney wrote:
> > 
> > > If you have not already done so, please run this with debug
> > > enabled,
> > > especially CONFIG_PROVE_LOCKING=y (which implies
> > > CONFIG_PROVE_RCU=y).
> > > This is important because there are configurations for which the
> > > deadlocks you saw with SRCU turn into silent failure, including
> > > memory corruption.
> > > CONFIG_PROVE_RCU=y will catch many of those situations.
> > 
> > Can you elaborate on that? I think we may have had CONFIG_PROVE_RCU
> > enabled in the builds where we saw the problem, but I'm not sure.
> 
> CONFIG_PROVE_RCU=y will reliably catch things like this:
> 
> 1.rcu_read_lock();
>   synchronize_rcu();
>   rcu_read_unlock();

Ok, that's not something that happens here either.

> 2.rcu_read_lock();
>   schedule_timeout_interruptible(HZ);
>   rcu_read_unlock();

Neither is this happening.

> There are more, but this should get you the flavor of the types
> of bugs CONFIG_PROVE_RCU=y can locate for you.

Makes sense. However, the issue at hand is what we (you and I)
discussed earlier wrt. lockdep -- from SRCU's point of view everything
is actually OK, except that the one thread is waiting for something and
we can never finish the grace period, and thus synchronize_srcu() will
never return. But there's no real SRCU bug here.

> > Nicolai probably never even ran into this problem, though it should
> > be easy to reproduce.
> 
> I am just worried that the situation resulting in the earlier SRCU
> deadlocks might be hiding behind CONFIG_PROVE_RCU=n,
> CONFIG_PREEMPT=n, and CONFIG_PREEMPT_COUNT=n.  Or some other bug
> hiding behind some other set of Kconfig options.

There's no SRCU deadlock though. I know exactly why it happens, in my
case, which is the following:

Thread 1
userspace: read(debugfs_file_1)
srcu_read_lock(_srcu); // in debugfs bowels
wait_event_interruptible(...); // in my driver's debugfs read method

Thread 2:
debugfs_remove(debugfs_file_2);
srcu_synchronize(_srcu); // in debugfs bowels


This is the live-lock. The deadlock is something I posited but never
ran into:

CPU 1   CPU 2
srcu_read_lock(_srcu);
rtnl_lock();
rtnl_lock();
srcu_synchronize(_srcu);

Again, no (S)RCU abuse here, just an ABBA deadlock.

johannes


Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-18 Thread Paul E. McKenney
On Tue, Apr 18, 2017 at 11:39:27AM +0200, Johannes Berg wrote:
> On Mon, 2017-04-17 at 09:01 -0700, Paul E. McKenney wrote:
> 
> > If you have not already done so, please run this with debug enabled,
> > especially CONFIG_PROVE_LOCKING=y (which implies CONFIG_PROVE_RCU=y).
> > This is important because there are configurations for which the
> > deadlocks you saw with SRCU turn into silent failure, including
> > memory corruption.
> > CONFIG_PROVE_RCU=y will catch many of those situations.
> 
> Can you elaborate on that? I think we may have had CONFIG_PROVE_RCU
> enabled in the builds where we saw the problem, but I'm not sure.

CONFIG_PROVE_RCU=y will reliably catch things like this:

1.  rcu_read_lock();
synchronize_rcu();
rcu_read_unlock();

With CONFIG_PROVE_RCU=n and CONFIG_PREEMPT=n, this will result in
too-short grace periods, which can free things out from under the
read-side critical section, which in turn can result in arbitrary
memory corruption.  You might not even get a "scheduling while
atomic", though CONFIG_PREEMPT_COUNT=y will produce this message.

With CONFIG_PREEMPT=y, on the other hand, this should
deadlock in a manner similar to the earlier SRCU deadlocks
seen in debugfs.

2.  rcu_read_lock();
schedule_timeout_interruptible(HZ);
rcu_read_unlock();

With CONFIG_PROVE_RCU=y and CONFIG_PREEMPT=y, this will just
work, more or less.  Until someone runs with CONFIG_PREEMPT=n,
which will produce "scheduling while atomic".  (I have a
fix for this queued for 4.13, FWIW, so that in the future
CONFIG_PROVE_RCU=y and CONFIG_PREEMPT=y will complain about
this.  But for now, silent bug.)

There are more, but this should get you the flavor of the types
of bugs CONFIG_PROVE_RCU=y can locate for you.

> Can you say which configurations you're thinking of? And perhaps what
> kind of corruption you're thinking of also? I'm having a hard time
> imagining any corruption that should happen?

#1 is the silent corruption case given CONFIG_PROVE_RCU=n,
CONFIG_PREEMPT=n, and CONFIG_PREEMPT_COUNT=n.

> Nicolai probably never even ran into this problem, though it should be
> easy to reproduce.

I am just worried that the situation resulting in the earlier SRCU
deadlocks might be hiding behind CONFIG_PROVE_RCU=n, CONFIG_PREEMPT=n,
and CONFIG_PREEMPT_COUNT=n.  Or some other bug hiding behind some
other set of Kconfig options.

Thanx, Paul



Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-18 Thread Paul E. McKenney
On Tue, Apr 18, 2017 at 11:39:27AM +0200, Johannes Berg wrote:
> On Mon, 2017-04-17 at 09:01 -0700, Paul E. McKenney wrote:
> 
> > If you have not already done so, please run this with debug enabled,
> > especially CONFIG_PROVE_LOCKING=y (which implies CONFIG_PROVE_RCU=y).
> > This is important because there are configurations for which the
> > deadlocks you saw with SRCU turn into silent failure, including
> > memory corruption.
> > CONFIG_PROVE_RCU=y will catch many of those situations.
> 
> Can you elaborate on that? I think we may have had CONFIG_PROVE_RCU
> enabled in the builds where we saw the problem, but I'm not sure.

CONFIG_PROVE_RCU=y will reliably catch things like this:

1.  rcu_read_lock();
synchronize_rcu();
rcu_read_unlock();

With CONFIG_PROVE_RCU=n and CONFIG_PREEMPT=n, this will result in
too-short grace periods, which can free things out from under the
read-side critical section, which in turn can result in arbitrary
memory corruption.  You might not even get a "scheduling while
atomic", though CONFIG_PREEMPT_COUNT=y will produce this message.

With CONFIG_PREEMPT=y, on the other hand, this should
deadlock in a manner similar to the earlier SRCU deadlocks
seen in debugfs.

2.  rcu_read_lock();
schedule_timeout_interruptible(HZ);
rcu_read_unlock();

With CONFIG_PROVE_RCU=y and CONFIG_PREEMPT=y, this will just
work, more or less.  Until someone runs with CONFIG_PREEMPT=n,
which will produce "scheduling while atomic".  (I have a
fix for this queued for 4.13, FWIW, so that in the future
CONFIG_PROVE_RCU=y and CONFIG_PREEMPT=y will complain about
this.  But for now, silent bug.)

There are more, but this should get you the flavor of the types
of bugs CONFIG_PROVE_RCU=y can locate for you.

> Can you say which configurations you're thinking of? And perhaps what
> kind of corruption you're thinking of also? I'm having a hard time
> imagining any corruption that should happen?

#1 is the silent corruption case given CONFIG_PROVE_RCU=n,
CONFIG_PREEMPT=n, and CONFIG_PREEMPT_COUNT=n.

> Nicolai probably never even ran into this problem, though it should be
> easy to reproduce.

I am just worried that the situation resulting in the earlier SRCU
deadlocks might be hiding behind CONFIG_PROVE_RCU=n, CONFIG_PREEMPT=n,
and CONFIG_PREEMPT_COUNT=n.  Or some other bug hiding behind some
other set of Kconfig options.

Thanx, Paul



Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-18 Thread Johannes Berg
On Mon, 2017-04-17 at 09:01 -0700, Paul E. McKenney wrote:

> If you have not already done so, please run this with debug enabled,
> especially CONFIG_PROVE_LOCKING=y (which implies CONFIG_PROVE_RCU=y).
> This is important because there are configurations for which the
> deadlocks you saw with SRCU turn into silent failure, including
> memory corruption.
> CONFIG_PROVE_RCU=y will catch many of those situations.

Can you elaborate on that? I think we may have had CONFIG_PROVE_RCU
enabled in the builds where we saw the problem, but I'm not sure.

Can you say which configurations you're thinking of? And perhaps what
kind of corruption you're thinking of also? I'm having a hard time
imagining any corruption that should happen?

Nicolai probably never even ran into this problem, though it should be
easy to reproduce.

johannes


Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-18 Thread Johannes Berg
On Mon, 2017-04-17 at 09:01 -0700, Paul E. McKenney wrote:

> If you have not already done so, please run this with debug enabled,
> especially CONFIG_PROVE_LOCKING=y (which implies CONFIG_PROVE_RCU=y).
> This is important because there are configurations for which the
> deadlocks you saw with SRCU turn into silent failure, including
> memory corruption.
> CONFIG_PROVE_RCU=y will catch many of those situations.

Can you elaborate on that? I think we may have had CONFIG_PROVE_RCU
enabled in the builds where we saw the problem, but I'm not sure.

Can you say which configurations you're thinking of? And perhaps what
kind of corruption you're thinking of also? I'm having a hard time
imagining any corruption that should happen?

Nicolai probably never even ran into this problem, though it should be
easy to reproduce.

johannes


Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-17 Thread Paul E. McKenney
On Sun, Apr 16, 2017 at 11:51:37AM +0200, Nicolai Stange wrote:
> Currently, a dentry's debugfs_fsdata instance is allocated from
> debugfs_file_get() at first usage, i.e. at first file opening.
> 
> It won't ever get freed though.
> 
> Ideally, these instances would get freed after the last open file handle
> gets closed again, that is from the fops' ->release().
> 
> Unfortunately, we can't hook easily into those ->release()ers of files
> created through debugfs_create_file_unsafe(), i.e. of those proxied through
> debugfs_open_proxy_file_operations rather than through the "full"
> debugfs_full_proxy_file_operations proxy.
> 
> Hence, free unreferenced debugfs_fsdata instances from debugfs_file_put(),
> with the drawback of a potential allocation + deallocation per
> debugfs_file_get() + debugfs_file_put() pair, that is per fops invocation.
> 
> In addition to its former role of tracking pending fops, use the
> ->active_users for reference counting on the debugfs_fsdata instance
> itself. In particular, don't keep a dummy reference to be dropped from
> __debugfs_remove_file(): a d_delete()ed dentry and thus, request for
> completion notification, is now signaled by the d_unlinked() dentry itself.
> 
> Once ->active_users drops to zero (and the dentry is still intact), free
> the debugfs_fsdata instance from debugfs_file_put(). RCU protects any
> concurrent debugfs_file_get() attempts to get a hold of the instance here.
> Likewise for full_proxy_release() which lacks a call to debugfs_file_get().
> 
> Note that due to non-atomic updates to the d_unlinked() + ->d_fsdata pair,
> care must be taken in order to avoid races between debugfs_file_put() and
> debugfs_file_get() as well as __debugfs_remove_file(). Rather than
> introducing a global lock, exploit the fact that there will ever be only a
> single !d_unlinked() -> d_unlinked() transition and add memory barriers
> where needed. Given the lack of proper benchmarking, that debugfs fops
> aren't performance critical and that we've already got a potential
> allocation/deallocation pair anyway, the added code complexity might be
> highly questionable though.
> 
> Signed-off-by: Nicolai Stange 

If you have not already done so, please run this with debug enabled,
especially CONFIG_PROVE_LOCKING=y (which implies CONFIG_PROVE_RCU=y).
This is important because there are configurations for which the deadlocks
you saw with SRCU turn into silent failure, including memory corruption.
CONFIG_PROVE_RCU=y will catch many of those situations.

(And yes, kfree_rcu() doesn't have that problem, but...)

Another issue called out inline.

Thanx, Paul

> ---
>  fs/debugfs/file.c | 102 
> --
>  fs/debugfs/inode.c|   8 +++-
>  fs/debugfs/internal.h |   1 +
>  3 files changed, 90 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
> index f4dfd7d0d625..b2cc25d44a39 100644
> --- a/fs/debugfs/file.c
> +++ b/fs/debugfs/file.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> 
>  #include "internal.h"
> @@ -78,10 +79,39 @@ int debugfs_file_get(struct dentry *dentry)
>   struct debugfs_fsdata *fsd;
>   void *d_fsd;
> 
> - d_fsd = READ_ONCE(dentry->d_fsdata);
> + rcu_read_lock();
> +retry:
> + d_fsd = rcu_dereference(dentry->d_fsdata);
>   if (!((unsigned long)d_fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)) {
> + /*
> +  * Paired with the control dependency in
> +  * debugfs_file_put(): if we saw the debugfs_fsdata
> +  * instance "restored" there but not the dead dentry,
> +  * we'd erroneously instantiate a fresh debugfs_fsdata
> +  * instance below.
> +  */
> + smp_rmb();
> + if (d_unlinked(dentry)) {
> + rcu_read_unlock();
> + return -EIO;
> + }
> +
>   fsd = d_fsd;
> + if (!refcount_inc_not_zero(>active_users)) {
> + /*
> +  * A concurrent debugfs_file_put() dropped the
> +  * count to zero and is about to free the
> +  * debugfs_fsdata. Help out resetting the
> +  * ->d_fsdata and retry.
> +  */
> + d_fsd = (void *)((unsigned long)fsd->real_fops |
> + DEBUGFS_FSDATA_IS_REAL_FOPS_BIT);
> + RCU_INIT_POINTER(dentry->d_fsdata, d_fsd);

This is an infrequent race, I hope?  If on the other hand there is
a possibility of this branch being taken a huge number of times in
one call, it would be good to exit the RCU read-side critical section
before retrying.

> + goto retry;
> + }
> + rcu_read_unlock();
>   } else {
> + 

Re: [RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-17 Thread Paul E. McKenney
On Sun, Apr 16, 2017 at 11:51:37AM +0200, Nicolai Stange wrote:
> Currently, a dentry's debugfs_fsdata instance is allocated from
> debugfs_file_get() at first usage, i.e. at first file opening.
> 
> It won't ever get freed though.
> 
> Ideally, these instances would get freed after the last open file handle
> gets closed again, that is from the fops' ->release().
> 
> Unfortunately, we can't hook easily into those ->release()ers of files
> created through debugfs_create_file_unsafe(), i.e. of those proxied through
> debugfs_open_proxy_file_operations rather than through the "full"
> debugfs_full_proxy_file_operations proxy.
> 
> Hence, free unreferenced debugfs_fsdata instances from debugfs_file_put(),
> with the drawback of a potential allocation + deallocation per
> debugfs_file_get() + debugfs_file_put() pair, that is per fops invocation.
> 
> In addition to its former role of tracking pending fops, use the
> ->active_users for reference counting on the debugfs_fsdata instance
> itself. In particular, don't keep a dummy reference to be dropped from
> __debugfs_remove_file(): a d_delete()ed dentry and thus, request for
> completion notification, is now signaled by the d_unlinked() dentry itself.
> 
> Once ->active_users drops to zero (and the dentry is still intact), free
> the debugfs_fsdata instance from debugfs_file_put(). RCU protects any
> concurrent debugfs_file_get() attempts to get a hold of the instance here.
> Likewise for full_proxy_release() which lacks a call to debugfs_file_get().
> 
> Note that due to non-atomic updates to the d_unlinked() + ->d_fsdata pair,
> care must be taken in order to avoid races between debugfs_file_put() and
> debugfs_file_get() as well as __debugfs_remove_file(). Rather than
> introducing a global lock, exploit the fact that there will ever be only a
> single !d_unlinked() -> d_unlinked() transition and add memory barriers
> where needed. Given the lack of proper benchmarking, that debugfs fops
> aren't performance critical and that we've already got a potential
> allocation/deallocation pair anyway, the added code complexity might be
> highly questionable though.
> 
> Signed-off-by: Nicolai Stange 

If you have not already done so, please run this with debug enabled,
especially CONFIG_PROVE_LOCKING=y (which implies CONFIG_PROVE_RCU=y).
This is important because there are configurations for which the deadlocks
you saw with SRCU turn into silent failure, including memory corruption.
CONFIG_PROVE_RCU=y will catch many of those situations.

(And yes, kfree_rcu() doesn't have that problem, but...)

Another issue called out inline.

Thanx, Paul

> ---
>  fs/debugfs/file.c | 102 
> --
>  fs/debugfs/inode.c|   8 +++-
>  fs/debugfs/internal.h |   1 +
>  3 files changed, 90 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
> index f4dfd7d0d625..b2cc25d44a39 100644
> --- a/fs/debugfs/file.c
> +++ b/fs/debugfs/file.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> 
>  #include "internal.h"
> @@ -78,10 +79,39 @@ int debugfs_file_get(struct dentry *dentry)
>   struct debugfs_fsdata *fsd;
>   void *d_fsd;
> 
> - d_fsd = READ_ONCE(dentry->d_fsdata);
> + rcu_read_lock();
> +retry:
> + d_fsd = rcu_dereference(dentry->d_fsdata);
>   if (!((unsigned long)d_fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)) {
> + /*
> +  * Paired with the control dependency in
> +  * debugfs_file_put(): if we saw the debugfs_fsdata
> +  * instance "restored" there but not the dead dentry,
> +  * we'd erroneously instantiate a fresh debugfs_fsdata
> +  * instance below.
> +  */
> + smp_rmb();
> + if (d_unlinked(dentry)) {
> + rcu_read_unlock();
> + return -EIO;
> + }
> +
>   fsd = d_fsd;
> + if (!refcount_inc_not_zero(>active_users)) {
> + /*
> +  * A concurrent debugfs_file_put() dropped the
> +  * count to zero and is about to free the
> +  * debugfs_fsdata. Help out resetting the
> +  * ->d_fsdata and retry.
> +  */
> + d_fsd = (void *)((unsigned long)fsd->real_fops |
> + DEBUGFS_FSDATA_IS_REAL_FOPS_BIT);
> + RCU_INIT_POINTER(dentry->d_fsdata, d_fsd);

This is an infrequent race, I hope?  If on the other hand there is
a possibility of this branch being taken a huge number of times in
one call, it would be good to exit the RCU read-side critical section
before retrying.

> + goto retry;
> + }
> + rcu_read_unlock();
>   } else {
> + rcu_read_unlock();
>   

[RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-16 Thread Nicolai Stange
Currently, a dentry's debugfs_fsdata instance is allocated from
debugfs_file_get() at first usage, i.e. at first file opening.

It won't ever get freed though.

Ideally, these instances would get freed after the last open file handle
gets closed again, that is from the fops' ->release().

Unfortunately, we can't hook easily into those ->release()ers of files
created through debugfs_create_file_unsafe(), i.e. of those proxied through
debugfs_open_proxy_file_operations rather than through the "full"
debugfs_full_proxy_file_operations proxy.

Hence, free unreferenced debugfs_fsdata instances from debugfs_file_put(),
with the drawback of a potential allocation + deallocation per
debugfs_file_get() + debugfs_file_put() pair, that is per fops invocation.

In addition to its former role of tracking pending fops, use the
->active_users for reference counting on the debugfs_fsdata instance
itself. In particular, don't keep a dummy reference to be dropped from
__debugfs_remove_file(): a d_delete()ed dentry and thus, request for
completion notification, is now signaled by the d_unlinked() dentry itself.

Once ->active_users drops to zero (and the dentry is still intact), free
the debugfs_fsdata instance from debugfs_file_put(). RCU protects any
concurrent debugfs_file_get() attempts to get a hold of the instance here.
Likewise for full_proxy_release() which lacks a call to debugfs_file_get().

Note that due to non-atomic updates to the d_unlinked() + ->d_fsdata pair,
care must be taken in order to avoid races between debugfs_file_put() and
debugfs_file_get() as well as __debugfs_remove_file(). Rather than
introducing a global lock, exploit the fact that there will ever be only a
single !d_unlinked() -> d_unlinked() transition and add memory barriers
where needed. Given the lack of proper benchmarking, that debugfs fops
aren't performance critical and that we've already got a potential
allocation/deallocation pair anyway, the added code complexity might be
highly questionable though.

Signed-off-by: Nicolai Stange 
---
 fs/debugfs/file.c | 102 --
 fs/debugfs/inode.c|   8 +++-
 fs/debugfs/internal.h |   1 +
 3 files changed, 90 insertions(+), 21 deletions(-)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index f4dfd7d0d625..b2cc25d44a39 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "internal.h"
@@ -78,10 +79,39 @@ int debugfs_file_get(struct dentry *dentry)
struct debugfs_fsdata *fsd;
void *d_fsd;
 
-   d_fsd = READ_ONCE(dentry->d_fsdata);
+   rcu_read_lock();
+retry:
+   d_fsd = rcu_dereference(dentry->d_fsdata);
if (!((unsigned long)d_fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)) {
+   /*
+* Paired with the control dependency in
+* debugfs_file_put(): if we saw the debugfs_fsdata
+* instance "restored" there but not the dead dentry,
+* we'd erroneously instantiate a fresh debugfs_fsdata
+* instance below.
+*/
+   smp_rmb();
+   if (d_unlinked(dentry)) {
+   rcu_read_unlock();
+   return -EIO;
+   }
+
fsd = d_fsd;
+   if (!refcount_inc_not_zero(>active_users)) {
+   /*
+* A concurrent debugfs_file_put() dropped the
+* count to zero and is about to free the
+* debugfs_fsdata. Help out resetting the
+* ->d_fsdata and retry.
+*/
+   d_fsd = (void *)((unsigned long)fsd->real_fops |
+   DEBUGFS_FSDATA_IS_REAL_FOPS_BIT);
+   RCU_INIT_POINTER(dentry->d_fsdata, d_fsd);
+   goto retry;
+   }
+   rcu_read_unlock();
} else {
+   rcu_read_unlock();
fsd = kmalloc(sizeof(*fsd), GFP_KERNEL);
if (!fsd)
return -ENOMEM;
@@ -91,25 +121,28 @@ int debugfs_file_get(struct dentry *dentry)
refcount_set(>active_users, 1);
init_completion(>active_users_drained);
if (cmpxchg(>d_fsdata, d_fsd, fsd) != d_fsd) {
+   /*
+* Another debugfs_file_get() has installed a
+* debugfs_fsdata instance concurrently.
+* Free ours and retry to grab a reference on
+* the installed one.
+*/
kfree(fsd);
-   fsd = READ_ONCE(dentry->d_fsdata);
+   rcu_read_lock();
+   goto retry;
+   }
+   /*
+* In case of a successful 

[RFC PATCH 9/9] debugfs: free debugfs_fsdata instances

2017-04-16 Thread Nicolai Stange
Currently, a dentry's debugfs_fsdata instance is allocated from
debugfs_file_get() at first usage, i.e. at first file opening.

It won't ever get freed though.

Ideally, these instances would get freed after the last open file handle
gets closed again, that is from the fops' ->release().

Unfortunately, we can't hook easily into those ->release()ers of files
created through debugfs_create_file_unsafe(), i.e. of those proxied through
debugfs_open_proxy_file_operations rather than through the "full"
debugfs_full_proxy_file_operations proxy.

Hence, free unreferenced debugfs_fsdata instances from debugfs_file_put(),
with the drawback of a potential allocation + deallocation per
debugfs_file_get() + debugfs_file_put() pair, that is per fops invocation.

In addition to its former role of tracking pending fops, use the
->active_users for reference counting on the debugfs_fsdata instance
itself. In particular, don't keep a dummy reference to be dropped from
__debugfs_remove_file(): a d_delete()ed dentry and thus, request for
completion notification, is now signaled by the d_unlinked() dentry itself.

Once ->active_users drops to zero (and the dentry is still intact), free
the debugfs_fsdata instance from debugfs_file_put(). RCU protects any
concurrent debugfs_file_get() attempts to get a hold of the instance here.
Likewise for full_proxy_release() which lacks a call to debugfs_file_get().

Note that due to non-atomic updates to the d_unlinked() + ->d_fsdata pair,
care must be taken in order to avoid races between debugfs_file_put() and
debugfs_file_get() as well as __debugfs_remove_file(). Rather than
introducing a global lock, exploit the fact that there will ever be only a
single !d_unlinked() -> d_unlinked() transition and add memory barriers
where needed. Given the lack of proper benchmarking, that debugfs fops
aren't performance critical and that we've already got a potential
allocation/deallocation pair anyway, the added code complexity might be
highly questionable though.

Signed-off-by: Nicolai Stange 
---
 fs/debugfs/file.c | 102 --
 fs/debugfs/inode.c|   8 +++-
 fs/debugfs/internal.h |   1 +
 3 files changed, 90 insertions(+), 21 deletions(-)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index f4dfd7d0d625..b2cc25d44a39 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "internal.h"
@@ -78,10 +79,39 @@ int debugfs_file_get(struct dentry *dentry)
struct debugfs_fsdata *fsd;
void *d_fsd;
 
-   d_fsd = READ_ONCE(dentry->d_fsdata);
+   rcu_read_lock();
+retry:
+   d_fsd = rcu_dereference(dentry->d_fsdata);
if (!((unsigned long)d_fsd & DEBUGFS_FSDATA_IS_REAL_FOPS_BIT)) {
+   /*
+* Paired with the control dependency in
+* debugfs_file_put(): if we saw the debugfs_fsdata
+* instance "restored" there but not the dead dentry,
+* we'd erroneously instantiate a fresh debugfs_fsdata
+* instance below.
+*/
+   smp_rmb();
+   if (d_unlinked(dentry)) {
+   rcu_read_unlock();
+   return -EIO;
+   }
+
fsd = d_fsd;
+   if (!refcount_inc_not_zero(>active_users)) {
+   /*
+* A concurrent debugfs_file_put() dropped the
+* count to zero and is about to free the
+* debugfs_fsdata. Help out resetting the
+* ->d_fsdata and retry.
+*/
+   d_fsd = (void *)((unsigned long)fsd->real_fops |
+   DEBUGFS_FSDATA_IS_REAL_FOPS_BIT);
+   RCU_INIT_POINTER(dentry->d_fsdata, d_fsd);
+   goto retry;
+   }
+   rcu_read_unlock();
} else {
+   rcu_read_unlock();
fsd = kmalloc(sizeof(*fsd), GFP_KERNEL);
if (!fsd)
return -ENOMEM;
@@ -91,25 +121,28 @@ int debugfs_file_get(struct dentry *dentry)
refcount_set(>active_users, 1);
init_completion(>active_users_drained);
if (cmpxchg(>d_fsdata, d_fsd, fsd) != d_fsd) {
+   /*
+* Another debugfs_file_get() has installed a
+* debugfs_fsdata instance concurrently.
+* Free ours and retry to grab a reference on
+* the installed one.
+*/
kfree(fsd);
-   fsd = READ_ONCE(dentry->d_fsdata);
+   rcu_read_lock();
+   goto retry;
+   }
+   /*
+* In case of a successful cmpxchg() above, this