[PATCH v3 3/3] auth_gss: fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1

2019-06-12 Thread Wenbin Zeng
When use-gss-proxy is set to 1, write_gssp() creates a rpc client in
gssp_rpc_create(), this increases netns refcount by 2, these refcounts are
supposed to be released in rpcsec_gss_exit_net(), but it will never happen
because rpcsec_gss_exit_net() is triggered only when netns refcount gets
to 0, specifically:
refcount=0 -> cleanup_net() -> ops_exit_list -> rpcsec_gss_exit_net
It is a deadlock situation here, refcount will never get to 0 unless
rpcsec_gss_exit_net() is called.

This fix introduced a new callback i.e. evict in struct proc_ns_operations,
which is called in nsfs_evict. Moving rpcsec_gss_exit_net to evict path
gives it a chance to get called and avoids the above deadlock situation.

Signed-off-by: Wenbin Zeng 
Cc: J. Bruce Fields 
---
 net/sunrpc/auth_gss/auth_gss.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index 3fd56c0..3e76c8a 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -2136,14 +2136,14 @@ static __net_init int rpcsec_gss_init_net(struct net 
*net)
return gss_svc_init_net(net);
 }
 
-static __net_exit void rpcsec_gss_exit_net(struct net *net)
+static void rpcsec_gss_evict_net(struct net *net)
 {
gss_svc_shutdown_net(net);
 }
 
 static struct pernet_operations rpcsec_gss_net_ops = {
.init = rpcsec_gss_init_net,
-   .exit = rpcsec_gss_exit_net,
+   .evict = rpcsec_gss_evict_net,
 };
 
 /*
-- 
1.8.3.1



[PATCH v3 2/3] netns: add netns_evict into netns_operations

2019-06-12 Thread Wenbin Zeng
The newly added netns_evict() shall be called when the netns inode being
evicted. It provides another path to release netns refcounts, previously
netns_put() is the only choice, but it is not able to release all netns
refcount, for example, a rpc client holds two netns refcounts, these
refcounts are supposed to be released when the rpc client is freed, but
the code to free rpc client is normally triggered by put() callback only
when netns refcount gets to 0, specifically:
refcount=0 -> cleanup_net() -> ops_exit_list -> free rpc client
But netns refcount will never get to 0 before rpc client gets freed, to
break the deadlock, the code to free rpc client can be put into the newly
added netns_evict.

Signed-off-by: Wenbin Zeng 
Acked-by: David S. Miller 
---
 include/net/net_namespace.h |  1 +
 net/core/net_namespace.c| 12 
 2 files changed, 13 insertions(+)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 12689dd..c44306a 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -357,6 +357,7 @@ struct pernet_operations {
int (*init)(struct net *net);
void (*exit)(struct net *net);
void (*exit_batch)(struct list_head *net_exit_list);
+   void (*evict)(struct net *net);
unsigned int *id;
size_t size;
 };
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 7e6dcc6..0626fc4 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -1296,6 +1296,17 @@ static void netns_put(struct ns_common *ns)
put_net(to_net_ns(ns));
 }
 
+static void netns_evict(struct ns_common *ns)
+{
+   struct net *net = to_net_ns(ns);
+   const struct pernet_operations *ops;
+
+   list_for_each_entry_reverse(ops, &pernet_list, list) {
+   if (ops->evict)
+   ops->evict(net);
+   }
+}
+
 static int netns_install(struct nsproxy *nsproxy, struct ns_common *ns)
 {
struct net *net = to_net_ns(ns);
@@ -1319,6 +1330,7 @@ static struct user_namespace *netns_owner(struct 
ns_common *ns)
.type   = CLONE_NEWNET,
.get= netns_get,
.put= netns_put,
+   .evict  = netns_evict,
.install= netns_install,
.owner  = netns_owner,
 };
-- 
1.8.3.1



[PATCH v3 0/3] auth_gss: netns refcount leaks when use-gss-proxy==1

2019-06-12 Thread Wenbin Zeng
This patch series fixes an auth_gss bug that results in netns refcount
leaks when use-gss-proxy is set to 1.

The problem was found in privileged docker containers with gssproxy service
enabled and /proc/net/rpc/use-gss-proxy set to 1, the corresponding
struct net->count ends up at 2 after container gets killed, the consequence
is that the struct net cannot be freed.

It turns out that write_gssp() called gssp_rpc_create() to create a rpc
client, this increases net->count by 2; rpcsec_gss_exit_net() is supposed
to decrease net->count but it never gets called because its call-path is:
net->count==0 -> cleanup_net -> ops_exit_list -> rpcsec_gss_exit_net
Before rpcsec_gss_exit_net() gets called, net->count cannot reach 0, this
is a deadlock situation.

To fix the problem, we must break the deadlock, rpcsec_gss_exit_net()
should move out of the put() path and find another chance to get called,
I think nsfs_evict() is a good place to go, when netns inode gets evicted
we call rpcsec_gss_exit_net() to free the rpc client, this requires a new
callback i.e. evict to be added in struct proc_ns_operations, and add
netns_evict() as one of netns_operations as well.

v1->v2:
 * in nsfs_evict(), move ->evict() in front of ->put()
v2->v3:
 * rpcsec_gss_evict_net() directly call gss_svc_shutdown_net() regardless
   if gssp_clnt is null, this is exactly same to what rpcsec_gss_exit_net()
   previously did

Wenbin Zeng (3):
  nsfs: add evict callback into struct proc_ns_operations
  netns: add netns_evict into netns_operations
  auth_gss: fix deadlock that blocks rpcsec_gss_exit_net when
use-gss-proxy==1

 fs/nsfs.c  |  2 ++
 include/linux/proc_ns.h|  1 +
 include/net/net_namespace.h|  1 +
 net/core/net_namespace.c   | 12 
 net/sunrpc/auth_gss/auth_gss.c |  4 ++--
 5 files changed, 18 insertions(+), 2 deletions(-)

-- 
1.8.3.1



[PATCH v3 1/3] nsfs: add evict callback into struct proc_ns_operations

2019-06-12 Thread Wenbin Zeng
The newly added evict callback shall be called by nsfs_evict(). Currently
only put() callback is called in nsfs_evict(), it is not able to release
all netns refcount, for example, a rpc client holds two netns refcounts,
these refcounts are supposed to be released when the rpc client is freed,
but the code to free rpc client is normally triggered by put() callback
only when netns refcount gets to 0, specifically:
refcount=0 -> cleanup_net() -> ops_exit_list -> free rpc client
But netns refcount will never get to 0 before rpc client gets freed, to
break the deadlock, the code to free rpc client can be put into the newly
added evict callback.

Signed-off-by: Wenbin Zeng 
Cc: Al Viro 
---
 fs/nsfs.c   | 2 ++
 include/linux/proc_ns.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/fs/nsfs.c b/fs/nsfs.c
index 60702d6..a122288 100644
--- a/fs/nsfs.c
+++ b/fs/nsfs.c
@@ -48,6 +48,8 @@ static void nsfs_evict(struct inode *inode)
 {
struct ns_common *ns = inode->i_private;
clear_inode(inode);
+   if (ns->ops->evict)
+   ns->ops->evict(ns);
ns->ops->put(ns);
 }
 
diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h
index d31cb62..919f0d4 100644
--- a/include/linux/proc_ns.h
+++ b/include/linux/proc_ns.h
@@ -19,6 +19,7 @@ struct proc_ns_operations {
int type;
struct ns_common *(*get)(struct task_struct *task);
void (*put)(struct ns_common *ns);
+   void (*evict)(struct ns_common *ns);
int (*install)(struct nsproxy *nsproxy, struct ns_common *ns);
struct user_namespace *(*owner)(struct ns_common *ns);
struct ns_common *(*get_parent)(struct ns_common *ns);
-- 
1.8.3.1



Re: [PATCH v2 0/3] auth_gss: netns refcount leaks when use-gss-proxy==1

2019-06-12 Thread Wenbin Zeng
On Tue, May 14, 2019 at 09:03:31PM -0400, J. Bruce Fields wrote:
> Whoops, I was slow to test these.  I'm getting failuring krb5 nfs
> mounts, and the following the server's logs.  Dropping the three patches
> for now.
My bad, I should have found it earlier. Thank you for testing it, Bruce.

I figured it out, the problem that you saw is due to the following code:
the if-condition is incorrect here because sn->gssp_clnt==NULL doesn't mean
inexistence of 'use-gss-proxy':

-static __net_exit void rpcsec_gss_exit_net(struct net *net)
+static void rpcsec_gss_evict_net(struct net *net)
 {
-   gss_svc_shutdown_net(net);
+   struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
+   if (sn->gssp_clnt)
+   gss_svc_shutdown_net(net);
 }

Simply using the original logic in rpcsec_gss_exit_net() should be fine,
i.e.:

-static __net_exit void rpcsec_gss_exit_net(struct net *net)
+static void rpcsec_gss_evict_net(struct net *net)
 {
gss_svc_shutdown_net(net);
 }

I'm going to submit v3 soon.

Wenbin.

> 
> --b.
> 
> [   40.894408] remove_proc_entry: removing non-empty directory 'net/rpc', 
> leaking at least 'use-gss-proxy'
> [   40.897352] WARNING: CPU: 2 PID: 31 at fs/proc/generic.c:683 
> remove_proc_entry+0x17d/0x190
> [   40.899373] Modules linked in: nfsd nfs_acl lockd grace auth_rpcgss sunrpc
> [   40.901335] CPU: 2 PID: 31 Comm: kworker/u8:1 Not tainted 
> 5.1.0-10733-g4f10d1cb695e #2220
> [   40.903759] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
> [   40.906972] Workqueue: netns cleanup_net
> [   40.907828] RIP: 0010:remove_proc_entry+0x17d/0x190
> [   40.908904] Code: 52 82 48 85 c0 48 8d 90 48 ff ff ff 48 0f 45 c2 48 8b 93 
> a8 00 00 00 4c 8b 80 d0 00 00 00 48 8b 92 d0 00 00 00 e8 a7 24 dc ff <0f> 0b 
> e9 52 ff ff ff e8 a7 21 dc ff 0f 1f 80 00 00 00 00 0f 1f 44
> [   40.912689] RSP: 0018:c9123d80 EFLAGS: 00010282
> [   40.913495] RAX:  RBX: 888079f96e40 RCX: 
> 
> [   40.914747] RDX: 88807fd24e80 RSI: 88807fd165b8 RDI: 
> 
> [   40.916107] RBP: 888079f96ef0 R08:  R09: 
> 
> [   40.917253] R10:  R11:  R12: 
> 88807cd76d68
> [   40.918508] R13: a0057000 R14: 8880683db200 R15: 
> 82970240
> [   40.919642] FS:  () GS:88807fd0() 
> knlGS:
> [   40.920956] CS:  0010 DS:  ES:  CR0: 80050033
> [   40.921867] CR2: 7f9d70010cb8 CR3: 7cc5c006 CR4: 
> 001606e0
> [   40.923044] Call Trace:
> [   40.923364]  sunrpc_exit_net+0xcc/0x190 [sunrpc]
> [   40.924069]  ops_exit_list.isra.0+0x36/0x70
> [   40.924713]  cleanup_net+0x1cb/0x2c0
> [   40.925182]  process_one_work+0x219/0x620
> [   40.925780]  worker_thread+0x3c/0x390
> [   40.926312]  ? process_one_work+0x620/0x620
> [   40.927015]  kthread+0x11d/0x140
> [   40.927430]  ? kthread_park+0x80/0x80
> [   40.927822]  ret_from_fork+0x3a/0x50
> [   40.928281] irq event stamp: 11688
> [   40.928780] hardirqs last  enabled at (11687): [] 
> console_unlock+0x41e/0x590
> [   40.930319] hardirqs last disabled at (11688): [] 
> trace_hardirqs_off_thunk+0x1a/0x1c
> [   40.932123] softirqs last  enabled at (11684): [] 
> __do_softirq+0x2c5/0x4c5
> [   40.933657] softirqs last disabled at (11673): [] 
> irq_exit+0x80/0x90
> 


[PATCH v2 1/3] nsfs: add evict callback into struct proc_ns_operations

2019-05-09 Thread Wenbin Zeng
The newly added evict callback shall be called by nsfs_evict(). Currently
only put() callback is called in nsfs_evict(), it is not able to release
all netns refcount, for example, a rpc client holds two netns refcounts,
these refcounts are supposed to be released when the rpc client is freed,
but the code to free rpc client is normally triggered by put() callback
only when netns refcount gets to 0, specifically:
refcount=0 -> cleanup_net() -> ops_exit_list -> free rpc client
But netns refcount will never get to 0 before rpc client gets freed, to
break the deadlock, the code to free rpc client can be put into the newly
added evict callback.

Signed-off-by: Wenbin Zeng 
---
 fs/nsfs.c   | 2 ++
 include/linux/proc_ns.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/fs/nsfs.c b/fs/nsfs.c
index 60702d6..a122288 100644
--- a/fs/nsfs.c
+++ b/fs/nsfs.c
@@ -48,6 +48,8 @@ static void nsfs_evict(struct inode *inode)
 {
struct ns_common *ns = inode->i_private;
clear_inode(inode);
+   if (ns->ops->evict)
+   ns->ops->evict(ns);
ns->ops->put(ns);
 }
 
diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h
index d31cb62..919f0d4 100644
--- a/include/linux/proc_ns.h
+++ b/include/linux/proc_ns.h
@@ -19,6 +19,7 @@ struct proc_ns_operations {
int type;
struct ns_common *(*get)(struct task_struct *task);
void (*put)(struct ns_common *ns);
+   void (*evict)(struct ns_common *ns);
int (*install)(struct nsproxy *nsproxy, struct ns_common *ns);
struct user_namespace *(*owner)(struct ns_common *ns);
struct ns_common *(*get_parent)(struct ns_common *ns);
-- 
1.8.3.1



[PATCH v2 0/3] auth_gss: netns refcount leaks when use-gss-proxy==1

2019-05-09 Thread Wenbin Zeng
This patch series fixes an auth_gss bug that results in netns refcount
leaks when use-gss-proxy is set to 1.

The problem was found in privileged docker containers with gssproxy service
enabled and /proc/net/rpc/use-gss-proxy set to 1, the corresponding
struct net->count ends up at 2 after container gets killed, the consequence
is that the struct net cannot be freed.

It turns out that write_gssp() called gssp_rpc_create() to create a rpc
client, this increases net->count by 2; rpcsec_gss_exit_net() is supposed
to decrease net->count but it never gets called because its call-path is:
net->count==0 -> cleanup_net -> ops_exit_list -> rpcsec_gss_exit_net
Before rpcsec_gss_exit_net() gets called, net->count cannot reach 0, this
is a deadlock situation.

To fix the problem, we must break the deadlock, rpcsec_gss_exit_net()
should move out of the put() path and find another chance to get called,
I think nsfs_evict() is a good place to go, when netns inode gets evicted
we call rpcsec_gss_exit_net() to free the rpc client, this requires a new
callback i.e. evict to be added in struct proc_ns_operations, and add
netns_evict() as one of netns_operations as well.

v1->v2:
 * in nsfs_evict(), move ->evict() in front of ->put()

Wenbin Zeng (3):
  nsfs: add evict callback into struct proc_ns_operations
  netns: add netns_evict into netns_operations
  auth_gss: fix deadlock that blocks rpcsec_gss_exit_net when
use-gss-proxy==1

 fs/nsfs.c  |  2 ++
 include/linux/proc_ns.h|  1 +
 include/net/net_namespace.h|  1 +
 net/core/net_namespace.c   | 12 
 net/sunrpc/auth_gss/auth_gss.c |  9 ++---
 5 files changed, 22 insertions(+), 3 deletions(-)

-- 
1.8.3.1



[PATCH v2 2/3] netns: add netns_evict into netns_operations

2019-05-09 Thread Wenbin Zeng
The newly added netns_evict() shall be called when the netns inode being
evicted. It provides another path to release netns refcounts, previously
netns_put() is the only choice, but it is not able to release all netns
refcount, for example, a rpc client holds two netns refcounts, these
refcounts are supposed to be released when the rpc client is freed, but
the code to free rpc client is normally triggered by put() callback only
when netns refcount gets to 0, specifically:
refcount=0 -> cleanup_net() -> ops_exit_list -> free rpc client
But netns refcount will never get to 0 before rpc client gets freed, to
break the deadlock, the code to free rpc client can be put into the newly
added netns_evict.

Signed-off-by: Wenbin Zeng 
---
 include/net/net_namespace.h |  1 +
 net/core/net_namespace.c| 12 
 2 files changed, 13 insertions(+)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 12689dd..c44306a 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -357,6 +357,7 @@ struct pernet_operations {
int (*init)(struct net *net);
void (*exit)(struct net *net);
void (*exit_batch)(struct list_head *net_exit_list);
+   void (*evict)(struct net *net);
unsigned int *id;
size_t size;
 };
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 7e6dcc6..0626fc4 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -1296,6 +1296,17 @@ static void netns_put(struct ns_common *ns)
put_net(to_net_ns(ns));
 }
 
+static void netns_evict(struct ns_common *ns)
+{
+   struct net *net = to_net_ns(ns);
+   const struct pernet_operations *ops;
+
+   list_for_each_entry_reverse(ops, &pernet_list, list) {
+   if (ops->evict)
+   ops->evict(net);
+   }
+}
+
 static int netns_install(struct nsproxy *nsproxy, struct ns_common *ns)
 {
struct net *net = to_net_ns(ns);
@@ -1319,6 +1330,7 @@ static struct user_namespace *netns_owner(struct 
ns_common *ns)
.type   = CLONE_NEWNET,
.get= netns_get,
.put= netns_put,
+   .evict  = netns_evict,
.install= netns_install,
.owner  = netns_owner,
 };
-- 
1.8.3.1



Re: [PATCH 1/3] nsfs: add evict callback into struct proc_ns_operations

2019-05-04 Thread Wenbin Zeng
On Thu, May 02, 2019 at 04:04:06AM +0100, Al Viro wrote:
> On Wed, May 01, 2019 at 02:42:23PM +0800, Wenbin Zeng wrote:
> > The newly added evict callback shall be called by nsfs_evict(). Currently
> > only put() callback is called in nsfs_evict(), it is not able to release
> > all netns refcount, for example, a rpc client holds two netns refcounts,
> > these refcounts are supposed to be released when the rpc client is freed,
> > but the code to free rpc client is normally triggered by put() callback
> > only when netns refcount gets to 0, specifically:
> > refcount=0 -> cleanup_net() -> ops_exit_list -> free rpc client
> > But netns refcount will never get to 0 before rpc client gets freed, to
> > break the deadlock, the code to free rpc client can be put into the newly
> > added evict callback.
> > 
> > Signed-off-by: Wenbin Zeng 
> > ---
> >  fs/nsfs.c   | 2 ++
> >  include/linux/proc_ns.h | 1 +
> >  2 files changed, 3 insertions(+)
> > 
> > diff --git a/fs/nsfs.c b/fs/nsfs.c
> > index 60702d6..5939b12 100644
> > --- a/fs/nsfs.c
> > +++ b/fs/nsfs.c
> > @@ -49,6 +49,8 @@ static void nsfs_evict(struct inode *inode)
> > struct ns_common *ns = inode->i_private;
> > clear_inode(inode);
> > ns->ops->put(ns);
> > +   if (ns->ops->evict)
> > +   ns->ops->evict(ns);
> 
> What's to guarantee that ns will not be freed by ->put()?
> Confused...

Hi Al, thank you very much. You are absolutely right.
->evict() should be called before ->put(), i.e.:

@@ -49,6 +49,8 @@ static void nsfs_evict(struct inode *inode)
struct ns_common *ns = inode->i_private;
clear_inode(inode);
+   if (ns->ops->evict)
+   ns->ops->evict(ns);
ns->ops->put(ns);
 }

Does this look good?


[PATCH 0/3] auth_gss: netns refcount leaks when use-gss-proxy==1

2019-04-30 Thread Wenbin Zeng
This patch series fixes an auth_gss bug that results in netns refcount leaks 
when use-gss-proxy is set to 1.

The problem was found in privileged docker containers with gssproxy service 
enabled and /proc/net/rpc/use-gss-proxy set to 1, the corresponding struct 
net->count ends up at 2 after container gets killed, the consequence is that 
the struct net cannot be freed.

It turns out that write_gssp() called gssp_rpc_create() to create a rpc client, 
this increases net->count by 2; rpcsec_gss_exit_net() is supposed to decrease 
net->count but it never gets called because its call-path is:
net->count==0 -> cleanup_net -> ops_exit_list -> rpcsec_gss_exit_net
Before rpcsec_gss_exit_net() gets called, net->count cannot reach 0, this is a 
deadlock situation.

To fix the problem, we must break the deadlock, rpcsec_gss_exit_net() should 
move out of the put() path and find another chance to get called, I think 
nsfs_evict() is a good place to go, when netns inode gets evicted we call 
rpcsec_gss_exit_net() to free the rpc client, this requires a new callback i.e. 
evict to be added in struct proc_ns_operations, and add netns_evict() as one of 
netns_operations as well.

Wenbin Zeng (3):
  nsfs: add evict callback into struct proc_ns_operations
  netns: add netns_evict into netns_operations
  auth_gss: fix deadlock that blocks rpcsec_gss_exit_net when
use-gss-proxy==1

 fs/nsfs.c  |  2 ++
 include/linux/proc_ns.h|  1 +
 include/net/net_namespace.h|  1 +
 net/core/net_namespace.c   | 12 
 net/sunrpc/auth_gss/auth_gss.c |  9 ++---
 5 files changed, 22 insertions(+), 3 deletions(-)

-- 
1.8.3.1