Re: [PATCH v2 net-next] socket: Move the socket inuse to namespace.

2017-11-15 Thread Tonghao Zhang
On Wed, Nov 15, 2017 at 12:48 AM, Cong Wang  wrote:
> On Tue, Nov 14, 2017 at 3:17 AM, Tonghao Zhang  
> wrote:
>> Hi cong, we increase the socket inuse in sk_allock ?
>
> How about sock_init_data()?
>
> Or at least you can replace get_net() with read_pnet()?
Hi cong, I guess it is better to count socket inuse in sk_alloc. It is
easy to maintain counter. In this way, we dont hold the refcnt again.
I send the v3. Can you review it again ? thanks.


Re: [PATCH v2 net-next] socket: Move the socket inuse to namespace.

2017-11-14 Thread Cong Wang
On Tue, Nov 14, 2017 at 3:17 AM, Tonghao Zhang  wrote:
> Hi cong, we increase the socket inuse in sk_allock ?

How about sock_init_data()?

Or at least you can replace get_net() with read_pnet()?


Re: [PATCH v2 net-next] socket: Move the socket inuse to namespace.

2017-11-14 Thread Tonghao Zhang
Hi cong, we increase the socket inuse in sk_allock ?

On Tue, Nov 14, 2017 at 1:58 PM, Cong Wang  wrote:
> On Mon, Nov 13, 2017 at 7:23 PM, 张军伟(基础平台部)
>  wrote:
>>
>>> On 14 Nov 2017, at 10:09, Cong Wang  wrote:
>>>
>>> On Mon, Nov 13, 2017 at 4:36 AM, Tonghao Zhang  
>>> wrote:
 From: Tonghao Zhang 

 This patch add a member in struct netns_core. and this is
 a counter for socket_inuse in the _net_ namespace. The patch
 will add/sub counter in the sock_alloc or sock_release. In
 the sock_alloc, the private data of inode saves the special
 _net_. When releasing it, we can access the special _net_ and
 dec the counter of socket in that namespace.

 By the way, we dont use the 'current->nsproxy->net_ns' in the
 sock_release. In one case,when one task exits, the 'do_exit'
 may set the current->nsproxy NULL, and then call the sock_release.
 Use the private data of inode, saving few bytes.
>>>
>>> Why do you need to hold netns refcnt for socket? sock already holds
>>> it, so you can just access it by sock_net(sock->sk) ?
>> I think you suggestion is
>> replace get_net(net) ===> with sock_net(sock->sk)
>
> Not literally, but essentially yes.
>
>
>>
>> If thus, I think it could not work.
>> Because sock->sk is assigned after sock_alloc.
>> What we change is done in sock_alloc.
>
> You don't have to do it in sock_alloc().
>
>
>> By the way, sock->sk may has been released(NULL) in sock_release.
>
> My point is since sock is always paired with a sk and sk already
> holds refcnt, you don't need to hold it again, it looks unnecessary.


Re: [PATCH v2 net-next] socket: Move the socket inuse to namespace.

2017-11-13 Thread Cong Wang
On Mon, Nov 13, 2017 at 7:23 PM, 张军伟(基础平台部)
 wrote:
>
>> On 14 Nov 2017, at 10:09, Cong Wang  wrote:
>>
>> On Mon, Nov 13, 2017 at 4:36 AM, Tonghao Zhang  
>> wrote:
>>> From: Tonghao Zhang 
>>>
>>> This patch add a member in struct netns_core. and this is
>>> a counter for socket_inuse in the _net_ namespace. The patch
>>> will add/sub counter in the sock_alloc or sock_release. In
>>> the sock_alloc, the private data of inode saves the special
>>> _net_. When releasing it, we can access the special _net_ and
>>> dec the counter of socket in that namespace.
>>>
>>> By the way, we dont use the 'current->nsproxy->net_ns' in the
>>> sock_release. In one case,when one task exits, the 'do_exit'
>>> may set the current->nsproxy NULL, and then call the sock_release.
>>> Use the private data of inode, saving few bytes.
>>
>> Why do you need to hold netns refcnt for socket? sock already holds
>> it, so you can just access it by sock_net(sock->sk) ?
> I think you suggestion is
> replace get_net(net) ===> with sock_net(sock->sk)

Not literally, but essentially yes.


>
> If thus, I think it could not work.
> Because sock->sk is assigned after sock_alloc.
> What we change is done in sock_alloc.

You don't have to do it in sock_alloc().


> By the way, sock->sk may has been released(NULL) in sock_release.

My point is since sock is always paired with a sk and sk already
holds refcnt, you don't need to hold it again, it looks unnecessary.


Re: [PATCH v2 net-next] socket: Move the socket inuse to namespace.

2017-11-13 Thread 基础平台部

> On 14 Nov 2017, at 10:09, Cong Wang  wrote:
> 
> On Mon, Nov 13, 2017 at 4:36 AM, Tonghao Zhang  
> wrote:
>> From: Tonghao Zhang 
>> 
>> This patch add a member in struct netns_core. and this is
>> a counter for socket_inuse in the _net_ namespace. The patch
>> will add/sub counter in the sock_alloc or sock_release. In
>> the sock_alloc, the private data of inode saves the special
>> _net_. When releasing it, we can access the special _net_ and
>> dec the counter of socket in that namespace.
>> 
>> By the way, we dont use the 'current->nsproxy->net_ns' in the
>> sock_release. In one case,when one task exits, the 'do_exit'
>> may set the current->nsproxy NULL, and then call the sock_release.
>> Use the private data of inode, saving few bytes.
> 
> Why do you need to hold netns refcnt for socket? sock already holds
> it, so you can just access it by sock_net(sock->sk) ?
I think you suggestion is 
replace get_net(net) ===> with sock_net(sock->sk) 
 
If thus, I think it could not work.
Because sock->sk is assigned after sock_alloc.
What we change is done in sock_alloc.
By the way, sock->sk may has been released(NULL) in sock_release.

Re: [PATCH v2 net-next] socket: Move the socket inuse to namespace.

2017-11-13 Thread Tonghao Zhang
Yes,  thanks for your advise. I will modify, test it and then submit v3

On Tue, Nov 14, 2017 at 10:09 AM, Cong Wang  wrote:
> On Mon, Nov 13, 2017 at 4:36 AM, Tonghao Zhang  
> wrote:
>> From: Tonghao Zhang 
>>
>> This patch add a member in struct netns_core. and this is
>> a counter for socket_inuse in the _net_ namespace. The patch
>> will add/sub counter in the sock_alloc or sock_release. In
>> the sock_alloc, the private data of inode saves the special
>> _net_. When releasing it, we can access the special _net_ and
>> dec the counter of socket in that namespace.
>>
>> By the way, we dont use the 'current->nsproxy->net_ns' in the
>> sock_release. In one case,when one task exits, the 'do_exit'
>> may set the current->nsproxy NULL, and then call the sock_release.
>> Use the private data of inode, saving few bytes.
>
> Why do you need to hold netns refcnt for socket? sock already holds
> it, so you can just access it by sock_net(sock->sk) ?


Re: [PATCH v2 net-next] socket: Move the socket inuse to namespace.

2017-11-13 Thread Cong Wang
On Mon, Nov 13, 2017 at 4:36 AM, Tonghao Zhang  wrote:
> From: Tonghao Zhang 
>
> This patch add a member in struct netns_core. and this is
> a counter for socket_inuse in the _net_ namespace. The patch
> will add/sub counter in the sock_alloc or sock_release. In
> the sock_alloc, the private data of inode saves the special
> _net_. When releasing it, we can access the special _net_ and
> dec the counter of socket in that namespace.
>
> By the way, we dont use the 'current->nsproxy->net_ns' in the
> sock_release. In one case,when one task exits, the 'do_exit'
> may set the current->nsproxy NULL, and then call the sock_release.
> Use the private data of inode, saving few bytes.

Why do you need to hold netns refcnt for socket? sock already holds
it, so you can just access it by sock_net(sock->sk) ?


[PATCH v2 net-next] socket: Move the socket inuse to namespace.

2017-11-13 Thread Tonghao Zhang
From: Tonghao Zhang 

This patch add a member in struct netns_core. and this is
a counter for socket_inuse in the _net_ namespace. The patch
will add/sub counter in the sock_alloc or sock_release. In
the sock_alloc, the private data of inode saves the special
_net_. When releasing it, we can access the special _net_ and
dec the counter of socket in that namespace.

By the way, we dont use the 'current->nsproxy->net_ns' in the
sock_release. In one case,when one task exits, the 'do_exit'
may set the current->nsproxy NULL, and then call the sock_release.
Use the private data of inode, saving few bytes.

Signed-off-by: Tonghao Zhang 
Signed-off-by: Martin Zhang 
---
fix typo and comment.
---
 include/net/netns/core.h |  1 +
 net/socket.c | 79 
 2 files changed, 61 insertions(+), 19 deletions(-)

diff --git a/include/net/netns/core.h b/include/net/netns/core.h
index 78eb1ff75475..bef1bc8a1721 100644
--- a/include/net/netns/core.h
+++ b/include/net/netns/core.h
@@ -11,6 +11,7 @@ struct netns_core {
int sysctl_somaxconn;
 
struct prot_inuse __percpu *inuse;
+   int __percpu *socket_inuse;
 };
 
 #endif
diff --git a/net/socket.c b/net/socket.c
index c729625eb5d3..8a873d3fb2ab 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -163,12 +163,6 @@ static DEFINE_SPINLOCK(net_family_lock);
 static const struct net_proto_family __rcu *net_families[NPROTO] __read_mostly;
 
 /*
- * Statistics counters of the socket lists
- */
-
-static DEFINE_PER_CPU(int, sockets_in_use);
-
-/*
  * Support routines.
  * Move socket addresses back and forth across the kernel/user
  * divide and look after the messy bits.
@@ -549,6 +543,50 @@ static const struct inode_operations sockfs_inode_ops = {
.setattr = sockfs_setattr,
 };
 
+#ifdef CONFIG_PROC_FS
+static void socket_inuse_add(struct net *net, int val)
+{
+   __this_cpu_add(*net->core.socket_inuse, val);
+}
+
+static int socket_inuse_get(struct net *net)
+{
+   int cpu, res = 0;
+
+   for_each_possible_cpu(cpu)
+   res += *per_cpu_ptr(net->core.socket_inuse, cpu);
+
+   return res >= 0 ? res : 0;
+}
+
+static int __net_init socket_inuse_init_net(struct net *net)
+{
+   net->core.socket_inuse = alloc_percpu(int);
+   return net->core.socket_inuse ? 0 : -ENOMEM;
+}
+
+static void __net_exit socket_inuse_exit_net(struct net *net)
+{
+   free_percpu(net->core.socket_inuse);
+}
+
+static struct pernet_operations socket_inuse_ops = {
+   .init = socket_inuse_init_net,
+   .exit = socket_inuse_exit_net,
+};
+
+static __init int socket_inuse_init(void)
+{
+   if (register_pernet_subsys(&socket_inuse_ops))
+   panic("Cannot initialize socket inuse counters");
+
+   return 0;
+}
+
+core_initcall(socket_inuse_init);
+#endif /* CONFIG_PROC_FS */
+
+
 /**
  * sock_alloc  -   allocate a socket
  *
@@ -561,6 +599,7 @@ struct socket *sock_alloc(void)
 {
struct inode *inode;
struct socket *sock;
+   struct net *net;
 
inode = new_inode_pseudo(sock_mnt->mnt_sb);
if (!inode)
@@ -575,7 +614,15 @@ struct socket *sock_alloc(void)
inode->i_gid = current_fsgid();
inode->i_op = &sockfs_inode_ops;
 
-   this_cpu_add(sockets_in_use, 1);
+   net = current->nsproxy->net_ns;
+   /*
+* Save the _net_ to private data of inode. When we destroy the
+* socket, we can use it to access the _net_ and dec socket_inuse
+* counter.
+*/
+   inode->i_private = get_net(net);
+   socket_inuse_add(net, 1);
+
return sock;
 }
 EXPORT_SYMBOL(sock_alloc);
@@ -602,7 +649,10 @@ void sock_release(struct socket *sock)
if (rcu_dereference_protected(sock->wq, 1)->fasync_list)
pr_err("%s: fasync list not empty!\n", __func__);
 
-   this_cpu_sub(sockets_in_use, 1);
+   /* inode->i_private saves the _net_ address. */
+   socket_inuse_add(SOCK_INODE(sock)->i_private, -1);
+   put_net(SOCK_INODE(sock)->i_private);
+
if (!sock->file) {
iput(SOCK_INODE(sock));
return;
@@ -2645,17 +2695,8 @@ core_initcall(sock_init);/* early initcall */
 #ifdef CONFIG_PROC_FS
 void socket_seq_show(struct seq_file *seq)
 {
-   int cpu;
-   int counter = 0;
-
-   for_each_possible_cpu(cpu)
-   counter += per_cpu(sockets_in_use, cpu);
-
-   /* It can be negative, by the way. 8) */
-   if (counter < 0)
-   counter = 0;
-
-   seq_printf(seq, "sockets: used %d\n", counter);
+   seq_printf(seq, "sockets: used %d\n",
+  socket_inuse_get(seq->private));
 }
 #endif /* CONFIG_PROC_FS */
 
-- 
2.13.6