This patch series contains the set of changes to correctly set up 
the infra for PF_RDS sockets that use TCP as the transport in multiple
network namespaces.

Patch 1 in the series is the minimal set of changes to allow
a single instance of RDS-TCP to run in any (i.e init_net or other) 
namespace. The changes in this patch set ensure that the
execution of 'modprobe [-r] rds_tcp' correctly sets up the kernel
TCP sockets relative to the current netns. 

Patch 2 of the series further allows multiple RDS-TCP instances,
one per network namespace. The changes in this patch allows dynamic
creation/tear-down of RDS-TCP client and server sockets  across all
current and future namespaces. 

Comments are specifically invited about the following:

   There is some question in my mind as to whether Patch 2 should
   use register_pernet_subsys() or register_pernet_device(): due
   to the nature of the architecture, RDS/TCP is not a network device,
   but more accurately a subsystem that encapsulates an RDS packet into
   a TCP/IP header at the ksocket layer. However, the listen socket
   is created as part of the ->init in the pernet_operations, and the 
   connect/accept sockets get created in the kernel dynamically, with the
   intention that all of these sockets should be cleaned as part of ->exit.

   Based on the comments in net_namespace.h, sockets would need
   to be cleaned up as part of a pernet operation, else they would
   hold up lo cleanup.  In the current version of patch2,  that cleanup is
   achieved after the ethernet devices, by the socket keepalive timeout,
   after which the ->exit will get called. I'm not sure there is a clean
   way to avoid this.  As thing stand, doing "ip netns delete <name>"
   would result in syslogd messages about "unregister_netdevice: waiting
   for lo to become free. Usage count .." being seen in the interval between
   ethernet device migration to init_net and the keepalive timeout
 
Patch 3 in this set is independant of the above two changes, and is 
a bugfix/follow up to eeb1bd5c encountered while testing the above.

Sowmini Varadhan (3):
  Make RDS-TCP work correctly when it is set up in a netns other than
    init_net
  Support multiple RDS-TCP listen endpoints, one per netns.
  sk_clone_lock() should only do get_net() if the parent is not a
    kernel socket

 net/core/sock.c       |    3 +-
 net/rds/bind.c        |    3 +-
 net/rds/connection.c  |   16 ++++---
 net/rds/ib.c          |    2 +-
 net/rds/ib_cm.c       |    4 +-
 net/rds/iw.c          |    2 +-
 net/rds/iw_cm.c       |    4 +-
 net/rds/rds.h         |   11 +++--
 net/rds/send.c        |    3 +-
 net/rds/tcp.c         |  116 ++++++++++++++++++++++++++++++++++++++++++-------
 net/rds/tcp.h         |    7 ++-
 net/rds/tcp_connect.c |    9 +++-
 net/rds/tcp_listen.c  |   40 ++++++-----------
 net/rds/transport.c   |    4 +-
 14 files changed, 155 insertions(+), 69 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to