Re: [Lxc-users] netns: Issues with deleting virtual interfaces during namespace cleanup

2011-02-27 Thread Renato Westphal
Hello David,

You may try the patch below (kernel v2.6.35) and see if that helps. It
basically does what you asked for: during namespace cleanup, move back the
virtual interfaces to their original namespaces. I did some tests with veth
pairs and nested netns's and everything worked fine.

I think this should be the default behaviour, I would like if someone could
review/fix this patch and push it upstream.

Have a good day,
Renato.

2011/2/26 Daniel Lezcano daniel.lezc...@free.fr

 On 02/26/2011 05:59 PM, Ward, David - 0663 - MITLL wrote:
  (Apologies for the cross-post, but Thunderbird messed up the formatting
  when I sent this originally, and then I realized I sent it to the wrong
  list.)
 
  A patch was applied to the kernel in November 2008 that deletes virtual
  network interfaces when network namespaces are cleaned up
  (d0c082cea6dfb9b674b4f6e1e84025662dbd24e8). A discussion about this
  patch took place on this list
  (
 https://lists.linux-foundation.org/pipermail/containers/2008-October/013460.html
 ),
  where Daniel Lezcano wrote:
 
After discussing with Benjamin, this patch means an user can no longer
manage a pool of virtual devices because they will be automatically
destroyed when the namespace exits. I don't think it is a big concern,
but just in case I am asking :)
 
  I currently have two use cases where this behavior is not desirable:
 
  1. I use a veth pair device to connect two containers together (as
  opposed to connecting a container to the host). To do this, I
  create the veth pair device manually in the host with iproute2
  (ip link add type veth). Then when I start each container, it
  pulls in one of the interfaces of the veth pair device with
  lxc.network.type = phys. When I stop one of the containers, its
  interface to the veth pair device is deleted instead of moved back
  to the host, so I can not just start the stopped container again
  and re-establish the same link.

 Maybe you can rely on the lxc configuration to do that.

 Assuming you create the two container always in the same order.

 The first one:

 lxc.network.type=veth
 lxc.network.veth.pair=vethX

 The second one

 lxc.network.type=phys
 lxc.network.link=vethX

 The drawback is you have to stop / start both of them.


 Otherwise, why don't you use the macvlan configuration ?

 For both containers:

 lxc.network.type=macvlan
 lxc.network.macvlan.mode=bridge
 lxc.network.link=dummy0


  2. I start a process in the host that creates a TUN/TAP interface,
  such as a VPN client. I pull the TUN/TAP interface into the
  container with lxc.network.type = phys. When the container
  exits, the TUN/TAP interface is deleted because it is a virtual
  interface, while the VPN client process continues to run in the
  host. Again I can not just start the container again with the
  same connection; I have to restart the VPN client.
 
  It makes sense that virtual network interfaces that get created inside a
  container should be deleted when the container exits. However, I feel
  that network interfaces from the host that get assigned to the container
  should be returned to the host when the container exits, whether they
  are physical or virtual.

 Wouldn't make sense to add a configuration option for lxc to create such
 device and handle the vpn client ?

 There is the lxc.network.script.up option where you can launch your vpn
 client. So adding the tun/tap interface as a network option, lxc will
 create it for you and when it is up, the up script is invoked where the
 vpn client is launched.

 The lxc.network.script.down does not exist yet, but it is quite easy to
 add the option.

 What do you think ?

  Can the kernel distinguish between network interfaces that were created
  inside the namespace, and network interfaces that were moved there?

 IMHO that will add more complexity to the network namespace, especially
 to handle the nested namespaces. Furthermore that will impact the
 current design. I am not really in favor of that as that was initial
 behavior and there were limitations.
  javascript:void(0);
 ___
 Containers mailing list
 contain...@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containers




-- 
Renato Westphal
commit 4b938c007d9a20d7ee6753083d7a9c6b1f098671
Author: Renato Westphal rwestp...@inf.ufrgs.br
Date:   Sun Feb 27 02:07:56 2011 -0300

netns: Preserve imported virtual interfaces during namespace cleanup

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index b21e405..7cce799 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1019,6 +1019,8 @@ struct net_device {
 #ifdef CONFIG_NET_NS
 	/* Network namespace this network device is inside */
 	struct net		*nd_net;
+	/* Initial network namespace of this network device */
+	struct net		*nd_init_net;
 #endif
 
 	/* mid-layer private */
diff --git a/net/core/dev.c b/net/core/dev.c
index f3a24c4..16d9bc4 100644

Re: [Lxc-users] netns: Issues with deleting virtual interfaces during namespace cleanup

2011-02-27 Thread Renato Westphal
Daniel/Eric,

You're completely right. This patch adds more problems than it solves.

I have a problem similar to that of David, but now I'm convinced that
it is better to deal with it with the userspace tools.

Renato.

2011/2/27 Daniel Lezcano daniel.lezc...@free.fr:
 On 02/27/2011 06:16 AM, Renato Westphal wrote:

 Hello David,

 You may try the patch below (kernel v2.6.35) and see if that helps. It
 basically does what you asked for: during namespace cleanup, move back the
 virtual interfaces to their original namespaces. I did some tests with
 veth
 pairs and nested netns's and everything worked fine.

 I think this should be the default behaviour, I would like if someone
 could
 review/fix this patch and push it upstream.

 I don't think you should modify this. The automatic destruction behavior is
 implemented since a couple of years now and the userspace components rely on
 that.

 Moreover, that will add extra complexity to the kernel, especially with the
 nested namespaces. For example, if netns1 and netns2 are created, where
 netns2 is child of netns1. You create a device in netns1, move it to netns2
 and then netns1 exits. What happens to the device in netns2 when this one is
 destroyed ? You have to track the net namespace life cycle to ensure the
 consistency with the network namespace origin of the device and take
 decision regarding if it is dead or not.

 No, really, I am not in favor of that.

 However, you can provide an interface to the device, eg a sysfs attribute,
 to flag it as non-destroyable-at-exit and so it will be kept untouched and
 moved back to the init_net_ns.

 2011/2/26 Daniel Lezcanodaniel.lezc...@free.fr

 On 02/26/2011 05:59 PM, Ward, David - 0663 - MITLL wrote:

 (Apologies for the cross-post, but Thunderbird messed up the formatting
 when I sent this originally, and then I realized I sent it to the wrong
 list.)

 A patch was applied to the kernel in November 2008 that deletes virtual
 network interfaces when network namespaces are cleaned up
 (d0c082cea6dfb9b674b4f6e1e84025662dbd24e8). A discussion about this
 patch took place on this list
 (


 https://lists.linux-foundation.org/pipermail/containers/2008-October/013460.html
 ),

 where Daniel Lezcano wrote:

    After discussing with Benjamin, this patch means an user can no
 longer
    manage a pool of virtual devices because they will be automatically
    destroyed when the namespace exits. I don't think it is a big
 concern,
    but just in case I am asking :)

 I currently have two use cases where this behavior is not desirable:

 1. I use a veth pair device to connect two containers together (as
 opposed to connecting a container to the host). To do this, I
 create the veth pair device manually in the host with iproute2
 (ip link add type veth). Then when I start each container, it
 pulls in one of the interfaces of the veth pair device with
 lxc.network.type = phys. When I stop one of the containers, its
 interface to the veth pair device is deleted instead of moved back
 to the host, so I can not just start the stopped container again
 and re-establish the same link.

 Maybe you can rely on the lxc configuration to do that.

 Assuming you create the two container always in the same order.

 The first one:

 lxc.network.type=veth
 lxc.network.veth.pair=vethX

 The second one

 lxc.network.type=phys
 lxc.network.link=vethX

 The drawback is you have to stop / start both of them.


 Otherwise, why don't you use the macvlan configuration ?

 For both containers:

 lxc.network.type=macvlan
 lxc.network.macvlan.mode=bridge
 lxc.network.link=dummy0


 2. I start a process in the host that creates a TUN/TAP interface,
 such as a VPN client. I pull the TUN/TAP interface into the
 container with lxc.network.type = phys. When the container
 exits, the TUN/TAP interface is deleted because it is a virtual
 interface, while the VPN client process continues to run in the
 host. Again I can not just start the container again with the
 same connection; I have to restart the VPN client.

 It makes sense that virtual network interfaces that get created inside a
 container should be deleted when the container exits. However, I feel
 that network interfaces from the host that get assigned to the container
 should be returned to the host when the container exits, whether they
 are physical or virtual.

 Wouldn't make sense to add a configuration option for lxc to create such
 device and handle the vpn client ?

 There is the lxc.network.script.up option where you can launch your vpn
 client. So adding the tun/tap interface as a network option, lxc will
 create it for you and when it is up, the up script is invoked where the
 vpn client is launched.

 The lxc.network.script.down does not exist yet, but it is quite easy to
 add the option.

 What do you think ?

 Can the kernel distinguish between network interfaces that were created
 inside the namespace, and network interfaces that were moved there?

 IMHO that will add more