[Devel] Re: VRF-like use of Network Namespaces
I want to be able to type say: ip vrf add vrf_name to create a persisting network namespace, and then be able to add a net device to this namespace ip link add dev tun0 vrf vrf_name and then add a route to a subnet in this namespace using e.g. ip route add 192.168.1.0/24 dev tun0 vrf vrf_name I believe i can patch iproute2 (providing the 'ip' config utility) to use setns() and unshare() to add new namespaces and configure interfaces and routing in namespace ? I will look more into it tomorrow :) Thanks a lot for this awesome work anyways ! mathieu. On Tue, Jun 8, 2010 at 11:06 PM, Daniel Lezcano daniel.lezc...@free.frwrote: On 06/08/2010 07:12 PM, Mathieu Peresse wrote: Looks good, thanks ! Has anyone worked to make 'ip' use these facilities ? If I understand correctly, from a network resource configuration perspective: - Creating a persisting namespace ('VRF') is equivalent to: create a namespace (using clone()), which creates a proc entry for that namespace, and then bind mount the file so that it stays open. From the same process, unshare (using unshare()), open /proc/self/ns/net, store the fd, unshare again, open /proc/self/ns/net, store the fd, ... A single process handles by this way several network namespaces. To switch from one namespace to another, just use the setns syscall. Well this is one example to use it, AFAIK you are looking for this very specific usage no ? Thanks -- Daniel -- a+ mathieu ___ Containers mailing list contain...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/containers ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel
[Devel] Re: VRF-like use of Network Namespaces
Hi, [this is related to the use of Eric Biederman's new set of patches for named netns / netns switching] ok so I successfully modified /sbin/ip. I can now: - add/del a new netns by name: ip netns {addns,delns} ns_name - The namespace files are mounted on /var/run/netns/ns_name (so you have to mkdir /var/run/netns/ for this to work). - list netns: ip netns show - use /sbin/ip in any named netns: ip -netns ns_name link show (rough patch against current git tree attached) I want now to move devices across namespaces using their filesystem names (instead of using PIDs...). I'm not sure I can do it in userspace with the current code yet, can I ? I saw there was a rtnetlink attribute to set the netns of a device but it uses the PID of a namespace owner to do so... within 'ip' i can refer to only one namespace (i.e. the one that 'ip' task_struct-ns_proxy currently points to), so I won't be able to move an interface from outside my namespace to my namespace... I hope my explanation is clear and that this will get some interest... :) BTW is this the right ML to post this on ? Thanks, Mathieu. On Tue, Jun 8, 2010 at 11:48 PM, Mathieu Peresse mathieu.pere...@gmail.comwrote: I want to be able to type say: ip vrf add vrf_name to create a persisting network namespace, and then be able to add a net device to this namespace ip link add dev tun0 vrf vrf_name and then add a route to a subnet in this namespace using e.g. ip route add 192.168.1.0/24 dev tun0 vrf vrf_name I believe i can patch iproute2 (providing the 'ip' config utility) to use setns() and unshare() to add new namespaces and configure interfaces and routing in namespace ? I will look more into it tomorrow :) Thanks a lot for this awesome work anyways ! mathieu. On Tue, Jun 8, 2010 at 11:06 PM, Daniel Lezcano daniel.lezc...@free.frwrote: On 06/08/2010 07:12 PM, Mathieu Peresse wrote: Looks good, thanks ! Has anyone worked to make 'ip' use these facilities ? If I understand correctly, from a network resource configuration perspective: - Creating a persisting namespace ('VRF') is equivalent to: create a namespace (using clone()), which creates a proc entry for that namespace, and then bind mount the file so that it stays open. From the same process, unshare (using unshare()), open /proc/self/ns/net, store the fd, unshare again, open /proc/self/ns/net, store the fd, ... A single process handles by this way several network namespaces. To switch from one namespace to another, just use the setns syscall. Well this is one example to use it, AFAIK you are looking for this very specific usage no ? Thanks -- Daniel -- a+ mathieu -- a+ mathieu diff -pruN iproute2/ip/ip.c iproute2_netns/ip/ip.c --- iproute2/ip/ip.c 2010-06-11 16:21:25.948671592 +0200 +++ iproute2_netns/ip/ip.c 2010-06-11 16:19:45.684672493 +0200 @@ -42,11 +42,11 @@ static void usage(void) Usage: ip [ OPTIONS ] OBJECT { COMMAND | help }\n ip [ -force ] -batch filename\n where OBJECT := { link | addr | addrlabel | route | rule | neigh | ntable |\n - tunnel | tuntap | maddr | mroute | mrule | monitor | xfrm }\n + tunnel | tuntap | maddr | mroute | mrule | monitor | xfrm | netns }\n OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] | -r[esolve] |\n -f[amily] { inet | inet6 | ipx | dnet | link } |\n -o[neline] | -t[imestamp] | -b[atch] [filename] |\n --rc[vbuf] [size]}\n); +-rc[vbuf] [size] | -[n]etns netnsname}\n); exit(-1); } @@ -75,6 +75,7 @@ static const struct cmd { { tap, do_iptuntap }, { monitor, do_ipmonitor }, { xfrm, do_xfrm }, + { netns, do_ipnetns }, { mroute, do_multiroute }, { mrule, do_multirule }, { help, do_help }, @@ -225,6 +226,27 @@ int main(int argc, char **argv) exit(-1); } rcvbuf = size; + } else if (matches(opt, -netns) == 0) { + int nsfd; + char netns_path[255]; + argc--; + argv++; + if (strlen(argv[1]) NETNS_STR_MAXLEN) { +fprintf(stderr, Invalid netns name.\n); +exit(-1); + } + strcpy(netns_path, NETNS_DIR); + strcat(netns_path, argv[1]); + if ((nsfd = open(netns_path, O_RDONLY)) 0) { + fprintf(stderr, Could not open netns file.\n); + exit(-1); + } + /* Change namespace for iface configuration */ + if (setns(0, nsfd) 0) { +fprintf(stderr, setns() failed: %s\n, + strerror(errno)); +exit(-1); + } } else if (matches(opt, -help) == 0) { usage(); } else { diff -pruN iproute2/ip/ip_common.h iproute2_netns/ip/ip_common.h --- iproute2/ip/ip_common.h 2010-06-11 16:21:25.948671592 +0200 +++ iproute2_netns/ip/ip_common.h 2010-06-11 16:20:14.900724958 +0200 @@ -39,6 +39,7 @@ extern int do_multiaddr(int argc, char * extern int do_multiroute(int argc, char **argv); extern int do_multirule(int argc, char **argv); extern int do_xfrm(int argc, char **argv); +extern int do_ipnetns(int argc, char
[Devel] Re: VRF-like use of Network Namespaces
Hi, On Sun, Jun 13, 2010 at 11:59 AM, Eric W. Biederman ebied...@xmission.comwrote: Daniel Lezcano daniel.lezc...@free.fr writes: On 06/11/2010 04:47 PM, Mathieu Peresse wrote: Hi, [this is related to the use of Eric Biederman's new set of patches for named netns / netns switching] ok so I successfully modified /sbin/ip. I can now: - add/del a new netns by name: ip netns {addns,delns} ns_name - The namespace files are mounted on /var/run/netns/ns_name (so you have to mkdir /var/run/netns/ for this to work). IMHO, the ip command is not suitable for this, it does not write anything to the fs. It does configuration by all kinds of means. As far as it goes I think the ip command is perfectly suitable in this particular situation. Having a vrf functionality in linux is very desirable. I agree. And ip is just a cool tool :) Getting this into ip has the major advantage that we will have a defacto standard, and using IFLA_NET_NS_FD makes a lot more sense if everything is in ip. You should write you own command, which can be a perl script using the 'unshare' command (util-linux package on my distro). vrf create name vrf delete name vrf attach name vrf list vrf create will bind mount the ns at the place you decided in the script (eg. a tmpfs in order to keep the directory consistent across (unclean) reboots). - list netns: ip netns show - use /sbin/ip in any named netns: ip -netns ns_name link show (rough patch against current git tree attached) I want now to move devices across namespaces using their filesystem names (instead of using PIDs...). I'm not sure I can do it in userspace with the current code yet, can I ? No, you can do that only with pids, but why don't you move the devices at the create time ? You have all the latitude to do that, no ? Does my published tree not have IFLA_NET_NS_FD in it? No I don't think so... I'll have to check tomorrow at work though. I saw there was a rtnetlink attribute to set the netns of a device but it uses the PID of a namespace owner to do so... within 'ip' i can refer to only one namespace (i.e. the one that 'ip' task_struct-ns_proxy currently points to), so I won't be able to move an interface from outside my namespace to my namespace... I hope my explanation is clear and that this will get some interest... :) Your 'create' command can open a fd to its current netns, unshare a new namespace, bind mount it, and then return to the previously saved netns. BTW is this the right ML to post this on ? Well, this is something related to a subsystem of the containers, so it has some interest but I would suggest to send to the netdev@ mailing list (net...@vger.kernel.org), maybe cc'ing this mailing list. Anyway it looks like time to post the core of my patchset for review, and get things moving on this. Definitely :) Thanks. Eric -- a+ mathieu ___ Containers mailing list contain...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/containers ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel
[Devel] Re: VRF-like use of Network Namespaces
Mathieu Peresse mathieu.pere...@gmail.com writes: Hi, [this is related to the use of Eric Biederman's new set of patches for named netns / netns switching] ok so I successfully modified /sbin/ip. I can now: - add/del a new netns by name: ip netns {addns,delns} ns_name - The namespace files are mounted on /var/run/netns/ns_name (so you have to mkdir /var/run/netns/ for this to work). - list netns: ip netns show - use /sbin/ip in any named netns: ip -netns ns_name link show (rough patch against current git tree attached) I want now to move devices across namespaces using their filesystem names (instead of using PIDs...). I'm not sure I can do it in userspace with the current code yet, can I ? I saw there was a rtnetlink attribute to set the netns of a device but it uses the PID of a namespace owner to do so... within 'ip' i can refer to only one namespace (i.e. the one that 'ip' task_struct-ns_proxy currently points to), so I won't be able to move an interface from outside my namespace to my namespace... I hope my explanation is clear and that this will get some interest... :) In my nsfd tree if you look there is a new IFLA_NET_NS_FD attribute so should be able to update the existing code in ip that takes a netns by pid, and do a name based search first, and if you don't find the name and the value is numeric do a search by ip. BTW is this the right ML to post this on ? For hashing out the idea this is fine. Ultimately this conversation needs to hit netdev, before we merge all of this. That rough patch looks particularly promising. Eric ___ Containers mailing list contain...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/containers ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel
[Devel] Re: VRF-like use of Network Namespaces
Daniel Lezcano daniel.lezc...@free.fr writes: On 06/11/2010 04:47 PM, Mathieu Peresse wrote: Hi, [this is related to the use of Eric Biederman's new set of patches for named netns / netns switching] ok so I successfully modified /sbin/ip. I can now: - add/del a new netns by name: ip netns {addns,delns} ns_name - The namespace files are mounted on /var/run/netns/ns_name (so you have to mkdir /var/run/netns/ for this to work). IMHO, the ip command is not suitable for this, it does not write anything to the fs. It does configuration by all kinds of means. As far as it goes I think the ip command is perfectly suitable in this particular situation. Having a vrf functionality in linux is very desirable. Getting this into ip has the major advantage that we will have a defacto standard, and using IFLA_NET_NS_FD makes a lot more sense if everything is in ip. You should write you own command, which can be a perl script using the 'unshare' command (util-linux package on my distro). vrf create name vrf delete name vrf attach name vrf list vrf create will bind mount the ns at the place you decided in the script (eg. a tmpfs in order to keep the directory consistent across (unclean) reboots). - list netns: ip netns show - use /sbin/ip in any named netns: ip -netns ns_name link show (rough patch against current git tree attached) I want now to move devices across namespaces using their filesystem names (instead of using PIDs...). I'm not sure I can do it in userspace with the current code yet, can I ? No, you can do that only with pids, but why don't you move the devices at the create time ? You have all the latitude to do that, no ? Does my published tree not have IFLA_NET_NS_FD in it? I saw there was a rtnetlink attribute to set the netns of a device but it uses the PID of a namespace owner to do so... within 'ip' i can refer to only one namespace (i.e. the one that 'ip' task_struct-ns_proxy currently points to), so I won't be able to move an interface from outside my namespace to my namespace... I hope my explanation is clear and that this will get some interest... :) Your 'create' command can open a fd to its current netns, unshare a new namespace, bind mount it, and then return to the previously saved netns. BTW is this the right ML to post this on ? Well, this is something related to a subsystem of the containers, so it has some interest but I would suggest to send to the netdev@ mailing list (net...@vger.kernel.org), maybe cc'ing this mailing list. Anyway it looks like time to post the core of my patchset for review, and get things moving on this. Eric ___ Containers mailing list contain...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/containers ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel
[Devel] Re: VRF-like use of Network Namespaces
On 06/13/2010 11:59 AM, Eric W. Biederman wrote: Daniel Lezcanodaniel.lezc...@free.fr writes: On 06/11/2010 04:47 PM, Mathieu Peresse wrote: Hi, [this is related to the use of Eric Biederman's new set of patches for named netns / netns switching] ok so I successfully modified /sbin/ip. I can now: - add/del a new netns by name: ip netns {addns,delns} ns_name - The namespace files are mounted on /var/run/netns/ns_name (so you have to mkdir /var/run/netns/ for this to work). IMHO, the ip command is not suitable for this, it does not write anything to the fs. It does configuration by all kinds of means. As far as it goes I think the ip command is perfectly suitable in this particular situation. Having a vrf functionality in linux is very desirable. I agree it would be preferable to centralize all in the ip command. But the approach proposed by Mathieu relies on the filesystem. I don't think there is another solution but having the ip command mounting, writing and reading from this directory is a bit weird IMHO, may be because it does not do that (or I missed something). And for this reason, only, I find the ip command not suitable for this. But I am perfectly fine with the idea in general. That makes me feel, maybe a 'netnsfs' is missing. IMHO, it is like we fork and we store the pid in /var/run/pid/1234. In the other hand, the 'ip' command is run as root, so we can assume he knows what it does, like the 'mount' command writing to /etc/mtab. Getting this into ip has the major advantage that we will have a defacto standard, and using IFLA_NET_NS_FD makes a lot more sense if everything is in ip. Sure, if the netdev guys are ok with writing into /var/run/netns, I won't argue against. You should write you own command, which can be a perl script using the 'unshare' command (util-linux package on my distro). vrf createname vrf deletename vrf attachname vrf list vrf create will bind mount the ns at the place you decided in the script (eg. a tmpfs in order to keep the directory consistent across (unclean) reboots). - list netns: ip netns show - use /sbin/ip in any named netns: ip -netns ns_name link show (rough patch against current git tree attached) I want now to move devices across namespaces using their filesystem names (instead of using PIDs...). I'm not sure I can do it in userspace with the current code yet, can I ? No, you can do that only with pids, but why don't you move the devices at the create time ? You have all the latitude to do that, no ? Does my published tree not have IFLA_NET_NS_FD in it? Hmm, AFAICS no. I saw there was a rtnetlink attribute to set the netns of a device but it uses the PID of a namespace owner to do so... within 'ip' i can refer to only one namespace (i.e. the one that 'ip' task_struct-ns_proxy currently points to), so I won't be able to move an interface from outside my namespace to my namespace... I hope my explanation is clear and that this will get some interest... :) Your 'create' command can open a fd to its current netns, unshare a new namespace, bind mount it, and then return to the previously saved netns. BTW is this the right ML to post this on ? Well, this is something related to a subsystem of the containers, so it has some interest but I would suggest to send to the netdev@ mailing list (net...@vger.kernel.org), maybe cc'ing this mailing list. Anyway it looks like time to post the core of my patchset for review, and get things moving on this. Reviewing in progress ... ;) Thanks -- Daniel ___ Containers mailing list contain...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/containers ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel
[Devel] Re: VRF-like use of Network Namespaces
MP I saw there was a rtnetlink attribute to set the netns of a device but it MP uses the PID of a namespace owner to do so... within 'ip' i can refer to MP only one namespace (i.e. the one that 'ip' task_struct-ns_proxy currently MP points to), so I won't be able to move an interface from outside my MP namespace to my namespace... Not just the owner, but any process in the namespace, AFAIK. So, you should be able to fork() a child, have that child setns() into the namespace of your choosing, and then move the device to the process of your child (since you now know the pid). It's a little indirect, but it should work. -- Dan Smith IBM Linux Technology Center email: da...@us.ibm.com ___ Containers mailing list contain...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/containers ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel
[Devel] Re: VRF-like use of Network Namespaces
On 06/11/2010 04:47 PM, Mathieu Peresse wrote: Hi, [this is related to the use of Eric Biederman's new set of patches for named netns / netns switching] ok so I successfully modified /sbin/ip. I can now: - add/del a new netns by name: ip netns {addns,delns} ns_name - The namespace files are mounted on /var/run/netns/ns_name (so you have to mkdir /var/run/netns/ for this to work). IMHO, the ip command is not suitable for this, it does not write anything to the fs. You should write you own command, which can be a perl script using the 'unshare' command (util-linux package on my distro). vrf create name vrf delete name vrf attach name vrf list vrf create will bind mount the ns at the place you decided in the script (eg. a tmpfs in order to keep the directory consistent across (unclean) reboots). - list netns: ip netns show - use /sbin/ip in any named netns: ip -netns ns_name link show (rough patch against current git tree attached) I want now to move devices across namespaces using their filesystem names (instead of using PIDs...). I'm not sure I can do it in userspace with the current code yet, can I ? No, you can do that only with pids, but why don't you move the devices at the create time ? You have all the latitude to do that, no ? I saw there was a rtnetlink attribute to set the netns of a device but it uses the PID of a namespace owner to do so... within 'ip' i can refer to only one namespace (i.e. the one that 'ip' task_struct-ns_proxy currently points to), so I won't be able to move an interface from outside my namespace to my namespace... I hope my explanation is clear and that this will get some interest... :) Your 'create' command can open a fd to its current netns, unshare a new namespace, bind mount it, and then return to the previously saved netns. BTW is this the right ML to post this on ? Well, this is something related to a subsystem of the containers, so it has some interest but I would suggest to send to the netdev@ mailing list (net...@vger.kernel.org), maybe cc'ing this mailing list. ___ Containers mailing list contain...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/containers ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel
[Devel] Re: VRF-like use of Network Namespaces
On 06/08/2010 05:23 PM, Mathieu Peresse wrote: Hi all, I saw this post from Oct 2008: https://lists.linux-foundation.org/pipermail/containers/2008-October/013917.html, discussing how to manipulate network namespaces like we do with VRFs on Cisco routers (e.g. using normal network commands, plus appending vrf vrf_name at the end to manipulate the desired VRF), without the need to have processes bound to network namespaces. Are there any activities on this subject ? There is a prototype here: git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6.33-nsfd-v5.git The description of what it does: http://git.kernel.org/?p=linux/kernel/git/ebiederm/linux-2.6.33-nsfd-v5.git;a=commit;h=9c2f86a44d9ca93e78fd8e81a4e2a8c2a4cdb054 I don't know what is the status of this patchset and if Eric is willing to push it for the next kernel version. Thanks -- Daniel ___ Containers mailing list contain...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/containers ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel
[Devel] Re: VRF-like use of Network Namespaces
On 06/08/2010 07:12 PM, Mathieu Peresse wrote: Looks good, thanks ! Has anyone worked to make 'ip' use these facilities ? If I understand correctly, from a network resource configuration perspective: - Creating a persisting namespace ('VRF') is equivalent to: create a namespace (using clone()), which creates a proc entry for that namespace, and then bind mount the file so that it stays open. From the same process, unshare (using unshare()), open /proc/self/ns/net, store the fd, unshare again, open /proc/self/ns/net, store the fd, ... A single process handles by this way several network namespaces. To switch from one namespace to another, just use the setns syscall. Well this is one example to use it, AFAIK you are looking for this very specific usage no ? Thanks -- Daniel ___ Containers mailing list contain...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/containers ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel