cross namespace interface notification for tun devices
Hey guys, It's possible to create a tun device in a process in namespace A and then move that interface to namespace B. The controlling process in A needs to receive notifications on when the interface is brought up or down. It can receive these notifications via netlink while the interface lives in A but not when it moves to B. Any tricks or APIs to get around this? The best I've come up with is, in a sleep loop, writing to the tun device's fd something with a NULL or invalid payload. If the interface is down, the kernel returns -EIO. If the interface is up, the kernel returns -EFAULT. This seems to be a reliable distinguisher, but is a pretty insane way of doing it. And sleep loops are somewhat different from events too. If there aren't any current APIs for receiving events directly from the fd of a tun interface, would this list be happy with a patch that adds one? Thanks, Jason
Re: cross namespace interface notification for tun devices
Le 18/09/2017 à 20:47, Jason A. Donenfeld a écrit : > Hey guys, > > It's possible to create a tun device in a process in namespace A and > then move that interface to namespace B. The controlling process in A > needs to receive notifications on when the interface is brought up or > down. It can receive these notifications via netlink while the > interface lives in A but not when it moves to B. > > Any tricks or APIs to get around this? There are two options. 1. Move the process to netns B, open the netlink socket and move back the process to netns A. The socket will remain in netns B and you will receive all netlink messages related to netns B. 2. Assign a nsid to netns B in netns A and use NETLINK_LISTEN_ALL_NSID on your netlink socket (see iproute2). Regards, Nicolas
Re: cross namespace interface notification for tun devices
On Mon, Oct 2, 2017 at 11:32 AM, Nicolas Dichtel wrote: > 1. Move the process to netns B, open the netlink socket and move back the > process to netns A. The socket will remain in netns B and you will receive all > netlink messages related to netns B. > > 2. Assign a nsid to netns B in netns A and use NETLINK_LISTEN_ALL_NSID on your > netlink socket (see iproute2). Both of these seem to rely on the process knowing where the device is being moved and having access to that namespace. I don't think these two things are a given though. Unless I'm missing something? Jason
Re: cross namespace interface notification for tun devices
Le 02/10/2017 à 13:11, Jason A. Donenfeld a écrit : > On Mon, Oct 2, 2017 at 11:32 AM, Nicolas Dichtel > wrote: >> 1. Move the process to netns B, open the netlink socket and move back the >> process to netns A. The socket will remain in netns B and you will receive >> all >> netlink messages related to netns B. >> >> 2. Assign a nsid to netns B in netns A and use NETLINK_LISTEN_ALL_NSID on >> your >> netlink socket (see iproute2). > > Both of these seem to rely on the process knowing where the device is > being moved and having access to that namespace. I don't think these > two things are a given though. Unless I'm missing something? I didn't understand correctly. Your control process cannot monitor or control an interface which is in a unkown/hidden netns. But x-netns interfaces are special. We already add a way to identify peer netns for this kind of interfaces. If an handler get_link_net was added to the rtnl_link_ops of the tun driver, it will help to identify netns A when you are in netns B. But you need the opposite. I already try a patch to advertise via netlink the dst netns when an interface moves to a new netns. I think that it is valid for x-netns interfaces. As soon as you can identify the dst netns, your problem is solved, right? Nicolas
Re: cross namespace interface notification for tun devices
On Mon, Sep 18, 2017 at 8:47 PM, Jason A. Donenfeld wrote: > The best I've come up with is, in a sleep loop, writing to the tun > device's fd something with a NULL or invalid payload. If the interface > is down, the kernel returns -EIO. If the interface is up, the kernel > returns -EINVAL. This seems to be a reliable distinguisher, but is a > pretty insane way of doing it. And sleep loops are somewhat different > from events too. Specifically, I'm referring to the horrific hack exemplified in the attached .c file, in case anybody is curious about the details of what I'd rather not use. #include #include #include #include #include #include #include #include #include #include int main(int argc, char *argv[]) { /* If IFF_NO_PI is specified, this still sort of works but it * bumps the device error counters, which we don't want, so * it's best not to use this trick with IFF_NO_PI. */ struct ifreq ifr = { .ifr_flags = IFF_TUN }; int tun, sock, ret; tun = open("/dev/net/tun", O_RDWR); if (tun < 0) { perror("[-] open(/dev/net/tun)"); return 1; } sock = socket(AF_INET, SOCK_DGRAM, 0); if (sock < 0) { perror("[-] socket(AF_INET, SOCK_DGRAM)"); return 1; } ret = ioctl(tun, TUNSETIFF, &ifr); if (ret < 0) { perror("[-] ioctl(TUNSETIFF)"); return 1; } if (write(tun, NULL, 0) >= 0 || errno != EIO) perror("[-] write(if:down, NULL, 0) did not return -EIO"); else fprintf(stderr, "[+] write(if:down, NULL, 0) returned -EIO: test successful\n"); ifr.ifr_flags = IFF_UP; ret = ioctl(sock, SIOCSIFFLAGS, &ifr); if (ret < 0) { perror("[-] ioctl(SIOCSIFFLAGS)"); return 1; } if (write(tun, NULL, 0) >= 0 || errno != EINVAL) perror("[-] write(if:up, NULL, 0) did not return -EINVAL"); else fprintf(stderr, "[+] write(if:up, NULL, 0) returned -EINVAL: test successful\n"); return 0; }
Re: cross namespace interface notification for tun devices
On Mon, Sep 18, 2017 at 11:47 AM, Jason A. Donenfeld wrote: > Hey guys, > > It's possible to create a tun device in a process in namespace A and > then move that interface to namespace B. The controlling process in A > needs to receive notifications on when the interface is brought up or > down. It can receive these notifications via netlink while the > interface lives in A but not when it moves to B. By "notification" I assume you mean netlink notification. > > Any tricks or APIs to get around this? The question is why does the process in A still care about the device sitting in B? Also, the process should be able to receive a last notification on IFF_UP|IFF_RUNNING before device is finally moved to B. After this point, it should not have any relation to netns A any more, like the device were completely gone.
Re: cross namespace interface notification for tun devices
On Tue, Sep 19, 2017 at 10:40 PM, Cong Wang wrote: > By "notification" I assume you mean netlink notification. Yes, netlink notification. > The question is why does the process in A still care about > the device sitting in B? > > Also, the process should be able to receive a last notification > on IFF_UP|IFF_RUNNING before device is finally moved to B. > After this point, it should not have any relation to netns A > any more, like the device were completely gone. That's very clearly not the case with a tun device. Tun devices work by letting a userspace process control the inputs (ndo_start_xmit) and outputs (netif_rx) of the actual network device. This controlling userspace process needs to know when its own interface that it controls goes up and down. In the kernel, we can do this by just checking dev->flags&IFF_UP, and receive notifications on ndo_open and ndo_stop. In userspace, the controlling process looses the ability to receive notifications like ndo_open/ndo_stop when the interface is moved to a new namespace. After the interface is moved to a namespace, the process will still control inputs and ouputs (ndo_start_xmit and netif_rx), but it will no longer receive netlink notifications for the equivalent of ndo_open and ndo_stop. This is problematic.
Re: cross namespace interface notification for tun devices
On Tue, Sep 19, 2017 at 2:02 PM, Jason A. Donenfeld wrote: > On Tue, Sep 19, 2017 at 10:40 PM, Cong Wang wrote: >> By "notification" I assume you mean netlink notification. > > Yes, netlink notification. > >> The question is why does the process in A still care about >> the device sitting in B? >> >> Also, the process should be able to receive a last notification >> on IFF_UP|IFF_RUNNING before device is finally moved to B. >> After this point, it should not have any relation to netns A >> any more, like the device were completely gone. > > That's very clearly not the case with a tun device. Tun devices work > by letting a userspace process control the inputs (ndo_start_xmit) and > outputs (netif_rx) of the actual network device. This controlling > userspace process needs to know when its own interface that it > controls goes up and down. In the kernel, we can do this by just > checking dev->flags&IFF_UP, and receive notifications on ndo_open and > ndo_stop. In userspace, the controlling process looses the ability to > receive notifications like ndo_open/ndo_stop when the interface is > moved to a new namespace. After the interface is moved to a namespace, > the process will still control inputs and ouputs (ndo_start_xmit and > netif_rx), but it will no longer receive netlink notifications for the > equivalent of ndo_open and ndo_stop. This is problematic. Sounds like we should set NETIF_F_NETNS_LOCAL for tun device. What is your legitimate use case of send/receive packet to/from a tun device in a different netns?
Re: cross namespace interface notification for tun devices
On Wed, 2017-09-20 at 11:29 -0700, Cong Wang wrote: > On Tue, Sep 19, 2017 at 2:02 PM, Jason A. Donenfeld > wrote: > > On Tue, Sep 19, 2017 at 10:40 PM, Cong Wang > om> wrote: > > > By "notification" I assume you mean netlink notification. > > > > Yes, netlink notification. > > > > > The question is why does the process in A still care about > > > the device sitting in B? > > > > > > Also, the process should be able to receive a last notification > > > on IFF_UP|IFF_RUNNING before device is finally moved to B. > > > After this point, it should not have any relation to netns A > > > any more, like the device were completely gone. > > > > That's very clearly not the case with a tun device. Tun devices > > work > > by letting a userspace process control the inputs (ndo_start_xmit) > > and > > outputs (netif_rx) of the actual network device. This controlling > > userspace process needs to know when its own interface that it > > controls goes up and down. In the kernel, we can do this by just > > checking dev->flags&IFF_UP, and receive notifications on ndo_open > > and > > ndo_stop. In userspace, the controlling process looses the ability > > to > > receive notifications like ndo_open/ndo_stop when the interface is > > moved to a new namespace. After the interface is moved to a > > namespace, > > the process will still control inputs and ouputs (ndo_start_xmit > > and > > netif_rx), but it will no longer receive netlink notifications for > > the > > equivalent of ndo_open and ndo_stop. This is problematic. > > Sounds like we should set NETIF_F_NETNS_LOCAL for tun > device. > > What is your legitimate use case of send/receive packet to/from > a tun device in a different netns? One thought: run openvpn in the master netns, but put its tun0 interface into an application's netns. Per-application VPN, essentially? Or maybe that's not how people do this kind of thing, but it's a thought. Dan
Re: cross namespace interface notification for tun devices
On Wed, Sep 20, 2017 at 8:29 PM, Cong Wang wrote: > Sounds like we should set NETIF_F_NETNS_LOCAL for tun > device. Absolutely do not do this under any circumstances. This would be a regression and would break API compatibility. As I wrote in my first email, it's already possible to sleep-loop for that information using the tun device's fd; I'm just looking for a better event-based approach. > What is your legitimate use case of send/receive packet to/from > a tun device in a different netns? Because sometimes it's very nice to be able to move network interfaces that use tun devices into different namespaces, for some xnamespace proxying. What Dan described in the email he just sent is exactly this use case. In WireGuard (a kernel thing), I have facilities for this -- https://www.wireguard.com/netns/ . Now I'm working on the userspace version and would like to expose the same utility. Anyway, the purpose of me sending this message to the list was not to question the "legitimacy" of my application usage, but rather to elicit feedback on two specific things: 1. to determine if there's already a mechanism in place for this that I've overlooked; and 2. to determine particularities of me implementing a mechanism, if it's not already there. I'm slightly more convinced that there isn't currently a mechanism for this. It seems like the easiest way, therefore, would be some kind of control message that could be poll'd for, using the existing per-process fd. That way there wouldn't be any violations of the current namespace situation, yet processes could still get event notifications as needed.