Re: l3mdev: Support for sockets bound to enslaved device
On 8/8/17 5:29 AM, Ondrej Zajicek wrote: > Hi > > We noticed that TCP outgoing sockets that are bound to enslaved devices > by SO_BINDTODEVICE fail to connect, while they work when bound to vrf > device instead. We noticed similar behavior on ICMP (i.e. ping -I vrf0 > worked, while ping -I eth0 does not if eth0 is enslaved to vrf0). On > the contrary, UDP outgoing sockets bound to enslaved devices worked > as expected. We tested that on Linux 4.9.30-2 from Debian. > > I found your patchset 'Support for sockets bound to enslaved device' > ( http://www.spinics.net/lists/netdev/msg448040.html ), which seems > to be related to the issue, but the description mentions services, > i.e. listening sockets. Does the patchset (or some other one) fixes > the issue also for outgoing sockets? > Yes, it works for both directions.
Re: l3mdev: Support for sockets bound to enslaved device
Hi We noticed that TCP outgoing sockets that are bound to enslaved devices by SO_BINDTODEVICE fail to connect, while they work when bound to vrf device instead. We noticed similar behavior on ICMP (i.e. ping -I vrf0 worked, while ping -I eth0 does not if eth0 is enslaved to vrf0). On the contrary, UDP outgoing sockets bound to enslaved devices worked as expected. We tested that on Linux 4.9.30-2 from Debian. I found your patchset 'Support for sockets bound to enslaved device' ( http://www.spinics.net/lists/netdev/msg448040.html ), which seems to be related to the issue, but the description mentions services, i.e. listening sockets. Does the patchset (or some other one) fixes the issue also for outgoing sockets? -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: santi...@crfreenet.org) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
Re: [PATCH v3 net-next 0/7] net: l3mdev: Support for sockets bound to enslaved device
On 8/7/17 12:39 PM, David Miller wrote: > Series applied, let's see if it builds this time :-) I did an allyesconfig build before sending just to make sure, so our mileage better not vary.
Re: [PATCH v3 net-next 0/7] net: l3mdev: Support for sockets bound to enslaved device
From: David Ahern Date: Mon, 7 Aug 2017 08:44:15 -0700 > A missing piece to the VRF puzzle is the ability to bind sockets to > devices enslaved to a VRF. This patch set adds the enslaved device > index, sdif, to IPv4 and IPv6 socket lookups. The end result for users > is the following scope options for services: > > 1. "global" services - sockets not bound to any device > >Allows 1 service to work across all network interfaces with >connected sockets bound to the VRF the connection originates >(Requires net.ipv4.tcp_l3mdev_accept=1 for TCP and > net.ipv4.udp_l3mdev_accept=1 for UDP) > > 2. "VRF" local services - sockets bound to a VRF > >Sockets work across all network interfaces enslaved to a VRF but >are limited to just the one VRF. > > 3. "device" services - sockets bound to a specific network interface > >Service works only through the one specific interface. > > v3 > - convert __inet_lookup_established in dccp_v4_err; missed in v2 > > v2 > - remove sk_lookup struct and add sdif as an argument to existing > functions > > Changes since RFC: > - no significant logic changes; mainly whitespace cleanups Series applied, let's see if it builds this time :-)
[PATCH v3 net-next 0/7] net: l3mdev: Support for sockets bound to enslaved device
A missing piece to the VRF puzzle is the ability to bind sockets to devices enslaved to a VRF. This patch set adds the enslaved device index, sdif, to IPv4 and IPv6 socket lookups. The end result for users is the following scope options for services: 1. "global" services - sockets not bound to any device Allows 1 service to work across all network interfaces with connected sockets bound to the VRF the connection originates (Requires net.ipv4.tcp_l3mdev_accept=1 for TCP and net.ipv4.udp_l3mdev_accept=1 for UDP) 2. "VRF" local services - sockets bound to a VRF Sockets work across all network interfaces enslaved to a VRF but are limited to just the one VRF. 3. "device" services - sockets bound to a specific network interface Service works only through the one specific interface. v3 - convert __inet_lookup_established in dccp_v4_err; missed in v2 v2 - remove sk_lookup struct and add sdif as an argument to existing functions Changes since RFC: - no significant logic changes; mainly whitespace cleanups David Ahern (7): net: ipv4: add second dif to udp socket lookups net: ipv4: add second dif to inet socket lookups net: ipv4: add second dif to raw socket lookups net: ipv4: add second dif to multicast source filter net: ipv6: add second dif to udp socket lookups net: ipv6: add second dif to inet6 socket lookups net: ipv6: add second dif to raw socket lookups include/linux/igmp.h | 3 +- include/linux/ipv6.h | 10 +++ include/net/inet6_hashtables.h | 22 -- include/net/inet_hashtables.h | 31 +++- include/net/ip.h | 10 +++ include/net/raw.h | 2 +- include/net/rawv6.h| 2 +- include/net/tcp.h | 20 + include/net/udp.h | 4 +-- net/dccp/ipv4.c| 4 +-- net/dccp/ipv6.c| 4 +-- net/ipv4/igmp.c| 6 ++-- net/ipv4/inet_hashtables.c | 27 ++--- net/ipv4/raw.c | 18 net/ipv4/raw_diag.c| 4 +-- net/ipv4/tcp_ipv4.c| 13 + net/ipv4/udp.c | 66 -- net/ipv4/udp_diag.c| 10 +++ net/ipv6/inet6_hashtables.c| 28 +++--- net/ipv6/raw.c | 13 + net/ipv6/tcp_ipv6.c| 13 + net/ipv6/udp.c | 47 -- net/netfilter/xt_TPROXY.c | 6 ++-- 23 files changed, 228 insertions(+), 135 deletions(-) -- 2.1.4
Re: [PATCH v2 net-next 0/7] net: l3mdev: Support for sockets bound to enslaved device
On 8/7/17 12:51 AM, David Miller wrote: > David, I had to revert. You didn't convert dccp which breaks > the build. > > net/dccp/ipv4.c: In function ‘dccp_v4_err’: > net/dccp/ipv4.c:256:7: error: too few arguments to function > ‘__inet_lookup_established’ > sk = __inet_lookup_established(net, &dccp_hashinfo, >^ > In file included from net/dccp/ipv4.c:22:0: > ./include/net/inet_hashtables.h:289:14: note: declared here > struct sock *__inet_lookup_established(struct net *net, > ^ That's embarrassing. Will fix
Re: [PATCH v2 net-next 0/7] net: l3mdev: Support for sockets bound to enslaved device
From: David Miller Date: Sun, 06 Aug 2017 21:39:38 -0700 (PDT) > From: David Ahern > Date: Fri, 4 Aug 2017 13:16:56 -0700 > >> A missing piece to the VRF puzzle is the ability to bind sockets to >> devices enslaved to a VRF. This patch set adds the enslaved device >> index, sdif, to IPv4 and IPv6 socket lookups. The end result for users >> is the following scope options for services: >> >> 1. "global" services - sockets not bound to any device >> >>Allows 1 service to work across all network interfaces with >>connected sockets bound to the VRF the connection originates >>(Requires net.ipv4.tcp_l3mdev_accept=1 for TCP and >> net.ipv4.udp_l3mdev_accept=1 for UDP) >> >> 2. "VRF" local services - sockets bound to a VRF >> >>Sockets work across all network interfaces enslaved to a VRF but >>are limited to just the one VRF. >> >> 3. "device" services - sockets bound to a specific network interface >> >>Service works only through the one specific interface. >> >> v2 >> - remove sk_lookup struct and add sdif as an argument to existing >> functions >> >> Changes since RFC: >> - no significant logic changes; mainly whitespace cleanups > > Series applied, thanks David. David, I had to revert. You didn't convert dccp which breaks the build. net/dccp/ipv4.c: In function ‘dccp_v4_err’: net/dccp/ipv4.c:256:7: error: too few arguments to function ‘__inet_lookup_established’ sk = __inet_lookup_established(net, &dccp_hashinfo, ^ In file included from net/dccp/ipv4.c:22:0: ./include/net/inet_hashtables.h:289:14: note: declared here struct sock *__inet_lookup_established(struct net *net, ^ Thanks.
Re: [PATCH v2 net-next 0/7] net: l3mdev: Support for sockets bound to enslaved device
From: David Ahern Date: Fri, 4 Aug 2017 13:16:56 -0700 > A missing piece to the VRF puzzle is the ability to bind sockets to > devices enslaved to a VRF. This patch set adds the enslaved device > index, sdif, to IPv4 and IPv6 socket lookups. The end result for users > is the following scope options for services: > > 1. "global" services - sockets not bound to any device > >Allows 1 service to work across all network interfaces with >connected sockets bound to the VRF the connection originates >(Requires net.ipv4.tcp_l3mdev_accept=1 for TCP and > net.ipv4.udp_l3mdev_accept=1 for UDP) > > 2. "VRF" local services - sockets bound to a VRF > >Sockets work across all network interfaces enslaved to a VRF but >are limited to just the one VRF. > > 3. "device" services - sockets bound to a specific network interface > >Service works only through the one specific interface. > > v2 > - remove sk_lookup struct and add sdif as an argument to existing > functions > > Changes since RFC: > - no significant logic changes; mainly whitespace cleanups Series applied, thanks David.
Re: [PATCH net-next 00/10] net: l3mdev: Support for sockets bound to enslaved device
On 8/1/17 6:41 PM, David Miller wrote: > From: David Ahern > Date: Mon, 31 Jul 2017 20:13:16 -0700 > >> Existing code for socket lookups already pass in 6+ arguments. Rather >> than add another for the enslaved device index, the existing lookups >> are converted to use a new sk_lookup struct. From there, the enslaved >> device index becomes another element of the struct. > > Sorry, not gonna happen :-) > > I know it's difficult, but maybe we should think about why we're > passing so much crap into each lookup.\ The 'crap' is essential data to find a socket -- the hash table (common lookup functions for multiple backends), destination address, dest port, source address, source port and device index make 6 parameters to be matched. The socket data is what I consolidated into 1 struct. Ultimately that data has to make its way from high level wrappers to compute_score and INET{6}_MATCH at the very bottom. > > And perhaps, why it can't (for example) be constituted in the lookup > function itself given sufficient (relevant) context. There are several contexts for lookups -- ingress packets (ipv4 and v6, tcp, udp, updlite, dccp), error paths for those protocols, netfilter and socket diagnostic API. The 5 socket parameters do not have a consistent storage across those contexts. Further there are several layers of wrappers to what are the real lookup functions, each wrapper abstracting some piece of data. The call heirarchy for inet lookups for example: tcp_v4_early_demux __inet_lookup_established tcp_v4_rcv / dccp_v4_rcv __inet_lookup_skb __inet_lookup __inet_lookup_listener __inet_lookup_established tcp_v4_rcv - TCP_TW_SYN inet_lookup_listener __inet_lookup_listener nf_tproxy_get_sock_v4 inet_lookup_listener __inet_lookup_listener inet_lookup_established __inet_lookup_established inet_diag_find_one_icsk / nf_socket_get_sock_v4 inet_lookup __inet_lookup __inet_lookup_listener __inet_lookup_established tcp_v4_send_reset - MD5 path __inet_lookup_listener tcp_v4_err / dccp_v4_err __inet_lookup_established > > I think passing a big struct into the lookups, by reference, is a big > step backwards. > > For one thing, if you pass it by pointer then the compiler can't > potentially pass parts in registers even if it could. However > if you pass it by value, that's actually a possibility. > > But I'd like to avoid this on-stack blob altogether if possible. Just sent a v2 that just adds sdif to the existing functions.
[PATCH v2 net-next 0/7] net: l3mdev: Support for sockets bound to enslaved device
A missing piece to the VRF puzzle is the ability to bind sockets to devices enslaved to a VRF. This patch set adds the enslaved device index, sdif, to IPv4 and IPv6 socket lookups. The end result for users is the following scope options for services: 1. "global" services - sockets not bound to any device Allows 1 service to work across all network interfaces with connected sockets bound to the VRF the connection originates (Requires net.ipv4.tcp_l3mdev_accept=1 for TCP and net.ipv4.udp_l3mdev_accept=1 for UDP) 2. "VRF" local services - sockets bound to a VRF Sockets work across all network interfaces enslaved to a VRF but are limited to just the one VRF. 3. "device" services - sockets bound to a specific network interface Service works only through the one specific interface. v2 - remove sk_lookup struct and add sdif as an argument to existing functions Changes since RFC: - no significant logic changes; mainly whitespace cleanups David Ahern (7): net: ipv4: add second dif to udp socket lookups net: ipv4: add second dif to inet socket lookups net: ipv4: add second dif to raw socket lookups net: ipv4: add second dif to multicast source filter net: ipv6: add second dif to udp socket lookups net: ipv6: add second dif to inet6 socket lookups net: ipv6: add second dif to raw socket lookups include/linux/igmp.h | 3 +- include/linux/ipv6.h | 10 +++ include/net/inet6_hashtables.h | 22 -- include/net/inet_hashtables.h | 31 +++- include/net/ip.h | 10 +++ include/net/raw.h | 2 +- include/net/rawv6.h| 2 +- include/net/tcp.h | 20 + include/net/udp.h | 4 +-- net/dccp/ipv4.c| 2 +- net/dccp/ipv6.c| 4 +-- net/ipv4/igmp.c| 6 ++-- net/ipv4/inet_hashtables.c | 27 ++--- net/ipv4/raw.c | 18 net/ipv4/raw_diag.c| 4 +-- net/ipv4/tcp_ipv4.c| 13 + net/ipv4/udp.c | 66 -- net/ipv4/udp_diag.c| 10 +++ net/ipv6/inet6_hashtables.c| 28 +++--- net/ipv6/raw.c | 13 + net/ipv6/tcp_ipv6.c| 13 + net/ipv6/udp.c | 47 -- net/netfilter/xt_TPROXY.c | 6 ++-- 23 files changed, 227 insertions(+), 134 deletions(-) -- 2.1.4
Re: [PATCH net-next 00/10] net: l3mdev: Support for sockets bound to enslaved device
From: David Ahern Date: Mon, 31 Jul 2017 20:13:16 -0700 > Existing code for socket lookups already pass in 6+ arguments. Rather > than add another for the enslaved device index, the existing lookups > are converted to use a new sk_lookup struct. From there, the enslaved > device index becomes another element of the struct. Sorry, not gonna happen :-) I know it's difficult, but maybe we should think about why we're passing so much crap into each lookup. And perhaps, why it can't (for example) be constituted in the lookup function itself given sufficient (relevant) context. I think passing a big struct into the lookups, by reference, is a big step backwards. For one thing, if you pass it by pointer then the compiler can't potentially pass parts in registers even if it could. However if you pass it by value, that's actually a possibility. But I'd like to avoid this on-stack blob altogether if possible. Thanks.
Re: [PATCH net-next 00/10] net: l3mdev: Support for sockets bound to enslaved device
On 8/1/17 8:15 AM, David Laight wrote: > From: David Ahern >> Sent: 01 August 2017 04:13 > ... >> Existing code for socket lookups already pass in 6+ arguments. Rather >> than add another for the enslaved device index, the existing lookups >> are converted to use a new sk_lookup struct. From there, the enslaved >> device index becomes another element of the struct. >> >> Patch 1 introduces sk_lookup struct and helper. > > I guess that socket lookup happens quite often! > Passing the lookup parameters in a structure might have a > measurable negative effect on performance - especially if the > structure isn't passed through to other functions. > > Have you made any performance mearurements? Before patches: IPv4 Test TCP_RR 23769.42 23862.59 23867.69 sum 71499.70 avg 23833 Test TCP_CRR8649.29 8650.94 8661.24 sum 25961.47 avg 8653 Test UDP_RR 26935.38 26813.30 26747.88 sum 80496.56 avg 26832 IPv6 Test TCP_RR 24708.10 24629.43 24593.75 sum 73931.28 avg 24643 Test TCP_CRR8432.82 8489.26 8474.82 sum 25396.90 avg 8465 Test UDP_RR 23607.57 23722.37 23713.80 sum 71043.74 avg 23681 # After patches: IPv4 Test TCP_RR 24204.41 23993.05 24129.18 sum 72326.64 avg 24108 Test TCP_CRR8690.31 8630.12 8620.88 sum 25941.31 avg 8647 Test UDP_RR 26653.26 26725.76 26587.70 sum 79966.72 avg 26655 IPv6 Test TCP_RR 24807.54 24698.30 24849.84 sum 74355.68 avg 24785 Test TCP_CRR8573.22 8640.02 8624.09 sum 25837.33 avg 8612 Test UDP_RR 23800.14 23747.01 23814.94 sum 71362.09 avg 23787 The middle columns are the results of each 30-second run and then the average of the 3 is on the end.
RE: [PATCH net-next 00/10] net: l3mdev: Support for sockets bound to enslaved device
From: David Ahern > Sent: 01 August 2017 04:13 ... > Existing code for socket lookups already pass in 6+ arguments. Rather > than add another for the enslaved device index, the existing lookups > are converted to use a new sk_lookup struct. From there, the enslaved > device index becomes another element of the struct. > > Patch 1 introduces sk_lookup struct and helper. I guess that socket lookup happens quite often! Passing the lookup parameters in a structure might have a measurable negative effect on performance - especially if the structure isn't passed through to other functions. Have you made any performance mearurements? David
[PATCH net-next 00/10] net: l3mdev: Support for sockets bound to enslaved device
A missing piece to the VRF puzzle is the ability to bind sockets to devices enslaved to a VRF. This patch set adds the enslaved device index, sdif, to IPv4 and IPv6 socket lookups. The end result for users is the following scope options for services: 1. "global" services - sockets not bound to any device Allows 1 service to work across all network interfaces with connected sockets bound to the VRF the connection originates (Requires net.ipv4.tcp_l3mdev_accept=1 for TCP and net.ipv4.udp_l3mdev_accept=1 for UDP) 2. "VRF" local services - sockets bound to a VRF Sockets work across all network interfaces enslaved to a VRF but are limited to just the one VRF. 3. "device" services - sockets bound to a specific network interface Service works only through the one specific interface. Existing code for socket lookups already pass in 6+ arguments. Rather than add another for the enslaved device index, the existing lookups are converted to use a new sk_lookup struct. From there, the enslaved device index becomes another element of the struct. Patch 1 introduces sk_lookup struct and helper. Patches 2-4 convert udp, inet and socket lookups for IPv4 to use the new sk_lookup struct. Meant to be a conversion of IPv4 code only; no functional change intended. Patches 5-7 convert udp, inet and socket lookups for IPv6 to use the new sk_lookup struct. Meant to be a conversion of IPv6 code only; no functional change intended. Patch 8 adds sdif to the sk_lookup struct allowing lookups to consider a second device index. Patches 9-10 add support for the enslaved device index to ipv4 and ipv6 socket lookups. Changes since RFC: - no significant logic changes; mainly whitespace cleanups David Ahern (10): net: Add sk_lookup struct and helper net: ipv4: Convert udp socket lookups to new struct net: ipv4: Convert inet socket lookups to new struct net: ipv4: Convert raw sockets to sk_lookup net: ipv6: Convert udp socket lookups to new struct net: ipv6: Convert inet socket lookups to new struct net: ipv6: Convert raw sockets to sk_lookup net: Add sdif to sk_lookup net: ipv4: Support for sockets bound to enslaved device net: ipv6: Support for sockets bound to enslaved device include/linux/igmp.h| 3 +- include/linux/ipv6.h| 8 ++ include/net/inet6_hashtables.h | 44 - include/net/inet_hashtables.h | 67 ++--- include/net/ip.h| 10 ++ include/net/raw.h | 3 +- include/net/rawv6.h | 3 +- include/net/sock.h | 42 + include/net/tcp.h | 17 include/net/udp.h | 18 +--- net/dccp/ipv4.c | 19 +++- net/dccp/ipv6.c | 22 +++-- net/ipv4/igmp.c | 6 +- net/ipv4/inet_diag.c| 50 +++--- net/ipv4/inet_hashtables.c | 59 +++- net/ipv4/netfilter/nf_socket_ipv4.c | 16 +++- net/ipv4/raw.c | 77 +-- net/ipv4/raw_diag.c | 30 -- net/ipv4/tcp_ipv4.c | 64 + net/ipv4/udp.c | 175 ++ net/ipv4/udp_diag.c | 89 -- net/ipv6/inet6_hashtables.c | 75 --- net/ipv6/netfilter/nf_socket_ipv6.c | 16 +++- net/ipv6/raw.c | 44 + net/ipv6/tcp_ipv6.c | 63 + net/ipv6/udp.c | 181 net/netfilter/xt_TPROXY.c | 39 +--- 27 files changed, 759 insertions(+), 481 deletions(-) -- 2.1.4
[RFC PATCH 00/10] net: l3mdev: Support for sockets bound to enslaved device
A missing piece to the VRF puzzle is the ability to bind sockets to devices enslaved to a VRF. This patch set adds the enslaved device index, sdif, to IPv4 and IPv6 socket lookups. The end result for users is the following scope options for services: 1. "global" services - sockets not bound to any device Allows 1 service to work across all network interfaces with connected sockets bound to the VRF the connection originates (Requires net.ipv4.tcp_l3mdev_accept=1 for TCP and net.ipv4.udp_l3mdev_accept=1 for UDP) 2. "VRF" local services - sockets bound to a VRF Sockets work across all network interfaces enslaved to a VRF but are limited to just the one VRF. 3. "device" services - sockets bound to a specific network interface Service works only through the one specific interface. Existing code for socket lookups already pass in 6+ arguments. Rather than add another for the enslaved device index, the existing lookups are converted to use a new sk_lookup struct. From there, the enslaved device index becomes another element of the struct. Patch 1 introduces sk_lookup struct and helper. Patches 2-4 convert udp, inet and socket lookups for IPv4 to use the new sk_lookup struct. Meant to be a conversion of IPv4 code only; no functional change intended. Patches 5-7 convert udp, inet and socket lookups for IPv6 to use the new sk_lookup struct. Meant to be a conversion of IPv6 code only; no functional change intended. Patch 8 adds sdif to the sk_lookup struct allowing lookups to consider a second device index. Patches 9-10 add support for the enslaved device index to ipv4 and ipv6 socket lookups. David Ahern (10): net: Add sk_lookup struct and helper net: ipv4: Convert udp socket lookups to new struct net: ipv4: Convert inet socket lookups to new struct net: ipv4: Convert raw sockets to sk_lookup net: ipv6: Convert udp socket lookups to new struct net: ipv6: Convert inet socket lookups to new struct net: ipv6: Convert raw sockets to sk_lookup net: Add sdif to sk_lookup net: ipv4: Support for sockets bound to enslaved device net: ipv6: Support for sockets bound to enslaved device include/linux/igmp.h| 3 +- include/linux/ipv6.h| 8 ++ include/net/inet6_hashtables.h | 44 - include/net/inet_hashtables.h | 67 ++--- include/net/ip.h| 10 ++ include/net/raw.h | 3 +- include/net/rawv6.h | 3 +- include/net/sock.h | 42 + include/net/tcp.h | 17 include/net/udp.h | 18 +--- net/dccp/ipv4.c | 19 +++- net/dccp/ipv6.c | 22 +++-- net/ipv4/igmp.c | 6 +- net/ipv4/inet_diag.c| 50 +++--- net/ipv4/inet_hashtables.c | 56 ++- net/ipv4/netfilter/nf_socket_ipv4.c | 16 +++- net/ipv4/raw.c | 77 +-- net/ipv4/raw_diag.c | 30 -- net/ipv4/tcp_ipv4.c | 64 + net/ipv4/udp.c | 175 ++ net/ipv4/udp_diag.c | 89 -- net/ipv6/inet6_hashtables.c | 72 +++--- net/ipv6/netfilter/nf_socket_ipv6.c | 16 +++- net/ipv6/raw.c | 44 + net/ipv6/tcp_ipv6.c | 63 + net/ipv6/udp.c | 181 net/netfilter/xt_TPROXY.c | 39 +--- 27 files changed, 754 insertions(+), 480 deletions(-) -- 2.1.4