Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
On Sat, Apr 9, 2016 at 11:02 AM, Eric Dumazetwrote: > On Sat, 2016-04-09 at 10:36 -0700, Alexander Duyck wrote: > >> For next version I plan to include a check in netif_skb_features that >> will clear NETIF_F_TSO_MANGLEID if the DF bit is not set. > > So it looks like we slowly but surely make the whole stack damn slow, to > support some extra features. > > How many instructions are needed nowadays for netif_skb_features() > alone ? > > More than 120 if I am not mistaken, if we count fast path in > skb_network_protocol() > > Note: I have a patch to remove the gso_min_segs thing, currently unused. > > 2 instructions saved ! Woo-hoo ! I was planning to drop the checks for TSO_MANGLEID and GSO_PARTIAL in a if statment based on skb_is_gso() since I had both GSO_PARTIAL to check for as well. - Alex
Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
On Sat, 2016-04-09 at 10:36 -0700, Alexander Duyck wrote: > For next version I plan to include a check in netif_skb_features that > will clear NETIF_F_TSO_MANGLEID if the DF bit is not set. So it looks like we slowly but surely make the whole stack damn slow, to support some extra features. How many instructions are needed nowadays for netif_skb_features() alone ? More than 120 if I am not mistaken, if we count fast path in skb_network_protocol() Note: I have a patch to remove the gso_min_segs thing, currently unused. 2 instructions saved ! Woo-hoo !
Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
On Sat, Apr 9, 2016 at 8:52 AM, Jesse Grosswrote: > On Fri, Apr 8, 2016 at 7:04 PM, Alexander Duyck > wrote: >> On Fri, Apr 8, 2016 at 2:40 PM, Jesse Gross wrote: >>> Maybe I missed it but I didn't see any checks for the DF bit being set >>> when we transmit a packet with NETIF_F_TSO_MANGLEID. Even if I am >>> comfortable mangling my IDs in the DF case, I don't think this would >>> ever extend to non-DF packets. In the documentation you noted that it >>> is the driver's responsibility to do this check but I couldn't find it >>> in either ixgbe or igb. It would also be nice if the core stack could >>> enforce it somehow as well rather than each driver. >> >> Yeah I had glossed over that in the igb and ixgbe patches. A check is >> only really needed for the incrementing to non-incrementing case and I >> wasn't sure how common it was to have TCP with an IP header that >> didn't set the DF bit. In the case of the outer headers igb and ixgbe >> will increment the IP ID always so we don't have to worry about if DF >> is set of not there. For the inner headers I had fudged it a bit and >> didn't add the validation. If needed I can see about adding that >> shortly. > > TCP without the DF bit set is not the default but it is possible (it > can be enabled by setting /proc/sys/net/ipv4/ip_no_pmtu_disc). I also > did a quick check of some Internet services and at least some of them > seem to return TCP without DF, so it's not too rare. For next version I plan to include a check in netif_skb_features that will clear NETIF_F_TSO_MANGLEID if the DF bit is not set. - Alex
Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
On Fri, Apr 8, 2016 at 7:04 PM, Alexander Duyckwrote: > On Fri, Apr 8, 2016 at 2:40 PM, Jesse Gross wrote: >> Maybe I missed it but I didn't see any checks for the DF bit being set >> when we transmit a packet with NETIF_F_TSO_MANGLEID. Even if I am >> comfortable mangling my IDs in the DF case, I don't think this would >> ever extend to non-DF packets. In the documentation you noted that it >> is the driver's responsibility to do this check but I couldn't find it >> in either ixgbe or igb. It would also be nice if the core stack could >> enforce it somehow as well rather than each driver. > > Yeah I had glossed over that in the igb and ixgbe patches. A check is > only really needed for the incrementing to non-incrementing case and I > wasn't sure how common it was to have TCP with an IP header that > didn't set the DF bit. In the case of the outer headers igb and ixgbe > will increment the IP ID always so we don't have to worry about if DF > is set of not there. For the inner headers I had fudged it a bit and > didn't add the validation. If needed I can see about adding that > shortly. TCP without the DF bit set is not the default but it is possible (it can be enabled by setting /proc/sys/net/ipv4/ip_no_pmtu_disc). I also did a quick check of some Internet services and at least some of them seem to return TCP without DF, so it's not too rare.
Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
On Fri, Apr 8, 2016 at 2:40 PM, Jesse Grosswrote: > On Thu, Apr 7, 2016 at 8:52 PM, Alexander Duyck > wrote: >> Just a thought. What if I replaced NETIF_F_TSO_FIXEDID with something >> that meant we could mange the IP ID like a NETIF_F_TSO_IPID_MANGLE >> (advice for better name welcome). Instead of the feature flag meaning >> we are going to transmit packets with a fixed ID it would mean we >> don't care about the ID and are free to mangle it as we see fit. The >> GSO type can retain the same meaning as far as that requiring the same >> ID for all, but the feature would mean we will take fixed and convert >> it to incrementing, or incrementing and convert it to fixed. > > I saw the new version of the code that you posted with this idea and > now that I understand it better, it seems like a reasonable choice to > me - it's nice that it is consistent with GRO and not tunnel specific. > It also makes behavior consistent across drivers in regards to > incrementing IDs in the default case, which was one of my concerns > from before. > > Maybe I missed it but I didn't see any checks for the DF bit being set > when we transmit a packet with NETIF_F_TSO_MANGLEID. Even if I am > comfortable mangling my IDs in the DF case, I don't think this would > ever extend to non-DF packets. In the documentation you noted that it > is the driver's responsibility to do this check but I couldn't find it > in either ixgbe or igb. It would also be nice if the core stack could > enforce it somehow as well rather than each driver. Yeah I had glossed over that in the igb and ixgbe patches. A check is only really needed for the incrementing to non-incrementing case and I wasn't sure how common it was to have TCP with an IP header that didn't set the DF bit. In the case of the outer headers igb and ixgbe will increment the IP ID always so we don't have to worry about if DF is set of not there. For the inner headers I had fudged it a bit and didn't add the validation. If needed I can see about adding that shortly. - Alex
Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
On Thu, Apr 7, 2016 at 8:52 PM, Alexander Duyckwrote: > Just a thought. What if I replaced NETIF_F_TSO_FIXEDID with something > that meant we could mange the IP ID like a NETIF_F_TSO_IPID_MANGLE > (advice for better name welcome). Instead of the feature flag meaning > we are going to transmit packets with a fixed ID it would mean we > don't care about the ID and are free to mangle it as we see fit. The > GSO type can retain the same meaning as far as that requiring the same > ID for all, but the feature would mean we will take fixed and convert > it to incrementing, or incrementing and convert it to fixed. I saw the new version of the code that you posted with this idea and now that I understand it better, it seems like a reasonable choice to me - it's nice that it is consistent with GRO and not tunnel specific. It also makes behavior consistent across drivers in regards to incrementing IDs in the default case, which was one of my concerns from before. Maybe I missed it but I didn't see any checks for the DF bit being set when we transmit a packet with NETIF_F_TSO_MANGLEID. Even if I am comfortable mangling my IDs in the DF case, I don't think this would ever extend to non-DF packets. In the documentation you noted that it is the driver's responsibility to do this check but I couldn't find it in either ixgbe or igb. It would also be nice if the core stack could enforce it somehow as well rather than each driver.
Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
On Thu, Apr 7, 2016 at 4:22 PM, Jesse Grosswrote: > On Thu, Apr 7, 2016 at 7:32 PM, Alexander Duyck wrote: >> This patch adds support for a feature I am calling IP ID mangling. It is >> basically just another way of saying the IP IDs that are transmitted by the >> tunnel may not match up with what would normally be expected. Specifically >> what will happen is in the case of TSO the IP IDs on the headers will be a >> fixed value so a given TSO will repeat the same inner IP ID value gso_segs >> number of times. >> >> Signed-off-by: Alexander Duyck > > If I'm understanding this correctly, enabling IP ID mangling will help > performance on ixgbe since it will allow it to do GSO partial instead > of plain GSO but it will hurt performance on i40e since it will drop > from TSO to plain GSO. Right. However the option is currently defaulted to off, and can be enabled per tunnel endpoint. So if you had an ixgbe to i40e link you could enable it on the end with the ixgbe and you should see good performance in both directions. > Assuming that's right, it seems like it will make it hard to chose the > right setting without knowledge of which hardware is in use. I guess > what we really want is "I care about nicely incrementing IP IDs" vs. > "I don't care as long as the DF bit is set". That second case is > really what this flag is trying to say but it seems like it is > enforcing too much in the i40e case - I don't think anyone wants to go > out of their way to make IP IDs jump around if incrementing is faster. Right. The problem is trying to sort out all the GRO/GSO bits. I was probably being a bit too conservative after the last few iterations for the GRO fixes. Just a thought. What if I replaced NETIF_F_TSO_FIXEDID with something that meant we could mange the IP ID like a NETIF_F_TSO_IPID_MANGLE (advice for better name welcome). Instead of the feature flag meaning we are going to transmit packets with a fixed ID it would mean we don't care about the ID and are free to mangle it as we see fit. The GSO type can retain the same meaning as far as that requiring the same ID for all, but the feature would mean we will take fixed and convert it to incrementing, or incrementing and convert it to fixed. - Alex
Re: [RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
On Thu, Apr 7, 2016 at 7:32 PM, Alexander Duyckwrote: > This patch adds support for a feature I am calling IP ID mangling. It is > basically just another way of saying the IP IDs that are transmitted by the > tunnel may not match up with what would normally be expected. Specifically > what will happen is in the case of TSO the IP IDs on the headers will be a > fixed value so a given TSO will repeat the same inner IP ID value gso_segs > number of times. > > Signed-off-by: Alexander Duyck If I'm understanding this correctly, enabling IP ID mangling will help performance on ixgbe since it will allow it to do GSO partial instead of plain GSO but it will hurt performance on i40e since it will drop from TSO to plain GSO. Assuming that's right, it seems like it will make it hard to chose the right setting without knowledge of which hardware is in use. I guess what we really want is "I care about nicely incrementing IP IDs" vs. "I don't care as long as the DF bit is set". That second case is really what this flag is trying to say but it seems like it is enforcing too much in the i40e case - I don't think anyone wants to go out of their way to make IP IDs jump around if incrementing is faster.
[RFC PATCH 07/11] GENEVE: Add option to mangle IP IDs on inner headers when using TSO
This patch adds support for a feature I am calling IP ID mangling. It is basically just another way of saying the IP IDs that are transmitted by the tunnel may not match up with what would normally be expected. Specifically what will happen is in the case of TSO the IP IDs on the headers will be a fixed value so a given TSO will repeat the same inner IP ID value gso_segs number of times. Signed-off-by: Alexander Duyck--- drivers/net/geneve.c | 24 ++-- include/net/udp_tunnel.h |8 include/uapi/linux/if_link.h |1 + 3 files changed, 23 insertions(+), 10 deletions(-) diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index bc168894bda3..6352223d80c3 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -80,6 +80,7 @@ struct geneve_dev { #define GENEVE_F_UDP_ZERO_CSUM_TX BIT(0) #define GENEVE_F_UDP_ZERO_CSUM6_TX BIT(1) #define GENEVE_F_UDP_ZERO_CSUM6_RX BIT(2) +#define GENEVE_F_TCP_FIXEDID BIT(3) struct geneve_sock { boolcollect_md; @@ -702,9 +703,14 @@ static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb, int min_headroom; int err; bool udp_sum = !(flags & GENEVE_F_UDP_ZERO_CSUM_TX); + int type = udp_sum ? SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL; skb_scrub_packet(skb, xnet); + if ((flags & GENEVE_F_TCP_FIXEDID) && skb_is_gso(skb) && + (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)) + type |= SKB_GSO_TCP_FIXEDID; + min_headroom = LL_RESERVED_SPACE(rt->dst.dev) + rt->dst.header_len + GENEVE_BASE_HLEN + opt_len + sizeof(struct iphdr); err = skb_cow_head(skb, min_headroom); @@ -713,7 +719,7 @@ static int geneve_build_skb(struct rtable *rt, struct sk_buff *skb, goto free_rt; } - skb = udp_tunnel_handle_offloads(skb, udp_sum); + skb = iptunnel_handle_offloads(skb, type); if (IS_ERR(skb)) { err = PTR_ERR(skb); goto free_rt; @@ -739,9 +745,14 @@ static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb, int min_headroom; int err; bool udp_sum = !(flags & GENEVE_F_UDP_ZERO_CSUM6_TX); + int type = udp_sum ? SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL; skb_scrub_packet(skb, xnet); + if ((flags & GENEVE_F_TCP_FIXEDID) && skb_is_gso(skb) && + (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)) + type |= SKB_GSO_TCP_FIXEDID; + min_headroom = LL_RESERVED_SPACE(dst->dev) + dst->header_len + GENEVE_BASE_HLEN + opt_len + sizeof(struct ipv6hdr); err = skb_cow_head(skb, min_headroom); @@ -750,7 +761,7 @@ static int geneve6_build_skb(struct dst_entry *dst, struct sk_buff *skb, goto free_dst; } - skb = udp_tunnel_handle_offloads(skb, udp_sum); + skb = iptunnel_handle_offloads(skb, type); if (IS_ERR(skb)) { err = PTR_ERR(skb); goto free_dst; @@ -1249,6 +1260,7 @@ static const struct nla_policy geneve_policy[IFLA_GENEVE_MAX + 1] = { [IFLA_GENEVE_UDP_CSUM] = { .type = NLA_U8 }, [IFLA_GENEVE_UDP_ZERO_CSUM6_TX] = { .type = NLA_U8 }, [IFLA_GENEVE_UDP_ZERO_CSUM6_RX] = { .type = NLA_U8 }, + [IFLA_GENEVE_IPID_MANGLE] = { .type = NLA_FLAG }, }; static int geneve_validate(struct nlattr *tb[], struct nlattr *data[]) @@ -1436,6 +1448,9 @@ static int geneve_newlink(struct net *net, struct net_device *dev, nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX])) flags |= GENEVE_F_UDP_ZERO_CSUM6_RX; + if (data[IFLA_GENEVE_IPID_MANGLE]) + flags |= GENEVE_F_TCP_FIXEDID; + return geneve_configure(net, dev, , vni, ttl, tos, label, dst_port, metadata, flags); } @@ -1460,6 +1475,7 @@ static size_t geneve_get_size(const struct net_device *dev) nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_UDP_CSUM */ nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_UDP_ZERO_CSUM6_TX */ nla_total_size(sizeof(__u8)) + /* IFLA_GENEVE_UDP_ZERO_CSUM6_RX */ + nla_total_size(0) + /* IFLA_GENEVE_IPID_MANGLE */ 0; } @@ -1505,6 +1521,10 @@ static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev) !!(geneve->flags & GENEVE_F_UDP_ZERO_CSUM6_RX))) goto nla_put_failure; + if ((geneve->flags & GENEVE_F_TCP_FIXEDID) && + nla_put_flag(skb, IFLA_GENEVE_IPID_MANGLE)) + goto nla_put_failure; + return 0; nla_put_failure: diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h index b83114077cee..c44d04259665 100644 --- a/include/net/udp_tunnel.h +++ b/include/net/udp_tunnel.h @@ -98,14 +98,6 @@ struct