Re: [ovs-dev] [PATCH] netdev-offload-tc: Disable offload of IPv6 fragments.
On 8/5/22 01:57, Marcelo Leitner wrote: > On Fri, Jul 29, 2022 at 10:38:36PM +0200, Ilya Maximets wrote: >> OVS kernel datapath and TC are parsing IPv6 fragments differently. >> For IPv6 later fragments, according to the original design [1], OVS >> always sets nw_proto=44 (IPPROTO_FRAMENT), regardless of the type >> of the L4 protocol. >> >> This leads to situation where flow for nw_proto=44 gets installed >> to TC, but packets can not match on it, causing all IPv6 later >> fragments to go to userspace significantly degrading performance. >> >> Disabling offload for such packets, so the flow can be installed >> to the OVS kernel datapath instead. Disabling for all IPv6 fragments >> including the first one, because it doesn't make a lot of sense to >> handle them separately. It may also cause potential problems with >> conntrack trying to re-assemble a packet from fragments handled by >> different datapaths (first in HW, later in OVS kernel). >> >> Checking both 'nw_proto' and 'nw_frag' as classifier might decide >> to match only on one of them and also nw_proto will not be 44 for >> the first fragment. >> >> The issue was hidden for some time due to incorrect behavior of the >> OVS kernel datapath that was recently fixed in kernel commit: >> >> 12378a5a75e3 ("net: openvswitch: fix parsing of nw_proto for IPv6 >> fragments") >> >> To allow offloading in the future either flow dissector in TC >> should be changed to parse packets in the same way as OVS does, >> or parsing in OVS kernel and userspace should be made configurable, >> so users can opt-in to the behavior change. Silent change of the >> behavior (change by default) is not an option, because existing >> OpenFlow pipelines may depend on a certain behavior. >> >> [1] https://docs.openvswitch.org/en/latest/topics/design/#fragments >> >> Fixes: 83e866067ea6 ("netdev-tc-offloads: Add support for IP fragmentation") >> Signed-off-by: Ilya Maximets > > Reviewed-by: Marcelo Ricardo Leitner Thanks, Roi and Marcelo! I fixed the reported typo and the weird 2-spaces indentation in the is_ipv6_fragment_and_masked() function (not sure how did that happen). With that, applied and backported down to 2.13. Best regards, Ilya Maximets. > >> --- >> lib/netdev-offload-tc.c | 22 +- >> 1 file changed, 21 insertions(+), 1 deletion(-) >> >> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c >> index 29d0e63eb..9d37881a1 100644 >> --- a/lib/netdev-offload-tc.c >> +++ b/lib/netdev-offload-tc.c >> @@ -1459,6 +1459,21 @@ parse_put_flow_set_action(struct tc_flower *flower, >> struct tc_action *action, >> return 0; >> } >> >> +static bool >> +is_ipv6_fragment_and_masked(const struct flow *key, const struct flow *mask) >> +{ >> + if (key->dl_type != htons(ETH_P_IPV6)) { >> + return false; >> + } >> + if (mask->nw_proto && key->nw_proto == IPPROTO_FRAGMENT) { >> + return true; >> + } >> + if (key->nw_frag & (mask->nw_frag & FLOW_NW_FRAG_ANY)) { >> + return true; >> + } >> + return false; >> +} >> + >> static int >> test_key_and_mask(struct match *match) >> { >> @@ -1541,6 +1556,11 @@ test_key_and_mask(struct match *match) >> return EOPNOTSUPP; >> } >> >> +if (is_ipv6_fragment_and_masked(key, mask)) { >> +VLOG_DBG_RL(&rl, "offloading of IPv6 fragments isn't supported"); >> +return EOPNOTSUPP; >> +} >> + >> if (!is_all_zeros(mask, sizeof *mask)) { >> VLOG_DBG_RL(&rl, "offloading isn't supported, unknown attribute"); >> return EOPNOTSUPP; >> @@ -2099,7 +2119,7 @@ netdev_tc_flow_put(struct netdev *netdev, struct match >> *match, >> memset(&mask->arp_tha, 0, sizeof mask->arp_tha); >> } >> >> -if (is_ip_any(key)) { >> +if (is_ip_any(key) && !is_ipv6_fragment_and_masked(key, mask)) { >> flower.key.ip_proto = key->nw_proto; >> flower.mask.ip_proto = mask->nw_proto; >> mask->nw_proto = 0; >> -- >> 2.34.3 >> > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] netdev-offload-tc: Disable offload of IPv6 fragments.
On Fri, Jul 29, 2022 at 10:38:36PM +0200, Ilya Maximets wrote: > OVS kernel datapath and TC are parsing IPv6 fragments differently. > For IPv6 later fragments, according to the original design [1], OVS > always sets nw_proto=44 (IPPROTO_FRAMENT), regardless of the type > of the L4 protocol. > > This leads to situation where flow for nw_proto=44 gets installed > to TC, but packets can not match on it, causing all IPv6 later > fragments to go to userspace significantly degrading performance. > > Disabling offload for such packets, so the flow can be installed > to the OVS kernel datapath instead. Disabling for all IPv6 fragments > including the first one, because it doesn't make a lot of sense to > handle them separately. It may also cause potential problems with > conntrack trying to re-assemble a packet from fragments handled by > different datapaths (first in HW, later in OVS kernel). > > Checking both 'nw_proto' and 'nw_frag' as classifier might decide > to match only on one of them and also nw_proto will not be 44 for > the first fragment. > > The issue was hidden for some time due to incorrect behavior of the > OVS kernel datapath that was recently fixed in kernel commit: > > 12378a5a75e3 ("net: openvswitch: fix parsing of nw_proto for IPv6 fragments") > > To allow offloading in the future either flow dissector in TC > should be changed to parse packets in the same way as OVS does, > or parsing in OVS kernel and userspace should be made configurable, > so users can opt-in to the behavior change. Silent change of the > behavior (change by default) is not an option, because existing > OpenFlow pipelines may depend on a certain behavior. > > [1] https://docs.openvswitch.org/en/latest/topics/design/#fragments > > Fixes: 83e866067ea6 ("netdev-tc-offloads: Add support for IP fragmentation") > Signed-off-by: Ilya Maximets Reviewed-by: Marcelo Ricardo Leitner > --- > lib/netdev-offload-tc.c | 22 +- > 1 file changed, 21 insertions(+), 1 deletion(-) > > diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c > index 29d0e63eb..9d37881a1 100644 > --- a/lib/netdev-offload-tc.c > +++ b/lib/netdev-offload-tc.c > @@ -1459,6 +1459,21 @@ parse_put_flow_set_action(struct tc_flower *flower, > struct tc_action *action, > return 0; > } > > +static bool > +is_ipv6_fragment_and_masked(const struct flow *key, const struct flow *mask) > +{ > + if (key->dl_type != htons(ETH_P_IPV6)) { > + return false; > + } > + if (mask->nw_proto && key->nw_proto == IPPROTO_FRAGMENT) { > + return true; > + } > + if (key->nw_frag & (mask->nw_frag & FLOW_NW_FRAG_ANY)) { > + return true; > + } > + return false; > +} > + > static int > test_key_and_mask(struct match *match) > { > @@ -1541,6 +1556,11 @@ test_key_and_mask(struct match *match) > return EOPNOTSUPP; > } > > +if (is_ipv6_fragment_and_masked(key, mask)) { > +VLOG_DBG_RL(&rl, "offloading of IPv6 fragments isn't supported"); > +return EOPNOTSUPP; > +} > + > if (!is_all_zeros(mask, sizeof *mask)) { > VLOG_DBG_RL(&rl, "offloading isn't supported, unknown attribute"); > return EOPNOTSUPP; > @@ -2099,7 +2119,7 @@ netdev_tc_flow_put(struct netdev *netdev, struct match > *match, > memset(&mask->arp_tha, 0, sizeof mask->arp_tha); > } > > -if (is_ip_any(key)) { > +if (is_ip_any(key) && !is_ipv6_fragment_and_masked(key, mask)) { > flower.key.ip_proto = key->nw_proto; > flower.mask.ip_proto = mask->nw_proto; > mask->nw_proto = 0; > -- > 2.34.3 > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] netdev-offload-tc: Disable offload of IPv6 fragments.
On 2022-07-29 11:38 PM, Ilya Maximets wrote: OVS kernel datapath and TC are parsing IPv6 fragments differently. For IPv6 later fragments, according to the original design [1], OVS always sets nw_proto=44 (IPPROTO_FRAMENT), regardless of the type of the L4 protocol. typo IPPROTO_FRAGMENT This leads to situation where flow for nw_proto=44 gets installed to TC, but packets can not match on it, causing all IPv6 later fragments to go to userspace significantly degrading performance. Disabling offload for such packets, so the flow can be installed to the OVS kernel datapath instead. Disabling for all IPv6 fragments including the first one, because it doesn't make a lot of sense to handle them separately. It may also cause potential problems with conntrack trying to re-assemble a packet from fragments handled by different datapaths (first in HW, later in OVS kernel). Checking both 'nw_proto' and 'nw_frag' as classifier might decide to match only on one of them and also nw_proto will not be 44 for the first fragment. The issue was hidden for some time due to incorrect behavior of the OVS kernel datapath that was recently fixed in kernel commit: 12378a5a75e3 ("net: openvswitch: fix parsing of nw_proto for IPv6 fragments") To allow offloading in the future either flow dissector in TC should be changed to parse packets in the same way as OVS does, or parsing in OVS kernel and userspace should be made configurable, so users can opt-in to the behavior change. Silent change of the behavior (change by default) is not an option, because existing OpenFlow pipelines may depend on a certain behavior. [1] https://docs.openvswitch.org/en/latest/topics/design/#fragments Fixes: 83e866067ea6 ("netdev-tc-offloads: Add support for IP fragmentation") Signed-off-by: Ilya Maximets --- lib/netdev-offload-tc.c | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c index 29d0e63eb..9d37881a1 100644 --- a/lib/netdev-offload-tc.c +++ b/lib/netdev-offload-tc.c @@ -1459,6 +1459,21 @@ parse_put_flow_set_action(struct tc_flower *flower, struct tc_action *action, return 0; } +static bool +is_ipv6_fragment_and_masked(const struct flow *key, const struct flow *mask) +{ + if (key->dl_type != htons(ETH_P_IPV6)) { + return false; + } + if (mask->nw_proto && key->nw_proto == IPPROTO_FRAGMENT) { + return true; + } + if (key->nw_frag & (mask->nw_frag & FLOW_NW_FRAG_ANY)) { + return true; + } + return false; +} + static int test_key_and_mask(struct match *match) { @@ -1541,6 +1556,11 @@ test_key_and_mask(struct match *match) return EOPNOTSUPP; } +if (is_ipv6_fragment_and_masked(key, mask)) { +VLOG_DBG_RL(&rl, "offloading of IPv6 fragments isn't supported"); +return EOPNOTSUPP; +} + if (!is_all_zeros(mask, sizeof *mask)) { VLOG_DBG_RL(&rl, "offloading isn't supported, unknown attribute"); return EOPNOTSUPP; @@ -2099,7 +2119,7 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match, memset(&mask->arp_tha, 0, sizeof mask->arp_tha); } -if (is_ip_any(key)) { +if (is_ip_any(key) && !is_ipv6_fragment_and_masked(key, mask)) { flower.key.ip_proto = key->nw_proto; flower.mask.ip_proto = mask->nw_proto; mask->nw_proto = 0; Reviewed-by: Roi Dayan ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] netdev-offload-tc: Disable offload of IPv6 fragments.
OVS kernel datapath and TC are parsing IPv6 fragments differently. For IPv6 later fragments, according to the original design [1], OVS always sets nw_proto=44 (IPPROTO_FRAMENT), regardless of the type of the L4 protocol. This leads to situation where flow for nw_proto=44 gets installed to TC, but packets can not match on it, causing all IPv6 later fragments to go to userspace significantly degrading performance. Disabling offload for such packets, so the flow can be installed to the OVS kernel datapath instead. Disabling for all IPv6 fragments including the first one, because it doesn't make a lot of sense to handle them separately. It may also cause potential problems with conntrack trying to re-assemble a packet from fragments handled by different datapaths (first in HW, later in OVS kernel). Checking both 'nw_proto' and 'nw_frag' as classifier might decide to match only on one of them and also nw_proto will not be 44 for the first fragment. The issue was hidden for some time due to incorrect behavior of the OVS kernel datapath that was recently fixed in kernel commit: 12378a5a75e3 ("net: openvswitch: fix parsing of nw_proto for IPv6 fragments") To allow offloading in the future either flow dissector in TC should be changed to parse packets in the same way as OVS does, or parsing in OVS kernel and userspace should be made configurable, so users can opt-in to the behavior change. Silent change of the behavior (change by default) is not an option, because existing OpenFlow pipelines may depend on a certain behavior. [1] https://docs.openvswitch.org/en/latest/topics/design/#fragments Fixes: 83e866067ea6 ("netdev-tc-offloads: Add support for IP fragmentation") Signed-off-by: Ilya Maximets --- lib/netdev-offload-tc.c | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c index 29d0e63eb..9d37881a1 100644 --- a/lib/netdev-offload-tc.c +++ b/lib/netdev-offload-tc.c @@ -1459,6 +1459,21 @@ parse_put_flow_set_action(struct tc_flower *flower, struct tc_action *action, return 0; } +static bool +is_ipv6_fragment_and_masked(const struct flow *key, const struct flow *mask) +{ + if (key->dl_type != htons(ETH_P_IPV6)) { + return false; + } + if (mask->nw_proto && key->nw_proto == IPPROTO_FRAGMENT) { + return true; + } + if (key->nw_frag & (mask->nw_frag & FLOW_NW_FRAG_ANY)) { + return true; + } + return false; +} + static int test_key_and_mask(struct match *match) { @@ -1541,6 +1556,11 @@ test_key_and_mask(struct match *match) return EOPNOTSUPP; } +if (is_ipv6_fragment_and_masked(key, mask)) { +VLOG_DBG_RL(&rl, "offloading of IPv6 fragments isn't supported"); +return EOPNOTSUPP; +} + if (!is_all_zeros(mask, sizeof *mask)) { VLOG_DBG_RL(&rl, "offloading isn't supported, unknown attribute"); return EOPNOTSUPP; @@ -2099,7 +2119,7 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match, memset(&mask->arp_tha, 0, sizeof mask->arp_tha); } -if (is_ip_any(key)) { +if (is_ip_any(key) && !is_ipv6_fragment_and_masked(key, mask)) { flower.key.ip_proto = key->nw_proto; flower.mask.ip_proto = mask->nw_proto; mask->nw_proto = 0; -- 2.34.3 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev