On 28 Feb 2022, at 19:01, Pieter Jansen van Vuuren wrote:
> On 25/02/2022 13:20, Eelco Chaudron wrote: >> CAUTION: This message has originated from an External Source. Please use >> proper judgment and caution when opening attachments, clicking links, or >> responding to this email. >> >> >> On 25 Feb 2022, at 13:57, Pieter Jansen van Vuuren wrote: >> >>> On 25/02/2022 10:08, Simon Horman wrote: >>>> Hi Pieter, >>>> >>>> nice to hear from you :) >>> >>> Hi Simon. Thank you for the insightful feedback. >>> >>>> >>>> On Tue, Feb 22, 2022 at 12:17:40PM +0000, pieter.jansen-van-vuu...@amd.com >>>> wrote: >>>>> From: Pieter Jansen van Vuuren <pieter.jansen-van-vuu...@amd.com> >>>>> >>>>> Previously when making use of the tc datapath to achieve decrement ttl, >>>>> we would install a filter that matches on the ttl/hoplimit field and use >>>>> a pedit set action to set the ttl/hoplimit to one less than the match >>>>> value. This results in a filter for each unique ttl value we want to >>>>> decrement. >>>>> >>>>> This patch introduces true decrement ttl which makes use of the pedit add >>>>> action. Adding 0xff is equivalent to decrementing the ttl/hoplimit >>>>> field. This also improves the hardware offload case by reducing the >>>>> number of filters required to support decrement ttl. In order to utilise >>>>> this, the following config option needs to be set to true. >>>> >>>> I'd be interested to understand the specific motivation for wanting >>>> to reduce flows. But I agree that in the general case it seems nice. >>> >>> I think main motivation is just relaxing some of the constraints on HW >>> offload slightly; although matching ip ttl and setting it to one less >>> than the match field is possible, it feels a bit wasteful if we really >>> only decrement the field. I concede that depending on the use case the >>> gain might be insignificant. Mostly I just wanted to start a discussion >>> on this as I think we can improve this area a bit. >>> >>>> >>>> One thing that I am curious about is the approach you have taken, >>>> which seems to be to patch pedit commands. But I am wondering if you >>>> considered plumbing the OpenFlow DEC_TTL command more directly into >>>> the ODP layer - perhaps hinging on the handling of OFPACT_DEC_TTL in >>>> do_xlate_actions(). >>> >>> Yes, I think that is good point. I think this ties in with Eelco's work; >>> https://patchwork.ozlabs.org/project/openvswitch/patch/162134902591.762107.15543938565196336653.st...@wsfd-netdev64.ntdv.lab.eng.bos.redhat.com/ >>> >>> Possibly this patch needs to be rebased on top Eelco's v5 of the above. >>> >>>> >>>>> ovs-vsctl set Open_vSwitch . other_config:tc-pedit-add=true >>>> >>>>> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuu...@amd.com> >>>>> Reviewed-by: Alejandro Lucero Palau <aluce...@amd.com> >>>>> Reviewed-by: Martin Habets <martin.hab...@amd.com> >>>>> --- >>>>> .../linux/compat/include/linux/openvswitch.h | 1 + >>>>> lib/dpif-netdev.c | 1 + >>>>> lib/dpif.c | 1 + >>>>> lib/netdev-offload-tc.c | 16 ++++++++- >>>>> lib/netdev-offload.c | 12 +++++++ >>>>> lib/netdev-offload.h | 1 + >>>>> lib/odp-execute.c | 2 ++ >>>>> lib/odp-util.c | 4 +++ >>>>> lib/tc.c | 35 +++++++++++++++++-- >>>>> lib/tc.h | 1 + >>>>> ofproto/ofproto-dpif-ipfix.c | 1 + >>>>> ofproto/ofproto-dpif-sflow.c | 1 + >>>>> vswitchd/vswitch.xml | 15 ++++++++ >>>>> 13 files changed, 87 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/datapath/linux/compat/include/linux/openvswitch.h >>>>> b/datapath/linux/compat/include/linux/openvswitch.h >>>>> index 8d9300091..f7daffeb0 100644 >>>>> --- a/datapath/linux/compat/include/linux/openvswitch.h >>>>> +++ b/datapath/linux/compat/include/linux/openvswitch.h >>>>> @@ -1075,6 +1075,7 @@ enum ovs_action_attr { >>>>> OVS_ACTION_ATTR_DROP, /* u32 xlate_error. */ >>>>> OVS_ACTION_ATTR_LB_OUTPUT, /* u32 bond-id. */ >>>>> #endif >>>>> + OVS_ACTION_ATTR_DEC_TTL, /* No argument. */ >>>> >>>> I'm not entirely sure, but I think that: >>>> * if this attribute is not used by the OVS kernel datapath then >>>> it should be inside the ifndef; otherwise >>>> * it should be above the ifndef >>>> >>>> Perhaps there is precedence, but it would strike me as unusual to implement >>>> a feature in the TC offload datapath but not in the OVS kernel datapath. >>> >>> Yeah, I see. I think you are correct; this needs to be implement in the >>> kernel >>> datapath as well and likely before we tackle the TC offload datapath. >> >> The kernel datapath (meaning the kernel module) already has the >> implementation. >> >>> I am hoping to rebase this on Eelco's patch that deals with this. >> >> The problem with the current v5(v4) patchset is some architectural things >> that need to be solved, which might make the actual benefits of having it >> offloaded not matter. I guess it depends on the actual use case, which was >> not clarified by Nokia. >> >> See the following email: >> https://mail.openvswitch.org/pipermail/ovs-dev/2021-July/386296.html >> Especially the part under “I was preparing my v5, and I noticed that a bunch >> of self-tests fail. I was wondering why I (and Matteo/Bindiya) never >> noticed. For me, it was because I was running make check on my dev system, >> which had an old kernel :( The datapath tests I was running on my DUT.” >> >> >> //Eelco > > I see, thank you Eelco for the clarification. So from the thread I gather that > there are at least 3 issues (corner cases); > 1. dec_ttl being non-reversible, solved with the clone action (in OVS > datapath). > As you pointed out this in not HW offloadable and will not be of much use for > TC datapath. Yes, this case will not help hardware offload, but it needs to be fixed, as without it the solution is not working per design. I mentioned I fixed this already in v5 but forgot to update the branch on github. I did now, so here is the commit: https://github.com/chaudron/ovs/commit/8bf3d1e6a7ae34bd8f9d091925db177bb27fbac0 > 2. encapsulation that copies the ttl field, which when using dec_ttl is set to > the incorrect value. Is this value "correct_ttl-1" or is it completely missing > as in it is "0"? It should be the same value as after the dec_ttl (if the encap is after the dec_ttl). If there are two dec_ttl actions before the encap, it should be original TTL - 2. In other words, it always needs to be the TTL value in the current stage of the pipeline. > 3. dec_ttl that stops processing the action list once hitting the exception > case. > Possibly solved with the clone action, similar to (1). Yes, I have not thought this through, but my first thought was that it should be possible with a clone action, but maybe better solutions exist. > Is my understanding here correct? > > It does seem non-trivial and I agree that this warrants some more > investigation. > > Thanks again for the insight here. > >> >>>> > <snip> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev