On 10 May 2024, at 15:06, Ilya Maximets wrote:
> On 5/10/24 14:01, Eelco Chaudron wrote: >> >> >> On 10 May 2024, at 12:45, Adrian Moreno wrote: >> >>> On 5/10/24 12:06 PM, Eelco Chaudron wrote: >>>> On 24 Apr 2024, at 21:53, Adrian Moreno wrote: >>>> >>>>> The new odp sample attributes allow userspace to specify a group_id and >>>>> user-defined cookie to be passed down to psample. >>>>> >>>>> Add support for parsing and formatting such action. >>>>> >>>>> Signed-off-by: Adrian Moreno <amore...@redhat.com> >>>> >>>> Hi Adrian, >>>> >>>> See my comments below inline. >>>> >>>> //Eelco >>>> >>>>> --- >>>>> include/linux/openvswitch.h | 49 +++++++++--- >>>>> lib/odp-execute.c | 3 + >>>>> lib/odp-util.c | 150 ++++++++++++++++++++++++++--------- >>>>> ofproto/ofproto-dpif-ipfix.c | 2 + >>>>> python/ovs/flow/odp.py | 2 + >>>>> tests/odp.at | 5 ++ >>>>> 6 files changed, 163 insertions(+), 48 deletions(-) >>>>> >>>>> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h >>>>> index d9fb991ef..3e748405c 100644 >>>>> --- a/include/linux/openvswitch.h >>>>> +++ b/include/linux/openvswitch.h >>>>> @@ -696,6 +696,7 @@ enum ovs_flow_attr { >>>>> #define OVS_UFID_F_OMIT_MASK (1 << 1) >>>>> #define OVS_UFID_F_OMIT_ACTIONS (1 << 2) >>>>> >>>>> +#define OVS_PSAMPLE_COOKIE_MAX_SIZE 16 >>>>> /** >>>>> * enum ovs_sample_attr - Attributes for %OVS_ACTION_ATTR_SAMPLE action. >>>>> * @OVS_SAMPLE_ATTR_PROBABILITY: 32-bit fraction of packets to sample >>>>> with >>>>> @@ -703,15 +704,27 @@ enum ovs_flow_attr { >>>>> * %UINT32_MAX samples all packets and intermediate values sample >>>>> intermediate >>>>> * fractions of packets. >>>>> * @OVS_SAMPLE_ATTR_ACTIONS: Set of actions to execute in sampling >>>>> event. >>>>> - * Actions are passed as nested attributes. >>>>> + * Actions are passed as nested attributes. Optional if >>>>> + * OVS_SAMPLE_ATTR_PSAMPLE_GROUP is set. >>>>> + * @OVS_SAMPLE_ATTR_PSAMPLE_GROUP: A 32-bit number to be used as psample >>>>> group. >>>>> + * Optional if OVS_SAMPLE_ATTR_ACTIONS is set. >>>>> + * @OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: A variable-length binary cookie >>>>> that, if >>>>> + * provided, will be copied to the psample cookie. >>>> >>>> Should we mention any limit on the actual length of the cookie? >>>> >>>> Any reason we explicitly call this psample? From an OVS perspective, this >>>> could just be >>>> SAMPLE_ATTR_FORWARD/OFFLOAD/DESTINATION/ENDPOINT/COLLECTOR/TARGET_COOKIE >>>> and _GROUP. Other datapaths, do not have PSAMPLE. >>>> >>> >>> See my response to your comment on the cover-letter for possible name >>> alternatives. >> >> ACK, we can continue the discussion there. >> >>>> Thinking about it more, from an OVS perspective we should probably not >>>> even send down a COOKIE, but we should send down an obs_domain_id and >>>> obs_point_id, and then the kernel can move this into a cookie. >>>> >>> >>> I did consider this. However, the opaque cookie fits well with the tc API >>> which does not have two 32-bit values but an opaque 128-bit cookie. So if >>> the action is offloaded OVS needs to "encode" the two 32-bit values into >>> the cookie anyways. >> >> Which is fine to me, the OVS API should be clear that we have two u32 >> entries. The cookie is implementation-specific, and we use the netlink >> format as our API for all DPs. > > > I'm not sure about this. It is a kernel interface and we can't deliver > two separate values from the psample. So, passing two separate values > to the kernel doesn't make a lot of sense. They are going to be mangled > into a single cookie anyway and we'll have to document that somehow. > We may as well always mangle the data from the beginning, document once > at the top level and save ourselves from documenting a million differences > between different implementations. While USDT could report them separately, > I'm not sure there is a ton of value in that. Our OVS action API is not tight to psample at all, so passing in a psample cookie does not make sense. We should not pass in any psample references to the sample action API. If it looks like the below, the API is clean and we can mangle it before passing it to the psample function. I see no need to document this in the kernel, as all the kernel does is provide the TC action cookie (unrelated to the OVS actions API). + OVS_SAMPLE_ATTR_OBS_DOMAIN, /* Observation domain id, u32 number. */ + OVS_SAMPLE_ATTR_OBS_POINT, /* Observation point id, u32 number. */ + OVS_SAMPLE_ATTR_SYSTEM_TARGET, /* Flag to additionally sample to system target. */ + OVS_SAMPLE_ATTR_SYSTEM_GROUP, /* System target group id, u32 number. */ >> >>>> The collector itself could be called system/local collector, or something >>>> like that. This way it can use for example psample for kernel and UDST >>>> probes for usespace. Both can pass the group and cookie (obs_domain_id and >>>> obs_point_id). >>>> >>>> Also, the presence of any of this should not dictate the psample action, >>>> we probably need a specific OVS_SAMPLE_ATTR_SYSTEM_TARGET type nl flag. >>>> >>> >>> It clearer and it might simplify option verification a bit, but isn't a bit >>> redundant? There is no flag for action execution for instance, the presence >>> of the attribute is enough. >>> >>> An alternative would be to have nested attributes, e.g: >>> >>> enum ovs_sample_attr { >>> OVS_SAMPLE_ATTR_UNSPEC, >>> OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */ >>> OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* attributes. */ >>> OVS_SAMPLE_ATTR_SYSTEM, /* Nested OVS_SAMPLE_SYSTEM_ATTR_* >>> attributes. */ >>> >>> __OVS_SAMPLE_ATTR_MAX, >>> }; >>> >>> enum ovs_sample_system_attr { >>> OVS_SAMPLE_SYSTEM_ATTR_OBS_DOMAIN, /* u32 number */ >>> OVS_SAMPLE_SYSTEM_ATTR_OBS_POINT, /* u32 number */ >>> OVS_SAMPLE_SYSTEM_ATTR_COOKIE, /* Nested OVS_ACTION_ATTR_* >>> >>> __OVS_SSAMPLE_SYSTEM_ATTR_MAX, >>> }; >>> >>> WDYT? >>> >>>> So I would envision your delta to look something like this: >>>> >>>> enum ovs_sample_attr { >>>> OVS_SAMPLE_ATTR_UNSPEC, >>>> - OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */ >>>> - OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* attributes. */ >>>> + OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */ >>>> + OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_ >>>> + * attributes. >>>> + */ >>>> + OVS_SAMPLE_ATTR_OBS_DOMAIN, /* Observation domain id, u32 >>>> number. */ >>>> + OVS_SAMPLE_ATTR_OBS_POINT, /* Observation point id, u32 number. >>>> */ >>>> + OVS_SAMPLE_ATTR_SYSTEM_TARGET, /* Flag to additionally sample to >>>> system target. */ >>>> + OVS_SAMPLE_ATTR_SYSTEM_GROUP, /* System target group id, u32 number. */ >>>> __OVS_SAMPLE_ATTR_MAX, >> >> I’ve been thinking about it a bit more (since I wrote this comment), and I >> guess we might not need the extra flag, as we could use the >> OVS_SAMPLE_ATTR_SYSTEM_GROUP as the key for doing a local system copy (or >> whatever name we come up with). The OBS DOMAIN/POINT can be used in the >> future by others. >> >>>>> * >>>>> - * Executes the specified actions with the given probability on a >>>>> per-packet >>>>> - * basis. >>>>> + * Either OVS_SAMPLE_ATTR_PSAMPLE_GROUP or OVS_SAMPLE_ATTR_ACTIONS must >>>>> be >>>>> + * specified. >>>>> + * >>>>> + * Executes the specified actions and/or sends the packet to psample >>>>> + * with the given probability on a per-packet basis. >>>>> */ >>>>> enum ovs_sample_attr { >>>>> OVS_SAMPLE_ATTR_UNSPEC, >>>>> - OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */ >>>>> - OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* attributes. */ >>>>> + OVS_SAMPLE_ATTR_PROBABILITY, /* u32 number */ >>>>> + OVS_SAMPLE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_ >>>> >>>> Missing * >>>> >>> >>> Yeah, it's a pitty the "*" makes checkpatch raise a warning. >>> >>>>> + * attributes. >>>>> + */ >>>>> + OVS_SAMPLE_ATTR_PSAMPLE_GROUP, /* u32 number */ >>>>> + OVS_SAMPLE_ATTR_PSAMPLE_COOKIE, /* binary */ >>>>> __OVS_SAMPLE_ATTR_MAX, >>>>> >>>>> #ifdef __KERNEL__ >>>>> @@ -722,13 +735,27 @@ enum ovs_sample_attr { >>>>> #define OVS_SAMPLE_ATTR_MAX (__OVS_SAMPLE_ATTR_MAX - 1) >>>>> >>>>> #ifdef __KERNEL__ >>>>> + >>>>> +/* Definition for flags in struct sample_arg. */ >>>>> +enum { >>>>> + /* When actions in sample will not change the flows. */ >>>>> + OVS_SAMPLE_ARG_FLAG_EXEC = 1 << 0, >>>>> + /* When set, the packet will be sent to psample. */ >>>>> + OVS_SAMPLE_ARG_FLAG_PSAMPLE = 1 << 1, >>>> >>>> Same on psample name here, should it be FLAG_SYSTEM_TARGET instead? See >>>> the above comment about an explicit option. >>>> >>>>> +}; >>>>> + >>>>> struct sample_arg { >>>>> - bool exec; /* When true, actions in sample will not >>>>> - * change flow keys. False otherwise. >>>>> - */ >>>>> - u32 probability; /* Same value as >>>>> - * 'OVS_SAMPLE_ATTR_PROBABILITY'. >>>>> - */ >>>>> + u16 flags; /* Flags that modify the behavior of the >>>>> + * action. See SAMPLE_ARG_FLAG_* >>>>> + */ >>>>> + u32 probability; /* Same value as >>>>> + * 'OVS_SAMPLE_ATTR_PROBABILITY'. >>>>> + */ >>>>> + u32 group_id; /* Same value as >>>>> + * 'OVS_SAMPLE_ATTR_PSAMPLE_GROUP'. >>>>> + */ >>>>> + u8 cookie_len; /* Length of psample cookie */ >>>>> + char cookie[OVS_PSAMPLE_COOKIE_MAX_SIZE]; /* psample cookie data. */ >>>> >>>> Would it make sense for the cookie also to be u8? >>>> >>> >>> Yes. >>> >>>>> }; >>>>> #endif >>>>> >>>>> diff --git a/lib/odp-execute.c b/lib/odp-execute.c >>>>> index 081e4d432..d8ee93429 100644 >>>>> --- a/lib/odp-execute.c >>>>> +++ b/lib/odp-execute.c >>>>> @@ -716,6 +716,9 @@ odp_execute_sample(void *dp, struct dp_packet >>>>> *packet, bool steal, >>>>> subactions = a; >>>>> break; >>>>> >>>>> + /* Ignored in userspace datapath. */ >>>>> + case OVS_SAMPLE_ATTR_PSAMPLE_GROUP: >>>>> + case OVS_SAMPLE_ATTR_PSAMPLE_COOKIE: > > Action that has these attributes should be marked as action that requires > datapath assistance. And we should never hit them here. > > We should have a test case covering a packet sent from a controller or > execution with a debug_slow flag. > > BTW, this code can be executed for kernel datapath as well if the packet > goes to userspace after upcall or comes from a controller, so the comment > is not correct. > > >>>>> case OVS_SAMPLE_ATTR_UNSPEC: >>>>> case __OVS_SAMPLE_ATTR_MAX: >>>>> default: >>>>> diff --git a/lib/odp-util.c b/lib/odp-util.c >>>>> index 21f34d955..611b5229e 100644 >>>>> --- a/lib/odp-util.c >>>>> +++ b/lib/odp-util.c >>>>> @@ -227,12 +227,16 @@ format_odp_sample_action(struct ds *ds, const >>>>> struct nlattr *attr, >>>>> { >>>>> static const struct nl_policy ovs_sample_policy[] = { >>>>> [OVS_SAMPLE_ATTR_PROBABILITY] = { .type = NL_A_U32 }, >>>>> - [OVS_SAMPLE_ATTR_ACTIONS] = { .type = NL_A_NESTED } >>>>> + [OVS_SAMPLE_ATTR_ACTIONS] = { .type = NL_A_NESTED, >>>>> + .optional = true }, >>>>> + [OVS_SAMPLE_ATTR_PSAMPLE_GROUP] = { .type = NL_A_U32, >>>>> + .optional = true }, >>>>> + [OVS_SAMPLE_ATTR_PSAMPLE_COOKIE] = { .type = NL_A_UNSPEC, >>>>> + .optional = true } >>>>> }; >>>>> struct nlattr *a[ARRAY_SIZE(ovs_sample_policy)]; >>>>> + const struct nlattr *nla; >>>>> double percentage; >>>>> - const struct nlattr *nla_acts; >>>>> - int len; >>>>> >>>>> ds_put_cstr(ds, "sample"); >>>>> >>>>> @@ -244,13 +248,33 @@ format_odp_sample_action(struct ds *ds, const >>>>> struct nlattr *attr, >>>>> percentage = (100.0 * >>>>> nl_attr_get_u32(a[OVS_SAMPLE_ATTR_PROBABILITY])) / >>>>> UINT32_MAX; >>>>> >>>>> - ds_put_format(ds, "(sample=%.1f%%,", percentage); >>>>> + ds_put_format(ds, "(sample=%.1f%%", percentage); >>>>> >>>>> - ds_put_cstr(ds, "actions("); >>>>> - nla_acts = nl_attr_get(a[OVS_SAMPLE_ATTR_ACTIONS]); >>>>> - len = nl_attr_get_size(a[OVS_SAMPLE_ATTR_ACTIONS]); >>>>> - format_odp_actions(ds, nla_acts, len, portno_names); >>>>> - ds_put_format(ds, "))"); >>>>> + nla = a[OVS_SAMPLE_ATTR_PSAMPLE_GROUP]; >>>>> + if (nla) { >>>>> + ds_put_format(ds, ",group_id=%d", nl_attr_get_u32(nla)); >>>>> + >>>>> + nla = a[OVS_SAMPLE_ATTR_PSAMPLE_COOKIE]; >>>>> + >>>>> + if (nla) { >>>>> + size_t i; >>>>> + const uint8_t *c = nl_attr_get(nla); >>>>> + ds_put_cstr(ds, ",cookie="); >>>>> + for (i = 0; i < nl_attr_get_size(nla); i++) { >>>>> + ds_put_format(ds, "%02x", c[i]); > > > We print userdata in some other actions as dot-separated bytes. > maybe do the same here? Otherwise we should at least add 0x to > the front to signify that it is a hex data. > > Best regards, Ilya Maximets. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev