Eelco Chaudron <echau...@redhat.com> writes:

> On 19 Feb 2024, at 19:57, Aaron Conole wrote:
>
>> Eelco Chaudron <echau...@redhat.com> writes:
>>
>>> On 12 Feb 2024, at 15:15, Aaron Conole wrote:
>>>
>>>> Aaron Conole <acon...@redhat.com> writes:
>>>>
>>>>> Eelco Chaudron <echau...@redhat.com> writes:
>>>>>
>>>>>> On 2 Feb 2024, at 11:31, Adrian Moreno wrote:
>>>>>>
>>>>>>> On 2/1/24 10:02, Eelco Chaudron wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 31 Jan 2024, at 18:03, Aaron Conole wrote:
>>>>>>>>
>>>>>>>>> Eelco Chaudron <echau...@redhat.com> writes:
>>>>>>>>>
>>>>>>>>>> On 25 Jan 2024, at 21:55, Aaron Conole wrote:
>>>>>>>>>>
>>>>>>>>>>> From: Kevin Sprague <ksprague0...@gmail.com>
>>>>>>>>>>>
>>>>>>>>>>> During normal operations, it is useful to understand when a 
>>>>>>>>>>> particular flow
>>>>>>>>>>> gets removed from the system. This can be useful when debugging 
>>>>>>>>>>> performance
>>>>>>>>>>> issues tied to ofproto flow changes, trying to determine deployed 
>>>>>>>>>>> traffic
>>>>>>>>>>> patterns, or while debugging dynamic systems where ports come and 
>>>>>>>>>>> go.
>>>>>>>>>>>
>>>>>>>>>>> Prior to this change, there was a lack of visibility around flow 
>>>>>>>>>>> expiration.
>>>>>>>>>>> The existing debugging infrastructure could tell us when a flow was 
>>>>>>>>>>> added to
>>>>>>>>>>> the datapath, but not when it was removed or why.
>>>>>>>>>>>
>>>>>>>>>>> This change introduces a USDT probe at the point where the 
>>>>>>>>>>> revalidator
>>>>>>>>>>> determines that the flow should be removed.  Additionally, we track 
>>>>>>>>>>> the
>>>>>>>>>>> reason for the flow eviction and provide that information as well.  
>>>>>>>>>>> With
>>>>>>>>>>> this change, we can track the complete flow lifecycle for the 
>>>>>>>>>>> netlink
>>>>>>>>>>> datapath by hooking the upcall tracepoint in kernel, the flow put 
>>>>>>>>>>> USDT, and
>>>>>>>>>>> the revaldiator USDT, letting us watch as flows are added and 
>>>>>>>>>>> removed from
>>>>>>>>>>> the kernel datapath.
>>>>>>>>>>>
>>>>>>>>>>> This change only enables this information via USDT probe, so it 
>>>>>>>>>>> won't be
>>>>>>>>>>> possible to access this information any other way (see:
>>>>>>>>>>> Documentation/topics/usdt-probes.rst).
>>>>>>>>>>>
>>>>>>>>>>> Also included is a script 
>>>>>>>>>>> (utilities/usdt-scripts/flow_reval_monitor.py)
>>>>>>>>>>> which serves as a demonstration of how the new USDT probe might be 
>>>>>>>>>>> used
>>>>>>>>>>> going forward.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Kevin Sprague <ksprague0...@gmail.com>
>>>>>>>>>>> Co-authored-by: Aaron Conole <acon...@redhat.com>
>>>>>>>>>>> Signed-off-by: Aaron Conole <acon...@redhat.com>
>>>>>>>>>>
>>>>>>>>>> Thanks for following this up Aaron! See comments on this patch
>>>>>> below. I have no additional comments on patch 2.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>> Eelco
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> ---
>>>>>>>>>>>   Documentation/topics/usdt-probes.rst         |   1 +
>>>>>>>>>>>   ofproto/ofproto-dpif-upcall.c                |  42 +-
>>>>>>>>>>>   utilities/automake.mk                        |   3 +
>>>>>>>>>>>   utilities/usdt-scripts/flow_reval_monitor.py | 653 
>>>>>>>>>>> +++++++++++++++++++
>>>>>>>>>>>   4 files changed, 693 insertions(+), 6 deletions(-)
>>>>>>>>>>>   create mode 100755 utilities/usdt-scripts/flow_reval_monitor.py
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/Documentation/topics/usdt-probes.rst 
>>>>>>>>>>> b/Documentation/topics/usdt-probes.rst
>>>>>>>>>>> index e527f43bab..a8da9bb1f7 100644
>>>>>>>>>>> --- a/Documentation/topics/usdt-probes.rst
>>>>>>>>>>> +++ b/Documentation/topics/usdt-probes.rst
>>>>>>>>>>> @@ -214,6 +214,7 @@ Available probes in ``ovs_vswitchd``:
>>>>>>>>>>>   - dpif_recv:recv_upcall
>>>>>>>>>>>   - main:poll_block
>>>>>>>>>>>   - main:run_start
>>>>>>>>>>> +- revalidate:flow_result
>>>>>>>>>>>   - revalidate_ukey\_\_:entry
>>>>>>>>>>>   - revalidate_ukey\_\_:exit
>>>>>>>>>>>   - udpif_revalidator:start_dump
>>>>>>>>>>
>>>>>>>>>> You are missing the specific flow_result result section. This is 
>>>>>>>>>> from the previous patch:
>>>>>>>>>
>>>>>>>>> D'oh!  Thanks for catching it.  I'll re-add it.
>>>>>>>>>
>>>>>>>>>> @@ -358,6 +360,27 @@  See also the ``main:run_start`` probe above.
>>>>>>>>>>   - ``utilities/usdt-scripts/bridge_loop.bt``
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> +probe revalidate:flow_result
>>>>>>>>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>>>>> +
>>>>>>>>>> +**Description**:
>>>>>>>>>> +This probe is triggered when the revalidator decides whether or not 
>>>>>>>>>> to
>>>>>>>>>> +revalidate a flow. ``reason`` is an enum that denotes that either 
>>>>>>>>>> the flow
>>>>>>>>>> +is being kept, or the reason why the flow is being deleted. The
>>>>>>>>>> +``flow_reval_monitor.py`` script uses this probe to notify users 
>>>>>>>>>> when flows
>>>>>>>>>> +matching user-provided criteria are deleted.
>>>>>>>>>> +
>>>>>>>>>> +**Arguments**:
>>>>>>>>>> +
>>>>>>>>>> +- *arg0*: ``(enum flow_del_reason) reason``
>>>>>>>>>> +- *arg1*: ``(struct udpif *) udpif``
>>>>>>>>>> +- *arg2*: ``(struct udpif_key *) ukey``
>>>>>>>>>> +
>>>>>>>>>> +**Script references**:
>>>>>>>>>> +
>>>>>>>>>> +- ``utilities/usdt-scripts/flow_reval_monitor.py``
>>>>>>>>>> +
>>>>>>>>>> +
>>>>>>>>>>   Adding your own probes
>>>>>>>>>>   ----------------------
>>>>>>>>>>
>>>>>>>>>>> diff --git a/ofproto/ofproto-dpif-upcall.c 
>>>>>>>>>>> b/ofproto/ofproto-dpif-upcall.c
>>>>>>>>>>> index b5cbeed878..97d75833f7 100644
>>>>>>>>>>> --- a/ofproto/ofproto-dpif-upcall.c
>>>>>>>>>>> +++ b/ofproto/ofproto-dpif-upcall.c
>>>>>>>>>>> @@ -269,6 +269,18 @@ enum ukey_state {
>>>>>>>>>>>   };
>>>>>>>>>>>   #define N_UKEY_STATES (UKEY_DELETED + 1)
>>>>>>>>>>>
>>>>>>>>>>> +enum flow_del_reason {
>>>>>>>>>>> +    FDR_REVALIDATE = 0,     /* The flow was revalidated. */
>>>>>>>>>>
>>>>>>>>>> It was called FDR_FLOW_LIVE before, which might make more
>>>>>> sense. As the flow is just NOT deleted. It might or might not have
>>>>>> been revalidated. Thoughts?
>>>>>>>>>
>>>>>>>>> I think it had to have been revalidated if we emit the reason, because
>>>>>>>>> we only emit the reason code after revalidation.  IE: there are many
>>>>>>>>> places where we skip revalidation but the flow stays live - and we 
>>>>>>>>> don't
>>>>>>>>> emit reasons in those cases.
>>>>>>>>>
>>>>>>>>> So at least for this patch, it MUST have been revalidated.  But maybe 
>>>>>>>>> in
>>>>>>>>> the future, we would want to catch cases where the flow hasn't been.  
>>>>>>>>> In
>>>>>>>>> that case, it makes sense to add the FDR_FLOW_LIVE at that time - I
>>>>>>>>> think.
>>>>>>>>>
>>>>>>>>> Maybe you disagree?
>>>>>>>>
>>>>>>>> Well, it depends on how you define revalidation, it might only have
>>>>>> updated the counters. i.e. it all depends on ‘bool need_revalidate =
>>>>>> ukey->reval_seq != reval_seq;’ in revalidate_ukey(). That was why I
>>>>>> opted for a more general name.
>>>>>>>>
>>>>>>>>>>> +    FDR_FLOW_IDLE,          /* The flow went unused and was 
>>>>>>>>>>> deleted. */
>>>>>>>>>>> +    FDR_TOO_EXPENSIVE,      /* The flow was too expensive to 
>>>>>>>>>>> revalidate. */
>>>>>>>>>>> +    FDR_FLOW_WILDCARDED,    /* The flow needed a narrower wildcard 
>>>>>>>>>>> mask. */
>>>>>>>>>>> +    FDR_BAD_ODP_FIT,        /* The flow had a bad ODP flow fit. */
>>>>>>>>>>> +    FDR_NO_OFPROTO,         /* The flow didn't have an associated 
>>>>>>>>>>> ofproto. */
>>>>>>>>>>> +    FDR_XLATION_ERROR,      /* There was an error translating the 
>>>>>>>>>>> flow. */
>>>>>>>>>>> +    FDR_AVOID_CACHING,      /* Flow deleted to avoid caching. */
>>>>>>>>>>> +    FDR_FLOW_LIMIT,         /* All flows being killed. */
>>>>>>>>>>
>>>>>>>>>> Looking at the comment from Han on FDR_PURGE, and this patch
>>>>>> needing another spin, we should probably add it.
>>>>>>>>>
>>>>>>>>> I can do that, sure.  In that case, we will need to have a new flow op
>>>>>>>>> added to revalidator_sweep__ so that we can catch it.  But in that 
>>>>>>>>> case,
>>>>>>>>> it will be a different usdt probe, so I still don't know if we need
>>>>>>>>> FDR_PURGE right?  WDYT?
>>>>>>>>
>>>>>>>> In revalidator_sweep__() you have sort of the following:
>>>>>>>>
>>>>>>>>                  if (purge || ukey_state == UKEY_INCONSISTENT) {
>>>>>>>>                      result = UKEY_DELETE;
>>>>>>>>                  } else if (!seq_mismatch) {
>>>>>>>>
>>>>>>>> And I’m afraid that if we use this tool to debug we miss the
>>>>>> ukey_state == UKEY_INCONSISTENT when debugging and spent a long time
>>>>>> figuring this out.
>>>>>>>> Maybe add something general like this (did not give it a lot of
>>>>>> thought), and only take the FDR_PURGE : FDR_UPDATE_FAIL results in the
>>>>>> script?
>>>>>>>>
>>>>>>>>   /* 'udpif_key's are responsible for tracking the little bit of state 
>>>>>>>> udpif
>>>>>>>> @@ -2991,13 +2993,13 @@ revalidator_sweep__(struct revalidator 
>>>>>>>> *revalidator, bool purge)
>>>>>>>>           uint64_t odp_actions_stub[1024 / 8];
>>>>>>>>           struct ofpbuf odp_actions = 
>>>>>>>> OFPBUF_STUB_INITIALIZER(odp_actions_stub);
>>>>>>>>
>>>>>>>> -        enum flow_del_reason reason = FDR_REVALIDATE;
>>>>>>>>           struct ukey_op ops[REVALIDATE_MAX_BATCH];
>>>>>>>>           struct udpif_key *ukey;
>>>>>>>>           struct umap *umap = &udpif->ukeys[i];
>>>>>>>>           size_t n_ops = 0;
>>>>>>>>
>>>>>>>>           CMAP_FOR_EACH(ukey, cmap_node, &umap->cmap) {
>>>>>>>> +            enum flow_del_reason reason = FDR_REVALIDATE;
>>>>>>>>               enum ukey_state ukey_state;
>>>>>>>>
>>>>>>>>               /* Handler threads could be holding a ukey lock while it 
>>>>>>>> installs a
>>>>>>>> @@ -3016,8 +3018,10 @@ revalidator_sweep__(struct revalidator 
>>>>>>>> *revalidator, bool purge)
>>>>>>>>
>>>>>>>>                   if (purge || ukey_state == UKEY_INCONSISTENT) {
>>>>>>>>                       result = UKEY_DELETE;
>>>>>>>> +                    reason = purge ? FDR_PURGE : FDR_UPDATE_FAIL;
>>>>>>>>                   } else if (!seq_mismatch) {
>>>>>>>>                       result = UKEY_KEEP;
>>>>>>>> +                    reason = FDR_REVALIDATE; //_KEEP
>>>>>>>>                   } else {
>>>>>>>>                       struct dpif_flow_stats stats;
>>>>>>>>                       COVERAGE_INC(revalidate_missed_dp_flow);
>>>>>>>> @@ -3030,6 +3034,8 @@ revalidator_sweep__(struct revalidator 
>>>>>>>> *revalidator, bool purge)
>>>>>>>>                       reval_op_init(&ops[n_ops++], result, udpif, 
>>>>>>>> ukey, &recircs,
>>>>>>>>                                     &odp_actions);
>>>>>>>>                   }
>>>>>>>> +                OVS_USDT_PROBE(revalidator_sweep__, flow_result, 
>>>>>>>> result,
>>>>>>>> +                               reason, udpif, ukey);
>>>>>>>>               }
>>>>>>>>               ovs_mutex_unlock(&ukey->mutex);
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> In addition in revalidator_sweep__() should the “enum
>>>>>> flow_del_reason reason = FDR_REVALIDATE;” not be moved to the
>>>>>> CMAP_FOR_EACH() loop?
>>>>>>>>
>>>>>>>>
>>>>>>>>>>> +};
>>>>>>>>>>> +
>>>>>>>>>>>   /* 'udpif_key's are responsible for tracking the little bit of 
>>>>>>>>>>> state udpif
>>>>>>>>>>>    * needs to do flow expiration which can't be pulled directly 
>>>>>>>>>>> from the
>>>>>>>>>>>    * datapath.  They may be created by any handler or revalidator 
>>>>>>>>>>> thread at any
>>>>>>>>>>> @@ -2272,7 +2284,8 @@ populate_xcache(struct udpif *udpif, struct 
>>>>>>>>>>> udpif_key *ukey,
>>>>>>>>>>>   static enum reval_result
>>>>>>>>>>>   revalidate_ukey__(struct udpif *udpif, const struct udpif_key 
>>>>>>>>>>> *ukey,
>>>>>>>>>>>                     uint16_t tcp_flags, struct ofpbuf *odp_actions,
>>>>>>>>>>> -                  struct recirc_refs *recircs, struct xlate_cache 
>>>>>>>>>>> *xcache)
>>>>>>>>>>> +                  struct recirc_refs *recircs, struct xlate_cache 
>>>>>>>>>>> *xcache,
>>>>>>>>>>> +                  enum flow_del_reason *reason)
>>>>>>>>>>>   {
>>>>>>>>>>>       struct xlate_out *xoutp;
>>>>>>>>>>>       struct netflow *netflow;
>>>>>>>>>>> @@ -2293,11 +2306,13 @@ revalidate_ukey__(struct udpif *udpif,
>>>>>> const struct udpif_key *ukey,
>>>>>>>>>>>       netflow = NULL;
>>>>>>>>>>>
>>>>>>>>>>>       if (xlate_ukey(udpif, ukey, tcp_flags, &ctx)) {
>>>>>>>>>>> +        *reason = FDR_XLATION_ERROR;
>>>>>>>>>>>           goto exit;
>>>>>>>>>>>       }
>>>>>>>>>>>       xoutp = &ctx.xout;
>>>>>>>>>>>
>>>>>>>>>>>       if (xoutp->avoid_caching) {
>>>>>>>>>>> +        *reason = FDR_AVOID_CACHING;
>>>>>>>>>>>           goto exit;
>>>>>>>>>>>       }
>>>>>>>>>>>
>>>>>>>>>>> @@ -2311,6 +2326,7 @@ revalidate_ukey__(struct udpif *udpif,
>>>>> const struct udpif_key *ukey,
>>>>>>>>>>>           ofpbuf_clear(odp_actions);
>>>>>>>>>>>
>>>>>>>>>>>           if (!ofproto) {
>>>>>>>>>>> +            *reason = FDR_NO_OFPROTO;
>>>>>>>>>>>               goto exit;
>>>>>>>>>>>           }
>>>>>>>>>>>
>>>>>>>>>>> @@ -2322,6 +2338,7 @@ revalidate_ukey__(struct udpif *udpif,
>>>>> const struct udpif_key *ukey,
>>>>>>>>>>>       if (odp_flow_key_to_mask(ukey->mask, ukey->mask_len, 
>>>>>>>>>>> &dp_mask, &ctx.flow,
>>>>>>>>>>>                                NULL)
>>>>>>>>>>>           == ODP_FIT_ERROR) {
>>>>>>>>>>> +        *reason = FDR_BAD_ODP_FIT;
>>>>>>>>>>>           goto exit;
>>>>>>>>>>>       }
>>>>>>>>>>>
>>>>>>>>>>> @@ -2331,6 +2348,7 @@ revalidate_ukey__(struct udpif *udpif,
>>>>> const struct udpif_key *ukey,
>>>>>>>>>>>        * down.  Note that we do not know if the datapath has 
>>>>>>>>>>> ignored any of the
>>>>>>>>>>>        * wildcarded bits, so we may be overly conservative here. */
>>>>>>>>>>>       if (flow_wildcards_has_extra(&dp_mask, ctx.wc)) {
>>>>>>>>>>> +        *reason = FDR_FLOW_WILDCARDED;
>>>>>>>>>>>           goto exit;
>>>>>>>>>>>       }
>>>>>>>>>>>
>>>>>>>>>>> @@ -2400,7 +2418,7 @@ static enum reval_result
>>>>>>>>>>>   revalidate_ukey(struct udpif *udpif, struct udpif_key *ukey,
>>>>>>>>>>>                   const struct dpif_flow_stats *stats,
>>>>>>>>>>>                   struct ofpbuf *odp_actions, uint64_t reval_seq,
>>>>>>>>>>> -                struct recirc_refs *recircs)
>>>>>>>>>>> +                struct recirc_refs *recircs, enum flow_del_reason 
>>>>>>>>>>> *reason)
>>>>>>>>>>>       OVS_REQUIRES(ukey->mutex)
>>>>>>>>>>>   {
>>>>>>>>>>>       bool need_revalidate = ukey->reval_seq != reval_seq;
>>>>>>>>>>> @@ -2430,8 +2448,12 @@ revalidate_ukey(struct udpif *udpif, struct 
>>>>>>>>>>> udpif_key *ukey,
>>>>>>>>>>>                   xlate_cache_clear(ukey->xcache);
>>>>>>>>>>>               }
>>>>>>>>>>>               result = revalidate_ukey__(udpif, ukey, 
>>>>>>>>>>> push.tcp_flags,
>>>>>>>>>>> -                                       odp_actions, recircs, 
>>>>>>>>>>> ukey->xcache);
>>>>>>>>>>> -        } /* else delete; too expensive to revalidate */
>>>>>>>>>>> +                                       odp_actions, recircs, 
>>>>>>>>>>> ukey->xcache,
>>>>>>>>>>> +                                       reason);
>>>>>>>>>>> +        } else {
>>>>>>>>>>> +            /* delete; too expensive to revalidate */
>>>>>>>>>>> +            *reason = FDR_TOO_EXPENSIVE;
>>>>>>>>>>> +        }
>>>>>>>>>>>       } else if (!push.n_packets || ukey->xcache
>>>>>>>>>>>                  || !populate_xcache(udpif, ukey, push.tcp_flags)) {
>>>>>>>>>>>           result = UKEY_KEEP;
>>>>>>>>>>> @@ -2831,6 +2853,7 @@ revalidate(struct revalidator *revalidator)
>>>>>>>>>>>           for (f = flows; f < &flows[n_dumped]; f++) {
>>>>>>>>>>>               long long int used = f->stats.used;
>>>>>>>>>>>               struct recirc_refs recircs = 
>>>>>>>>>>> RECIRC_REFS_EMPTY_INITIALIZER;
>>>>>>>>>>> +            enum flow_del_reason reason = FDR_REVALIDATE;
>>>>>>>>>>>               struct dpif_flow_stats stats = f->stats;
>>>>>>>>>>>               enum reval_result result;
>>>>>>>>>>>               struct udpif_key *ukey;
>>>>>>>>>>> @@ -2905,9 +2928,14 @@ revalidate(struct revalidator *revalidator)
>>>>>>>>>>>               }
>>>>>>>>>>>               if (kill_them_all || (used && used < now - max_idle)) 
>>>>>>>>>>> {
>>>>>>>>>>>                   result = UKEY_DELETE;
>>>>>>>>>>> +                if (kill_them_all) {
>>>>>>>>>>> +                    reason = FDR_FLOW_LIMIT;
>>>>>>>>>>> +                } else {
>>>>>>>>>>> +                    reason = FDR_FLOW_IDLE;
>>>>>>>>>>> +                }
>>>>>>>>>>>               } else {
>>>>>>>>>>>                   result = revalidate_ukey(udpif, ukey, &stats, 
>>>>>>>>>>> &odp_actions,
>>>>>>>>>>> -                                         reval_seq, &recircs);
>>>>>>>>>>> +                                         reval_seq, &recircs, 
>>>>>>>>>>> &reason);
>>>>>>>>>>>               }
>>>>>>>>>>>               ukey->dump_seq = dump_seq;
>>>>>>>>>>>
>>>>>>>>>>> @@ -2916,6 +2944,7 @@ revalidate(struct revalidator *revalidator)
>>>>>>>>>>>                   udpif_update_flow_pps(udpif, ukey, f);
>>>>>>>>>>>               }
>>>>>>>>>>>
>>>>>>>>>>> +            OVS_USDT_PROBE(revalidate, flow_result, reason, udpif, 
>>>>>>>>>>> ukey);
>>>>>>>
>>>>>>> I have been experimenting with several upcall tracking techniques
>>>>>> that would make it easier to correlate upcalls with their subsequent
>>>>>> related events.
>>>>>>> To achieve that, we need (among other things) some easy-to-compare
>>>>>> unique value in the events. For revalidation events, I think a good
>>>>>> candidate would be "ukey->ufid" and so does the script in this patch.
>>>>>>>
>>>>>>> However, requiring all external tools to know the layout of "struct
>>>>>> udpif_key" in order to get that value makes things quite complicated
>>>>>> for CORE tools (e.g: retis).
>>>>>>>
>>>>>>> With all this, would you consider adding the ufid to probe payload 
>>>>>>> directly?
>>>>>>
>>>>>> Dont see why we can not, but if we need anything else it would quickly
>>>>>> explode in too much arguments. I guess CORE needs a good solution for
>>>>>> userspace.
>>>>>
>>>>> I think having the ufid and udpif makes sense for now.  Actually, maybe
>>>>> for the long term we can do something as an appctl command which will
>>>>> dump the relevant datastructures.  That way a tool like this, which
>>>>> needs the pid anyway, can generate an appctl command against the daemon
>>>>> and get the exact right header layout.
>>>>>
>>>>> WDYT?
>>>>
>>>> Update on this - I have it ready to go with the exception of this
>>>> particular change.  The USDT probe code generation seems to have a bug,
>>>> or maybe it just won't work using these kinds of data types.  For
>>>> example, the bcc tools gets confused and ends up generating code for the
>>>> probe like:
>>>>
>>>>     __attribute__((always_inline))
>>>> static __always_inline int _bpf_readarg_usdt__flow_result_3(struct
>>> pt_regs *ctx, void *dest, size_t len) {
>>>>       if (len != sizeof(d)) return -1;
>>>> { u64 __addr = ctx->r13 + 40; __asm__ __volatile__("": : :"memory");
>>> uint128_t __res = 0x0; bpf_probe_read_user(&__res, sizeof(__res),
>>> (void *)__addr); *((d *)dest) = __res; }
>>>>       return 0;
>>>>     }
>>>>
>>>> This code isn't valid - the (d) variable isn't properly represented
>>>> here.  I guess maybe there's some issue with the way they use ctypes
>>>> internally?  It isn't clear to me.
>>>
>>> I miss some context, but is it because you point to ufid, which is a
>>> union (typedef union ovs_u128)? What if you point to the u32 member of
>>> the union?
>>
>> I'm not sure, but that doesn't seem like the best approach to me, tbh.
>>
>>>> We could work around this by making the ufid always pass as a void *,
>>>> and passing a length that tells the size.  However, if the underlying
>>>> data type ever changes, it will be totally incompatible.  I doubt it
>>>> would ever change, but at least with something stronger typed, we could
>>>> get a better exception that bubbled up.
>>>
>>> I do not like the pointer + length, as this adds another argument. I
>>> can’t remember which, but some tools had problems with 5 or more
>>> arguments.
>>
>> I do agree.  I also want to test the latest version of bcc-tools also,
>> to see if it really is just related to my f38 version of bcc-tools.
>> That also raises a different issue, but it can be something we solve in
>> the interrim with a version check.
>>
>>>> I'm okay with doing it this way, but just wanted to make sure that
>>>> retis, et. al. would be okay with interpreting it this way.
>>>>
>>>> Alternatively, there's definitely a bug there - we could omit the
>>>> argument entirely and rely on pahole or even pyelftools to help extract
>>>> the struct layout and get the ufid from the udpif_key struct.  The data
>>>> is already being sent, so it seems like duplication a bit to send it,
>>>> and becomes difficult in light of this bcc "bug?"
>>>
>>> I agree it’s overloading arguments with data already present, but I
>>> guess CORE can not yet read debug data from userspace
>>> applications. Adrian, I thought this was being worked on with the
>>> Pahole guys. If this is close we could just leave out this extra
>>> argument for now.
>>
>> That should be something doable, I think.  For example, there are crates
>> like gimli that parse DWARF data (but not sure if it is parsing v5,
>> yet).
>>
>>> Alternatively, we could ask the BCC guys if this is a known issue,
>>> with maybe a potential workaround. Just post it here
>>> https://github.com/iovisor/bcc/issues the respond quickly.
>>
>> They will tell to run with the latest version, I think.  The version I'm
>> using is what ships with Fedora 38, and seems quite out of date.
>
> ACK, will wait for your conclusion and the v9.

After tying to get it working with latest bcc-tools, the behavior is
actually much worse.  Whereas before the flow_reval_monitor script would
fail during the bpf compilation (with a clear error message), the latest
BCC tools will crash the python interpreter.

I'm going to omit the ufid details.  I think CO-RE will have support
anyway for this kinds of output, so v9 I will keep the args as-is and
when the support gets added to decode the debuginfo packages to pull
those structures, they will get the support.

> //Eelco
>
>>> <SNIP>

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to