On 2/18/26 11:37 AM, Adrian Moreno via dev wrote:
> ofproto/trace is one of the most useful debugging tools OVS provides.
> However, it's "offline" nature comes with limitations:
> - Users need to know exactly what their packets look like
> - Runtime information such as conntrack states has to be guessed
> 
> This RFC introduces the idea of upcall (live) tracing. In a nutshell,
> the idea is:
> - The user activates upcall tracing by specifying an openflow filter
> - When packet is upcalled, OVS checks if it matches the filter and if it
>   does, it collects traces from the xlate layer (same traces as
>   ofproto/trace emits).
> - If a frozen state is created from this upcall, an ID is stored in it
>   so if a subsequent upcall resumes the frozen state, it inherits its
>   ID, the "trace_id".
> - Traces are stored in some kind of ring-buffer or fixed-size list
>   arranged by these tracing IDs.
> - Traces are accessed (printed) by the user after the experiment has
>   ended.
> 
> The code in this RFC is in early state but it can be used to play
> around. I'm sending it out early to get some feedback.
> The following topics are not clear to me at the moment:
> 
> - Naming ana layering: I have called the thing "upcall-tracing", and
>   unixctl commands are "upcall/trace/{create,list,get,show}". I tried to
>   differentiate from "ofproto/trace" but maybe this is not very
>   intuitive? A (kind of crazy) idea that I have as a followup is to
>   persist the trace_id into the udpif_key so we can also trace
>   revalidations associated with an upcall. In such scenario,
>   "upcall/trace" might fall semantically short.
> 
> - ofproto vs dpif: Connected to the previous topic. The tracing
>   infrastructure is bound to the "udpif", the upcall engine, which is
>   part of the datapath (dpif), not the bridge. However, ofproto flows
>   are easier to write and more familiar to users so from their PoV,
>   specifying a bridge and a ofp filter is nicer. Writing
>   "upcall/trace/create br0 in_port=myport,ip" and then visualizing a
>   trace associated with `system@ovs-system` feels weird.
> 
> - Trace ID != Packet ID: The RFC generates a trace_id for each new
>   upcall that matches the filter and persistes the trace_id inside the
>   frozen_state. If the same packet gets recirculated and upcalled, it
>   will (nicely) inherit the same trace_id and be grouped together. We
>   see all the recirculation rounds of the same packet (same as with
>   ofproto/trace but for real!). BUT, frozen_states are shared. If
>   another packet hits it, it will also inherit that ID and be printed
>   alongside the first. Is this good? bad? acceptable? Do we need to
>   the packet's metadata instead of the frozen_state to persist this
>   trace_id?
> 
> - Another idea that I considered is binding the tracing to a specific
>   port, i.e: "upcall/trace/create br0 p1 {ofp_filter}". Although this
>   would deviate from the ofproto/trace syntax, it would make it easier
>   to avoid an extra flow match on traffic that is not from the
>   "port-under-test".
> 
> Of course any other feedback is hugely appreciated.
> 
> This RFC contains the core feature but lacks and some configurability
> that might be interesting for the actual patch. Additionally, I would
> like to measure the performance impact of enabling it in a loaded
> system.
> 

Hi, Adrian.  Thanks for the set!  It's definitely an interesting idea.

I'm a little on the edge about adding a pile of new appctl APIs for yet
another tracing mechanism.  It might be better if we can incorporate
this functionality into what we already have, enhancing things people
are already using and familiar with.

We have today ofproto/trace and ofproto/detrace that retis is using, for
example, to get something close to the trace, but limited in what it can
report.  So, I was thinking if we could just enhance the detrace output
with all the actual trace details.  We could do that relatively easily
by creating a new type of the cache entry (XC_TRACE) and make the
xlate_report() create those with the tracing type/text and maybe some
way of tracking the nesting level.  Then both the tracing code and the
detrace could construct the same output from the cache.

This has a few advantages:

- Not only upcall tracing, revalidators will populate the cache if user
  turns on the tracing, and clean it up when disabled.

- It's not a new API, just a couple more knobs (just one?) for the
  existing one, so retis could just get the benefits right away.

- We can still filter, if needed, or just populate traces for everything.

- The code sharing between the trace and detrace sounds nice. :)

Some disadvantages:

- Tracing everything may be expensive, but filter (either the full packet
  filter or the input port filter) can solve this the same as in the
  current implementation.  And we don't need to store extra data per
  packet in the additional buffer, just one trace per datapath flow,
  updated when there are changes in the pipeline.

- Filters may be tricky in a way that revalidating a post-recirculation
  flow with a filter on pre-recirculation flow will still require having
  a flag or something in the frozen state (Does it need to be a unique
  id?  A boolean flag 'trace_this' may be enough.).  But that's also not
  much different from the current implementation.

- Not a full trace.  detrace works per datapath flow, so in order to get
  the full trace one will need to detrace all relevant datapath flows.
  However, I suspect the primary user may be retis and it already tracks
  the packet, so it will know which datapath flows to detrace.

WDYT?

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to