On Thu, Jul 04, 2024 at 09:52:41AM GMT, Adrian Moreno wrote:
> (Was: Add psample support to NXAST_SAMPLE action)
>
> This is the userspace counterpart of the work being done in the kernel
> [1] which is still not merged (hence the RFC state). There, a new
> datapath action is added, called "psample".
>
> From the PoV of ovs-vswitchd, this new action is used to implement
> "local sampling". Local sampling (or lsample for short) is configured
> in a similar way as current per-flow IPFIX sampling, i.e: using the
> Flow_Sample_Collector_Set table and the NXAST_SAMPLE action.
>
> However, instead of sending the sample to an external IPFIX collector
> though the network, the sample is emitted using the new action and
> made available to locally running sample collector.
>
> The specific way emit_sample sends the sample (and the way the local
> collector shall collect it) is datapath-specific.
> Currently, currently only the Linux kernel datapath implements it using
> the psample netlink multicast group.
>
> ~~ Configuration ~~
> Local sampling is configured via a new column in the
> Flow_Sample_Collector_Set (FSCS) table called "local_sample_group".
> Configuring this value is orthogonal to also associating the FSCS
> entry to an entry in the IPFIX table.
>
> Once that entry in the OVSDB is configured, NXAST_SAMPLE actions coming
> from the controller will be translated into the following odp action:
>
>sample(sample={P}%, actions(emit_sample(group={G},cookie={C})))
>
> Where:
> P: Is the sampling probability from NXAST_SAMPLE
> G: Is the group id in the FSCS entry whose "id" matches the one in
> the NXAST_SAMPLE.
> C: Is a 64bit cookie result of concatenating the obs_domain and
> obs_point from the NXAST_SAMPLE in network order, i.e:
> "htonl(obs_domain) << 32 | htonl(obs_point)"
> Notes:
> - The parent sample action might be omitted if the probability is
> 100% and there is no IPFIX sampling that requires the use of a
> meter.
>
> ~~ Dpif-lsample ~~
> Internally, a new object called "dpif-lsample" is introduced to track
> the configured local sampling exporters and track statistics based on
> odp flow stats (using xcache).
> It exposes the list of configured exporters and their statistics on a
> new unixctl command called "lsample/show".
>
I just realized I forgot to add a comment explicitly stating that the
above two sections below (which translate to patches 11/13, 12/13 and
13/13) are new in this version of the RFC series.
I know this can be problematic given the late stage we're in so I'll add
a bit os context on why I added them.
> ~~ Drop monitoring ~~
> A common use-case for this action can be to sample drops. However,
> adding sample actions to drops makes the existing drop statistics
> disappear. In order to fix this, patches 11 and 12 make use of explicit
> drop actions to ensure statistics still report drops even if sampled.
>
Drop monitoring and the interaction with local (or even non-local)
sampling has been discussed in the kernel series as I originally tried
to solve the problem in the kernel. After some discussions with Ilya we
agreed to explore the solution to the problem in userspace. That is why
I feel these patches are related to the series.
In any case, IMHO, both patches fix existing bugs: Enabling sampling
(local or not, per-bridge or per-flow) should not hide drop statistics.
One visibility feature should not break an existing one.
> ~~ Extended OpenFlow sample action ~~
> Given the series aims at making sampling production ready, conntrack
> integration must be considered. A common use-case for state-full
> pipelines is to calculate the observation metadata at connection
> establishment, store it in ct_label and then use it for packets of
> established connections. However, this forces OVN to create a big number
> of OFP Flows (one per distinct cookie). Patch 13 solves this by allowing
> controllers to specify the obs_domain and point ids from another OFP
> field.
>
This is an addition that, although discussed informally, did not come
directly from the kernel series but from experimentation and interaction
with the OVN team.
It can be considered a follow-up optimization so if there is controversy
around it, I'm OK postponing it to a future release.
> ~~ Testing ~~
> The series includes an test utility program than can be executed by
> running "tests/ovstest test-psample". This utility listens
> to packets multicasted by the psample module and prints them (also
> printing the obs_domain and obs_point ids).
>
> ~~ HW Offload ~~
> tc offload is not being introduced in this series as existing sample
> or userspace actions are not currently offloadable. Also some
> improvements need to be implemented in tc for it to be feasible.
>
> ~~ DPDK datapath ~~
> By naming the action "psample" it was intentionally restricted to the
> Linux datapath only. A follow up task would be spawned to think of a
> good way of implementing local-samp