On Wed, Jun 19, 2024 at 08:21:02PM GMT, Ilya Maximets wrote: > On 6/19/24 08:35, Adrián Moreno wrote: > > On Tue, Jun 18, 2024 at 05:44:05PM GMT, Ilya Maximets wrote: > >> On 6/18/24 12:50, Adrián Moreno wrote: > >>> On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote: > >>>> On 6/18/24 09:00, Adrián Moreno wrote: > >>>>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote: > >>>>>> On 6/17/24 13:55, Ilya Maximets wrote: > >>>>>>> On 6/3/24 20:56, Adrian Moreno wrote: > >>>>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence, > >>>>>>>> observability-oriented. > >>>>>>>> > >>>>>>>> Apart from some corner case in which it's used a replacement of > >>>>>>>> clone() > >>>>>>>> for old kernels, it's really only used for sFlow, IPFIX and now, > >>>>>>>> local emit_sample. > >>>>>>>> > >>>>>>>> With this in mind, it doesn't make much sense to report > >>>>>>>> OVS_DROP_LAST_ACTION inside sample actions. > >>>>>>>> > >>>>>>>> For instance, if the flow: > >>>>>>>> > >>>>>>>> actions:sample(..,emit_sample(..)),2 > >>>>>>>> > >>>>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely > >>>>>>>> confusing for users since the packet did reach its destination. > >>>>>>>> > >>>>>>>> This patch makes internal action execution silently consume the skb > >>>>>>>> instead of notifying a drop for this case. > >>>>>>>> > >>>>>>>> Unfortunately, this patch does not remove all potential sources of > >>>>>>>> confusion since, if the sample action itself is the last action, e.g: > >>>>>>>> > >>>>>>>> actions:sample(..,emit_sample(..)) > >>>>>>>> > >>>>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we > >>>>>>>> aren't. > >>>>>>>> > >>>>>>>> Sadly, this case is difficult to solve without breaking the > >>>>>>>> optimization by which the skb is not cloned on last sample actions. > >>>>>>>> But, given explicit drop actions are now supported, OVS can just add > >>>>>>>> one > >>>>>>>> after the last sample() and rewrite the flow as: > >>>>>>>> > >>>>>>>> actions:sample(..,emit_sample(..)),drop > >>>>>>>> > >>>>>>>> Signed-off-by: Adrian Moreno <amore...@redhat.com> > >>>>>>>> --- > >>>>>>>> net/openvswitch/actions.c | 13 +++++++++++-- > >>>>>>>> 1 file changed, 11 insertions(+), 2 deletions(-) > >>>>>>>> > >>>>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c > >>>>>>>> index 33f6d93ba5e4..54fc1abcff95 100644 > >>>>>>>> --- a/net/openvswitch/actions.c > >>>>>>>> +++ b/net/openvswitch/actions.c > >>>>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos; > >>>>>>>> static struct action_flow_keys __percpu *flow_keys; > >>>>>>>> static DEFINE_PER_CPU(int, exec_actions_level); > >>>>>>>> > >>>>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb) > >>>>>>>> +{ > >>>>>>>> + /* Do not emit packet drops inside sample(). */ > >>>>>>>> + if (OVS_CB(skb)->probability) > >>>>>>>> + consume_skb(skb); > >>>>>>>> + else > >>>>>>>> + ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION); > >>>>>>>> +} > >>>>>>>> + > >>>>>>>> /* Make a clone of the 'key', using the pre-allocated percpu > >>>>>>>> 'flow_keys' > >>>>>>>> * space. Return NULL if out of key spaces. > >>>>>>>> */ > >>>>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct > >>>>>>>> sk_buff *skb, > >>>>>>>> if ((arg->probability != U32_MAX) && > >>>>>>>> (!arg->probability || get_random_u32() > arg->probability)) > >>>>>>>> { > >>>>>>>> if (last) > >>>>>>>> - ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION); > >>>>>>>> + ovs_drop_skb_last_action(skb); > >>>>>> > >>>>>> Always consuming the skb at this point makes sense, since having > >>>>>> smaple() > >>>>>> as a last action is a reasonable thing to have. But this looks more > >>>>>> like > >>>>>> a fix for the original drop reason patch set. > >>>>>> > >>>>> > >>>>> I don't think consuming the skb at this point makes sense. It was very > >>>>> intentionally changed to a drop since a very common use-case for > >>>>> sampling is drop-sampling, i.e: replacing an empty action list (that > >>>>> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally, > >>>>> that replacement should not have any effect on the number of > >>>>> OVS_DROP_LAST_ACTION being reported as the packets are being treated in > >>>>> the same way (only observed in one case). > >>>>> > >>>>> > >>>>>>>> return 0; > >>>>>>>> } > >>>>>>>> > >>>>>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath > >>>>>>>> *dp, struct sk_buff *skb, > >>>>>>>> } > >>>>>>>> } > >>>>>>>> > >>>>>>>> - ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION); > >>>>>>>> + ovs_drop_skb_last_action(skb); > >>>>>>> > >>>>>>> I don't think I agree with this one. If we have a sample() action > >>>>>>> with > >>>>>>> a lot of different actions inside and we reached the end while the > >>>>>>> last > >>>>>>> action didn't consume the skb, then we should report that. E.g. > >>>>>>> "sample(emit_sample(),push_vlan(),set(eth())),2" should report that > >>>>>>> the > >>>>>>> cloned skb was dropped. "sample(push_vlan(),emit_sample())" should > >>>>>>> not. > >>>>>>> > >>>>> > >>>>> What is the use case for such action list? Having an action branch > >>>>> executed randomly doesn't make sense to me if it's not some > >>>>> observability thing (which IMHO should not trigger drops). > >>>> > >>>> It is exactly my point. A list of actions that doesn't end is some sort > >>>> of a terminal action (output, drop, etc) does not make a lot of sense and > >>>> hence should be signaled as an unexpected drop, so users can re-check the > >>>> pipeline in case they missed the terminal action somehow. > >>>> > >>>>> > >>>>>>> The only actions that are actually consuming the skb are "output", > >>>>>>> "userspace", "recirc" and now "emit_sample". "output" and "recirc" > >>>>>>> are > >>>>>>> consuming the skb "naturally" by stealing it when it is the last > >>>>>>> action. > >>>>>>> "userspace" has an explicit check to consume the skb if it is the last > >>>>>>> action. "emit_sample" should have the similar check. It should > >>>>>>> likely > >>>>>>> be added at the point of action introduction instead of having a > >>>>>>> separate > >>>>>>> patch. > >>>>>>> > >>>>> > >>>>> Unlinke "output", "recirc", "userspace", etc. with emit_sample the > >>>>> packet does not continue it's way through the datapath. > >>>> > >>>> After "output" the packet leaves the datapath too, i.e. does not continue > >>>> it's way through OVS datapath. > >>>> > >>> > >>> I meant a broader concept of "datapath". The packet continues. For the > >>> userspace action this is true only for the CONTROLLER ofp action but > >>> since the datapath does not know which action it's implementing, we > >>> cannot do better. > >> > >> It's not only controller() action. Packets can be brought to userspace > >> for various reason including just an explicit ask to execute some actions > >> in userspace. In any case the packet sent to userspace kind of reached its > >> destination and it's not the "datapath drops the packet" situation. > >> > >>> > >>>>> > >>>>> It would be very confusing if OVS starts monitoring drops and adds a > >>>>> bunch > >>>>> of flows such as "actions:emit_sample()" and suddently it stops > >>>>> reporting such > >>>>> drops via standard kfree_skb_reason. Packets _are_ being dropped here, > >>>>> we are just observing them. > >>>> > >>>> This might make sense from the higher logic in user space application, > >>>> but > >>>> it doesn't from the datapath perspective. And also, if the user adds the > >>>> 'emit_sample' action for drop monitring, they already know where to find > >>>> packet samples, they don't need to use tools like dropwatch anymore. > >>>> This packet is not dropped from the datapath perspective, it is sampled. > >>>> > >>>>> > >>>>> And if we change emit_sample to trigger a drop if it's the last action, > >>>>> then "sample(50%, emit_sample()),2" will trigger a drop half of the > >>>>> times > >>>>> which is also terribly confusing. > >>>> > >>>> If emit_sample is the last action, then skb should be consumed silently. > >>>> The same as for "output" and "userspace". > >>>> > >>>>> > >>>>> I think we should try to be clear and informative with what we > >>>>> _actually_ drop and not require the user that is just running > >>>>> "dropwatch" to understand the internals of the OVS module. > >>>> > >>>> If someone is already using sampling to watch their packet drops, why > >>>> would > >>>> they use dropwatch? > >>>> > >>>>> > >>>>> So if you don't want to accept the "observational" nature of sample(), > >>>>> the only other solution that does not bring even more confusion to OVS > >>>>> drops would be to have userspace add explicit drop actions. WDYT? > >>>>> > >>>> > >>>> These are not drops from the datapath perspective. Users can add > >>>> explicit > >>>> drop actions if they want to, but I'm really not sure why they would do > >>>> that > >>>> if they are already capturing all these packets in psample, sFlow or > >>>> IPFIX. > >>> > >>> Because there is not a single "user". Tools and systems can be built on > >>> top of tracepoints and samples and they might not be coordinated between > >>> them. Some observability application can be always enabled and doing > >>> constant network monitoring or statistics while other lower level tools > >>> can be run at certain moments to troubleshoot issues. > >>> > >>> In order to run dropwatch in a node you don't need to have rights to > >>> access the OpenFlow controller and ask it to change the OpenFlow rules > >>> or else dropwatch simply will not show actual packet drops. > >> > >> The point is that these are not drops in this scenario. The packet was > >> delivered to its destination and hence should not be reported as dropped. > >> In the observability use-case that you're describing even OpenFlow layer > >> in OVS doesn't know if these supposed to be treated as packet drops for > >> the user or if these are just samples with the sampling being the only > >> intended destination. For OpenFlow and OVS userspace components these > >> two scenarios are indistinguishable. Only the OpenFlow controller knows > >> that these rules were put in place because it was an ACL created by some > >> user or tool. And since OVS in user space can't make such a distinction, > >> kernel can't make it either, and so shouldn't guess what the user two > >> levels of abstraction higher up meant. > >> > >>> > >>> To me it seems obvious that drop sampling (via emit_sample) "includes" > >>> drop reporting via emit_sample. In both cases you get the packet > >>> headers, but in one case you also get OFP controller metadata. Now even > >>> if there is a system that uses both, does it make sense to push to them > >>> the responsibility of dealing with them being mutually exclusive? > >>> > >>> I think this makes debugging OVS datapath unnecessarily obscure when we > >>> know the packet is actually being dropped intentionally by OVS. > >> > >> I don't think we know that we're in a drop sampling scenario. We don't > >> have enough information even in OVS userspace to tell. > >> > >> And having different behavior between "userspace" and "emit_sample" in > >> the kernel may cause even more confusion, because now two ways of sampling > >> packets will result in packets showing up in dropwatch in one case, but > >> not in the other. > >> > >>> > >>> What's the problem with having OVS write the following? > >>> "sample(50%, emit_sample()),drop(0)" > >> > >> It's a valid sequence of actions, but we shouldn't guess what the end > >> user meant by putting those actions into the kernel. If we see such a > >> sequence in the kernel, then we should report an explicit drop. If > >> there was only the "sample(50%, emit_sample())" then we should simply > >> consume the skb as it reached its destination in the psample. > >> > >> For the question if OVS in user space should put explicit drop action > >> while preparing to emit sample, this doesn't sound reasonable for the > >> same reason - OVS in user space doesn't know what the intention was of > >> the user or tool that put the sampling action into OpenFlow pipeline. > >> > > > > I don't see it that way. The spec says that packets whose action sets > > (the result of classification) have no output action and no group action > > must be dropped. Even if OFP sample action is an extension, I don't see > > it invalidating that semantics. > > So, IMHO, OVS does know that a flow that is just sampled is a drop. > > This applies to "action sets", but most users are actually using "action > lists" supplied via "Apply-actions" OF instruction and the action sets > always remain empty. So, from the OF perspective, strictly speaking, we > are dropping every single packet. So, this is not a good analogy. > > > > >> I actually became more confused about what are we arguing about. > >> To recap: > >> > >> This patch My proposal > >> > >> 1. emit_sample() is the last consume consume > >> inside the sample() > >> > >> 2. the end of the action list consume drop > >> inside the sample() > >> > >> 3. emit_sample() is the last drop consume > >> outside the sample() > >> > >> 4. the end of the action list drop drop > >> outside the sample() > >> > >> 5. sample() is the last action consume consume > >> and probability failed > >> > >> > >> I don't think cases 1 and 3 should differ, i.e. the behavior should > >> be the same regardless of emit_sample() being inside or outside of > >> the sample(). As a side point, OVS in user space will omit the 100% > >> rate sample() action and will just list inner actions instead. This > >> means that 100% probability sampling will generate drops and 99% will > >> not. Doesn't sound right. > >> > > > > That's what I was refering to in the commit message, we still OVS to > > write: > > actions:sample(..,emit_sample(..)),drop > > > >> Case 2 should likely never happen, but I'd like to see a drop reported > >> if that ever happens, because it is not a meaningful list of actions. > >> > >> Best regards, Ilya Maximets. > >> > > > > I think we could drop this patch if we agree that OVS could write > > explicit drops when it knows the packet is being dropped and sampled > > (the action only has OFP sample actions). > > > > The drop could be placed inside the odp sample action to avoid > > breaking the clone optimization: > > actions:sample(50%, actions(emit_sample(),drop))) > > > > or outside if the sample itself is optimized out: > > actions:emit_sample(),drop > > > > IIUC, if we don't do that, we are saying that sampling is incompatible > > with decent drop reporting via kfree_skb infrastructure used by tools > > like dropwatch or retis (among many others). And I think that is > > unnecessarily and deliberately making OVS datapath more difficult to > > troubleshoot. > > This makes some sense, so let's ensure that semantics is consistent > within the kernel and discuss how to make the tools happy from the > user space perspective. > > But we shouldn't simply drop this patch, we still need to consume the > skb after emit_sample() when it is the last action. The same as we > do for the userpsace() action. Though it should be done at the point > of the action introduction. Having both actions consistent will allow > us to solve the observability problem for both in the same way by > adding explicit drop actions from user space.
OK. I'll resend the series dropping this patch (and consuming the skb apropriately). > > On a side note: > I wonder if probability-induced drop needs a separate reason... i.e. > it could have been consumed by emit_smaple()/userspace() but wasn't. > You mean in sample action "get_random_u32() > arg->probability"? It only makes sense to drop it if the last action so currently uses OVS_DROP_LAST_ACTION. > Best regards, Ilya Maximets. > Thanks for the great discussion. Adrián _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev