On 6/19/24 22:40, Adrián Moreno wrote: > On Wed, Jun 19, 2024 at 08:21:02PM GMT, Ilya Maximets wrote: >> On 6/19/24 08:35, Adrián Moreno wrote: >>> On Tue, Jun 18, 2024 at 05:44:05PM GMT, Ilya Maximets wrote: >>>> On 6/18/24 12:50, Adrián Moreno wrote: >>>>> On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote: >>>>>> On 6/18/24 09:00, Adrián Moreno wrote: >>>>>>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote: >>>>>>>> On 6/17/24 13:55, Ilya Maximets wrote: >>>>>>>>> On 6/3/24 20:56, Adrian Moreno wrote: >>>>>>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence, >>>>>>>>>> observability-oriented. >>>>>>>>>> >>>>>>>>>> Apart from some corner case in which it's used a replacement of >>>>>>>>>> clone() >>>>>>>>>> for old kernels, it's really only used for sFlow, IPFIX and now, >>>>>>>>>> local emit_sample. >>>>>>>>>> >>>>>>>>>> With this in mind, it doesn't make much sense to report >>>>>>>>>> OVS_DROP_LAST_ACTION inside sample actions. >>>>>>>>>> >>>>>>>>>> For instance, if the flow: >>>>>>>>>> >>>>>>>>>> actions:sample(..,emit_sample(..)),2 >>>>>>>>>> >>>>>>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely >>>>>>>>>> confusing for users since the packet did reach its destination. >>>>>>>>>> >>>>>>>>>> This patch makes internal action execution silently consume the skb >>>>>>>>>> instead of notifying a drop for this case. >>>>>>>>>> >>>>>>>>>> Unfortunately, this patch does not remove all potential sources of >>>>>>>>>> confusion since, if the sample action itself is the last action, e.g: >>>>>>>>>> >>>>>>>>>> actions:sample(..,emit_sample(..)) >>>>>>>>>> >>>>>>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we >>>>>>>>>> aren't. >>>>>>>>>> >>>>>>>>>> Sadly, this case is difficult to solve without breaking the >>>>>>>>>> optimization by which the skb is not cloned on last sample actions. >>>>>>>>>> But, given explicit drop actions are now supported, OVS can just add >>>>>>>>>> one >>>>>>>>>> after the last sample() and rewrite the flow as: >>>>>>>>>> >>>>>>>>>> actions:sample(..,emit_sample(..)),drop >>>>>>>>>> >>>>>>>>>> Signed-off-by: Adrian Moreno <amore...@redhat.com> >>>>>>>>>> --- >>>>>>>>>> net/openvswitch/actions.c | 13 +++++++++++-- >>>>>>>>>> 1 file changed, 11 insertions(+), 2 deletions(-) >>>>>>>>>> >>>>>>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c >>>>>>>>>> index 33f6d93ba5e4..54fc1abcff95 100644 >>>>>>>>>> --- a/net/openvswitch/actions.c >>>>>>>>>> +++ b/net/openvswitch/actions.c >>>>>>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos; >>>>>>>>>> static struct action_flow_keys __percpu *flow_keys; >>>>>>>>>> static DEFINE_PER_CPU(int, exec_actions_level); >>>>>>>>>> >>>>>>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb) >>>>>>>>>> +{ >>>>>>>>>> + /* Do not emit packet drops inside sample(). */ >>>>>>>>>> + if (OVS_CB(skb)->probability) >>>>>>>>>> + consume_skb(skb); >>>>>>>>>> + else >>>>>>>>>> + ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION); >>>>>>>>>> +} >>>>>>>>>> + >>>>>>>>>> /* Make a clone of the 'key', using the pre-allocated percpu >>>>>>>>>> 'flow_keys' >>>>>>>>>> * space. Return NULL if out of key spaces. >>>>>>>>>> */ >>>>>>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct >>>>>>>>>> sk_buff *skb, >>>>>>>>>> if ((arg->probability != U32_MAX) && >>>>>>>>>> (!arg->probability || get_random_u32() > arg->probability)) >>>>>>>>>> { >>>>>>>>>> if (last) >>>>>>>>>> - ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION); >>>>>>>>>> + ovs_drop_skb_last_action(skb); >>>>>>>> >>>>>>>> Always consuming the skb at this point makes sense, since having >>>>>>>> smaple() >>>>>>>> as a last action is a reasonable thing to have. But this looks more >>>>>>>> like >>>>>>>> a fix for the original drop reason patch set. >>>>>>>> >>>>>>> >>>>>>> I don't think consuming the skb at this point makes sense. It was very >>>>>>> intentionally changed to a drop since a very common use-case for >>>>>>> sampling is drop-sampling, i.e: replacing an empty action list (that >>>>>>> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally, >>>>>>> that replacement should not have any effect on the number of >>>>>>> OVS_DROP_LAST_ACTION being reported as the packets are being treated in >>>>>>> the same way (only observed in one case). >>>>>>> >>>>>>> >>>>>>>>>> return 0; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath >>>>>>>>>> *dp, struct sk_buff *skb, >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> - ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION); >>>>>>>>>> + ovs_drop_skb_last_action(skb); >>>>>>>>> >>>>>>>>> I don't think I agree with this one. If we have a sample() action >>>>>>>>> with >>>>>>>>> a lot of different actions inside and we reached the end while the >>>>>>>>> last >>>>>>>>> action didn't consume the skb, then we should report that. E.g. >>>>>>>>> "sample(emit_sample(),push_vlan(),set(eth())),2" should report that >>>>>>>>> the >>>>>>>>> cloned skb was dropped. "sample(push_vlan(),emit_sample())" should >>>>>>>>> not. >>>>>>>>> >>>>>>> >>>>>>> What is the use case for such action list? Having an action branch >>>>>>> executed randomly doesn't make sense to me if it's not some >>>>>>> observability thing (which IMHO should not trigger drops). >>>>>> >>>>>> It is exactly my point. A list of actions that doesn't end is some sort >>>>>> of a terminal action (output, drop, etc) does not make a lot of sense and >>>>>> hence should be signaled as an unexpected drop, so users can re-check the >>>>>> pipeline in case they missed the terminal action somehow. >>>>>> >>>>>>> >>>>>>>>> The only actions that are actually consuming the skb are "output", >>>>>>>>> "userspace", "recirc" and now "emit_sample". "output" and "recirc" >>>>>>>>> are >>>>>>>>> consuming the skb "naturally" by stealing it when it is the last >>>>>>>>> action. >>>>>>>>> "userspace" has an explicit check to consume the skb if it is the last >>>>>>>>> action. "emit_sample" should have the similar check. It should >>>>>>>>> likely >>>>>>>>> be added at the point of action introduction instead of having a >>>>>>>>> separate >>>>>>>>> patch. >>>>>>>>> >>>>>>> >>>>>>> Unlinke "output", "recirc", "userspace", etc. with emit_sample the >>>>>>> packet does not continue it's way through the datapath. >>>>>> >>>>>> After "output" the packet leaves the datapath too, i.e. does not continue >>>>>> it's way through OVS datapath. >>>>>> >>>>> >>>>> I meant a broader concept of "datapath". The packet continues. For the >>>>> userspace action this is true only for the CONTROLLER ofp action but >>>>> since the datapath does not know which action it's implementing, we >>>>> cannot do better. >>>> >>>> It's not only controller() action. Packets can be brought to userspace >>>> for various reason including just an explicit ask to execute some actions >>>> in userspace. In any case the packet sent to userspace kind of reached its >>>> destination and it's not the "datapath drops the packet" situation. >>>> >>>>> >>>>>>> >>>>>>> It would be very confusing if OVS starts monitoring drops and adds a >>>>>>> bunch >>>>>>> of flows such as "actions:emit_sample()" and suddently it stops >>>>>>> reporting such >>>>>>> drops via standard kfree_skb_reason. Packets _are_ being dropped here, >>>>>>> we are just observing them. >>>>>> >>>>>> This might make sense from the higher logic in user space application, >>>>>> but >>>>>> it doesn't from the datapath perspective. And also, if the user adds the >>>>>> 'emit_sample' action for drop monitring, they already know where to find >>>>>> packet samples, they don't need to use tools like dropwatch anymore. >>>>>> This packet is not dropped from the datapath perspective, it is sampled. >>>>>> >>>>>>> >>>>>>> And if we change emit_sample to trigger a drop if it's the last action, >>>>>>> then "sample(50%, emit_sample()),2" will trigger a drop half of the >>>>>>> times >>>>>>> which is also terribly confusing. >>>>>> >>>>>> If emit_sample is the last action, then skb should be consumed silently. >>>>>> The same as for "output" and "userspace". >>>>>> >>>>>>> >>>>>>> I think we should try to be clear and informative with what we >>>>>>> _actually_ drop and not require the user that is just running >>>>>>> "dropwatch" to understand the internals of the OVS module. >>>>>> >>>>>> If someone is already using sampling to watch their packet drops, why >>>>>> would >>>>>> they use dropwatch? >>>>>> >>>>>>> >>>>>>> So if you don't want to accept the "observational" nature of sample(), >>>>>>> the only other solution that does not bring even more confusion to OVS >>>>>>> drops would be to have userspace add explicit drop actions. WDYT? >>>>>>> >>>>>> >>>>>> These are not drops from the datapath perspective. Users can add >>>>>> explicit >>>>>> drop actions if they want to, but I'm really not sure why they would do >>>>>> that >>>>>> if they are already capturing all these packets in psample, sFlow or >>>>>> IPFIX. >>>>> >>>>> Because there is not a single "user". Tools and systems can be built on >>>>> top of tracepoints and samples and they might not be coordinated between >>>>> them. Some observability application can be always enabled and doing >>>>> constant network monitoring or statistics while other lower level tools >>>>> can be run at certain moments to troubleshoot issues. >>>>> >>>>> In order to run dropwatch in a node you don't need to have rights to >>>>> access the OpenFlow controller and ask it to change the OpenFlow rules >>>>> or else dropwatch simply will not show actual packet drops. >>>> >>>> The point is that these are not drops in this scenario. The packet was >>>> delivered to its destination and hence should not be reported as dropped. >>>> In the observability use-case that you're describing even OpenFlow layer >>>> in OVS doesn't know if these supposed to be treated as packet drops for >>>> the user or if these are just samples with the sampling being the only >>>> intended destination. For OpenFlow and OVS userspace components these >>>> two scenarios are indistinguishable. Only the OpenFlow controller knows >>>> that these rules were put in place because it was an ACL created by some >>>> user or tool. And since OVS in user space can't make such a distinction, >>>> kernel can't make it either, and so shouldn't guess what the user two >>>> levels of abstraction higher up meant. >>>> >>>>> >>>>> To me it seems obvious that drop sampling (via emit_sample) "includes" >>>>> drop reporting via emit_sample. In both cases you get the packet >>>>> headers, but in one case you also get OFP controller metadata. Now even >>>>> if there is a system that uses both, does it make sense to push to them >>>>> the responsibility of dealing with them being mutually exclusive? >>>>> >>>>> I think this makes debugging OVS datapath unnecessarily obscure when we >>>>> know the packet is actually being dropped intentionally by OVS. >>>> >>>> I don't think we know that we're in a drop sampling scenario. We don't >>>> have enough information even in OVS userspace to tell. >>>> >>>> And having different behavior between "userspace" and "emit_sample" in >>>> the kernel may cause even more confusion, because now two ways of sampling >>>> packets will result in packets showing up in dropwatch in one case, but >>>> not in the other. >>>> >>>>> >>>>> What's the problem with having OVS write the following? >>>>> "sample(50%, emit_sample()),drop(0)" >>>> >>>> It's a valid sequence of actions, but we shouldn't guess what the end >>>> user meant by putting those actions into the kernel. If we see such a >>>> sequence in the kernel, then we should report an explicit drop. If >>>> there was only the "sample(50%, emit_sample())" then we should simply >>>> consume the skb as it reached its destination in the psample. >>>> >>>> For the question if OVS in user space should put explicit drop action >>>> while preparing to emit sample, this doesn't sound reasonable for the >>>> same reason - OVS in user space doesn't know what the intention was of >>>> the user or tool that put the sampling action into OpenFlow pipeline. >>>> >>> >>> I don't see it that way. The spec says that packets whose action sets >>> (the result of classification) have no output action and no group action >>> must be dropped. Even if OFP sample action is an extension, I don't see >>> it invalidating that semantics. >>> So, IMHO, OVS does know that a flow that is just sampled is a drop. >> >> This applies to "action sets", but most users are actually using "action >> lists" supplied via "Apply-actions" OF instruction and the action sets >> always remain empty. So, from the OF perspective, strictly speaking, we >> are dropping every single packet. So, this is not a good analogy. >> >>> >>>> I actually became more confused about what are we arguing about. >>>> To recap: >>>> >>>> This patch My proposal >>>> >>>> 1. emit_sample() is the last consume consume >>>> inside the sample() >>>> >>>> 2. the end of the action list consume drop >>>> inside the sample() >>>> >>>> 3. emit_sample() is the last drop consume >>>> outside the sample() >>>> >>>> 4. the end of the action list drop drop >>>> outside the sample() >>>> >>>> 5. sample() is the last action consume consume >>>> and probability failed >>>> >>>> >>>> I don't think cases 1 and 3 should differ, i.e. the behavior should >>>> be the same regardless of emit_sample() being inside or outside of >>>> the sample(). As a side point, OVS in user space will omit the 100% >>>> rate sample() action and will just list inner actions instead. This >>>> means that 100% probability sampling will generate drops and 99% will >>>> not. Doesn't sound right. >>>> >>> >>> That's what I was refering to in the commit message, we still OVS to >>> write: >>> actions:sample(..,emit_sample(..)),drop >>> >>>> Case 2 should likely never happen, but I'd like to see a drop reported >>>> if that ever happens, because it is not a meaningful list of actions. >>>> >>>> Best regards, Ilya Maximets. >>>> >>> >>> I think we could drop this patch if we agree that OVS could write >>> explicit drops when it knows the packet is being dropped and sampled >>> (the action only has OFP sample actions). >>> >>> The drop could be placed inside the odp sample action to avoid >>> breaking the clone optimization: >>> actions:sample(50%, actions(emit_sample(),drop))) >>> >>> or outside if the sample itself is optimized out: >>> actions:emit_sample(),drop >>> >>> IIUC, if we don't do that, we are saying that sampling is incompatible >>> with decent drop reporting via kfree_skb infrastructure used by tools >>> like dropwatch or retis (among many others). And I think that is >>> unnecessarily and deliberately making OVS datapath more difficult to >>> troubleshoot. >> >> This makes some sense, so let's ensure that semantics is consistent >> within the kernel and discuss how to make the tools happy from the >> user space perspective. >> >> But we shouldn't simply drop this patch, we still need to consume the >> skb after emit_sample() when it is the last action. The same as we >> do for the userpsace() action. Though it should be done at the point >> of the action introduction. Having both actions consistent will allow >> us to solve the observability problem for both in the same way by >> adding explicit drop actions from user space. > > OK. I'll resend the series dropping this patch (and consuming the skb > apropriately).
Thanks! > >> >> On a side note: >> I wonder if probability-induced drop needs a separate reason... i.e. >> it could have been consumed by emit_smaple()/userspace() but wasn't. >> > > You mean in sample action "get_random_u32() > arg->probability"? > It only makes sense to drop it if the last action so currently uses > OVS_DROP_LAST_ACTION. Sure, but, for example: actions:sample(50%,userspace()) In 50% cases we will consume the skb, in 50% we will report a LAST_ACTION drop. Looks a little inconsistent. That's why I was saying that always consuming on probability check failure is a sane option. But if we have actions:sample(50%,userspace(),drop) Then reporting a drop makes more sense. So, I was thinking that maybe the LAST_ACTION is just not the right drop reason to report. e.g. something like OVS_DROP_SAMPLE_PROBABILITY may be more appropriate to report in both cases. Anyways, this is only kind of related to this set and may be a separate change if we decide it is needed. > >> Best regards, Ilya Maximets. >> > > Thanks for the great discussion. > Adrián > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev