On 12 Jul 2024, at 17:11, Adrián Moreno wrote:
> On Fri, Jul 12, 2024 at 04:35:04PM GMT, Eelco Chaudron wrote: >> On 12 Jul 2024, at 1:32, Adrian Moreno wrote: >> >>> When the flow translation results in a datapath action list whose last >>> action is an "observational" action, i.e: one generated for IPFIX, >>> sFlow or local sampling applications, the packet is actually going to be >>> dropped (and observed). >>> >>> In that case, add an explicit drop action so that drop statistics remain >>> accurate. >>> >>> Combine the "optimizations" and other odp_actions "tweaks" into a single >>> function. >> >> I'm running out of time to properly review (and look for alternatives), so >> I'll add some general comments, and let you and Ilya work out the details >> when I'm on PTO ;) >> >>> Signed-off-by: Adrian Moreno <amore...@redhat.com> >>> --- >>> ofproto/ofproto-dpif-xlate.c | 52 +++++++++++----- >>> ofproto/ofproto-dpif-xlate.h | 4 ++ >>> tests/drop-stats.at | 113 +++++++++++++++++++++++++++++++++++ >>> tests/ofproto-dpif.at | 47 ++++++++++++++- >>> 4 files changed, 200 insertions(+), 16 deletions(-) >>> >>> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c >>> index 0a0964348..15e89e6a8 100644 >>> --- a/ofproto/ofproto-dpif-xlate.c >>> +++ b/ofproto/ofproto-dpif-xlate.c >>> @@ -3415,6 +3415,7 @@ compose_sample_action(struct xlate_ctx *ctx, >>> struct ofproto *ofproto = &ctx->xin->ofproto->up; >>> uint32_t meter_id = ofproto->slowpath_meter_id; >>> size_t cookie_offset = 0; >>> + size_t observe_offset; >>> >>> /* The meter action is only used to throttle userspace actions. >>> * If they are not needed and the sampling rate is 100%, avoid >>> generating >>> @@ -3432,6 +3433,7 @@ compose_sample_action(struct xlate_ctx *ctx, >>> } >>> >>> if (args->psample) { >>> + observe_offset = ctx->odp_actions->size; >>> odp_put_psample_action(ctx->odp_actions, >>> args->psample->group_id, >>> (void *) &args->psample->cookie, >>> @@ -3443,6 +3445,7 @@ compose_sample_action(struct xlate_ctx *ctx, >>> nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_METER, >>> meter_id); >>> } >>> >>> + observe_offset = ctx->odp_actions->size; >>> odp_port_t odp_port = ofp_port_to_odp_port( >>> ctx->xbridge, ctx->xin->flow.in_port.ofp_port); >>> uint32_t pid = dpif_port_get_pid(ctx->xbridge->dpif, odp_port); >>> @@ -3457,6 +3460,9 @@ compose_sample_action(struct xlate_ctx *ctx, >>> if (is_sample) { >>> nl_msg_end_nested(ctx->odp_actions, actions_offset); >>> nl_msg_end_nested(ctx->odp_actions, sample_offset); >>> + ctx->xout->last_observe_offset = sample_offset; >>> + } else { >>> + ctx->xout->last_observe_offset = observe_offset; >> >> Not sure why, but I do not like the idea of storing offsets in the ctx. >> Thinking out loud without any checking; We are looping through all actions >> in xlate_optimize_odp_actions() is there no easy way to figure it out there? > > I thought of this. The condition that must be met in order to > consider an action to be "observability" is: > > Its a userspace() action used for IPFIX/sFlow OR > It's a psample() action OR > It's a sample() action that inside has ( > psample action OR > userspace action usef for IPFIX/sFlow > ) > > So, we need to introspect the netlink attributes all the way to the > userspace action's cookie, decode it and look at the type to make sure > it's not, a metered controller action (which uses sample() as well). > > The resulting code was a bit ugly with all that instrospection. So I > tried this other way which I think is cleaner and more robust since > actions have to explicitly "opt-in" to be considered observational, > rather than the check happening somewhere else. > > Would you prefer the introspection alternative? I do not know what I like ;) If we have to choose between two not-so-nice approaches, the ctx option might be the cleanest. Maybe Ilya has a better idea... >> >>> } >>> >>> return cookie_offset; >>> @@ -8066,12 +8072,14 @@ xlate_wc_finish(struct xlate_ctx *ctx) >>> } >>> } >>> >>> -/* This will optimize the odp actions generated. For now, it will remove >>> - * trailing clone actions that are unnecessary. */ >>> +/* This will tweak the odp actions generated. For now, it will: >>> + * - Remove trailing clone actions that are unnecessary. >>> + * - Add an explicit drop action if the last action is observational or >>> the >>> + * action list is empty. */ >>> static void >>> -xlate_optimize_odp_actions(struct xlate_in *xin) >>> +xlate_tweak_odp_actions(struct xlate_ctx *ctx) >>> { >>> - struct ofpbuf *actions = xin->odp_actions; >>> + struct ofpbuf *actions = ctx->xin->odp_actions; >>> struct nlattr *last_action = NULL; >>> struct nlattr *a; >>> int left; >>> @@ -8085,9 +8093,16 @@ xlate_optimize_odp_actions(struct xlate_in *xin) >>> last_action = a; >>> } >>> >>> + if (!last_action) { >>> + if (ovs_explicit_drop_action_supported(ctx->xbridge->ofproto)) { >>> + put_drop_action(actions, XLATE_OK); >>> + } >>> + return; >>> + } >>> + >>> /* Remove the trailing clone() action, by directly embedding the nested >>> * actions. */ >>> - if (last_action && nl_attr_type(last_action) == OVS_ACTION_ATTR_CLONE) >>> { >>> + if (nl_attr_type(last_action) == OVS_ACTION_ATTR_CLONE) { >>> void *dest; >>> >>> nl_msg_reset_size(actions, >>> @@ -8096,6 +8111,16 @@ xlate_optimize_odp_actions(struct xlate_in *xin) >>> >>> dest = nl_msg_put_uninit(actions, nl_attr_get_size(last_action)); >>> memmove(dest, nl_attr_get(last_action), >>> nl_attr_get_size(last_action)); >>> + return; >>> + } >>> + >>> + /* If the last action of the list is an observability action, add an >>> + * explicit drop action so that drop statistics remain reliable. */ >> >> What if the sample was part of the last clone we are removing? >> > > That rare corner case is not covered. I guess I can check if the offset > is within the nested actions and adjust it accordingly, i.e: substract > (nl_attr_get(last_action) - dest). > >>> + if (ctx->xout->last_observe_offset != UINT32_MAX && >>> + (char *) last_action == (char *) actions->data + >> >> Change to unsigned char same as existing code? > > Sure, I don't mind. As long as pointer arithmetics work :-) > >> >>> + ctx->xout->last_observe_offset && >>> + ovs_explicit_drop_action_supported(ctx->xbridge->ofproto)) { >>> + put_drop_action(actions, XLATE_OK); >>> } >>> } >>> >>> @@ -8113,6 +8138,7 @@ xlate_actions(struct xlate_in *xin, struct xlate_out >>> *xout) >>> *xout = (struct xlate_out) { >>> .slow = 0, >>> .recircs = RECIRC_REFS_EMPTY_INITIALIZER, >>> + .last_observe_offset = UINT32_MAX, >>> }; >>> >>> struct xlate_cfg *xcfg = ovsrcu_get(struct xlate_cfg *, &xcfgp); >>> @@ -8541,17 +8567,15 @@ exit: >>> xout->slow = 0; >>> if (xin->odp_actions) { >>> ofpbuf_clear(xin->odp_actions); >>> + /* Make the drop explicit if the datapath supports it. */ >>> + if (ovs_explicit_drop_action_supported(ctx.xbridge->ofproto)) { >>> + put_drop_action(xin->odp_actions, ctx.error); >>> + } >>> } >>> } else { >>> - /* In the non-error case, see if we can further optimize the >>> datapath >>> - * rules by removing redundant (clone) actions. */ >>> - xlate_optimize_odp_actions(xin); >>> - } >>> - >>> - /* Install drop action if datapath supports explicit drop action. */ >>> - if (xin->odp_actions && !xin->odp_actions->size && >>> - ovs_explicit_drop_action_supported(ctx.xbridge->ofproto)) { >> >> If you make this configurable, please ensure the current behavior is >> maintained. >> > > Yes. It will be an opt-in. > >>> - put_drop_action(xin->odp_actions, ctx.error); >>> + /* In the non-error case, see if we can further optimize or tweak >>> + * datapath actions. */ >>> + xlate_tweak_odp_actions(&ctx); >>> } >>> >>> /* Since congestion drop and forwarding drop are not exactly >>> diff --git a/ofproto/ofproto-dpif-xlate.h b/ofproto/ofproto-dpif-xlate.h >>> index 08f9397d8..d973a634a 100644 >>> --- a/ofproto/ofproto-dpif-xlate.h >>> +++ b/ofproto/ofproto-dpif-xlate.h >>> @@ -61,6 +61,10 @@ struct xlate_out { >>> >>> /* Recirc action IDs on which references are held. */ >>> struct recirc_refs recircs; >>> + >>> + /* Keep track of the last action whose purpose is purely observational. >>> + * e.g: IPFIX, sFlow, local sampling. */ >>> + uint32_t last_observe_offset; >>> }; >>> >>> struct xlate_in { >>> diff --git a/tests/drop-stats.at b/tests/drop-stats.at >>> index 1d3af98da..820ca9c06 100644 >>> --- a/tests/drop-stats.at >>> +++ b/tests/drop-stats.at >>> @@ -191,3 +191,116 @@ ovs-appctl coverage/read-counter >>> drop_action_too_many_mpls_labels >>> >>> OVS_VSWITCHD_STOP(["/|WARN|/d"]) >>> AT_CLEANUP >>> + >>> +AT_SETUP([drop-stats - bridge sampling]) >>> + >>> +OVS_VSWITCHD_START([dnl >>> + set bridge br0 datapath_type=dummy \ >>> + protocols=OpenFlow10,OpenFlow13,OpenFlow14,OpenFlow15 -- \ >>> + add-port br0 p1 -- set Interface p1 type=dummy ofport_request=1]) >>> + >>> +AT_DATA([flows.txt], [dnl >>> +table=0,in_port=1,actions=drop >>> +]) >>> + >>> +AT_CHECK([ >>> + ovs-ofctl del-flows br0 >>> + ovs-ofctl -Oopenflow13 add-flows br0 flows.txt >>> + ovs-ofctl -Oopenflow13 dump-flows br0 | ofctl_strip | sort | grep >>> actions ], [0], [dnl >>> + in_port=1 actions=drop >>> +]) >>> + >>> +AT_CHECK([ovs-vsctl -- set bridge br0 ipfix=@fix -- \ >>> + --id=@fix create ipfix targets=\"127.0.0.1:4739\" \ >>> + sampling=1], >>> + [0], [ignore]) >>> + >>> +for i in $(seq 1 3); do >>> +AT_CHECK([ >>> + ovs-appctl netdev-dummy/receive p1 >>> 'in_port(1),packet_type(ns=0,id=0),eth(src=3a:6d:d2:09:9c:ab,dst=1e:2c:e9:2a:66:9e),ipv4(src=192.168.10.10,dst=192.168.10.30,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' >>> +], [0], [ignore]) >>> +done >>> + >>> +AT_CHECK([ovs-appctl dpctl/dump-flows | sed >>> 's/used:[[0-9]].[[0-9]]*s/used:0.0/' | sort], [0], [flow-dump from the main >>> thread: >>> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), >>> packets:2, bytes:212, used:0.0, >>> actions:userspace(pid=0,ipfix(output_port=4294967295)),drop >>> +]) >>> + >>> +ovs-appctl time/warp 5000 >> >> There is a mix of bare and AT_CHECK() time/warps. Ilya what is the preferred >> way? My preference will be: >> >> AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore]) >> >>> + >>> +AT_CHECK([ >>> +ovs-appctl coverage/read-counter drop_action_of_pipeline >>> +], [0], [dnl >>> +3 >>> +]) >>> + >>> +OVS_VSWITCHD_STOP(["/sending to collector failed/d"]) >>> + >>> +AT_CLEANUP >>> +AT_SETUP([drop-stats - sampling action]) >> >> OVS_VSWITCHD_STOP(["/sending to collector failed/d"]) >> AT_CLEANUP >> >> AT_SETUP([drop-stats - sampling action]) >> >>> + >>> +OVS_VSWITCHD_START([dnl >>> + set bridge br0 datapath_type=dummy \ >>> + protocols=OpenFlow10,OpenFlow13,OpenFlow14,OpenFlow15 -- \ >>> + add-port br0 p1 -- set Interface p1 type=dummy ofport_request=1 -- \ >>> + add-port br0 p2 -- set Interface p2 type=dummy ofport_request=2]) >>> + >>> +AT_DATA([flows.txt], [dnl >>> +table=0,in_port=1,actions=sample(probability=65535,collector_set_id=1) >>> +table=0,in_port=2,actions=sample(probability=32767,collector_set_id=1),load:0->reg0 >>> +]) >>> + >>> +AT_CHECK([ >>> + ovs-ofctl del-flows br0 >>> + ovs-ofctl -Oopenflow13 add-flows br0 flows.txt >>> +]) >>> + >>> +AT_CHECK([ovs-vsctl --id=@br0 get Bridge br0 \ >>> + -- --id=@ipfix create IPFIX targets=\"127.0.0.1:4739\" >>> \ >>> + -- create Flow_Sample_Collector_Set id=1 bridge=@br0 \ >>> + ipfix=@ipfix], >>> + [0], [ignore]) >>> + >>> +for i in $(seq 1 3); do >>> +AT_CHECK([ >>> + ovs-appctl netdev-dummy/receive p1 \ >>> + >>> 'in_port(1),packet_type(ns=0,id=0),eth(src=3a:6d:d2:09:9c:ab,dst=1e:2c:e9:2a:66:9e),ipv4(src=192.168.10.10,dst=192.168.10.30,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' >>> +], [0], [ignore]) >>> +done >>> + >>> +AT_CHECK([ovs-appctl dpctl/dump-flows | sed >>> 's/used:[[0-9]].[[0-9]]*s/used:0.0/' | sort], [0], [flow-dump from the main >>> thread: >>> +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), >>> packets:2, bytes:212, used:0.0, dnl >>> +actions:userspace(pid=0,flow_sample(probability=65535,collector_set_id=1,obs_domain_id=0,obs_point_id=0,output_port=4294967295)),drop >>> +]) >>> + >>> +ovs-appctl time/warp 5000 >>> + >>> +AT_CHECK([ >>> +ovs-appctl coverage/read-counter drop_action_of_pipeline >>> +], [0], [dnl >>> +3 >>> +]) >> >> Maybe (multiple times) >> >> AT_CHECK([ovs-appctl coverage/read-counter drop_action_of_pipeline], >> [0], [dnl >> 3 >> ]) >> >>> + >>> +AT_CHECK([ovs-appctl dpctl/del-flows]) >>> + >>> +for i in $(seq 1 3); do >>> +AT_CHECK([ >>> + ovs-appctl netdev-dummy/receive p2 \ >>> + >>> 'in_port(2),packet_type(ns=0,id=0),eth(src=3a:6d:d2:09:9c:ab,dst=1e:2c:e9:2a:66:9e),ipv4(src=192.168.10.10,dst=192.168.10.30,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)' >>> +], [0], [ignore]) >>> +done >>> + >>> +AT_CHECK([ovs-appctl dpctl/dump-flows | sed >>> 's/used:[[0-9]].[[0-9]]*s/used:0.0/' | sort], [0], [flow-dump from the main >>> thread: >>> +recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no), >>> packets:2, bytes:212, used:0.0, dnl >>> +actions:sample(sample=50.0%,actions(userspace(pid=0,flow_sample(probability=32767,collector_set_id=1,obs_domain_id=0,obs_point_id=0,output_port=4294967295)))),drop >>> +]) >>> + >>> +ovs-appctl time/warp 5000 >>> + >>> +AT_CHECK([ >>> +ovs-appctl coverage/read-counter drop_action_of_pipeline >>> +], [0], [dnl >>> +6 >>> +]) >>> + >>> +OVS_VSWITCHD_STOP(["/sending to collector failed/d"]) >>> +AT_CLEANUP >>> diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at >>> index 5edb22fb5..c94dca9b1 100644 >>> --- a/tests/ofproto-dpif.at >>> +++ b/tests/ofproto-dpif.at >>> @@ -8082,7 +8082,7 @@ for i in `seq 1 3`; do >>> done >>> AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*\(packets:\)/\1/' | sed >>> 's/used:[[0-9]].[[0-9]]*s/used:0.001s/'], [0], [dnl >>> flow-dump from the main thread: >>> -packets:2, bytes:68, used:0.001s, >>> actions:userspace(pid=0,ipfix(output_port=4294967295)) >>> +packets:2, bytes:68, used:0.001s, >>> actions:userspace(pid=0,ipfix(output_port=4294967295)),drop >>> ]) >>> >>> AT_CHECK([ovs-appctl revalidator/purge]) >>> @@ -8118,7 +8118,7 @@ for i in `seq 1 3`; do >>> done >>> AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*\(packets:\)/\1/' | sed >>> 's/used:[[0-9]].[[0-9]]*s/used:0.001s/'], [0], [dnl >>> flow-dump from the main thread: >>> -packets:2, bytes:68, used:0.001s, >>> actions:sample(sample=100.0%,actions(meter(0),userspace(pid=0,ipfix(output_port=4294967295)))) >>> +packets:2, bytes:68, used:0.001s, >>> actions:sample(sample=100.0%,actions(meter(0),userspace(pid=0,ipfix(output_port=4294967295)))),drop >>> ]) >>> >>> dnl Remove the IPFIX configuration. >>> @@ -12295,3 +12295,46 @@ Datapath actions: EXPECTED_ACT >>> >>> OVS_VSWITCHD_STOP("/Enabling an unsupported feature is very dangerous/d") >>> AT_CLEANUP >>> + >>> +AT_SETUP([ofproto-dpif - Local sampling - drop]) >>> +OVS_VSWITCHD_START >>> +add_of_ports br0 1 2 >>> + >>> +AT_CHECK([ovs-appctl dpif/set-dp-features --force br0 psample true], [0], >>> [ignore]) >>> + >>> +AT_CHECK([ovs-vsctl -- --id=@br0 get Bridge br0 \ >>> + -- create Flow_Sample_Collector_Set id=1 bridge=@br0 >>> local_group_id=42], >>> + [0], [ignore]) >>> + >>> +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0 'meter=slowpath pktps >>> stats bands=type=drop rate=2']) >>> + >>> +AT_DATA([flows.txt], [dnl >>> +in_port=1, >>> actions=sample(probability=32767,collector_set_id=1,obs_domain_id=100,obs_point_id=200) >>> +in_port=2, >>> actions=sample(probability=65535,collector_set_id=1,obs_domain_id=100,obs_point_id=200) >>> +]) >>> + >>> +AT_CHECK([ovs-ofctl --protocols=OpenFlow10 add-flows br0 flows.txt]) >>> + >>> +m4_define([TRACE_PKT], [m4_join([,], >>> + [eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800)], >>> + [ipv4(src=10.10.10.2,dst=10.10.10.1,proto=1,tos=1,ttl=128,frag=no)], >>> + [icmp(type=8,code=0)])]) >>> + >>> +m4_define([EXPECTED_ACT1], [m4_join([], >>> + [sample(sample=50.0%,actions(], >>> + [psample(group=42,cookie=0x64000000c8))),], >>> + [drop], >>> +)]) >>> + >>> +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(1) TRACE_PKT'], [0], >>> [stdout]) >>> +AT_CHECK([tail -1 stdout], [0], [dnl >>> +Datapath actions: EXPECTED_ACT1 >>> +]) >>> + >>> +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(2) TRACE_PKT'], [0], >>> [stdout]) >>> +AT_CHECK([tail -1 stdout], [0], [dnl >>> +Datapath actions: psample(group=42,cookie=0x64000000c8),drop >>> +]) >> >> Maybe a test case for a last action inside a clone, i.e. >> >> clone(psample(group=42,cookie=0x64000000c8)) --> >> psample(group=42,cookie=0x64000000c8),drop >> > > Ack. > >>> + >>> +OVS_VSWITCHD_STOP("/Enabling an unsupported feature is very dangerous/d") >>> +AT_CLEANUP >>> -- >>> 2.45.2 >> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev