On Fri, Jul 12, 2024 at 04:35:04PM GMT, Eelco Chaudron wrote:
> On 12 Jul 2024, at 1:32, Adrian Moreno wrote:
>
> > When the flow translation results in a datapath action list whose last
> > action is an "observational" action, i.e: one generated for IPFIX,
> > sFlow or local sampling applications, the packet is actually going to be
> > dropped (and observed).
> >
> > In that case, add an explicit drop action so that drop statistics remain
> > accurate.
> >
> > Combine the "optimizations" and other odp_actions "tweaks" into a single
> > function.
>
> I'm running out of time to properly review (and look for alternatives), so 
> I'll add some general comments, and let you and Ilya work out the details 
> when I'm on PTO ;)
>
> > Signed-off-by: Adrian Moreno <amore...@redhat.com>
> > ---
> >  ofproto/ofproto-dpif-xlate.c |  52 +++++++++++-----
> >  ofproto/ofproto-dpif-xlate.h |   4 ++
> >  tests/drop-stats.at          | 113 +++++++++++++++++++++++++++++++++++
> >  tests/ofproto-dpif.at        |  47 ++++++++++++++-
> >  4 files changed, 200 insertions(+), 16 deletions(-)
> >
> > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> > index 0a0964348..15e89e6a8 100644
> > --- a/ofproto/ofproto-dpif-xlate.c
> > +++ b/ofproto/ofproto-dpif-xlate.c
> > @@ -3415,6 +3415,7 @@ compose_sample_action(struct xlate_ctx *ctx,
> >      struct ofproto *ofproto = &ctx->xin->ofproto->up;
> >      uint32_t meter_id = ofproto->slowpath_meter_id;
> >      size_t cookie_offset = 0;
> > +    size_t observe_offset;
> >
> >      /* The meter action is only used to throttle userspace actions.
> >       * If they are not needed and the sampling rate is 100%, avoid 
> > generating
> > @@ -3432,6 +3433,7 @@ compose_sample_action(struct xlate_ctx *ctx,
> >      }
> >
> >      if (args->psample) {
> > +        observe_offset = ctx->odp_actions->size;
> >          odp_put_psample_action(ctx->odp_actions,
> >                                 args->psample->group_id,
> >                                 (void *) &args->psample->cookie,
> > @@ -3443,6 +3445,7 @@ compose_sample_action(struct xlate_ctx *ctx,
> >              nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_METER, 
> > meter_id);
> >          }
> >
> > +        observe_offset = ctx->odp_actions->size;
> >          odp_port_t odp_port = ofp_port_to_odp_port(
> >              ctx->xbridge, ctx->xin->flow.in_port.ofp_port);
> >          uint32_t pid = dpif_port_get_pid(ctx->xbridge->dpif, odp_port);
> > @@ -3457,6 +3460,9 @@ compose_sample_action(struct xlate_ctx *ctx,
> >      if (is_sample) {
> >          nl_msg_end_nested(ctx->odp_actions, actions_offset);
> >          nl_msg_end_nested(ctx->odp_actions, sample_offset);
> > +        ctx->xout->last_observe_offset = sample_offset;
> > +    } else {
> > +        ctx->xout->last_observe_offset = observe_offset;
>
> Not sure why, but I do not like the idea of storing offsets in the ctx. 
> Thinking out loud without any checking; We are looping through all actions in 
> xlate_optimize_odp_actions() is there no easy way to figure it out there?

I thought of this.  The condition that must be met in order to
consider an action to be "observability" is:

Its a userspace() action used for IPFIX/sFlow OR
It's a psample() action OR
It's a sample() action that inside has (
    psample action OR
    userspace action usef for IPFIX/sFlow
)

So, we need to introspect the netlink attributes all the way to the
userspace action's cookie, decode it and look at the type to make sure
it's not, a metered controller action (which uses sample() as well).

The resulting code was a bit ugly with all that instrospection. So I
tried this other way which I think is cleaner and more robust since
actions have to explicitly "opt-in" to be considered observational,
rather than the check happening somewhere else.

Would you prefer the introspection alternative?

>
> >      }
> >
> >      return cookie_offset;
> > @@ -8066,12 +8072,14 @@ xlate_wc_finish(struct xlate_ctx *ctx)
> >      }
> >  }
> >
> > -/* This will optimize the odp actions generated. For now, it will remove
> > - * trailing clone actions that are unnecessary. */
> > +/* This will tweak the odp actions generated. For now, it will:
> > + *  - Remove trailing clone actions that are unnecessary.
> > + *  - Add an explicit drop action if the last action is observational or 
> > the
> > + *    action list is empty. */
> >  static void
> > -xlate_optimize_odp_actions(struct xlate_in *xin)
> > +xlate_tweak_odp_actions(struct xlate_ctx *ctx)
> >  {
> > -    struct ofpbuf *actions = xin->odp_actions;
> > +    struct ofpbuf *actions = ctx->xin->odp_actions;
> >      struct nlattr *last_action = NULL;
> >      struct nlattr *a;
> >      int left;
> > @@ -8085,9 +8093,16 @@ xlate_optimize_odp_actions(struct xlate_in *xin)
> >          last_action = a;
> >      }
> >
> > +    if (!last_action) {
> > +        if (ovs_explicit_drop_action_supported(ctx->xbridge->ofproto)) {
> > +            put_drop_action(actions, XLATE_OK);
> > +        }
> > +        return;
> > +    }
> > +
> >      /* Remove the trailing clone() action, by directly embedding the nested
> >       * actions. */
> > -    if (last_action && nl_attr_type(last_action) == OVS_ACTION_ATTR_CLONE) 
> > {
> > +    if (nl_attr_type(last_action) == OVS_ACTION_ATTR_CLONE) {
> >          void *dest;
> >
> >          nl_msg_reset_size(actions,
> > @@ -8096,6 +8111,16 @@ xlate_optimize_odp_actions(struct xlate_in *xin)
> >
> >          dest = nl_msg_put_uninit(actions, nl_attr_get_size(last_action));
> >          memmove(dest, nl_attr_get(last_action), 
> > nl_attr_get_size(last_action));
> > +        return;
> > +    }
> > +
> > +    /* If the last action of the list is an observability action, add an
> > +     * explicit drop action so that drop statistics remain reliable. */
>
> What if the sample was part of the last clone we are removing?
>

That rare corner case is not covered. I guess I can check if the offset
is within the nested actions and adjust it accordingly, i.e: substract
(nl_attr_get(last_action) - dest).

> > +    if (ctx->xout->last_observe_offset != UINT32_MAX &&
> > +        (char *) last_action == (char *) actions->data +
>
> Change to unsigned char same as existing code?

Sure, I don't mind. As long as pointer arithmetics work :-)

>
> > +                                 ctx->xout->last_observe_offset &&
> > +        ovs_explicit_drop_action_supported(ctx->xbridge->ofproto)) {
> > +            put_drop_action(actions, XLATE_OK);
> >      }
> >  }
> >
> > @@ -8113,6 +8138,7 @@ xlate_actions(struct xlate_in *xin, struct xlate_out 
> > *xout)
> >      *xout = (struct xlate_out) {
> >          .slow = 0,
> >          .recircs = RECIRC_REFS_EMPTY_INITIALIZER,
> > +        .last_observe_offset = UINT32_MAX,
> >      };
> >
> >      struct xlate_cfg *xcfg = ovsrcu_get(struct xlate_cfg *, &xcfgp);
> > @@ -8541,17 +8567,15 @@ exit:
> >          xout->slow = 0;
> >          if (xin->odp_actions) {
> >              ofpbuf_clear(xin->odp_actions);
> > +            /* Make the drop explicit if the datapath supports it. */
> > +            if (ovs_explicit_drop_action_supported(ctx.xbridge->ofproto)) {
> > +                put_drop_action(xin->odp_actions, ctx.error);
> > +            }
> >          }
> >      } else {
> > -        /* In the non-error case, see if we can further optimize the 
> > datapath
> > -         * rules by removing redundant (clone) actions. */
> > -        xlate_optimize_odp_actions(xin);
> > -    }
> > -
> > -    /* Install drop action if datapath supports explicit drop action. */
> > -    if (xin->odp_actions && !xin->odp_actions->size &&
> > -        ovs_explicit_drop_action_supported(ctx.xbridge->ofproto)) {
>
> If you make this configurable, please ensure the current behavior is 
> maintained.
>

Yes. It will be an opt-in.

> > -        put_drop_action(xin->odp_actions, ctx.error);
> > +        /* In the non-error case, see if we can further optimize or tweak
> > +         * datapath actions. */
> > +        xlate_tweak_odp_actions(&ctx);
> >      }
> >
> >      /* Since congestion drop and forwarding drop are not exactly
> > diff --git a/ofproto/ofproto-dpif-xlate.h b/ofproto/ofproto-dpif-xlate.h
> > index 08f9397d8..d973a634a 100644
> > --- a/ofproto/ofproto-dpif-xlate.h
> > +++ b/ofproto/ofproto-dpif-xlate.h
> > @@ -61,6 +61,10 @@ struct xlate_out {
> >
> >      /* Recirc action IDs on which references are held. */
> >      struct recirc_refs recircs;
> > +
> > +    /* Keep track of the last action whose purpose is purely observational.
> > +     * e.g: IPFIX, sFlow, local sampling. */
> > +    uint32_t last_observe_offset;
> >  };
> >
> >  struct xlate_in {
> > diff --git a/tests/drop-stats.at b/tests/drop-stats.at
> > index 1d3af98da..820ca9c06 100644
> > --- a/tests/drop-stats.at
> > +++ b/tests/drop-stats.at
> > @@ -191,3 +191,116 @@ ovs-appctl coverage/read-counter 
> > drop_action_too_many_mpls_labels
> >
> >  OVS_VSWITCHD_STOP(["/|WARN|/d"])
> >  AT_CLEANUP
> > +
> > +AT_SETUP([drop-stats - bridge sampling])
> > +
> > +OVS_VSWITCHD_START([dnl
> > +    set bridge br0 datapath_type=dummy \
> > +        protocols=OpenFlow10,OpenFlow13,OpenFlow14,OpenFlow15 -- \
> > +    add-port br0 p1 -- set Interface p1 type=dummy ofport_request=1])
> > +
> > +AT_DATA([flows.txt], [dnl
> > +table=0,in_port=1,actions=drop
> > +])
> > +
> > +AT_CHECK([
> > +    ovs-ofctl del-flows br0
> > +    ovs-ofctl -Oopenflow13 add-flows br0 flows.txt
> > +    ovs-ofctl -Oopenflow13 dump-flows br0 | ofctl_strip | sort | grep 
> > actions ], [0], [dnl
> > + in_port=1 actions=drop
> > +])
> > +
> > +AT_CHECK([ovs-vsctl -- set bridge br0 ipfix=@fix -- \
> > +                    --id=@fix create ipfix targets=\"127.0.0.1:4739\" \
> > +                      sampling=1],
> > +         [0], [ignore])
> > +
> > +for i in $(seq 1 3); do
> > +AT_CHECK([
> > +    ovs-appctl netdev-dummy/receive p1 
> > 'in_port(1),packet_type(ns=0,id=0),eth(src=3a:6d:d2:09:9c:ab,dst=1e:2c:e9:2a:66:9e),ipv4(src=192.168.10.10,dst=192.168.10.30,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'
> > +], [0], [ignore])
> > +done
> > +
> > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 
> > 's/used:[[0-9]].[[0-9]]*s/used:0.0/' | sort], [0], [flow-dump from the main 
> > thread:
> > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
> >  packets:2, bytes:212, used:0.0, 
> > actions:userspace(pid=0,ipfix(output_port=4294967295)),drop
> > +])
> > +
> > +ovs-appctl time/warp 5000
>
> There is a mix of bare and AT_CHECK() time/warps. Ilya what is the preferred 
> way? My preference will be:
>
> AT_CHECK([ovs-appctl time/warp 5000], [0], [ignore])
>
> > +
> > +AT_CHECK([
> > +ovs-appctl coverage/read-counter drop_action_of_pipeline
> > +], [0], [dnl
> > +3
> > +])
> > +
> > +OVS_VSWITCHD_STOP(["/sending to collector failed/d"])
> > +
> > +AT_CLEANUP
> > +AT_SETUP([drop-stats - sampling action])
>
> OVS_VSWITCHD_STOP(["/sending to collector failed/d"])
> AT_CLEANUP
>
> AT_SETUP([drop-stats - sampling action])
>
> > +
> > +OVS_VSWITCHD_START([dnl
> > +    set bridge br0 datapath_type=dummy \
> > +        protocols=OpenFlow10,OpenFlow13,OpenFlow14,OpenFlow15 -- \
> > +    add-port br0 p1 -- set Interface p1 type=dummy ofport_request=1 -- \
> > +    add-port br0 p2 -- set Interface p2 type=dummy ofport_request=2])
> > +
> > +AT_DATA([flows.txt], [dnl
> > +table=0,in_port=1,actions=sample(probability=65535,collector_set_id=1)
> > +table=0,in_port=2,actions=sample(probability=32767,collector_set_id=1),load:0->reg0
> > +])
> > +
> > +AT_CHECK([
> > +    ovs-ofctl del-flows br0
> > +    ovs-ofctl -Oopenflow13 add-flows br0 flows.txt
> > +])
> > +
> > +AT_CHECK([ovs-vsctl --id=@br0 get Bridge br0 \
> > +                    -- --id=@ipfix create IPFIX targets=\"127.0.0.1:4739\" 
> > \
> > +                    -- create Flow_Sample_Collector_Set id=1 bridge=@br0 \
> > +                       ipfix=@ipfix],
> > +         [0], [ignore])
> > +
> > +for i in $(seq 1 3); do
> > +AT_CHECK([
> > +    ovs-appctl netdev-dummy/receive p1 \
> > +    
> > 'in_port(1),packet_type(ns=0,id=0),eth(src=3a:6d:d2:09:9c:ab,dst=1e:2c:e9:2a:66:9e),ipv4(src=192.168.10.10,dst=192.168.10.30,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'
> > +], [0], [ignore])
> > +done
> > +
> > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 
> > 's/used:[[0-9]].[[0-9]]*s/used:0.0/' | sort], [0], [flow-dump from the main 
> > thread:
> > +recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
> >  packets:2, bytes:212, used:0.0, dnl
> > +actions:userspace(pid=0,flow_sample(probability=65535,collector_set_id=1,obs_domain_id=0,obs_point_id=0,output_port=4294967295)),drop
> > +])
> > +
> > +ovs-appctl time/warp 5000
> > +
> > +AT_CHECK([
> > +ovs-appctl coverage/read-counter drop_action_of_pipeline
> > +], [0], [dnl
> > +3
> > +])
>
> Maybe (multiple times)
>
> AT_CHECK([ovs-appctl coverage/read-counter drop_action_of_pipeline],
>          [0], [dnl
> 3
> ])
>
> > +
> > +AT_CHECK([ovs-appctl dpctl/del-flows])
> > +
> > +for i in $(seq 1 3); do
> > +AT_CHECK([
> > +    ovs-appctl netdev-dummy/receive p2 \
> > +    
> > 'in_port(2),packet_type(ns=0,id=0),eth(src=3a:6d:d2:09:9c:ab,dst=1e:2c:e9:2a:66:9e),ipv4(src=192.168.10.10,dst=192.168.10.30,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0)'
> > +], [0], [ignore])
> > +done
> > +
> > +AT_CHECK([ovs-appctl dpctl/dump-flows | sed 
> > 's/used:[[0-9]].[[0-9]]*s/used:0.0/' | sort], [0], [flow-dump from the main 
> > thread:
> > +recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(frag=no),
> >  packets:2, bytes:212, used:0.0, dnl
> > +actions:sample(sample=50.0%,actions(userspace(pid=0,flow_sample(probability=32767,collector_set_id=1,obs_domain_id=0,obs_point_id=0,output_port=4294967295)))),drop
> > +])
> > +
> > +ovs-appctl time/warp 5000
> > +
> > +AT_CHECK([
> > +ovs-appctl coverage/read-counter drop_action_of_pipeline
> > +], [0], [dnl
> > +6
> > +])
> > +
> > +OVS_VSWITCHD_STOP(["/sending to collector failed/d"])
> > +AT_CLEANUP
> > diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
> > index 5edb22fb5..c94dca9b1 100644
> > --- a/tests/ofproto-dpif.at
> > +++ b/tests/ofproto-dpif.at
> > @@ -8082,7 +8082,7 @@ for i in `seq 1 3`; do
> >  done
> >  AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*\(packets:\)/\1/' | sed 
> > 's/used:[[0-9]].[[0-9]]*s/used:0.001s/'], [0], [dnl
> >  flow-dump from the main thread:
> > -packets:2, bytes:68, used:0.001s, 
> > actions:userspace(pid=0,ipfix(output_port=4294967295))
> > +packets:2, bytes:68, used:0.001s, 
> > actions:userspace(pid=0,ipfix(output_port=4294967295)),drop
> >  ])
> >
> >  AT_CHECK([ovs-appctl revalidator/purge])
> > @@ -8118,7 +8118,7 @@ for i in `seq 1 3`; do
> >  done
> >  AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*\(packets:\)/\1/' | sed 
> > 's/used:[[0-9]].[[0-9]]*s/used:0.001s/'], [0], [dnl
> >  flow-dump from the main thread:
> > -packets:2, bytes:68, used:0.001s, 
> > actions:sample(sample=100.0%,actions(meter(0),userspace(pid=0,ipfix(output_port=4294967295))))
> > +packets:2, bytes:68, used:0.001s, 
> > actions:sample(sample=100.0%,actions(meter(0),userspace(pid=0,ipfix(output_port=4294967295)))),drop
> >  ])
> >
> >  dnl Remove the IPFIX configuration.
> > @@ -12295,3 +12295,46 @@ Datapath actions: EXPECTED_ACT
> >
> >  OVS_VSWITCHD_STOP("/Enabling an unsupported feature is very dangerous/d")
> >  AT_CLEANUP
> > +
> > +AT_SETUP([ofproto-dpif - Local sampling - drop])
> > +OVS_VSWITCHD_START
> > +add_of_ports br0 1 2
> > +
> > +AT_CHECK([ovs-appctl dpif/set-dp-features --force br0 psample true], [0], 
> > [ignore])
> > +
> > +AT_CHECK([ovs-vsctl -- --id=@br0 get Bridge br0 \
> > +                    -- create Flow_Sample_Collector_Set id=1 bridge=@br0 
> > local_group_id=42],
> > +         [0], [ignore])
> > +
> > +AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0 'meter=slowpath pktps 
> > stats bands=type=drop rate=2'])
> > +
> > +AT_DATA([flows.txt], [dnl
> > +in_port=1, 
> > actions=sample(probability=32767,collector_set_id=1,obs_domain_id=100,obs_point_id=200)
> > +in_port=2, 
> > actions=sample(probability=65535,collector_set_id=1,obs_domain_id=100,obs_point_id=200)
> > +])
> > +
> > +AT_CHECK([ovs-ofctl --protocols=OpenFlow10 add-flows br0 flows.txt])
> > +
> > +m4_define([TRACE_PKT], [m4_join([,],
> > +    [eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800)],
> > +    [ipv4(src=10.10.10.2,dst=10.10.10.1,proto=1,tos=1,ttl=128,frag=no)],
> > +    [icmp(type=8,code=0)])])
> > +
> > +m4_define([EXPECTED_ACT1], [m4_join([],
> > +    [sample(sample=50.0%,actions(],
> > +    [psample(group=42,cookie=0x64000000c8))),],
> > +    [drop],
> > +)])
> > +
> > +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(1) TRACE_PKT'], [0], 
> > [stdout])
> > +AT_CHECK([tail -1 stdout], [0], [dnl
> > +Datapath actions: EXPECTED_ACT1
> > +])
> > +
> > +AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(2) TRACE_PKT'], [0], 
> > [stdout])
> > +AT_CHECK([tail -1 stdout], [0], [dnl
> > +Datapath actions: psample(group=42,cookie=0x64000000c8),drop
> > +])
>
> Maybe a test case for a last action inside a clone, i.e.
>
>   clone(psample(group=42,cookie=0x64000000c8)) --> 
> psample(group=42,cookie=0x64000000c8),drop
>

Ack.

> > +
> > +OVS_VSWITCHD_STOP("/Enabling an unsupported feature is very dangerous/d")
> > +AT_CLEANUP
> > --
> > 2.45.2
>

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to