Re: [ovs-dev] [PATCH v6] ofproto-dpif-xlate: Don't reinstall removed XC_LEARN rule

Ilya Maximets Wed, 30 Aug 2023 14:05:17 -0700

On 8/28/23 22:00, Mike Pattrick wrote:
> When the a revalidator thread is updating statistics for an XC_LEARN
> xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn.
> The revalidator will update stats for rules even if they are in a
> removed state or marked as invisible. However, ofproto_flow_mod_learn
> will detect if a flow has been removed and re-add it in that case. This
> can result in an old learn action replacing the new learn action that
> had replaced it in the first place.
> 
> This change adds a new stats_used parameter to ofproto_flow_mod_learn


Nit: The name changed in the code.

> allowing the caller to provide a timestamp that will be fed into the
> learned rule's modified time. The provided timestamp should be the time
> of the last packet activity. If stats_used is not set then the current
> time is used, as is the current behaviour.
> 
> This change also adds a check when replacing a learned rule to favour the
> newest rule.
> 
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2213892
> Signed-off-by: Mike Pattrick <[email protected]>
> ---
>  v2: Added additional checks if rule is removed
>  v3: v2 patch was corrupted in transit
>  v4: Added check against dpif flow stats
>  v5: Fixed typos, updated commit message
>      Changed timestamps to use datapath timestamp more consistently
>  v6: Added a unit test, changed some variable names and comments,
>      zero is no longer an acceptable value
> 
> Signed-off-by: Mike Pattrick <[email protected]>
> ---
>  ofproto/ofproto-dpif-xlate-cache.c |  2 +-
>  ofproto/ofproto-dpif-xlate.c       | 10 +++-
>  ofproto/ofproto-dpif.c             |  2 +-
>  ofproto/ofproto-provider.h         |  6 ++-
>  ofproto/ofproto.c                  | 53 ++++++++++++++++----
>  tests/learn.at                     | 77 ++++++++++++++++++++++++++++++
>  6 files changed, 136 insertions(+), 14 deletions(-)
> 
> diff --git a/ofproto/ofproto-dpif-xlate-cache.c 
> b/ofproto/ofproto-dpif-xlate-cache.c
> index 9224ee2e6..2e1fcb3a6 100644
> --- a/ofproto/ofproto-dpif-xlate-cache.c
> +++ b/ofproto/ofproto-dpif-xlate-cache.c
> @@ -125,7 +125,7 @@ xlate_push_stats_entry(struct xc_entry *entry,
>      case XC_LEARN: {
>          enum ofperr error;
>          error = ofproto_flow_mod_learn(entry->learn.ofm, true,
> -                                       entry->learn.limit, NULL);
> +                                       entry->learn.limit, NULL, 
> stats->used);
>          if (error) {
>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>              VLOG_WARN_RL(&rl, "xcache LEARN action execution failed.");
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 47ea0f47e..d608a5f25 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -5700,8 +5700,16 @@ xlate_learn_action(struct xlate_ctx *ctx, const struct 
> ofpact_learn *learn)
>          if (!error) {
>              bool success = true;
>              if (ctx->xin->allow_side_effects) {
> +                long long int last_used;
> +
> +                if (ctx->xin->resubmit_stats) {
> +                    last_used = ctx->xin->resubmit_stats->used;
> +                } else {
> +                    last_used = time_msec();
> +                }
>                  error = ofproto_flow_mod_learn(ofm, ctx->xin->xcache != NULL,
> -                                               learn->limit, &success);
> +                                               learn->limit, &success,
> +                                               last_used);
>              } else if (learn->limit) {
>                  if (!ofm->temp_rule
>                      || ofm->temp_rule->state != RULE_INSERTED) {
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index e22ca757a..ba5706f6a 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -4880,7 +4880,7 @@ packet_xlate(struct ofproto *ofproto_, struct 
> ofproto_packet_out *opo)
>              if (entry->type == XC_LEARN) {
>                  struct ofproto_flow_mod *ofm = entry->learn.ofm;
>  
> -                error = ofproto_flow_mod_learn_refresh(ofm);
> +                error = ofproto_flow_mod_learn_refresh(ofm, time_msec());
>                  if (error) {
>                      goto error_out;
>                  }
> diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
> index 143ded690..9f7b8b6e8 100644
> --- a/ofproto/ofproto-provider.h
> +++ b/ofproto/ofproto-provider.h
> @@ -2027,9 +2027,11 @@ enum ofperr ofproto_flow_mod_init_for_learn(struct 
> ofproto *,
>                                              struct ofproto_flow_mod *)
>      OVS_EXCLUDED(ofproto_mutex);
>  enum ofperr ofproto_flow_mod_learn(struct ofproto_flow_mod *, bool keep_ref,
> -                                   unsigned limit, bool *below_limit)
> +                                   unsigned limit, bool *below_limit,
> +                                   long long int last_used)
>      OVS_EXCLUDED(ofproto_mutex);
> -enum ofperr ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod *ofm);
> +enum ofperr ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod *ofm,
> +                                           long long int last_used);
>  enum ofperr ofproto_flow_mod_learn_start(struct ofproto_flow_mod *ofm)
>      OVS_REQUIRES(ofproto_mutex);
>  void ofproto_flow_mod_learn_revert(struct ofproto_flow_mod *ofm)
> diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
> index dbf4958bc..444e15a11 100644
> --- a/ofproto/ofproto.c
> +++ b/ofproto/ofproto.c
> @@ -5472,7 +5472,8 @@ ofproto_flow_mod_init_for_learn(struct ofproto *ofproto,
>  }
>  
>  enum ofperr
> -ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod *ofm)
> +ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod *ofm,
> +                               long long int last_used)
>  {
>      enum ofperr error = 0;
>  
> @@ -5493,9 +5494,33 @@ ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod 
> *ofm)
>       * this function is executed the rule will be reinstated. */
>      if (rule->state == RULE_REMOVED) {
>          struct cls_rule cr;
> +        struct oftable *table = &rule->ofproto->tables[rule->table_id];
> +        ovs_version_t tables_version = rule->ofproto->tables_version;
> +
> +        if (!cls_rule_visible_in_version(&rule->cr, tables_version)) {
> +            const struct cls_rule *curr_cls_rule;
> +
> +            curr_cls_rule = classifier_find_rule_exactly(&table->cls,
> +                                                         &rule->cr,
> +                                                         tables_version);

It may be worth a comment why we do not compare any other characteristics
of a rule, e.g. cookie, timeouts, etc.

> +            if (curr_cls_rule) {
> +                struct rule *curr_rule = rule_from_cls_rule(curr_cls_rule);
> +                long long int curr_last_used;
> +
> +                ovs_mutex_lock(&curr_rule->mutex);
> +                curr_last_used = curr_rule->modified;
> +                ovs_mutex_unlock(&curr_rule->mutex);
> +
> +                if (curr_last_used > last_used) {
> +                    /* In the case of a newer visible rule, don't recreate 
> the
> +                     *  current rule. */
> +                    return 0;
> +                }
> +            }
> +        }
>  
> -        cls_rule_clone(&cr, &rule->cr);
>          ovs_mutex_lock(&rule->mutex);
> +        cls_rule_clone(&cr, &rule->cr);
>          error = ofproto_rule_create(rule->ofproto, &cr, rule->table_id,
>                                      rule->flow_cookie,
>                                      rule->idle_timeout,
> @@ -5506,6 +5531,7 @@ ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod 
> *ofm)
>                                      rule->match_tlv_bitmap,
>                                      rule->ofpacts_tlv_bitmap,
>                                      &ofm->temp_rule);
> +        ofm->temp_rule->modified = last_used;
>          ovs_mutex_unlock(&rule->mutex);
>          if (!error) {
>              ofproto_rule_unref(rule);   /* Release old reference. */
> @@ -5513,7 +5539,7 @@ ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod 
> *ofm)
>      } else {
>          /* Refresh the existing rule. */
>          ovs_mutex_lock(&rule->mutex);
> -        rule->modified = time_msec();
> +        rule->modified = last_used;
>          ovs_mutex_unlock(&rule->mutex);
>      }
>      return error;
> @@ -5565,10 +5591,13 @@ ofproto_flow_mod_learn_finish(struct ofproto_flow_mod 
> *ofm,
>  
>  /* Refresh 'ofm->temp_rule', for which the caller holds a reference, if 
> already
>   * in the classifier, insert it otherwise.  If the rule has already been
> - * removed from the classifier, a new rule is created using 'ofm->temp_rule' 
> as
> - * a template and the reference to the old 'ofm->temp_rule' is freed.  If
> - * 'keep_ref' is true, then a reference to the current rule is held, 
> otherwise
> - * it is released and 'ofm->temp_rule' is set to NULL.
> + * removed from the classifier and replaced by another rule, the last_used
> + * parameter is used to determine whether the newer rule is replaced or kept.
> + * If last_used is greater than the current rule's last modified time then a
> + * new rule is created using 'ofm->temp_rule' as a template and the reference
> + * to the old 'ofm->temp_rule' is freed. If 'keep_ref' is true, then a
> + * reference to the current rule is held, otherwise it is released and
> + * 'ofm->temp_rule' is set to NULL.

The updated comment is only talking about the case where there is another
rule n the classifier after removal and doesn't specify what happens in a
simple case where the rule is just removed.  And the 'current rule' term
is ambiguous here, because we have a rule that is current in the classifier
and the rule that is currently referenced by 'ofm->temp_rule'.  Both are
current and only the newly created one is new.

Nit: Also, please, put the 'last_used' paramenter name into single quotes
and double spaces between sentences for the style consistency.

>   *
>   * If 'limit' != 0, insertion will fail if there are more than 'limit' rules
>   * in the same table with the same cookie.  If insertion succeeds,
> @@ -5579,10 +5608,11 @@ ofproto_flow_mod_learn_finish(struct ofproto_flow_mod 
> *ofm,
>   * during the call. */
>  enum ofperr
>  ofproto_flow_mod_learn(struct ofproto_flow_mod *ofm, bool keep_ref,
> -                       unsigned limit, bool *below_limitp)
> +                       unsigned limit, bool *below_limitp,
> +                       long long int last_used)
>      OVS_EXCLUDED(ofproto_mutex)
>  {
> -    enum ofperr error = ofproto_flow_mod_learn_refresh(ofm);
> +    enum ofperr error = ofproto_flow_mod_learn_refresh(ofm, last_used);
>      struct rule *rule = ofm->temp_rule;
>      bool below_limit = true;
>  
> @@ -5615,6 +5645,11 @@ ofproto_flow_mod_learn(struct ofproto_flow_mod *ofm, 
> bool keep_ref,
>  
>              error = ofproto_flow_mod_learn_start(ofm);
>              if (!error) {
> +                /* ofproto_flow_mod_learn_start may have overwritten
> +                 * modified with current time. */
> +                ovs_mutex_lock(&ofm->temp_rule->mutex);
> +                ofm->temp_rule->modified = last_used;
> +                ovs_mutex_unlock(&ofm->temp_rule->mutex);
>                  error = ofproto_flow_mod_learn_finish(ofm, NULL);
>              }
>          } else {
> diff --git a/tests/learn.at b/tests/learn.at
> index d127fed34..bd0f88f38 100644
> --- a/tests/learn.at
> +++ b/tests/learn.at
> @@ -836,3 +836,80 @@ AT_CHECK([ovs-vsctl add-br br1 -- set b br1 
> datapath_type=dummy])
>  
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
> +
> +AT_SETUP([learning action - flapping learn rule])
> +OVS_VSWITCHD_START
> +add_of_ports br0 1 2 3 4

Why we need 4 ports in this test?

> +
> +ovs-appctl time/stop
> +# Set up flow table for TCPv4 port learning.

Nothing depends on these packets to be TCP packets or even IPv4.
'TCPv4 port learning' aslo a confusing expression, it reads as
"learning of TCP ports", and not as "learning of ports on which
TCP packets arrive".

> +AT_CHECK([[ovs-ofctl -OOpenFlow15 add-flow br0 
> 'table=0,priority=2,in_port=1,actions=resubmit(,2)']])
> +AT_CHECK([[ovs-ofctl -OOpenFlow15 add-flow br0 
> 'table=0,priority=2,in_port=4,actions=resubmit(,2)']])
> +AT_CHECK([[ovs-ofctl -OOpenFlow15 add-flow br0 
> 'table=0,in_port=2,actions=normal']])
> +AT_CHECK([[ovs-ofctl -OOpenFlow15 add-flow br0 
> 'table=0,priority=2,actions=drop']])

This is an undefined behavior - the overlapping flow at the same priority.
The last two flows are not really needed.

Also, is there a need for OF15?

> +AT_CHECK([[ovs-ofctl -OOpenFlow15 add-flow br0 
> 'table=2,actions=learn(table=0,hard_timeout=3,priority=1,cookie=0x0001020304050607,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:2']])
> +

This flow is confusing.  There is not VLAN or tunnels in this test.

Also, shorter cookie will be easier to use.

> +packet1='in_port(1),eth(src=50:54:00:00:00:06,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9)'
> +packet2='in_port(4),eth(src=50:54:00:00:00:06,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9)'
> +

There is no actual need to have 2 separate packets.
The 'in_port()' may be omitted as it is defered from the actual port used.

> +for i in `seq 1 2`; do

Maybe add a comment on why the loop is needed?

Nit:  it's better to use $(...) instead of `...`.
And actually there is no point in calling "seq 1 2" to produce "1 2".

> +    ovs-appctl revalidator/pause
> +    ovs-appctl netdev-dummy/receive p4 $packet2
> +    ovs-appctl time/warp 75
> +    ovs-appctl netdev-dummy/receive p1 $packet1
> +    ovs-appctl time/warp 75
> +    ovs-appctl netdev-dummy/receive p4 $packet2
> +    ovs-appctl time/warp 75
> +    ovs-appctl netdev-dummy/receive p1 $packet1
> +    ovs-appctl time/warp 75
> +    ovs-appctl revalidator/resume
> +    ovs-appctl revalidator/wait
> +
> +    AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sed -e 
> 's/n_packets=[[0-9]]*, n_bytes=[[0-9]]*, //' | sort], [0], [dnl
> + cookie=0x1020304050607, hard_timeout=3, 
> priority=1,vlan_tci=0x0000/0x0fff,dl_dst=50:54:00:00:00:06 
> actions=load:0->NXM_OF_VLAN_TCI[[]],load:0->NXM_NX_TUN_ID[[]],output:1
> + in_port=2 actions=NORMAL
> + priority=2 actions=drop
> + priority=2,in_port=1 actions=resubmit(,2)
> + priority=2,in_port=4 actions=resubmit(,2)
> + table=2, 
> actions=learn(table=0,hard_timeout=3,priority=1,cookie=0x1020304050607,NXM_OF_VLAN_TCI[[0..11]],NXM_OF_ETH_DST[[]]=NXM_OF_ETH_SRC[[]],load:0->NXM_OF_VLAN_TCI[[]],load:NXM_NX_TUN_ID[[]]->NXM_NX_TUN_ID[[]],output:OXM_OF_IN_PORT[[]]),output:2
> +NXST_FLOW reply:
> +])

There is no real need to dump any flows, except for a learned one.
Use --no-stats to omit stats, and cookie can be used as a filter parameter
for the ovs-ofctl.

It's hard to track what exactly we're checking here with so much output.

> +
> +    ovs-appctl revalidator/pause
> +    ovs-appctl netdev-dummy/receive p1 $packet1
> +    ovs-appctl time/warp 75
> +    ovs-appctl netdev-dummy/receive p4 $packet2
> +    ovs-appctl time/warp 75
> +    ovs-appctl netdev-dummy/receive p1 $packet1
> +    ovs-appctl time/warp 75
> +    ovs-appctl netdev-dummy/receive p4 $packet2
> +    ovs-appctl time/warp 75
> +
> +    ovs-appctl revalidator/resume
> +    ovs-appctl revalidator/wait
> +
> +    AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sed -e 
> 's/n_packets=[[0-9]]*, n_bytes=[[0-9]]*, //' | sort], [0], [dnl
> + cookie=0x1020304050607, hard_timeout=3, 
> priority=1,vlan_tci=0x0000/0x0fff,dl_dst=50:54:00:00:00:06 
> actions=load:0->NXM_OF_VLAN_TCI[[]],load:0->NXM_NX_TUN_ID[[]],output:4
> + in_port=2 actions=NORMAL
> + priority=2 actions=drop
> + priority=2,in_port=1 actions=resubmit(,2)
> + priority=2,in_port=4 actions=resubmit(,2)
> + table=2, 
> actions=learn(table=0,hard_timeout=3,priority=1,cookie=0x1020304050607,NXM_OF_VLAN_TCI[[0..11]],NXM_OF_ETH_DST[[]]=NXM_OF_ETH_SRC[[]],load:0->NXM_OF_VLAN_TCI[[]],load:NXM_NX_TUN_ID[[]]->NXM_NX_TUN_ID[[]],output:OXM_OF_IN_PORT[[]]),output:2
> +NXST_FLOW reply:
> +])
> +done
> +
> +ovs-appctl time/warp 3001
> +ovs-appctl revalidator/wait

IIRC, revalidators are not responsible for timeout handling.
We just need a couple of main loop runs.  So, if the simple
time warp is not enough, it might help replacing it with
a multi-stap warping for the same amount of time, e.g.:

  ovs-appctl time/warp 1000 3001

A comment on what exactly we're waiting for here might aslo
be useful.

> +
> +AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sed -e 
> 's/n_packets=[[0-9]]*, n_bytes=[[0-9]]*, //' | sort], [0], [dnl
> + in_port=2 actions=NORMAL
> + priority=2 actions=drop
> + priority=2,in_port=1 actions=resubmit(,2)
> + priority=2,in_port=4 actions=resubmit(,2)
> + table=2, 
> actions=learn(table=0,hard_timeout=3,priority=1,cookie=0x1020304050607,NXM_OF_VLAN_TCI[[0..11]],NXM_OF_ETH_DST[[]]=NXM_OF_ETH_SRC[[]],load:0->NXM_OF_VLAN_TCI[[]],load:NXM_NX_TUN_ID[[]]->NXM_NX_TUN_ID[[]],output:OXM_OF_IN_PORT[[]]),output:2
> +NXST_FLOW reply:
> +])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v6] ofproto-dpif-xlate: Don't reinstall removed XC_LEARN rule

Reply via email to