Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: Don't reinstall removed XC_LEARN rule

Mike Pattrick Mon, 31 Jul 2023 09:01:39 -0700

On Mon, Jul 31, 2023 at 7:20 AM Ilya Maximets <i.maxim...@ovn.org> wrote:
>
> On 7/31/23 07:34, Mike Pattrick wrote:
> > On Fri, Jul 28, 2023 at 6:31 PM Ilya Maximets <i.maxim...@ovn.org> wrote:
> >>
> >> On 7/28/23 23:13, Mike Pattrick wrote:
> >>> On Fri, Jul 28, 2023 at 7:48 AM Ilya Maximets <i.maxim...@ovn.org 
> >>> <mailto:i.maxim...@ovn.org>> wrote:
> >>>
> >>>     On 7/20/23 17:06, Mike Pattrick wrote:
> >>>     > When the a revalidator thread is updating statistics for an XC_LEARN
> >>>     > xcache entry in xlate_push_stats_entry it uses 
> >>> ofproto_flow_mod_learn.
> >>>     > The revalidator will update stats for rules even if they are in a
> >>>     > removed state or marked as invisible. However, 
> >>> ofproto_flow_mod_learn
> >>>     > will detect if a flow has been removed and re-add it in that case. 
> >>> This
> >>>     > can result in an old learn action replacing the new learn action 
> >>> that
> >>>     > had replaced it in the first place.
> >>>     >
> >>>     > This change adds a new parameter to ofproto_flow_mod_learn allowing 
> >>> the
> >>>     > caller to specify an action to take if temp_rule is removed. If it 
> >>> is
> >>>     > removed and the caller is xlate_push_stats_entry, then a new rule 
> >>> will
> >>>     > not be replaced.
> >>>
> >>>
> >>> This is a slightly unrelated point, but while reviewing some code and test
> >>> results I think I found a different issue. rule_expire() considers rule
> >>> modified time for hard_timeout and the man page says hard_timeout "Causes
> >>> the flow to expire after the given number of seconds, regardless of 
> >>> activity",
> >>> however, modified time is updated in ofproto_flow_mod_learn_refresh() 
> >>> even if
> >>> the rule isn't changed. Due to the current behaviour, I don't think 
> >>> hard_timeout
> >>> could actually ever trigger on a learned action if an idle timeout is set 
> >>> or if
> >>> traffic is ongoing.
> >>
> >> The 'modified' time of a learned rule is updated in this function, because
> >> the packet hit the learn() rule.  If the learned rule didn't exist, we
> >> would create it at this moment.  Instead of re-creating, we just update
> >> the 'modified' time.  If packets will stop hitting the learn() rule, but
> >> will continue to flow through the learned one, the learned one will be
> >> removed when the hard timeout fires up.  That is a correct behavior.
> >>
> >
> > I can accept this is intended, but it doesn't seem like it would
> > operate in that way based on the man page alone.
>
> It's a generic characteristic of an OpenFlow rule.  "activity" means traffic
> matching it.  learn() action is triggering a flow modification equal to a
> strict mod-flow, as described in the ovs-actions man page.  And ovs-ofctl page
> mentions that "mod-flows restarts the hard_timeout".
>
> Maybe it can be clarified somehow better, I'm not sure.  Feel free to submit
> a patch.
>
> >
> >>
> >>>
> >>>
> >>>     Hi, Mike.  That's an interesting issue, though I'm not sure that is a
> >>>     correct way to fix it.
> >>>
> >>>     It is important to re-install removed learned flows during stats 
> >>> update.
> >>>     The logic is the following, AFAIU:
> >>>
> >>>     1. Packet hits the OF rule (1) with a learning action.
> >>>     2. Handler translates OF and creates a datapath flow for it, creating
> >>>        a learned rule (2) in the process and filling the translation 
> >>> cache.
> >>>     3. Datapath flow for the rule (1) is installed into the datapath.
> >>>     4. Time goes by and timeout on the learned rule (2) hits. Learned rule
> >>>        is removed now.
> >>>     5. A packet hits the datapath flow for rule (1).
> >>>     6. Revalidator updates flow stats for the rule (1).
> >>>     7. Since the packet hit the learning rule (1), the rule (2) has to be
> >>>        added back.
> >>>
> >>>     IIUC, the step 7 will not happen with the change proposed in this 
> >>> patch.
> >>>     Flow will not be re-translated, so the only way to re-install it is
> >>>     to execute side effects when we see stats update on a parent datapath
> >>>     flow in the revalidator.
> >>>
> >>>     One potential way to avoid breaking this scenario is to check the
> >>>     removal reason, but it's not bulletproof either as we do still need
> >>>     to reinstate the flow even if it was manually deleted or evicted.
> >>>
> >>>
> >>> I added a log to check the removal reason in the XC_LEARN stats update, 
> >>> and the
> >>> reason always seems to be eviction. I wasn't able to catch an idle or 
> >>> hard removal
> >>> in that function, though I'm sure I just wasn't winning the race.
> >>>
> >>> But in this example if we hit 7, and don't re-add the rule, what is the 
> >>> real penalty?
> >>> The next packet will have to upcall and we would process the learn action 
> >>> from scratch,
> >>> but that may be what the user wants as indicated by setting a timeout.
> >>
> >> Not really.  If the learned flow is removed, we need a packet to hit a
> >> learn() rule, which is a different rule, in order for traffic to continue
> >> to flow.  Let's say we implement a firewall that allows connections only
> >> from A to B.  So, A can initiate connection to B, but B can't initiate
> >> connection to A.  Once A sends tcp SYN, we learn a rule that allows traffic
> >> from B to A on these particular ports.  Now B can send traffic back.
> >> Once the learned flow is removed, A will have to establish a new connection
> >> in order to re-learn the B->A rule.  So, not just an upcall.  May not be a
> >> best example, but should work as an illustration.
> >>
> >>> The alternative
> >>> is we re-install a flow that may be different then the flow that the user 
> >>> actually
> >>> wants installed.
> >>
> >> If the flow is modified, it means that there was a traffic on a learn()
> >> action and we have to learn that specific rule.  If it was already removed,
> >> we have to reinstate it.  In general, the logic in the current OVS code
> >> seems correct in this particular place.
> >>
> >>>
> >>> Isn't there a risk of different revalidator threads going out of sync 
> >>> with different
> >>> versions of a learned rule in their xcache?
> >>>
> >>>
> >>>     It also seems like we may hold the reference to this deleted rule for 
> >>> a
> >>>     long time in the cache until the parent datapath flow is deleted.
> >>>
> >>>
> >>>     What is triggering the deletion of the learned rule in your case?
> >>>     Is it deletion or modification of the parent rule?  And does it have
> >>>     'delete_learned' flag?
> >>>
> >>>
> >>> In the case that I was debugging, the parent learn action was not 
> >>> deleted. Instead,
> >>> new traffic from a different port was supposed to cause the learn action 
> >>> to modify
> >>> the existing rule. But instead we see the rule modified with the correct 
> >>> information,
> >>> and then immediately reverted with old stale information.
> >>
> >> The problem here is that we can't define which one is 'correct'.
> >> It's a race where both values are correct, but you like one better
> >> for some reason.  New rules are only learned if the parent datapath
> >> flow that has a learn() action seen packets after the previous stats
> >> check.
> >>
> >> What you're describing is that you have a single rule with a learn
> >> action that is capable of re-writing the same learned rule depending
> >> on what kind of packet actually hits it.  Is that correct?
> >
> > In this case the rule is:
> >
> > learn(table=20,hard_timeout=300,priority=1,cookie=0xcb684d24d14b14ad,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:1
> >
> > And an example of the learned rule is:
> >
> > vlan_tci=0x0005/0x0fff,dl_dst=fc:16:3e:3e:5d:e0,actions=load:0->NXM_OF_VLAN_TCI[],load:0x3->NXM_NX_TUN_ID[],ou
> > tput:2
> >
> > The scenario is that ingressing packets are supposed to change the
> > route that egress packets take, and that route can change over time.
> >
> > What I see from some additional logging is:
> >
> > 1. traffic begins to ingress on port A
> > 2. learn action installs a rule to route response to A
> > 3. traffic begins to egress on port A
> > 4. port A becomes invalid, no further traffic ingresses from port A
> > 5. traffic ingresses on port B
> > 6. learn action installs a rule to route response to B
> > 7. revalidator replaces (6) with (2)
> >
> > Eventually the egressing traffic will become exausted and the correct
> > rule is installed, but this can take several minutes. The rule
> > regression can even happen many milliseconds after learn installs the
> > correct rule and even after packets have flowed through the new rule.
>
> The rule can be replaced if there was no revalidation between steps
> 4 and 5, i.e. at the point of revalidation there is a stats increase
> on both datapath flows (ingress A and ingress B).
> Revalidators can sleep for up to 500ms by default between revalidation
> cycles, so the replacement can happen many milliseconds after.
>
> Is the traffic on port B continuous, or is it a one-shot packet?


Continuous but less frequent than traffic in the opposite direction.
In a client's environment there were ingress packets every few seconds
and egress packets continued to flow incorrectly for several minutes
afterwards.

If this solution is unacceptable; I could resubmit with an additional
check to only suppress the flow_mod if the learn_modify rule is in a
removed state and there is already a conflicting rule in the
classifier that isn't removed. Or would you prefer a completely
different approach be taken here?


Thanks,
M

> The point is, at least one revalidation cycle should pass after step 4.
> And the traffic on port B should be seen after this revalidation cycle
> is completed.  So, step 5 should be at least 500+ ms apart from step 4.
>
> >
> >
> >> I would consider this kind of setup as fairly fragile and would not
> >> recommend using it in general.
> >>
> >> However, the scenario is likely like this:
> >>
> >> 1. Upcall on a rule with learn() action.
> >>    New datapath flow (1) is installed.  This flow has a match on
> >>    all the fields used in a learn action and ukey contains the
> >>    translation cache with a rule that it produces.
> >>    The new learned rule is installed from this.
> >>
> >> 2. Upcall with a different packet that doesn't match the first
> >>    datapath flow (1), but on the same rule with a learn() action.
> >>    New datapath flow (2) is installed.  This flow has a match on
> >>    all the fields used in a learn action (they are different from
> >>    the first one, since we have an upcall) and ukey contains the
> >>    translation cache with a rule that it produces.  Let's say the
> >>    learned rule has the same match, but different actions.
> >>    This overwrites the previously installed learned rule.
> >>
> >> 3. Both datapath flows (1) and (2) get some traffic.
> >>
> >> 4. Revalidation starts.
> >>
> >> 5. Revalidator A dumps datapath flow (1).
> >>    Revalidator B dumps datapath flow (2).
> >>
> >> 6. Revalidators A and B are trying to revalidate these datapath flows
> >>    at the same time.  Both datapath flows had traffic since the
> >>    last revalidation.  Both revalidators are trying to update the
> >>    learned rule, which is the same for both of them.
> >>
> >> 7. One of them does that first, the other one ends up with a rule in a
> >>    REMOVED state, since it was just replaced by the other revalidator.
> >>
> >>
> >> There is no correct or incorrect rule in this situation.  Both
> >> of them are correct, because both datapath flows have seen packets
> >> since the previous revalidation.  We can't guarantee which one will
> >> end up being written the last and therefore will take effect.
> >> if we refuse to reinstate the REMOVED rule, we still can't say which
> >> one will be active in the end.
> >>
> >> What can resolve the race is the next revalidation cycle.
> >> Once one of the datapath flows will stop receiving traffic, there
> >> should no longer be a race condition and the next revalidation will
> >> set the "correct" rule, i.e. the rule that corresponds to a learning
> >> datapath flow that seen some packets last.  But until both flows have
> >> traffic, there is no "correct" rule and the result will be random,
> >> regardless of what we choose to do.
> >>
> >>
> >> One issue that I see in the current code is that it looks like the table
> >> version is not getting updated on learned flow refresh and hence the next
> >> revalidation may not actually be triggered.  In this case the random
> >> result may stay for much longer than we would like.
> >>
> >> The suspicious code (and the potential fix) is the following:
> >>
> >> @@ -5548,22 +5558,15 @@ ofproto_flow_mod_learn_revert(struct 
> >> ofproto_flow_mod *ofm)
> >>  }
> >>
> >>  enum ofperr
> >> -ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm,
> >> -                              struct ofproto *orig_ofproto)
> >> +ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm)
> >>      OVS_REQUIRES(ofproto_mutex)
> >>  {
> >>      struct rule *rule = rule_collection_rules(&ofm->new_rules)[0];
> >>      enum ofperr error = 0;
> >>
> >> -    /* If learning on a different bridge, must bump its version
> >> -     * number and flush connmgr afterwards. */
> >> -    if (rule->ofproto != orig_ofproto) {
> >> -        ofproto_bump_tables_version(rule->ofproto);
> >> -    }
> >> +    ofproto_bump_tables_version(rule->ofproto);
> >>      error = ofproto_flow_mod_finish(rule->ofproto, ofm, NULL);
> >> -    if (rule->ofproto != orig_ofproto) {
> >> -        ofmonitor_flush(rule->ofproto->connmgr);
> >> -    }
> >> +    ofmonitor_flush(rule->ofproto->connmgr);
> >>
> >>      return error;
> >>  }
> >> ---
> >>
> >> We should actually bump the version and request a revalidation in
> >> any case, not only when the change is in a different bridge.
>
> FWIW, the original code in commit 1f4a89336682 was written mostly
> for a case where learning is triggered by a packet-out executed
> within a bundle.  And we can't actually publish changes for a current
> bridge in this case.  So, the fix should be a bit more clever.
>
> >>
> >> Could you try that in your setup?
> >
> > That was actually my initial suspicion when investigating this bug,
> > but this change doesn't improve the buggy behaviour.
> >
> > Thank you,
> > M
> >
> >>
> >>
> >> The issue appears to come from the commit:
> >>   1f4a89336682 ("ofproto: Refactor packet_out handling.")
> >>
> >> There is a slight chance of launching ourselves into an infinite
> >> revlidation loop, I guess.  Especially if both learning datapath
> >> flows will continue to receive the traffic, but there is a minimal
> >> revalidation interval, IIRC.  Not sure if it applies in this case.
> >>
> >> Best regards, Ilya Maximets.
> >>
> >
>

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: Don't reinstall removed XC_LEARN rule

Reply via email to