On Fri, Jul 28, 2023 at 6:31 PM Ilya Maximets <i.maxim...@ovn.org> wrote:
>
> On 7/28/23 23:13, Mike Pattrick wrote:
> > On Fri, Jul 28, 2023 at 7:48 AM Ilya Maximets <i.maxim...@ovn.org 
> > <mailto:i.maxim...@ovn.org>> wrote:
> >
> >     On 7/20/23 17:06, Mike Pattrick wrote:
> >     > When the a revalidator thread is updating statistics for an XC_LEARN
> >     > xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn.
> >     > The revalidator will update stats for rules even if they are in a
> >     > removed state or marked as invisible. However, ofproto_flow_mod_learn
> >     > will detect if a flow has been removed and re-add it in that case. 
> > This
> >     > can result in an old learn action replacing the new learn action that
> >     > had replaced it in the first place.
> >     >
> >     > This change adds a new parameter to ofproto_flow_mod_learn allowing 
> > the
> >     > caller to specify an action to take if temp_rule is removed. If it is
> >     > removed and the caller is xlate_push_stats_entry, then a new rule will
> >     > not be replaced.
> >
> >
> > This is a slightly unrelated point, but while reviewing some code and test
> > results I think I found a different issue. rule_expire() considers rule
> > modified time for hard_timeout and the man page says hard_timeout "Causes
> > the flow to expire after the given number of seconds, regardless of 
> > activity",
> > however, modified time is updated in ofproto_flow_mod_learn_refresh() even 
> > if
> > the rule isn't changed. Due to the current behaviour, I don't think 
> > hard_timeout
> > could actually ever trigger on a learned action if an idle timeout is set 
> > or if
> > traffic is ongoing.
>
> The 'modified' time of a learned rule is updated in this function, because
> the packet hit the learn() rule.  If the learned rule didn't exist, we
> would create it at this moment.  Instead of re-creating, we just update
> the 'modified' time.  If packets will stop hitting the learn() rule, but
> will continue to flow through the learned one, the learned one will be
> removed when the hard timeout fires up.  That is a correct behavior.
>

I can accept this is intended, but it doesn't seem like it would
operate in that way based on the man page alone.

>
> >
> >
> >     Hi, Mike.  That's an interesting issue, though I'm not sure that is a
> >     correct way to fix it.
> >
> >     It is important to re-install removed learned flows during stats update.
> >     The logic is the following, AFAIU:
> >
> >     1. Packet hits the OF rule (1) with a learning action.
> >     2. Handler translates OF and creates a datapath flow for it, creating
> >        a learned rule (2) in the process and filling the translation cache.
> >     3. Datapath flow for the rule (1) is installed into the datapath.
> >     4. Time goes by and timeout on the learned rule (2) hits. Learned rule
> >        is removed now.
> >     5. A packet hits the datapath flow for rule (1).
> >     6. Revalidator updates flow stats for the rule (1).
> >     7. Since the packet hit the learning rule (1), the rule (2) has to be
> >        added back.
> >
> >     IIUC, the step 7 will not happen with the change proposed in this patch.
> >     Flow will not be re-translated, so the only way to re-install it is
> >     to execute side effects when we see stats update on a parent datapath
> >     flow in the revalidator.
> >
> >     One potential way to avoid breaking this scenario is to check the
> >     removal reason, but it's not bulletproof either as we do still need
> >     to reinstate the flow even if it was manually deleted or evicted.
> >
> >
> > I added a log to check the removal reason in the XC_LEARN stats update, and 
> > the
> > reason always seems to be eviction. I wasn't able to catch an idle or hard 
> > removal
> > in that function, though I'm sure I just wasn't winning the race.
> >
> > But in this example if we hit 7, and don't re-add the rule, what is the 
> > real penalty?
> > The next packet will have to upcall and we would process the learn action 
> > from scratch,
> > but that may be what the user wants as indicated by setting a timeout.
>
> Not really.  If the learned flow is removed, we need a packet to hit a
> learn() rule, which is a different rule, in order for traffic to continue
> to flow.  Let's say we implement a firewall that allows connections only
> from A to B.  So, A can initiate connection to B, but B can't initiate
> connection to A.  Once A sends tcp SYN, we learn a rule that allows traffic
> from B to A on these particular ports.  Now B can send traffic back.
> Once the learned flow is removed, A will have to establish a new connection
> in order to re-learn the B->A rule.  So, not just an upcall.  May not be a
> best example, but should work as an illustration.
>
> > The alternative
> > is we re-install a flow that may be different then the flow that the user 
> > actually
> > wants installed.
>
> If the flow is modified, it means that there was a traffic on a learn()
> action and we have to learn that specific rule.  If it was already removed,
> we have to reinstate it.  In general, the logic in the current OVS code
> seems correct in this particular place.
>
> >
> > Isn't there a risk of different revalidator threads going out of sync with 
> > different
> > versions of a learned rule in their xcache?
> >
> >
> >     It also seems like we may hold the reference to this deleted rule for a
> >     long time in the cache until the parent datapath flow is deleted.
> >
> >
> >     What is triggering the deletion of the learned rule in your case?
> >     Is it deletion or modification of the parent rule?  And does it have
> >     'delete_learned' flag?
> >
> >
> > In the case that I was debugging, the parent learn action was not deleted. 
> > Instead,
> > new traffic from a different port was supposed to cause the learn action to 
> > modify
> > the existing rule. But instead we see the rule modified with the correct 
> > information,
> > and then immediately reverted with old stale information.
>
> The problem here is that we can't define which one is 'correct'.
> It's a race where both values are correct, but you like one better
> for some reason.  New rules are only learned if the parent datapath
> flow that has a learn() action seen packets after the previous stats
> check.
>
> What you're describing is that you have a single rule with a learn
> action that is capable of re-writing the same learned rule depending
> on what kind of packet actually hits it.  Is that correct?

In this case the rule is:

learn(table=20,hard_timeout=300,priority=1,cookie=0xcb684d24d14b14ad,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:1

And an example of the learned rule is:

vlan_tci=0x0005/0x0fff,dl_dst=fc:16:3e:3e:5d:e0,actions=load:0->NXM_OF_VLAN_TCI[],load:0x3->NXM_NX_TUN_ID[],ou
tput:2

The scenario is that ingressing packets are supposed to change the
route that egress packets take, and that route can change over time.

What I see from some additional logging is:

1. traffic begins to ingress on port A
2. learn action installs a rule to route response to A
3. traffic begins to egress on port A
4. port A becomes invalid, no further traffic ingresses from port A
5. traffic ingresses on port B
6. learn action installs a rule to route response to B
7. revalidator replaces (6) with (2)

Eventually the egressing traffic will become exausted and the correct
rule is installed, but this can take several minutes. The rule
regression can even happen many milliseconds after learn installs the
correct rule and even after packets have flowed through the new rule.


> I would consider this kind of setup as fairly fragile and would not
> recommend using it in general.
>
> However, the scenario is likely like this:
>
> 1. Upcall on a rule with learn() action.
>    New datapath flow (1) is installed.  This flow has a match on
>    all the fields used in a learn action and ukey contains the
>    translation cache with a rule that it produces.
>    The new learned rule is installed from this.
>
> 2. Upcall with a different packet that doesn't match the first
>    datapath flow (1), but on the same rule with a learn() action.
>    New datapath flow (2) is installed.  This flow has a match on
>    all the fields used in a learn action (they are different from
>    the first one, since we have an upcall) and ukey contains the
>    translation cache with a rule that it produces.  Let's say the
>    learned rule has the same match, but different actions.
>    This overwrites the previously installed learned rule.
>
> 3. Both datapath flows (1) and (2) get some traffic.
>
> 4. Revalidation starts.
>
> 5. Revalidator A dumps datapath flow (1).
>    Revalidator B dumps datapath flow (2).
>
> 6. Revalidators A and B are trying to revalidate these datapath flows
>    at the same time.  Both datapath flows had traffic since the
>    last revalidation.  Both revalidators are trying to update the
>    learned rule, which is the same for both of them.
>
> 7. One of them does that first, the other one ends up with a rule in a
>    REMOVED state, since it was just replaced by the other revalidator.
>
>
> There is no correct or incorrect rule in this situation.  Both
> of them are correct, because both datapath flows have seen packets
> since the previous revalidation.  We can't guarantee which one will
> end up being written the last and therefore will take effect.
> if we refuse to reinstate the REMOVED rule, we still can't say which
> one will be active in the end.
>
> What can resolve the race is the next revalidation cycle.
> Once one of the datapath flows will stop receiving traffic, there
> should no longer be a race condition and the next revalidation will
> set the "correct" rule, i.e. the rule that corresponds to a learning
> datapath flow that seen some packets last.  But until both flows have
> traffic, there is no "correct" rule and the result will be random,
> regardless of what we choose to do.
>
>
> One issue that I see in the current code is that it looks like the table
> version is not getting updated on learned flow refresh and hence the next
> revalidation may not actually be triggered.  In this case the random
> result may stay for much longer than we would like.
>
> The suspicious code (and the potential fix) is the following:
>
> @@ -5548,22 +5558,15 @@ ofproto_flow_mod_learn_revert(struct ofproto_flow_mod 
> *ofm)
>  }
>
>  enum ofperr
> -ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm,
> -                              struct ofproto *orig_ofproto)
> +ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm)
>      OVS_REQUIRES(ofproto_mutex)
>  {
>      struct rule *rule = rule_collection_rules(&ofm->new_rules)[0];
>      enum ofperr error = 0;
>
> -    /* If learning on a different bridge, must bump its version
> -     * number and flush connmgr afterwards. */
> -    if (rule->ofproto != orig_ofproto) {
> -        ofproto_bump_tables_version(rule->ofproto);
> -    }
> +    ofproto_bump_tables_version(rule->ofproto);
>      error = ofproto_flow_mod_finish(rule->ofproto, ofm, NULL);
> -    if (rule->ofproto != orig_ofproto) {
> -        ofmonitor_flush(rule->ofproto->connmgr);
> -    }
> +    ofmonitor_flush(rule->ofproto->connmgr);
>
>      return error;
>  }
> ---
>
> We should actually bump the version and request a revalidation in
> any case, not only when the change is in a different bridge.
>
> Could you try that in your setup?

That was actually my initial suspicion when investigating this bug,
but this change doesn't improve the buggy behaviour.

Thank you,
M

>
>
> The issue appears to come from the commit:
>   1f4a89336682 ("ofproto: Refactor packet_out handling.")
>
> There is a slight chance of launching ourselves into an infinite
> revlidation loop, I guess.  Especially if both learning datapath
> flows will continue to receive the traffic, but there is a minimal
> revalidation interval, IIRC.  Not sure if it applies in this case.
>
> Best regards, Ilya Maximets.
>

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to