On 7/31/23 07:34, Mike Pattrick wrote:
> On Fri, Jul 28, 2023 at 6:31 PM Ilya Maximets <[email protected]> wrote:
>>
>> On 7/28/23 23:13, Mike Pattrick wrote:
>>> On Fri, Jul 28, 2023 at 7:48 AM Ilya Maximets <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>>
>>>     On 7/20/23 17:06, Mike Pattrick wrote:
>>>     > When the a revalidator thread is updating statistics for an XC_LEARN
>>>     > xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn.
>>>     > The revalidator will update stats for rules even if they are in a
>>>     > removed state or marked as invisible. However, ofproto_flow_mod_learn
>>>     > will detect if a flow has been removed and re-add it in that case. 
>>> This
>>>     > can result in an old learn action replacing the new learn action that
>>>     > had replaced it in the first place.
>>>     >
>>>     > This change adds a new parameter to ofproto_flow_mod_learn allowing 
>>> the
>>>     > caller to specify an action to take if temp_rule is removed. If it is
>>>     > removed and the caller is xlate_push_stats_entry, then a new rule will
>>>     > not be replaced.
>>>
>>>
>>> This is a slightly unrelated point, but while reviewing some code and test
>>> results I think I found a different issue. rule_expire() considers rule
>>> modified time for hard_timeout and the man page says hard_timeout "Causes
>>> the flow to expire after the given number of seconds, regardless of 
>>> activity",
>>> however, modified time is updated in ofproto_flow_mod_learn_refresh() even 
>>> if
>>> the rule isn't changed. Due to the current behaviour, I don't think 
>>> hard_timeout
>>> could actually ever trigger on a learned action if an idle timeout is set 
>>> or if
>>> traffic is ongoing.
>>
>> The 'modified' time of a learned rule is updated in this function, because
>> the packet hit the learn() rule.  If the learned rule didn't exist, we
>> would create it at this moment.  Instead of re-creating, we just update
>> the 'modified' time.  If packets will stop hitting the learn() rule, but
>> will continue to flow through the learned one, the learned one will be
>> removed when the hard timeout fires up.  That is a correct behavior.
>>
> 
> I can accept this is intended, but it doesn't seem like it would
> operate in that way based on the man page alone.

It's a generic characteristic of an OpenFlow rule.  "activity" means traffic
matching it.  learn() action is triggering a flow modification equal to a
strict mod-flow, as described in the ovs-actions man page.  And ovs-ofctl page
mentions that "mod-flows restarts the hard_timeout".

Maybe it can be clarified somehow better, I'm not sure.  Feel free to submit
a patch.

> 
>>
>>>
>>>
>>>     Hi, Mike.  That's an interesting issue, though I'm not sure that is a
>>>     correct way to fix it.
>>>
>>>     It is important to re-install removed learned flows during stats update.
>>>     The logic is the following, AFAIU:
>>>
>>>     1. Packet hits the OF rule (1) with a learning action.
>>>     2. Handler translates OF and creates a datapath flow for it, creating
>>>        a learned rule (2) in the process and filling the translation cache.
>>>     3. Datapath flow for the rule (1) is installed into the datapath.
>>>     4. Time goes by and timeout on the learned rule (2) hits. Learned rule
>>>        is removed now.
>>>     5. A packet hits the datapath flow for rule (1).
>>>     6. Revalidator updates flow stats for the rule (1).
>>>     7. Since the packet hit the learning rule (1), the rule (2) has to be
>>>        added back.
>>>
>>>     IIUC, the step 7 will not happen with the change proposed in this patch.
>>>     Flow will not be re-translated, so the only way to re-install it is
>>>     to execute side effects when we see stats update on a parent datapath
>>>     flow in the revalidator.
>>>
>>>     One potential way to avoid breaking this scenario is to check the
>>>     removal reason, but it's not bulletproof either as we do still need
>>>     to reinstate the flow even if it was manually deleted or evicted.
>>>
>>>
>>> I added a log to check the removal reason in the XC_LEARN stats update, and 
>>> the
>>> reason always seems to be eviction. I wasn't able to catch an idle or hard 
>>> removal
>>> in that function, though I'm sure I just wasn't winning the race.
>>>
>>> But in this example if we hit 7, and don't re-add the rule, what is the 
>>> real penalty?
>>> The next packet will have to upcall and we would process the learn action 
>>> from scratch,
>>> but that may be what the user wants as indicated by setting a timeout.
>>
>> Not really.  If the learned flow is removed, we need a packet to hit a
>> learn() rule, which is a different rule, in order for traffic to continue
>> to flow.  Let's say we implement a firewall that allows connections only
>> from A to B.  So, A can initiate connection to B, but B can't initiate
>> connection to A.  Once A sends tcp SYN, we learn a rule that allows traffic
>> from B to A on these particular ports.  Now B can send traffic back.
>> Once the learned flow is removed, A will have to establish a new connection
>> in order to re-learn the B->A rule.  So, not just an upcall.  May not be a
>> best example, but should work as an illustration.
>>
>>> The alternative
>>> is we re-install a flow that may be different then the flow that the user 
>>> actually
>>> wants installed.
>>
>> If the flow is modified, it means that there was a traffic on a learn()
>> action and we have to learn that specific rule.  If it was already removed,
>> we have to reinstate it.  In general, the logic in the current OVS code
>> seems correct in this particular place.
>>
>>>
>>> Isn't there a risk of different revalidator threads going out of sync with 
>>> different
>>> versions of a learned rule in their xcache?
>>>
>>>
>>>     It also seems like we may hold the reference to this deleted rule for a
>>>     long time in the cache until the parent datapath flow is deleted.
>>>
>>>
>>>     What is triggering the deletion of the learned rule in your case?
>>>     Is it deletion or modification of the parent rule?  And does it have
>>>     'delete_learned' flag?
>>>
>>>
>>> In the case that I was debugging, the parent learn action was not deleted. 
>>> Instead,
>>> new traffic from a different port was supposed to cause the learn action to 
>>> modify
>>> the existing rule. But instead we see the rule modified with the correct 
>>> information,
>>> and then immediately reverted with old stale information.
>>
>> The problem here is that we can't define which one is 'correct'.
>> It's a race where both values are correct, but you like one better
>> for some reason.  New rules are only learned if the parent datapath
>> flow that has a learn() action seen packets after the previous stats
>> check.
>>
>> What you're describing is that you have a single rule with a learn
>> action that is capable of re-writing the same learned rule depending
>> on what kind of packet actually hits it.  Is that correct?
> 
> In this case the rule is:
> 
> learn(table=20,hard_timeout=300,priority=1,cookie=0xcb684d24d14b14ad,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:1
> 
> And an example of the learned rule is:
> 
> vlan_tci=0x0005/0x0fff,dl_dst=fc:16:3e:3e:5d:e0,actions=load:0->NXM_OF_VLAN_TCI[],load:0x3->NXM_NX_TUN_ID[],ou
> tput:2
> 
> The scenario is that ingressing packets are supposed to change the
> route that egress packets take, and that route can change over time.
> 
> What I see from some additional logging is:
> 
> 1. traffic begins to ingress on port A
> 2. learn action installs a rule to route response to A
> 3. traffic begins to egress on port A
> 4. port A becomes invalid, no further traffic ingresses from port A
> 5. traffic ingresses on port B
> 6. learn action installs a rule to route response to B
> 7. revalidator replaces (6) with (2)
> 
> Eventually the egressing traffic will become exausted and the correct
> rule is installed, but this can take several minutes. The rule
> regression can even happen many milliseconds after learn installs the
> correct rule and even after packets have flowed through the new rule.

The rule can be replaced if there was no revalidation between steps
4 and 5, i.e. at the point of revalidation there is a stats increase
on both datapath flows (ingress A and ingress B).
Revalidators can sleep for up to 500ms by default between revalidation
cycles, so the replacement can happen many milliseconds after.

Is the traffic on port B continuous, or is it a one-shot packet?
The point is, at least one revalidation cycle should pass after step 4.
And the traffic on port B should be seen after this revalidation cycle
is completed.  So, step 5 should be at least 500+ ms apart from step 4.

> 
> 
>> I would consider this kind of setup as fairly fragile and would not
>> recommend using it in general.
>>
>> However, the scenario is likely like this:
>>
>> 1. Upcall on a rule with learn() action.
>>    New datapath flow (1) is installed.  This flow has a match on
>>    all the fields used in a learn action and ukey contains the
>>    translation cache with a rule that it produces.
>>    The new learned rule is installed from this.
>>
>> 2. Upcall with a different packet that doesn't match the first
>>    datapath flow (1), but on the same rule with a learn() action.
>>    New datapath flow (2) is installed.  This flow has a match on
>>    all the fields used in a learn action (they are different from
>>    the first one, since we have an upcall) and ukey contains the
>>    translation cache with a rule that it produces.  Let's say the
>>    learned rule has the same match, but different actions.
>>    This overwrites the previously installed learned rule.
>>
>> 3. Both datapath flows (1) and (2) get some traffic.
>>
>> 4. Revalidation starts.
>>
>> 5. Revalidator A dumps datapath flow (1).
>>    Revalidator B dumps datapath flow (2).
>>
>> 6. Revalidators A and B are trying to revalidate these datapath flows
>>    at the same time.  Both datapath flows had traffic since the
>>    last revalidation.  Both revalidators are trying to update the
>>    learned rule, which is the same for both of them.
>>
>> 7. One of them does that first, the other one ends up with a rule in a
>>    REMOVED state, since it was just replaced by the other revalidator.
>>
>>
>> There is no correct or incorrect rule in this situation.  Both
>> of them are correct, because both datapath flows have seen packets
>> since the previous revalidation.  We can't guarantee which one will
>> end up being written the last and therefore will take effect.
>> if we refuse to reinstate the REMOVED rule, we still can't say which
>> one will be active in the end.
>>
>> What can resolve the race is the next revalidation cycle.
>> Once one of the datapath flows will stop receiving traffic, there
>> should no longer be a race condition and the next revalidation will
>> set the "correct" rule, i.e. the rule that corresponds to a learning
>> datapath flow that seen some packets last.  But until both flows have
>> traffic, there is no "correct" rule and the result will be random,
>> regardless of what we choose to do.
>>
>>
>> One issue that I see in the current code is that it looks like the table
>> version is not getting updated on learned flow refresh and hence the next
>> revalidation may not actually be triggered.  In this case the random
>> result may stay for much longer than we would like.
>>
>> The suspicious code (and the potential fix) is the following:
>>
>> @@ -5548,22 +5558,15 @@ ofproto_flow_mod_learn_revert(struct 
>> ofproto_flow_mod *ofm)
>>  }
>>
>>  enum ofperr
>> -ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm,
>> -                              struct ofproto *orig_ofproto)
>> +ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm)
>>      OVS_REQUIRES(ofproto_mutex)
>>  {
>>      struct rule *rule = rule_collection_rules(&ofm->new_rules)[0];
>>      enum ofperr error = 0;
>>
>> -    /* If learning on a different bridge, must bump its version
>> -     * number and flush connmgr afterwards. */
>> -    if (rule->ofproto != orig_ofproto) {
>> -        ofproto_bump_tables_version(rule->ofproto);
>> -    }
>> +    ofproto_bump_tables_version(rule->ofproto);
>>      error = ofproto_flow_mod_finish(rule->ofproto, ofm, NULL);
>> -    if (rule->ofproto != orig_ofproto) {
>> -        ofmonitor_flush(rule->ofproto->connmgr);
>> -    }
>> +    ofmonitor_flush(rule->ofproto->connmgr);
>>
>>      return error;
>>  }
>> ---
>>
>> We should actually bump the version and request a revalidation in
>> any case, not only when the change is in a different bridge.

FWIW, the original code in commit 1f4a89336682 was written mostly
for a case where learning is triggered by a packet-out executed
within a bundle.  And we can't actually publish changes for a current
bridge in this case.  So, the fix should be a bit more clever.

>>
>> Could you try that in your setup?
> 
> That was actually my initial suspicion when investigating this bug,
> but this change doesn't improve the buggy behaviour.
> 
> Thank you,
> M
> 
>>
>>
>> The issue appears to come from the commit:
>>   1f4a89336682 ("ofproto: Refactor packet_out handling.")
>>
>> There is a slight chance of launching ourselves into an infinite
>> revlidation loop, I guess.  Especially if both learning datapath
>> flows will continue to receive the traffic, but there is a minimal
>> revalidation interval, IIRC.  Not sure if it applies in this case.
>>
>> Best regards, Ilya Maximets.
>>
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to