On Mon, Jul 31, 2023 at 7:20 AM Ilya Maximets <i.maxim...@ovn.org> wrote: > > On 7/31/23 07:34, Mike Pattrick wrote: > > On Fri, Jul 28, 2023 at 6:31 PM Ilya Maximets <i.maxim...@ovn.org> wrote: > >> > >> On 7/28/23 23:13, Mike Pattrick wrote: > >>> On Fri, Jul 28, 2023 at 7:48 AM Ilya Maximets <i.maxim...@ovn.org > >>> <mailto:i.maxim...@ovn.org>> wrote: > >>> > >>> On 7/20/23 17:06, Mike Pattrick wrote: > >>> > When the a revalidator thread is updating statistics for an XC_LEARN > >>> > xcache entry in xlate_push_stats_entry it uses > >>> ofproto_flow_mod_learn. > >>> > The revalidator will update stats for rules even if they are in a > >>> > removed state or marked as invisible. However, > >>> ofproto_flow_mod_learn > >>> > will detect if a flow has been removed and re-add it in that case. > >>> This > >>> > can result in an old learn action replacing the new learn action > >>> that > >>> > had replaced it in the first place. > >>> > > >>> > This change adds a new parameter to ofproto_flow_mod_learn allowing > >>> the > >>> > caller to specify an action to take if temp_rule is removed. If it > >>> is > >>> > removed and the caller is xlate_push_stats_entry, then a new rule > >>> will > >>> > not be replaced. > >>> > >>> > >>> This is a slightly unrelated point, but while reviewing some code and test > >>> results I think I found a different issue. rule_expire() considers rule > >>> modified time for hard_timeout and the man page says hard_timeout "Causes > >>> the flow to expire after the given number of seconds, regardless of > >>> activity", > >>> however, modified time is updated in ofproto_flow_mod_learn_refresh() > >>> even if > >>> the rule isn't changed. Due to the current behaviour, I don't think > >>> hard_timeout > >>> could actually ever trigger on a learned action if an idle timeout is set > >>> or if > >>> traffic is ongoing. > >> > >> The 'modified' time of a learned rule is updated in this function, because > >> the packet hit the learn() rule. If the learned rule didn't exist, we > >> would create it at this moment. Instead of re-creating, we just update > >> the 'modified' time. If packets will stop hitting the learn() rule, but > >> will continue to flow through the learned one, the learned one will be > >> removed when the hard timeout fires up. That is a correct behavior. > >> > > > > I can accept this is intended, but it doesn't seem like it would > > operate in that way based on the man page alone. > > It's a generic characteristic of an OpenFlow rule. "activity" means traffic > matching it. learn() action is triggering a flow modification equal to a > strict mod-flow, as described in the ovs-actions man page. And ovs-ofctl page > mentions that "mod-flows restarts the hard_timeout". > > Maybe it can be clarified somehow better, I'm not sure. Feel free to submit > a patch. > > > > >> > >>> > >>> > >>> Hi, Mike. That's an interesting issue, though I'm not sure that is a > >>> correct way to fix it. > >>> > >>> It is important to re-install removed learned flows during stats > >>> update. > >>> The logic is the following, AFAIU: > >>> > >>> 1. Packet hits the OF rule (1) with a learning action. > >>> 2. Handler translates OF and creates a datapath flow for it, creating > >>> a learned rule (2) in the process and filling the translation > >>> cache. > >>> 3. Datapath flow for the rule (1) is installed into the datapath. > >>> 4. Time goes by and timeout on the learned rule (2) hits. Learned rule > >>> is removed now. > >>> 5. A packet hits the datapath flow for rule (1). > >>> 6. Revalidator updates flow stats for the rule (1). > >>> 7. Since the packet hit the learning rule (1), the rule (2) has to be > >>> added back. > >>> > >>> IIUC, the step 7 will not happen with the change proposed in this > >>> patch. > >>> Flow will not be re-translated, so the only way to re-install it is > >>> to execute side effects when we see stats update on a parent datapath > >>> flow in the revalidator. > >>> > >>> One potential way to avoid breaking this scenario is to check the > >>> removal reason, but it's not bulletproof either as we do still need > >>> to reinstate the flow even if it was manually deleted or evicted. > >>> > >>> > >>> I added a log to check the removal reason in the XC_LEARN stats update, > >>> and the > >>> reason always seems to be eviction. I wasn't able to catch an idle or > >>> hard removal > >>> in that function, though I'm sure I just wasn't winning the race. > >>> > >>> But in this example if we hit 7, and don't re-add the rule, what is the > >>> real penalty? > >>> The next packet will have to upcall and we would process the learn action > >>> from scratch, > >>> but that may be what the user wants as indicated by setting a timeout. > >> > >> Not really. If the learned flow is removed, we need a packet to hit a > >> learn() rule, which is a different rule, in order for traffic to continue > >> to flow. Let's say we implement a firewall that allows connections only > >> from A to B. So, A can initiate connection to B, but B can't initiate > >> connection to A. Once A sends tcp SYN, we learn a rule that allows traffic > >> from B to A on these particular ports. Now B can send traffic back. > >> Once the learned flow is removed, A will have to establish a new connection > >> in order to re-learn the B->A rule. So, not just an upcall. May not be a > >> best example, but should work as an illustration. > >> > >>> The alternative > >>> is we re-install a flow that may be different then the flow that the user > >>> actually > >>> wants installed. > >> > >> If the flow is modified, it means that there was a traffic on a learn() > >> action and we have to learn that specific rule. If it was already removed, > >> we have to reinstate it. In general, the logic in the current OVS code > >> seems correct in this particular place. > >> > >>> > >>> Isn't there a risk of different revalidator threads going out of sync > >>> with different > >>> versions of a learned rule in their xcache? > >>> > >>> > >>> It also seems like we may hold the reference to this deleted rule for > >>> a > >>> long time in the cache until the parent datapath flow is deleted. > >>> > >>> > >>> What is triggering the deletion of the learned rule in your case? > >>> Is it deletion or modification of the parent rule? And does it have > >>> 'delete_learned' flag? > >>> > >>> > >>> In the case that I was debugging, the parent learn action was not > >>> deleted. Instead, > >>> new traffic from a different port was supposed to cause the learn action > >>> to modify > >>> the existing rule. But instead we see the rule modified with the correct > >>> information, > >>> and then immediately reverted with old stale information. > >> > >> The problem here is that we can't define which one is 'correct'. > >> It's a race where both values are correct, but you like one better > >> for some reason. New rules are only learned if the parent datapath > >> flow that has a learn() action seen packets after the previous stats > >> check. > >> > >> What you're describing is that you have a single rule with a learn > >> action that is capable of re-writing the same learned rule depending > >> on what kind of packet actually hits it. Is that correct? > > > > In this case the rule is: > > > > learn(table=20,hard_timeout=300,priority=1,cookie=0xcb684d24d14b14ad,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:1 > > > > And an example of the learned rule is: > > > > vlan_tci=0x0005/0x0fff,dl_dst=fc:16:3e:3e:5d:e0,actions=load:0->NXM_OF_VLAN_TCI[],load:0x3->NXM_NX_TUN_ID[],ou > > tput:2 > > > > The scenario is that ingressing packets are supposed to change the > > route that egress packets take, and that route can change over time. > > > > What I see from some additional logging is: > > > > 1. traffic begins to ingress on port A > > 2. learn action installs a rule to route response to A > > 3. traffic begins to egress on port A > > 4. port A becomes invalid, no further traffic ingresses from port A > > 5. traffic ingresses on port B > > 6. learn action installs a rule to route response to B > > 7. revalidator replaces (6) with (2) > > > > Eventually the egressing traffic will become exausted and the correct > > rule is installed, but this can take several minutes. The rule > > regression can even happen many milliseconds after learn installs the > > correct rule and even after packets have flowed through the new rule. > > The rule can be replaced if there was no revalidation between steps > 4 and 5, i.e. at the point of revalidation there is a stats increase > on both datapath flows (ingress A and ingress B). > Revalidators can sleep for up to 500ms by default between revalidation > cycles, so the replacement can happen many milliseconds after. > > Is the traffic on port B continuous, or is it a one-shot packet?
Continuous but less frequent than traffic in the opposite direction. In a client's environment there were ingress packets every few seconds and egress packets continued to flow incorrectly for several minutes afterwards. If this solution is unacceptable; I could resubmit with an additional check to only suppress the flow_mod if the learn_modify rule is in a removed state and there is already a conflicting rule in the classifier that isn't removed. Or would you prefer a completely different approach be taken here? Thanks, M > The point is, at least one revalidation cycle should pass after step 4. > And the traffic on port B should be seen after this revalidation cycle > is completed. So, step 5 should be at least 500+ ms apart from step 4. > > > > > > >> I would consider this kind of setup as fairly fragile and would not > >> recommend using it in general. > >> > >> However, the scenario is likely like this: > >> > >> 1. Upcall on a rule with learn() action. > >> New datapath flow (1) is installed. This flow has a match on > >> all the fields used in a learn action and ukey contains the > >> translation cache with a rule that it produces. > >> The new learned rule is installed from this. > >> > >> 2. Upcall with a different packet that doesn't match the first > >> datapath flow (1), but on the same rule with a learn() action. > >> New datapath flow (2) is installed. This flow has a match on > >> all the fields used in a learn action (they are different from > >> the first one, since we have an upcall) and ukey contains the > >> translation cache with a rule that it produces. Let's say the > >> learned rule has the same match, but different actions. > >> This overwrites the previously installed learned rule. > >> > >> 3. Both datapath flows (1) and (2) get some traffic. > >> > >> 4. Revalidation starts. > >> > >> 5. Revalidator A dumps datapath flow (1). > >> Revalidator B dumps datapath flow (2). > >> > >> 6. Revalidators A and B are trying to revalidate these datapath flows > >> at the same time. Both datapath flows had traffic since the > >> last revalidation. Both revalidators are trying to update the > >> learned rule, which is the same for both of them. > >> > >> 7. One of them does that first, the other one ends up with a rule in a > >> REMOVED state, since it was just replaced by the other revalidator. > >> > >> > >> There is no correct or incorrect rule in this situation. Both > >> of them are correct, because both datapath flows have seen packets > >> since the previous revalidation. We can't guarantee which one will > >> end up being written the last and therefore will take effect. > >> if we refuse to reinstate the REMOVED rule, we still can't say which > >> one will be active in the end. > >> > >> What can resolve the race is the next revalidation cycle. > >> Once one of the datapath flows will stop receiving traffic, there > >> should no longer be a race condition and the next revalidation will > >> set the "correct" rule, i.e. the rule that corresponds to a learning > >> datapath flow that seen some packets last. But until both flows have > >> traffic, there is no "correct" rule and the result will be random, > >> regardless of what we choose to do. > >> > >> > >> One issue that I see in the current code is that it looks like the table > >> version is not getting updated on learned flow refresh and hence the next > >> revalidation may not actually be triggered. In this case the random > >> result may stay for much longer than we would like. > >> > >> The suspicious code (and the potential fix) is the following: > >> > >> @@ -5548,22 +5558,15 @@ ofproto_flow_mod_learn_revert(struct > >> ofproto_flow_mod *ofm) > >> } > >> > >> enum ofperr > >> -ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm, > >> - struct ofproto *orig_ofproto) > >> +ofproto_flow_mod_learn_finish(struct ofproto_flow_mod *ofm) > >> OVS_REQUIRES(ofproto_mutex) > >> { > >> struct rule *rule = rule_collection_rules(&ofm->new_rules)[0]; > >> enum ofperr error = 0; > >> > >> - /* If learning on a different bridge, must bump its version > >> - * number and flush connmgr afterwards. */ > >> - if (rule->ofproto != orig_ofproto) { > >> - ofproto_bump_tables_version(rule->ofproto); > >> - } > >> + ofproto_bump_tables_version(rule->ofproto); > >> error = ofproto_flow_mod_finish(rule->ofproto, ofm, NULL); > >> - if (rule->ofproto != orig_ofproto) { > >> - ofmonitor_flush(rule->ofproto->connmgr); > >> - } > >> + ofmonitor_flush(rule->ofproto->connmgr); > >> > >> return error; > >> } > >> --- > >> > >> We should actually bump the version and request a revalidation in > >> any case, not only when the change is in a different bridge. > > FWIW, the original code in commit 1f4a89336682 was written mostly > for a case where learning is triggered by a packet-out executed > within a bundle. And we can't actually publish changes for a current > bridge in this case. So, the fix should be a bit more clever. > > >> > >> Could you try that in your setup? > > > > That was actually my initial suspicion when investigating this bug, > > but this change doesn't improve the buggy behaviour. > > > > Thank you, > > M > > > >> > >> > >> The issue appears to come from the commit: > >> 1f4a89336682 ("ofproto: Refactor packet_out handling.") > >> > >> There is a slight chance of launching ourselves into an infinite > >> revlidation loop, I guess. Especially if both learning datapath > >> flows will continue to receive the traffic, but there is a minimal > >> revalidation interval, IIRC. Not sure if it applies in this case. > >> > >> Best regards, Ilya Maximets. > >> > > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev