On 9/15/20 9:21 PM, Flavio Leitner wrote: > When the number of flows in the datapath reaches twice the > maximum, revalidators will delete all flows as an emergency > action to recover. In that case, log a message with values > and increase a coverage counter. > > Signed-off-by: Flavio Leitner <f...@sysclose.org> > --- > ofproto/ofproto-dpif-upcall.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c > index 05a912f57..12e94287a 100644 > --- a/ofproto/ofproto-dpif-upcall.c > +++ b/ofproto/ofproto-dpif-upcall.c > @@ -57,6 +57,7 @@ COVERAGE_DEFINE(upcall_ukey_contention); > COVERAGE_DEFINE(upcall_ukey_replace); > COVERAGE_DEFINE(revalidate_missed_dp_flow); > COVERAGE_DEFINE(upcall_flow_limit_hit); > +COVERAGE_DEFINE(upcall_flow_limit_kill); > > /* A thread that reads upcalls from dpif, forwards each upcall's packet, > * and possibly sets up a kernel flow as a cache. */ > @@ -2622,6 +2623,7 @@ revalidate(struct revalidator *revalidator) > > long long int max_idle; > long long int now; > + size_t kill_all_limit; > size_t n_dp_flows; > bool kill_them_all; > > @@ -2649,7 +2651,18 @@ revalidate(struct revalidator *revalidator) > COVERAGE_INC(upcall_flow_limit_hit); > } > > - kill_them_all = n_dp_flows > flow_limit * 2; > + kill_them_all = false; > + kill_all_limit = flow_limit * 2; > + if (OVS_UNLIKELY(n_dp_flows > kill_all_limit)) { > + static struct vlog_rate_limit rlem = VLOG_RATE_LIMIT_INIT(1, 5); > + > + VLOG_WARN_RL(&rlem, "Emergency: deleting all flows " > + "(now: %"PRIuSIZE", max: %"PRIuSIZE")",
Almost he same concern here as in the other patch. But this message is even more misleading since it prints doubled dynamically adjusted limit which is OK to print, but how to use this value in practice if you do not know internals of this code? Maybe something like "Number of datapath flows (%PRIuSIZE) twice as high as current dynamic flow limit (%PRIuSIZE). Deleting all flows as an emergency measure."? Another thing is that this message is actually misleading anyway because revalidators recalculates the number of flows for each batch (50) of dumped flows (every 100ms, to be honest), so we will not actually flush all the flows in many cases. We will remove them until situation is normalized. So, probably: "Number of datapath flows (%PRIuSIZE) twice as high as current dynamic flow limit (%PRIuSIZE). Starting to delete flows unconditionally as an emergency measure."? Also, this message should be printed only once per revalidation cycle, I guess, not for every flow batch. > + n_dp_flows, kill_all_limit); > + COVERAGE_INC(upcall_flow_limit_kill); > + kill_them_all = true; > + } > + > max_idle = n_dp_flows > flow_limit ? 100 : ofproto_max_idle; > > udpif->dpif->current_ms = time_msec(); > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev