On 1/3/23 13:26, Eelco Chaudron wrote:
> 
> 
> On 3 Jan 2023, at 13:20, Ilya Maximets wrote:
> 
>> On 1/3/23 12:33, Eelco Chaudron wrote:
>>> Depending on the driver implementation it can take up to 2 seconds before
>>> offloaded flow statistics are updated. This is causing a problem with
>>> min-revalidate-pps, as old statistic values are used during this period.
>>>
>>> This fix will wait for at least 2 seconds before assuming no packets
>>> where received during this period.
>>>
>>> Signed-off-by: Eelco Chaudron <echau...@redhat.com>
>>> ---
>>>
>>> Changes:
>>>  - v2: Use existing ukey->offloaded so the revalidate_missed_dp_flow case is
>>>        also covered.
>>>
>>>  ofproto/ofproto-dpif-upcall.c |   25 +++++++++++++++----------
>>>  1 file changed, 15 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
>>> index ad9635496..c395adbc6 100644
>>> --- a/ofproto/ofproto-dpif-upcall.c
>>> +++ b/ofproto/ofproto-dpif-upcall.c
>>> @@ -2094,10 +2094,11 @@ ukey_delete(struct umap *umap, struct udpif_key 
>>> *ukey)
>>>  }
>>>
>>>  static bool
>>> -should_revalidate(const struct udpif *udpif, uint64_t packets,
>>> -                  long long int used)
>>> +should_revalidate(const struct udpif *udpif, struct udpif_key *ukey,
>>> +                  uint64_t packets)
>>>  {
>>>      long long int metric, now, duration;
>>> +    long long int used = ukey->stats.used;
>>>
>>>      if (!used) {
>>>          /* Always revalidate the first time a flow is dumped. */
>>> @@ -2124,8 +2125,12 @@ should_revalidate(const struct udpif *udpif, 
>>> uint64_t packets,
>>>      duration = now - used;
>>>      metric = duration / packets;
>>>
>>> -    if (metric < 1000 / ofproto_min_revalidate_pps) {
>>> -        /* The flow is receiving more than min-revalidate-pps, so keep it. 
>>> */
>>> +    if (metric < 1000 / ofproto_min_revalidate_pps ||
>>> +        (ukey->offloaded && duration < 2000)) {
>>> +        /* The flow is receiving more than min-revalidate-pps, so keep it.
>>> +         * Or it's a hardware offloaded flow that might take up to 2 
>>> seconds
>>> +         * to update its statistics. Until we are sure the statistics had a
>>> +         * chance to be updated, also keep it. */
>>
>> Hmm.  If you know that a flow dump is taking a long time on this setup,
>> shouldn't we just bump the max-revalidator value in the database instead
>> of hardcoding yet another magic constant?
> 
> Don’t think increasing the max-revalidator value is a good solution, as it 
> affects
> the statistics gathering in general.

It affects the judgement on the dump duration and the
time when revalidators will wake up next time.  If your
dump duration is already high, your statistics gathering
is already delayed.  And your hardware doesn't seem to
provide statistics in any timely manner as well.

> It also does not solve the problem, as any update in the flow table/config can
> trigger a revalidation.

It will, but there is a check right above the one that
you're changing and it looks like this:

    if (udpif->dump_duration < ofproto_max_revalidator / 2) {
        /* We are likely to handle full revalidation for the flows. */
        return true;
    }

So, by bumping the max-revalidator value we will avoid
blind deletion and will properly revalidate the flow.
The flow will not be deleted then, unless we hit the max-idle.

Or am I missing something?

> 
> If we do not like the magic static values we can add a configuration option, 
> so it can be adjusted based on the driver implementation.
> 
>> How long the dump actually takes?  And how many flows in it?
> 
> William do you have these values, as I did not capture any of this when 
> debugging your setup.
> 
> //Eelco
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to