On Tue, Aug 24, 2021 at 7:56 AM Anton Ivanov
<anton.iva...@cambridgegreys.com> wrote:
>
>
> On 24/08/2021 12:46, Ilya Maximets wrote:
> > On 8/24/21 1:18 PM, Anton Ivanov wrote:
> >> On 24/08/2021 12:05, Ilya Maximets wrote:
> >>> On 8/24/21 7:36 AM, Anton Ivanov wrote:
> >>>> On 23/08/2021 22:36, Ilya Maximets wrote:
> >>>>> On 8/23/21 10:37 PM, Anton Ivanov wrote:
> >>>>>> On 23/08/2021 21:26, Ilya Maximets wrote:
> >>>>>>> On 8/23/21 10:20 PM, Anton Ivanov wrote:
> >>>>>>>> Should not be the case.
> >>>>>>>>
> >>>>>>>> The map is pre-sized for the size from the previous iterations.
> >>>>>>>>
> >>>>>>>> Line 12861 in my tree which is probably a few commits out of date:
> >>>>>>>>
> >>>>>>>>         fast_hmap_size_for(&lflows, max_seen_lflow_size);
> >>>>>>>>
> >>>>>>>> And immediately after building the lflows:
> >>>>>>>>
> >>>>>>>>         if (hmap_count(&lflows) > max_seen_lflow_size) {
> >>>>>>>>             max_seen_lflow_size = hmap_count(&lflows);
> >>>>>>>>         }
> >>>>>>>>
> >>>>>>>> So the map SHOULD be sized correctly - to the most recent seen lflow 
> >>>>>>>> count.
> >>>>>>> Please, re-read the commit message.  It's a first run of the loop
> >>>>>>> and the 'max_seen_lflow_size' is default 128 at this point.
> >>>>>> Ack,
> >>>>>>
> >>>>>> Not using auto-resizing in single threaded mode is a bug. Thanks for 
> >>>>>> fixing it.
> >>>>>>
> >>>>>>    From that perspective the patch is a straight +1 from me.
> >>>>>>
> >>>>>>    From the perspective of the use case stated in the commit message- 
> >>>>>> I am not sure it addresses it.
> >>>>>>
> >>>>>> If northd is in single-threaded mode and is tackling a GIGANTIC> 
> >>>>>> database, it may never complete the first iteration before the
> >>>>>> expiration of the timers and everyone deciding that northd is AWOL.
> >>>>> Well, how do you suggest to fix that?  Obviously, we can always create
> >>>>> a database that northd will never be able to process in a reasonable
> >>>>> amount of time.  And it doesn't matter if it's single- or 
> >>>>> multi-threaded.
> >>>>>
> >>>>> In this case NbDB is only 9MB in size, which is very reasonable, and
> >>>>> northd on my laptop takes more than 15 minutes to process it (I killed
> >>>>> it at this point).  With the patch applied it took only 11 seconds.
> >>>>> So, for me, this patch pretty much fixes the issue.  11 seconds is not
> >>>>> that bad, e.g. ovn-k8s configures inactivity probes for clients to 180.
> >>>>> It would be great to reduce, but we're not there yet.
> >>>>>
> >>>>>> In that case, if it is multi-threaded from the start, it should 
> >>>>>> probably
> >>>>>> grab the sizing from the lflow table hash in south db. That would be a
> >>>>>> good approximation for the first run.
> >>>>> This will not work for a case where SbDB is empty for any reason while
> >>>>> NbDB is not.  And there is still a case where northd initially connects
> >>>>> to semi-empty databases and after few iterations NbDB receives a big
> >>>>> transaction and generates a big update for northd.
> >>>> A partial fix is to resize to optimum size the hash after lflow 
> >>>> processing.
> >>> I'm not sure I understand what you mean here, because resizing after
> >>> lflow processing will not help.  The lflow processing itself is the
> >>> very slow part that we're trying to make faster here.
> >> That can be the case only with dpgroups. Otherwise lflows are just a 
> >> destination to dump data and the bucket sizing is irrelevant because there 
> >> is never any lookup inside lflows during processing. The lflow map is just 
> >> used to store data. So if it is suboptimal at the exit of build_lflows() 
> >> the resize will fix it before the first lookup shortly thereafter.
> >>
> >> Are you running it with dpgroups enabled? In that case there are lookups 
> >> inside lflows during build which happen under a per-bucket lock. So in 
> >> addition to suboptimal size when searching the contention depends on the 
> >> number of buckets. If they are too few, the system becomes heavily 
> >> contended resulting in ridiculous computation sizes.
> > Oh, I see.  Indeed, without dp-groups there is no lookup during
> > lflow build.  I missed that.  So, yes, I agree that for a case
> > without dp-groups, re-sizing after lflow processing should work.
> > We need that for parallel case.
> >
> > Current patch (use hmap_insert() that resizes if needed) helps
> > for all non-parallel cases.
>
> Indeed. It should go in.
>

Why can't we have hmap_insert() for both parallel and non parallel configs
to start with and switch over to hmap_insert_fast() when ovn-northd
has successfully connected to SB DB and has approximated on the
more accurate hmap size ?

Thanks
Numan

> I will sort out the other cases to the extent possible.


>
> Brgds,
>
> A.
>
> >
> > I'm mostly running dp-groups + non-parallel which is a default
> > case for ovn-heater/ovn-k8s.
> >
> >> For the case of "dpgroups + parallel + first iteration + pre-existing 
> >> large database" there is no cure short of pre-allocating the hash to 
> >> maximum size.
> > Yeah, dp-groups + parallel is a hard case.
> >
> >> I am scale testing that as well as resize (for non-dp-groups cases) at 
> >> present.
> >>
> >> Brgds,
> >>
> >> A.
> >>
> >>>> If the sizing was correct - 99.9% of the case this will be a noop.
> >>>>
> >>>> If the sizing was incorrect, it will be resized so that the DP searches 
> >>>> and all other ops which were recently added for flow reduction will work 
> >>>> optimally.
> >>>>
> >>>> This still does not work for lflow compute with dpgroups + parallel upon 
> >>>> initial connect and without a SB database to use for size guidance. It 
> >>>> will work for all other cases.
> >>>>
> >>>> I will send two separate patches to address the cases which can be 
> >>>> easily addressed and see what can be done with the dp+parallel upon 
> >>>> initial connect to an empty sb database.
> >>>>
> >>>> Brgds,
> >>>>
> >>>> A
> >>>>
> >>>>>> A.
> >>>>>>
> >>>>>>>> A.
> >>>>>>>>
> >>>>>>>> On 23/08/2021 21:02, Ilya Maximets wrote:
> >>>>>>>>> 'lflow_map' is never expanded because northd always uses fast
> >>>>>>>>> insertion.  This leads to the case where we have a hash map
> >>>>>>>>> with only 128 initial buckets and every ovn_lflow_find() ends
> >>>>>>>>> up iterating over n_lflows / 128 entries.  It's thousands of
> >>>>>>>>> logical flows or even more.  For example, it takes forever for
> >>>>>>>>> ovn-northd to start up with the Northbound Db from the 120 node
> >>>>>>>>> density-heavy test from ovn-heater, because every lookup is slower
> >>>>>>>>> than previous one.  I aborted the process after 15 minutes of
> >>>>>>>>> waiting, because there was no sign that it will converge.  With
> >>>>>>>>> this change applied the loop completes in only 11 seconds.
> >>>>>>>>>
> >>>>>>>>> Hash map will be pre-allocated to the maximum seen number of
> >>>>>>>>> logical flows on a second iteration, but this doesn't help for
> >>>>>>>>> the first iteration when northd first time connects to a big
> >>>>>>>>> Northbound database, which is a common case during failover or
> >>>>>>>>> cluster upgrade.  And there is an even trickier case where big
> >>>>>>>>> NbDB transaction that explodes the number of logical flows received
> >>>>>>>>> on not the first run.
> >>>>>>>>>
> >>>>>>>>> We can't expand the hash map in case of parallel build, so this
> >>>>>>>>> should be fixed separately.
> >>>>>>>>>
> >>>>>>>>> CC: Anton Ivanov <anton.iva...@cambridgegreys.com>
> >>>>>>>>> Fixes: 74daa0607c7f ("ovn-northd: Introduce parallel lflow build")
> >>>>>>>>> Signed-off-by: Ilya Maximets <i.maxim...@ovn.org>
> >>>>>>>>> ---
> >>>>>>>>>       northd/ovn-northd.c | 6 +++++-
> >>>>>>>>>       1 file changed, 5 insertions(+), 1 deletion(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> >>>>>>>>> index 3d8e21a4f..40cf957c0 100644
> >>>>>>>>> --- a/northd/ovn-northd.c
> >>>>>>>>> +++ b/northd/ovn-northd.c
> >>>>>>>>> @@ -4387,7 +4387,11 @@ do_ovn_lflow_add(struct hmap *lflow_map, 
> >>>>>>>>> struct ovn_datapath *od,
> >>>>>>>>>                          nullable_xstrdup(ctrl_meter),
> >>>>>>>>>                          ovn_lflow_hint(stage_hint), where);
> >>>>>>>>>           hmapx_add(&lflow->od_group, od);
> >>>>>>>>> -    hmap_insert_fast(lflow_map, &lflow->hmap_node, hash);
> >>>>>>>>> +    if (!use_parallel_build) {
> >>>>>>>>> +        hmap_insert(lflow_map, &lflow->hmap_node, hash);
> >>>>>>>>> +    } else {
> >>>>>>>>> +        hmap_insert_fast(lflow_map, &lflow->hmap_node, hash);
> >>>>>>>>> +    }
> >>>>>>>>>       }
> >>>>>>>>>         /* Adds a row with the specified contents to the 
> >>>>>>>>> Logical_Flow table. */
> >
> --
> Anton R. Ivanov
> Cambridgegreys Limited. Registered in England. Company Number 10273661
> https://www.cambridgegreys.com/
>
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to