On Sat, Dec 7, 2019 at 1:36 AM Han Zhou <hz...@ovn.org> wrote:
>
> Thanks Dumitru for the patch. Please see my comments below.
>

Hi Han,

Thanks for the review.

> On Fri, Dec 6, 2019 at 5:38 AM Dumitru Ceara <dce...@redhat.com> wrote:
> >
> > If the last ovn-controller main loop run started a transaction to the
> > Southbound DB and the transaction is still in progress, ovn-controller
> > will not call engine_run(). In case there were changes to the DB,
> > engine_need_run() would return true which would trigger an immediate
> > forced recompute in the next processing loop iteration.
> >
> > However, there are scenarios when updates can be processed incrementally
> > even if no Southbound DB transaction is available. One example, often
> > seen in the field, is when the MAC_Binding table is updated. Currently
> > such an update received while a transaction is still committing would
> > trigger a forced recompute.
> >
> > To minimize the number of forced recomputes, we now call
> > engine_run(false), i.e., try to process updates incrementally without
> > allowing recompute, also when ovnsb_idl_txn == NULL. This is safe
> > because ovnsb_idl_txn is not used by change_handlers and run handlers
> > are not allowed to execute when calling engine_run(false).
> >
> This patch assumes we will never need to do SB updates when handling changes 
> incrementally. There are two problems:
> 1. I don't see a reason why a change_handler (for doing incremental compute) 
> can't update SB DB. For example Numan mentioned about working on incremental 
> processing for port-bindings on local chassis. I believe those change 
> handling would trigger SB updates, e.g. claiming/releasing a port.
> 2. Even if we can keep this assumption and never do incremental processing 
> for changes that require updating SB DB, how do we ensure this constraint is 
> enforced in coding?
>

I think my commit message is not clear enough. I didn't mean that
change handlers will never perform SB updates. What I meant was that
if a change handler needs to perform SB updates it needs to validate
that ovnsb_idl_txn != NULL and if it's NULL return "false" to trigger
a recompute. At this moment we have no change handler that updates the
SB database but as you pointed out we might have them in the future.
Then, the new change handlers will have to make sure that
ovnsb_idl_txn is not NULL.

> For 2), it may be easy. We could hide the interface get_context(), and always 
> pass the context to xxx_run() interface, so that ovnsb_idl_txn can be 
> accessed by recompute but never by change_handlers.
> What I am really concerned is 1). I think the long term solution is to make 
> the OVSDB change tracking available across iterations. Maybe it is not a 
> trivial work.
>

As mentioned above, I don't think we need this as long as the contract
of get_context() is clear and the users know to check for NULL txn
pointers in change handlers.

> If we believe there is a big performance gain here and we are sure we don't 
> need to update SB DB in change handlers (in short term), we may proceed with 
> this approach as long as 2) is addressed, and revert this whenever the long 
> term solution (OVSDB change tracking over iterations) is ready.

There's no real reason to not try to process changes incrementally
even when a SB txn is in progress (ovnsb_idl_txn == NULL). We use the
same reasoning for !ofctrl_can_put() in order to avoid forced
recomputes. We saw it on customer deployments that MAC_Binding updates
received while a SB transaction was still in progress were triggering
unnecessary forced full recomputes that last in the order of tens of
seconds.

What do you think?

>
> > To make sure that other users of ovnsb_idl_txn, like pinctrl_run(),
> > get a chance to run as soon as the transaction is completed, if the
> > engine has successfully run and ovnsb_idl_txn is NULL we trigger an
> > immediate wake and a new iteration of the processing loop.
>
> I think this will cause busy loop until the transaction completes. And I 
> think it is not necessary because when transaction completes there will be a 
> jsonrpc message coming and trigger the main loop, and pinctrl will get a 
> chance to run.

Ah, right, I missed that case. I can just remove this check.

Thanks,
Dumitru

>
> >
> > CC: Han Zhou <hz...@ovn.org>
> > CC: Numan Siddique <num...@ovn.org>
> > Signed-off-by: Dumitru Ceara <dce...@redhat.com>
> > ---
> >  controller/ovn-controller.c | 13 +++++++++++++
> >  lib/inc-proc-eng.h          |  8 +++++++-
> >  2 files changed, 20 insertions(+), 1 deletion(-)
> >
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index 5874776..4a27d5f 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -2100,6 +2100,14 @@ main(int argc, char *argv[])
> >                          } else {
> >                              engine_run(true);
> >                          }
> > +                    } else {
> > +                        /* Even if there's no SB DB transaction available,
> > +                         * try to run the engine so that we can handle any
> > +                         * incremental changes that don't require a 
> > recompute.
> > +                         * If a recompute is required, the engine will 
> > abort,
> > +                         * triggerring a full run in the next iteration.
> > +                         */
> > +                        engine_run(false);
> >                      }
> >                      stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME,
> >                                     time_msec());
> > @@ -2168,6 +2176,11 @@ main(int argc, char *argv[])
> >                           "br_int %p, chassis %p", br_int, chassis);
> >                  engine_set_force_recompute(true);
> >                  poll_immediate_wake();
> > +            } else if (!ovnsb_idl_txn) {
> > +                VLOG_DBG("engine ran, no SB DB transaction available, "
> > +                         "trigger an immediate loop run: "
> > +                         "br_int %p, chassis %p", br_int, chassis);
> > +                poll_immediate_wake();
> >              } else {
> >                  engine_set_force_recompute(false);
> >              }
> > diff --git a/lib/inc-proc-eng.h b/lib/inc-proc-eng.h
> > index 5b92971..2f90b0a 100644
> > --- a/lib/inc-proc-eng.h
> > +++ b/lib/inc-proc-eng.h
> > @@ -189,7 +189,13 @@ void engine_add_input(struct engine_node *node, struct 
> > engine_node *input,
> >   * iteration, and the change can't be tracked across iterations */
> >  void engine_set_force_recompute(bool val);
> >
> > -const struct engine_context * engine_get_context(void);
> > +/* Return the current engine_context. The values in the context can be NULL
> > + * if the engine is run with allow_recompute == false in the current
> > + * iteration.
> > + * Therefore, it is the responsibility of the caller to check the context
> > + * values when called from change_handlers.
> > + */
> > +const struct engine_context *engine_get_context(void);
> >
> >  void engine_set_context(const struct engine_context *);
> >
> > --
> > 1.8.3.1
> >

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to