On Sat, Jul 23, 2016 at 09:06:26AM -0400, Lance Richardson wrote:
> > From: "Ben Pfaff" <b...@ovn.org>
> > To: "Lance Richardson" <lrich...@redhat.com>
> > Cc: dev@openvswitch.org
> > Sent: Saturday, July 23, 2016 1:49:36 AM
> > Subject: Re: [ovs-dev] [PATCH v2] ovn-controller: eliminate stall in ofctrl 
> > state machine
> > 
> > On Fri, Jul 22, 2016 at 11:04:47PM -0400, Lance Richardson wrote:
> > > ----- Original Message -----
> > > > From: "Ben Pfaff" <b...@ovn.org>
> > > > To: "Lance Richardson" <lrich...@redhat.com>
> > > > Cc: dev@openvswitch.org
> > > > Sent: Friday, July 22, 2016 6:47:14 PM
> > > > Subject: Re: [ovs-dev] [PATCH v2] ovn-controller: eliminate stall in
> > > > ofctrl state machine
> > > > 
> > > > On Thu, Jul 07, 2016 at 08:31:08PM -0400, Lance Richardson wrote:
> > > > > The "ovn -- 2 HVs, 3 LRs connected via LS, static routes"
> > > > > test case currently exhibits frequent failures. These failures
> > > > > occur because, at the time that the test packets are sent to
> > > > > verify forwarding, no flows have been installed in the vswitch
> > > > > for one of the hypervisors.
> > > > > 
> > > > > Investigation shows that, in the failing case, the ofctrl state
> > > > > machine has not yet transitioned to the S_UPDATE_FLOWS state.
> > > > > This occurrs when ofctrl_run() is called and:
> > > > >    1) The state is S_TLV_TABLE_MOD_SENT.
> > > > >    2) An OFPTYPE_NXT_TLV_TABLE_REPLY message is queued for reception.
> > > > >    3) No event (other than SB probe timer expiration) is expected
> > > > >       that would unblock poll_block() in the main ovn-controller
> > > > >       loop.
> > > > > 
> > > > > In this scenario, ofctrl_run() will move state to S_CLEAR_FLOWS
> > > > > and return, without having executed run_S_CLEAR_FLOWS() which
> > > > > would have immediately transitioned the state to S_UPDATE_FLOWS
> > > > > which is needed in order for ovn-controller to configure flows
> > > > > in ovs-vswitchd. After a delay of about 5 seconds (the default
> > > > > SB probe timer interval), ofctrl_run() would be called again
> > > > > to make the transition to S_UPDATE_FLOWS, but by this time
> > > > > the test case has already failed.
> > > > > 
> > > > > Fix by expanding the state machine's "while state != old_state"
> > > > > loop to include processing of receive events, with a maximum
> > > > > iteration limit to prevent excessive looping in pathological
> > > > > cases. Without this fix, around 40 failures are seen out of
> > > > > 100 attempts, with this fix no failures have been observed in
> > > > > several hundred attempts.
> > > > > 
> > > > > Signed-off-by: Lance Richardson <lrich...@redhat.com>
> > > > > ---
> > > > >  v2: Added maximum iteration limit to state machine loop.
> > > > 
> > > > Thanks for investigating this.
> > > > 
> > > > I understand why your patch fixes a problem, but I don't yet understand
> > > > the root cause of the problem, and I'd like to know more before I commit
> > > > the patch.  I have two questions:
> > > > 
> > > > First, I don't understand why, if there's a message queued for reception
> > > > as you describe in 2), the main loop would wait for a timer expiration.
> > > > Before poll_block(), the main loop should call ofctrl_wait(), which
> > > > calls rconn_recv_wait(), which should cause poll_block() to wake up if
> > > > there's anything incoming from the OpenFlow connection.
> > > > 
> > > > Second, I don't know why the test "state == old_state" is here in
> > > > ofctrl_run().  I think that we can delete it.  It could be a culprit,
> > > > but on the other hand it seems like it's not a big deal if poll_block()
> > > > properly wakes up when a message is received:
> > > > 
> > > >     for (int i = 0; state == old_state && i < 50; i++) {
> > > >         struct ofpbuf *msg = rconn_recv(swconn);
> > > >         if (!msg) {
> > > >             break;
> > > >         }
> > > > 
> > > > Thanks,
> > > > 
> > > > Ben.
> > > > 
> > > As currently implemented, when ofctrl_run() is called,
> > > the run_*() functions will be called to transition through states that
> > > don't depend on external events until the state no longer changes, then
> > > any pending receive events are processed until the arbitrary limit of 50
> > > events is processed, all receive events have been processed, or the state
> > > changes (which does seem odd).
> > > 
> > > <This is the important bit:>
> > > If the receive event handling happens to transition to a state for which
> > > receive events are not expected, i.e. where the run_*() needs to be called
> > > in order to transition to the next state, then the ofctrl_run() will 
> > > return
> > > and the transition will not be made until an event (timer expiration or
> > > receive) occurs.
> > > 
> > > So the "message queued for reception" I referred to in the description is
> > > the
> > > one that transitions into S_CLEAR_FLOWS state.  ofctrl_run() returns after
> > > making this transition, despite the fact there is no receive event 
> > > expected
> > > which would
> > > cause ofctrl_run() to be called again. The state machine is stuck
> > > in this state until the next event (timer expiration) occurs to unblock
> > > the poll loop and cause ofctrl_run() to be called again.  When 
> > > ofctrl_run()
> > > is called, the run_S_CLEAR_FLOWS() function immediately transitions into
> > > S_UPDATE_FLOWS state.
> > > 
> > > Another way to describe this: if a receive event causes a state 
> > > transition,
> > > the run_*() function for the new state needs to be called in order to
> > > execute the actions for the new state (including possibly transitioning
> > > into another state).  If this doesn't occur, the state machine stalls
> > > until an event occurs to unblock the poll loop.
> > > 
> > > I can't say I'm particularly happy with the solution proposed in my patch,
> > > there's probably a better way. If the above explanation still doesn't 
> > > help,
> > > let me know and I'll make another attempt when I'm a little less tired ;-)
> > 
> > Now that you explained it so clearly, it's obvious ;-)  Sorry I didn't
> > catch on quickly.  Your patience is admirable.
> > 
> > Your version will definitely work.  However, in the furtherance of a
> > long personal tradition of screwing with people's code, here's my
> > version.
> > 
> > --8<--------------------------cut here-------------------------->8--
> > 
> > From: Lance Richardson <lrich...@redhat.com>
> > Date: Thu, 7 Jul 2016 20:31:08 -0400
> > Subject: [PATCH] ovn-controller: eliminate stall in ofctrl state machine
> > 
> > The "ovn -- 2 HVs, 3 LRs connected via LS, static routes"
> > test case currently exhibits frequent failures. These failures
> > occur because, at the time that the test packets are sent to
> > verify forwarding, no flows have been installed in the vswitch
> > for one of the hypervisors.
> > 
> > The state machine implemented by ofctrl_run() is intended to
> > iterate as long as progress is being made, either as long as
> > the state continues to change or as long as packets are being
> > received.  Unfortunately, the code had a bug: if receiving a
> > packet caused the state to change, it didn't call the state's
> > run function again to try to see if it would change the state.
> > This caused a real problem in the following case:
> > 
> >    1) The state is S_TLV_TABLE_MOD_SENT.
> >    2) An OFPTYPE_NXT_TLV_TABLE_REPLY message is received.
> >    3) No event (other than SB probe timer expiration) is expected
> >       that would unblock poll_block() in the main ovn-controller
> >       loop.
> > 
> > In such a case, ofctrl_run() would receive the packet and
> > advance the state, but not call the run function for the new
> > state, and then leave the state machine paused until the next
> > event (e.g. a timer event) occurred.
> > 
> > This commit fixes the problem by continuing to iterate the state
> > machine until the state remains the same and no packet is
> > received in the same iteration.  Without this fix, around 40
> > failures are seen out of 100 attempts, with this fix no failures
> > have been observed in several hundred attempts (using an earlier
> > version of this patch).
> > 
> > Signed-off-by: Lance Richardson <lrich...@redhat.com>
> > Signed-off-by: Ben Pfaff <b...@ovn.org>
> > ---
> >  ovn/controller/ofctrl.c | 56
> >  ++++++++++++++++++++++++++++---------------------
> >  1 file changed, 32 insertions(+), 24 deletions(-)
> > 
> > diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
> > index 184e86f..f0451b7 100644
> > --- a/ovn/controller/ofctrl.c
> > +++ b/ovn/controller/ofctrl.c
> > @@ -35,6 +35,7 @@
> >  #include "openvswitch/vlog.h"
> >  #include "ovn-controller.h"
> >  #include "ovn/lib/actions.h"
> > +#include "poll-loop.h"
> >  #include "physical.h"
> >  #include "rconn.h"
> >  #include "socket-util.h"
> > @@ -409,9 +410,10 @@ ofctrl_run(const struct ovsrec_bridge *br_int)
> >          state = S_NEW;
> >      }
> >  
> > -    enum ofctrl_state old_state;
> > -    do {
> > -        old_state = state;
> > +    bool progress = true;
> > +    for (int i = 0; progress && i < 50; i++) {
> > +        /* Allow the state machine to run. */
> > +        enum ofctrl_state old_state = state;
> >          switch (state) {
> >  #define STATE(NAME) case NAME: run_##NAME(); break;
> >              STATES
> > @@ -419,35 +421,41 @@ ofctrl_run(const struct ovsrec_bridge *br_int)
> >          default:
> >              OVS_NOT_REACHED();
> >          }
> > -    } while (state != old_state);
> >  
> > -    for (int i = 0; state == old_state && i < 50; i++) {
> > +        /* Try to process a received packet. */
> >          struct ofpbuf *msg = rconn_recv(swconn);
> > -        if (!msg) {
> > -            break;
> > -        }
> > -
> > -        const struct ofp_header *oh = msg->data;
> > -        enum ofptype type;
> > -        enum ofperr error;
> > +        if (msg) {
> > +            const struct ofp_header *oh = msg->data;
> > +            enum ofptype type;
> > +            enum ofperr error;
> >  
> > -        error = ofptype_decode(&type, oh);
> > -        if (!error) {
> > -            switch (state) {
> > +            error = ofptype_decode(&type, oh);
> > +            if (!error) {
> > +                switch (state) {
> >  #define STATE(NAME) case NAME: recv_##NAME(oh, type); break;
> > -                STATES
> > +                    STATES
> >  #undef STATE
> > -            default:
> > -                OVS_NOT_REACHED();
> > +                default:
> > +                    OVS_NOT_REACHED();
> > +                }
> > +            } else {
> > +                char *s = ofp_to_string(oh, ntohs(oh->length), 1);
> > +                VLOG_WARN("could not decode OpenFlow message (%s): %s",
> > +                          ofperr_to_string(error), s);
> > +                free(s);
> >              }
> > -        } else {
> > -            char *s = ofp_to_string(oh, ntohs(oh->length), 1);
> > -            VLOG_WARN("could not decode OpenFlow message (%s): %s",
> > -                      ofperr_to_string(error), s);
> > -            free(s);
> > +
> > +            ofpbuf_delete(msg);
> >          }
> >  
> > -        ofpbuf_delete(msg);
> > +        /* If we did some work, plan to go around again. */
> > +        progress = old_state != state || msg;
> > +    }
> > +    if (progress) {
> > +        /* We bailed out to limit the amount of work we do in one go, to
> > allow
> > +         * other code a chance to run.  We were still making progress at
> > that
> > +         * point, so ensure that we come back again without waiting. */
> > +        poll_immediate_wake();
> >      }
> >  
> >      return (state == S_CLEAR_FLOWS || state == S_UPDATE_FLOWS
> > --
> 
> I think your version is better, it's cleaner and should be more robust. So:
> 
> Acked-by: Lance Richardson <lrich...@redhat.com>
> 
> Thanks,

Thanks, I applied this.  It was a little weird for you to ack your own
patch, but I added a note about my changes to it to explain.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to