On Tue, Jul 19, 2016 at 04:55:49PM -0500, Ryan Moats wrote:
> 
> 
> > On Tue, Jul 19, 2016 at 02:57:43PM -0500, Ryan Moats wrote:
> > > Ben Pfaff <blp at ovn.org> wrote on 07/19/2016 02:54:36 PM:
> > >
> > > > From: Ben Pfaff <blp at ovn.org>
> > > To: Ryan Moats/Omaha/IBM at IBMUS>
> > > Cc: dev at openvswitch.org
> > > Date: 07/19/2016 02:54 PM
> > > > Subject: Re: [ovs-dev, v23, 2/2] ovn-controller: Add incremental
> > > > processing to lflow_run and physical_run
> > > >
> > > > On Tue, Jul 19, 2016 at 02:18:11PM -0500, Ryan Moats wrote:
> > > > >
> > > > >
> > > > > Ben Pfaff <blp at ovn.org> wrote on 07/19/2016 01:39:43 PM:
> > > > >
> > > > > > From: Ben Pfaff <blp at ovn.org>
> > > > > > To: Ryan Moats/Omaha/IBM at IBMUS
> > > > > > Cc: dev at openvswitch.org
> > > > > > Date: 07/19/2016 01:41 PM
> > > > > > Subject: Re: [ovs-dev, v23, 2/2] ovn-controller: Add incremental
> > > > > > processing to lflow_run and physical_run
> > > > > >
> > > > > > On Tue, Jul 19, 2016 at 08:10:05AM -0500, Ryan Moats wrote:
> > > > > > >
> > > > > > >
> > > > > > > Ben Pfaff <blp at ovn.org> wrote on 07/19/2016 12:57:23 AM:
> > > > > > >
> > > > > > > > From: Ben Pfaff <blp at ovn.org>
> > > > > > > > To: Ryan Moats/Omaha/IBM at IBMUS
> > > > > > > > Cc: dev at openvswitch.org
> > > > > > > > Date: 07/19/2016 12:57 AM
> > > > > > > > Subject: Re: [ovs-dev, v23, 2/2] ovn-controller: Add
> incremental
> > > > > > > > processing to lflow_run and physical_run
> > > > > > > >
> > > > > > > > On Mon, Jul 18, 2016 at 04:21:17PM -0500, Ryan Moats wrote:
> > > > > > > > > This code changes to allow incremental processing of the
> > > > > > > > > logical flow and physical binding tables whenver possible.
> > > > > > > > >
> > > > > > > > > Note: flows created by physical_run for multicast_groups
> are
> > > > > > > > > *NOT* handled incrementally due to to be solved issues
> > > > > > > > > with GWs and local routers.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Ryan Moats <rmoats at us.ibm.com>
> > > > > > > >
> > > > > > > > With this applied, I get a number of apparently consistent
> test
> > > > > failures
> > > > > > > > (literally no passes in 5+ runs) that I don't see, at least
> not
> > > > > > > > consistently, if I revert it:
> > > > > > > >
> > > > > > > >     OVN end-to-end tests
> > > > > > > >
> > > > > > > >     2106: ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS
> FAILED
> > > > > > > (ovn.at:1307)
> > > > > > > >     2107: ovn -- 3 HVs, 1 VIFs/HV, 1 software GW, 1 LS
> FAILED
> > > > > > > (ovn.at:1473)
> > > > > > > >     2108: ovn -- 3 HVs, 3 LS, 3 lports/LS, 1 LR
> FAILED
> > > > > > > (ovn.at:1887)
> > > > > > > >     2110: ovn -- 2 HVs, 2 LS, 1 lport/LS, 2 peer LRs
> FAILED
> > > > > > > (ovn.at:2416)
> > > > > > > >     2115: ovn -- 2 HVs, 3 LRs connected via LS, static routes
> > > FAILED
> > > > > > > > (ovn.at:3053)
> > > > > > > >
> > > > > > > > Do these pass reasonably often for you?  (Maybe my changes to
> the
> > > > > > > > previous patch screwed something up?)
> > > > > > > >
> > > > > > >
> > > > > > > First, not all of the above cases failed for me even with a
> rebase
> > > of
> > > > > the
> > > > > > > last patch onto what is currently in master - the last two
> passed
> > > for
> > > > > me
> > > > > > > on either the first and second run, so I'm looking at the first
> > > three
> > > > > for
> > > > > > > the rest of this message.
> > > > > > >
> > > > > > > > Double checking by going back to my original first patch
> verified
> > > that
> > > > > all
> > > > > > > of the tests in question passed.  Going through and checking
> each
> > > of
> > > > > the
> > > > > > > changes that you made to the first patch, the following is the
> diff
> > > I
> > > > > > > needed
> > > > > > > to make things work again:
> > > > > > >
> > > > > > > diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
> > > > > > > index d10dcc6..e113213 100644
> > > > > > > --- a/ovn/controller/ofctrl.c
> > > > > > > +++ b/ovn/controller/ofctrl.c
> > > > > > > @@ -553,7 +553,10 @@ ofctrl_add_flow(uint8_t table_id, uint16_t
> > > > > priority,
> > > > > > >      ovn_flow_lookup(&match_flow_table, f, &existing);
> > > > > > >      struct ovn_flow *d;
> > > > > > >      LIST_FOR_EACH (d, list_node, &existing) {
> > > > > > > -        if (uuid_equals(&f->uuid, &d->uuid)) {
> > > > > > > +        if (f->table_id == d->table_id && f->priority == d->
> > > priority
> > > > > > > +            && ofpacts_equal(f->ofpacts, f->ofpacts_len,
> > > > > > > +                             d->ofpacts, d->ofpacts_len)
> > > > > > > +            && uuid_equals(&f->uuid, &d->uuid)) {
> > > > > > >              static struct vlog_rate_limit rl =
> > > VLOG_RATE_LIMIT_INIT(5,
> > > > > 1);
> > > > > > >              char *s = ovn_flow_to_string(f);
> > > > > > >              VLOG_WARN_RL(&rl, "found duplicate flow %s for
> parent
> > > > > > > "UUID_FMT,
> > > > > > >
> > > > > > > Hopefully that will get you past the test failures and you can
> push
> > > > > > > both it and the rest of the last incremental processing patch.
> > > > > >
> > > > > > OK...
> > > > > >
> > > > > > Help me understand why that change is correct.
> > > > > >
> > > > > > I deleted the table_id and priority checks because
> ovn_flow_lookup()
> > > > > > only returns flows where those are equal anyhow, so they seemed
> > > > > > completely redundant.  That leaves the actions comparison.  Why
> would
> > > we
> > > > > > report a duplicate only if the actions are the same?  It's
> actually
> > > > > > worse, isn't it, if the actions are different, because then the
> flows
> > > > > > aren't just duplicates, they're inconsistent with each other.
> > > > >
> > > > > Actually, you are correct in that I'm doing something I shouldn't,
> but
> > > > > your solution is also doing something it shouldn't.  Yes, my
> keeping
> > > > > rules that are inconsistent is wrong, but discarding rules where
> > > > > the action changes (which is what you proposed) doesn't work
> either.
> > > > >
> > > > > How about if we split the difference?  First, verify that the uuid
> > > > > is the same, then check the actions.  If they are the same, spawn a
> > > > > warning, and if they aren't replace the old actions with the new
> ones.
> > > > > This appears to pass unit tests as well (and I think it's more
> > > correct),
> > > > > in which case the difference becomes:
> > > > >
> > > > > diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
> > > > > index d10dcc6..0b8326e 100644
> > > > > --- a/ovn/controller/ofctrl.c
> > > > > +++ b/ovn/controller/ofctrl.c
> > > > > @@ -554,13 +554,21 @@ ofctrl_add_flow(uint8_t table_id, uint16_t
> > > priority,
> > > > >      struct ovn_flow *d;
> > > > >      LIST_FOR_EACH (d, list_node, &existing) {
> > > > >          if (uuid_equals(&f->uuid, &d->uuid)) {
> > > > > -            static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5,
> > > 1);
> > > > > -            char *s = ovn_flow_to_string(f);
> > > > > -            VLOG_WARN_RL(&rl, "found duplicate flow %s for parent
> > > > > "UUID_FMT,
> > > > > -                         s, UUID_ARGS(&f->uuid));
> > > > > -            ovn_flow_destroy(f);
> > > > > -            free(s);
> > > > > -            return;
> > > > > +            if (ofpacts_equal(f->ofpacts, f->ofpacts_len,
> > > > > +                              d->ofpacts, d->ofpacts_len)) {
> > > > > +                static struct vlog_rate_limit rl =
> > > VLOG_RATE_LIMIT_INIT(5,
> > > > > 1);
> > > > > +                char *s = ovn_flow_to_string(f);
> > > > > +                VLOG_WARN_RL(&rl, "found duplicate flow %s for
> parent
> > > > > "UUID_FMT
> > > > > +                             s, UUID_ARGS(&f->uuid));
> > > > > +                ovn_flow_destroy(f);
> > > > > +                free(s);
> > > > > +                return;
> > > > > +            } else {
> > > > > +                /* Update the action. */
> > > > > +                free(d->ofpacts);
> > > > > +                d->ofpacts = xmemdup(f->ofpacts, f->ofpacts_len);
> > > > > +                d->ofpacts_len = f->ofpacts_len;
> > > > > +            }
> > > > >          }
> > > > >      }
> > > > >
> > > >
> > > > I'm OK with updating the actions, but I'd prefer to warn about it
> anyway
> > > > because I still think it's a bug; if the "new" actions happen to work
> > > > better than the "old" actions, I think that's just a coincidence.
> > >
> > > While I think that's going to be a *lot* of warnings, I'm ok with
> warning
> > > in that case as well for now and we can argue the point later on after
> we
> > > have more operational experience with the patch set.
> >
> > So do we have a lot of code that's using ofctrl_add_flow() to actually
> > replace an existing flow instead of adding one?  Maybe that code should
> > be using ofctrl_set_flow() instead?
> >
> > I'm trying to understand the intent here instead of just the code.
> >
> > > You want to add the additional warning code or should I?
> >
> > I'm happy to write the code once I understand what's going on.
> 
> Sorry for breaking the thread but while I can see your reply in the
> archives,
> something in the server chain isn't it dropping it into my mailbox,
> and I don't want to look like I'm ignoring it.
> 
> Based on looking through the logs when running the unit tests, I believe
> (but I don't know for sure) that the vast majority of warnings
> will come from creating a completely identical flow (and I admit that I
> don't know why *yet* - that was a topic for another day) - the reason I
> can't say for sure is that since I've not had a separate message handling
> the action change case, I've had nothing to compare to.
> 
> Since we are going to warn on both branches, I'd like to have
> the warning message be different so that there is a canary to work
> backwards
> from to figure out what's driving what. Something like the following:
> 
> diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
> index d10dcc6..7fd6ff8 100644
> --- a/ovn/controller/ofctrl.c
> +++ b/ovn/controller/ofctrl.c
> @@ -554,13 +554,28 @@ ofctrl_add_flow(uint8_t table_id, uint16_t priority,
>      struct ovn_flow *d;
>      LIST_FOR_EACH (d, list_node, &existing) {
>          if (uuid_equals(&f->uuid, &d->uuid)) {
> -            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> -            char *s = ovn_flow_to_string(f);
> -            VLOG_WARN_RL(&rl, "found duplicate flow %s for parent
> "UUID_FMT,
> -                         s, UUID_ARGS(&f->uuid));
> -            ovn_flow_destroy(f);
> -            free(s);
> -            return;
> +            if (ofpacts_equal(f->ofpacts, f->ofpacts_len,
> +                              d->ofpacts, d->ofpacts_len)) {
> +                static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
> 1);
> +                char *s = ovn_flow_to_string(f);
> +                VLOG_WARN_RL(&rl, "found duplicate flow %s for parent
> "UUID_FMT
> +                             s, UUID_ARGS(&f->uuid));
> +                ovn_flow_destroy(f);
> +                free(s);
> +                return;
> +            } else {
> +                /* Update the action. */
> +                free(d->ofpacts);
> +                d->ofpacts = xmemdup(f->ofpacts, f->ofpacts_len);
> +                d->ofpacts_len = f->ofpacts_len;
> +                static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,
> 1);
> +                char *s = ovn_flow_to_string(f);
> +                VLOG_WARN_RL(&rl, "found flow %s with modified action for
> paren
> +                             s, UUID_ARGS(&f->uuid));
> +                ovn_flow_destroy(f);
> +                free(s);
> +                return;
> +            }
>          }
> 
> Does that make sense?

Here's a version for your review:
        http://openvswitch.org/pipermail/dev/2016-July/075709.html
        http://openvswitch.org/pipermail/dev/2016-July/075710.html
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to