On Thu, Jul 13, 2017 at 02:04:53AM +0000, fukaige wrote:
> 
> 
> > -----Original Message-----
> > From: fukaige
> > Sent: Tuesday, July 11, 2017 3:34 PM
> > To: 'Ben Pfaff'
> > Cc: d...@openvswitch.org; Zhaoshenglong
> > Subject: RE: [PATCH v3] tnl-ports: Remove netdevs in netdev_hash when
> > deleted
> > 
> > 
> > > -----Original Message-----
> > > From: Ben Pfaff [mailto:b...@ovn.org]
> > > Sent: Saturday, July 08, 2017 5:54 AM
> > > To: fukaige
> > > Cc: d...@openvswitch.org; Zhaoshenglong
> > > Subject: Re: [PATCH v3] tnl-ports: Remove netdevs in netdev_hash when
> > > deleted
> > >
> > > On Thu, Jun 15, 2017 at 09:58:57AM +0800, fukaige wrote:
> > > > tart a virtual machine with its backend tap device attached to a
> > > > brought up
> > > linux bridge.
> > > > If we delete the linux bridge when vm is still running, we'll get
> > > > the following error when trying to create a ovs bridge with the same 
> > > > name.
> > > >
> > > > The reason is that ovs-router subsystem add the linux bridge into
> > > > netdev_shash, but does not remove it when the bridge is deleted in
> > > > the situation. When the bridge is deleted, ovs will receive a
> > > > RTM_DELLINK msg,
> > > take this chance to remove the bridge in netdev_shash.
> > > >
> > > > ovs-vsctl: Error detected while setting up 'br-eth'.  See
> > > > ovs-vswitchd log for
> > > details.
> > > >
> > > > ovs-vswitchd log:
> > > >
> > 2017-05-11T03:45:25.293Z|00026|ofproto_dpif|INFO|system@ovs-system:
> > > > Datapath supports recirculation
> > > >
> > 2017-05-11T03:45:25.293Z|00027|ofproto_dpif|INFO|system@ovs-system:
> > > > MPLS label stack length probed as 1
> > > >
> > 2017-05-11T03:45:25.293Z|00028|ofproto_dpif|INFO|system@ovs-system:
> > > > Datapath supports unique flow ids
> > > >
> > 2017-05-11T03:45:25.293Z|00029|ofproto_dpif|INFO|system@ovs-system:
> > > > Datapath supports ct_state
> > > >
> > 2017-05-11T03:45:25.293Z|00030|ofproto_dpif|INFO|system@ovs-system:
> > > > Datapath supports ct_zone
> > > >
> > 2017-05-11T03:45:25.293Z|00031|ofproto_dpif|INFO|system@ovs-system:
> > > > Datapath supports ct_mark
> > > >
> > 2017-05-11T03:45:25.293Z|00032|ofproto_dpif|INFO|system@ovs-system:
> > > > Datapath supports ct_label
> > > > 2017-05-11T03:45:25.364Z|00001|ofproto_dpif_upcall(handler226)|INFO|
> > > > re ceived packet on unassociated datapath port 0
> > > > 2017-05-11T03:45:25.368Z|00033|netdev_linux|WARN|ethtool command
> > > > ETHTOOL_GFLAGS on network device br-eth failed: No such device
> > > > 2017-05-11T03:45:25.368Z|00034|dpif|WARN|system@ovs-system: failed
> > > to
> > > > add br-eth as port: No such device
> > > > 2017-05-11T03:45:25.368Z|00035|bridge|INFO|bridge br-eth: using
> > > > datapath ID 00002a51cf9f2841
> > > > 2017-05-11T03:45:25.368Z|00036|connmgr|INFO|br-eth: added service
> > > controller "punix:/var/run/openvswitch/br-eth.mgmt"
> > > >
> > > > Signed-off-by: fukaige <fuka...@huawei.com>
> > >
> > > Thank you for the updated patch.
> > >
> > > This patch raises some difficulties.  First, it uses rtnetlink, which
> > > is Linux specific.  We do not want tnl-ports to be Linux-specific.
> > > The more generic alternative is ifnotifier, but it provides no
> > > information about the change that took place, so it is harder to deal
> > > with.  Second, it will dereference a null pointer if the 'change'
> > > passed in is null, which can happen if the system is busy or devices are
> > changing quickly, as described in rtnetlink.h:
> > >
> > >     /* Function called to report that a netdev has changed.  'change'
> > > describes the
> > >      * specific change.  It may be null if the buffer of change 
> > > information
> > >      * overflowed, in which case the function must assume that every
> > > device may
> > >      * have changed.  'aux' is as specified in the call to
> > >      * rtnetlink_notifier_register().  */
> > >     typedef
> > >     void rtnetlink_notify_func(const struct rtnetlink_change *change,
> > >                                void *aux);
> > >
> > > I am not sure the best way to proceed.  One way would be to revamp
> > > ifnotifier so that it provides information on the device that changed
> > > and the kind of change, and then adapt each of the implementations to pass
> > that along as well.
> > > Other approaches along these lines are also possible.
> > >
> > > Another approach might be to notify the ovs-router subsystem when a
> > > bridge is deleted, so that it can drop all of the references it has
> > > for that bridge.  I haven't looked into how much work this would be.
> > >
> > > What do you think?
> > >
> > > Thanks,
> > >
> > > Ben.
> > 
> > Thanks for your review.
> > 
> > I prefer the second approach you mentioned. May be we can register a 
> > callback
> > in ovs-router level, like route_table_init.
> > When a bridge is deleted, ovs-router subsystem will get a msg, like
> > RTM_DELLINK on linux. So it can drop all of the references it has for that
> > bridge.
> > 
> > I don't know if there is same bug in other OSes, like bsd, and which type 
> > of msg
> > will be sent in the situation.
> 
> I think we can fix the bug through the following way:
>       1. register a callback using rtnetlink_notifier_create in 
> route_table_init(route-table.c);
>       2.when receive the msg RTM_DELLINK, drop all of the references it has 
> for that bridge.
> 
> The reason I think the above method can work is that ovs-router subsystem in 
> bsd(route-table-bsd.c) does not hold the references of that bridge.
> So, there is no this bug in bsd. I just hit it in linux.
> 
> What do you think?

It's promising enough that I'd like to see a patch that implements it.
Would you mind writing one?

Thanks,

Ben.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to