On Thu, Jul 13, 2017 at 02:04:53AM +0000, fukaige wrote: > > > > -----Original Message----- > > From: fukaige > > Sent: Tuesday, July 11, 2017 3:34 PM > > To: 'Ben Pfaff' > > Cc: d...@openvswitch.org; Zhaoshenglong > > Subject: RE: [PATCH v3] tnl-ports: Remove netdevs in netdev_hash when > > deleted > > > > > > > -----Original Message----- > > > From: Ben Pfaff [mailto:b...@ovn.org] > > > Sent: Saturday, July 08, 2017 5:54 AM > > > To: fukaige > > > Cc: d...@openvswitch.org; Zhaoshenglong > > > Subject: Re: [PATCH v3] tnl-ports: Remove netdevs in netdev_hash when > > > deleted > > > > > > On Thu, Jun 15, 2017 at 09:58:57AM +0800, fukaige wrote: > > > > tart a virtual machine with its backend tap device attached to a > > > > brought up > > > linux bridge. > > > > If we delete the linux bridge when vm is still running, we'll get > > > > the following error when trying to create a ovs bridge with the same > > > > name. > > > > > > > > The reason is that ovs-router subsystem add the linux bridge into > > > > netdev_shash, but does not remove it when the bridge is deleted in > > > > the situation. When the bridge is deleted, ovs will receive a > > > > RTM_DELLINK msg, > > > take this chance to remove the bridge in netdev_shash. > > > > > > > > ovs-vsctl: Error detected while setting up 'br-eth'. See > > > > ovs-vswitchd log for > > > details. > > > > > > > > ovs-vswitchd log: > > > > > > 2017-05-11T03:45:25.293Z|00026|ofproto_dpif|INFO|system@ovs-system: > > > > Datapath supports recirculation > > > > > > 2017-05-11T03:45:25.293Z|00027|ofproto_dpif|INFO|system@ovs-system: > > > > MPLS label stack length probed as 1 > > > > > > 2017-05-11T03:45:25.293Z|00028|ofproto_dpif|INFO|system@ovs-system: > > > > Datapath supports unique flow ids > > > > > > 2017-05-11T03:45:25.293Z|00029|ofproto_dpif|INFO|system@ovs-system: > > > > Datapath supports ct_state > > > > > > 2017-05-11T03:45:25.293Z|00030|ofproto_dpif|INFO|system@ovs-system: > > > > Datapath supports ct_zone > > > > > > 2017-05-11T03:45:25.293Z|00031|ofproto_dpif|INFO|system@ovs-system: > > > > Datapath supports ct_mark > > > > > > 2017-05-11T03:45:25.293Z|00032|ofproto_dpif|INFO|system@ovs-system: > > > > Datapath supports ct_label > > > > 2017-05-11T03:45:25.364Z|00001|ofproto_dpif_upcall(handler226)|INFO| > > > > re ceived packet on unassociated datapath port 0 > > > > 2017-05-11T03:45:25.368Z|00033|netdev_linux|WARN|ethtool command > > > > ETHTOOL_GFLAGS on network device br-eth failed: No such device > > > > 2017-05-11T03:45:25.368Z|00034|dpif|WARN|system@ovs-system: failed > > > to > > > > add br-eth as port: No such device > > > > 2017-05-11T03:45:25.368Z|00035|bridge|INFO|bridge br-eth: using > > > > datapath ID 00002a51cf9f2841 > > > > 2017-05-11T03:45:25.368Z|00036|connmgr|INFO|br-eth: added service > > > controller "punix:/var/run/openvswitch/br-eth.mgmt" > > > > > > > > Signed-off-by: fukaige <fuka...@huawei.com> > > > > > > Thank you for the updated patch. > > > > > > This patch raises some difficulties. First, it uses rtnetlink, which > > > is Linux specific. We do not want tnl-ports to be Linux-specific. > > > The more generic alternative is ifnotifier, but it provides no > > > information about the change that took place, so it is harder to deal > > > with. Second, it will dereference a null pointer if the 'change' > > > passed in is null, which can happen if the system is busy or devices are > > changing quickly, as described in rtnetlink.h: > > > > > > /* Function called to report that a netdev has changed. 'change' > > > describes the > > > * specific change. It may be null if the buffer of change > > > information > > > * overflowed, in which case the function must assume that every > > > device may > > > * have changed. 'aux' is as specified in the call to > > > * rtnetlink_notifier_register(). */ > > > typedef > > > void rtnetlink_notify_func(const struct rtnetlink_change *change, > > > void *aux); > > > > > > I am not sure the best way to proceed. One way would be to revamp > > > ifnotifier so that it provides information on the device that changed > > > and the kind of change, and then adapt each of the implementations to pass > > that along as well. > > > Other approaches along these lines are also possible. > > > > > > Another approach might be to notify the ovs-router subsystem when a > > > bridge is deleted, so that it can drop all of the references it has > > > for that bridge. I haven't looked into how much work this would be. > > > > > > What do you think? > > > > > > Thanks, > > > > > > Ben. > > > > Thanks for your review. > > > > I prefer the second approach you mentioned. May be we can register a > > callback > > in ovs-router level, like route_table_init. > > When a bridge is deleted, ovs-router subsystem will get a msg, like > > RTM_DELLINK on linux. So it can drop all of the references it has for that > > bridge. > > > > I don't know if there is same bug in other OSes, like bsd, and which type > > of msg > > will be sent in the situation. > > I think we can fix the bug through the following way: > 1. register a callback using rtnetlink_notifier_create in > route_table_init(route-table.c); > 2.when receive the msg RTM_DELLINK, drop all of the references it has > for that bridge. > > The reason I think the above method can work is that ovs-router subsystem in > bsd(route-table-bsd.c) does not hold the references of that bridge. > So, there is no this bug in bsd. I just hit it in linux. > > What do you think?
It's promising enough that I'd like to see a patch that implements it. Would you mind writing one? Thanks, Ben. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev