Infinite re-addition of failed ports happens if the device in userspace datapath has a linux network interface and it's not able to be configured. For example, if the first reconfiguration fails because of misconfiguration or bad initial device state. In current code victims are afxdp ports and the Mellanox NIC ports opened by the DPDK due to their bifurcated drivers (It's unlikely for usual netdev-linux ports to fail).
The root cause: Every change in the state of the network interface of a linux kernel device generates if-notifier event and if-notifier event triggers the OVS code to re-apply the configuration of ports, i.e. add broken ports back. The most obvious part is that dpif-netdev changes the device flags before trying to configure it: 1. add_port() 2. set_flags() --> if-notifier event 3. reconfigure() --> port removal from the datapath due to misconfiguration or any other issue in the underlying device. 4. setting flags back --> another if-notifier event. 5. There was new if-notifier event? yes --> re-apply all settings. --> goto step 1. Easy way to reproduce is to add afxdp port with n_rxq=N, where N is bigger than device supports. This patch fixes the most obvious case for this issue by moving enabling of a promisc mode later to the place where we already know that device could be added to datapath without errors, i.e. after its first successful reconfiguration. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-September/363038.html Signed-off-by: Ilya Maximets <i.maxim...@ovn.org> --- lib/dpif-netdev.c | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 1e5493615..a6a52d27a 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -1802,7 +1802,6 @@ static int port_create(const char *devname, const char *type, odp_port_t port_no, struct dp_netdev_port **portp) { - struct netdev_saved_flags *sf; struct dp_netdev_port *port; enum netdev_flags flags; struct netdev *netdev; @@ -1824,17 +1823,11 @@ port_create(const char *devname, const char *type, goto out; } - error = netdev_turn_flags_on(netdev, NETDEV_PROMISC, &sf); - if (error) { - VLOG_ERR("%s: cannot set promisc flag", devname); - goto out; - } - port = xzalloc(sizeof *port); port->port_no = port_no; port->netdev = netdev; port->type = xstrdup(type); - port->sf = sf; + port->sf = NULL; port->emc_enabled = true; port->need_reconfigure = true; ovs_mutex_init(&port->txq_used_mutex); @@ -1853,6 +1846,7 @@ do_add_port(struct dp_netdev *dp, const char *devname, const char *type, odp_port_t port_no) OVS_REQUIRES(dp->port_mutex) { + struct netdev_saved_flags *sf; struct dp_netdev_port *port; int error; @@ -1872,7 +1866,24 @@ do_add_port(struct dp_netdev *dp, const char *devname, const char *type, reconfigure_datapath(dp); /* Check that port was successfully configured. */ - return dp_netdev_lookup_port(dp, port_no) ? 0 : EINVAL; + if (!dp_netdev_lookup_port(dp, port_no)) { + return EINVAL; + } + + /* Updating device flags triggers an if_notifier, which triggers a bridge + * reconfiguration and another attempt to add this port, leading to an + * infinite loop if the device is configured incorrectly and cannot be + * added. Setting the promisc mode after a successful reconfiguration, + * since we already know that the device is somehow properly configured. */ + error = netdev_turn_flags_on(port->netdev, NETDEV_PROMISC, &sf); + if (error) { + VLOG_ERR("%s: cannot set promisc flag", devname); + do_del_port(dp, port); + return error; + } + port->sf = sf; + + return 0; } static int -- 2.17.1 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev