2016-12-20 14:08 GMT-08:00 Kevin Traynor <ktray...@redhat.com>: > On 12/15/2016 11:54 AM, Ciara Loftus wrote: >> 'dpdk' ports no longer have naming restrictions. Now, instead of >> specifying the dpdk port ID as part of the name, the PCI address of the >> device must be specified via the 'dpdk-devargs' option. eg. >> >> ovs-vsctl add-port br0 my-port >> ovs-vsctl set Interface my-port type=dpdk >> ovs-vsctl set Interface my-port options:dpdk-devargs=0000:06:00.3 > > I wouldn't encourage people to split up commands like above as they'll > see errors and warnings.
Good point > > If you use the old command (which people surely will), it's not > intuitive that it's now still a valid cmd but incomplete for setting up > the port: > > []# ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk > 2016-12-20T19:53:54Z|00051|netdev|WARN|dpdk0: could not set > configuration (Invalid argument) > ovs-vsctl: Error detected while setting up 'dpdk0'. See ovs-vswitchd > log for details. > > It would be nice if this was just a warning and more informative - this > could be an incremental change also. You're right, vsctl errors are pretty obscure in this case. I think a first step is to improve what ovs-vsctl reports to the user. I sent a patch here: https://mail.openvswitch.org/pipermail/ovs-dev/2016-December/326542.html The second step would be to allow netdev_dpdk_set_config() to return an error string, that can be printed by ovs-vsctl. I'm fine with doing that later. What do you guys think? > >> >> The user must no longer hotplug attach DPDK ports by issuing the >> specific ovs-appctl netdev-dpdk/attach command. The hotplug is now >> automatically invoked when a valid PCI address is set in the >> dpdk-devargs. The format for ovs-appctl netdev-dpdk/detach command >> has changed in that the user now must specify the relevant PCI address >> as input instead of the port name. >> >> Signed-off-by: Ciara Loftus <ciara.lof...@intel.com> >> Signed-off-by: Daniele Di Proietto <diproiet...@vmware.com> >> Co-authored-by: Daniele Di Proietto <diproiet...@vmware.com> I think netdev_dpdk_set_config() should compare new_devargs with dev->devargs and avoid calling netdev_dpdk_process_devargs() if they're equal and if the port_id is valid. I've noticed that rte_eth_dev_get_port_by_name() sometimes doesn't find an existing device and I think comparing the strings will fix the problem. Also, could you add a log message if rte_eth_dev_attach fails? I've been playing with this for a while and I guess I'm fine with the rest of the series. Thanks, Daniele >> --- >> Changelog: >> * Keep port-detach appctl function - use PCI as input arg >> * Add requires_mutex to devargs processing functions >> * use reconfigure infrastructure for devargs changes >> * process devargs even if valid portid ie. device already configured >> * report err if dpdk-devargs is not specified >> * Add Daniele's incremental patch & Sign-off + Co-author tags >> * Update details of detach command to reflect PCI is needed instead of >> port name >> * Update NEWS to mention that the new naming scheme is not backwards >> compatible >> * Use existing DPDk functions to get port ID from PCI address/devname >> * Merged process_devargs and process_pdevargs functions >> * Removed unnecessary requires_mutex from devargs processing fn >> * Fix case where port is re-attached after detach >> * Add note to documentation that devices won't work until devargs set. >> >> Documentation/intro/install/dpdk-advanced.rst | 5 +- >> Documentation/intro/install/dpdk.rst | 15 ++- >> NEWS | 5 + >> lib/netdev-dpdk.c | 169 >> +++++++++++++++++--------- >> vswitchd/vswitch.xml | 8 ++ >> 5 files changed, 138 insertions(+), 64 deletions(-) >> >> diff --git a/Documentation/intro/install/dpdk-advanced.rst >> b/Documentation/intro/install/dpdk-advanced.rst >> index 0e78142..147fcca 100644 >> --- a/Documentation/intro/install/dpdk-advanced.rst >> +++ b/Documentation/intro/install/dpdk-advanced.rst >> @@ -944,9 +944,8 @@ dpdk_nic_bind.py script: >> >> Then it can be attached to OVS: >> >> - $ ovs-appctl netdev-dpdk/attach 0000:01:00.0 >> - >> -At this point, the user can create a ovs port using the add-port command. >> + $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk \ >> + options:dpdk-devargs=0000:01:00.0 >> >> It is also possible to detach a port from ovs, the user has to remove the >> port using the del-port command, then it can be detached using: >> diff --git a/Documentation/intro/install/dpdk.rst >> b/Documentation/intro/install/dpdk.rst >> index 7724c8a..2b0aa90 100644 >> --- a/Documentation/intro/install/dpdk.rst >> +++ b/Documentation/intro/install/dpdk.rst >> @@ -245,12 +245,17 @@ Bridges should be created with a >> ``datapath_type=netdev``:: >> >> $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev >> >> -Now you can add DPDK devices. OVS expects DPDK device names to start with >> -``dpdk`` and end with a portid. ovs-vswitchd should print the number of dpdk >> -devices found in the log file:: >> +Now you can add dpdk devices. The PCI address of the device needs to be >> +set using the 'dpdk-devargs' option. DPDK will print the PCI addresses of >> +eligible devices found during initialisation. >> >> - $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk >> - $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk >> + $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk \ >> + options:dpdk-devargs=0000:06:00.0 >> + $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk \ >> + options:dpdk-devargs=0000:06:00.1 > > It might be good to not use dpdk0 and dpdk1 here now. They follow the > old naming restrictions, so don't look arbitrary and might cause a bit > of confusion for people transitioning. myportnameone/two, > first/secondportname? portname/anotherportname ? > >> + >> +DPDK devices will not be available for use until a valid dpdk-devargs is >> +specified. >> >> After the DPDK ports get added to switch, a polling thread continuously >> polls >> DPDK devices and consumes 100% of the core, as can be checked from 'top' and >> diff --git a/NEWS b/NEWS >> index b596cf3..8c7f38e 100644 >> --- a/NEWS >> +++ b/NEWS >> @@ -42,6 +42,11 @@ Post-v2.6.0 >> which set the number of rx and tx descriptors to use for the given >> port. >> * Support for DPDK v16.11. >> * Port Hotplug is now supported. >> + * DPDK physical ports can now have arbitrary names. The PCI address of >> + the device must be set using the 'dpdk-devargs' option. Compatibility >> + with the old dpdk<portid> naming scheme is broken, and as such a >> + device will not be available for use until a valid dpdk-devargs is >> + specified. >> - Fedora packaging: >> * A package upgrade does not automatically restart OVS service. >> - ovs-vswitchd/ovs-vsctl: >> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >> index 0d57227..aa14181 100644 >> --- a/lib/netdev-dpdk.c >> +++ b/lib/netdev-dpdk.c >> @@ -352,6 +352,9 @@ struct netdev_dpdk { >> /* Identifier used to distinguish vhost devices from each other. */ >> char vhost_id[PATH_MAX]; >> >> + /* Device arguments for dpdk ports */ >> + char *devargs; >> + >> /* In dpdk_list. */ >> struct ovs_list list_node OVS_GUARDED_BY(dpdk_mutex); >> >> @@ -813,7 +816,7 @@ netdev_dpdk_init(struct netdev *netdev, unsigned int >> port_no, >> /* If the 'sid' is negative, it means that the kernel fails >> * to obtain the pci numa info. In that situation, always >> * use 'SOCKET0'. */ >> - if (type == DPDK_DEV_ETH) { >> + if (type == DPDK_DEV_ETH && rte_eth_dev_is_valid_port(dev->port_id)) { >> sid = rte_eth_dev_socket_id(port_no); >> } else { >> sid = rte_lcore_to_socket_id(rte_get_master_lcore()); >> @@ -852,9 +855,11 @@ netdev_dpdk_init(struct netdev *netdev, unsigned int >> port_no, >> /* Initialize the flow control to NULL */ >> memset(&dev->fc_conf, 0, sizeof dev->fc_conf); >> if (type == DPDK_DEV_ETH) { >> - err = dpdk_eth_dev_init(dev); >> - if (err) { >> - goto unlock; >> + if (rte_eth_dev_is_valid_port(dev->port_id)) { >> + err = dpdk_eth_dev_init(dev); >> + if (err) { >> + goto unlock; >> + } >> } >> dev->tx_q = netdev_dpdk_alloc_txq(netdev->n_txq); >> } else { >> @@ -950,17 +955,10 @@ netdev_dpdk_vhost_client_construct(struct netdev >> *netdev) >> static int >> netdev_dpdk_construct(struct netdev *netdev) >> { >> - unsigned int port_no; >> int err; >> >> - /* Names always start with "dpdk" */ >> - err = dpdk_dev_parse_name(netdev->name, "dpdk", &port_no); >> - if (err) { >> - return err; >> - } >> - >> ovs_mutex_lock(&dpdk_mutex); >> - err = netdev_dpdk_init(netdev, port_no, DPDK_DEV_ETH); >> + err = netdev_dpdk_init(netdev, -1, DPDK_DEV_ETH); >> ovs_mutex_unlock(&dpdk_mutex); >> return err; >> } >> @@ -974,6 +972,7 @@ netdev_dpdk_destruct(struct netdev *netdev) >> ovs_mutex_lock(&dev->mutex); >> >> rte_eth_dev_stop(dev->port_id); >> + free(dev->devargs); >> free(ovsrcu_get_protected(struct ingress_policer *, >> &dev->ingress_policer)); >> >> @@ -1078,6 +1077,46 @@ netdev_dpdk_get_config(const struct netdev *netdev, >> struct smap *args) >> return 0; >> } >> >> +static struct netdev_dpdk * >> +netdev_dpdk_lookup_by_port_id(int port_id) >> + OVS_REQUIRES(dpdk_mutex) >> +{ >> + struct netdev_dpdk *dev; >> + >> + LIST_FOR_EACH (dev, list_node, &dpdk_list) { >> + if (dev->port_id == port_id) { >> + return dev; >> + } >> + } >> + >> + return NULL; >> +} >> + >> +static int >> +netdev_dpdk_process_devargs(const char *devargs) >> +{ >> + struct rte_pci_addr addr; >> + uint8_t new_port_id = UINT8_MAX; >> + >> + if (!eal_parse_pci_DomBDF(devargs, &addr)) { >> + /* Valid PCI address format detected - configure physical device */ >> + if (rte_eth_dev_get_port_by_name(devargs, &new_port_id) >> + || !rte_eth_dev_is_valid_port(new_port_id) ) { >> + /* PCI device not found in DPDK, attempt to attach it */ >> + if (!rte_eth_dev_attach(devargs, &new_port_id)) { >> + /* Attach successful */ >> + VLOG_INFO("Device "PCI_PRI_FMT" has been attached to DPDK", >> + addr.domain, addr.bus, addr.devid, addr.function); >> + } else { >> + /* Attach unsuccessful */ >> + return -1; >> + } >> + } >> + } >> + >> + return new_port_id; >> +} >> + >> static void >> dpdk_set_rxq_config(struct netdev_dpdk *dev, const struct smap *args) >> OVS_REQUIRES(dev->mutex) >> @@ -1118,7 +1157,10 @@ netdev_dpdk_set_config(struct netdev *netdev, const >> struct smap *args) >> {RTE_FC_NONE, RTE_FC_TX_PAUSE}, >> {RTE_FC_RX_PAUSE, RTE_FC_FULL } >> }; >> + const char *new_devargs; >> + int err = 0; >> >> + ovs_mutex_lock(&dpdk_mutex); >> ovs_mutex_lock(&dev->mutex); >> >> dpdk_set_rxq_config(dev, args); >> @@ -1130,6 +1172,45 @@ netdev_dpdk_set_config(struct netdev *netdev, const >> struct smap *args) >> NIC_PORT_DEFAULT_TXQ_SIZE, >> &dev->requested_txq_size); >> >> + new_devargs = smap_get(args, "dpdk-devargs"); >> + >> + if (dev->devargs && strcmp(new_devargs, dev->devargs)) { >> + /* The user requested a new device. If we return error, the caller >> + * will delete this netdev and try to recreate it. */ >> + err = EAGAIN; >> + goto out; >> + } >> + >> + /* dpdk-devargs is required for device configuration */ >> + err = EINVAL; >> + if (new_devargs && new_devargs[0]) { >> + int new_port_id; >> + >> + new_port_id = netdev_dpdk_process_devargs(new_devargs); >> + if (new_port_id == dev->port_id) { >> + /* Already configured, do not reconfigure again */ >> + err = 0; >> + } else if (rte_eth_dev_is_valid_port(new_port_id)) { >> + struct netdev_dpdk *dup_dev; >> + dup_dev = netdev_dpdk_lookup_by_port_id(new_port_id); >> + if (dup_dev) { >> + VLOG_WARN("'%s' is trying to use device '%s' which is >> already " >> + "in use by '%s'.", netdev_get_name(netdev), >> + new_devargs, netdev_get_name(&dup_dev->up)); >> + err = EADDRINUSE; >> + } else { >> + dev->devargs = xstrdup(new_devargs); >> + dev->port_id = new_port_id; >> + netdev_request_reconfigure(&dev->up); >> + err = 0; >> + } >> + } >> + } >> + >> + if (err) { >> + goto out; >> + } >> + >> rx_fc_en = smap_get_bool(args, "rx-flow-ctrl", false); >> tx_fc_en = smap_get_bool(args, "tx-flow-ctrl", false); >> autoneg = smap_get_bool(args, "flow-ctrl-autoneg", false); >> @@ -1141,9 +1222,11 @@ netdev_dpdk_set_config(struct netdev *netdev, const >> struct smap *args) >> dpdk_eth_flow_ctrl_setup(dev); >> } >> >> +out: >> ovs_mutex_unlock(&dev->mutex); >> + ovs_mutex_unlock(&dpdk_mutex); >> >> - return 0; >> + return err; >> } >> >> static int >> @@ -2308,57 +2391,34 @@ netdev_dpdk_set_admin_state(struct unixctl_conn >> *conn, int argc, >> } >> >> static void >> -netdev_dpdk_attach(struct unixctl_conn *conn, int argc OVS_UNUSED, >> - const char *argv[], void *aux OVS_UNUSED) >> -{ >> - int ret; >> - char *response; >> - uint8_t port_id; >> - >> - ovs_mutex_lock(&dpdk_mutex); >> - >> - ret = rte_eth_dev_attach(argv[1], &port_id); >> - if (ret < 0) { >> - response = xasprintf("Error attaching device '%s'", argv[1]); >> - ovs_mutex_unlock(&dpdk_mutex); >> - unixctl_command_reply_error(conn, response); >> - free(response); >> - return; >> - } >> - >> - response = xasprintf("Device '%s' has been attached as 'dpdk%d'", >> - argv[1], port_id); >> - >> - ovs_mutex_unlock(&dpdk_mutex); >> - unixctl_command_reply(conn, response); >> - free(response); >> -} >> - >> -static void >> netdev_dpdk_detach(struct unixctl_conn *conn, int argc OVS_UNUSED, >> const char *argv[], void *aux OVS_UNUSED) >> { >> int ret; >> char *response; >> - unsigned int parsed_port; >> uint8_t port_id; >> char devname[RTE_ETH_NAME_MAX_LEN]; >> + struct rte_pci_addr addr; >> + struct netdev_dpdk *dev; >> >> ovs_mutex_lock(&dpdk_mutex); >> >> - ret = dpdk_dev_parse_name(argv[1], "dpdk", &parsed_port); >> - if (ret) { >> - response = xasprintf("'%s' is not a valid port", argv[1]); >> + if (eal_parse_pci_DomBDF(argv[1], &addr)) { >> + response = xasprintf("Invalid PCI address '%s'. Cannot detach.", >> + argv[1]); >> goto error; >> } >> >> - port_id = parsed_port; >> + if (rte_eth_dev_get_port_by_name(argv[1], &port_id)) { >> + response = xasprintf("Device '%s' not found in DPDK", argv[1]); >> + goto error; >> + } >> >> - struct netdev *netdev = netdev_from_name(argv[1]); >> - if (netdev) { >> - netdev_close(netdev); >> - response = xasprintf("Port '%s' is being used. Remove it before" >> - "detaching", argv[1]); >> + dev = netdev_dpdk_lookup_by_port_id(port_id); >> + if (dev) { >> + response = xasprintf("Device '%s' is being used by interface '%s'. " >> + "Remove it before detaching", >> + argv[1], netdev_get_name(&dev->up)); >> goto error; >> } >> >> @@ -2366,11 +2426,11 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int >> argc OVS_UNUSED, >> >> ret = rte_eth_dev_detach(port_id, devname); >> if (ret < 0) { >> - response = xasprintf("Port '%s' can not be detached", argv[1]); >> + response = xasprintf("Device '%s' can not be detached", argv[1]); >> goto error; >> } >> >> - response = xasprintf("Port '%s' has been detached", argv[1]); >> + response = xasprintf("Device '%s' has been detached", argv[1]); >> >> ovs_mutex_unlock(&dpdk_mutex); >> unixctl_command_reply(conn, response); >> @@ -2658,11 +2718,8 @@ netdev_dpdk_class_init(void) >> unixctl_command_register("netdev-dpdk/set-admin-state", >> "[netdev] up|down", 1, 2, >> netdev_dpdk_set_admin_state, NULL); >> - unixctl_command_register("netdev-dpdk/attach", >> - "pci address of device", 1, 1, >> - netdev_dpdk_attach, NULL); >> unixctl_command_register("netdev-dpdk/detach", >> - "port", 1, 1, >> + "pci address of device", 1, 1, >> netdev_dpdk_detach, NULL); >> >> ovsthread_once_done(&once); >> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml >> index b4af5a5..23ab469 100644 >> --- a/vswitchd/vswitch.xml >> +++ b/vswitchd/vswitch.xml >> @@ -2303,6 +2303,14 @@ >> </p> >> </column> >> >> + <column name="options" key="dpdk-devargs" >> + type='{"type": "string"}'> >> + <p> >> + Specifies the PCI address of a physical dpdk device. >> + Only supported by 'dpdk' devices. >> + </p> >> + </column> >> + >> <column name="other_config" key="pmd-rxq-affinity"> >> <p>Specifies mapping of RX queues of this interface to CPU >> cores.</p> >> <p>Value should be set in the following form:</p> >> > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev