2016-12-20 14:08 GMT-08:00 Kevin Traynor <ktray...@redhat.com>:
> On 12/15/2016 11:54 AM, Ciara Loftus wrote:
>> 'dpdk' ports no longer have naming restrictions. Now, instead of
>> specifying the dpdk port ID as part of the name, the PCI address of the
>> device must be specified via the 'dpdk-devargs' option. eg.
>>
>> ovs-vsctl add-port br0 my-port
>> ovs-vsctl set Interface my-port type=dpdk
>> ovs-vsctl set Interface my-port options:dpdk-devargs=0000:06:00.3
>
> I wouldn't encourage people to split up commands like above as they'll
> see errors and warnings.

Good point

>
> If you use the old command (which people surely will), it's not
> intuitive that it's now still a valid cmd but incomplete for setting up
> the port:
>
> []# ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
> 2016-12-20T19:53:54Z|00051|netdev|WARN|dpdk0: could not set
> configuration (Invalid argument)
> ovs-vsctl: Error detected while setting up 'dpdk0'.  See ovs-vswitchd
> log for details.
>
> It would be nice if this was just a warning and more informative - this
> could be an incremental change also.

You're right, vsctl errors are pretty obscure in this case. I think a first
step is to improve what ovs-vsctl reports to the user. I sent a patch here:

https://mail.openvswitch.org/pipermail/ovs-dev/2016-December/326542.html

The second step would be to allow netdev_dpdk_set_config() to return an error
string, that can be printed by ovs-vsctl.  I'm fine with doing that later.
What do you guys think?

>
>>
>> The user must no longer hotplug attach DPDK ports by issuing the
>> specific ovs-appctl netdev-dpdk/attach command. The hotplug is now
>> automatically invoked when a valid PCI address is set in the
>> dpdk-devargs. The format for ovs-appctl netdev-dpdk/detach command
>> has changed in that the user now must specify the relevant PCI address
>> as input instead of the port name.
>>
>> Signed-off-by: Ciara Loftus <ciara.lof...@intel.com>
>> Signed-off-by: Daniele Di Proietto <diproiet...@vmware.com>
>> Co-authored-by: Daniele Di Proietto <diproiet...@vmware.com>

I think netdev_dpdk_set_config() should compare new_devargs with dev->devargs
and avoid calling netdev_dpdk_process_devargs() if they're equal and
if the port_id
is valid.  I've noticed that rte_eth_dev_get_port_by_name() sometimes
doesn't find
an existing device and I think comparing the strings will fix the problem.

Also, could you add a log message if rte_eth_dev_attach fails?

I've been playing with this for a while and I guess I'm fine with the
rest of the series.

Thanks,

Daniele

>> ---
>> Changelog:
>> * Keep port-detach appctl function - use PCI as input arg
>> * Add requires_mutex to devargs processing functions
>> * use reconfigure infrastructure for devargs changes
>> * process devargs even if valid portid ie. device already configured
>> * report err if dpdk-devargs is not specified
>> * Add Daniele's incremental patch & Sign-off + Co-author tags
>> * Update details of detach command to reflect PCI is needed instead of
>>   port name
>> * Update NEWS to mention that the new naming scheme is not backwards
>>   compatible
>> * Use existing DPDk functions to get port ID from PCI address/devname
>> * Merged process_devargs and process_pdevargs functions
>> * Removed unnecessary requires_mutex from devargs processing fn
>> * Fix case where port is re-attached after detach
>> * Add note to documentation that devices won't work until devargs set.
>>
>>  Documentation/intro/install/dpdk-advanced.rst |   5 +-
>>  Documentation/intro/install/dpdk.rst          |  15 ++-
>>  NEWS                                          |   5 +
>>  lib/netdev-dpdk.c                             | 169 
>> +++++++++++++++++---------
>>  vswitchd/vswitch.xml                          |   8 ++
>>  5 files changed, 138 insertions(+), 64 deletions(-)
>>
>> diff --git a/Documentation/intro/install/dpdk-advanced.rst 
>> b/Documentation/intro/install/dpdk-advanced.rst
>> index 0e78142..147fcca 100644
>> --- a/Documentation/intro/install/dpdk-advanced.rst
>> +++ b/Documentation/intro/install/dpdk-advanced.rst
>> @@ -944,9 +944,8 @@ dpdk_nic_bind.py script:
>>
>>  Then it can be attached to OVS:
>>
>> -       $ ovs-appctl netdev-dpdk/attach 0000:01:00.0
>> -
>> -At this point, the user can create a ovs port using the add-port command.
>> +       $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk \
>> +           options:dpdk-devargs=0000:01:00.0
>>
>>  It is also possible to detach a port from ovs, the user has to remove the
>>  port using the del-port command, then it can be detached using:
>> diff --git a/Documentation/intro/install/dpdk.rst 
>> b/Documentation/intro/install/dpdk.rst
>> index 7724c8a..2b0aa90 100644
>> --- a/Documentation/intro/install/dpdk.rst
>> +++ b/Documentation/intro/install/dpdk.rst
>> @@ -245,12 +245,17 @@ Bridges should be created with a 
>> ``datapath_type=netdev``::
>>
>>      $ ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
>>
>> -Now you can add DPDK devices. OVS expects DPDK device names to start with
>> -``dpdk`` and end with a portid. ovs-vswitchd should print the number of dpdk
>> -devices found in the log file::
>> +Now you can add dpdk devices. The PCI address of the device needs to be
>> +set using the 'dpdk-devargs' option. DPDK will print the PCI addresses of
>> +eligible devices found during initialisation.
>>
>> -    $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
>> -    $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
>> +    $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk \
>> +        options:dpdk-devargs=0000:06:00.0
>> +    $ ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk \
>> +        options:dpdk-devargs=0000:06:00.1
>
> It might be good to not use dpdk0 and dpdk1 here now. They follow the
> old naming restrictions, so don't look arbitrary and might cause a bit
> of confusion for people transitioning. myportnameone/two,
> first/secondportname? portname/anotherportname ?
>
>> +
>> +DPDK devices will not be available for use until a valid dpdk-devargs is
>> +specified.
>>
>>  After the DPDK ports get added to switch, a polling thread continuously 
>> polls
>>  DPDK devices and consumes 100% of the core, as can be checked from 'top' and
>> diff --git a/NEWS b/NEWS
>> index b596cf3..8c7f38e 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -42,6 +42,11 @@ Post-v2.6.0
>>         which set the number of rx and tx descriptors to use for the given 
>> port.
>>       * Support for DPDK v16.11.
>>       * Port Hotplug is now supported.
>> +     * DPDK physical ports can now have arbitrary names. The PCI address of
>> +       the device must be set using the 'dpdk-devargs' option. Compatibility
>> +       with the old dpdk<portid> naming scheme is broken, and as such a
>> +       device will not be available for use until a valid dpdk-devargs is
>> +       specified.
>>     - Fedora packaging:
>>       * A package upgrade does not automatically restart OVS service.
>>     - ovs-vswitchd/ovs-vsctl:
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>> index 0d57227..aa14181 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -352,6 +352,9 @@ struct netdev_dpdk {
>>      /* Identifier used to distinguish vhost devices from each other. */
>>      char vhost_id[PATH_MAX];
>>
>> +    /* Device arguments for dpdk ports */
>> +    char *devargs;
>> +
>>      /* In dpdk_list. */
>>      struct ovs_list list_node OVS_GUARDED_BY(dpdk_mutex);
>>
>> @@ -813,7 +816,7 @@ netdev_dpdk_init(struct netdev *netdev, unsigned int 
>> port_no,
>>      /* If the 'sid' is negative, it means that the kernel fails
>>       * to obtain the pci numa info.  In that situation, always
>>       * use 'SOCKET0'. */
>> -    if (type == DPDK_DEV_ETH) {
>> +    if (type == DPDK_DEV_ETH && rte_eth_dev_is_valid_port(dev->port_id)) {
>>          sid = rte_eth_dev_socket_id(port_no);
>>      } else {
>>          sid = rte_lcore_to_socket_id(rte_get_master_lcore());
>> @@ -852,9 +855,11 @@ netdev_dpdk_init(struct netdev *netdev, unsigned int 
>> port_no,
>>      /* Initialize the flow control to NULL */
>>      memset(&dev->fc_conf, 0, sizeof dev->fc_conf);
>>      if (type == DPDK_DEV_ETH) {
>> -        err = dpdk_eth_dev_init(dev);
>> -        if (err) {
>> -            goto unlock;
>> +        if (rte_eth_dev_is_valid_port(dev->port_id)) {
>> +            err = dpdk_eth_dev_init(dev);
>> +            if (err) {
>> +                goto unlock;
>> +            }
>>          }
>>          dev->tx_q = netdev_dpdk_alloc_txq(netdev->n_txq);
>>      } else {
>> @@ -950,17 +955,10 @@ netdev_dpdk_vhost_client_construct(struct netdev 
>> *netdev)
>>  static int
>>  netdev_dpdk_construct(struct netdev *netdev)
>>  {
>> -    unsigned int port_no;
>>      int err;
>>
>> -    /* Names always start with "dpdk" */
>> -    err = dpdk_dev_parse_name(netdev->name, "dpdk", &port_no);
>> -    if (err) {
>> -        return err;
>> -    }
>> -
>>      ovs_mutex_lock(&dpdk_mutex);
>> -    err = netdev_dpdk_init(netdev, port_no, DPDK_DEV_ETH);
>> +    err = netdev_dpdk_init(netdev, -1, DPDK_DEV_ETH);
>>      ovs_mutex_unlock(&dpdk_mutex);
>>      return err;
>>  }
>> @@ -974,6 +972,7 @@ netdev_dpdk_destruct(struct netdev *netdev)
>>      ovs_mutex_lock(&dev->mutex);
>>
>>      rte_eth_dev_stop(dev->port_id);
>> +    free(dev->devargs);
>>      free(ovsrcu_get_protected(struct ingress_policer *,
>>                                &dev->ingress_policer));
>>
>> @@ -1078,6 +1077,46 @@ netdev_dpdk_get_config(const struct netdev *netdev, 
>> struct smap *args)
>>      return 0;
>>  }
>>
>> +static struct netdev_dpdk *
>> +netdev_dpdk_lookup_by_port_id(int port_id)
>> +    OVS_REQUIRES(dpdk_mutex)
>> +{
>> +    struct netdev_dpdk *dev;
>> +
>> +    LIST_FOR_EACH (dev, list_node, &dpdk_list) {
>> +        if (dev->port_id == port_id) {
>> +            return dev;
>> +        }
>> +    }
>> +
>> +    return NULL;
>> +}
>> +
>> +static int
>> +netdev_dpdk_process_devargs(const char *devargs)
>> +{
>> +    struct rte_pci_addr addr;
>> +    uint8_t new_port_id = UINT8_MAX;
>> +
>> +    if (!eal_parse_pci_DomBDF(devargs, &addr)) {
>> +        /* Valid PCI address format detected - configure physical device */
>> +        if (rte_eth_dev_get_port_by_name(devargs, &new_port_id)
>> +                || !rte_eth_dev_is_valid_port(new_port_id) ) {
>> +            /* PCI device not found in DPDK, attempt to attach it */
>> +            if (!rte_eth_dev_attach(devargs, &new_port_id)) {
>> +                /* Attach successful */
>> +                VLOG_INFO("Device "PCI_PRI_FMT" has been attached to DPDK",
>> +                          addr.domain, addr.bus, addr.devid, addr.function);
>> +            } else {
>> +                /* Attach unsuccessful */
>> +                return -1;
>> +            }
>> +        }
>> +    }
>> +
>> +    return new_port_id;
>> +}
>> +
>>  static void
>>  dpdk_set_rxq_config(struct netdev_dpdk *dev, const struct smap *args)
>>      OVS_REQUIRES(dev->mutex)
>> @@ -1118,7 +1157,10 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
>> struct smap *args)
>>          {RTE_FC_NONE,     RTE_FC_TX_PAUSE},
>>          {RTE_FC_RX_PAUSE, RTE_FC_FULL    }
>>      };
>> +    const char *new_devargs;
>> +    int err = 0;
>>
>> +    ovs_mutex_lock(&dpdk_mutex);
>>      ovs_mutex_lock(&dev->mutex);
>>
>>      dpdk_set_rxq_config(dev, args);
>> @@ -1130,6 +1172,45 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
>> struct smap *args)
>>                              NIC_PORT_DEFAULT_TXQ_SIZE,
>>                              &dev->requested_txq_size);
>>
>> +    new_devargs = smap_get(args, "dpdk-devargs");
>> +
>> +    if (dev->devargs && strcmp(new_devargs, dev->devargs)) {
>> +        /* The user requested a new device.  If we return error, the caller
>> +         * will delete this netdev and try to recreate it. */
>> +        err = EAGAIN;
>> +        goto out;
>> +    }
>> +
>> +    /* dpdk-devargs is required for device configuration */
>> +    err = EINVAL;
>> +    if (new_devargs && new_devargs[0]) {
>> +        int new_port_id;
>> +
>> +        new_port_id = netdev_dpdk_process_devargs(new_devargs);
>> +        if (new_port_id == dev->port_id) {
>> +            /* Already configured, do not reconfigure again */
>> +            err = 0;
>> +        } else if (rte_eth_dev_is_valid_port(new_port_id)) {
>> +            struct netdev_dpdk *dup_dev;
>> +            dup_dev = netdev_dpdk_lookup_by_port_id(new_port_id);
>> +            if (dup_dev) {
>> +                VLOG_WARN("'%s' is trying to use device '%s' which is 
>> already "
>> +                          "in use by '%s'.", netdev_get_name(netdev),
>> +                          new_devargs, netdev_get_name(&dup_dev->up));
>> +                err = EADDRINUSE;
>> +            } else {
>> +                dev->devargs = xstrdup(new_devargs);
>> +                dev->port_id = new_port_id;
>> +                netdev_request_reconfigure(&dev->up);
>> +                err = 0;
>> +            }
>> +        }
>> +    }
>> +
>> +    if (err) {
>> +        goto out;
>> +    }
>> +
>>      rx_fc_en = smap_get_bool(args, "rx-flow-ctrl", false);
>>      tx_fc_en = smap_get_bool(args, "tx-flow-ctrl", false);
>>      autoneg = smap_get_bool(args, "flow-ctrl-autoneg", false);
>> @@ -1141,9 +1222,11 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
>> struct smap *args)
>>          dpdk_eth_flow_ctrl_setup(dev);
>>      }
>>
>> +out:
>>      ovs_mutex_unlock(&dev->mutex);
>> +    ovs_mutex_unlock(&dpdk_mutex);
>>
>> -    return 0;
>> +    return err;
>>  }
>>
>>  static int
>> @@ -2308,57 +2391,34 @@ netdev_dpdk_set_admin_state(struct unixctl_conn 
>> *conn, int argc,
>>  }
>>
>>  static void
>> -netdev_dpdk_attach(struct unixctl_conn *conn, int argc OVS_UNUSED,
>> -                   const char *argv[], void *aux OVS_UNUSED)
>> -{
>> -    int ret;
>> -    char *response;
>> -    uint8_t port_id;
>> -
>> -    ovs_mutex_lock(&dpdk_mutex);
>> -
>> -    ret = rte_eth_dev_attach(argv[1], &port_id);
>> -    if (ret < 0) {
>> -        response = xasprintf("Error attaching device '%s'", argv[1]);
>> -        ovs_mutex_unlock(&dpdk_mutex);
>> -        unixctl_command_reply_error(conn, response);
>> -        free(response);
>> -        return;
>> -    }
>> -
>> -    response = xasprintf("Device '%s' has been attached as 'dpdk%d'",
>> -                         argv[1], port_id);
>> -
>> -    ovs_mutex_unlock(&dpdk_mutex);
>> -    unixctl_command_reply(conn, response);
>> -    free(response);
>> -}
>> -
>> -static void
>>  netdev_dpdk_detach(struct unixctl_conn *conn, int argc OVS_UNUSED,
>>                     const char *argv[], void *aux OVS_UNUSED)
>>  {
>>      int ret;
>>      char *response;
>> -    unsigned int parsed_port;
>>      uint8_t port_id;
>>      char devname[RTE_ETH_NAME_MAX_LEN];
>> +    struct rte_pci_addr addr;
>> +    struct netdev_dpdk *dev;
>>
>>      ovs_mutex_lock(&dpdk_mutex);
>>
>> -    ret = dpdk_dev_parse_name(argv[1], "dpdk", &parsed_port);
>> -    if (ret) {
>> -        response = xasprintf("'%s' is not a valid port", argv[1]);
>> +    if (eal_parse_pci_DomBDF(argv[1], &addr)) {
>> +        response = xasprintf("Invalid PCI address '%s'. Cannot detach.",
>> +                             argv[1]);
>>          goto error;
>>      }
>>
>> -    port_id = parsed_port;
>> +    if (rte_eth_dev_get_port_by_name(argv[1], &port_id)) {
>> +        response = xasprintf("Device '%s' not found in DPDK", argv[1]);
>> +        goto error;
>> +    }
>>
>> -    struct netdev *netdev = netdev_from_name(argv[1]);
>> -    if (netdev) {
>> -        netdev_close(netdev);
>> -        response = xasprintf("Port '%s' is being used. Remove it before"
>> -                             "detaching", argv[1]);
>> +    dev = netdev_dpdk_lookup_by_port_id(port_id);
>> +    if (dev) {
>> +        response = xasprintf("Device '%s' is being used by interface '%s'. "
>> +                             "Remove it before detaching",
>> +                             argv[1], netdev_get_name(&dev->up));
>>          goto error;
>>      }
>>
>> @@ -2366,11 +2426,11 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int 
>> argc OVS_UNUSED,
>>
>>      ret = rte_eth_dev_detach(port_id, devname);
>>      if (ret < 0) {
>> -        response = xasprintf("Port '%s' can not be detached", argv[1]);
>> +        response = xasprintf("Device '%s' can not be detached", argv[1]);
>>          goto error;
>>      }
>>
>> -    response = xasprintf("Port '%s' has been detached", argv[1]);
>> +    response = xasprintf("Device '%s' has been detached", argv[1]);
>>
>>      ovs_mutex_unlock(&dpdk_mutex);
>>      unixctl_command_reply(conn, response);
>> @@ -2658,11 +2718,8 @@ netdev_dpdk_class_init(void)
>>          unixctl_command_register("netdev-dpdk/set-admin-state",
>>                                   "[netdev] up|down", 1, 2,
>>                                   netdev_dpdk_set_admin_state, NULL);
>> -        unixctl_command_register("netdev-dpdk/attach",
>> -                                 "pci address of device", 1, 1,
>> -                                 netdev_dpdk_attach, NULL);
>>          unixctl_command_register("netdev-dpdk/detach",
>> -                                 "port", 1, 1,
>> +                                 "pci address of device", 1, 1,
>>                                   netdev_dpdk_detach, NULL);
>>
>>          ovsthread_once_done(&once);
>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>> index b4af5a5..23ab469 100644
>> --- a/vswitchd/vswitch.xml
>> +++ b/vswitchd/vswitch.xml
>> @@ -2303,6 +2303,14 @@
>>          </p>
>>        </column>
>>
>> +      <column name="options" key="dpdk-devargs"
>> +              type='{"type": "string"}'>
>> +        <p>
>> +          Specifies the PCI address of a physical dpdk device.
>> +          Only supported by 'dpdk' devices.
>> +        </p>
>> +      </column>
>> +
>>        <column name="other_config" key="pmd-rxq-affinity">
>>          <p>Specifies mapping of RX queues of this interface to CPU 
>> cores.</p>
>>          <p>Value should be set in the following form:</p>
>>
>
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to