On 16/09/2020 16:17, Gaetan Rivet wrote:
> In some cloud topologies, using DPDK VF representors in guest requires
> configuring a VF before it is assigned to the guest.
> 
> A first basic option for such configuration is setting the VF MAC
> address. Add a key 'dpdk-vf-mac' to the 'options' column of the Interface
> table.
> 
> This option can be used as such:
> 
>    $ ovs-vsctl add-port br0 dpdk-rep0 -- set Interface dpdk-rep0 type=dpdk \
>       options:dpdk-vf-mac=00:11:22:33:44:55
> 

If I got it right, it doesn't seem like it solves any issues by limiting
to representor, but is just an attempt to limit some of the edge cases
around bifurcated devices to the devices where the requested
functionality is really needed now.

This limitation has some costs...

We will now have, mac, mac_in_use, dpdk-vf-mac, configured_hwaddr,
requested_hwaddr for the user to understand. I think it can be confusing.

'ovs-vsctl show' will show the dpdk-vf-mac, which is the main one we
document, but it's not clear if it was successful or not. (yes there is
a log entry, but it is a separate place).

As it is specific for representors, if we ever want to allow setting of
mac on any non representor DPDK ports, we have an awkward user
interface. We would need to make another 'dpdk-mac' type option, or we
decide then to allow use mac in interface table but that it doesn't work
for representors, or there are two ways to set for representors etc.
None of this seems great.

My feeling is that it is making things complicated for the user with
this many knobs, where we could just have mac and mac_in_use, live with
the edge cases (as Gaetan pointed out, we already do for MTU) and we
know as well if we ever need to extend further, the user interface will
stay simple. Just my 2C, maybe there's a good argument why we can't do
it like this.

Few comments on the code as it is now below.


> Signed-off-by: Gaetan Rivet <gr...@u256.net>
> Suggested-by: Ilya Maximets <i.maxim...@ovn.org>
> Acked-by: Eli Britstein <el...@nvidia.com>
> ---
>  Documentation/topics/dpdk/phy.rst | 22 ++++++++++++
>  NEWS                              |  2 ++
>  lib/netdev-dpdk.c                 | 60 +++++++++++++++++++++++++++++++
>  vswitchd/vswitch.xml              | 13 +++++++
>  4 files changed, 97 insertions(+)
> 
> diff --git a/Documentation/topics/dpdk/phy.rst 
> b/Documentation/topics/dpdk/phy.rst
> index 38e52c8de..73dcb7c28 100644
> --- a/Documentation/topics/dpdk/phy.rst
> +++ b/Documentation/topics/dpdk/phy.rst
> @@ -379,6 +379,28 @@ an eth device whose mac address is 
> ``00:11:22:33:44:55``::
>      $ ovs-vsctl add-port br0 dpdk-mac -- set Interface dpdk-mac type=dpdk \
>         options:dpdk-devargs="class=eth,mac=00:11:22:33:44:55"
>  
> +Representor specific configuration
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +In some topologies, a VF must be configured before being assigned to a
> +guest (VM) machine.  This configuration is done through VF-specific fields
> +in the ``options`` column of the Interface table.
> +
> +.. important::
> +
> +   Some DPDK port use `bifurcated drivers <bifurcated-drivers>`__,
> +   which means that a kernel netdevice remains when Open vSwitch is stopped.
> +
> +   In such case, any configuration applied to a VF would remain set on the
> +   kernel netdevice, and be inherited from it when Open vSwitch is restarted,
> +   even if the options described in this section are unset from Open vSwitch.
> +
> +.. _bifurcated-drivers: 
> http://doc.dpdk.org/guides/linux_gsg/linux_drivers.html#bifurcated-driver
> +
> +- Configure the VF MAC address::
> +
> +    $ ovs-vsctl set Interface dpdk-rep0 options:dpdk-vf-mac=00:11:22:33:44:55
> +

As mentioned above, if for some reason setting the mac fails there will
be warning in the logs, but 'ovs-vsctl show' will still show the
dpdk-vf-mac that was not applied and it's quite prominent. Also, there
are mac, mac_in_use, dpdk-vf-mac, configured_hwaddr, requested_hwaddr.

Suggest to add a small summary here for each one as it relates
representors and make clear to users,
1. how to find the mac the user requested
2. how to find the current mac

>  Jumbo Frames
>  ------------
>  
> diff --git a/NEWS b/NEWS
> index 2f67d5047..5ece29474 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -24,6 +24,8 @@ v2.14.0 - 17 Aug 2020

2.15....hopefully :-)

>         to list and change log levels in DPDK components.
>       * Vhost-user Dequeue zero-copy support is deprecated and will be removed
>         in the next release.
> +     * New 'options:dpdk-vf-mac' field for DPDK interface of VF ports,
> +       that allows configuring the MAC address of a VF representor.
>     - Linux datapath:
>       * Support for kernel versions up to 5.5.x.
>     - AF_XDP:
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 4ef7f78a6..060348369 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -523,6 +523,9 @@ struct netdev_dpdk {
>           * otherwise interrupt mode is used. */
>          bool requested_lsc_interrupt_mode;
>          bool lsc_interrupt_mode;
> +
> +        /* VF configuration. */
> +        struct eth_addr requested_hwaddr;
>      );
>  
>      PADDED_MEMBERS(CACHE_LINE_SIZE,
> @@ -1685,6 +1688,16 @@ out:
>      return ret;
>  }
>  
> +static bool
> +dpdk_port_is_representor(struct netdev_dpdk *dev)
> +    OVS_REQUIRES(dev->mutex)
> +{
> +    struct rte_eth_dev_info dev_info;
> +
> +    rte_eth_dev_info_get(dev->port_id, &dev_info);
> +    return (*dev_info.dev_flags) & RTE_ETH_DEV_REPRESENTOR;
> +}
> +
>  static int
>  netdev_dpdk_get_config(const struct netdev *netdev, struct smap *args)
>  {
> @@ -1719,6 +1732,13 @@ netdev_dpdk_get_config(const struct netdev *netdev, 
> struct smap *args)
>          }
>          smap_add(args, "lsc_interrupt_mode",
>                   dev->lsc_interrupt_mode ? "true" : "false");
> +
> +        if (dpdk_port_is_representor(dev)) {
> +            smap_add_format(args, "requested_hwaddr", ETH_ADDR_FMT,
> +                            ETH_ADDR_ARGS(dev->requested_hwaddr));
> +            smap_add_format(args, "configured_hwaddr", ETH_ADDR_FMT,
> +                            ETH_ADDR_ARGS(dev->hwaddr));

'*mac*' is used elsewhere (mac, mac_in_use, dpdk-vf-mac), think it's
better to keep consistent

> +        }
>      }
>      ovs_mutex_unlock(&dev->mutex);
>  
> @@ -1898,6 +1918,7 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
> struct smap *args,
>          {RTE_FC_RX_PAUSE, RTE_FC_FULL    }
>      };
>      const char *new_devargs;
> +    const char *vf_mac;
>      int err = 0;
>  
>      ovs_mutex_lock(&dpdk_mutex);
> @@ -1968,6 +1989,36 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
> struct smap *args,
>          goto out;
>      }
>  
> +    vf_mac = smap_get(args, "dpdk-vf-mac");
> +    if (vf_mac) {
> +        struct eth_addr mac;
> +
> +        err = EINVAL;
> +
> +        if (!dpdk_port_is_representor(dev)) {
> +            VLOG_ERR_BUF(errp, "'%s' is trying to set the VF MAC '%s' "
> +                         "but 'options:dpdk-vf-mac' is only supported for "
> +                         "VF representors.",
> +                         netdev_get_name(netdev), vf_mac);

Warnings here seems more in line with the other configs that cannot be
applied.

> +            goto out;
> +        }
> +        if (!eth_addr_from_string(vf_mac, &mac)) {
> +            VLOG_ERR_BUF(errp, "interface '%s': cannot parse VF MAC '%s'",
> +                         netdev_get_name(netdev), vf_mac);

as above

> +            goto out;
> +        }
> +        if (eth_addr_is_multicast(mac)) {
> +            VLOG_ERR_BUF(errp,
> +                         "interface '%s': cannot set VF MAC to multicast "
> +                         "address", netdev_get_name(netdev));

as above

> +            goto out;
> +        }
> +        if (!eth_addr_equals(dev->requested_hwaddr, mac)) {
> +            dev->requested_hwaddr = mac;
> +            netdev_request_reconfigure(netdev);
> +        }
> +    }
> +
>      lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt", false);
>      if (dev->requested_lsc_interrupt_mode != lsc_interrupt_mode) {
>          dev->requested_lsc_interrupt_mode = lsc_interrupt_mode;
> @@ -4938,6 +4989,7 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
>          && dev->lsc_interrupt_mode == dev->requested_lsc_interrupt_mode
>          && dev->rxq_size == dev->requested_rxq_size
>          && dev->txq_size == dev->requested_txq_size
> +        && eth_addr_equals(dev->hwaddr, dev->requested_hwaddr)
>          && dev->socket_id == dev->requested_socket_id
>          && dev->started && !dev->reset_needed) {
>          /* Reconfiguration is unnecessary */
> @@ -4969,6 +5021,14 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
>      dev->txq_size = dev->requested_txq_size;
>  
>      rte_free(dev->tx_q);
> +
> +    if (!eth_addr_equals(dev->hwaddr, dev->requested_hwaddr)) {

Looks like if dpdk-vf-mac is not set, after initialization there will be
a configured hwaddr, and requested_hwaddr will be all zeros.

If this function is called because of some other config change, then
hwaddr !=  requested_hwaddr, and it will attempt to set the mac to the
requested_hwaddr of all zeros. This may also fail causing the
reconfigure to fail for other configs items.

(I admit, I hacked a phy port dev_info to say it was a representor, when
playing with this but the issue seems valid)


> +        err = netdev_dpdk_set_etheraddr__(dev, dev->requested_hwaddr);
> +        if (err) {
> +            goto out;
> +        }
> +    }
> +
>      err = dpdk_eth_dev_init(dev);
>      if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) {
>          netdev->ol_flags |= NETDEV_TX_OFFLOAD_TCP_TSO;
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 81c84927f..c0bf03c95 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -3280,6 +3280,19 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 
> type=patch options:peer=p1 \
>            descriptors will be used by default.
>          </p>
>        </column>
> +
> +      <column name="options" key="dpdk-vf-mac">
> +        <p>
> +          Ethernet address to set for this VF interface.  If unset then the
> +          default MAC address is used:
> +        </p>
> +        <ul>
> +          <li>For most drivers, the default MAC address assigned by
> +          their hardware.</li><li>For bifurcated drivers, the MAC currently
> +          used by the kernel netdevice.</li>

nit: the formatting is a little different compared to the other
<li></li> usage, newline/indent etc.

thanks,
Kevin.

> +        </ul>
> +        <p>This option may only be used with dpdk VF representors.</p>
> +      </column>
>      </group>
>  
>      <group title="EMC (Exact Match Cache) Configuration">
> 





_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to