Hi Ilya, Kevin, Thanks for the remarks, I will fix them. However I have two questions inline.
On 27/10/20 15:03 +0100, Ilya Maximets wrote: > On 10/20/20 1:15 PM, Kevin Traynor wrote: > > On 16/09/2020 16:17, Gaetan Rivet wrote: > >> In some cloud topologies, using DPDK VF representors in guest requires > >> configuring a VF before it is assigned to the guest. > >> > >> A first basic option for such configuration is setting the VF MAC > >> address. Add a key 'dpdk-vf-mac' to the 'options' column of the Interface > >> table. > >> > >> This option can be used as such: > >> > >> $ ovs-vsctl add-port br0 dpdk-rep0 -- set Interface dpdk-rep0 type=dpdk > >> \ > >> options:dpdk-vf-mac=00:11:22:33:44:55 > >> > > > > If I got it right, it doesn't seem like it solves any issues by limiting > > to representor, but is just an attempt to limit some of the edge cases > > around bifurcated devices to the devices where the requested > > functionality is really needed now. > > > > This limitation has some costs... > > > > We will now have, mac, mac_in_use, dpdk-vf-mac, configured_hwaddr, > > requested_hwaddr for the user to understand. I think it can be confusing. > > To be clear, 'configured_hwaddr/requested_hwaddr' type of report should not > exist at least in the output of 'get_config()' function. Same applies > to all existing '{configured,requested}_something' reports. > 'get_config()' intended to report things that could be copied and passed > directly > to database. OTOH, 'get_status()' should report what is an actual device > status, i.e. 'dpdk-vf-mac' should be in 'get_config()' and 'configured_hwaddr' > should be in 'get_status()', but it should be named differently, e.g. just > 'dpdk-vf-mac' and the callback it placed in already says that it's actually > 'configured'. In short: > - 'dpdk-vf-mac' from the 'get_config()' should report dev->requested_hwaddr. > - 'dpdk-vf-mac' from the 'get_status()' should report dev->hwaddr. > Of course, all this should be reported only if device is a representor. > > Indeed I had followed the other '{configured|requested}_' fields. This is not related to this patchset, however I'd like to know how to proceed to fix those other fields? They were exposed to users possibly, should there be a deprecation process to warn of the change, or should they just be fixed directly as the get_config() API was not respected? > > > > 'ovs-vsctl show' will show the dpdk-vf-mac, which is the main one we > > document, but it's not clear if it was successful or not. (yes there is > > a log entry, but it is a separate place). > > > > As it is specific for representors, if we ever want to allow setting of > > mac on any non representor DPDK ports, we have an awkward user > > interface. We would need to make another 'dpdk-mac' type option, or we > > decide then to allow use mac in interface table but that it doesn't work > > for representors, or there are two ways to set for representors etc. > > None of this seems great. > > > > My feeling is that it is making things complicated for the user with > > this many knobs, where we could just have mac and mac_in_use, live with > > the edge cases (as Gaetan pointed out, we already do for MTU) and we > > know as well if we ever need to extend further, the user interface will > > stay simple. Just my 2C, maybe there's a good argument why we can't do > > it like this. > > Setting mac for physical device has lots of split-brain issues and unclear > ways how and when undo changes as we do not know what was original mac or > what to do if there was no mac at all at the start. Moreover, some devices > will not allow setting mac to it's original value (e.g. all zeroes). > One more point is that vf and representor itself are different devices and > these devices could have different mac addresses. For now we agreed that > DPDK api works for representors in a way that it always changes parameters > of a virtual function, not representors', but that is not a very clear thing > for all users (e.g. for DPDK developers), so 'vf' in 'dpdk-vf-mac' > specifically > points to that difference. > > After a long discussion we decided to make this option as specific as > possible, > so users will not try to use it to configure anything else beside mac of a > virtual function. All restrictions and downsides of configuration of vf mac > was/should be documented. > > > > > > Few comments on the code as it is now below. > > > > > >> Signed-off-by: Gaetan Rivet <gr...@u256.net> > >> Suggested-by: Ilya Maximets <i.maxim...@ovn.org> > >> Acked-by: Eli Britstein <el...@nvidia.com> > >> --- > >> Documentation/topics/dpdk/phy.rst | 22 ++++++++++++ > >> NEWS | 2 ++ > >> lib/netdev-dpdk.c | 60 +++++++++++++++++++++++++++++++ > >> vswitchd/vswitch.xml | 13 +++++++ > >> 4 files changed, 97 insertions(+) > >> > >> diff --git a/Documentation/topics/dpdk/phy.rst > >> b/Documentation/topics/dpdk/phy.rst > >> index 38e52c8de..73dcb7c28 100644 > >> --- a/Documentation/topics/dpdk/phy.rst > >> +++ b/Documentation/topics/dpdk/phy.rst > >> @@ -379,6 +379,28 @@ an eth device whose mac address is > >> ``00:11:22:33:44:55``:: > >> $ ovs-vsctl add-port br0 dpdk-mac -- set Interface dpdk-mac type=dpdk > >> \ > >> options:dpdk-devargs="class=eth,mac=00:11:22:33:44:55" > >> > >> +Representor specific configuration > >> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > >> + > >> +In some topologies, a VF must be configured before being assigned to a > >> +guest (VM) machine. This configuration is done through VF-specific fields > >> +in the ``options`` column of the Interface table. > >> + > >> +.. important:: > >> + > >> + Some DPDK port use `bifurcated drivers <bifurcated-drivers>`__, > >> + which means that a kernel netdevice remains when Open vSwitch is > >> stopped. > >> + > >> + In such case, any configuration applied to a VF would remain set on the > >> + kernel netdevice, and be inherited from it when Open vSwitch is > >> restarted, > >> + even if the options described in this section are unset from Open > >> vSwitch. > >> + > >> +.. _bifurcated-drivers: > >> http://doc.dpdk.org/guides/linux_gsg/linux_drivers.html#bifurcated-driver > >> + > >> +- Configure the VF MAC address:: > >> + > >> + $ ovs-vsctl set Interface dpdk-rep0 > >> options:dpdk-vf-mac=00:11:22:33:44:55 > >> + > > > > As mentioned above, if for some reason setting the mac fails there will > > be warning in the logs, but 'ovs-vsctl show' will still show the > > dpdk-vf-mac that was not applied and it's quite prominent. Also, there > > are mac, mac_in_use, dpdk-vf-mac, configured_hwaddr, requested_hwaddr. > > > > Suggest to add a small summary here for each one as it relates > > representors and make clear to users, > > 1. how to find the mac the user requested > > 2. how to find the current mac > > > >> Jumbo Frames > >> ------------ > >> > >> diff --git a/NEWS b/NEWS > >> index 2f67d5047..5ece29474 100644 > >> --- a/NEWS > >> +++ b/NEWS > >> @@ -24,6 +24,8 @@ v2.14.0 - 17 Aug 2020 > > > > 2.15....hopefully :-) > > > >> to list and change log levels in DPDK components. > >> * Vhost-user Dequeue zero-copy support is deprecated and will be > >> removed > >> in the next release. > >> + * New 'options:dpdk-vf-mac' field for DPDK interface of VF ports, > >> + that allows configuring the MAC address of a VF representor. > >> - Linux datapath: > >> * Support for kernel versions up to 5.5.x. > >> - AF_XDP: > >> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > >> index 4ef7f78a6..060348369 100644 > >> --- a/lib/netdev-dpdk.c > >> +++ b/lib/netdev-dpdk.c > >> @@ -523,6 +523,9 @@ struct netdev_dpdk { > >> * otherwise interrupt mode is used. */ > >> bool requested_lsc_interrupt_mode; > >> bool lsc_interrupt_mode; > >> + > >> + /* VF configuration. */ > >> + struct eth_addr requested_hwaddr; > >> ); > >> > >> PADDED_MEMBERS(CACHE_LINE_SIZE, > >> @@ -1685,6 +1688,16 @@ out: > >> return ret; > >> } > >> > >> +static bool > >> +dpdk_port_is_representor(struct netdev_dpdk *dev) > >> + OVS_REQUIRES(dev->mutex) > >> +{ > >> + struct rte_eth_dev_info dev_info; > >> + > >> + rte_eth_dev_info_get(dev->port_id, &dev_info); > >> + return (*dev_info.dev_flags) & RTE_ETH_DEV_REPRESENTOR; > >> +} > >> + > >> static int > >> netdev_dpdk_get_config(const struct netdev *netdev, struct smap *args) > >> { > >> @@ -1719,6 +1732,13 @@ netdev_dpdk_get_config(const struct netdev *netdev, > >> struct smap *args) > >> } > >> smap_add(args, "lsc_interrupt_mode", > >> dev->lsc_interrupt_mode ? "true" : "false"); > >> + > >> + if (dpdk_port_is_representor(dev)) { > >> + smap_add_format(args, "requested_hwaddr", ETH_ADDR_FMT, > >> + ETH_ADDR_ARGS(dev->requested_hwaddr)); > >> + smap_add_format(args, "configured_hwaddr", ETH_ADDR_FMT, > >> + ETH_ADDR_ARGS(dev->hwaddr)); > > > > '*mac*' is used elsewhere (mac, mac_in_use, dpdk-vf-mac), think it's > > better to keep consistent > > Since this is 'config' it should report things in a way they could be passed > to the datbase. So, it should be 'dpdk-vf-mac' and it should report the > value that was requested, i.e. dev->requested_hwaddr. > > > > >> + } > >> } > >> ovs_mutex_unlock(&dev->mutex); > >> > >> @@ -1898,6 +1918,7 @@ netdev_dpdk_set_config(struct netdev *netdev, const > >> struct smap *args, > >> {RTE_FC_RX_PAUSE, RTE_FC_FULL } > >> }; > >> const char *new_devargs; > >> + const char *vf_mac; > >> int err = 0; > >> > >> ovs_mutex_lock(&dpdk_mutex); > >> @@ -1968,6 +1989,36 @@ netdev_dpdk_set_config(struct netdev *netdev, const > >> struct smap *args, > >> goto out; > >> } > >> > >> + vf_mac = smap_get(args, "dpdk-vf-mac"); > >> + if (vf_mac) { > >> + struct eth_addr mac; > >> + > >> + err = EINVAL; > >> + > >> + if (!dpdk_port_is_representor(dev)) { > >> + VLOG_ERR_BUF(errp, "'%s' is trying to set the VF MAC '%s' " > >> + "but 'options:dpdk-vf-mac' is only supported for > >> " > >> + "VF representors.", > >> + netdev_get_name(netdev), vf_mac); > > > > Warnings here seems more in line with the other configs that cannot be > > applied. > > I agree, if it's not a representor we could just issue a warning and continue. > Your phrasing seems to limit the (err -> warn) change to the representor filtering part. Should the MAC sanitizing messages below remain errors? > > > >> + goto out; > >> + } > >> + if (!eth_addr_from_string(vf_mac, &mac)) { > >> + VLOG_ERR_BUF(errp, "interface '%s': cannot parse VF MAC '%s'", > >> + netdev_get_name(netdev), vf_mac); > > > > as above > > > >> + goto out; > >> + } > >> + if (eth_addr_is_multicast(mac)) { > >> + VLOG_ERR_BUF(errp, > >> + "interface '%s': cannot set VF MAC to multicast " > >> + "address", netdev_get_name(netdev)); > > > > as above > > > >> + goto out; > >> + } > >> + if (!eth_addr_equals(dev->requested_hwaddr, mac)) { > >> + dev->requested_hwaddr = mac; > >> + netdev_request_reconfigure(netdev); > >> + } > >> + } > >> + > >> lsc_interrupt_mode = smap_get_bool(args, "dpdk-lsc-interrupt", false); > >> if (dev->requested_lsc_interrupt_mode != lsc_interrupt_mode) { > >> dev->requested_lsc_interrupt_mode = lsc_interrupt_mode; > >> @@ -4938,6 +4989,7 @@ netdev_dpdk_reconfigure(struct netdev *netdev) > >> && dev->lsc_interrupt_mode == dev->requested_lsc_interrupt_mode > >> && dev->rxq_size == dev->requested_rxq_size > >> && dev->txq_size == dev->requested_txq_size > >> + && eth_addr_equals(dev->hwaddr, dev->requested_hwaddr) > >> && dev->socket_id == dev->requested_socket_id > >> && dev->started && !dev->reset_needed) { > >> /* Reconfiguration is unnecessary */ > >> @@ -4969,6 +5021,14 @@ netdev_dpdk_reconfigure(struct netdev *netdev) > >> dev->txq_size = dev->requested_txq_size; > >> > >> rte_free(dev->tx_q); > >> + > >> + if (!eth_addr_equals(dev->hwaddr, dev->requested_hwaddr)) { > > > > Looks like if dpdk-vf-mac is not set, after initialization there will be > > a configured hwaddr, and requested_hwaddr will be all zeros. > > > > If this function is called because of some other config change, then > > hwaddr != requested_hwaddr, and it will attempt to set the mac to the > > requested_hwaddr of all zeros. This may also fail causing the > > reconfigure to fail for other configs items. > > > > (I admit, I hacked a phy port dev_info to say it was a representor, when > > playing with this but the issue seems valid) > > > > > >> + err = netdev_dpdk_set_etheraddr__(dev, dev->requested_hwaddr); > >> + if (err) { > >> + goto out; > >> + } > >> + } > >> + > >> err = dpdk_eth_dev_init(dev); > >> if (dev->hw_ol_features & NETDEV_TX_TSO_OFFLOAD) { > >> netdev->ol_flags |= NETDEV_TX_OFFLOAD_TCP_TSO; > >> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > >> index 81c84927f..c0bf03c95 100644 > >> --- a/vswitchd/vswitch.xml > >> +++ b/vswitchd/vswitch.xml > >> @@ -3280,6 +3280,19 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 > >> type=patch options:peer=p1 \ > >> descriptors will be used by default. > >> </p> > >> </column> > >> + > >> + <column name="options" key="dpdk-vf-mac"> > >> + <p> > >> + Ethernet address to set for this VF interface. If unset then > >> the > >> + default MAC address is used: > >> + </p> > >> + <ul> > >> + <li>For most drivers, the default MAC address assigned by > >> + their hardware.</li><li>For bifurcated drivers, the MAC > >> currently > >> + used by the kernel netdevice.</li> > > > > nit: the formatting is a little different compared to the other > > <li></li> usage, newline/indent etc. > > > > thanks, > > Kevin. > > > >> + </ul> > >> + <p>This option may only be used with dpdk VF representors.</p> > >> + </column> > >> </group> > >> > >> <group title="EMC (Exact Match Cache) Configuration"> > >> > > > > > > > > > > > -- Gaëtan _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev