Hi Ian, On Thu, Jan 18, 2018 at 05:32:22PM +0000, Ian Stokes wrote: > This commit replaces MTU_TO_FRAME_LEN(mtu) with > MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed > NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu(). > > When setting an MTU we first check if the requested MTU frame > size will exceed the maximum packet frame size supported in > netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU + > ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a later > stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).
Perhaps? dpdk_eth_dev_init() rte_eth_dev_set_mtu() > However when using rte_ethdev_set_mtu(mtu) the calculation used to check > that the MTU does not exceed the max frame length for that device varies > between DPDK device drivers. For example ixgbe driver calculates MTU > frame length as > > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN > > i40e driver calculates it as > > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2 > > em driver calculates it as > > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE > > Currently it is possible to set an MTU for a netdev_dpdk device that exceeds > the upper limit MTU for that devices DPDK driver. This leads to a > segfault. This is because the MTU frame length comparison as is, does > not take into account the addition of the vlan tag overhead expected in > the drivers. The netdev_dpdk_set_mtu() call will incorrectly succeed but the > subsequent dpdk_ethdev_init() will fail before the queues have been > created for the DPDK device. This coupled with assumptions regarding > reconfiguration requirements for the netdev will lead to a segfault when > the rxq is polled for this device. > > A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when > validating a requested MTU in netdev_dpdk_set_mtu(). > MTU_TO_MAX_FRAME_LEN(mtu) is equivalent to the following: > > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + (2 * VLAN_HEADER_LEN) > > It's use accounts for the potential of up to two vlan headers being > included in the overhead required by some devices in MTU frame length at > the netdev_dpdk_set_mtu() stage. This allows OVS to flag an error rather > than the DPDK driver if the MTU exceeds the max DPDK frame length. OVS > can fail gracefully at this point by falling back to a default MTU of 1500 > and continue to configure the port. > > Note: this fix is a work around, a better approach would be if DPDK devices > could report the maximum MTU value that can be requested on a per device > basis. This capability however is not currently available. A downside of > this patch is that the MTU upper limit will be reduced by 8 bytes for > DPDK devices that do not need to account for vlan tags in MTU frame length > driver calculations e.g. ixgbe devices upper MTU limit is reduced from > the OVS point of view from 9710 to 9702. > > CC: Mark Kavanagh <mark.b.kavan...@intel.com> > Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames") > Signed-off-by: Ian Stokes <ian.sto...@intel.com> > > --- > v1->v2 > * Add note to limitations in DPDK documentation regarding MTU. > * Correct MTU calculation in commit message. > * Flag that we provision 8 bytes (2 x vlan header) by using > MTU_TO_MAX_FRAME_LEN in commit message and code comments. > --- > Documentation/intro/install/dpdk.rst | 12 ++++++++++++ > lib/netdev-dpdk.c | 13 ++++++++++++- > 2 files changed, 24 insertions(+), 1 deletions(-) > > diff --git a/Documentation/intro/install/dpdk.rst > b/Documentation/intro/install/dpdk.rst > index 3fecb5c..5800096 100644 > --- a/Documentation/intro/install/dpdk.rst > +++ b/Documentation/intro/install/dpdk.rst > @@ -585,6 +585,18 @@ Limitations > > .. _DPDK release notes: > http://dpdk.org/doc/guides/rel_notes/release_17_11.html > > +- Upper bound MTU: DPDK device drivers differ in how MTU overhead is > + calculated e.g. i40e driver includes 2 x vlan headers in MTU overhead, > + em driver includes 1 x vlan header, ixgbe driver does not include a vlan > + header in overhead. Currently it is not possible for OVS DPDK to know what > + upper bound MTU value is supported for a given device. As such OVS DPDK > + must provision for the case where the maximum MTU value includes 2 x vlan > + headers. This reduces the upper bound MTU value for devices that do not > + include vlan headers in their overhead by 8 bytes e.g. ixgbe devices upper two spaces > + bound MTU is reduced from 9710 to 9702. This work around is temporary and > + is expected to be removed once a method is provided by DPDK to query the > + maximum MTU for a given device. > + > Reporting Bugs > -------------- > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index e32c7f6..2ac76e3 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -2134,7 +2134,18 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int mtu) > { > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > > - if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN > + /* XXX: We need to ensure the requested MTU frame length does not > + * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers differ > + * in how the MTU frame length is calculated when rte_ethdev_set_mtu(mtu) > + * is called e.g. i40e driver includes 2 x vlan headers in MTU overhead > + * the em driver includes 1 x vlan tag in MTU overhead, the ixgbe driver > + * does not include vlan tags in MTU overhead. As such we should use > + * MTU_TO_MAX_FRAME_LEN(mtu) which includes an additional 2 x vlan > headers > + * (8 bytes) for comparison. This avoids a failure later with > + * rte_ethdev_set_mtu(). This approach should be used until DPDK provides rte_eth_dev_set_mtu > + * a method to retrieve maximum MTU frame length for a given device. > + */ > + if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN > || mtu < ETHER_MIN_MTU) { > VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu); > return EINVAL; We got a bug report for another related MTU issue because in older OVS/DPDK, the i40e driver didn't include the 2 VLAN headers overhead, so if the packet length is the maximum allowed, it can't forward packets with VLAN headers included. So, I agree that this needs improvements on the DPDK side to report not only the maximum MTU but also to standardize what means MTU. Today we don't know the overhead each PMD will account. In the hope that the committer can fix the small typos above or if there is another spin: Acked-by: Flavio Leitner <f...@sysclose.org> Thanks Ian! > -- > 1.7.0.7 > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev -- Flavio _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev