On Thu, Nov 02, 2017 at 02:52:16PM +0100, Gaëtan Rivet wrote:
> On Wed, Nov 01, 2017 at 08:12:38PM +0000, Ophir Munk wrote:
> > failsafe device has vlan stripping configured at startup however once
> > a sub device is found as non-capable of vlan-stripping failsafe
> > updates it configuration and removes vlan stripping from it.
> > This update occurs only once at startup. Following a later plugin
> > attempt and in case of vlan stripping mismatch between failsafe
> > configuration and device capability - failsafe cannot recover and the
> > device remains constantly in plug out state.
> > 
> > The sequence of events leading to this situation is described as
> > follows:
> > 1. Start testpmd with failsafe where mlx4 is a sub device (not capable
> > of vlan stripping). Expected printout:
> > PMD: net_failsafe: Disabling VLAN stripping offload
> > 2. Execute:
> > testpmd> port stop all
> > testpmd> port config all max-pkt-len 2048
> > testpmd> port start all
> > 3. Do a plug out (e.g. disable sriov)
> > 4. Do a plug in (e.g. enable sriov)
> > 5. Expected result: failsafe successfully configures and starts its sub
> > devices
> > Actual result: failsafe is continuously failing with these messages:
> > PMD: net_failsafe: VLAN stripping offload requested but not supported by
> > sub_device 0
> > PMD: net_failsafe: device already configured, cannot fix live
> > configuration
> > PMD: net_failsafe: Unable to synchronize sub device state
> > 
> > Root cause analysis: at startup failsafe removes vlan stripping from its
> > configuration. After executing "port config all max-pkt-len 2048"
> > testpmd marks failsafe in need for configuration update.
> > After executing "port start all" testpmd overrides failsafe
> > configuration with its own configuration which includes vlan stripping
> > 
> 
> Have you tried launching testpmd with the option
> 
> "--disable-hw-vlan"
> 
> as your mlx4 port does not support it?
> 

On a second thought, I think there is a simple solution:

The fail-safe should stop trying to be clever with port configuration.
On rte_eth_dev_configure, simply apply the user configuration (without
trying to detect support and disabling flags on the fly).

If a PMD has an issue, it should warn the user. If it has an issue but
does not warn, it is a bug for this PMD. This is the case for MLX4:
either the PMD changes its behavior, or not, as long as users are fine
with it.

So a proper fix would be to remove the checks (fs_port_offload_validate
and fs_port_disable_offload) and depend on the sub-device for proper
configuration vetting.

Thoughts?

> > During the plugin attempt failsafe refuses to update its configuration
> > by removing vlan stripping since it has already updated its
> > configuration at startup.
> > 
> > The fix is to remove the limitation of one time configuration at
> > startup and allow it during plugin attempts.
> > 
> > Cc: sta...@dpdk.org
> > Fixes: bbc6a53dda44 ("net/failsafe: support Rx offload capabilities")
> > 
> > Signed-off-by: Ophir Munk <ophi...@mellanox.com>
> > ---
> > The commit message includes bug and fix descriptions
> > ---
> >  drivers/net/failsafe/failsafe_ops.c | 10 ----------
> >  1 file changed, 10 deletions(-)
> > 
> > diff --git a/drivers/net/failsafe/failsafe_ops.c 
> > b/drivers/net/failsafe/failsafe_ops.c
> > index f460551..953ee65 100644
> > --- a/drivers/net/failsafe/failsafe_ops.c
> > +++ b/drivers/net/failsafe/failsafe_ops.c
> > @@ -187,16 +187,6 @@
> >                     continue;
> >             DEBUG("Checking capabilities for sub_device %d", i);
> >             while ((capa_flag = fs_port_offload_validate(dev, sdev))) {
> > -                   /*
> > -                    * Refuse to change configuration if multiple devices
> > -                    * are present and we already have configured at least
> > -                    * some of them.
> > -                    */
> > -                   if (PRIV(dev)->state >= DEV_ACTIVE &&
> > -                       PRIV(dev)->subs_tail > 1) {
> > -                           ERROR("device already configured, cannot fix 
> > live configuration");
> > -                           return -1;
> > -                   }
> >                     ret = fs_port_disable_offload(&dev->data->dev_conf,
> >                                                   capa_flag);
> >                     if (ret) {
> > -- 
> > 1.8.3.1
> > 
> 
> -- 
> Gaëtan Rivet
> 6WIND

-- 
Gaëtan Rivet
6WIND

Reply via email to