This issue is firstly observed with an ixgbe NIC in VPP project, which is 
software switching application based on DPDK.
There's a daemon thread running in background keeping polling hardware link 
status, using ixgbe_dev_link_update_share().
Once flag IXGBE_FLAG_NEED_LINK_CONFIG is set, ixgbe_dev_link_update_share() 
will just return link down status without actually polling hardware status.

In the issue, flag IXGBE_FLAG_NEED_LINK_CONFIG is always set, and never be 
cleared, meaning ixgbe_dev_link_update_share() cannot get hardware status, but 
always get link down status.

The condition causing IXGBE_FLAG_NEED_LINK_CONFIG always set is as below.

The ixgbe_dev_link_update_share() is always running in the background.
1. In the beginning, IXGBE_FLAG_NEED_LINK_CONFIG is 0 and it is link down 
status.
2. ixgbe_dev_link_update_share() will set IXGBE_FLAG_NEED_LINK_CONFIG to 1
3. Then it triggers ixgbe_dev_setup_link_thread_handler() thread to configure 
the interface.
4. At the end of configuring thread, ixgbe_dev_setup_link_thread_handler() will 
clear the flag IXGBE_FLAG_NEED_LINK_CONFIG.
5. With IXGBE_FLAG_NEED_LINK_CONFIG being cleared, 
ixgbe_dev_link_update_share() can poll hardware link status in the next round.

But when the user is setting interface link up or down in the CLI, it will call 
ixgbe_dev_start() or ixgbe_dev_stop(). In both function, they will call 
ixgbe_dev_cancel_link_thread() to interrupt any running configuring thread 
(which is running in above step 3 and step 4), without clearing the flag 
IXGBE_FLAG_NEED_LINK_CONFIG. This will leave IXGBE_FLAG_NEED_LINK_CONFIG always 
set, and ixgbe_dev_link_update_share() cannot get hardware status.
Thanks.

> -----Original Message-----
> From: Phil Yang <phil.y...@arm.com>
> Sent: 2020年3月19日 14:42
> To: dev@dpdk.org; konstantin.anan...@intel.com; wenzhuo...@intel.com
> Cc: qi.z.zh...@intel.com; Lijian Zhang <lijian.zh...@arm.com>; Gavin Hu
> <gavin...@arm.com>; Honnappa Nagarahalli
> <honnappa.nagaraha...@arm.com>; nd <n...@arm.com>; sta...@dpdk.org
> Subject: [PATCH] net/ixgbe: fix link state timing issue on fiber ports
> 
> With some models of fiber ports (e.g. X520-2 device ID 0x10fb), it is possible
> when a port is started to experience a timing issue which prevents the link
> from ever being fully set up.
> 
> In ixgbe_dev_link_update_share(), if the media type is fiber and the link is
> down, a flag (IXGBE_FLAG_NEED_LINK_CONFIG) is set. A callback to
> ixgbe_dev_setup_link_thread_handler() is scheduled which should try to set up
> the link and clear the flag afterwards.
> 
> If the device is started before the flag is cleared, the scheduled callback is
> cancelled. This causes the flag to remain set and subsequent calls to
> ixgbe_dev_link_update_share() return without trying to retrieve the link state
> because the flag is set.
> 
> In ixgbe_dev_cancel_link_thread(), after cancelling the callback, unset the 
> flag
> on the device to avoid this condition.
> 
> Fixes: 819d0d1d57f1 ("net/ixgbe: fix blocking system events")
> Cc: sta...@dpdk.org
> 
> Bugzilla ID: 388
> 
> Signed-off-by: Phil Yang <phil.y...@arm.com>
> Signed-off-by: Lijian Zhang <lijian.zh...@arm.com>
> Reviewed-by: Gavin Hu <gavin...@arm.com>
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 23b3f5b..2b65750 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -4147,11 +4147,19 @@ static void
>  ixgbe_dev_cancel_link_thread(struct rte_eth_dev *dev)  {
>       struct ixgbe_adapter *ad = dev->data->dev_private;
> +     struct ixgbe_interrupt *intr =
> +             IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
>       void *retval;
> 
>       if (rte_atomic32_read(&ad->link_thread_running)) {
>               pthread_cancel(ad->link_thread_tid);
>               pthread_join(ad->link_thread_tid, &retval);
> +             /* clear this flag once the thread has been
> +              * cancelled, to avoid link status error in
> +              * case unfinished threads cannot clean up
> +              * this flag.
> +              */
> +             intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG;
>               rte_atomic32_clear(&ad->link_thread_running);
>       }
>  }
> @@ -4262,8 +4270,12 @@ ixgbe_dev_link_update_share(struct rte_eth_dev
> *dev,
> 
>       if (link_up == 0) {
>               if (ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
> -                     intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
>                       if (rte_atomic32_test_and_set(&ad-
> >link_thread_running)) {
> +                             /* To avoid race condition between threads,
> set
> +                              * the IXGBE_FLAG_NEED_LINK_CONFIG flag
> only
> +                              * when there is no link thread running.
> +                              */
> +                             intr->flags |=
> IXGBE_FLAG_NEED_LINK_CONFIG;
>                               if (rte_ctrl_thread_create(&ad-
> >link_thread_tid,
>                                       "ixgbe-link-handler",
>                                       NULL,
> --
> 2.7.4

Reply via email to