On Tue, Jun 30, 2026 at 02:16:26PM +0000, Ciara Loftus wrote:
> If interrupt registration fails eg. on FreeBSD with nic_uio, the ice
> interrupt handler is never called and AdminQ messages, including
> unsolicited link state notifications, are never processed.
>
> This gap has always existed but was partially masked by the blocking link
> status poll in the ice dev_start function, which would delay some time
> until the link was up before returning. In this case, the link status was
> typically correct after dev_start. However after that delay was removed in
> 2dd7e99855 ("net/ice: revert fix link up when starting device"), the link
> status was often wrong after dev_start, and there was no mechanism to
> self-correct.
>
> Rather than restoring the blocking wait removed by that commit which
> would re-impose a startup delay on every platform, instead activate a
> periodic alarm that drains the AdminQ on a 50ms interval, replicating the
> role of the interrupt handler on platforms where interrupt delivery is not
> supported.
>
> Fixes: 2dd7e99855 ("net/ice: revert fix link up when starting device")
> Cc: [email protected]
>
> Signed-off-by: Ciara Loftus <[email protected]>
> ---
This solution seems reasonable. Some thoughts and comments inline below.
/Bruce
> drivers/net/intel/ice/ice_ethdev.c | 45 +++++++++++++++++++++++++++++-
> drivers/net/intel/ice/ice_ethdev.h | 1 +
> 2 files changed, 45 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/intel/ice/ice_ethdev.c
> b/drivers/net/intel/ice/ice_ethdev.c
> index 022d92a2f6..447742ff8d 100644
> --- a/drivers/net/intel/ice/ice_ethdev.c
> +++ b/drivers/net/intel/ice/ice_ethdev.c
> @@ -3,6 +3,7 @@
> */
>
> #include <rte_string_fns.h>
> +#include <rte_alarm.h>
> #include <ethdev_pci.h>
>
> #include <ctype.h>
> @@ -108,7 +109,11 @@ enum ice_link_state_on_close {
>
> #define ICE_RSS_LUT_GLOBAL_QUEUE_NB 64
>
> +/* Polling interval used when the interrupt path is unavailable */
> +#define ICE_AQ_ALARM_INTERVAL 50000 /* us */
> +
> static int ice_dev_configure(struct rte_eth_dev *dev);
> +static void ice_aq_alarm_handler(void *param);
> static int ice_dev_start(struct rte_eth_dev *dev);
> static int ice_dev_stop(struct rte_eth_dev *dev);
> static int ice_dev_close(struct rte_eth_dev *dev);
> @@ -1473,6 +1478,28 @@ ice_handle_aq_msg(struct rte_eth_dev *dev)
> }
> }
>
> +/*
> + * Alarm-based AdminQ polling handler that drains the AdminQ to process
> + * async events, then re-arms itself.
> + *
> + * @param param
> + * The address of parameter (struct rte_eth_dev *) registered before.
> + *
> + * @return
> + * void
> + */
> +static void
> +ice_aq_alarm_handler(void *param)
> +{
> + struct rte_eth_dev *dev = (struct rte_eth_dev *)param;
> + struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
> +
> + if (!pf->adapter_stopped) {
> + ice_handle_aq_msg(dev);
> + rte_eal_alarm_set(ICE_AQ_ALARM_INTERVAL, ice_aq_alarm_handler,
> dev);
> + }
> +}
> +
> /**
> * Interrupt handler triggered by NIC for handling
> * specific interrupt.
> @@ -2913,6 +2940,11 @@ ice_dev_stop(struct rte_eth_dev *dev)
> rte_intr_efd_disable(intr_handle);
> rte_intr_vec_list_free(intr_handle);
>
> + if (pf->use_aq_polling) {
> + rte_eal_alarm_cancel(ice_aq_alarm_handler, dev);
> + pf->use_aq_polling = false;
> + }
> +
Do we actually need to cancel the timer here? Since the callback itself
only rearms on started devices, we can let the timer expire and become a
no-op, rather than needing to explicitly cancel it here.
If so, and based on other suggestions I have below, we may be able to
remove this block from stop function entirely.
> pf->adapter_stopped = true;
> dev->data->dev_started = 0;
>
> @@ -4434,7 +4466,13 @@ ice_dev_start(struct rte_eth_dev *dev)
> if (ice_rxq_intr_setup(dev))
> return -EIO;
>
> - rte_intr_enable(pci_dev->intr_handle);
> + /* Fall back to polling the AdminQ via an alarm on platforms where
> + * interrupt delivery is unavailable.
> + */
> + if (rte_intr_enable(pci_dev->intr_handle) != 0) {
> + pf->use_aq_polling = true;
> + rte_eal_alarm_set(ICE_AQ_ALARM_INTERVAL, ice_aq_alarm_handler,
> dev);
> + }
If alarm set fails, we should at least emit a warning, even if there is
nothing we can really do about it.
Also, I don't think we need to track use_aq_polling at all. Its only use is
to cancel the timer, but all places where we cancel the timer are places
where we mark the port as stopped (with one exception, which I believe is a
separate error in the code), so the timer will self-cancel at next timeout
expiry.
>
> /* Enable receiving broadcast packets and transmitting packets */
> ice_set_bit(ICE_PROMISC_BCAST_RX, pmask);
> @@ -4497,6 +4535,11 @@ ice_dev_start(struct rte_eth_dev *dev)
> for (i = 0; i < nb_txq; i++)
> ice_tx_queue_stop(dev, i);
>
> + if (pf->use_aq_polling) {
> + rte_eal_alarm_cancel(ice_aq_alarm_handler, dev);
> + pf->use_aq_polling = false;
> + }
> +
As explained above, I don't think we need this block either. If we have an
error on start, then the port should be marked as stopped and the timer
won't rearm itself after it expires.
NOTE: there is an unrelated bug in the code here - at line 4550 in start,
we set "adapter_stopped = false", but there is still error handling at line
4564 which causes us to jump to rx_err and stop all the rx and tx queues
again, but neglecting to reset adapter_stopped back to true. I think we
need to either reset it immediately after the rx_err label, or move the
assignment down till after all potential rollback points are passed.
> return -EIO;
> }
>
> diff --git a/drivers/net/intel/ice/ice_ethdev.h
> b/drivers/net/intel/ice/ice_ethdev.h
> index 20e8a13fe9..86b256fc3f 100644
> --- a/drivers/net/intel/ice/ice_ethdev.h
> +++ b/drivers/net/intel/ice/ice_ethdev.h
> @@ -599,6 +599,7 @@ struct ice_pf {
> struct ice_eth_stats internal_stats;
> bool offset_loaded;
> bool adapter_stopped;
> + bool use_aq_polling; /* true when interrupt path unavailable */
> struct ice_flow_list flow_list;
> rte_spinlock_t flow_ops_lock;
> bool init_link_up;
> --
> 2.43.0
>