Hi Slava, Saturday, May 25, 2019 12:26 PM, Of Viacheslav Ovsiienko: > Subject: [dpdk-dev] [PATCH] net/mlx5: fix event handler uninstall > > When device is being closed and tries to unregister interrupt callback, there > is > a chance the handler is still active (called in context of > eal_intr_thread_main > thread). If so the rte_intr_callback_unregister returns -EAGAIN and keeps > the handler registered, causing crash when underlaying resourse is gone > away. > > This race condition may happen if event handling in application takes a long > time. We should check the return code of unregistering routine and try again > to unregister the handler. The diagnostic messages are shown once a > second, while trying to unregister. > > Fixes: 028b2a28c3cb ("net/mlx5: update event handler for multiport IB > devices") > > Signed-off-by: Viacheslav Ovsiienko <viachesl...@mellanox.com> > Acked-by: Yongseok Koh <ys...@mellanox.com> > ---
[...] > + */ > +void > +mlx5_intr_callback_unregister(const struct rte_intr_handle *handle, > + rte_intr_callback_fn cb_fn, void *cb_arg) { > + /* > + * Try to reduce timeout management overhead by not calling > + * the timer related routines on the first iteration. If the > + * unregistering succeeds on first call there will be no > + * timer calls at all. > + */ > + uint64_t twait = 0; > + uint64_t start = 0; > + > + do { > + int ret; > + > + ret = rte_intr_callback_unregister(handle, cb_fn, cb_arg); > + if (ret >= 0) > + return; > + if (ret != -EAGAIN) { > + DRV_LOG(INFO, "failed to unregister interrupt" > + " handler (error: %d)", ret); > + assert(false); > + return; > + } > + if (twait) { > + struct timespec onems; > + > + /* Wait one millisecond and try again. */ > + onems.tv_sec = 0; > + onems.tv_nsec = NS_PER_S / MS_PER_S; I get the below when trying to compile on top of Bluefield: /.autodirect/swgwork/shahafs/workspace/dpdk.org/drivers/net/mlx5/mlx5_ethdev.c:1272:20: error: 'NS_PER_S' undeclared (first use in this function); did you mean 'NB_SEGS'? onems.tv_nsec = NS_PER_S / MS_PER_S; ^~~~~~~~ NB_SEGS /.autodirect/swgwork/shahafs/workspace/dpdk.org/drivers/net/mlx5/mlx5_ethdev.c:1272:20: note: each undeclared identifier is reported only once for each function it appears in /.autodirect/swgwork/shahafs/workspace/dpdk.org/drivers/net/mlx5/mlx5_ethdev.c:1272:31: error: 'MS_PER_S' undeclared (first use in this function); did you mean 'NS_PER_S'? onems.tv_nsec = NS_PER_S / MS_PER_S; ^~~~~~~~ NS_PER_S /.autodirect/swgwork/shahafs/workspace/dpdk.org/drivers/net/mlx5/mlx5_ethdev.c:1275:9: error: implicit declaration of function 'rte_get_timer_cycles'; did you mean 'rte_get_ptype_name'? [-Werror=implicit-function-declaration] if ((rte_get_timer_cycles() - start) <= twait) ^~~~~~~~~~~~~~~~~~~~ rte_get_ptype_name /.autodirect/swgwork/shahafs/workspace/dpdk.org/drivers/net/mlx5/mlx5_ethdev.c:1275:9: error: nested extern declaration of 'rte_get_timer_cycles' [-Werror=nested-externs] /.autodirect/swgwork/shahafs/workspace/dpdk.org/drivers/net/mlx5/mlx5_ethdev.c:1284:12: error: implicit declaration of function 'rte_get_timer_hz'; did you mean 'rte_gettid'? [-Werror=implicit-function-declaration] twait = rte_get_timer_hz(); ^~~~~~~~~~~~~~~~ rte_gettid /.autodirect/swgwork/shahafs/workspace/dpdk.org/drivers/net/mlx5/mlx5_ethdev.c:1284:12: error: nested extern declaration of 'rte_get_timer_hz' [-Werror=nested-externs]