Wander reported that performing a SR-IOV setup on PREEMPT_RT can fail/
timeout.
The reason is that during the setup the VF device performs a reset
(igbvf_reset()) and polls for an ACK (e1000_reset_hw_vf() ->
e1000_check_for_ack_vf()) with disabled bottom halves. For the ACK to
complete it is required for the igb_msix_other() interrupt handler to
run.
The interrupt handler is forced-threaded on PREEMPT_RT and therefore
delayed until after bottom halves are enabled again. This happens only
after e1000_reset_hw_vf() times out. This scenario requires that the
interrupt handler and the reset handler run on the same CPU.
This scenario is not limited to PREEMPT_RT but can also happen without
PREEMPT_RT if the interrupts are forced threaded via `threadirqs'.
Setups without forced threaded interrupts are not affected.

The interrupt handler (igb_msix_other()) does not require bottom halves
to be disabled. It does not call into the network stack which would
mandate it. Requesting the handler explicit as a threaded interrupt will
not disable bottom halves prior invocation of the handler thus avoiding
the scenario.

Request igb_msix_other as a threaded interrupt handler.

Reported-by: Wander Lairson Costa <wan...@redhat.com>
Closes: https://lore.kernel.org/all/20240920185918.616302-2-wan...@redhat.com/
Signed-off-by: Sebastian Andrzej Siewior <bige...@linutronix.de>
---

I've been sitting on this one for a while. While this avoids the timeout
on a PREEMPT_RT setup, the !PREEMPT_RT + threadirqs setup remains
affected. The difference is that PREEMPT_RT allows a context switch
within a local_bh_disable() section while !PREEMPT_RT does not.

Allowing e1000_reset_hw_vf() to run/ wait/ poll without
e1000_hw::mbx_lock, which disable BH, should fix both setups.

 drivers/net/ethernet/intel/igb/igb_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index c646c71915f03..0827e8dcd9de7 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -912,8 +912,8 @@ static int igb_request_msix(struct igb_adapter *adapter)
        struct net_device *netdev = adapter->netdev;
        int i, err = 0, vector = 0, free_vector = 0;
 
-       err = request_irq(adapter->msix_entries[vector].vector,
-                         igb_msix_other, 0, netdev->name, adapter);
+       err = request_threaded_irq(adapter->msix_entries[vector].vector,
+                                  igb_msix_other, 0, netdev->name, adapter);
        if (err)
                goto err_out;
 
-- 
2.49.0

Reply via email to