On 05/02/2026 14:51, Sebastian Andrzej Siewior wrote:
On 2026-02-05 11:56:44 [+0000], Vadim Fedorenko wrote:
On 05/02/2026 10:37, Loktionov, Aleksandr wrote:
spin_lock_irqsave(&wq_head->lock, flags);  <- RT mutex can sleep

Hmm... that actually means we have some drivers broken for RT kernels if
they are processing TX timestamps within a single irq vector:
- hisilicon/hns3
- intel/i40e (and ice probably)
- marvell/mvpp2

For igb/igc/i40e it's still OK to process TX timestamps directly in
MSI-X configuration, as ring processing has separate vector, right?

The statement made above is not accurate. Each and every driver does
request_irq() and here on PREEMPT_RT you can freely acquire spinlock_t.

But !RT looks problematic…

__skb_tstamp_tx() invokes skb_may_tx_timestamp() which should exit early
most of the time due to the passed bool (which is true) or
sysctl_tstamp_allow_data which is true. However, should both be false
then it tries to
        read_lock_bh(&sk->sk_callback_lock);

where lockdep will complain because this lock is now acquired with
disabled interrupts.

The function will attempt do free the fresh/ cloned skb in error case
via kfree_skb(). Since it is fresh skb, sk_buff::destructor is NULL and
the warning in skb_release_head_state() won't trigger.

So the only thing that bothers me is the read_lock_bh() in
skb_may_tx_timestamp() which deadlocks if the socket is write-locked on
the same CPU.

Alright. Now you make me think whether we should enforce OPT_TSONLY
option on socket which doesn't have CAP_NET_RAW? Then we can get rid of this check, and in case sysctl was flipped off - drop TX timestamps as
it's done now?

Reply via email to