Re: [PATCH] xfrm : lock input tasklet skb queue
Hi, I already send a patch on 2019-09-09 to this mailing list with a similar issue[1]. Sadly no replies, although this is a huge bug in the rt kernel. I fixed it a bit differently, using smaller locked regions. You have also propably a bug in your patch, because trans->queue.lock is no initialized by __skb_queue_head_init (in xfrm_input_init) Jörg [1] https://lkml.org/lkml/2019/9/9/111 Am 20.10.2019 um 17:46 schrieb Tom Rix: On PREEMPT_RT_FULL while running netperf, a corruption of the skb queue causes an oops. This appears to be caused by a race condition here __skb_queue_tail(&trans->queue, skb); tasklet_schedule(&trans->tasklet); Where the queue is changed before the tasklet is locked by tasklet_schedule. The fix is to use the skb queue lock. Signed-off-by: Tom Rix --- net/xfrm/xfrm_input.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c index 9b599ed66d97..226dead86828 100644 --- a/net/xfrm/xfrm_input.c +++ b/net/xfrm/xfrm_input.c @@ -758,12 +758,16 @@ static void xfrm_trans_reinject(unsigned long data) struct xfrm_trans_tasklet *trans = (void *)data; struct sk_buff_head queue; struct sk_buff *skb; +unsigned long flags; __skb_queue_head_init(&queue); +spin_lock_irqsave(&trans->queue.lock, flags); skb_queue_splice_init(&trans->queue, &queue); while ((skb = __skb_dequeue(&queue))) XFRM_TRANS_SKB_CB(skb)->finish(dev_net(skb->dev), NULL, skb); + +spin_unlock_irqrestore(&trans->queue.lock, flags); } int xfrm_trans_queue(struct sk_buff *skb, @@ -771,15 +775,20 @@ int xfrm_trans_queue(struct sk_buff *skb, struct sk_buff *)) { struct xfrm_trans_tasklet *trans; +unsigned long flags; trans = this_cpu_ptr(&xfrm_trans_tasklet); +spin_lock_irqsave(&trans->queue.lock, flags); -if (skb_queue_len(&trans->queue) >= netdev_max_backlog) +if (skb_queue_len(&trans->queue) >= netdev_max_backlog) { +spin_unlock_irqrestore(&trans->queue.lock, flags); return -ENOBUFS; +} XFRM_TRANS_SKB_CB(skb)->finish = finish; __skb_queue_tail(&trans->queue, skb); tasklet_schedule(&trans->tasklet); +spin_unlock_irqrestore(&trans->queue.lock, flags); return 0; } EXPORT_SYMBOL(xfrm_trans_queue);
Re: [PATCH] xfrm : lock input tasklet skb queue
On Mon, Oct 21, 2019 at 09:31:13AM -0700, Tom Rix wrote: > When preempt rt is full, softirq and interrupts run in kthreads. So it > is possible for the tasklet to sleep and for its queue to get modified > while it sleeps. This is ridiculous. The network stack is full of assumptions like this. So I think we need to fix preempt rt instead because you can't make a major change like this without auditing the entire kernel first rather than relying on a whack-a-mole approach. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] xfrm : lock input tasklet skb queue
When preempt rt is full, softirq and interrupts run in kthreads. So it is possible for the tasklet to sleep and for its queue to get modified while it sleeps. On Mon, Oct 21, 2019 at 1:37 AM Steffen Klassert wrote: > > On Sun, Oct 20, 2019 at 08:46:10AM -0700, Tom Rix wrote: > > On PREEMPT_RT_FULL while running netperf, a corruption > > of the skb queue causes an oops. > > > > This appears to be caused by a race condition here > > __skb_queue_tail(&trans->queue, skb); > > tasklet_schedule(&trans->tasklet); > > Where the queue is changed before the tasklet is locked by > > tasklet_schedule. > > > > The fix is to use the skb queue lock. > > > > Signed-off-by: Tom Rix > > --- > > net/xfrm/xfrm_input.c | 11 ++- > > 1 file changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c > > index 9b599ed66d97..226dead86828 100644 > > --- a/net/xfrm/xfrm_input.c > > +++ b/net/xfrm/xfrm_input.c > > @@ -758,12 +758,16 @@ static void xfrm_trans_reinject(unsigned long data) > > struct xfrm_trans_tasklet *trans = (void *)data; > > struct sk_buff_head queue; > > struct sk_buff *skb; > > +unsigned long flags; > > > > __skb_queue_head_init(&queue); > > +spin_lock_irqsave(&trans->queue.lock, flags); > > skb_queue_splice_init(&trans->queue, &queue); > > > > while ((skb = __skb_dequeue(&queue))) > > XFRM_TRANS_SKB_CB(skb)->finish(dev_net(skb->dev), NULL, skb); > > + > > +spin_unlock_irqrestore(&trans->queue.lock, flags); > > } > > > > int xfrm_trans_queue(struct sk_buff *skb, > > @@ -771,15 +775,20 @@ int xfrm_trans_queue(struct sk_buff *skb, > > struct sk_buff *)) > > { > > struct xfrm_trans_tasklet *trans; > > +unsigned long flags; > > > > trans = this_cpu_ptr(&xfrm_trans_tasklet); > > +spin_lock_irqsave(&trans->queue.lock, flags); > > As you can see above 'trans' is per cpu, so a spinlock > is not needed here. Also this does not run in hard > interrupt context, so irqsave is also not needed. > I don't see how this can fix anything. > > Can you please explain that race a bit more detailed? >
Re: [PATCH] xfrm : lock input tasklet skb queue
On Sun, Oct 20, 2019 at 08:46:10AM -0700, Tom Rix wrote: > On PREEMPT_RT_FULL while running netperf, a corruption > of the skb queue causes an oops. > > This appears to be caused by a race condition here > __skb_queue_tail(&trans->queue, skb); > tasklet_schedule(&trans->tasklet); > Where the queue is changed before the tasklet is locked by > tasklet_schedule. > > The fix is to use the skb queue lock. > > Signed-off-by: Tom Rix > --- > net/xfrm/xfrm_input.c | 11 ++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c > index 9b599ed66d97..226dead86828 100644 > --- a/net/xfrm/xfrm_input.c > +++ b/net/xfrm/xfrm_input.c > @@ -758,12 +758,16 @@ static void xfrm_trans_reinject(unsigned long data) > struct xfrm_trans_tasklet *trans = (void *)data; > struct sk_buff_head queue; > struct sk_buff *skb; > +unsigned long flags; > > __skb_queue_head_init(&queue); > +spin_lock_irqsave(&trans->queue.lock, flags); > skb_queue_splice_init(&trans->queue, &queue); > > while ((skb = __skb_dequeue(&queue))) > XFRM_TRANS_SKB_CB(skb)->finish(dev_net(skb->dev), NULL, skb); > + > +spin_unlock_irqrestore(&trans->queue.lock, flags); > } > > int xfrm_trans_queue(struct sk_buff *skb, > @@ -771,15 +775,20 @@ int xfrm_trans_queue(struct sk_buff *skb, > struct sk_buff *)) > { > struct xfrm_trans_tasklet *trans; > +unsigned long flags; > > trans = this_cpu_ptr(&xfrm_trans_tasklet); > +spin_lock_irqsave(&trans->queue.lock, flags); As you can see above 'trans' is per cpu, so a spinlock is not needed here. Also this does not run in hard interrupt context, so irqsave is also not needed. I don't see how this can fix anything. Can you please explain that race a bit more detailed?
[PATCH] xfrm : lock input tasklet skb queue
On PREEMPT_RT_FULL while running netperf, a corruption of the skb queue causes an oops. This appears to be caused by a race condition here __skb_queue_tail(&trans->queue, skb); tasklet_schedule(&trans->tasklet); Where the queue is changed before the tasklet is locked by tasklet_schedule. The fix is to use the skb queue lock. Signed-off-by: Tom Rix --- net/xfrm/xfrm_input.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c index 9b599ed66d97..226dead86828 100644 --- a/net/xfrm/xfrm_input.c +++ b/net/xfrm/xfrm_input.c @@ -758,12 +758,16 @@ static void xfrm_trans_reinject(unsigned long data) struct xfrm_trans_tasklet *trans = (void *)data; struct sk_buff_head queue; struct sk_buff *skb; +unsigned long flags; __skb_queue_head_init(&queue); +spin_lock_irqsave(&trans->queue.lock, flags); skb_queue_splice_init(&trans->queue, &queue); while ((skb = __skb_dequeue(&queue))) XFRM_TRANS_SKB_CB(skb)->finish(dev_net(skb->dev), NULL, skb); + +spin_unlock_irqrestore(&trans->queue.lock, flags); } int xfrm_trans_queue(struct sk_buff *skb, @@ -771,15 +775,20 @@ int xfrm_trans_queue(struct sk_buff *skb, struct sk_buff *)) { struct xfrm_trans_tasklet *trans; +unsigned long flags; trans = this_cpu_ptr(&xfrm_trans_tasklet); +spin_lock_irqsave(&trans->queue.lock, flags); -if (skb_queue_len(&trans->queue) >= netdev_max_backlog) +if (skb_queue_len(&trans->queue) >= netdev_max_backlog) { +spin_unlock_irqrestore(&trans->queue.lock, flags); return -ENOBUFS; +} XFRM_TRANS_SKB_CB(skb)->finish = finish; __skb_queue_tail(&trans->queue, skb); tasklet_schedule(&trans->tasklet); +spin_unlock_irqrestore(&trans->queue.lock, flags); return 0; } EXPORT_SYMBOL(xfrm_trans_queue); -- 2.23.0