Re: auro deadlock

2006-07-10 Thread Ingo Molnar

* David Miller <[EMAIL PROTECTED]> wrote:

> The lockdep fixes are starting to cause us to go in and start adding 
> hard IRQ protection to many socket layer objects and I want this 
> thinking to end quickly :)

In earlier lockdep versions we had many such hacks, but in the current 
upstream kernel we've only got one such case so far: one netlink 
function, where the alternative was to rewrite softmac. I fixed all the 
earlier hacks to be proper annotations - and the plan is to keep things 
clean in the future too :-)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: auro deadlock (was Re: e100 lockdep irq lock inversion.)

2006-07-07 Thread Herbert Xu
Arjan van de Ven <[EMAIL PROTECTED]> wrote:
> 
> Act 1
> 
> Enter the mpi_start_xmit() function, which is airo's xmit function.
> This function takes the aux_lock first, with irq's off, then calls
> skb_queue_tail(). skb_queue_tail takes the sk_receive_queue.lock (with
> irqsave as well).

Nope, make that ai->txq.

> Act 2
> 
> Enter the ipcalc program. This program calls an ioctl, which ends up
> calling udp_ioctl. udp_ioctl does
>   spin_lock_bh(&sk->sk_receive_queue.lock);

Different queue.

So no deadlock.  Better luck next time :)
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: auro deadlock

2006-07-07 Thread Arjan van de Ven
On Fri, 2006-07-07 at 12:09 -0700, David Miller wrote:
> From: Arjan van de Ven <[EMAIL PROTECTED]>
> Date: Fri, 07 Jul 2006 20:13:09 +0200
> 
> > Now a question for netdev: what is the interrupt-or-softirq rules for
> > the sk_receive_queue.lock?
> > 
> > Anyway, the patch below fixes this deadlock; it may or may not be the
> > correct solution depending on the netdev answer, but the deadlock is
> > gone ;)
> 
> The lockdep fixes are starting to cause us to go in and start adding
> hard IRQ protection to many socket layer objects and I want this
> thinking to end quickly :)

that's why I asked the question ;)


> To reiterate, nothing socket or SKB level should be taking anything
> deeper than software IRQ locking.
> 
> If drivers manage local SKB queues in hard IRQ context, they need to
> use a seperate lockdep identifier for that queue's lock.

I'm not so sure that;s the case here, but.. if you have time today I
hope you can take a look at this one with a wider "network view" than I
can..


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: auro deadlock

2006-07-07 Thread David Miller
From: Arjan van de Ven <[EMAIL PROTECTED]>
Date: Fri, 07 Jul 2006 20:13:09 +0200

> Now a question for netdev: what is the interrupt-or-softirq rules for
> the sk_receive_queue.lock?
> 
> Anyway, the patch below fixes this deadlock; it may or may not be the
> correct solution depending on the netdev answer, but the deadlock is
> gone ;)

The lockdep fixes are starting to cause us to go in and start adding
hard IRQ protection to many socket layer objects and I want this
thinking to end quickly :)

The netlink wireless fix is another example of this, but I accept the
temporary fix for that one for the time being.

To reiterate, nothing socket or SKB level should be taking anything
deeper than software IRQ locking.

If drivers manage local SKB queues in hard IRQ context, they need to
use a seperate lockdep identifier for that queue's lock.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


auro deadlock (was Re: e100 lockdep irq lock inversion.)

2006-07-07 Thread Arjan van de Ven
On Fri, 2006-07-07 at 13:19 -0400, Dave Jones wrote:
> Another one triggered by a Fedora-development user..
> 
> e100: eth1: e100_watchdog: link up, 100Mbps, half-duplex
> 
> =
> [ INFO: possible irq lock inversion dependency detected ]
> -
> ipcalc/1671 just changed the state of lock:
>  (&skb_queue_lock_key){-+..}, at: [] udp_ioctl+0x3b/0x6e
> but this lock was taken by another, hard-irq-safe lock in the past:
>  (&ai->aux_lock){+...}
> 
> and interrupts could create inverse lock ordering between them.


ok the situation is this:

the airo driver has a per card aux_lock.

This lock is used in the hardirq handler, and thus all uses need to be
irqsave (and they are)

Act 1

Enter the mpi_start_xmit() function, which is airo's xmit function.
This function takes the aux_lock first, with irq's off, then calls
skb_queue_tail(). skb_queue_tail takes the sk_receive_queue.lock (with
irqsave as well).

Act 2

Enter the ipcalc program. This program calls an ioctl, which ends up
calling udp_ioctl. udp_ioctl does
   spin_lock_bh(&sk->sk_receive_queue.lock);


This is a deadlock

For brevity, lock A is the aux_lock, lock B is the sk_receive_queue.lock

CPU 0   CPU 1
user contextuser context
udp_ioctl takes lock B with _bh
(leaving irqs enabled)

mpi_start_function takes
lock A with _irqsave
mpi_start_function takes
lock B with _irqsave and spins

interrupt happens
the hard irq handler takes lock A
 and spins for cpu 0

eg a classic AB-BA deadlock.


Now a question for netdev: what is the interrupt-or-softirq rules for
the sk_receive_queue.lock?

Anyway, the patch below fixes this deadlock; it may or may not be the
correct solution depending on the netdev answer, but the deadlock is
gone ;)

Signed-off-by: Arjan van de Ven

Index: linux-2.6.18-rc1/net/ipv4/udp.c
===
--- linux-2.6.18-rc1.orig/net/ipv4/udp.c
+++ linux-2.6.18-rc1/net/ipv4/udp.c
@@ -725,6 +725,7 @@ out:
  
 int udp_ioctl(struct sock *sk, int cmd, unsigned long arg)
 {
+   unsigned long flags;
switch(cmd) 
{
case SIOCOUTQ:
@@ -739,7 +740,7 @@ int udp_ioctl(struct sock *sk, int cmd, 
unsigned long amount;
 
amount = 0;
-   spin_lock_bh(&sk->sk_receive_queue.lock);
+   spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
skb = skb_peek(&sk->sk_receive_queue);
if (skb != NULL) {
/*
@@ -749,7 +750,7 @@ int udp_ioctl(struct sock *sk, int cmd, 
 */
amount = skb->len - sizeof(struct udphdr);
}
-   spin_unlock_bh(&sk->sk_receive_queue.lock);
+   spin_unlock_irqrestore(&sk->sk_receive_queue.lock, 
flags);
return put_user(amount, (int __user *)arg);
}
 


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html