On 12/05/2010 10:54 AM, Michal Sojka wrote:
> thanks for the tip. Now, with "Spinlock debugging: sleep-inside-spinlock
> checking" I get the following, which seems a bit more useful:
> 
>     BUG: sleeping function called from invalid context at 
> /home/wsh/projects/can-benchmark/kernel/2.6.36/mm/slab.c:3101
>     in_atomic(): 1, irqs_disabled(): 0, pid: 379, name: cangw
>     Call Trace:
>     [c7abdb50] [c0009c04] show_stack+0xb0/0x1d4 (unreliable)
>     [c7abdba0] [c02ffe54] dump_stack+0x2c/0x44
>     [c7abdbb0] [c002157c] __might_sleep+0xfc/0x124
>     [c7abdbc0] [c00c45bc] kmem_cache_alloc+0x15c/0x180
>     [c7abdbf0] [c02d60d8] can_rx_register+0x78/0x210
>     [c7abdc30] [c02d9ebc] cgw_create_job+0x1cc/0x220
>     [c7abdc50] [c026c208] rtnetlink_rcv_msg+0x21c/0x28c
>     [c7abdc70] [c02762a4] netlink_rcv_skb+0xb8/0x100
>     [c7abdc90] [c026bfd0] rtnetlink_rcv+0x40/0x5c
>     [c7abdcb0] [c0275e9c] netlink_unicast+0x320/0x368
>     [c7abdd00] [c0276a0c] netlink_sendmsg+0x2e0/0x33c
>     [c7abdd50] [c0244cc0] sock_sendmsg+0x9c/0xd4
>     [c7abde20] [c0247194] sys_sendto+0xcc/0x108
>     [c7abdf00] [c0248980] sys_socketcall+0x17c/0x218
>     [c7abdf40] [c0012524] ret_from_syscall+0x0/0x38
>     --- Exception: c01 at 0xff3413c
>         LR = 0x10001f9c
>     BUG: spinlock bad magic on CPU#0, swapper/0
>      lock: c798aabc, .magic: c0000000, .owner: <none>/-1, .owner_cpu: 
> -1069424200
>     Call Trace:
>     [c7ffbd00] [c0009c04] show_stack+0xb0/0x1d4 (unreliable)
>     [c7ffbd50] [c02ffe54] dump_stack+0x2c/0x44
>     [c7ffbd60] [c01afa50] spin_bug+0x84/0xd0
>     [c7ffbd80] [c01afbfc] do_raw_spin_lock+0x3c/0x15c
>     [c7ffbdb0] [c02ff70c] _raw_spin_lock+0x34/0x4c
>     [c7ffbdd0] [c025bbdc] dev_queue_xmit+0xa4/0x428
>     [c7ffbe00] [c02d630c] can_send+0x9c/0x1a0
>     [c7ffbe20] [c02d991c] can_can_gw_rcv+0x108/0x164
>     [c7ffbe50] [c02d53b4] can_rcv_filter+0xf8/0x2e8
>     [c7ffbe70] [c02d566c] can_rcv+0xc8/0x140
>     [c7ffbe90] [c025a0d0] __netif_receive_skb+0x2cc/0x338
>     [c7ffbed0] [c025a314] netif_receive_skb+0x5c/0x98
>     [c7ffbef0] [c0208374] mscan_rx_poll+0x1c0/0x454
>     [c7ffbf50] [c025a644] net_rx_action+0x104/0x230
>     [c7ffbfa0] [c00317a8] __do_softirq+0x118/0x22c
>     [c7ffbff0] [c0011eec] call_do_softirq+0x14/0x24
>     [c042fe60] [c0006d78] do_softirq+0x84/0xa8
>     [c042fe80] [c00314cc] irq_exit+0x88/0xb4
>     [c042fe90] [c0006efc] do_IRQ+0xe0/0x234
>     [c042fec0] [c0012bbc] ret_from_except+0x0/0x14
>     --- Exception: 501 at cpu_idle+0xfc/0x10c
>         LR = cpu_idle+0xfc/0x10c
>     [c042ff80] [c000afb8] cpu_idle+0x68/0x10c (unreliable)
>     [c042ffa0] [c0003ec0] rest_init+0x9c/0xbc
>     [c042ffc0] [c03da91c] start_kernel+0x2c0/0x2d8
>     [c042fff0] [00003438] 0x3438
> 
> I tried to fix the sleeping call by the following patches, but the
> original problem still appears.
> 
> diff --git a/net/can/gw.c b/net/can/gw.c
> index 94ba3f1..7779ca6 100644
> --- a/net/can/gw.c
> +++ b/net/can/gw.c
> @@ -822,11 +822,14 @@ static int cgw_create_job(struct sk_buff *skb,  struct 
> nlmsghdr *nlh,
>         if (gwj->dst.dev->type != ARPHRD_CAN)
>                 goto put_src_dst_out;
>                 
> -       spin_lock(&cgw_list_lock);
>  
>         err = cgw_register_filter(gwj);
> -       if (!err)
> -               hlist_add_head_rcu(&gwj->list, &cgw_list);
> +       if (err)
> +               goto put_src_dst_out;
> +
> +       spin_lock(&cgw_list_lock);
> +
> +       hlist_add_head_rcu(&gwj->list, &cgw_list);
>  
>         spin_unlock(&cgw_list_lock);

The fix looks good!

> My second attempt was:
> 
> diff --git a/net/can/af_can.c b/net/can/af_can.c
> index 702be5a..b046ff0 100644
> --- a/net/can/af_can.c
> +++ b/net/can/af_can.c
> @@ -418,7 +418,7 @@ int can_rx_register(struct net_device *dev, canid_t 
> can_id, canid_t mask,
>         if (dev && dev->type != ARPHRD_CAN)
>                 return -ENODEV;
>  
> -       r = kmem_cache_alloc(rcv_cache, GFP_KERNEL);
> +       r = kmem_cache_alloc(rcv_cache, GFP_ATOMIC);
>         if (!r)
>                 return -ENOMEM;
>  
> With both patches I still get the original panic (now preceeded with
> spinlock bad magic):
>                 
>     BUG: spinlock bad magic on CPU#0, swapper/0
>      lock: c7986abc, .magic: c0000000, .owner: <none>/-1, .owner_cpu: 
> -1069424200
                               ^^^^^^^^
BTW:
This is the border between userspace and kernel mapping. In a standard
3G/1G split the userspace goes from 0x0-0xbfffffff and the kernel from
0xc0000000-0xffffffff.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Socketcan-users mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/socketcan-users

Reply via email to