Re: Request for net guru help: waitqueue oops
On Wed, 4 Oct 2000, Petko Manolov wrote: > > The timer routines (there are 4) are used to switch hardware states and > > must therefore be mutually exclusive with respect to the interrupt handler. > > There are no bottom halves used in this driver. Andrew Morton suggested > > that the problem could be in my use of the skb pointers, which seems > > a likely candidate. I'll check that. > > It might be, but it might not. Be careful about locking and calling > procedures which can sleep from interrupt context. > > Sorry if i am not enough specific, i haven't seen the code ;-) I have found another driver in the standard kernel that also causes this oops and have posted to linux-net (as this appears to be networking related). Thanks -- Hans. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Request for net guru help: waitqueue oops
Hans Grobler wrote: > > Ok. I originally had them outside locks as they appeared to be atomic. I > moved them in incase they were the cause of the problem. Don't bother about them - see include/linux/netdevice.h to be sure. > The timer routines (there are 4) are used to switch hardware states and > must therefore be mutually exclusive with respect to the interrupt handler. > There are no bottom halves used in this driver. Andrew Morton suggested > that the problem could be in my use of the skb pointers, which seems > a likely candidate. I'll check that. It might be, but it might not. Be careful about locking and calling procedures which can sleep from interrupt context. Sorry if i am not enough specific, i haven't seen the code ;-) best, Petkan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Request for net guru help: waitqueue oops
Hans Grobler wrote: Ok. I originally had them outside locks as they appeared to be atomic. I moved them in incase they were the cause of the problem. Don't bother about them - see include/linux/netdevice.h to be sure. The timer routines (there are 4) are used to switch hardware states and must therefore be mutually exclusive with respect to the interrupt handler. There are no bottom halves used in this driver. Andrew Morton suggested that the problem could be in my use of the skb pointers, which seems a likely candidate. I'll check that. It might be, but it might not. Be careful about locking and calling procedures which can sleep from interrupt context. Sorry if i am not enough specific, i haven't seen the code ;-) best, Petkan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Request for net guru help: waitqueue oops
On Wed, 4 Oct 2000, Petko Manolov wrote: The timer routines (there are 4) are used to switch hardware states and must therefore be mutually exclusive with respect to the interrupt handler. There are no bottom halves used in this driver. Andrew Morton suggested that the problem could be in my use of the skb pointers, which seems a likely candidate. I'll check that. It might be, but it might not. Be careful about locking and calling procedures which can sleep from interrupt context. Sorry if i am not enough specific, i haven't seen the code ;-) I have found another driver in the standard kernel that also causes this oops and have posted to linux-net (as this appears to be networking related). Thanks -- Hans. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Request for net guru help: waitqueue oops
On Tue, 3 Oct 2000, Petko Manolov wrote: > None of these can sleep. netif_*_queue routines are quite simple. > They are all atomic so there is no need to protect them with locks. Ok. I originally had them outside locks as they appeared to be atomic. I moved them in incase they were the cause of the problem. > It is not clear from the example above if it is needed to lock in > the timer routine and what is locked inside. Anyway be careful > about locking regions shared between interrupts/bottom halves and > user context as this happens often. The timer routines (there are 4) are used to switch hardware states and must therefore be mutually exclusive with respect to the interrupt handler. There are no bottom halves used in this driver. Andrew Morton suggested that the problem could be in my use of the skb pointers, which seems a likely candidate. I'll check that. Thanks -- Hans - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Request for net guru help: waitqueue oops
Hans Grobler wrote: > > On Tue, 3 Oct 2000, Petko Manolov wrote: > > > It seems you're trying to sleep without process context (most likely in > > net_tx_action). It would be more clear if you send that part of the > > code. > > Since I don't explictly sleep anywhere, I'm not sure which code fragment > would be useful... (net_tx_action is part of the networking layers). Which > network functions can sleep (netif_rx, netif_stop_queue, netif_wake_queue, > ...) ? None of these can sleep. netif_*_queue routines are quite simple. They are all atomic so there is no need to protect them with locks. > After reading the softnet HOWTO, and some of the network drivers, I > was unsure about the netif_stop_queue and netif_wake_queue functions. The > howto indicated that these two should be protected from concurrent > execution by a private lock. Not all the drivers seem to do this. In my > case (although I'm running UP at the moment), I've used a driver global > spinlock, for example: > > spinlock_t driver_lock = SPIN_LOCK_UNLOCKED; > > int scc72_hard_xmit (struct sk_buff *skb, struct net_device *dev) > { > unsigned long flags; > > /* ... */ > > spin_lock_irqsave (_lock, flags); > netif_stop_queue (dev); > spin_unlock_irqrestore (_lock, flags); > > /* ... */ > } > > /* Example timer callback, to wake the queue */ > void scc72_interframewait (unsigned long channel) > { > unsigned long flags; > struct scc72_channel *scc = (struct scc72_channel *) channel; > > /* ... */ > > spin_lock_irqsave (_lock, flags); > > /* ... */ > > if (netif_queue_stopped (scc->dev)) > netif_wake_queue (scc->dev); > > spin_unlock_irqrestore (_lock, flags); > } It is not clear from the example above if it is needed to lock in the timer routine and what is locked inside. Anyway be careful about locking regions shared between interrupts/bottom halves and user context as this happens often. best, Petkan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Request for net guru help: waitqueue oops
Hi Petkan, Thanks for your comment. On Tue, 3 Oct 2000, Petko Manolov wrote: > > A driver I'm working on seems to be doing/triggering something related > > to waitqueues. This causes a perfectly reproducable oops (small mercies!). > > Since the oops is not happening in my driver, I'm having a hard time > > figuring out whats going wrong. I suspect a networking guru will take > > one look and know what I'm doing wrong. Any suggestions please? > > > It seems you're trying to sleep without process context (most likely in > net_tx_action). It would be more clear if you send that part of the > code. Since I don't explictly sleep anywhere, I'm not sure which code fragment would be useful... (net_tx_action is part of the networking layers). Which network functions can sleep (netif_rx, netif_stop_queue, netif_wake_queue, ...) ? After reading the softnet HOWTO, and some of the network drivers, I was unsure about the netif_stop_queue and netif_wake_queue functions. The howto indicated that these two should be protected from concurrent execution by a private lock. Not all the drivers seem to do this. In my case (although I'm running UP at the moment), I've used a driver global spinlock, for example: spinlock_t driver_lock = SPIN_LOCK_UNLOCKED; int scc72_hard_xmit (struct sk_buff *skb, struct net_device *dev) { unsigned long flags; /* ... */ spin_lock_irqsave (_lock, flags); netif_stop_queue (dev); spin_unlock_irqrestore (_lock, flags); /* ... */ } /* Example timer callback, to wake the queue */ void scc72_interframewait (unsigned long channel) { unsigned long flags; struct scc72_channel *scc = (struct scc72_channel *) channel; /* ... */ spin_lock_irqsave (_lock, flags); /* ... */ if (netif_queue_stopped (scc->dev)) netif_wake_queue (scc->dev); spin_unlock_irqrestore (_lock, flags); } I've just checked my driver, and below is the list of all the external functions called. Any idea which of these could be trying to sleep? dev_kfree_skb_any (called from both hard IRQ and non IRQ context) dev_alloc_skb (called from both hard IRQ and non IRQ context) del_timer (called from both hard IRQ and non IRQ context) add_timer (called from both hard IRQ and non IRQ context) netif_rx (called from IRQ context) netif_start_queue (called from non hard IRQ context, ex: dev_open) netif_stop_queue (called from non hard IRQ context, ex: hard_start_xmit) netif_wake_queue (called from non hard IRQ context, ex: timer callbacks) netif_queue_stopped (called from non hard IRQ context, ex: timer callbacks) skb_queue_tail(called from non hard IRQ context, ex: hard_start_xmit) skb_dequeue (called from both hard IRQ and non IRQ context) skb_queue_head_init (called from non hard IRQ context, ex: dev_open) and the standard functions dev_init_buffers, register_netdevice, copy_from_user, unregister_netdev, etc. called in the standard places. skb_queue_tail, skb_dequeue and skb_queue_head_init are used to manage an internal queue of outgoing skb's. Thanks. -- Hans - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Request for net guru help: waitqueue oops
Hans Grobler wrote: > > Hi all, > > A driver I'm working on seems to be doing/triggering something related > to waitqueues. This causes a perfectly reproducable oops (small mercies!). > Since the oops is not happening in my driver, I'm having a hard time > figuring out whats going wrong. I suspect a networking guru will take > one look and know what I'm doing wrong. Any suggestions please? It seems you're trying to sleep without process context (most likely in net_tx_action). It would be more clear if you send that part of the code. best, Petkan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Request for net guru help: waitqueue oops
Hi all, A driver I'm working on seems to be doing/triggering something related to waitqueues. This causes a perfectly reproducable oops (small mercies!). Since the oops is not happening in my driver, I'm having a hard time figuring out whats going wrong. I suspect a networking guru will take one look and know what I'm doing wrong. Any suggestions please? Initially, I was getting the first oops below. After browsing the waitqueue code, I found and enabled the WAITQUEUE_DEBUG define. Now I'm getting the second oops. The values 8729, 8731 in eax ebx ecx (first oops) and in the magic & creator field (second oops) look very weird... something incrementing... In my driver I have all pointers protected by magic numbers. These are validated before every use (will do a BUG() on invalid pointer). TIA -- Hans. ---[ OOPS1 ]-- ksymoops 2.3.4 on i686 2.4.0-test9. Options used -v /usr/src/linux/vmlinux (specified) -k ./ksyms (specified) -l ./modules (specified) -o /lib/modules/2.4.0-test9 (specified) -m /usr/src/linux/System.map (specified) Unable to handle kernel paging request at virtual address 8731 c0113a70 *pde = Oops: CPU:0 EIP:0010:[] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010003 eax: 8729 ebx: 8731 ecx: 8731 edx: 0021 esi: edi: 000d ebp: c0231f40 esp: c0231f1c ds: 0018 es: 0018 ss: 0018 Process swapper (pid: 0, stackpage=c0231000) Stack: c3fc59a0 c3fa8800 000d 0110 8731 c17aec6c 0246 0001 0021 c0231fa4 c01a5155 c3fc59a0 c01a4a53 c3fc59a0 c01a55d0 c3fa8800 000d c010a00d c01a7129 c3fa8800 0001 c0269c08 000d Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] Code: 8b 1b 89 5d ec 8b 48 04 8b 11 89 d0 24 df 85 45 fc 0f 84 79 >>EIP; c0113a70 <__wake_up+50/144> <= Trace; c01a5155 Trace; c01a4a53 Trace; c01a55d0 <__kfree_skb+7c/11c> Trace; c010a00d Trace; c01a7129 Trace; c01192ee Trace; c010a1a8 Trace; c0107160 Trace; c0107160 Trace; c0108df0 Trace; c0107160 Trace; c0107160 Trace; c0100018 Trace; c0107183 Trace; c01071e4 Trace; c0105000 Trace; c0100192 Code; c0113a70 <__wake_up+50/144> <_EIP>: Code; c0113a70 <__wake_up+50/144> <= 0: 8b 1b mov(%ebx),%ebx <= Code; c0113a72 <__wake_up+52/144> 2: 89 5d ec mov%ebx,0xffec(%ebp) Code; c0113a75 <__wake_up+55/144> 5: 8b 48 04 mov0x4(%eax),%ecx Code; c0113a78 <__wake_up+58/144> 8: 8b 11 mov(%ecx),%edx Code; c0113a7a <__wake_up+5a/144> a: 89 d0 mov%edx,%eax Code; c0113a7c <__wake_up+5c/144> c: 24 df and$0xdf,%al Code; c0113a7e <__wake_up+5e/144> e: 85 45 fc test %eax,0xfffc(%ebp) Code; c0113a81 <__wake_up+61/144> 11: 0f 84 79 00 00 00 je 90 <_EIP+0x90> c0113b00 <__wake_up+e0/144> Aiee, killing interrupt handler Kernel panic: Attempted to kill the idle task! ---[ OOPS2 ]-- ksymoops 2.3.4 on i686 2.4.0-test9. Options used -v /usr/src/linux/vmlinux (specified) -k ./ksyms (specified) -l ./modules (specified) -o /lib/modules/2.4.0-test9 (specified) -m /usr/src/linux/System.map (specified) bad magic 8722 (should be c2dfbbd4, creator 8723), wq bug, forcing oops. kernel BUG at /usr/src/linux/include/linux/wait.h:155! invalid operand: CPU:0 EIP:0010:[] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010296 eax: 0037 ebx: c2dfbbc8 ecx: c0240b48 edx: esi: c3bbe060 edi: 000d ebp: c0253fa4 esp: c0253f34 ds: 0018 es: 0018 ss: 0018 Process swapper (pid: 0, stackpage=c0253000) Stack: c02291e4 c02291c0 009b c3bbe060 c3f87260 c01b2ea7 c3bbe060 c01b3bc0 c3f87260 000d c01b582a c3f87260 0001 c028bc08 000d c0253fa4 c011b1ae c028bc08 00a0 c02839a0 0005 c010a4a5 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] Code: 0f 0b 83 c4 0c 8d b6 00 00 00 00 8d 43 04 39 43 04 74 0d 8b >>EIP; c01b3715<= Trace; c02291e4 Trace; c02291c0 Trace; c01b2ea7 Trace; c01b3bc0 <__kfree_skb+7c/11c> Trace; c01b582a Trace; c011b1ae Trace; c010a4a5 Trace; c0107160 Trace; c0107160 Trace; c010902c Trace; c0107160 Trace; c0107160 Trace; c0100018 Trace; c0107183 Trace; c01071e4 Trace; c0105000 Trace; c0100192 Code; c01b3715 <_EIP>: Code; c01b3715<= 0: 0f 0b ud2a <= Code; c01b3717 2: 83 c4 0c add$0xc,%esp Code; c01b371a 5: 8d b6 00 00 00 00 lea0x0(%esi),%esi Code; c01b3720 b: 8d 43 04 lea0x4(%ebx),%eax
Request for net guru help: waitqueue oops
Hi all, A driver I'm working on seems to be doing/triggering something related to waitqueues. This causes a perfectly reproducable oops (small mercies!). Since the oops is not happening in my driver, I'm having a hard time figuring out whats going wrong. I suspect a networking guru will take one look and know what I'm doing wrong. Any suggestions please? Initially, I was getting the first oops below. After browsing the waitqueue code, I found and enabled the WAITQUEUE_DEBUG define. Now I'm getting the second oops. The values 8729, 8731 in eax ebx ecx (first oops) and in the magic creator field (second oops) look very weird... something incrementing... In my driver I have all pointers protected by magic numbers. These are validated before every use (will do a BUG() on invalid pointer). TIA -- Hans. ---[ OOPS1 ]-- ksymoops 2.3.4 on i686 2.4.0-test9. Options used -v /usr/src/linux/vmlinux (specified) -k ./ksyms (specified) -l ./modules (specified) -o /lib/modules/2.4.0-test9 (specified) -m /usr/src/linux/System.map (specified) Unable to handle kernel paging request at virtual address 8731 c0113a70 *pde = Oops: CPU:0 EIP:0010:[c0113a70] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010003 eax: 8729 ebx: 8731 ecx: 8731 edx: 0021 esi: edi: 000d ebp: c0231f40 esp: c0231f1c ds: 0018 es: 0018 ss: 0018 Process swapper (pid: 0, stackpage=c0231000) Stack: c3fc59a0 c3fa8800 000d 0110 8731 c17aec6c 0246 0001 0021 c0231fa4 c01a5155 c3fc59a0 c01a4a53 c3fc59a0 c01a55d0 c3fa8800 000d c010a00d c01a7129 c3fa8800 0001 c0269c08 000d Call Trace: [c01a5155] [c01a4a53] [c01a55d0] [c010a00d] [c01a7129] [c01192ee] [c010a1a8] [c0107160] [c0107160] [c0108df0] [c0107160] [c0107160] [c0100018] [c0107183] [c01071e4] [c0105000] [c0100192] Code: 8b 1b 89 5d ec 8b 48 04 8b 11 89 d0 24 df 85 45 fc 0f 84 79 EIP; c0113a70 __wake_up+50/144 = Trace; c01a5155 sock_def_write_space+2d/74 Trace; c01a4a53 sock_wfree+17/30 Trace; c01a55d0 __kfree_skb+7c/11c Trace; c010a00d handle_IRQ_event+31/5c Trace; c01a7129 net_tx_action+45/a0 Trace; c01192ee do_softirq+4e/74 Trace; c010a1a8 do_IRQ+9c/ac Trace; c0107160 default_idle+0/28 Trace; c0107160 default_idle+0/28 Trace; c0108df0 ret_from_intr+0/20 Trace; c0107160 default_idle+0/28 Trace; c0107160 default_idle+0/28 Trace; c0100018 startup_32+18/13a Trace; c0107183 default_idle+23/28 Trace; c01071e4 cpu_idle+3c/50 Trace; c0105000 empty_bad_page+0/1000 Trace; c0100192 L6+0/2 Code; c0113a70 __wake_up+50/144 _EIP: Code; c0113a70 __wake_up+50/144 = 0: 8b 1b mov(%ebx),%ebx = Code; c0113a72 __wake_up+52/144 2: 89 5d ec mov%ebx,0xffec(%ebp) Code; c0113a75 __wake_up+55/144 5: 8b 48 04 mov0x4(%eax),%ecx Code; c0113a78 __wake_up+58/144 8: 8b 11 mov(%ecx),%edx Code; c0113a7a __wake_up+5a/144 a: 89 d0 mov%edx,%eax Code; c0113a7c __wake_up+5c/144 c: 24 df and$0xdf,%al Code; c0113a7e __wake_up+5e/144 e: 85 45 fc test %eax,0xfffc(%ebp) Code; c0113a81 __wake_up+61/144 11: 0f 84 79 00 00 00 je 90 _EIP+0x90 c0113b00 __wake_up+e0/144 Aiee, killing interrupt handler Kernel panic: Attempted to kill the idle task! ---[ OOPS2 ]-- ksymoops 2.3.4 on i686 2.4.0-test9. Options used -v /usr/src/linux/vmlinux (specified) -k ./ksyms (specified) -l ./modules (specified) -o /lib/modules/2.4.0-test9 (specified) -m /usr/src/linux/System.map (specified) bad magic 8722 (should be c2dfbbd4, creator 8723), wq bug, forcing oops. kernel BUG at /usr/src/linux/include/linux/wait.h:155! invalid operand: CPU:0 EIP:0010:[c01b3715] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010296 eax: 0037 ebx: c2dfbbc8 ecx: c0240b48 edx: esi: c3bbe060 edi: 000d ebp: c0253fa4 esp: c0253f34 ds: 0018 es: 0018 ss: 0018 Process swapper (pid: 0, stackpage=c0253000) Stack: c02291e4 c02291c0 009b c3bbe060 c3f87260 c01b2ea7 c3bbe060 c01b3bc0 c3f87260 000d c01b582a c3f87260 0001 c028bc08 000d c0253fa4 c011b1ae c028bc08 00a0 c02839a0 0005 c010a4a5 Call Trace: [c02291e4] [c02291c0] [c01b2ea7] [c01b3bc0] [c01b582a] [c011b1ae] [c010a4a5] [c0107160] [c0107160] [c010902c] [c0107160] [c0107160] [c0100018] [c0107183] [c01071e4] [c0105000] [c0100192] Code: 0f 0b 83 c4 0c 8d b6 00 00 00 00 8d 43 04 39 43 04 74 0d 8b EIP; c01b3715 sock_def_write_space+5d/c4 = Trace; c02291e4 RCSid+6ee4/9360 Trace; c02291c0 RCSid+6ec0/9360 Trace; c01b2ea7
Re: Request for net guru help: waitqueue oops
Hans Grobler wrote: Hi all, A driver I'm working on seems to be doing/triggering something related to waitqueues. This causes a perfectly reproducable oops (small mercies!). Since the oops is not happening in my driver, I'm having a hard time figuring out whats going wrong. I suspect a networking guru will take one look and know what I'm doing wrong. Any suggestions please? It seems you're trying to sleep without process context (most likely in net_tx_action). It would be more clear if you send that part of the code. best, Petkan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Request for net guru help: waitqueue oops
Hi Petkan, Thanks for your comment. On Tue, 3 Oct 2000, Petko Manolov wrote: A driver I'm working on seems to be doing/triggering something related to waitqueues. This causes a perfectly reproducable oops (small mercies!). Since the oops is not happening in my driver, I'm having a hard time figuring out whats going wrong. I suspect a networking guru will take one look and know what I'm doing wrong. Any suggestions please? It seems you're trying to sleep without process context (most likely in net_tx_action). It would be more clear if you send that part of the code. Since I don't explictly sleep anywhere, I'm not sure which code fragment would be useful... (net_tx_action is part of the networking layers). Which network functions can sleep (netif_rx, netif_stop_queue, netif_wake_queue, ...) ? After reading the softnet HOWTO, and some of the network drivers, I was unsure about the netif_stop_queue and netif_wake_queue functions. The howto indicated that these two should be protected from concurrent execution by a private lock. Not all the drivers seem to do this. In my case (although I'm running UP at the moment), I've used a driver global spinlock, for example: spinlock_t driver_lock = SPIN_LOCK_UNLOCKED; int scc72_hard_xmit (struct sk_buff *skb, struct net_device *dev) { unsigned long flags; /* ... */ spin_lock_irqsave (driver_lock, flags); netif_stop_queue (dev); spin_unlock_irqrestore (driver_lock, flags); /* ... */ } /* Example timer callback, to wake the queue */ void scc72_interframewait (unsigned long channel) { unsigned long flags; struct scc72_channel *scc = (struct scc72_channel *) channel; /* ... */ spin_lock_irqsave (driver_lock, flags); /* ... */ if (netif_queue_stopped (scc-dev)) netif_wake_queue (scc-dev); spin_unlock_irqrestore (driver_lock, flags); } I've just checked my driver, and below is the list of all the external functions called. Any idea which of these could be trying to sleep? dev_kfree_skb_any (called from both hard IRQ and non IRQ context) dev_alloc_skb (called from both hard IRQ and non IRQ context) del_timer (called from both hard IRQ and non IRQ context) add_timer (called from both hard IRQ and non IRQ context) netif_rx (called from IRQ context) netif_start_queue (called from non hard IRQ context, ex: dev_open) netif_stop_queue (called from non hard IRQ context, ex: hard_start_xmit) netif_wake_queue (called from non hard IRQ context, ex: timer callbacks) netif_queue_stopped (called from non hard IRQ context, ex: timer callbacks) skb_queue_tail(called from non hard IRQ context, ex: hard_start_xmit) skb_dequeue (called from both hard IRQ and non IRQ context) skb_queue_head_init (called from non hard IRQ context, ex: dev_open) and the standard functions dev_init_buffers, register_netdevice, copy_from_user, unregister_netdev, etc. called in the standard places. skb_queue_tail, skb_dequeue and skb_queue_head_init are used to manage an internal queue of outgoing skb's. Thanks. -- Hans - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Request for net guru help: waitqueue oops
Hans Grobler wrote: On Tue, 3 Oct 2000, Petko Manolov wrote: It seems you're trying to sleep without process context (most likely in net_tx_action). It would be more clear if you send that part of the code. Since I don't explictly sleep anywhere, I'm not sure which code fragment would be useful... (net_tx_action is part of the networking layers). Which network functions can sleep (netif_rx, netif_stop_queue, netif_wake_queue, ...) ? None of these can sleep. netif_*_queue routines are quite simple. They are all atomic so there is no need to protect them with locks. After reading the softnet HOWTO, and some of the network drivers, I was unsure about the netif_stop_queue and netif_wake_queue functions. The howto indicated that these two should be protected from concurrent execution by a private lock. Not all the drivers seem to do this. In my case (although I'm running UP at the moment), I've used a driver global spinlock, for example: spinlock_t driver_lock = SPIN_LOCK_UNLOCKED; int scc72_hard_xmit (struct sk_buff *skb, struct net_device *dev) { unsigned long flags; /* ... */ spin_lock_irqsave (driver_lock, flags); netif_stop_queue (dev); spin_unlock_irqrestore (driver_lock, flags); /* ... */ } /* Example timer callback, to wake the queue */ void scc72_interframewait (unsigned long channel) { unsigned long flags; struct scc72_channel *scc = (struct scc72_channel *) channel; /* ... */ spin_lock_irqsave (driver_lock, flags); /* ... */ if (netif_queue_stopped (scc-dev)) netif_wake_queue (scc-dev); spin_unlock_irqrestore (driver_lock, flags); } It is not clear from the example above if it is needed to lock in the timer routine and what is locked inside. Anyway be careful about locking regions shared between interrupts/bottom halves and user context as this happens often. best, Petkan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Request for net guru help: waitqueue oops
On Tue, 3 Oct 2000, Petko Manolov wrote: None of these can sleep. netif_*_queue routines are quite simple. They are all atomic so there is no need to protect them with locks. Ok. I originally had them outside locks as they appeared to be atomic. I moved them in incase they were the cause of the problem. It is not clear from the example above if it is needed to lock in the timer routine and what is locked inside. Anyway be careful about locking regions shared between interrupts/bottom halves and user context as this happens often. The timer routines (there are 4) are used to switch hardware states and must therefore be mutually exclusive with respect to the interrupt handler. There are no bottom halves used in this driver. Andrew Morton suggested that the problem could be in my use of the skb pointers, which seems a likely candidate. I'll check that. Thanks -- Hans - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/