Re: kernel BUG at mm/rmap.c:631!
On Sun, Feb 17, 2008 at 09:16:36PM +0100, thus spake Rafael J. Wysocki: > On Sunday, 17 of February 2008, Ignacy Gawedzki wrote: > > Hi, > > Hi, > > > I was printing on the parallel port and suddenly the "parallel" CUPS backend > > went 50% CPU (obviously endless-looping), while the other 50% were eaten by > > ghostscript (strace didn't show anything, so this might be an "internal" > > loop). When I eventually killed the latter, I got this: > > Which kernel is this? As is shown in the dmesg, it is 2.6.24.1. >Is it a regression? Can't really say for sure. At least it already happened with 2.6.23.9. > If so, what's the last known > working kernel? This is really difficult to determine, since the event is pretty hard to reproduce. I'll try to investigate more, then. :/ This one happened pretty much right after a reboot due to a completely frozen machine (no Oops or Eeek whatsoever) apparently due to intensive writing to the parallel port (the kernel complained that "FIFO write timed out" twice before locking up). Of course I do suspect a hardware problem, but since last time I had similarly strange things it ended up being due to misconfiguration, I still hope someone will tell me this is also the case here. -- :wq! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kernel BUG at mm/rmap.c:631!
Hi, I was printing on the parallel port and suddenly the "parallel" CUPS backend went 50% CPU (obviously endless-looping), while the other 50% were eaten by ghostscript (strace didn't show anything, so this might be an "internal" loop). When I eventually killed the latter, I got this: Eeek! page_mapcount(page) went negative! (-1) page pfn = 3 page->flags = 80014 page->count = 0 page->mapping = vma->vm_ops = _stext+0x3feff000/0x14 [ cut here ] kernel BUG at mm/rmap.c:631! invalid opcode: [#1] Modules linked in: cls_fw sch_prio sch_htb iptable_nat xt_limit xt_state ipt_REJECT xt_tcpudp ipt_LOG xt_DSCP xt_dscp xt_mark nf_conntrack_ipv4 xt_CONNMARK xt_MARK iptable_mangle iptable_filter ip_tables x_tables aes_i586 geode_aes aes_generic ieee80211_crypt_ccmp lirc_dev nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack ipv6 evdev hostap_pci hostap ieee80211_crypt i2c_viapro via686a ide_cd Pid: 5098, comm: gs Tainted: GF (2.6.24.1 #9) EIP: 0060:[] EFLAGS: 00010246 CPU: 0 EIP is at page_remove_rmap+0xe4/0x111 EAX: EBX: c160 ECX: 0046 EDX: 5a52 ESI: e90d3f44 EDI: ea61b720 EBP: b700 ESP: eefc7e00 DS: 007b ES: 007b FS: GS: SS: 0068 Process gs (pid: 5098, ti=eefc6000 task=ea6fd570 task.ti=eefc6000) Stack: c0398eb6 c160 b6dc8000 c013f6c8 326b e90d3f44 eefc7e74 0001 ef371b6c ef1073a0 c0454f98 ffa0 ef371b6c 000eaeb1 b70bc000 eefc7e74 e90d3860 ef1073a0 eefc7f10 Call Trace: [] unmap_vmas+0x23e/0x403 [] exit_mmap+0x5f/0xc9 [] mmput+0x1b/0x5e [] do_exit+0x1ad/0x5ae [] sys_exit_group+0x0/0xd [] get_signal_to_deliver+0x370/0x380 [] net_rx_action+0x70/0x144 [] intr_handler+0x9c/0xcf [] do_page_fault+0x0/0x52d [] do_notify_resume+0x81/0x5c0 [] handle_mm_fault+0x70/0x49d [] common_interrupt+0x23/0x28 [] do_page_fault+0x18c/0x52d [] schedule+0x1f3/0x20d [] do_page_fault+0x0/0x52d [] work_notifysig+0x13/0x19 [] rpc_info_open+0x17/0x6a === Code: 8b 46 40 8b 50 08 b8 05 8f 39 c0 e8 08 df fe ff 8b 46 48 85 c0 74 14 8b 40 10 85 c0 74 0d 8b 50 2c b8 23 8f 39 c0 e8 ed de fe ff <0f> 0b eb fe 8b 53 10 8b 03 83 e2 01 f7 da c1 e8 1e 83 c2 04 69 EIP: [] page_remove_rmap+0xe4/0x111 SS:ESP 0068:eefc7e00 ---[ end trace 42d12388f65d0f6f ]--- Fixing recursive fault but reboot is needed! Apparently this happened to me in the near past, but I didn't have any netconsole facility enabled at that time to capture the message. Anybody has any idea where this might have come from? -- Sex on TV doesn't hurtunless you fall off. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kernel BUG at mm/rmap.c:631!
Hi, I was printing on the parallel port and suddenly the parallel CUPS backend went 50% CPU (obviously endless-looping), while the other 50% were eaten by ghostscript (strace didn't show anything, so this might be an internal loop). When I eventually killed the latter, I got this: Eeek! page_mapcount(page) went negative! (-1) page pfn = 3 page-flags = 80014 page-count = 0 page-mapping = vma-vm_ops = _stext+0x3feff000/0x14 [ cut here ] kernel BUG at mm/rmap.c:631! invalid opcode: [#1] Modules linked in: cls_fw sch_prio sch_htb iptable_nat xt_limit xt_state ipt_REJECT xt_tcpudp ipt_LOG xt_DSCP xt_dscp xt_mark nf_conntrack_ipv4 xt_CONNMARK xt_MARK iptable_mangle iptable_filter ip_tables x_tables aes_i586 geode_aes aes_generic ieee80211_crypt_ccmp lirc_dev nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack ipv6 evdev hostap_pci hostap ieee80211_crypt i2c_viapro via686a ide_cd Pid: 5098, comm: gs Tainted: GF (2.6.24.1 #9) EIP: 0060:[c01443f1] EFLAGS: 00010246 CPU: 0 EIP is at page_remove_rmap+0xe4/0x111 EAX: EBX: c160 ECX: 0046 EDX: 5a52 ESI: e90d3f44 EDI: ea61b720 EBP: b700 ESP: eefc7e00 DS: 007b ES: 007b FS: GS: SS: 0068 Process gs (pid: 5098, ti=eefc6000 task=ea6fd570 task.ti=eefc6000) Stack: c0398eb6 c160 b6dc8000 c013f6c8 326b e90d3f44 eefc7e74 0001 ef371b6c ef1073a0 c0454f98 ffa0 ef371b6c 000eaeb1 b70bc000 eefc7e74 e90d3860 ef1073a0 eefc7f10 Call Trace: [c013f6c8] unmap_vmas+0x23e/0x403 [c0141c3e] exit_mmap+0x5f/0xc9 [c0116a82] mmput+0x1b/0x5e [c011a8f8] do_exit+0x1ad/0x5ae [c011ad4a] sys_exit_group+0x0/0xd [c0120962] get_signal_to_deliver+0x370/0x380 [c02cf89e] net_rx_action+0x70/0x144 [c026d8bb] intr_handler+0x9c/0xcf [c0111f67] do_page_fault+0x0/0x52d [c0103326] do_notify_resume+0x81/0x5c0 [c013ff5d] handle_mm_fault+0x70/0x49d [c010455b] common_interrupt+0x23/0x28 [c01120f3] do_page_fault+0x18c/0x52d [c0336dc2] schedule+0x1f3/0x20d [c0111f67] do_page_fault+0x0/0x52d [c0103c4e] work_notifysig+0x13/0x19 [c033] rpc_info_open+0x17/0x6a === Code: 8b 46 40 8b 50 08 b8 05 8f 39 c0 e8 08 df fe ff 8b 46 48 85 c0 74 14 8b 40 10 85 c0 74 0d 8b 50 2c b8 23 8f 39 c0 e8 ed de fe ff 0f 0b eb fe 8b 53 10 8b 03 83 e2 01 f7 da c1 e8 1e 83 c2 04 69 EIP: [c01443f1] page_remove_rmap+0xe4/0x111 SS:ESP 0068:eefc7e00 ---[ end trace 42d12388f65d0f6f ]--- Fixing recursive fault but reboot is needed! Apparently this happened to me in the near past, but I didn't have any netconsole facility enabled at that time to capture the message. Anybody has any idea where this might have come from? -- Sex on TV doesn't hurtunless you fall off. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at mm/rmap.c:631!
On Sun, Feb 17, 2008 at 09:16:36PM +0100, thus spake Rafael J. Wysocki: On Sunday, 17 of February 2008, Ignacy Gawedzki wrote: Hi, Hi, I was printing on the parallel port and suddenly the parallel CUPS backend went 50% CPU (obviously endless-looping), while the other 50% were eaten by ghostscript (strace didn't show anything, so this might be an internal loop). When I eventually killed the latter, I got this: Which kernel is this? As is shown in the dmesg, it is 2.6.24.1. Is it a regression? Can't really say for sure. At least it already happened with 2.6.23.9. If so, what's the last known working kernel? This is really difficult to determine, since the event is pretty hard to reproduce. I'll try to investigate more, then. :/ This one happened pretty much right after a reboot due to a completely frozen machine (no Oops or Eeek whatsoever) apparently due to intensive writing to the parallel port (the kernel complained that FIFO write timed out twice before locking up). Of course I do suspect a hardware problem, but since last time I had similarly strange things it ended up being due to misconfiguration, I still hope someone will tell me this is also the case here. -- :wq! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Oops with hostap_pci (?)
On Mon, Feb 11, 2008 at 04:19:35AM +0100, thus spake Ignacy Gawedzki: > Hi, > > A few days back I started having strange lockups on a gateway machine so I > started looking at things. Then I compiled the 2.6.24.1 kernel and started > having oopses not long after upping the wlan0 (hostap_pci) interface. > > So I enabled netconsole and got a few logs. Now the sad point is that I'm > getting an oops even with my older kernel which used to be fine (2.6.23.9). I > also checked with 2.6.24 and the effects are the same: I boot, I up the wlan0 > interface and a few seconds or minutes later, boom! Sometimes only rmmod'ing > hostap_pci triggers the oops. I'm suspecting some hardware problem and have > already checked the ram with memtest86+ and tested with only one memory module > out of two plugged: same thing. > > If anybody could take a look at these and shed some light on that issue... Okay, false alarm... it's all my fault. :/ The cause of the problem was my previous tampering with udev rules. The udev rules as such (on Ubuntu Gutsy) were bad for hostapd, since persistent rules were written for the wlan0ap interface name created by hostapd. So I changed a few things that had the unexpected effect of renaming the initial hostap_pci's wifi0 into wlan0ap. This in turn made hostap_pci oops in many cases. Anyway, I've modified my udev rules again and hopefully this will be it. =) -- "The whole problem with the world is that fools and fanatics are always so certain of themselves, and wiser people so full of doubts." - Bertrand Russell -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Oops with hostap_pci (?)
On Mon, Feb 11, 2008 at 04:19:35AM +0100, thus spake Ignacy Gawedzki: Hi, A few days back I started having strange lockups on a gateway machine so I started looking at things. Then I compiled the 2.6.24.1 kernel and started having oopses not long after upping the wlan0 (hostap_pci) interface. So I enabled netconsole and got a few logs. Now the sad point is that I'm getting an oops even with my older kernel which used to be fine (2.6.23.9). I also checked with 2.6.24 and the effects are the same: I boot, I up the wlan0 interface and a few seconds or minutes later, boom! Sometimes only rmmod'ing hostap_pci triggers the oops. I'm suspecting some hardware problem and have already checked the ram with memtest86+ and tested with only one memory module out of two plugged: same thing. If anybody could take a look at these and shed some light on that issue... Okay, false alarm... it's all my fault. :/ The cause of the problem was my previous tampering with udev rules. The udev rules as such (on Ubuntu Gutsy) were bad for hostapd, since persistent rules were written for the wlan0ap interface name created by hostapd. So I changed a few things that had the unexpected effect of renaming the initial hostap_pci's wifi0 into wlan0ap. This in turn made hostap_pci oops in many cases. Anyway, I've modified my udev rules again and hopefully this will be it. =) -- The whole problem with the world is that fools and fanatics are always so certain of themselves, and wiser people so full of doubts. - Bertrand Russell -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Oops with hostap_pci (?)
Hi, A few days back I started having strange lockups on a gateway machine so I started looking at things. Then I compiled the 2.6.24.1 kernel and started having oopses not long after upping the wlan0 (hostap_pci) interface. So I enabled netconsole and got a few logs. Now the sad point is that I'm getting an oops even with my older kernel which used to be fine (2.6.23.9). I also checked with 2.6.24 and the effects are the same: I boot, I up the wlan0 interface and a few seconds or minutes later, boom! Sometimes only rmmod'ing hostap_pci triggers the oops. I'm suspecting some hardware problem and have already checked the ram with memtest86+ and tested with only one memory module out of two plugged: same thing. If anybody could take a look at these and shed some light on that issue... Thanks a lot, Ignacy -- Save the whales. Feed the hungry. Free the mallocs. With kernel 2.6.24.1 BUG: unable to handle kernel NULL pointer dereference at virtual address printing eip: f08f50c2 *pde = Oops: [#1] Modules linked in: lirc_serial(F) lirc_dev cls_fw sch_prio sch_htb iptable_nat xt_limit xt_state ipt_REJECT xt_tcpudp ipt_LOG xt_DSCP xt_dscp xt_mark nf_conntrack_ipv4 xt_CONNMARK xt_MARK iptable_mangle iptable_filter ip_tables x_tables nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack ipv6 evdev hostap_pci i2c_viapro hostap via686a ieee80211_crypt ide_cd Pid: 0, comm: swapper Tainted: GF (2.6.24.1 #5) EIP: 0060:[] EFLAGS: 00010297 CPU: 0 EIP is at hostap_80211_rx+0x41d/0xecf [hostap] EAX: eec28460 EBX: ECX: eec28444 EDX: ESI: efbb8434 EDI: EBP: efbb843e ESP: c0419e74 DS: 007b ES: 007b FS: GS: SS: 0068 Process swapper (pid: 0, ti=c0418000 task=c03e4300 task.ti=c0418000) Stack: 0080 004c 0001 c0419f2c c0419f30 ef3ab760 0018 0100 eec28444 1148 0040 c9c0 0001 ef8d3370 2a40 04b1cd93 000a1e00 1148 013a1148 685b0900 ef8d3000 1f714b23 685b0900 Call Trace: [] hostap_rx_tasklet+0x11f/0x145 [hostap_pci] [] run_timer_softirq+0x11/0x12f [] tasklet_action+0x32/0x52 [] __do_softirq+0x35/0x75 [] do_softirq+0x22/0x26 [] irq_exit+0x29/0x58 [] do_IRQ+0x58/0x6b [] common_interrupt+0x23/0x28 [] mod_sysfs_init+0x17/0x6d [] arch_setup_additional_pages+0x121/0x13a [] acpi_processor_idle+0x244/0x3c4 [] cpu_idle+0x43/0x5d [] start_kernel+0x237/0x23c [] unknown_bootoption+0x0/0x195 === Code: 0a 8b 4c 24 24 8b 59 1c eb 21 83 bb d8 00 00 00 04 75 16 8d 83 dc 00 00 00 b9 06 00 00 00 89 ea e8 0b d1 91 cf 85 c0 74 18 89 fb <8b> 3b 0f 18 07 90 8b 44 24 24 83 c0 1c 39 c3 75 ce e9 44 0a 00 EIP: [] hostap_80211_rx+0x41d/0xecf [hostap] SS:ESP 0068:c0419e74 Kernel panic - not syncing: Fatal exception in interrupt wlan0ap: SW TICK stuck? bits=0x0 EvStat=8001 IntEn=e018 With kernel 2.6.24.1 BUG: unable to handle kernel paging request at virtual address abdb24ce printing eip: f08ea0c2 *pde = Oops: [#1] Modules linked in: cls_fw sch_prio sch_htb iptable_nat xt_limit xt_state ipt_REJECT xt_tcpudp ipt_LOG xt_DSCP xt_dscp xt_mark nf_conntrack_ipv4 xt_CONNMARK xt_MARK iptable_mangle iptable_filter ip_tables x_tables nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack ipv6 evdev hostap_pci i2c_viapro via686a hostap ieee80211_crypt ide_cd Pid: 0, comm: swapper Not tainted (2.6.24.1 #5) EIP: 0060:[] EFLAGS: 00010202 CPU: 0 EIP is at hostap_80211_rx+0x41d/0xecf [hostap] EAX: efa68460 EBX: abdb24ce ECX: efa68444 EDX: ESI: ef1e1034 EDI: abdb24ce EBP: ef1e103e ESP: c0419e74 DS: 007b ES: 007b FS: GS: SS: 0068 Process swapper (pid: 0, ti=c0418000 task=c03e4300 task.ti=c0418000) Stack: 0080 c045358c 0001 c0453570 c0419f30 eec598e0 0018 0100 efa68444 1148 0040 1f90 0001 eec29370 5a40 0080ce43 000a1e00 1148 013a1148 685b0900 eec29000 1f714b23 685b0900 Call Trace: [] hostap_rx_tasklet+0x11f/0x145 [hostap_pci] [] tasklet_action+0x32/0x52 [] __do_softirq+0x35/0x75 [] do_softirq+0x22/0x26 [] irq_exit+0x29/0x58 [] do_IRQ+0x58/0x6b [] common_interrupt+0x23/0x28 [] mod_sysfs_init+0x17/0x6d [] arch_setup_additional_pages+0x121/0x13a [] acpi_processor_idle+0x244/0x3c4 [] cpu_idle+0x43/0x5d [] start_kernel+0x237/0x23c [] unknown_bootoption+0x0/0x195 === Code: 0a 8b 4c 24 24 8b 59 1c eb 21 83 bb d8 00 00 00 04 75 16 8d 83 dc 00 00 00 b9 06 00 00 00 89 ea e8 0b 81 92 cf 85 c0 74 18 89 fb <8b> 3b 0f 18 07 90 8b 44 24 24 83 c0 1c 39 c3 75 ce e9 44 0a 00 EIP: [] hostap_80211_rx+0x41d/0xecf [hostap] SS:ESP 0068:c0419e74 Kernel panic - not syncing: Fatal exception in interrupt wlan0ap: SW TICK stuck? bits=0x0 EvStat=8001 IntEn=e018 With kernel 2.6.24 BUG: unable to handle kernel paging request at virtual address 630e0021 printing eip: f08f20c2 *pde = Oops: [#1] Modules linked in: cls_fw sch_prio sch_htb iptable_nat xt_limit xt_state ipt_REJECT
Oops with hostap_pci (?)
Hi, A few days back I started having strange lockups on a gateway machine so I started looking at things. Then I compiled the 2.6.24.1 kernel and started having oopses not long after upping the wlan0 (hostap_pci) interface. So I enabled netconsole and got a few logs. Now the sad point is that I'm getting an oops even with my older kernel which used to be fine (2.6.23.9). I also checked with 2.6.24 and the effects are the same: I boot, I up the wlan0 interface and a few seconds or minutes later, boom! Sometimes only rmmod'ing hostap_pci triggers the oops. I'm suspecting some hardware problem and have already checked the ram with memtest86+ and tested with only one memory module out of two plugged: same thing. If anybody could take a look at these and shed some light on that issue... Thanks a lot, Ignacy -- Save the whales. Feed the hungry. Free the mallocs. With kernel 2.6.24.1 BUG: unable to handle kernel NULL pointer dereference at virtual address printing eip: f08f50c2 *pde = Oops: [#1] Modules linked in: lirc_serial(F) lirc_dev cls_fw sch_prio sch_htb iptable_nat xt_limit xt_state ipt_REJECT xt_tcpudp ipt_LOG xt_DSCP xt_dscp xt_mark nf_conntrack_ipv4 xt_CONNMARK xt_MARK iptable_mangle iptable_filter ip_tables x_tables nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack ipv6 evdev hostap_pci i2c_viapro hostap via686a ieee80211_crypt ide_cd Pid: 0, comm: swapper Tainted: GF (2.6.24.1 #5) EIP: 0060:[f08f50c2] EFLAGS: 00010297 CPU: 0 EIP is at hostap_80211_rx+0x41d/0xecf [hostap] EAX: eec28460 EBX: ECX: eec28444 EDX: ESI: efbb8434 EDI: EBP: efbb843e ESP: c0419e74 DS: 007b ES: 007b FS: GS: SS: 0068 Process swapper (pid: 0, ti=c0418000 task=c03e4300 task.ti=c0418000) Stack: 0080 004c 0001 c0419f2c c0419f30 ef3ab760 0018 0100 eec28444 1148 0040 c9c0 0001 ef8d3370 2a40 04b1cd93 000a1e00 1148 013a1148 685b0900 ef8d3000 1f714b23 685b0900 Call Trace: [f090ffca] hostap_rx_tasklet+0x11f/0x145 [hostap_pci] [c011e399] run_timer_softirq+0x11/0x12f [c011bbbc] tasklet_action+0x32/0x52 [c011bb24] __do_softirq+0x35/0x75 [c011bb86] do_softirq+0x22/0x26 [c011bdb3] irq_exit+0x29/0x58 [c0105bc0] do_IRQ+0x58/0x6b [c010455b] common_interrupt+0x23/0x28 [c013007b] mod_sysfs_init+0x17/0x6d [c011007b] arch_setup_additional_pages+0x121/0x13a [c023f4a0] acpi_processor_idle+0x244/0x3c4 [c01024fc] cpu_idle+0x43/0x5d [c041a9ac] start_kernel+0x237/0x23c [c041a303] unknown_bootoption+0x0/0x195 === Code: 0a 8b 4c 24 24 8b 59 1c eb 21 83 bb d8 00 00 00 04 75 16 8d 83 dc 00 00 00 b9 06 00 00 00 89 ea e8 0b d1 91 cf 85 c0 74 18 89 fb 8b 3b 0f 18 07 90 8b 44 24 24 83 c0 1c 39 c3 75 ce e9 44 0a 00 EIP: [f08f50c2] hostap_80211_rx+0x41d/0xecf [hostap] SS:ESP 0068:c0419e74 Kernel panic - not syncing: Fatal exception in interrupt wlan0ap: SW TICK stuck? bits=0x0 EvStat=8001 IntEn=e018 With kernel 2.6.24.1 BUG: unable to handle kernel paging request at virtual address abdb24ce printing eip: f08ea0c2 *pde = Oops: [#1] Modules linked in: cls_fw sch_prio sch_htb iptable_nat xt_limit xt_state ipt_REJECT xt_tcpudp ipt_LOG xt_DSCP xt_dscp xt_mark nf_conntrack_ipv4 xt_CONNMARK xt_MARK iptable_mangle iptable_filter ip_tables x_tables nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack ipv6 evdev hostap_pci i2c_viapro via686a hostap ieee80211_crypt ide_cd Pid: 0, comm: swapper Not tainted (2.6.24.1 #5) EIP: 0060:[f08ea0c2] EFLAGS: 00010202 CPU: 0 EIP is at hostap_80211_rx+0x41d/0xecf [hostap] EAX: efa68460 EBX: abdb24ce ECX: efa68444 EDX: ESI: ef1e1034 EDI: abdb24ce EBP: ef1e103e ESP: c0419e74 DS: 007b ES: 007b FS: GS: SS: 0068 Process swapper (pid: 0, ti=c0418000 task=c03e4300 task.ti=c0418000) Stack: 0080 c045358c 0001 c0453570 c0419f30 eec598e0 0018 0100 efa68444 1148 0040 1f90 0001 eec29370 5a40 0080ce43 000a1e00 1148 013a1148 685b0900 eec29000 1f714b23 685b0900 Call Trace: [f0904fca] hostap_rx_tasklet+0x11f/0x145 [hostap_pci] [c011bbbc] tasklet_action+0x32/0x52 [c011bb24] __do_softirq+0x35/0x75 [c011bb86] do_softirq+0x22/0x26 [c011bdb3] irq_exit+0x29/0x58 [c0105bc0] do_IRQ+0x58/0x6b [c010455b] common_interrupt+0x23/0x28 [c013007b] mod_sysfs_init+0x17/0x6d [c011007b] arch_setup_additional_pages+0x121/0x13a [c023f4a0] acpi_processor_idle+0x244/0x3c4 [c01024fc] cpu_idle+0x43/0x5d [c041a9ac] start_kernel+0x237/0x23c [c041a303] unknown_bootoption+0x0/0x195 === Code: 0a 8b 4c 24 24 8b 59 1c eb 21 83 bb d8 00 00 00 04 75 16 8d 83 dc 00 00 00 b9 06 00 00 00 89 ea e8 0b 81 92 cf 85 c0 74 18 89 fb 8b 3b 0f 18 07 90 8b 44 24 24 83 c0 1c 39 c3 75 ce e9 44 0a 00 EIP: [f08ea0c2] hostap_80211_rx+0x41d/0xecf [hostap] SS:ESP 0068:c0419e74 Kernel panic - not syncing: Fatal exception in interrupt wlan0ap: SW TICK stuck? bits=0x0 EvStat=8001
Re: Hot (un)plugging of a SATA drive with sata_nv (CK8S)?
On Mon, Jan 28, 2008 at 05:35:58PM -0600, thus spake Robert Hancock: > Any ideas guys? When the drive is plugged in, a stream of this shows up. It > would seem like the controller is throwing hotplug interrupts but we never > seem to get a "SATA link up". This is on nForce3, btw. I just happened to upgrade to kernel 2.6.24 and the problem is gone. I just have a few SError messages that appear to be harmless: ata2: exception Emask 0x10 SAct 0x0 SErr 0x5 action 0xa frozen ata2: SError: { PHYRdyChg CommWake } ata2: hard resetting link ata2: SATA link down (SStatus 0 SControl 300) ata2: EH complete ata2: exception Emask 0x10 SAct 0x0 SErr 0x1d action 0xa frozen ata2: SError: { PHYRdyChg CommWake 10B8B Dispar } ata2: hard resetting link ata2: port is slow to respond, please be patient (Status 0x80) ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata2.00: ATA-8: ST3500320AS, SD15, max UDMA/133 ata2.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) ata2.00: configured for UDMA/133 ata2: EH complete and then the usual SCSI messages about the newly seen drive. The scsiadd -r command works every time and does stop the disk indeed : sd 1:0:0:0: [sdb] Synchronizing SCSI cache sd 1:0:0:0: [sdb] Stopping disk ata2.00: disabled and then when I switch the drive off : ata2: exception Emask 0x10 SAct 0x0 SErr 0x199 action 0xa frozen ata2: SError: { PHYRdyChg 10B8B Dispar LinkSeq TrStaTrns } ata2: hard resetting link ata2: SATA link down (SStatus 0 SControl 300) ata2: EH complete So thanks for the help and sorry for the bother. =) -- Everything is more fun naked except cooking with grease. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Hot (un)plugging of a SATA drive with sata_nv (CK8S)?
On Mon, Jan 28, 2008 at 05:35:58PM -0600, thus spake Robert Hancock: Any ideas guys? When the drive is plugged in, a stream of this shows up. It would seem like the controller is throwing hotplug interrupts but we never seem to get a SATA link up. This is on nForce3, btw. I just happened to upgrade to kernel 2.6.24 and the problem is gone. I just have a few SError messages that appear to be harmless: ata2: exception Emask 0x10 SAct 0x0 SErr 0x5 action 0xa frozen ata2: SError: { PHYRdyChg CommWake } ata2: hard resetting link ata2: SATA link down (SStatus 0 SControl 300) ata2: EH complete ata2: exception Emask 0x10 SAct 0x0 SErr 0x1d action 0xa frozen ata2: SError: { PHYRdyChg CommWake 10B8B Dispar } ata2: hard resetting link ata2: port is slow to respond, please be patient (Status 0x80) ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata2.00: ATA-8: ST3500320AS, SD15, max UDMA/133 ata2.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32) ata2.00: configured for UDMA/133 ata2: EH complete and then the usual SCSI messages about the newly seen drive. The scsiadd -r command works every time and does stop the disk indeed : sd 1:0:0:0: [sdb] Synchronizing SCSI cache sd 1:0:0:0: [sdb] Stopping disk ata2.00: disabled and then when I switch the drive off : ata2: exception Emask 0x10 SAct 0x0 SErr 0x199 action 0xa frozen ata2: SError: { PHYRdyChg 10B8B Dispar LinkSeq TrStaTrns } ata2: hard resetting link ata2: SATA link down (SStatus 0 SControl 300) ata2: EH complete So thanks for the help and sorry for the bother. =) -- Everything is more fun naked except cooking with grease. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Hot (un)plugging of a SATA drive with sata_nv (CK8S)?
On Fri, Jan 25, 2008 at 09:03:02PM -0600, thus spake Robert Hancock: > Ignacy Gawedzki wrote: >> Hi everyone, >> I'm having trouble to determine the cause of the following behavior. I'm >> not >> even sure that I'm supposed to hot plug and unplug a SATA drive from a >> nForce3 >> Ultra (apparently CK8S, on a Gigabyte K8NS Ultra 939 mobo) SATA interface, >> to >> begin with. The information is hard to find given that the sata_nv driver >> supports a range of different hardware. >> I've recently acquired an external drive with (among others) an eSATA >> interface, so I also bought a eSATA->SATA bracket and intend to use that >> drive >> (Lacie d2 quadra 500G) through eSATA. > > BTW, eSATA cannot technically be converted properly to SATA with a simple > connector adapter. eSATA is supposed to use higher signalling voltages and > so using such an adapter is not guaranteed to work. Yeah, apparently this shortens the max cable length to 1 meter. In this case I've got a 1 meter external cable and approx. 30 cm internal (heavily shielded though) cable from the bracket to the SATA port. Anyway, the drive works perfectly if plugged at boot time. > >> The thing is that if I boot the machine with the drive plugged and turned >> on, >> it is properly detected and usable. If, at some point, I want to remove >> the >> drive, I unmount any partitions on it and issue the proper scsiadd -r >> command >> (usually scsiadd -r 1 0 0 0, since this is the second SATA drive) and >> everything is fine (I turn the drive off and unplug it), so far. Next, >> when >> I want to use the drive again, it's still detected alright (although >> appears >> as sdc and not sdb anymore), but the SCSI layer issues "scsi 1:0:0:0: >> rejecting I/O to dead device" from time to time. Then any scsiadd -r 1 0 >> 0 0 >> command fails with "No such device or address", although it appears in the >> output of scsiadd -p or even scsiadd -s (always as 1 0 0 0). If I ignore >> that >> detail and switch the drive off, then the kernel eventually notices that >> the >> drive is gone and the SCSI layer attempts to stop the device and fails >> ([sdc] >> START_STOP FAILED). From that moment on, any attempt to plug the drive >> again >> fails. The kernel issues "ata2: hard resetting port" and "ata2: port is >> slow >> to respond, please be patient (Status 0x80)" periodically, until I switch >> the >> drive off. >> If the drive is not present at boot, then hot plugging it fails. The >> kernel >> first soft resets the port, then issues the "please be patient (Status >> 0x80)" >> message, complains that SRST failed (errno=-16) and goes on hard resetting >> the >> port, issuing "please be patient (Status 0x80)" and complaining that >> COMRESET >> failed (errno=-16), periodically, until the drive is switched off. > > Full dmesg output would be useful.. I repeated the experiments and dumped as much dmesg as I could. The dmesg outputs of both experiments are attached and commented. It seems that in the case the drive is pluggin at boot time, it remains hot pluggable later (be it with some strange error messages) after all (or is there another factor that I did not reproduce?). Thank you for any help. =) -- NO CARRIER ### First experiment, the drive is plugged and turned on at boot time. ### The initial full dmesg dump follows. Linux version 2.6.23.14 ([EMAIL PROTECTED]) (gcc version 4.1.3 20070929 (prerelease) (U buntu 4.1.2-16ubuntu2)) #1 PREEMPT Thu Jan 24 22:07:54 CET 2008 Command line: root=UUID=84d4c1b4-5602-4364-a583-7913d518b4ab ro quiet splash BIOS-provided physical RAM map: BIOS-e820: - 0009f400 (usable) BIOS-e820: 0009f800 - 000a (reserved) BIOS-e820: 000f - 0010 (reserved) BIOS-e820: 0010 - 7fff (usable) BIOS-e820: 7fff - 7fff3000 (ACPI NVS) BIOS-e820: 7fff3000 - 8000 (ACPI data) BIOS-e820: fec0 - 0001 (reserved) Entering add_active_range(0, 0, 159) 0 entries of 256 used Entering add_active_range(0, 256, 524272) 1 entries of 256 used end_pfn_map = 1048576 DMI 2.3 present. ACPI: RSDP 000F6C90, 0014 (r0 Nvidia) ACPI: RSDT 7FFF3000, 002C (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101) ACPI: FACP 7FFF3040, 0074 (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101) ACPI: DSDT 7FFF30C0, 4AC4 (r1 NVIDIA AWRDACPI 1000 MSFT 10C) ACPI: FACS 7FFF, 0040 ACPI: APIC 7FFF7BC0, 007C (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101) Entering add_active_range(0, 0, 159) 0 entries of 256 used Enterin
Re: Hot (un)plugging of a SATA drive with sata_nv (CK8S)?
On Fri, Jan 25, 2008 at 09:03:02PM -0600, thus spake Robert Hancock: Ignacy Gawedzki wrote: Hi everyone, I'm having trouble to determine the cause of the following behavior. I'm not even sure that I'm supposed to hot plug and unplug a SATA drive from a nForce3 Ultra (apparently CK8S, on a Gigabyte K8NS Ultra 939 mobo) SATA interface, to begin with. The information is hard to find given that the sata_nv driver supports a range of different hardware. I've recently acquired an external drive with (among others) an eSATA interface, so I also bought a eSATA-SATA bracket and intend to use that drive (Lacie d2 quadra 500G) through eSATA. BTW, eSATA cannot technically be converted properly to SATA with a simple connector adapter. eSATA is supposed to use higher signalling voltages and so using such an adapter is not guaranteed to work. Yeah, apparently this shortens the max cable length to 1 meter. In this case I've got a 1 meter external cable and approx. 30 cm internal (heavily shielded though) cable from the bracket to the SATA port. Anyway, the drive works perfectly if plugged at boot time. The thing is that if I boot the machine with the drive plugged and turned on, it is properly detected and usable. If, at some point, I want to remove the drive, I unmount any partitions on it and issue the proper scsiadd -r command (usually scsiadd -r 1 0 0 0, since this is the second SATA drive) and everything is fine (I turn the drive off and unplug it), so far. Next, when I want to use the drive again, it's still detected alright (although appears as sdc and not sdb anymore), but the SCSI layer issues scsi 1:0:0:0: rejecting I/O to dead device from time to time. Then any scsiadd -r 1 0 0 0 command fails with No such device or address, although it appears in the output of scsiadd -p or even scsiadd -s (always as 1 0 0 0). If I ignore that detail and switch the drive off, then the kernel eventually notices that the drive is gone and the SCSI layer attempts to stop the device and fails ([sdc] START_STOP FAILED). From that moment on, any attempt to plug the drive again fails. The kernel issues ata2: hard resetting port and ata2: port is slow to respond, please be patient (Status 0x80) periodically, until I switch the drive off. If the drive is not present at boot, then hot plugging it fails. The kernel first soft resets the port, then issues the please be patient (Status 0x80) message, complains that SRST failed (errno=-16) and goes on hard resetting the port, issuing please be patient (Status 0x80) and complaining that COMRESET failed (errno=-16), periodically, until the drive is switched off. Full dmesg output would be useful.. I repeated the experiments and dumped as much dmesg as I could. The dmesg outputs of both experiments are attached and commented. It seems that in the case the drive is pluggin at boot time, it remains hot pluggable later (be it with some strange error messages) after all (or is there another factor that I did not reproduce?). Thank you for any help. =) -- NO CARRIER ### First experiment, the drive is plugged and turned on at boot time. ### The initial full dmesg dump follows. Linux version 2.6.23.14 ([EMAIL PROTECTED]) (gcc version 4.1.3 20070929 (prerelease) (U buntu 4.1.2-16ubuntu2)) #1 PREEMPT Thu Jan 24 22:07:54 CET 2008 Command line: root=UUID=84d4c1b4-5602-4364-a583-7913d518b4ab ro quiet splash BIOS-provided physical RAM map: BIOS-e820: - 0009f400 (usable) BIOS-e820: 0009f800 - 000a (reserved) BIOS-e820: 000f - 0010 (reserved) BIOS-e820: 0010 - 7fff (usable) BIOS-e820: 7fff - 7fff3000 (ACPI NVS) BIOS-e820: 7fff3000 - 8000 (ACPI data) BIOS-e820: fec0 - 0001 (reserved) Entering add_active_range(0, 0, 159) 0 entries of 256 used Entering add_active_range(0, 256, 524272) 1 entries of 256 used end_pfn_map = 1048576 DMI 2.3 present. ACPI: RSDP 000F6C90, 0014 (r0 Nvidia) ACPI: RSDT 7FFF3000, 002C (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101) ACPI: FACP 7FFF3040, 0074 (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101) ACPI: DSDT 7FFF30C0, 4AC4 (r1 NVIDIA AWRDACPI 1000 MSFT 10C) ACPI: FACS 7FFF, 0040 ACPI: APIC 7FFF7BC0, 007C (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101) Entering add_active_range(0, 0, 159) 0 entries of 256 used Entering add_active_range(0, 256, 524272) 1 entries of 256 used Zone PFN ranges: DMA 0 - 4096 DMA324096 - 1048576 Normal1048576 - 1048576 Movable zone start PFN for each node early_node_map[2] active PFN ranges 0:0 - 159 0: 256 - 524272 On node 0 totalpages: 524175 DMA zone: 56 pages used for memmap DMA zone: 1312 pages reserved DMA zone: 2631 pages, LIFO batch:0 DMA32 zone: 7111 pages used for memmap DMA32 zone: 513065 pages, LIFO batch:31
Hot (un)plugging of a SATA drive with sata_nv (CK8S) ?
Hi everyone, I'm having trouble to determine the cause of the following behavior. I'm not even sure that I'm supposed to hot plug and unplug a SATA drive from a nForce3 Ultra (apparently CK8S, on a Gigabyte K8NS Ultra 939 mobo) SATA interface, to begin with. The information is hard to find given that the sata_nv driver supports a range of different hardware. I've recently acquired an external drive with (among others) an eSATA interface, so I also bought a eSATA->SATA bracket and intend to use that drive (Lacie d2 quadra 500G) through eSATA. The thing is that if I boot the machine with the drive plugged and turned on, it is properly detected and usable. If, at some point, I want to remove the drive, I unmount any partitions on it and issue the proper scsiadd -r command (usually scsiadd -r 1 0 0 0, since this is the second SATA drive) and everything is fine (I turn the drive off and unplug it), so far. Next, when I want to use the drive again, it's still detected alright (although appears as sdc and not sdb anymore), but the SCSI layer issues "scsi 1:0:0:0: rejecting I/O to dead device" from time to time. Then any scsiadd -r 1 0 0 0 command fails with "No such device or address", although it appears in the output of scsiadd -p or even scsiadd -s (always as 1 0 0 0). If I ignore that detail and switch the drive off, then the kernel eventually notices that the drive is gone and the SCSI layer attempts to stop the device and fails ([sdc] START_STOP FAILED). From that moment on, any attempt to plug the drive again fails. The kernel issues "ata2: hard resetting port" and "ata2: port is slow to respond, please be patient (Status 0x80)" periodically, until I switch the drive off. If the drive is not present at boot, then hot plugging it fails. The kernel first soft resets the port, then issues the "please be patient (Status 0x80)" message, complains that SRST failed (errno=-16) and goes on hard resetting the port, issuing "please be patient (Status 0x80)" and complaining that COMRESET failed (errno=-16), periodically, until the drive is switched off. If somebody could tell me whether hot-plugging is supposed to work with my SATA interface, it would be nice. =) The motherboard happens to offer another SATA interface (Sil3512A) which is well supported and appears to support hot-plugging as well, but it conflicts nastily with my PCTV Pro (bttv) card (which are apparently known to conflict with the Sil SATA interfaces). Thanks for any help. Ignacy -- I used to have a sig, but I've stopped smoking now. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Hot (un)plugging of a SATA drive with sata_nv (CK8S) ?
Hi everyone, I'm having trouble to determine the cause of the following behavior. I'm not even sure that I'm supposed to hot plug and unplug a SATA drive from a nForce3 Ultra (apparently CK8S, on a Gigabyte K8NS Ultra 939 mobo) SATA interface, to begin with. The information is hard to find given that the sata_nv driver supports a range of different hardware. I've recently acquired an external drive with (among others) an eSATA interface, so I also bought a eSATA-SATA bracket and intend to use that drive (Lacie d2 quadra 500G) through eSATA. The thing is that if I boot the machine with the drive plugged and turned on, it is properly detected and usable. If, at some point, I want to remove the drive, I unmount any partitions on it and issue the proper scsiadd -r command (usually scsiadd -r 1 0 0 0, since this is the second SATA drive) and everything is fine (I turn the drive off and unplug it), so far. Next, when I want to use the drive again, it's still detected alright (although appears as sdc and not sdb anymore), but the SCSI layer issues scsi 1:0:0:0: rejecting I/O to dead device from time to time. Then any scsiadd -r 1 0 0 0 command fails with No such device or address, although it appears in the output of scsiadd -p or even scsiadd -s (always as 1 0 0 0). If I ignore that detail and switch the drive off, then the kernel eventually notices that the drive is gone and the SCSI layer attempts to stop the device and fails ([sdc] START_STOP FAILED). From that moment on, any attempt to plug the drive again fails. The kernel issues ata2: hard resetting port and ata2: port is slow to respond, please be patient (Status 0x80) periodically, until I switch the drive off. If the drive is not present at boot, then hot plugging it fails. The kernel first soft resets the port, then issues the please be patient (Status 0x80) message, complains that SRST failed (errno=-16) and goes on hard resetting the port, issuing please be patient (Status 0x80) and complaining that COMRESET failed (errno=-16), periodically, until the drive is switched off. If somebody could tell me whether hot-plugging is supposed to work with my SATA interface, it would be nice. =) The motherboard happens to offer another SATA interface (Sil3512A) which is well supported and appears to support hot-plugging as well, but it conflicts nastily with my PCTV Pro (bttv) card (which are apparently known to conflict with the Sil SATA interfaces). Thanks for any help. Ignacy -- I used to have a sig, but I've stopped smoking now. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/