malloc(M_WAITOK) of g_bio, forcing M_NOWAIT with non-sleepable locks held:
Hi, I found this in logs of 6.1 box that I admin this morning. The machine keeps running after that. -- wilx +malloc(M_WAITOK) of g_bio, forcing M_NOWAIT with the following non-sleepable locks held: +exclusive sleep mutex inp (tcpinp) r = 0 (0xc50c5d38) locked @ /usr/src/sys/netinet/tcp_usrreq.c:1029 +KDB: stack backtrace: +kdb_backtrace(c08eef84,e78be89c,1,c45752c0,c1035380) at kdb_backtrace+0x2f +witness_warn(5,0,c0819ce0,c07fc905,c05eaca1) at witness_warn+0x1ac +uma_zalloc_arg(c1035380,0,102,e78be8e4,c07562a1) at uma_zalloc_arg+0x3d +g_alloc_bio(8,c0819777,c4575400,c45752c0,d367aac8) at g_alloc_bio+0x23 +swapgeom_strategy(d367aac8,c45752c0,c0819777,271) at swapgeom_strategy+0x3a +swp_pager_strategy(d367aac8,0,c0819777,437,c0760ee7) at swp_pager_strategy+0x88 +swap_pager_getpages(c728e948,e78be9ec,1,0,e78be9b0) at swap_pager_getpages+0x382 +vm_fault(c4f9b128,806b000,1,0,c4c0b300) at vm_fault+0xb13 +trap_pfault(e78beaa8,0,806b2c0,c08a9c60,806b2c0) at trap_pfault+0xf4 +trap(c088,28,c4c00028,e78beb28,806b2c0) at trap+0x33e +calltrap() at calltrap+0x5 +--- trap 0xc, eip = 0xc07b9b16, esp = 0xe78beae8, ebp = 0xe78beb08 --- +generic_copyin(e78bec84,e78beb28,4,4,c50c5ca8) at generic_copyin+0x32 +tcp_ctloutput(c4d1dc84,e78bec84,0,c589b400,e78bec68) at tcp_ctloutput+0x182 +sosetopt(c4d1dc84,e78bec84,e78bec80,c08eef80,c4bd15a0) at sosetopt+0x38 +kern_setsockopt(c4c0b300,7,6,1,806b2c0) at kern_setsockopt+0xd6 +setsockopt(c4c0b300,e78bed04,14,28279000,5) at setsockopt+0x3e +syscall(3b,2808003b,bfbf003b,1,806b2c0) at syscall+0x295 +Xint0x80_syscall() at Xint0x80_syscall+0x1f +--- syscall (105, FreeBSD ELF32, setsockopt), eip = 0x2827959b, esp = 0xbf9fea3c, ebp = 0xbf9fea68 --- +Sleeping on swread with the following non-sleepable locks held: +exclusive sleep mutex inp (tcpinp) r = 0 (0xc50c5d38) locked @ /usr/src/sys/netinet/tcp_usrreq.c:1029 +KDB: stack backtrace: +kdb_backtrace(c08eef84,e78be8e0,1,1,0) at kdb_backtrace+0x2f +witness_warn(5,c08fe2a0,c0802eb1,c0819833,c08fe2a0) at witness_warn+0x1ac +msleep(c1bb6fe8,c08fe2a0,40,c0819833,4e20) at msleep+0x58 +swap_pager_getpages(c728e948,e78be9ec,1,0,e78be9b0) at swap_pager_getpages+0x400 +vm_fault(c4f9b128,806b000,1,0,c4c0b300) at vm_fault+0xb13 +trap_pfault(e78beaa8,0,806b2c0,c08a9c60,806b2c0) at trap_pfault+0xf4 +trap(c088,28,c4c00028,e78beb28,806b2c0) at trap+0x33e +calltrap() at calltrap+0x5 +--- trap 0xc, eip = 0xc07b9b16, esp = 0xe78beae8, ebp = 0xe78beb08 --- +generic_copyin(e78bec84,e78beb28,4,4,c50c5ca8) at generic_copyin+0x32 +tcp_ctloutput(c4d1dc84,e78bec84,0,c589b400,e78bec68) at tcp_ctloutput+0x182 +sosetopt(c4d1dc84,e78bec84,e78bec80,c08eef80,c4bd15a0) at sosetopt+0x38 +kern_setsockopt(c4c0b300,7,6,1,806b2c0) at kern_setsockopt+0xd6 +setsockopt(c4c0b300,e78bed04,14,28279000,5) at setsockopt+0x3e +syscall(3b,2808003b,bfbf003b,1,806b2c0) at syscall+0x295 +Xint0x80_syscall() at Xint0x80_syscall+0x1f +--- syscall (105, FreeBSD ELF32, setsockopt), eip = 0x2827959b, esp = 0xbf9fea3c, ebp = 0xbf9fea68 --- signature.asc Description: OpenPGP digital signature
Suspend freezes system on my Thinkpad X41
Subject says it all. Attempting FN+F4 (suspend) or acpiconf -s 3 or -s 4 all cause the system to halt. Specs: Thinkpad X41 6.1-RELEASE GENERIC Atheros 5212 Broadcom BCM5751M SATA (ICH6 SATA150) Freezes are non-recoverable. Even hardware functions (such as lcd dimming) cease. Seems like the system has half suspended, but the LCD is still on and displaying. Sleep light doesn't blink indicating sleepiness. Any help on this would be greatly appreciated :) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
gjournal questions
Pavel, running 6.1-stable with these patches rebuilt kernel/world as of 8/28 @ 2p CST w/ these patches gjournal6_20060808.patch vfs_subr.c.3.patch the backend RAID presents 4 luns, this is how we config'd it. da1 - 8G da2 - ~897G da3 - 8G da4 - ~897G da2/4 have been partitioned in FreeBSD, then we did the following gjournal label -v /dev/da2 /dev/da1 gjournal label -v /dev/da4 /dev/da3 newfs -U -L scr09 /dev/da2.journal newfs -U -L scr10 /dev/da4.journal so 1 -8 G journal for each data device. now that the server is under load i'm seeing NFS not responding messages on my clients. the message corresponds to the gjournal suspend/copy operation, causing my clients to hang or give no such file or directory. we copied 137G to /scr10 and it just finished, could this be some remains of writes from the journal? here is the time correlation Aug 31 13:55:24 donkey kernel: GEOM_JOURNAL[1]: Starting copy of journal. Aug 31 13:55:24 donkey kernel: GEOM_JOURNAL[1]: Switch time of da4: 0.002798s Aug 31 13:55:24 donkey kernel: GEOM_JOURNAL[1]: Entire switch time: 14.030198s Aug 31 13:55:24 donkey kernel: GEOM_JOURNAL[1]: Data has been copied. Aug 31 13:55:33 donkey kernel: GEOM_JOURNAL[1]: Entire switch time: 0.13s Aug 31 13:55:44 donkey kernel: GEOM_JOURNAL[1]: Entire switch time: 0.13s Aug 31 13:56:04 donkey kernel: GEOM_JOURNAL[1]: Msync time of /scr09: 0.10s Aug 31 13:56:04 donkey kernel: GEOM_JOURNAL[1]: Sync time of /scr09: 0.09s Aug 31 13:56:04 donkey kernel: GEOM_JOURNAL[1]: Suspend time of /scr09: 0.07s Aug 31 13:56:04 donkey kernel: GEOM_JOURNAL[1]: Starting copy of journal. Aug 31 13:56:04 donkey kernel: GEOM_JOURNAL[1]: Switch time of da2: 0.002302s Aug 31 13:56:04 donkey kernel: GEOM_JOURNAL[1]: Data has been copied. Aug 31 13:56:04 donkey kernel: GEOM_JOURNAL[1]: Msync time of /scr10: 0.029769s Aug 31 13:56:04 donkey kernel: GEOM_JOURNAL[1]: Sync time of /scr10: 0.035259s Aug 31 13:56:04 donkey kernel: GEOM_JOURNAL[1]: Suspend time of /scr10: 10.109732s Aug 31 13:56:04 donkey kernel: GEOM_JOURNAL[1]: Starting copy of journal. Aug 31 13:56:04 donkey kernel: GEOM_JOURNAL[1]: Switch time of da4: 0.002756s Aug 31 13:56:04 donkey kernel: GEOM_JOURNAL[1]: Entire switch time: 10.182759s Aug 31 13:56:04 donkey kernel: GEOM_JOURNAL[1]: Data has been copied. Aug 31 13:56:14 donkey kernel: GEOM_JOURNAL[1]: Entire switch time: 0.12s Aug 31 13:56:24 donkey kernel: GEOM_JOURNAL[1]: Entire switch time: 0.11s Aug 31 13:56:46 donkey kernel: GEOM_JOURNAL[1]: Msync time of /scr09: 0.10s Aug 31 13:56:46 donkey kernel: GEOM_JOURNAL[1]: Sync time of /scr09: 0.09s Aug 31 13:56:46 donkey kernel: GEOM_JOURNAL[1]: Suspend time of /scr09: 0.07s Aug 31 13:56:46 donkey kernel: GEOM_JOURNAL[1]: Starting copy of journal. Aug 31 13:56:46 donkey kernel: GEOM_JOURNAL[1]: Switch time of da2: 0.002364s Aug 31 13:56:46 donkey kernel: GEOM_JOURNAL[1]: Data has been copied. from syslog server Aug 31 13:55:23 user.notice bowltest4 kernel: nfs: server donkey not responding, still trying Aug 31 13:55:23 user.notice bowltest4 kernel: nfs: server donkey OK Aug 31 13:55:23 user.notice laybox32 kernel: nfs: server donkey OK Aug 31 13:55:29 user.notice b-115-4 kernel: nfs: server donkey not responding, still trying Aug 31 13:55:29 user.notice b-115-4 kernel: nfs: server donkey OK Aug 31 13:55:56 user.notice b-116-16 kernel: nfs: server donkey not responding, still trying Aug 31 13:55:56 user.notice b-204-40 kernel: nfs: server donkey not responding, still trying Aug 31 13:55:57 user.notice b-116-16 kernel: nfs: server donkey OK Aug 31 13:55:57 user.notice lic2 kernel: nfs: server donkey not responding, still trying Aug 31 13:55:57 user.notice b-204-40 kernel: nfs: server donkey OK Aug 31 13:55:57 user.notice lic2 kernel: nfs: server donkey OK Aug 31 13:55:57 user.notice laybox29 kernel: nfs: server donkey not responding, still trying Aug 31 13:55:57 user.notice laybox26 kernel: nfs: server donkey not responding, still trying Aug 31 13:55:58 user.notice laybox19 kernel: nfs: server donkey not responding, still trying Aug 31 13:55:58 user.notice laybox37 kernel: nfs: server donkey not responding, still trying Aug 31 13:56:00 user.notice laybox19 kernel: nfs: server donkey OK Aug 31 13:56:00 user.notice laybox26 kernel: nfs: server donkey OK Aug 31 13:56:00 user.notice laybox37 kernel: nfs: server donkey OK Aug 31 13:56:00 user.notice laybox29 kernel: nfs: server donkey OK Aug 31 13:56:05 daemon.info ws-119-8 amd[2640]: file server donkey20.centtech.com, type nfs, state not responding Aug 31 13:56:05 daemon.info ws-119-8 amd[2640]: file server donkey20.centtech.com, type nfs, state ok Aug 31 13:56:36 user.notice b-116-17 kernel: nfs: server donkey not responding, still trying Aug 31 13:56:36 user.notice b-116-17 kernel: nfs: server donkey OK Aug 31 13:56:40 user.notice b-210-17 kernel: nfs: server donkey not responding, still trying Aug 31 13:56:41
Re: suggestions for SATA RAID cards
On Thu, Aug 24, 2006 at 01:39:43AM -0500, Nikolas Britton wrote: Their new hardware (coming soon) will have a 800MHz XScale with DDR2-533 cache. According to http://www.areca.com.tw/products/html/pcietosata1280.htm, it's already here. -- albert chin ([EMAIL PROTECTED]) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Suspend freezes system on my Thinkpad X41
Jordan Sissel wrote: Subject says it all. Attempting FN+F4 (suspend) or acpiconf -s 3 or -s 4 all cause the system to halt. Specs: Thinkpad X41 6.1-RELEASE GENERIC Atheros 5212 Broadcom BCM5751M SATA (ICH6 SATA150) Freezes are non-recoverable. Even hardware functions (such as lcd dimming) cease. Seems like the system has half suspended, but the LCD is still on and displaying. Sleep light doesn't blink indicating sleepiness. Any help on this would be greatly appreciated :) Atheros 5212 is too imprecise; please give mac+phy rev's from dmesg|grep ath. Otherwise, try taking ath out of your kernel config to see if you can suspend+resume. I've got an outstanding issue with ath in how suspend+resume is handled--for cardbus cards at least (never seen it with minipci). Sam ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: NFS locking: lockf freezes (rpc.lockd problem?)
On Tue, Aug 29, 2006 at 05:05:26PM +, Michael Abbott wrote: [I wrote] An alternative would be to update to RELENG_6 (or at least RELENG_6_1) and then try again. So. I have done this. And I can't reproduce the problem. # uname -a FreeBSD venus.araneidae.co.uk 6.1-STABLE FreeBSD 6.1-STABLE #1: Mon Aug 28 18:32:17 UTC 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC i386 Hmm. Hopefully this is a *good* thing, ie, the problem really has been fixed, rather than just going into hiding. So, as far as I can tell, lockf works properly in this release. Just as an interesting side note, I just experienced rpc.lockd crashing. The server is not running RELENG_6, but RELENG_5 (FreeBSD 5.5-STABLE #15: Thu Aug 24 18:47:20 CEST 2006). Due to user error, someone ended up with over 1000 processes trying to lock the same NFS mounted file at the same time. The result was over 1000 Cannot allocate memory errors followed by rpc.lockd crashing. I guess the server is telling me it wants an update... -- greg byshenk - [EMAIL PROTECTED] - Leiden, NL ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Suspend freezes system on my Thinkpad X41
ath_hal: 0.9.16.16 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) ath0: Atheros 5212 mem 0xa020-0xa020 irq 21 at device 2.0 on pci4 ath0: Ethernet address: 00:16:ce:40:7e:41 ath0: mac 5.9 phy 4.3 radio 3.6 It's minipci. As for the video. The LCD display still shows text. If I am in X, it switches to a vty before attempting sleep. The backlight stays on and the display does not turn off. I think when I was testing earlier, I did not have X running nor if_ath loaded. I'll post more later tonight (tomorrow?) when I get a chance. To further clarify: The laptop does *NOT* successfully go to sleep. It returns to vty and locks hard. This is not a resume problem, this is a suspend problem. Attempting to hit ctrl+alt+del to reboot does nothing. -Jordan On 8/31/06, Sam Leffler [EMAIL PROTECTED] wrote: Jordan Sissel wrote: Subject says it all. Attempting FN+F4 (suspend) or acpiconf -s 3 or -s 4 all cause the system to halt. Specs: Thinkpad X41 6.1-RELEASE GENERIC Atheros 5212 Broadcom BCM5751M SATA (ICH6 SATA150) Freezes are non-recoverable. Even hardware functions (such as lcd dimming) cease. Seems like the system has half suspended, but the LCD is still on and displaying. Sleep light doesn't blink indicating sleepiness. Any help on this would be greatly appreciated :) Atheros 5212 is too imprecise; please give mac+phy rev's from dmesg|grep ath. Otherwise, try taking ath out of your kernel config to see if you can suspend+resume. I've got an outstanding issue with ath in how suspend+resume is handled--for cardbus cards at least (never seen it with minipci). Sam ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]