On Fri, 19 Oct 2007, Ray Lee wrote: > On 10/19/07, Jeff Garzik <[EMAIL PROTECTED]> wrote: > > On my main devel box, vanilla 2.6.23 on x86-64/Fedora-7, I'm seeing a > > certain behavior at least once a day. I'll start a kernel build (make > > -sj5 on this box), and it will "hang" in the following way: > > > > > 31003 ? S 0:04 sshd: [EMAIL PROTECTED]/0 > > > 31004 pts/0 Ss 0:02 \_ -bash > > > 8280 pts/0 S+ 0:00 \_ make ARCH=i386 -sj4 > > > 8690 pts/0 Z+ 0:00 \_ [rm] <defunct> > > > 8691 pts/0 S+ 0:00 \_ /bin/sh -c cat > > > include/config/kernel.release 2> /dev/null > > > 8692 pts/0 R+ 6:12 \_ cat > > > include/config/kernel.release > > > > Specifically, the symptom is a process, often a simple one like cat(1) > > or rm(1) or somewhere in check-headers, will stay in the running state, > > accumulating CPU time. > > > > If I Ctrl-C the build, and start over, the build will normally -not- get > > stuck at the same point, but proceed to chew through one of a bazillion > > allmodconfig builds. > > I *think* I'm seeing this with firefox under 2.6.23-rc6. I tried a > `killall -SIGSTOP firefox; killall -SIGCONT firefox` and when I looked > back it was back to life again, but that may have been a fluke. > Regardless, try that the next time it happens?
Don't know if that's the same problem as above, but a few minutes ago my mail-server locked down completely. First pine froze, then more processes started freezing, then the system became unusable, ssh logins got stuck, USB- and ps/2 keyboards. I managed to get a trace with the "w" sysrq: SysRq : Show Blocked State task PC stack pid father syslogd D c01275a3 0 2818 1 e8a1fe4c 00000086 e8a1fe4c c01275a3 f7e9a200 00000282 e8a1fe5c 00b5ce60 c1b05cc0 e8a1fe7c c030c8e7 e8a1ff30 c8d553c4 c0407348 c1b58b18 00b5ce60 c01277d0 c1b1ba90 c0407040 000000da d5f806c0 e8a1fe84 c030c974 e8a1feb4 Call Trace: [<c030c8e7>] schedule_timeout+0x47/0xc0 [<c030c974>] schedule_timeout_uninterruptible+0x14/0x20 [<c01b94cb>] journal_stop+0xcb/0x270 [<c01ba1fd>] journal_force_commit+0x1d/0x30 [<c01b2265>] ext3_force_commit+0x25/0x30 [<c01ac7dc>] ext3_write_inode+0x2c/0x40 [<c0183f8b>] __writeback_single_inode+0x30b/0x3e0 [<c01849b4>] sync_inode+0x24/0x60 [<c01a8d42>] ext3_sync_file+0xc2/0xd0 [<c0187340>] do_fsync+0x60/0xa0 [<c01873a8>] __do_fsync+0x28/0x40 [<c01873ed>] sys_fsync+0xd/0x10 [<c010424e>] sysenter_past_esp+0x5f/0x85 ======================= pine D c01275a3 0 6910 6243 d1f55e4c 00200082 c03c0e40 c01275a3 f7e42c80 00200282 d1f55e5c 00b5ce60 c1b05cc0 d1f55e7c c030c8e7 d1f55f30 c8d553d8 c1b58b18 c90a9e5c 00b5ce60 c01277d0 d8ae8a90 c0407040 000000c3 d5f806c0 d1f55e84 c030c974 d1f55eb4 Call Trace: [<c030c8e7>] schedule_timeout+0x47/0xc0 [<c030c974>] schedule_timeout_uninterruptible+0x14/0x20 [<c01b94cb>] journal_stop+0xcb/0x270 [<c01ba1fd>] journal_force_commit+0x1d/0x30 [<c01b2265>] ext3_force_commit+0x25/0x30 [<c01ac7dc>] ext3_write_inode+0x2c/0x40 [<c0183f8b>] __writeback_single_inode+0x30b/0x3e0 [<c01849b4>] sync_inode+0x24/0x60 [<c01a8d42>] ext3_sync_file+0xc2/0xd0 [<c0187340>] do_fsync+0x60/0xa0 [<c01873a8>] __do_fsync+0x28/0x40 [<c01873ed>] sys_fsync+0xd/0x10 [<c010424e>] sysenter_past_esp+0x5f/0x85 ======================= sendmail D c01275a3 0 7448 7446 c90a9e4c 00000082 c03c0e40 c01275a3 f7e00580 00000282 c90a9e5c 00b5ce60 c1b05cc0 c90a9e7c c030c8e7 c90a9f30 c8d553ec d1f55e5c c0407348 00b5ce60 c01277d0 d8ae8030 c0407040 0000004b d5f806c0 c90a9e84 c030c974 c90a9eb4 Call Trace: [<c030c8e7>] schedule_timeout+0x47/0xc0 [<c030c974>] schedule_timeout_uninterruptible+0x14/0x20 [<c01b94cb>] journal_stop+0xcb/0x270 [<c01ba1fd>] journal_force_commit+0x1d/0x30 [<c01b2265>] ext3_force_commit+0x25/0x30 [<c01ac7dc>] ext3_write_inode+0x2c/0x40 [<c0183f8b>] __writeback_single_inode+0x30b/0x3e0 [<c01849b4>] sync_inode+0x24/0x60 [<c01a8d42>] ext3_sync_file+0xc2/0xd0 [<c0187340>] do_fsync+0x60/0xa0 [<c01873a8>] __do_fsync+0x28/0x40 [<c01873ed>] sys_fsync+0xd/0x10 [<c010424e>] sysenter_past_esp+0x5f/0x85 ======================= now you see why I wrote "ext3 deadlock." It's y VIA C7 system, running 2.6.23-rc9-g804b3f9a, no problems since 5 October, when the kernel has been built. No Oops / warnings in dmesg. Or has this been fixed since 23-rc9? Thanks Guennadi --- Guennadi Liakhovetski - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/