On Tuesday 05 February 2008 01:49, Mike Galbraith wrote: > On Tue, 2008-01-22 at 06:47 +0100, Mike Galbraith wrote: > > On Tue, 2008-01-22 at 16:25 +1100, Nick Piggin wrote: > > > On Tuesday 22 January 2008 16:03, Mike Galbraith wrote: > > > > I've hit same twice recently (not pan, and not repeatable). > > > > > > Nasty. The attached patch is something really simple that can sometimes > > > help. sysrq+p is also an option, if you're on a UP system. > > > > SMP (P4/HT imitating real cores) > > > > > Any luck getting traces? > > > > We'll see. Armed. > > Hm. ld just went loopy (but killable) in v2.6.24-6928-g9135f19. During > kbuild, modpost segfaulted, restart build, ld goes gaga. Third attempt, > build finished. Not what I hit before, but mentionable. > > > [ 674.589134] modpost[18588]: segfault at 3e8dc42c ip 0804a96d sp af982920 > error 5 in modpost[8048000+9000] [ 674.589211] mm/memory.c:115: bad pgd > 3e081163. > [ 674.589214] mm/memory.c:115: bad pgd 3e0d2163. > [ 674.589217] mm/memory.c:115: bad pgd 3eb01163.
Hmm, this _could_ be bad memory. Or if it is very easy to reproduce with a particular kernel version, then it is probably a memory scribble from another part of the kernel :( First thing I guess would be easy and helpful to run memtest86 for a while if you have time. If that's clean, then I don't have another good option except to bisect the problem. Turning on DEBUG_VM, DEBUG_SLAB, DEBUG_LIST, DEBUG_PAGEALLOC, DEBUG_STACKOVERFLOW, DEBUG_RODATA might help catch it sooner... SLAB and PAGEALLOC could slow you down quite a bit though. And if the problem is quite reproduceable, then obviously don't touch your config ;) Thanks, Nick > > [ 1407.322144] ======================= > [ 1407.322144] ld R running 0 21963 21962 > [ 1407.322144] db9d7f1c 00200086 c75f9020 b1814300 b0428300 b0428300 > b0428300 c75f9280 [ 1407.322144] b1814300 00000001 db9d7000 00000000 > d08c2f90 dba4f300 00000002 00000000 [ 1407.322144] b1810120 dba4f334 > 00200046 ffffffff db9d7000 c75f9020 db9d7f30 b02f333f [ 1407.322144] Call > Trace: > [ 1407.322144] [<b02f333f>] preempt_schedule_irq+0x45/0x5b > [ 1407.322144] [<b0117a10>] ? do_page_fault+0x0/0x470 > [ 1407.322144] [<b0104d6e>] need_resched+0x1f/0x21 > [ 1407.322144] [<b0117a10>] ? do_page_fault+0x0/0x470 > [ 1407.322144] [<b0117a5c>] ? do_page_fault+0x4c/0x470 > [ 1407.322144] [<b0117a10>] ? do_page_fault+0x0/0x470 > [ 1407.322144] [<b02f4a3a>] ? error_code+0x72/0x78 > [ 1407.322144] [<b02f0000>] ? init_transmeta+0xcf/0x22f <== zzt P4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/