Re: [prepatch] removal of oops->printk deadlocks
On Thu, 16 Nov 2000, Andrew Morton wrote: > OK, so the options are now: > > nmi_watchdog=0 > nmi_watchdog=1 > nmi_watchdog=1,verbose looks good for me! Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [prepatch] removal of oops->printk deadlocks
Ingo Molnar wrote: > > > So... In this updated patch the console_silent() call remains > > as you designed it, but the nmi_watchdog kernel boot parameter has > > been overloaded so that > > > > nmi_watchdog=2 > > > > will now cause _all_ NMI oops messages to be displayed on the console. > > > > Is this OK? > > this collides with the UP-IOAPIC path's use of nmi_watchdog == 2 ... > > i'd rather suggest a cleaner, "nmi_watchdog=2,verbose" (default: silent) > type of boot parameter, it looks like the NMI watchdog needs more > parameters. OK, so the options are now: nmi_watchdog=0 nmi_watchdog=1 nmi_watchdog=1,verbose Third time lucky? --- linux-2.4.0-test11-pre5/include/linux/kernel.h Sun Oct 15 01:27:46 2000 +++ linux-akpm/include/linux/kernel.h Wed Nov 15 23:25:48 2000 @@ -62,6 +62,7 @@ asmlinkage int printk(const char * fmt, ...) __attribute__ ((format (printf, 1, 2))); +extern int oops_in_progress; #if DEBUG #define pr_debug(fmt,arg...) \ --- linux-2.4.0-test11-pre5/include/asm-i386/page.h Sat Nov 4 16:22:49 2000 +++ linux-akpm/include/asm-i386/page.h Wed Nov 15 23:27:32 2000 @@ -88,8 +88,9 @@ * Tell the user there is some problem. Beep too, so we can * see^H^H^Hhear bugs in early bootup as well! */ +extern void do_BUG(const char *file, int line); #define BUG() do { \ - printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ + do_BUG(__FILE__, __LINE__); \ __asm__ __volatile__(".byte 0x0f,0x0b"); \ } while (0) --- linux-2.4.0-test11-pre5/kernel/printk.c Sat Nov 4 16:22:49 2000 +++ linux-akpm/kernel/printk.c Wed Nov 15 23:25:48 2000 @@ -51,6 +51,7 @@ static unsigned long logged_chars; struct console_cmdline console_cmdline[MAX_CMDLINECONSOLES]; static int preferred_console = -1; +int oops_in_progress; /* * Setup a list of consoles. Called from init/main.c @@ -251,6 +252,9 @@ return do_syslog(type, buf, len); } +/* + * printk("") may be used to wake up klogd + */ asmlinkage int printk(const char *fmt, ...) { va_list args; @@ -260,6 +264,8 @@ static signed char msg_level = -1; long flags; + if (oops_in_progress) + spin_lock_init(_lock); spin_lock_irqsave(_lock, flags); va_start(args, fmt); i = vsprintf(buf + 3, fmt, args); /* hopefully i < sizeof(buf)-4 */ @@ -308,7 +314,8 @@ msg_level = -1; } spin_unlock_irqrestore(_lock, flags); - wake_up_interruptible(_wait); + if (!oops_in_progress) + wake_up_interruptible(_wait); return i; } --- linux-2.4.0-test11-pre5/arch/i386/mm/fault.cWed Nov 15 23:00:16 2000 +++ linux-akpm/arch/i386/mm/fault.c Thu Nov 16 00:14:37 2000 @@ -77,17 +77,37 @@ return 0; } -extern spinlock_t console_lock, timerlist_lock; +#ifdef CONFIG_SMP +extern unsigned volatile int global_irq_lock; +#endif /* * Unlock any spinlocks which will prevent us from getting the - * message out (timerlist_lock is aquired through the - * console unblank code) + * message out and tell the printk/console paths that an emergency + * message is coming through */ -void bust_spinlocks(void) +void bust_spinlocks(int yes) +{ + if (yes) { + oops_in_progress = 1; +#ifdef CONFIG_SMP + global_irq_lock = 0;/* Many serial drivers do __global_cli() */ +#endif + } else { + oops_in_progress = 0; + /* +* OK, the message is on the console. Now we call printk() +* without oops_in_progress set so that printk will give klogd +* a poke. Hold onto your hats... +*/ + printk(""); + } +} + +void do_BUG(const char *file, int line) { - spin_lock_init(_lock); - spin_lock_init(_lock); + bust_spinlocks(1); + printk("kernel BUG at %s:%d!\n", file, line); } asmlinkage void do_invalid_op(struct pt_regs *, unsigned long); @@ -264,8 +284,7 @@ * terminate things with extreme prejudice. */ - bust_spinlocks(); - + bust_spinlocks(1); if (address < PAGE_SIZE) printk(KERN_ALERT "Unable to handle kernel NULL pointer dereference"); else @@ -283,6 +302,7 @@ printk(KERN_ALERT "*pte = %08lx\n", page); } die("Oops", regs, error_code); + bust_spinlocks(0); do_exit(SIGKILL); /* --- linux-2.4.0-test11-pre5/arch/i386/kernel/traps.cWed Nov 15 23:00:16 2000 +++ linux-akpm/arch/i386/kernel/traps.c Wed Nov 15 23:49:00 2000 @@ -63,7 +63,7 @@ struct desc_struct idt_table[256] __attribute__((__section__(".data.idt"))) = { {0, 0}, }; extern int console_loglevel; -extern void bust_spinlocks(void); +extern void bust_spinlocks(int yes); static inline void console_silent(void) { @@ -209,9 +209,10 @@ { console_verbose(); spin_lock_irq(_lock); +
Re: [prepatch] removal of oops->printk deadlocks
"Pawe³ Kot" wrote: > > In attachment. But don't beat me. I think I found the oops reason. > /dev/shm was not mounted. After mounting it I couldn't get an oops yet. > Could this be a reason? No. >>EIP; c010b5ee <__read_lock_failed+6/18> <= Trace; c02026f7 Trace; c0120404 ^^ This was fixed in test11-pre5. Please retest with that kernel. Thanks again. - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [prepatch] removal of oops->printk deadlocks
On Wed, 15 Nov 2000, Andrew Morton wrote: > "Pawe Kot" wrote: > > > > Hi, > > > > The whole oops produced by the 2.4.0-test11pre4 with this patch is in the > > attachment. Other system info is tehere as well. > > > > Oops is produced when runnning big-tables test from mysql sql-bench. > > It's reproducable. The machine is running only mysqld. > > Thank you. Quad 700 meg Xeons? Nice machine, that. Yes. 54 seconds for compiling kernel is quite good result :-) > The NMI oops changes are working well. We can see three CPUs are > stuck in different places spinning on the same lock with interrupts > disabled, but unfortunately something went wrong with your `ksymoops' > run. > > Could you please rerun ksymoops, and make sure that you use the > correct symbol file? Take the `System.map' from the directory > where you built the kernel and run > > ksymoops -m System.map < log_file In attachment. But don't beat me. I think I found the oops reason. /dev/shm was not mounted. After mounting it I couldn't get an oops yet. Could this be a reason? regards and thanks for help pkot -- mailto:[EMAIL PROTECTED] http://urtica.linuxnews.pl/~pkot/ http://newsreader.linuxnews.pl/ ksymoops 2.3.4 on i686 2.4.0-test11. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.0-test11/ (default) -m /usr/src/linux/System.map (specified) Error (regular_file): read_ksyms stat /proc/ksyms failed No modules in ksyms, skipping objects No ksyms, skipping lsmod CPU:0 EIP:0010:[] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 0097 eax: c0285420 ebx: f6f08000 ecx: edx: bf1fff2c esi: bf1fff2c edi: f6f09fa8 ebp: f6f09fa8 esp: f6f09f90 ds: 0018 es: 0018 ss: 0018 Process mysqld (pid: 2386, stackpage=f6f09000) Stack: c02026f7 f6f08000 bf1fff24 fff2 f6f09fbc c0120404 0952 bf1fff28 bf1ffe34 c010c66b 0952 bf1fff28 bf1fff24 bf1ffe34 009c 002b 002b 009c 081313e4 Call Trace: [] [] [] Code: 78 fb f0 ff 08 78 f3 c3 90 90 90 90 90 90 90 90 90 90 55 57 >>EIP; c010b5ee <__read_lock_failed+6/18> <= Trace; c02026f7 Trace; c0120404 Trace; c010c66b Code; c010b5ee <__read_lock_failed+6/18> <_EIP>: Code; c010b5ee <__read_lock_failed+6/18> <= 0: 78 fb js fffd <_EIP+0xfffd> c010b5eb <__read_lock_failed+3/18> <= Code; c010b5f0 <__read_lock_failed+8/18> 2: f0 ff 08 lock decl (%eax) Code; c010b5f3 <__read_lock_failed+b/18> 5: 78 f3 js fffa <_EIP+0xfffa> c010b5e8 <__read_lock_failed+0/18> Code; c010b5f5 <__read_lock_failed+d/18> 7: c3ret Code; c010b5f6 <__read_lock_failed+e/18> 8: 90nop Code; c010b5f7 <__read_lock_failed+f/18> 9: 90nop Code; c010b5f8 <__read_lock_failed+10/18> a: 90nop Code; c010b5f9 <__read_lock_failed+11/18> b: 90nop Code; c010b5fa <__read_lock_failed+12/18> c: 90nop Code; c010b5fb <__read_lock_failed+13/18> d: 90nop Code; c010b5fc <__read_lock_failed+14/18> e: 90nop Code; c010b5fd <__read_lock_failed+15/18> f: 90nop Code; c010b5fe <__read_lock_failed+16/18> 10: 90nop Code; c010b5ff <__read_lock_failed+17/18> 11: 90nop Code; c010b600 12: 55push %ebp Code; c010b601 13: 57push %edi NMI Watchdog detected LOCKUP on CPU1, registers: CPU:1 EIP:0010:[] EFLAGS: 0082 eax: c028e7ec ebx: 0282 ecx: c4df edx: c028e7ec esi: c011f550 edi: ebp: c4d59ef4 esp: c4d59ef0 ds: 0018 es: 0018 ss: 0018 Process swapper (pid: 0, stackpage=c4d59000) Stack: c4df c0285c40 c0128592 c4df 0020 c0285c40 f70082a0 f7c06800 0086 c0124de5 c02a1220 0020 c0124cc7 0001 c0286040 0020 000e c0124b6c c0286040 c02a1580 c0283800 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] Code: 80 3d 00 54 28 c0 00 f3 90 7e f5 e9 ba cf f1 ff 80 bb 74 06 >>EIP; c0202590<= Trace; c0128592 Trace; c0124de5 Trace; c0124cc7 Trace; c0124b6c Trace; c010e225 Trace; c010a810 Trace; c010a810 Trace; c010c72c Trace; c010a810 Trace; c010a810 Trace; c0100018 Trace; c010a83d Trace; c010a8a2 Trace; c0116c84 Trace; c0124b6c Trace; c010e225 Code; c0202590 <_EIP>: Code; c0202590<= 0: 80 3d 00 54 28 c0 00 cmpb $0x0,0xc0285400 <= Code;
Re: [prepatch] removal of oops->printk deadlocks
"Pawe³ Kot" wrote: > > Hi, > > The whole oops produced by the 2.4.0-test11pre4 with this patch is in the > attachment. Other system info is tehere as well. > > Oops is produced when runnning big-tables test from mysql sql-bench. > It's reproducable. The machine is running only mysqld. Thank you. Quad 700 meg Xeons? Nice machine, that. The NMI oops changes are working well. We can see three CPUs are stuck in different places spinning on the same lock with interrupts disabled, but unfortunately something went wrong with your `ksymoops' run. Could you please rerun ksymoops, and make sure that you use the correct symbol file? Take the `System.map' from the directory where you built the kernel and run ksymoops -m System.map < log_file - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [prepatch] removal of oops->printk deadlocks
On Wed, 15 Nov 2000, Andrew Morton wrote: > [...] Problem is, we're getting some reported lockups in which we need > to know what the other CPUs are doing (the other 1%). If a critical > spinlock is stuck on, we don't get to type `dmesg'. i know - but we cannot please everyone, so i went for the 99% :-) > So... In this updated patch the console_silent() call remains > as you designed it, but the nmi_watchdog kernel boot parameter has > been overloaded so that > > nmi_watchdog=2 > > will now cause _all_ NMI oops messages to be displayed on the console. > > Is this OK? this collides with the UP-IOAPIC path's use of nmi_watchdog == 2 ... i'd rather suggest a cleaner, "nmi_watchdog=2,verbose" (default: silent) type of boot parameter, it looks like the NMI watchdog needs more parameters. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [prepatch] removal of oops->printk deadlocks
Ingo Molnar wrote: > > On Tue, 14 Nov 2000, Andrew Morton wrote: > > > It also changes the x86 NMI oopser so that it no longer shuts the > > console up after the first NMI oops. Instead, each CPU is allowed > > to print out NMI diagnostics a single time per reboot. > > this is not how the NMI oopser works. It does *not* shut down the console > - you can still see everything in 'dmesg'. If you want to see it on the > console again, you can do 'dmesg -n 8'. Adding a 'once per reboot' > restriction is unreasonable - there are user-space applications that can > be terminated via the NMI-oopser safely. The patch addresses that - it still calls do_exit() for every NMI timeout. > In 99% of the cases it's the > first oops that counts, and most people do not have serial console set up, > so writing the first oops down from screen is important. Good point. Problem is, we're getting some reported lockups in which we need to know what the other CPUs are doing (the other 1%). If a critical spinlock is stuck on, we don't get to type `dmesg'. So... In this updated patch the console_silent() call remains as you designed it, but the nmi_watchdog kernel boot parameter has been overloaded so that nmi_watchdog=2 will now cause _all_ NMI oops messages to be displayed on the console. Is this OK? --- linux-2.4.0-test11-pre4/include/linux/kernel.h Sun Oct 15 01:27:46 2000 +++ linux-akpm/include/linux/kernel.h Wed Nov 15 20:31:29 2000 @@ -62,6 +62,7 @@ asmlinkage int printk(const char * fmt, ...) __attribute__ ((format (printf, 1, 2))); +extern int oops_in_progress; #if DEBUG #define pr_debug(fmt,arg...) \ --- linux-2.4.0-test11-pre4/include/asm-i386/page.h Sat Nov 4 16:22:49 2000 +++ linux-akpm/include/asm-i386/page.h Wed Nov 15 20:31:29 2000 @@ -88,8 +88,9 @@ * Tell the user there is some problem. Beep too, so we can * see^H^H^Hhear bugs in early bootup as well! */ +extern void do_BUG(const char *file, int line); #define BUG() do { \ - printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ + do_BUG(__FILE__, __LINE__); \ __asm__ __volatile__(".byte 0x0f,0x0b"); \ } while (0) --- linux-2.4.0-test11-pre4/kernel/printk.c Sat Nov 4 16:22:49 2000 +++ linux-akpm/kernel/printk.c Wed Nov 15 20:31:29 2000 @@ -51,6 +51,7 @@ static unsigned long logged_chars; struct console_cmdline console_cmdline[MAX_CMDLINECONSOLES]; static int preferred_console = -1; +int oops_in_progress; /* * Setup a list of consoles. Called from init/main.c @@ -260,6 +261,8 @@ static signed char msg_level = -1; long flags; + if (oops_in_progress) + spin_lock_init(_lock); spin_lock_irqsave(_lock, flags); va_start(args, fmt); i = vsprintf(buf + 3, fmt, args); /* hopefully i < sizeof(buf)-4 */ @@ -308,7 +311,8 @@ msg_level = -1; } spin_unlock_irqrestore(_lock, flags); - wake_up_interruptible(_wait); + if (!oops_in_progress) + wake_up_interruptible(_wait); return i; } --- linux-2.4.0-test11-pre4/arch/i386/mm/fault.cMon Nov 13 18:23:49 2000 +++ linux-akpm/arch/i386/mm/fault.c Wed Nov 15 20:31:29 2000 @@ -77,17 +77,40 @@ return 0; } -extern spinlock_t console_lock, timerlist_lock; +#ifdef CONFIG_SMP +extern unsigned volatile int global_irq_lock; +#endif /* * Unlock any spinlocks which will prevent us from getting the - * message out (timerlist_lock is aquired through the - * console unblank code) + * message out and tell the printk/console paths that an emergency + * message is coming through + */ +void bust_spinlocks(int yes) +{ + if (yes) { + oops_in_progress = 1; +#ifdef CONFIG_SMP + global_irq_lock = 0;/* Many serial drivers do __global_cli() */ +#endif + } else { + oops_in_progress = 0; + /* +* OK, the message is on the console. Now we call printk() +* without oops_in_progress set so that printk will give syslogd +* a poke. Hold onto your hats... +*/ + printk(""); + } +} + +/* + * Called from the BUG() macro. */ -void bust_spinlocks(void) +void do_BUG(const char *file, int line) { - spin_lock_init(_lock); - spin_lock_init(_lock); + bust_spinlocks(1); + printk("kernel BUG at %s:%d!\n", file, line); } asmlinkage void do_invalid_op(struct pt_regs *, unsigned long); @@ -264,8 +287,7 @@ * terminate things with extreme prejudice. */ - bust_spinlocks(); - + bust_spinlocks(1); if (address < PAGE_SIZE) printk(KERN_ALERT "Unable to handle kernel NULL pointer dereference"); else @@ -283,6 +305,7 @@ printk(KERN_ALERT "*pte = %08lx\n", page); } die("Oops", regs, error_code); + bust_spinlocks(0); do_exit(SIGKILL); /*
Re: [prepatch] removal of oops->printk deadlocks
On Tue, 14 Nov 2000, Andrew Morton wrote: > It also changes the x86 NMI oopser so that it no longer shuts the > console up after the first NMI oops. Instead, each CPU is allowed > to print out NMI diagnostics a single time per reboot. this is not how the NMI oopser works. It does *not* shut down the console - you can still see everything in 'dmesg'. If you want to see it on the console again, you can do 'dmesg -n 8'. Adding a 'once per reboot' restriction is unreasonable - there are user-space applications that can be terminated via the NMI-oopser safely. In 99% of the cases it's the first oops that counts, and most people do not have serial console set up, so writing the first oops down from screen is important. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [prepatch] removal of oops-printk deadlocks
On Tue, 14 Nov 2000, Andrew Morton wrote: It also changes the x86 NMI oopser so that it no longer shuts the console up after the first NMI oops. Instead, each CPU is allowed to print out NMI diagnostics a single time per reboot. this is not how the NMI oopser works. It does *not* shut down the console - you can still see everything in 'dmesg'. If you want to see it on the console again, you can do 'dmesg -n 8'. Adding a 'once per reboot' restriction is unreasonable - there are user-space applications that can be terminated via the NMI-oopser safely. In 99% of the cases it's the first oops that counts, and most people do not have serial console set up, so writing the first oops down from screen is important. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [prepatch] removal of oops-printk deadlocks
Ingo Molnar wrote: On Tue, 14 Nov 2000, Andrew Morton wrote: It also changes the x86 NMI oopser so that it no longer shuts the console up after the first NMI oops. Instead, each CPU is allowed to print out NMI diagnostics a single time per reboot. this is not how the NMI oopser works. It does *not* shut down the console - you can still see everything in 'dmesg'. If you want to see it on the console again, you can do 'dmesg -n 8'. Adding a 'once per reboot' restriction is unreasonable - there are user-space applications that can be terminated via the NMI-oopser safely. The patch addresses that - it still calls do_exit() for every NMI timeout. In 99% of the cases it's the first oops that counts, and most people do not have serial console set up, so writing the first oops down from screen is important. Good point. Problem is, we're getting some reported lockups in which we need to know what the other CPUs are doing (the other 1%). If a critical spinlock is stuck on, we don't get to type `dmesg'. So... In this updated patch the console_silent() call remains as you designed it, but the nmi_watchdog kernel boot parameter has been overloaded so that nmi_watchdog=2 will now cause _all_ NMI oops messages to be displayed on the console. Is this OK? --- linux-2.4.0-test11-pre4/include/linux/kernel.h Sun Oct 15 01:27:46 2000 +++ linux-akpm/include/linux/kernel.h Wed Nov 15 20:31:29 2000 @@ -62,6 +62,7 @@ asmlinkage int printk(const char * fmt, ...) __attribute__ ((format (printf, 1, 2))); +extern int oops_in_progress; #if DEBUG #define pr_debug(fmt,arg...) \ --- linux-2.4.0-test11-pre4/include/asm-i386/page.h Sat Nov 4 16:22:49 2000 +++ linux-akpm/include/asm-i386/page.h Wed Nov 15 20:31:29 2000 @@ -88,8 +88,9 @@ * Tell the user there is some problem. Beep too, so we can * see^H^H^Hhear bugs in early bootup as well! */ +extern void do_BUG(const char *file, int line); #define BUG() do { \ - printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ + do_BUG(__FILE__, __LINE__); \ __asm__ __volatile__(".byte 0x0f,0x0b"); \ } while (0) --- linux-2.4.0-test11-pre4/kernel/printk.c Sat Nov 4 16:22:49 2000 +++ linux-akpm/kernel/printk.c Wed Nov 15 20:31:29 2000 @@ -51,6 +51,7 @@ static unsigned long logged_chars; struct console_cmdline console_cmdline[MAX_CMDLINECONSOLES]; static int preferred_console = -1; +int oops_in_progress; /* * Setup a list of consoles. Called from init/main.c @@ -260,6 +261,8 @@ static signed char msg_level = -1; long flags; + if (oops_in_progress) + spin_lock_init(console_lock); spin_lock_irqsave(console_lock, flags); va_start(args, fmt); i = vsprintf(buf + 3, fmt, args); /* hopefully i sizeof(buf)-4 */ @@ -308,7 +311,8 @@ msg_level = -1; } spin_unlock_irqrestore(console_lock, flags); - wake_up_interruptible(log_wait); + if (!oops_in_progress) + wake_up_interruptible(log_wait); return i; } --- linux-2.4.0-test11-pre4/arch/i386/mm/fault.cMon Nov 13 18:23:49 2000 +++ linux-akpm/arch/i386/mm/fault.c Wed Nov 15 20:31:29 2000 @@ -77,17 +77,40 @@ return 0; } -extern spinlock_t console_lock, timerlist_lock; +#ifdef CONFIG_SMP +extern unsigned volatile int global_irq_lock; +#endif /* * Unlock any spinlocks which will prevent us from getting the - * message out (timerlist_lock is aquired through the - * console unblank code) + * message out and tell the printk/console paths that an emergency + * message is coming through + */ +void bust_spinlocks(int yes) +{ + if (yes) { + oops_in_progress = 1; +#ifdef CONFIG_SMP + global_irq_lock = 0;/* Many serial drivers do __global_cli() */ +#endif + } else { + oops_in_progress = 0; + /* +* OK, the message is on the console. Now we call printk() +* without oops_in_progress set so that printk will give syslogd +* a poke. Hold onto your hats... +*/ + printk(""); + } +} + +/* + * Called from the BUG() macro. */ -void bust_spinlocks(void) +void do_BUG(const char *file, int line) { - spin_lock_init(console_lock); - spin_lock_init(timerlist_lock); + bust_spinlocks(1); + printk("kernel BUG at %s:%d!\n", file, line); } asmlinkage void do_invalid_op(struct pt_regs *, unsigned long); @@ -264,8 +287,7 @@ * terminate things with extreme prejudice. */ - bust_spinlocks(); - + bust_spinlocks(1); if (address PAGE_SIZE) printk(KERN_ALERT "Unable to handle kernel NULL pointer dereference"); else @@ -283,6 +305,7 @@ printk(KERN_ALERT "*pte = %08lx\n", page); } die("Oops", regs, error_code); + bust_spinlocks(0);
Re: [prepatch] removal of oops-printk deadlocks
On Wed, 15 Nov 2000, Andrew Morton wrote: [...] Problem is, we're getting some reported lockups in which we need to know what the other CPUs are doing (the other 1%). If a critical spinlock is stuck on, we don't get to type `dmesg'. i know - but we cannot please everyone, so i went for the 99% :-) So... In this updated patch the console_silent() call remains as you designed it, but the nmi_watchdog kernel boot parameter has been overloaded so that nmi_watchdog=2 will now cause _all_ NMI oops messages to be displayed on the console. Is this OK? this collides with the UP-IOAPIC path's use of nmi_watchdog == 2 ... i'd rather suggest a cleaner, "nmi_watchdog=2,verbose" (default: silent) type of boot parameter, it looks like the NMI watchdog needs more parameters. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [prepatch] removal of oops-printk deadlocks
"Pawe³ Kot" wrote: Hi, The whole oops produced by the 2.4.0-test11pre4 with this patch is in the attachment. Other system info is tehere as well. Oops is produced when runnning big-tables test from mysql sql-bench. It's reproducable. The machine is running only mysqld. Thank you. Quad 700 meg Xeons? Nice machine, that. The NMI oops changes are working well. We can see three CPUs are stuck in different places spinning on the same lock with interrupts disabled, but unfortunately something went wrong with your `ksymoops' run. Could you please rerun ksymoops, and make sure that you use the correct symbol file? Take the `System.map' from the directory where you built the kernel and run ksymoops -m System.map log_file - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [prepatch] removal of oops-printk deadlocks
On Wed, 15 Nov 2000, Andrew Morton wrote: "Pawe Kot" wrote: Hi, The whole oops produced by the 2.4.0-test11pre4 with this patch is in the attachment. Other system info is tehere as well. Oops is produced when runnning big-tables test from mysql sql-bench. It's reproducable. The machine is running only mysqld. Thank you. Quad 700 meg Xeons? Nice machine, that. Yes. 54 seconds for compiling kernel is quite good result :-) The NMI oops changes are working well. We can see three CPUs are stuck in different places spinning on the same lock with interrupts disabled, but unfortunately something went wrong with your `ksymoops' run. Could you please rerun ksymoops, and make sure that you use the correct symbol file? Take the `System.map' from the directory where you built the kernel and run ksymoops -m System.map log_file In attachment. But don't beat me. I think I found the oops reason. /dev/shm was not mounted. After mounting it I couldn't get an oops yet. Could this be a reason? regards and thanks for help pkot -- mailto:[EMAIL PROTECTED] http://urtica.linuxnews.pl/~pkot/ http://newsreader.linuxnews.pl/ ksymoops 2.3.4 on i686 2.4.0-test11. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.0-test11/ (default) -m /usr/src/linux/System.map (specified) Error (regular_file): read_ksyms stat /proc/ksyms failed No modules in ksyms, skipping objects No ksyms, skipping lsmod CPU:0 EIP:0010:[c010b5ee] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 0097 eax: c0285420 ebx: f6f08000 ecx: edx: bf1fff2c esi: bf1fff2c edi: f6f09fa8 ebp: f6f09fa8 esp: f6f09f90 ds: 0018 es: 0018 ss: 0018 Process mysqld (pid: 2386, stackpage=f6f09000) Stack: c02026f7 f6f08000 bf1fff24 fff2 f6f09fbc c0120404 0952 bf1fff28 bf1ffe34 c010c66b 0952 bf1fff28 bf1fff24 bf1ffe34 009c 002b 002b 009c 081313e4 Call Trace: [c02026f7] [c0120404] [c010c66b] Code: 78 fb f0 ff 08 78 f3 c3 90 90 90 90 90 90 90 90 90 90 55 57 EIP; c010b5ee __read_lock_failed+6/18 = Trace; c02026f7 stext_lock+86b/8c14 Trace; c0120404 sys_sched_setscheduler+14/18 Trace; c010c66b system_call+33/38 Code; c010b5ee __read_lock_failed+6/18 _EIP: Code; c010b5ee __read_lock_failed+6/18 = 0: 78 fb js fffd _EIP+0xfffd c010b5eb __read_lock_failed+3/18 = Code; c010b5f0 __read_lock_failed+8/18 2: f0 ff 08 lock decl (%eax) Code; c010b5f3 __read_lock_failed+b/18 5: 78 f3 js fffa _EIP+0xfffa c010b5e8 __read_lock_failed+0/18 Code; c010b5f5 __read_lock_failed+d/18 7: c3ret Code; c010b5f6 __read_lock_failed+e/18 8: 90nop Code; c010b5f7 __read_lock_failed+f/18 9: 90nop Code; c010b5f8 __read_lock_failed+10/18 a: 90nop Code; c010b5f9 __read_lock_failed+11/18 b: 90nop Code; c010b5fa __read_lock_failed+12/18 c: 90nop Code; c010b5fb __read_lock_failed+13/18 d: 90nop Code; c010b5fc __read_lock_failed+14/18 e: 90nop Code; c010b5fd __read_lock_failed+15/18 f: 90nop Code; c010b5fe __read_lock_failed+16/18 10: 90nop Code; c010b5ff __read_lock_failed+17/18 11: 90nop Code; c010b600 copy_siginfo_to_user+0/a8 12: 55push %ebp Code; c010b601 copy_siginfo_to_user+1/a8 13: 57push %edi NMI Watchdog detected LOCKUP on CPU1, registers: CPU:1 EIP:0010:[c0202590] EFLAGS: 0082 eax: c028e7ec ebx: 0282 ecx: c4df edx: c028e7ec esi: c011f550 edi: ebp: c4d59ef4 esp: c4d59ef0 ds: 0018 es: 0018 ss: 0018 Process swapper (pid: 0, stackpage=c4d59000) Stack: c4df c0285c40 c0128592 c4df 0020 c0285c40 f70082a0 f7c06800 0086 c0124de5 c02a1220 0020 c0124cc7 0001 c0286040 0020 000e c0124b6c c0286040 c02a1580 c0283800 Call Trace: [c0128592] [c0124de5] [c0124cc7] [c0124b6c] [c010e225] [c010a810] [c010a810] [c010c72c] [c010a810] [c010a810] [c0100018] [c010a83d] [c010a8a2] [c0116c84] [c0124b6c] [c010e225] Code: 80 3d 00 54 28 c0 00 f3 90 7e f5 e9 ba cf f1 ff 80 bb 74 06 EIP; c0202590 stext_lock+704/8c14 = Trace; c0128592 timer_bh+256/2b0 Trace; c0124de5 bh_action+4d/b0 Trace; c0124cc7 tasklet_hi_action+4f/7c Trace; c0124b6c do_softirq+5c/8c Trace; c010e225 do_IRQ+e5/f4 Trace; c010a810 default_idle+0/34
Re: [prepatch] removal of oops-printk deadlocks
"Pawe³ Kot" wrote: In attachment. But don't beat me. I think I found the oops reason. /dev/shm was not mounted. After mounting it I couldn't get an oops yet. Could this be a reason? No. EIP; c010b5ee __read_lock_failed+6/18 = Trace; c02026f7 stext_lock+86b/8c14 Trace; c0120404 sys_sched_setscheduler+14/18 ^^ This was fixed in test11-pre5. Please retest with that kernel. Thanks again. - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [prepatch] removal of oops-printk deadlocks
Ingo Molnar wrote: So... In this updated patch the console_silent() call remains as you designed it, but the nmi_watchdog kernel boot parameter has been overloaded so that nmi_watchdog=2 will now cause _all_ NMI oops messages to be displayed on the console. Is this OK? this collides with the UP-IOAPIC path's use of nmi_watchdog == 2 ... i'd rather suggest a cleaner, "nmi_watchdog=2,verbose" (default: silent) type of boot parameter, it looks like the NMI watchdog needs more parameters. OK, so the options are now: nmi_watchdog=0 nmi_watchdog=1 nmi_watchdog=1,verbose Third time lucky? --- linux-2.4.0-test11-pre5/include/linux/kernel.h Sun Oct 15 01:27:46 2000 +++ linux-akpm/include/linux/kernel.h Wed Nov 15 23:25:48 2000 @@ -62,6 +62,7 @@ asmlinkage int printk(const char * fmt, ...) __attribute__ ((format (printf, 1, 2))); +extern int oops_in_progress; #if DEBUG #define pr_debug(fmt,arg...) \ --- linux-2.4.0-test11-pre5/include/asm-i386/page.h Sat Nov 4 16:22:49 2000 +++ linux-akpm/include/asm-i386/page.h Wed Nov 15 23:27:32 2000 @@ -88,8 +88,9 @@ * Tell the user there is some problem. Beep too, so we can * see^H^H^Hhear bugs in early bootup as well! */ +extern void do_BUG(const char *file, int line); #define BUG() do { \ - printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ + do_BUG(__FILE__, __LINE__); \ __asm__ __volatile__(".byte 0x0f,0x0b"); \ } while (0) --- linux-2.4.0-test11-pre5/kernel/printk.c Sat Nov 4 16:22:49 2000 +++ linux-akpm/kernel/printk.c Wed Nov 15 23:25:48 2000 @@ -51,6 +51,7 @@ static unsigned long logged_chars; struct console_cmdline console_cmdline[MAX_CMDLINECONSOLES]; static int preferred_console = -1; +int oops_in_progress; /* * Setup a list of consoles. Called from init/main.c @@ -251,6 +252,9 @@ return do_syslog(type, buf, len); } +/* + * printk("") may be used to wake up klogd + */ asmlinkage int printk(const char *fmt, ...) { va_list args; @@ -260,6 +264,8 @@ static signed char msg_level = -1; long flags; + if (oops_in_progress) + spin_lock_init(console_lock); spin_lock_irqsave(console_lock, flags); va_start(args, fmt); i = vsprintf(buf + 3, fmt, args); /* hopefully i sizeof(buf)-4 */ @@ -308,7 +314,8 @@ msg_level = -1; } spin_unlock_irqrestore(console_lock, flags); - wake_up_interruptible(log_wait); + if (!oops_in_progress) + wake_up_interruptible(log_wait); return i; } --- linux-2.4.0-test11-pre5/arch/i386/mm/fault.cWed Nov 15 23:00:16 2000 +++ linux-akpm/arch/i386/mm/fault.c Thu Nov 16 00:14:37 2000 @@ -77,17 +77,37 @@ return 0; } -extern spinlock_t console_lock, timerlist_lock; +#ifdef CONFIG_SMP +extern unsigned volatile int global_irq_lock; +#endif /* * Unlock any spinlocks which will prevent us from getting the - * message out (timerlist_lock is aquired through the - * console unblank code) + * message out and tell the printk/console paths that an emergency + * message is coming through */ -void bust_spinlocks(void) +void bust_spinlocks(int yes) +{ + if (yes) { + oops_in_progress = 1; +#ifdef CONFIG_SMP + global_irq_lock = 0;/* Many serial drivers do __global_cli() */ +#endif + } else { + oops_in_progress = 0; + /* +* OK, the message is on the console. Now we call printk() +* without oops_in_progress set so that printk will give klogd +* a poke. Hold onto your hats... +*/ + printk(""); + } +} + +void do_BUG(const char *file, int line) { - spin_lock_init(console_lock); - spin_lock_init(timerlist_lock); + bust_spinlocks(1); + printk("kernel BUG at %s:%d!\n", file, line); } asmlinkage void do_invalid_op(struct pt_regs *, unsigned long); @@ -264,8 +284,7 @@ * terminate things with extreme prejudice. */ - bust_spinlocks(); - + bust_spinlocks(1); if (address PAGE_SIZE) printk(KERN_ALERT "Unable to handle kernel NULL pointer dereference"); else @@ -283,6 +302,7 @@ printk(KERN_ALERT "*pte = %08lx\n", page); } die("Oops", regs, error_code); + bust_spinlocks(0); do_exit(SIGKILL); /* --- linux-2.4.0-test11-pre5/arch/i386/kernel/traps.cWed Nov 15 23:00:16 2000 +++ linux-akpm/arch/i386/kernel/traps.c Wed Nov 15 23:49:00 2000 @@ -63,7 +63,7 @@ struct desc_struct idt_table[256] __attribute__((__section__(".data.idt"))) = { {0, 0}, }; extern int console_loglevel; -extern void bust_spinlocks(void); +extern void bust_spinlocks(int yes); static inline void console_silent(void) { @@ -209,9 +209,10 @@ { console_verbose();
Re: [prepatch] removal of oops-printk deadlocks
On Thu, 16 Nov 2000, Andrew Morton wrote: OK, so the options are now: nmi_watchdog=0 nmi_watchdog=1 nmi_watchdog=1,verbose looks good for me! Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [prepatch] removal of oops->printk deadlocks
Hi, The whole oops produced by the 2.4.0-test11pre4 with this patch is in the attachment. Other system info is tehere as well. Oops is produced when runnning big-tables test from mysql sql-bench. It's reproducable. The machine is running only mysqld. regards pkot -- mailto:[EMAIL PROTECTED] http://urtica.linuxnews.pl/~pkot/ http://newsreader.linuxnews.pl/ ksymoops 2.3.4 on i686 2.4.0-test11. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.0-test11/ (default) -m /boot/System.map-2.4.0-test11 (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Error (regular_file): read_ksyms stat /proc/ksyms failed No modules in ksyms, skipping objects No ksyms, skipping lsmod CPU:0 EIP:0010:[] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 0097 eax: c0285420 ebx: f6f08000 ecx: edx: bf1fff2c esi: bf1fff2c edi: f6f09fa8 ebp: f6f09fa8 esp: f6f09f90 ds: 0018 es: 0018 ss: 0018 Process mysqld (pid: 2386, stackpage=f6f09000) Stack: c02026f7 f6f08000 bf1fff24 fff2 f6f09fbc c0120404 0952 bf1fff28 bf1ffe34 c010c66b 0952 bf1fff28 bf1fff24 bf1ffe34 009c 002b 002b 009c 081313e4 Call Trace: [] [] [] Code: 78 fb f0 ff 08 78 f3 c3 90 90 90 90 90 90 90 90 90 90 55 57 >>EIP; c010b5ee<= Trace; c02026f7 Trace; c0120404 Trace; c010c66b <__global_cli+b/124> Code; c010b5ee <_EIP>: Code; c010b5ee<= 0: 78 fb js fffd <_EIP+0xfffd> c010b5eb <= Code; c010b5f0 2: f0 ff 08 lock decl (%eax) Code; c010b5f3 5: 78 f3 js fffa <_EIP+0xfffa> c010b5e8 Code; c010b5f5 7: c3ret Code; c010b5f6 8: 90nop Code; c010b5f7 9: 90nop Code; c010b5f8 a: 90nop Code; c010b5f9 b: 90nop Code; c010b5fa c: 90nop Code; c010b5fb d: 90nop Code; c010b5fc e: 90nop Code; c010b5fd f: 90nop Code; c010b5fe 10: 90nop Code; c010b5ff 11: 90nop Code; c010b600 12: 55push %ebp Code; c010b601 13: 57push %edi NMI Watchdog detected LOCKUP on CPU1, registers: CPU:1 EIP:0010:[] EFLAGS: 0082 eax: c028e7ec ebx: 0282 ecx: c4df edx: c028e7ec esi: c011f550 edi: ebp: c4d59ef4 esp: c4d59ef0 ds: 0018 es: 0018 ss: 0018 Process swapper (pid: 0, stackpage=c4d59000) Stack: c4df c0285c40 c0128592 c4df 0020 c0285c40 f70082a0 f7c06800 0086 c0124de5 c02a1220 0020 c0124cc7 0001 c0286040 0020 000e c0124b6c c0286040 c02a1580 c0283800 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] Code: 80 3d 00 54 28 c0 00 f3 90 7e f5 e9 ba cf f1 ff 80 bb 74 06 >>EIP; c0202590<= Trace; c0128592 Trace; c0124de5 Trace; c0124cc7 Trace; c0124b6c Trace; c010e225 Trace; c010a810 Trace; c010a810 Trace; c010c72c <__global_cli+cc/124> Trace; c010a810 Trace; c010a810 Trace; c0100018 Trace; c010a83d Trace; c010a8a2 Trace; c0116c84 Trace; c0124b6c Trace; c010e225 Code; c0202590 <_EIP>: Code; c0202590<= 0: 80 3d 00 54 28 c0 00 cmpb $0x0,0xc0285400 <= Code; c0202597 7: f3 90 repz nop Code; c0202599 9: 7e f5 jle0 <_EIP> Code; c020259b b: e9 ba cf f1 ffjmpfff1cfca <_EIP+0xfff1cfca> c011f55a Code; c02025a0 10: 80 bb 74 06 00 00 00 cmpb $0x0,0x674(%ebx) NMI Watchdog detected LOCKUP on CPU2, registers: CPU:2 EIP:0010:[] EFLAGS: 0082 eax: ebx: c0285480 ecx: 0040 edx: 0002 esi: c4d7a000 edi: c010a810 ebp: c4d7bfac esp: c4d7bf78 ds: 0018 es: 0018 ss: 0018 Process swapper (pid: 0, stackpage=c4d7b000) Stack: c010a810 c4d7a000 c010a810 c4d7a000 c4d7a000 c010a810 c4d70018 c0100018 0002 c4d7a000 c0285480 c010a8ae 0003 c0116c84 c0289373 c012293f
[prepatch] removal of oops->printk deadlocks
This patch is designed to remove the things which can cause loss of oops information due to the SMP deadlocks which can occur during the oops processing itself. It's fairly x86-specific, but the infrastructure is in place for other SMP-capable architectures to use. It also changes the x86 NMI oopser so that it no longer shuts the console up after the first NMI oops. Instead, each CPU is allowed to print out NMI diagnostics a single time per reboot. This patch only addresses console-on-vgacon. If the console is on a serial device then there may still be deadlocks. The serial drivers can individually test `oops_in_progress' to avoid this. The call to poke_blanked_console() is resurrected, but it's no longer deadlocky. This is fairly important - without this change a machine could be wedged with oops diagnostics on its screen, but the screen would remain blanked. The patch addresses oopses, die(), NMI oopses and BUG(). The changes to BUG() result in 3 kbytes less .rodata. There's a tiny kernel module at http://www.uow.edu.au/~andrewm/linux/ddeadlock.c which may be used to test the NMI oopser changes. Comments are welcome! --- linux-2.4.0-test11-pre4/include/linux/kernel.h Sun Oct 15 01:27:46 2000 +++ linux-akpm/include/linux/kernel.h Tue Nov 14 21:02:18 2000 @@ -62,6 +62,7 @@ asmlinkage int printk(const char * fmt, ...) __attribute__ ((format (printf, 1, 2))); +extern int oops_in_progress; #if DEBUG #define pr_debug(fmt,arg...) \ --- linux-2.4.0-test11-pre4/include/asm-i386/page.h Sat Nov 4 16:22:49 2000 +++ linux-akpm/include/asm-i386/page.h Tue Nov 14 21:02:18 2000 @@ -88,8 +88,9 @@ * Tell the user there is some problem. Beep too, so we can * see^H^H^Hhear bugs in early bootup as well! */ +extern void do_BUG(const char *file, int line); #define BUG() do { \ - printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ + do_BUG(__FILE__, __LINE__); \ __asm__ __volatile__(".byte 0x0f,0x0b"); \ } while (0) --- linux-2.4.0-test11-pre4/kernel/printk.c Sat Nov 4 16:22:49 2000 +++ linux-akpm/kernel/printk.c Tue Nov 14 21:02:18 2000 @@ -51,6 +51,7 @@ static unsigned long logged_chars; struct console_cmdline console_cmdline[MAX_CMDLINECONSOLES]; static int preferred_console = -1; +int oops_in_progress; /* * Setup a list of consoles. Called from init/main.c @@ -260,6 +261,8 @@ static signed char msg_level = -1; long flags; + if (oops_in_progress) + spin_lock_init(_lock); spin_lock_irqsave(_lock, flags); va_start(args, fmt); i = vsprintf(buf + 3, fmt, args); /* hopefully i < sizeof(buf)-4 */ @@ -308,7 +311,8 @@ msg_level = -1; } spin_unlock_irqrestore(_lock, flags); - wake_up_interruptible(_wait); + if (!oops_in_progress) + wake_up_interruptible(_wait); return i; } --- linux-2.4.0-test11-pre4/arch/i386/mm/fault.cMon Nov 13 18:23:49 2000 +++ linux-akpm/arch/i386/mm/fault.c Tue Nov 14 21:02:18 2000 @@ -77,17 +77,40 @@ return 0; } -extern spinlock_t console_lock, timerlist_lock; +#ifdef CONFIG_SMP +extern unsigned volatile int global_irq_lock; +#endif /* * Unlock any spinlocks which will prevent us from getting the - * message out (timerlist_lock is aquired through the - * console unblank code) + * message out and tell the printk/console paths that an emergency + * message is coming through + */ +void bust_spinlocks(int yes) +{ + if (yes) { + oops_in_progress = 1; +#ifdef CONFIG_SMP + global_irq_lock = 0;/* Many serial drivers do __global_cli() */ +#endif + } else { + oops_in_progress = 0; + /* +* OK, the message is on the console. Now we call printk() +* without oops_in_progress set so that printk will give syslogd +* a poke. Hold onto your hats... +*/ + printk(""); + } +} + +/* + * Called from the BUG() macro. */ -void bust_spinlocks(void) +void do_BUG(const char *file, int line) { - spin_lock_init(_lock); - spin_lock_init(_lock); + bust_spinlocks(1); + printk("kernel BUG at %s:%d!\n", file, line); } asmlinkage void do_invalid_op(struct pt_regs *, unsigned long); @@ -264,8 +287,7 @@ * terminate things with extreme prejudice. */ - bust_spinlocks(); - + bust_spinlocks(1); if (address < PAGE_SIZE) printk(KERN_ALERT "Unable to handle kernel NULL pointer dereference"); else @@ -283,6 +305,7 @@ printk(KERN_ALERT "*pte = %08lx\n", page); } die("Oops", regs, error_code); + bust_spinlocks(0); do_exit(SIGKILL); /* --- linux-2.4.0-test11-pre4/arch/i386/kernel/traps.cMon Nov 13 18:23:49 2000 +++ linux-akpm/arch/i386/kernel/traps.c Tue Nov 14 21:11:10 2000 @@ -63,7 +63,7 @@
[prepatch] removal of oops-printk deadlocks
This patch is designed to remove the things which can cause loss of oops information due to the SMP deadlocks which can occur during the oops processing itself. It's fairly x86-specific, but the infrastructure is in place for other SMP-capable architectures to use. It also changes the x86 NMI oopser so that it no longer shuts the console up after the first NMI oops. Instead, each CPU is allowed to print out NMI diagnostics a single time per reboot. This patch only addresses console-on-vgacon. If the console is on a serial device then there may still be deadlocks. The serial drivers can individually test `oops_in_progress' to avoid this. The call to poke_blanked_console() is resurrected, but it's no longer deadlocky. This is fairly important - without this change a machine could be wedged with oops diagnostics on its screen, but the screen would remain blanked. The patch addresses oopses, die(), NMI oopses and BUG(). The changes to BUG() result in 3 kbytes less .rodata. There's a tiny kernel module at http://www.uow.edu.au/~andrewm/linux/ddeadlock.c which may be used to test the NMI oopser changes. Comments are welcome! --- linux-2.4.0-test11-pre4/include/linux/kernel.h Sun Oct 15 01:27:46 2000 +++ linux-akpm/include/linux/kernel.h Tue Nov 14 21:02:18 2000 @@ -62,6 +62,7 @@ asmlinkage int printk(const char * fmt, ...) __attribute__ ((format (printf, 1, 2))); +extern int oops_in_progress; #if DEBUG #define pr_debug(fmt,arg...) \ --- linux-2.4.0-test11-pre4/include/asm-i386/page.h Sat Nov 4 16:22:49 2000 +++ linux-akpm/include/asm-i386/page.h Tue Nov 14 21:02:18 2000 @@ -88,8 +88,9 @@ * Tell the user there is some problem. Beep too, so we can * see^H^H^Hhear bugs in early bootup as well! */ +extern void do_BUG(const char *file, int line); #define BUG() do { \ - printk("kernel BUG at %s:%d!\n", __FILE__, __LINE__); \ + do_BUG(__FILE__, __LINE__); \ __asm__ __volatile__(".byte 0x0f,0x0b"); \ } while (0) --- linux-2.4.0-test11-pre4/kernel/printk.c Sat Nov 4 16:22:49 2000 +++ linux-akpm/kernel/printk.c Tue Nov 14 21:02:18 2000 @@ -51,6 +51,7 @@ static unsigned long logged_chars; struct console_cmdline console_cmdline[MAX_CMDLINECONSOLES]; static int preferred_console = -1; +int oops_in_progress; /* * Setup a list of consoles. Called from init/main.c @@ -260,6 +261,8 @@ static signed char msg_level = -1; long flags; + if (oops_in_progress) + spin_lock_init(console_lock); spin_lock_irqsave(console_lock, flags); va_start(args, fmt); i = vsprintf(buf + 3, fmt, args); /* hopefully i sizeof(buf)-4 */ @@ -308,7 +311,8 @@ msg_level = -1; } spin_unlock_irqrestore(console_lock, flags); - wake_up_interruptible(log_wait); + if (!oops_in_progress) + wake_up_interruptible(log_wait); return i; } --- linux-2.4.0-test11-pre4/arch/i386/mm/fault.cMon Nov 13 18:23:49 2000 +++ linux-akpm/arch/i386/mm/fault.c Tue Nov 14 21:02:18 2000 @@ -77,17 +77,40 @@ return 0; } -extern spinlock_t console_lock, timerlist_lock; +#ifdef CONFIG_SMP +extern unsigned volatile int global_irq_lock; +#endif /* * Unlock any spinlocks which will prevent us from getting the - * message out (timerlist_lock is aquired through the - * console unblank code) + * message out and tell the printk/console paths that an emergency + * message is coming through + */ +void bust_spinlocks(int yes) +{ + if (yes) { + oops_in_progress = 1; +#ifdef CONFIG_SMP + global_irq_lock = 0;/* Many serial drivers do __global_cli() */ +#endif + } else { + oops_in_progress = 0; + /* +* OK, the message is on the console. Now we call printk() +* without oops_in_progress set so that printk will give syslogd +* a poke. Hold onto your hats... +*/ + printk(""); + } +} + +/* + * Called from the BUG() macro. */ -void bust_spinlocks(void) +void do_BUG(const char *file, int line) { - spin_lock_init(console_lock); - spin_lock_init(timerlist_lock); + bust_spinlocks(1); + printk("kernel BUG at %s:%d!\n", file, line); } asmlinkage void do_invalid_op(struct pt_regs *, unsigned long); @@ -264,8 +287,7 @@ * terminate things with extreme prejudice. */ - bust_spinlocks(); - + bust_spinlocks(1); if (address PAGE_SIZE) printk(KERN_ALERT "Unable to handle kernel NULL pointer dereference"); else @@ -283,6 +305,7 @@ printk(KERN_ALERT "*pte = %08lx\n", page); } die("Oops", regs, error_code); + bust_spinlocks(0); do_exit(SIGKILL); /* --- linux-2.4.0-test11-pre4/arch/i386/kernel/traps.cMon Nov 13 18:23:49 2000 +++ linux-akpm/arch/i386/kernel/traps.c Tue Nov
Re: [prepatch] removal of oops-printk deadlocks
Hi, The whole oops produced by the 2.4.0-test11pre4 with this patch is in the attachment. Other system info is tehere as well. Oops is produced when runnning big-tables test from mysql sql-bench. It's reproducable. The machine is running only mysqld. regards pkot -- mailto:[EMAIL PROTECTED] http://urtica.linuxnews.pl/~pkot/ http://newsreader.linuxnews.pl/ ksymoops 2.3.4 on i686 2.4.0-test11. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.0-test11/ (default) -m /boot/System.map-2.4.0-test11 (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Error (regular_file): read_ksyms stat /proc/ksyms failed No modules in ksyms, skipping objects No ksyms, skipping lsmod CPU:0 EIP:0010:[c010b5ee] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 0097 eax: c0285420 ebx: f6f08000 ecx: edx: bf1fff2c esi: bf1fff2c edi: f6f09fa8 ebp: f6f09fa8 esp: f6f09f90 ds: 0018 es: 0018 ss: 0018 Process mysqld (pid: 2386, stackpage=f6f09000) Stack: c02026f7 f6f08000 bf1fff24 fff2 f6f09fbc c0120404 0952 bf1fff28 bf1ffe34 c010c66b 0952 bf1fff28 bf1fff24 bf1ffe34 009c 002b 002b 009c 081313e4 Call Trace: [c02026f7] [c0120404] [c010c66b] Code: 78 fb f0 ff 08 78 f3 c3 90 90 90 90 90 90 90 90 90 90 55 57 EIP; c010b5ee show_registers+12e/13c = Trace; c02026f7 stext_lock+474b/8d34 Trace; c0120404 sys_nanosleep+16c/178 Trace; c010c66b __global_cli+b/124 Code; c010b5ee show_registers+12e/13c _EIP: Code; c010b5ee show_registers+12e/13c = 0: 78 fb js fffd _EIP+0xfffd c010b5eb show_registers+12b/13c = Code; c010b5f0 show_registers+130/13c 2: f0 ff 08 lock decl (%eax) Code; c010b5f3 show_registers+133/13c 5: 78 f3 js fffa _EIP+0xfffa c010b5e8 show_registers+128/13c Code; c010b5f5 show_registers+135/13c 7: c3ret Code; c010b5f6 show_registers+136/13c 8: 90nop Code; c010b5f7 show_registers+137/13c 9: 90nop Code; c010b5f8 show_registers+138/13c a: 90nop Code; c010b5f9 show_registers+139/13c b: 90nop Code; c010b5fa show_registers+13a/13c c: 90nop Code; c010b5fb show_registers+13b/13c d: 90nop Code; c010b5fc die+0/58 e: 90nop Code; c010b5fd die+1/58 f: 90nop Code; c010b5fe die+2/58 10: 90nop Code; c010b5ff die+3/58 11: 90nop Code; c010b600 die+4/58 12: 55push %ebp Code; c010b601 die+5/58 13: 57push %edi NMI Watchdog detected LOCKUP on CPU1, registers: CPU:1 EIP:0010:[c0202590] EFLAGS: 0082 eax: c028e7ec ebx: 0282 ecx: c4df edx: c028e7ec esi: c011f550 edi: ebp: c4d59ef4 esp: c4d59ef0 ds: 0018 es: 0018 ss: 0018 Process swapper (pid: 0, stackpage=c4d59000) Stack: c4df c0285c40 c0128592 c4df 0020 c0285c40 f70082a0 f7c06800 0086 c0124de5 c02a1220 0020 c0124cc7 0001 c0286040 0020 000e c0124b6c c0286040 c02a1580 c0283800 Call Trace: [c0128592] [c0124de5] [c0124cc7] [c0124b6c] [c010e225] [c010a810] [c010a810] [c010c72c] [c010a810] [c010a810] [c0100018] [c010a83d] [c010a8a2] [c0116c84] [c0124b6c] [c010e225] Code: 80 3d 00 54 28 c0 00 f3 90 7e f5 e9 ba cf f1 ff 80 bb 74 06 EIP; c0202590 stext_lock+45e4/8d34 = Trace; c0128592 vmfree_area_pages+122/174 Trace; c0124de5 do_generic_file_read+279/544 Trace; c0124cc7 do_generic_file_read+15b/544 Trace; c0124b6c do_generic_file_read+0/544 Trace; c010e225 do_vm86_irq_handling+22d/2a8 Trace; c010a810 setup_sigcontext+50/128 Trace; c010a810 setup_sigcontext+50/128 Trace; c010c72c __global_cli+cc/124 Trace; c010a810 setup_sigcontext+50/128 Trace; c010a810 setup_sigcontext+50/128 Trace; c0100018 startup_32+18/cc Trace; c010a83d setup_sigcontext+7d/128 Trace; c010a8a2 setup_sigcontext+e2/128 Trace; c0116c84 getrusage+168/1ec Trace; c0124b6c do_generic_file_read+0/544 Trace; c010e225 do_vm86_irq_handling+22d/2a8 Code; c0202590