Hello folks,

here's what I've got trying to debug with kgdb on powerpc target (Sandpoint) 
with the latest patches:

BUG: soft lockup detected on CPU#0!
Call Trace:
[C028BCA0] [C0008428] show_stack+0x3c/0x194 (unreliable)
[C028BCD0] [C00400E0] softlockup_tick+0x94/0xc4
[C028BCF0] [C002858C] run_local_timers+0x18/0x28
[C028BD00] [C00285CC] update_process_times+0x30/0x7c
[C028BD10] [C000D0F0] timer_interrupt+0xd4/0x508
[C028BD90] [C0010648] ret_from_except+0x0/0x14
--- Exception: 901 at breakpoint+0xb4/0xcc
    LR = kgdb8250_interrupt+0x74/0xa8
[C028BE50] [C0022FA4] irq_exit+0x48/0x58 (unreliable)
[C028BE60] [C0131500] kgdb8250_interrupt+0x74/0xa8
[C028BE70] [C004041C] handle_IRQ_event+0x64/0xb8
[C028BE90] [C0041D94] handle_level_irq+0x9c/0x13c
[C028BEB0] [C0018008] sandpoint_8259_cascade+0x7c/0xb0
[C028BED0] [C0005F00] do_IRQ+0x9c/0xc0
[C028BEE0] [C0010648] ret_from_except+0x0/0x14
--- Exception: 501 at cpu_idle+0x48/0xe8
    LR = cpu_idle+0xd8/0xe8
[C028BFB0] [C0003F8C] rest_init+0x28/0x38
[C028BFC0] [C02336B0] start_kernel+0x1b8/0x220
[C028BFF0] [00003860] 0x3860

I've been poking around that a bit, and the problem seems to be introduced by 
core.patch. The old (working) version of core.patch did the following:

 void do_timer(struct pt_regs *regs)
 {
+       int this_cpu = smp_processor_id();
        jiffies_64++;
        /* prevent loading jiffies before storing new jiffies_64 value. */
        barrier();
        update_times();
-       softlockup_tick(regs);
+
+#ifdef CONFIG_KGDB
+       if(!atomic_read(&kgdb_sync_softlockup[this_cpu]))
+#endif
+               softlockup_tick(regs);
+
 }

whereas the new one does the following:

 void do_timer(struct pt_regs *regs)
 {
+       int this_cpu = smp_processor_id();
        jiffies_64++;
        /* prevent loading jiffies before storing new jiffies_64 value. */
        barrier();
        update_times();
+
+#ifdef CONFIG_KGDB
+       if(!atomic_read(&kgdb_sync_softlockup[this_cpu]))
+#endif
+       softlockup_tick();
 }

which of course isn't equivalent.

AFAICS, softlockup_tick() has moved to run_local_timers() so the current 
core.patch introduces second (wrong) softlockup_tick() call and doesn't modify 
the first one not to happen when kgdb_sync_softlockup is set. The latter one 
was triggering the soft lockup message I've been getting.

So, please find the fix for core.patch as I see it inlined below, any comments 
are welcome.

Vitaly

 kernel/timer.c |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

Signed-off-by: Vitaly Wool <[EMAIL PROTECTED]>

diff -u linux-2.6.git/kernel/timer.c powerpc.git/kernel/timer.c
--- linux-2.6.git/kernel/timer.c        2006-08-11 19:55:59.000000000 +0400
+++ powerpc.git/kernel/timer.c
@@ -1258,7 +1258,11 @@
  */
 void run_local_timers(void)
 {
+       int this_cpu = smp_processor_id();
        raise_softirq(TIMER_SOFTIRQ);
+#ifdef CONFIG_KGDB
+       if(!atomic_read(&kgdb_sync_softlockup[this_cpu]))
+#endif
        softlockup_tick();
 }
 
@@ -1284,16 +1288,10 @@
 
 void do_timer(struct pt_regs *regs)
 {
-       int this_cpu = smp_processor_id();
        jiffies_64++;
        /* prevent loading jiffies before storing new jiffies_64 value. */
        barrier();
        update_times();
-
-#ifdef CONFIG_KGDB
-       if(!atomic_read(&kgdb_sync_softlockup[this_cpu]))
-#endif
-       softlockup_tick();
 }
 
 #ifdef __ARCH_WANT_SYS_ALARM

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Kgdb-bugreport mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport

Reply via email to