On Fri, Oct 3, 2014 at 9:37 PM, Gilles Chanteperdrix <[email protected]> wrote: > On 10/03/2014 12:58 PM, GP Orcullo wrote: >> On Fri, Oct 3, 2014 at 4:57 PM, Gilles Chanteperdrix >> <[email protected]> wrote: >>> On Fri, Oct 03, 2014 at 04:45:53PM +0800, GP Orcullo wrote: >>>> On Fri, Oct 3, 2014 at 3:20 PM, Gilles Chanteperdrix >>>> <[email protected]> wrote: >>>>> On Fri, Oct 03, 2014 at 11:35:33AM +0800, GP Orcullo wrote: >>>>>> The system managed to dump some error messages. >>>>>> >>>>> >>>>>> [ 407.688809] INFO: rcu_preempt detected stalls on CPUs/tasks: { 3} >>>>>> (detected by 0, t=12002 jiffies, g=22447, c=22446, q=113448) >>>>>> [ 407.694560] Task dump for CPU 3: >>>>>> [ 407.697763] switchtest R running 0 3979 3640 0x00000002 >>>>>> [ 407.704111] [<c0453ddc>] (__schedule+0x1fc/0x5f8) from [<00000010>] >>>>>> (0x10) >>>>>> [ 587.714458] INFO: rcu_preempt detected stalls on CPUs/tasks: { 3} >>>>>> (detected by 2, t=48007 jiffies, g=22447, c=22446, q=443504) >>>>>> [ 587.720205] Task dump for CPU 3: >>>>>> [ 587.723407] switchtest R running 0 3979 3640 0x00000002 >>>>>> [ 587.729755] [<c0453ddc>] (__schedule+0x1fc/0x5f8) from [<00000010>] >>>>>> (0x10) >>>>>> [ 767.740128] INFO: rcu_preempt detected stalls on CPUs/tasks: { 3} >>>>>> (detected by 2, t=84012 jiffies, g=22447, c=22446, q=1136198) >>>>>> [ 767.745964] Task dump for CPU 3: >>>>>> [ 767.749167] switchtest R running 0 3979 3640 0x00000002 >>>>>> [ 767.755515] [<c0453ddc>] (__schedule+0x1fc/0x5f8) from [<00000010>] >>>>>> (0x10) >>>>>> [ 782.763027] ------------[ cut here ]------------ >>>>>> [ 782.763099] WARNING: at include/linux/ipipe_domain.h:212 >>>>>> __ipipe_dispatch_irq+0x35c/0x5a0() >>>>>> [ 782.770380] Modules linked in: nfsv3 nfsv4 nfsd exportfs nfs_acl >>>>>> auth_rpcgss nfs lockd sunrpc vfat fat smsc95xx usbnet mii gpio_keys >>>>>> [ 782.782218] [<c0015624>] (unwind_backtrace+0x0/0xf8) from >>>>>> [<c0026118>] (warn_slowpath_common+0x4c/0x64) >>>>>> [ 782.791606] [<c0026118>] (warn_slowpath_common+0x4c/0x64) from >>>>>> [<c002614c>] (warn_slowpath_null+0x1c/0x24) >>>>>> [ 782.801232] [<c002614c>] (warn_slowpath_null+0x1c/0x24) from >>>>>> [<c00886ac>] (__ipipe_dispatch_irq+0x35c/0x5a0) >>>>>> [ 782.811035] [<c00886ac>] (__ipipe_dispatch_irq+0x35c/0x5a0) from >>>>>> [<c00200c8>] (combiner_handle_cascade_irq+0x80/0xf4) >>>>>> [ 782.821620] [<c00200c8>] (combiner_handle_cascade_irq+0x80/0xf4) from >>>>>> [<c0019110>] (__ipipe_ack_irq+0xc/0x10) >>>>> >>>>> This is bad, can I see the code of the "combiner_handle_cascade_irq" >>>>> function? >>>>> >>>> >>>> Here's the original code: >>>> >>>> static DEFINE_SPINLOCK(irq_controller_lock); >>>> >>>> ... >>>> >>>> static void combiner_handle_cascade_irq(unsigned int irq, struct irq_desc >>>> *desc) >>>> { >>>> struct combiner_chip_data *chip_data = irq_get_handler_data(irq); >>>> struct irq_chip *chip = irq_get_chip(irq); >>>> unsigned int cascade_irq, combiner_irq; >>>> unsigned long status; >>>> >>>> chained_irq_enter(chip, desc); >>>> >>>> spin_lock(&irq_controller_lock); >>>> status = __raw_readl(chip_data->base + COMBINER_INT_STATUS); >>>> spin_unlock(&irq_controller_lock); >>>> status &= chip_data->irq_mask; >>>> >>>> if (status == 0) >>>> goto out; >>>> >>>> combiner_irq = __ffs(status); >>>> >>>> cascade_irq = combiner_irq + (chip_data->irq_offset & ~31); >>>> if (unlikely(cascade_irq >= NR_IRQS)) >>>> do_bad_IRQ(cascade_irq, desc); >>>> else >>>> ipipe_handle_demuxed_irq(cascade_irq); >>>> >>>> out: >>>> chained_irq_exit(chip, desc); >>>> } >>>> >>>> >>>> I've changed the DEFINE_SPINLOCK to IPIPE_DEFINE_SPINLOCK and I got >>>> the same results. >>> >>> Same results meaning you still have the warning in ipipe_domain ? Or >>> still have the lockup? >>> >> >> Lockup. It was just luck that that error message came up. > > The error message only happens with I-pipe debugging enabled. And the > error was probably due to the spinlock. > >> >>> >>> One of these functions uses local_irq_save/local_irq_restore, one >>> way to find which is to enable the I-pipe tracer, enable panic >>> backtrace, and put a BUG_ON(hard_irqs_enabled()) right before the >>> call to ipipe_handle_demuxed_irq. You probably can not call >>> do_bad_IRQ from here either, but I do not think this is your >>> problem, this checks looks a bit redundant: in order to check that >>> cascade_irq is always less than NR_IRQS, it is sufficient to check >>> that chip_data->irq_offset & ~31 is less than NR_IRQS - 32, which >>> can be done only once, and not repeated stupidly for every irq. >>> >> >> Thanks, will let you know how it goes. > > I do not think that is necessary, if the error was due to the spinlock, > now that you replaced with an I-pipe spinlock, you will no longer have > it. It will not help debugging the spinlock, which may be something > different. But you can test that anyway, to be sure that the irqs are > not enabled at this point. > > In any case, the backtrace you sent explains why the cpu appear to be > stalled: because root is unstalled while a GPIO irq happens, > smp_call_functions ends up invoking callbacks on cpus where Linux has > been preempted by Xeonmai, then invokes schedule, which probably goes > south because either Linux data are not in a consistent state or more > probably because it preempted a Xenomai kernel thread (switchtest runs a > lot of them), and tries to use current, which can not work. > > -- > Gilles.
I've added BUG_ON(!hard_irqs_disabled()) to the code and got the kernel to oops at startup. Where shall I start looking for the offending code? -- GP Orcullo -------------- next part -------------- [ 15.603409] Unable to handle kernel NULL pointer dereference at virtual address 0000000c [ 15.606348] pgd = e6004000, hw pgd = e6004000 [ 15.610624] [0000000c] *pgd=00000000 [ 15.613969] Internal error: Oops: 5 [#1] PREEMPT SMP ARM [ 15.616703] s5p-mixer s5p-mixer: start latency exceeded, new value 333 ns [ 15.616747] s5p-mixer s5p-mixer: state restore latency exceeded, new value 36542 ns [ 15.617000] s5p-mixer s5p-mixer: stop latency exceeded, new value 291 ns [ 15.617024] s5p-mixer s5p-mixer: state save latency exceeded, new value 11750 ns [ 15.617029] s5p-mixer s5p-mixer: stop latency exceeded, new value 292 ns [ 15.654205] Modules linked in: mii(+) [ 15.657830] CPU: 1 Not tainted (3.8.13.11-xen #25) [ 15.662966] PC is at load_module+0x1908/0x1e18 [ 15.667382] LR is at ipipe_root_only+0x58/0x160 [ 15.671890] pc : [<c007935c>] lr : [<c008f614>] psr: a0000053 [ 15.671890] sp : e6bddeb0 ip : bf000c40 fp : c04c4c68 [ 15.683355] r10: 00000000 r9 : bf000b34 r8 : 00000000 [ 15.688542] r7 : fffffff8 r6 : bf000aec r5 : bf000af8 r4 : e6bddf58 [ 15.695050] r3 : e7135800 r2 : e6bddea8 r1 : c08c7ac0 r0 : 00000000 [ 15.701555] Flags: NzCv IRQs on FIQs off Mode SVC_32 ISA ARM Segment user [ 15.708756] Control: 10c5387d Table: 6600404a DAC: 00000015 [ 15.714476] Process modprobe (pid: 1728, stack limit = 0xe6bdc240) [ 15.720634] Stack: (0xe6bddeb0 to 0xe6bde000) [ 15.724968] dea0: bf000af8 00007fff c0074e48 00001302 [ 15.733133] dec0: 00000000 f01d1000 b6ec1d50 e6bdc000 bf000c40 e6bddef4 e6bdddb0 00000000 [ 15.741285] dee0: c001aa44 c000e860 f01f2000 b6da3000 00000c8f bf000980 00000008 00000000 [ 15.749437] df00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 15.757589] df20: 00000000 00000000 00000000 00000000 000000d2 00021d0f b6d82000 b6ec1d50 [ 15.765742] df40: 00000080 c000ef70 e6bdc000 00000000 00000000 c007994c f01d1000 00021d0f [ 15.773894] df60: f01eb2c0 f01eb171 f01f29d0 00000c3c 00000edc 00000000 00000000 00000000 [ 15.782046] df80: 0000001f 00000020 0000000d 00000000 0000000a 00000000 00000000 b6f10290 [ 15.790209] dfa0: b6f12910 c000ed40 00000000 b6f10290 b6d82000 00021d0f b6ec1d50 00000002 [ 15.798361] dfc0: 00000000 b6f10290 b6f12910 00000080 00000000 b6ec1d50 00021d0f 00000000 [ 15.800249] s5p-mixer s5p-mixer: start latency exceeded, new value 542 ns [ 15.800535] s5p-mixer s5p-mixer: stop latency exceeded, new value 333 ns [ 15.819951] dfe0: 00060000 be9ce914 b6ebbb07 b6e3f684 80000050 b6d82000 00000000 00000000 [ 15.828113] [<c007935c>] (load_module+0x1908/0x1e18) from [<c007994c>] (sys_init_module+0xe0/0xf4) [ 15.837040] [<c007994c>] (sys_init_module+0xe0/0xf4) from [<c000ed40>] (ret_fast_syscall+0x0/0x34) [ 15.845971] Code: e59dc020 e15c0007 e2477008 0a000009 (e5973014) [ 15.852033] I-pipe tracer log (100 points): [ 15.856183] +func 0 ipipe_trace_panic_freeze+0x8 (oops_enter+0x14) [ 15.864412] +func -1 oops_enter+0x8 (die+0x20) [ 15.870830] +func -1 die+0xc (__do_kernel_fault.part.8+0x5c) [ 15.878462] | #func -2 ipipe_root_only+0x8 (ipipe_unstall_root+0x14) [ 15.886615] #func -3 ipipe_unstall_root+0x8 (vprintk_emit+0x1cc) [ 15.894593] #func -3 wake_up_klogd+0x8 (vprintk_emit+0x1bc) [ 15.902138] #func -4 ipipe_test_root+0x8 (preempt_schedule+0x24) [ 15.910118] #func -4 ipipe_root_only+0x8 (sub_preempt_count+0x18) [ 15.918184] #func -5 sub_preempt_count+0xc (_raw_spin_unlock_irqrestore+0x28) [ 15.927289] #func -5 _raw_spin_unlock_irqrestore+0x8 (console_unlock+0x1e0) [ 15.936222] #func -6 __ipipe_spin_unlock_debug+0x8 (console_unlock+0x1d4) [ 15.944981] #func -6 ipipe_root_only+0x8 (add_preempt_count+0x18) [ 15.953047] #func -7 add_preempt_count+0xc (_raw_spin_lock+0x18) [ 15.961026] #func -7 _raw_spin_lock+0x8 (console_unlock+0x1c0) [ 15.968831] #func -8 ipipe_test_root+0x8 (preempt_schedule+0x24) [ 15.976810] #func -8 ipipe_root_only+0x8 (sub_preempt_count+0x18) [ 15.984876] #func -9 sub_preempt_count+0xc (_raw_spin_unlock_irqrestore+0x28) [ 15.993982] #func -9 _raw_spin_unlock_irqrestore+0x8 (console_unlock+0x1b8) [ 16.002915] #func -10 __ipipe_spin_unlock_debug+0x8 (up+0x3c) [ 16.010547] #func -10 ipipe_root_only+0x8 (add_preempt_count+0x18) [ 16.018612] #func -11 add_preempt_count+0xc (_raw_spin_lock_irqsave+0x20) [ 16.027285] #func -11 ipipe_root_only+0x8 (ipipe_test_and_stall_root+0x10) [ 16.036045] #func -12 ipipe_test_and_stall_root+0x8 (_raw_spin_lock_irqsave+0x14) [ 16.045411] #func -12 _raw_spin_lock_irqsave+0x8 (up+0x14) [ 16.052783] #func -13 up+0x8 (console_unlock+0x1b8) [ 16.059548] #func -13 ipipe_test_root+0x8 (preempt_schedule+0x24) [ 16.067527] #func -14 ipipe_root_only+0x8 (sub_preempt_count+0x18) [ 16.075592] #func -14 sub_preempt_count+0xc (_raw_spin_unlock+0x18) [ 16.083745] #func -15 _raw_spin_unlock+0x8 (console_unlock+0x1b0) [ 16.091724] #func -16 ipipe_root_only+0x8 (add_preempt_count+0x18) [ 16.099789] #func -16 add_preempt_count+0xc (_raw_spin_lock_irqsave+0x20) [ 16.108462] #func -17 ipipe_root_only+0x8 (ipipe_test_and_stall_root+0x10) [ 16.117222] #func -17 ipipe_test_and_stall_root+0x8 (_raw_spin_lock_irqsave+0x14) [ 16.126588] #func -17 _raw_spin_lock_irqsave+0x8 (console_unlock+0x8c) [ 16.135000] #func -18 ipipe_test_root+0x8 (preempt_schedule+0x24) [ 16.142979] #func -19 ipipe_root_only+0x8 (sub_preempt_count+0x18) [ 16.151045] #func -19 sub_preempt_count+0xc (_raw_spin_unlock+0x18) [ 16.159197] #func -19 _raw_spin_unlock+0x8 (call_console_drivers.constprop.15+0xf8) [ 16.168738] #func -20 __rcu_read_unlock+0x8 (__atomic_notifier_call_chain+0x48) [ 16.177931] #func -20 notifier_call_chain+0x8 (__atomic_notifier_call_chain+0x40) [ 16.187297] #func -21 __rcu_read_lock+0x4 (__atomic_notifier_call_chain+0x24) [ 16.196317] #func -21 __atomic_notifier_call_chain+0xc (atomic_notifier_call_chain+0x20) [ 16.206290] #func -22 atomic_notifier_call_chain+0xc (notify_update+0x30) [ 16.214963] #func -22 notify_update+0xc (vt_console_print+0x1cc) [ 16.222855] #func -23 dummycon_dummy+0x4 (set_cursor+0x90) [ 16.230227] #func -23 add_softcursor+0x8 (set_cursor+0x6c) [ 16.237599] #func -24 set_cursor+0x8 (vt_console_print+0x1c4) [ 16.245231] #func -24 __rcu_read_unlock+0x8 (__atomic_notifier_call_chain+0x48) [ 16.254424] #func -25 notifier_call_chain+0x8 (__atomic_notifier_call_chain+0x40) [ 16.263791] #func -25 __rcu_read_lock+0x4 (__atomic_notifier_call_chain+0x24) [ 16.272810] #func -26 __atomic_notifier_call_chain+0xc (atomic_notifier_call_chain+0x20) [ 16.282784] #func -26 atomic_notifier_call_chain+0xc (notify_write+0x28) [ 16.291370] #func -26 notify_write+0xc (vt_console_print+0x260) [ 16.299175] #func -27 __rcu_read_unlock+0x8 (__atomic_notifier_call_chain+0x48) [ 16.308368] #func -27 notifier_call_chain+0x8 (__atomic_notifier_call_chain+0x40) [ 16.317735] #func -28 __rcu_read_lock+0x4 (__atomic_notifier_call_chain+0x24) [ 16.326754] #func -28 __atomic_notifier_call_chain+0xc (atomic_notifier_call_chain+0x20) [ 16.336728] #func -29 atomic_notifier_call_chain+0xc (notify_write+0x28) [ 16.345314] #func -29 notify_write+0xc (vt_console_print+0x230) [ 16.353120] #func -33 dummycon_dummy+0x4 (scrup+0xe8) [ 16.360058] #func -33 scrup+0xc (lf+0x6c) [ 16.365955] #func -34 lf+0x8 (vt_console_print+0x230) [ 16.372893] #func -34 dummycon_dummy+0x4 (hide_cursor+0x38) [ 16.380352] #func -35 hide_cursor+0x8 (vt_console_print+0x2c0) [ 16.388071] #func -36 ipipe_root_only+0x8 (add_preempt_count+0x18) [ 16.396136] #func -36 add_preempt_count+0xc (_raw_spin_trylock+0x18) [ 16.404376] #func -37 _raw_spin_trylock+0x8 (vt_console_print+0x48) [ 16.412528] #func -37 vt_console_print+0xc (call_console_drivers.constprop.15+0xf8) [ 16.422068] #func -38 ipipe_test_root+0x8 (debug_smp_processor_id+0x7c) [ 16.430567] #func -39 s3c24xx_serial_console_putchar+0x8 (uart_console_write+0x5c) [ 16.440020] #func -41 s3c24xx_serial_console_putchar+0x8 (uart_console_write+0x50) [ 16.449474] #func -41 uart_console_write+0x8 (call_console_drivers.constprop.15+0xf8) [ 16.459187] #func -42 s3c24xx_serial_console_write+0x4 (call_console_drivers.constprop.15+0xf8) [ 16.469768] #func -42 ipipe_test_root+0x8 (debug_smp_processor_id+0x7c) [ 16.478267] #func -43 call_console_drivers.constprop.15+0xc (console_unlock+0x288) [ 16.487720] #func -43 ipipe_test_root+0x8 (preempt_schedule+0x24) [ 16.495699] #func -44 ipipe_root_only+0x8 (sub_preempt_count+0x18) [ 16.503765] #func -44 sub_preempt_count+0xc (_raw_spin_unlock+0x18) [ 16.511917] #func -45 _raw_spin_unlock+0x8 (console_unlock+0x27c) [ 16.519896] #func -46 ipipe_root_only+0x8 (add_preempt_count+0x18) [ 16.527962] #func -46 add_preempt_count+0xc (_raw_spin_lock_irqsave+0x20) [ 16.536635] #func -47 ipipe_root_only+0x8 (ipipe_test_and_stall_root+0x10) [ 16.545394] #func -47 ipipe_test_and_stall_root+0x8 (_raw_spin_lock_irqsave+0x14) [ 16.554761] #func -48 _raw_spin_lock_irqsave+0x8 (console_unlock+0x34) [ 16.563173] #func -48 console_unlock+0xc (vprintk_emit+0x1bc) [ 16.570805] #func -48 ipipe_test_root+0x8 (preempt_schedule+0x24) [ 16.578784] #func -49 ipipe_root_only+0x8 (sub_preempt_count+0x18) [ 16.586850] #func -49 sub_preempt_count+0xc (_raw_spin_unlock+0x18) [ 16.595002] #func -50 _raw_spin_unlock+0x8 (vprintk_emit+0x1b8) [ 16.602807] #func -51 ipipe_root_only+0x8 (sub_preempt_count+0x18) [ 16.610873] #func -51 sub_preempt_count+0xc (_raw_spin_unlock_irqrestore+0x28) [ 16.619979] #func -52 _raw_spin_unlock_irqrestore+0x8 (down_trylock+0x38) [ 16.628652] #func -52 __ipipe_spin_unlock_debug+0x8 (down_trylock+0x2c) [ 16.637151] #func -53 ipipe_root_only+0x8 (add_preempt_count+0x18) [ 16.645217] #func -53 add_preempt_count+0xc (_raw_spin_lock_irqsave+0x20) [ 16.653890] #func -54 ipipe_root_only+0x8 (ipipe_test_and_stall_root+0x10) [ 16.662650] #func -54 ipipe_test_and_stall_root+0x8 (_raw_spin_lock_irqsave+0x14) [ 16.672016] #func -55 _raw_spin_lock_irqsave+0x8 (down_trylock+0x14) [ 16.680255] #func -55 down_trylock+0x8 (console_trylock+0x14) [ 16.687887] #func -56 console_trylock+0x8 (vprintk_emit+0x16c) [ 16.695700] ---[ end trace c1425b6856ac7c36 ]--- _______________________________________________ Xenomai mailing list [email protected] http://www.xenomai.org/mailman/listinfo/xenomai
