Oops with PREEMPT-RT on 2.6.25.4
Hi I get the following oops when trying to boot a arch/powerpc kernel with preempt-rt installed (v2.6.25.4-rt1) The board is using a Freescale 8280 as the main CPU and a Silicon Image SII3124 SATA controller. The oops seems to happen on fileaccess right after init starts. I need ideas what to look for. Freeing unused kernel memory: 128k init INIT: version 2.85 booting Activating all swap files/partitions... [ OK ] Mounting proc file system...[ OK ] path=/bin:/usr/bin:/sbin:/usr/sbin Oops: Exception in kernel mode, sig: 5 [#1] PREEMPT Innovative Systems ApMax Modules linked in: NIP: c0249618 LR: c02495ec CTR: REGS: ef29d550 TRAP: 0700 Not tainted (2.6.25.4-rt1) MSR: 00021032 ME,IR,DR CR: 24044482 XER: TASK = ef26d070[50] 'ldconfig' THREAD: ef29c000 GPR00: 0001 ef29d600 ef26d070 ef29d64c 008c GPR08: ef29d628 ef29d630 ef29c000 100b5eec 100b GPR16: c00c3304 ef29dc48 000c 0014 0001 ef3818c0 GPR24: ef8a4000 0011 c01bc3c4 ef29d698 ef381904 9032 c0354700 NIP [c0249618] rt_spin_lock_slowlock+0x9c/0x200 LR [c02495ec] rt_spin_lock_slowlock+0x70/0x200 Call Trace: [ef29d600] [c02495ec] rt_spin_lock_slowlock+0x70/0x200 (unreliable) [ef29d670] [c00277d0] lock_timer_base+0x2c/0x64 [ef29d690] [c00285e8] del_timer+0x2c/0x78 [ef29d6b0] [c019d108] scsi_delete_timer+0x1c/0x3c [ef29d6d0] [c01992d0] scsi_done+0x18/0x4c [ef29d6f0] [c01b19dc] ata_scsi_qc_complete+0x364/0x380 [ef29d720] [c01a8708] __ata_qc_complete+0xd8/0xec [ef29d740] [c01b011c] ata_qc_complete_multiple+0xc4/0xec [ef29d760] [c01bcaf4] sil24_interrupt+0x46c/0x52c [ef29d7a0] [c0048954] handle_IRQ_event+0x64/0x100 [ef29d7d0] [c0048b30] __do_IRQ+0x140/0x1bc [ef29d7f0] [c00166c4] apmax_int_irq_demux+0x8c/0xb0 [ef29d810] [c0006448] do_IRQ+0x68/0xa8 [ef29d820] [c0010388] ret_from_except+0x0/0x14 --- Exception: 501 at __spin_unlock_irqrestore+0x28/0x4c LR = __spin_unlock_irqrestore+0x20/0x4c [ef29d8f0] [c0249600] rt_spin_lock_slowlock+0x84/0x200 [ef29d960] [c00277d0] lock_timer_base+0x2c/0x64 [ef29d980] [c002790c] __mod_timer+0x34/0xdc [ef29d9b0] [c0154650] blk_plug_device+0x58/0x68 [ef29d9c0] [c0154db0] __make_request+0x2dc/0x34c [ef29da00] [c0153aa4] generic_make_request+0x20c/0x238 [ef29da40] [c01550a8] submit_bio+0x124/0x138 [ef29da80] [c009a490] submit_bh+0x13c/0x174 [ef29daa0] [c009df9c] __bread+0xa4/0x100 [ef29dab0] [c00c2478] ext3_get_branch+0x78/0xfc [ef29dae0] [c00c27dc] ext3_get_blocks_handle+0x94/0x9cc [ef29dba0] [c00c3398] ext3_get_block+0x94/0xdc [ef29dbd0] [c00a458c] do_mpage_readpage+0x1a4/0x63c [ef29dc40] [c00a4b58] mpage_readpages+0xc4/0x11c [ef29dd20] [c00c269c] ext3_readpages+0x24/0x34 [ef29dd30] [c00572e4] __do_page_cache_readahead+0x1a0/0x258 [ef29dd70] [c00512bc] filemap_fault+0x198/0x41c [ef29ddb0] [c005d934] __do_fault+0x6c/0x804 [ef29de10] [c0012420] do_page_fault+0x338/0x4b4 [ef29df40] [c0010120] handle_page_fault+0xc/0x80 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Oops with PREEMPT-RT on 2.6.25.4
Rune Torgersen wrote: Hi I get the following oops when trying to boot a arch/powerpc kernel with preempt-rt installed (v2.6.25.4-rt1) The board is using a Freescale 8280 as the main CPU and a Silicon Image SII3124 SATA controller. The oops seems to happen on fileaccess right after init starts. [snip] NIP [c0249618] rt_spin_lock_slowlock+0x9c/0x200 LR [c02495ec] rt_spin_lock_slowlock+0x70/0x200 Call Trace: [ef29d600] [c02495ec] rt_spin_lock_slowlock+0x70/0x200 (unreliable) [ef29d670] [c00277d0] lock_timer_base+0x2c/0x64 [ef29d690] [c00285e8] del_timer+0x2c/0x78 [ef29d6b0] [c019d108] scsi_delete_timer+0x1c/0x3c [ef29d6d0] [c01992d0] scsi_done+0x18/0x4c [ef29d6f0] [c01b19dc] ata_scsi_qc_complete+0x364/0x380 [ef29d720] [c01a8708] __ata_qc_complete+0xd8/0xec [ef29d740] [c01b011c] ata_qc_complete_multiple+0xc4/0xec [ef29d760] [c01bcaf4] sil24_interrupt+0x46c/0x52c [ef29d7a0] [c0048954] handle_IRQ_event+0x64/0x100 [ef29d7d0] [c0048b30] __do_IRQ+0x140/0x1bc [ef29d7f0] [c00166c4] apmax_int_irq_demux+0x8c/0xb0 [ef29d810] [c0006448] do_IRQ+0x68/0xa8 [ef29d820] [c0010388] ret_from_except+0x0/0x14 --- Exception: 501 at __spin_unlock_irqrestore+0x28/0x4c LR = __spin_unlock_irqrestore+0x20/0x4c [ef29d8f0] [c0249600] rt_spin_lock_slowlock+0x84/0x200 [ef29d960] [c00277d0] lock_timer_base+0x2c/0x64 You're recursively entering lock_timer_base, which does a spin_lock_irqsave(). Either interrupts are enabled when they should not be, or an interrupt was supposed to be threaded that isn't. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: Oops with PREEMPT-RT on 2.6.25.4
Scott Wood wrote: Rune Torgersen wrote: Hi I get the following oops when trying to boot a arch/powerpc kernel with preempt-rt installed (v2.6.25.4-rt1) The board is using a Freescale 8280 as the main CPU and a Silicon Image SII3124 SATA controller. The oops seems to happen on fileaccess right after init starts. [snip] NIP [c0249618] rt_spin_lock_slowlock+0x9c/0x200 LR [c02495ec] rt_spin_lock_slowlock+0x70/0x200 Call Trace: [ef29d600] [c02495ec] rt_spin_lock_slowlock+0x70/0x200 (unreliable) [ef29d670] [c00277d0] lock_timer_base+0x2c/0x64 [ef29d690] [c00285e8] del_timer+0x2c/0x78 [ef29d6b0] [c019d108] scsi_delete_timer+0x1c/0x3c [ef29d6d0] [c01992d0] scsi_done+0x18/0x4c [ef29d6f0] [c01b19dc] ata_scsi_qc_complete+0x364/0x380 [ef29d720] [c01a8708] __ata_qc_complete+0xd8/0xec [ef29d740] [c01b011c] ata_qc_complete_multiple+0xc4/0xec [ef29d760] [c01bcaf4] sil24_interrupt+0x46c/0x52c [ef29d7a0] [c0048954] handle_IRQ_event+0x64/0x100 [ef29d7d0] [c0048b30] __do_IRQ+0x140/0x1bc [ef29d7f0] [c00166c4] apmax_int_irq_demux+0x8c/0xb0 [ef29d810] [c0006448] do_IRQ+0x68/0xa8 [ef29d820] [c0010388] ret_from_except+0x0/0x14 --- Exception: 501 at __spin_unlock_irqrestore+0x28/0x4c LR = __spin_unlock_irqrestore+0x20/0x4c [ef29d8f0] [c0249600] rt_spin_lock_slowlock+0x84/0x200 [ef29d960] [c00277d0] lock_timer_base+0x2c/0x64 You're recursively entering lock_timer_base, which does a spin_lock_irqsave(). Either interrupts are enabled when they should not be, or an interrupt was supposed to be threaded that isn't. Sort of figured. How do I figure out which one, and how to fix it? I've never gotten any -rt patchsets to work on this CPU, and it always seems to be related to the disk driver. I've tried since 2.6.16 ppc (2.6.16, 2.6.18 on ppc, 2.6.24 and 25 on powerpc) Even though this is a custom board, I'm pretty sure I can get it to fail on a pq2fads board with the same disk controller. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Oops with PREEMPT-RT on 2.6.25.4
Rune Torgersen wrote: Scott Wood wrote: You're recursively entering lock_timer_base, which does a spin_lock_irqsave(). Either interrupts are enabled when they should not be, or an interrupt was supposed to be threaded that isn't. Sort of figured. How do I figure out which one, and how to fix it? I've never gotten any -rt patchsets to work on this CPU, and it always seems to be related to the disk driver. I've tried since 2.6.16 ppc (2.6.16, 2.6.18 on ppc, 2.6.24 and 25 on powerpc) Even though this is a custom board, I'm pretty sure I can get it to fail on a pq2fads board with the same disk controller. Im not sure if LOCKDEP is available for that architecture. Have you tried it? Its pretty good at flushing these kinds of issues out (assuming its available). -Greg ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Oops with PREEMPT-RT on 2.6.25.4
Rune Torgersen wrote: Scott Wood wrote: You're recursively entering lock_timer_base, which does a spin_lock_irqsave(). Either interrupts are enabled when they should not be, or an interrupt was supposed to be threaded that isn't. Sort of figured. How do I figure out which one, and how to fix it? Almost certainly the latter. Is the disk interrupt shared with any other interrupts, that are marked IRQF_NODELAY? The -rt patch doesn't seem to handle mixing the two well. Oh, and just to be sure: you do have CONFIG_PREEMPT_RT turned on, and not just CONFIG_PREEMPT, right? The non-preempt-rt versions in the -rt patch don't look like they disable interrupts, though I may just be getting lost in a sea of underscores and ifdefs. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: Oops with PREEMPT-RT on 2.6.25.4
Scott Wood wrote: Almost certainly the latter. Is the disk interrupt shared with any other interrupts, that are marked IRQF_NODELAY? The -rt patch doesn't seem to handle mixing the two well. Disk is on a muxed PCI interrupt. None of the other interrupts on the mux is fireing at the time. Is is possible that the demuxer is not set up right? It is based loosely on pq2-pci-pic.c Oh, and just to be sure: you do have CONFIG_PREEMPT_RT turned on, and not just CONFIG_PREEMPT, right? The non-preempt-rt versions in the -rt patch don't look like they disable interrupts, though I may just be getting lost in a sea of underscores and ifdefs. Full CONFIG_PREEMPT_RT. I was actually going to try CONFIG_PREEMPT to see if anything helped. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Oops with PREEMPT-RT on 2.6.25.4
Rune Torgersen wrote: Scott Wood wrote: Almost certainly the latter. Is the disk interrupt shared with any other interrupts, that are marked IRQF_NODELAY? The -rt patch doesn't seem to handle mixing the two well. Disk is on a muxed PCI interrupt. None of the other interrupts on the mux is fireing at the time. Regardless of whether they're firing, any request_irq with IRQF_NODELAY will turn off threading for all handlers. Is is possible that the demuxer is not set up right? It is based loosely on pq2-pci-pic.c Try calling irq_set_chip_and_handler() with handle_level_irq, rather than irq_set_chip(). The -rt patch doesn't seem to have threadified the __do_IRQ() path. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: Oops with PREEMPT-RT on 2.6.25.4
Scott Wood wrote: Try calling irq_set_chip_and_handler() with handle_level_irq, rather than irq_set_chip(). The -rt patch doesn't seem to have threadified the __do_IRQ() path. The demuxer is setting itself up with set_irq_chained handler(), any pointers on how to change to irq_set_chip_and_handler()? ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Oops with PREEMPT-RT on 2.6.25.4
Rune Torgersen wrote: Scott Wood wrote: Try calling irq_set_chip_and_handler() with handle_level_irq, rather than irq_set_chip(). The -rt patch doesn't seem to have threadified the __do_IRQ() path. The demuxer is setting itself up with set_irq_chained handler(), any pointers on how to change to irq_set_chip_and_handler()? No, I mean the call to set_irq_chip() in pci_pic_host_map() where it sets up the IRQs it manages, not the cascade IRQ itself. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: Oops with PREEMPT-RT on 2.6.25.4
Scott Wood wrote: Rune Torgersen wrote: Scott Wood wrote: Try calling irq_set_chip_and_handler() with handle_level_irq, rather than irq_set_chip(). The -rt patch doesn't seem to have threadified the __do_IRQ() path. The demuxer is setting itself up with set_irq_chained handler(), any pointers on how to change to irq_set_chip_and_handler()? No, I mean the call to set_irq_chip() in pci_pic_host_map() where it sets up the IRQs it manages, not the cascade IRQ itself. Thanks!!! That fixed that particular problem. Of course I then ran headfirst into another one This one seems to happen when I attempt to read flash through an mtd driver. Oops: Exception in kernel mode, sig: 5 [#1] PREEMPT Innovative Systems ApMax Modules linked in: NIP: c005e780 LR: c005e758 CTR: REGS: ef2c1d20 TRAP: 0700 Not tainted (2.6.25.4-rt1) MSR: 00029032 EE,ME,IR,DR CR: 48222482 XER: 2000 TASK = ef2a1a90[98] 'S21initenv' THREAD: ef2c GPR00: 3ffcf581 ef2c1dd0 ef2a1a90 c02cae8c 0002 0001 3ffcf580 GPR08: c037 c037000c 28222484 1009ecc0 100a5838 GPR16: 1009 1009 1009 ef2a7100 3ee38385 100955ac GPR24: ef2a4ef0 c27dc700 c27dcb80 0003 ef359040 c0334f88 NIP [c005e780] do_wp_page+0x650/0xc2c LR [c005e758] do_wp_page+0x628/0xc2c Call Trace: [ef2c1dd0] [c005e758] do_wp_page+0x628/0xc2c (unreliable) [ef2c1e10] [c0012420] do_page_fault+0x338/0x4b4 [ef2c1f40] [c0010120] handle_page_fault+0xc/0x80 --- Exception: 301 at 0x100322e0 LR = 0x100322dc Instruction dump: 409e0594 4810f2e1 3d20c035 1fa3000f 8129b140 3bbd0003 57ab103a 7c0b482e 7d6b4a14 540007fa 30e0 7cc70110 0f06 3d20c035 3d40c035 8129b480 Oops: Exception in kernel mode, sig: 5 [#2] PREEMPT Innovative Systems ApMax Modules linked in: NIP: c005db0c LR: c005dae4 CTR: REGS: ef29fd00 TRAP: 0700 Tainted: G D (2.6.25.4-rt1) MSR: 00029032 EE,ME,IR,DR CR: 48002482 XER: 2000 TASK = ef26e070[102] 'sed' THREAD: ef29e000 GPR00: 3ee2d581 ef29fdb0 ef26e070 c02cae8c 0001 3ee2d580 GPR08: c037 c037000c 0722 1009ecc0 4802dfa4 4802d878 GPR16: 0014f73c 0003 4802cce0 0001 0200 GPR24: ef3592e0 0ffece1c ef3592e0 ef2a50fc ef2a4df4 0003 c27dcfe0 c27ff4c0 NIP [c005db0c] __do_fault+0x1e0/0x804 LR [c005dae4] __do_fault+0x1b8/0x804 Call Trace: [ef29fdb0] [c005dae4] __do_fault+0x1b8/0x804 (unreliable) [ef29fe10] [c0012420] do_page_fault+0x338/0x4b4 [ef29ff40] [c0010120] handle_page_fault+0xc/0x80 --- Exception: 301 at 0x48017b10 LR = 0x48007ac8 Instruction dump: 409e060c 4810ff55 3d20c035 1fa3000f 8129b140 3bbd0003 57ab103a 7c0b482e 7d6b4a14 540007fa 30e0 7cc70110 0f06 3d20c035 3d40c035 8129b480 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: Oops with PREEMPT-RT on 2.6.25.4
Rune Torgersen wrote: Scott Wood wrote: Of course I then ran headfirst into another one This one seems to happen when I attempt to read flash through an mtd driver. Both if these is hitting a BUG_ON in kmap_atomic (include/asm-powerpc/highmem.h) Oops: Exception in kernel mode, sig: 5 [#1] PREEMPT Innovative Systems ApMax Modules linked in: NIP: c005e780 LR: c005e758 CTR: REGS: ef2c1d20 TRAP: 0700 Not tainted (2.6.25.4-rt1) MSR: 00029032 EE,ME,IR,DR CR: 48222482 XER: 2000 TASK = ef2a1a90[98] 'S21initenv' THREAD: ef2c GPR00: 3ffcf581 ef2c1dd0 ef2a1a90 c02cae8c 0002 0001 3ffcf580 GPR08: c037 c037000c 28222484 1009ecc0 100a5838 GPR16: 1009 1009 1009 ef2a7100 3ee38385 100955ac GPR24: ef2a4ef0 c27dc700 c27dcb80 0003 ef359040 c0334f88 NIP [c005e780] do_wp_page+0x650/0xc2c LR [c005e758] do_wp_page+0x628/0xc2c Call Trace: [ef2c1dd0] [c005e758] do_wp_page+0x628/0xc2c (unreliable) [ef2c1e10] [c0012420] do_page_fault+0x338/0x4b4 [ef2c1f40] [c0010120] handle_page_fault+0xc/0x80 --- Exception: 301 at 0x100322e0 LR = 0x100322dc Instruction dump: 409e0594 4810f2e1 3d20c035 1fa3000f 8129b140 3bbd0003 57ab103a 7c0b482e 7d6b4a14 540007fa 30e0 7cc70110 0f06 3d20c035 3d40c035 8129b480 Oops: Exception in kernel mode, sig: 5 [#2] PREEMPT Innovative Systems ApMax Modules linked in: NIP: c005db0c LR: c005dae4 CTR: REGS: ef29fd00 TRAP: 0700 Tainted: G D (2.6.25.4-rt1) MSR: 00029032 EE,ME,IR,DR CR: 48002482 XER: 2000 TASK = ef26e070[102] 'sed' THREAD: ef29e000 GPR00: 3ee2d581 ef29fdb0 ef26e070 c02cae8c 0001 3ee2d580 GPR08: c037 c037000c 0722 1009ecc0 4802dfa4 4802d878 GPR16: 0014f73c 0003 4802cce0 0001 0200 GPR24: ef3592e0 0ffece1c ef3592e0 ef2a50fc ef2a4df4 0003 c27dcfe0 c27ff4c0 NIP [c005db0c] __do_fault+0x1e0/0x804 LR [c005dae4] __do_fault+0x1b8/0x804 Call Trace: [ef29fdb0] [c005dae4] __do_fault+0x1b8/0x804 (unreliable) [ef29fe10] [c0012420] do_page_fault+0x338/0x4b4 [ef29ff40] [c0010120] handle_page_fault+0xc/0x80 --- Exception: 301 at 0x48017b10 LR = 0x48007ac8 Instruction dump: 409e060c 4810ff55 3d20c035 1fa3000f 8129b140 3bbd0003 57ab103a 7c0b482e 7d6b4a14 540007fa 30e0 7cc70110 0f06 3d20c035 3d40c035 8129b480 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev