Oops with PREEMPT-RT on 2.6.25.4

2008-05-19 Thread Rune Torgersen
Hi 
I get the following oops when trying to boot a arch/powerpc kernel with
preempt-rt installed (v2.6.25.4-rt1)
The board is using a Freescale 8280 as the main CPU and a Silicon Image
SII3124 SATA controller. The oops seems to happen on fileaccess right
after init starts.

I need ideas what to look for.

Freeing unused kernel memory: 128k init
INIT: version 2.85 booting
Activating all swap files/partitions... [  OK  ]
Mounting proc file system...[  OK  ]
path=/bin:/usr/bin:/sbin:/usr/sbin
Oops: Exception in kernel mode, sig: 5 [#1]
PREEMPT Innovative Systems ApMax
Modules linked in:
NIP: c0249618 LR: c02495ec CTR: 
REGS: ef29d550 TRAP: 0700   Not tainted  (2.6.25.4-rt1)
MSR: 00021032 ME,IR,DR  CR: 24044482  XER: 
TASK = ef26d070[50] 'ldconfig' THREAD: ef29c000
GPR00: 0001 ef29d600 ef26d070    ef29d64c
008c
GPR08: ef29d628  ef29d630 ef29c000  100b5eec 
100b
GPR16: c00c3304 ef29dc48 000c  0014  0001
ef3818c0
GPR24: ef8a4000 0011  c01bc3c4 ef29d698 ef381904 9032
c0354700
NIP [c0249618] rt_spin_lock_slowlock+0x9c/0x200
LR [c02495ec] rt_spin_lock_slowlock+0x70/0x200
Call Trace:
[ef29d600] [c02495ec] rt_spin_lock_slowlock+0x70/0x200 (unreliable)
[ef29d670] [c00277d0] lock_timer_base+0x2c/0x64
[ef29d690] [c00285e8] del_timer+0x2c/0x78
[ef29d6b0] [c019d108] scsi_delete_timer+0x1c/0x3c
[ef29d6d0] [c01992d0] scsi_done+0x18/0x4c
[ef29d6f0] [c01b19dc] ata_scsi_qc_complete+0x364/0x380
[ef29d720] [c01a8708] __ata_qc_complete+0xd8/0xec
[ef29d740] [c01b011c] ata_qc_complete_multiple+0xc4/0xec
[ef29d760] [c01bcaf4] sil24_interrupt+0x46c/0x52c
[ef29d7a0] [c0048954] handle_IRQ_event+0x64/0x100
[ef29d7d0] [c0048b30] __do_IRQ+0x140/0x1bc
[ef29d7f0] [c00166c4] apmax_int_irq_demux+0x8c/0xb0
[ef29d810] [c0006448] do_IRQ+0x68/0xa8
[ef29d820] [c0010388] ret_from_except+0x0/0x14
--- Exception: 501 at __spin_unlock_irqrestore+0x28/0x4c
LR = __spin_unlock_irqrestore+0x20/0x4c
[ef29d8f0] [c0249600] rt_spin_lock_slowlock+0x84/0x200
[ef29d960] [c00277d0] lock_timer_base+0x2c/0x64
[ef29d980] [c002790c] __mod_timer+0x34/0xdc
[ef29d9b0] [c0154650] blk_plug_device+0x58/0x68
[ef29d9c0] [c0154db0] __make_request+0x2dc/0x34c
[ef29da00] [c0153aa4] generic_make_request+0x20c/0x238
[ef29da40] [c01550a8] submit_bio+0x124/0x138
[ef29da80] [c009a490] submit_bh+0x13c/0x174
[ef29daa0] [c009df9c] __bread+0xa4/0x100
[ef29dab0] [c00c2478] ext3_get_branch+0x78/0xfc
[ef29dae0] [c00c27dc] ext3_get_blocks_handle+0x94/0x9cc
[ef29dba0] [c00c3398] ext3_get_block+0x94/0xdc
[ef29dbd0] [c00a458c] do_mpage_readpage+0x1a4/0x63c
[ef29dc40] [c00a4b58] mpage_readpages+0xc4/0x11c
[ef29dd20] [c00c269c] ext3_readpages+0x24/0x34
[ef29dd30] [c00572e4] __do_page_cache_readahead+0x1a0/0x258
[ef29dd70] [c00512bc] filemap_fault+0x198/0x41c
[ef29ddb0] [c005d934] __do_fault+0x6c/0x804
[ef29de10] [c0012420] do_page_fault+0x338/0x4b4
[ef29df40] [c0010120] handle_page_fault+0xc/0x80
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Oops with PREEMPT-RT on 2.6.25.4

2008-05-19 Thread Scott Wood

Rune Torgersen wrote:
Hi 
I get the following oops when trying to boot a arch/powerpc kernel with

preempt-rt installed (v2.6.25.4-rt1)
The board is using a Freescale 8280 as the main CPU and a Silicon Image
SII3124 SATA controller. The oops seems to happen on fileaccess right
after init starts.

[snip]

NIP [c0249618] rt_spin_lock_slowlock+0x9c/0x200
LR [c02495ec] rt_spin_lock_slowlock+0x70/0x200
Call Trace:
[ef29d600] [c02495ec] rt_spin_lock_slowlock+0x70/0x200 (unreliable)
[ef29d670] [c00277d0] lock_timer_base+0x2c/0x64
[ef29d690] [c00285e8] del_timer+0x2c/0x78
[ef29d6b0] [c019d108] scsi_delete_timer+0x1c/0x3c
[ef29d6d0] [c01992d0] scsi_done+0x18/0x4c
[ef29d6f0] [c01b19dc] ata_scsi_qc_complete+0x364/0x380
[ef29d720] [c01a8708] __ata_qc_complete+0xd8/0xec
[ef29d740] [c01b011c] ata_qc_complete_multiple+0xc4/0xec
[ef29d760] [c01bcaf4] sil24_interrupt+0x46c/0x52c
[ef29d7a0] [c0048954] handle_IRQ_event+0x64/0x100
[ef29d7d0] [c0048b30] __do_IRQ+0x140/0x1bc
[ef29d7f0] [c00166c4] apmax_int_irq_demux+0x8c/0xb0
[ef29d810] [c0006448] do_IRQ+0x68/0xa8
[ef29d820] [c0010388] ret_from_except+0x0/0x14
--- Exception: 501 at __spin_unlock_irqrestore+0x28/0x4c
LR = __spin_unlock_irqrestore+0x20/0x4c
[ef29d8f0] [c0249600] rt_spin_lock_slowlock+0x84/0x200
[ef29d960] [c00277d0] lock_timer_base+0x2c/0x64


You're recursively entering lock_timer_base, which does a 
spin_lock_irqsave().  Either interrupts are enabled when they should not 
be, or an interrupt was supposed to be threaded that isn't.


-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


RE: Oops with PREEMPT-RT on 2.6.25.4

2008-05-19 Thread Rune Torgersen
Scott Wood wrote:
 Rune Torgersen wrote:
 Hi
 I get the following oops when trying to boot a arch/powerpc kernel
 with preempt-rt installed (v2.6.25.4-rt1)
 The board is using a Freescale 8280 as the main CPU and a Silicon
 Image SII3124 SATA controller. The oops seems to happen on
 fileaccess right after init starts.
 [snip]
 NIP [c0249618] rt_spin_lock_slowlock+0x9c/0x200
 LR [c02495ec] rt_spin_lock_slowlock+0x70/0x200
 Call Trace:
 [ef29d600] [c02495ec] rt_spin_lock_slowlock+0x70/0x200 (unreliable)
 [ef29d670] [c00277d0] lock_timer_base+0x2c/0x64
 [ef29d690] [c00285e8] del_timer+0x2c/0x78
 [ef29d6b0] [c019d108] scsi_delete_timer+0x1c/0x3c
 [ef29d6d0] [c01992d0] scsi_done+0x18/0x4c
 [ef29d6f0] [c01b19dc] ata_scsi_qc_complete+0x364/0x380
 [ef29d720] [c01a8708] __ata_qc_complete+0xd8/0xec
 [ef29d740] [c01b011c] ata_qc_complete_multiple+0xc4/0xec
 [ef29d760] [c01bcaf4] sil24_interrupt+0x46c/0x52c
 [ef29d7a0] [c0048954] handle_IRQ_event+0x64/0x100
 [ef29d7d0] [c0048b30] __do_IRQ+0x140/0x1bc
 [ef29d7f0] [c00166c4] apmax_int_irq_demux+0x8c/0xb0
 [ef29d810] [c0006448] do_IRQ+0x68/0xa8
 [ef29d820] [c0010388] ret_from_except+0x0/0x14
 --- Exception: 501 at __spin_unlock_irqrestore+0x28/0x4c
 LR = __spin_unlock_irqrestore+0x20/0x4c
 [ef29d8f0] [c0249600] rt_spin_lock_slowlock+0x84/0x200
 [ef29d960] [c00277d0] lock_timer_base+0x2c/0x64
 
 You're recursively entering lock_timer_base, which does a
 spin_lock_irqsave().  Either interrupts are enabled when they should
 not be, or an interrupt was supposed to be threaded that isn't.

Sort of figured. How do I figure out which one, and how to fix it?
I've never gotten any -rt patchsets to work on this CPU, and it always
seems to be related to the disk driver.
I've tried since 2.6.16 ppc (2.6.16, 2.6.18 on ppc, 2.6.24 and 25 on
powerpc)

Even though this is a custom board, I'm pretty sure I can get it to fail
on a pq2fads board with the same disk controller.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Oops with PREEMPT-RT on 2.6.25.4

2008-05-19 Thread Gregory Haskins

Rune Torgersen wrote:

Scott Wood wrote:


You're recursively entering lock_timer_base, which does a
spin_lock_irqsave().  Either interrupts are enabled when they should
not be, or an interrupt was supposed to be threaded that isn't.


Sort of figured. How do I figure out which one, and how to fix it?
I've never gotten any -rt patchsets to work on this CPU, and it always
seems to be related to the disk driver.
I've tried since 2.6.16 ppc (2.6.16, 2.6.18 on ppc, 2.6.24 and 25 on
powerpc)

Even though this is a custom board, I'm pretty sure I can get it to fail
on a pq2fads board with the same disk controller.


Im not sure if LOCKDEP is available for that architecture.  Have you 
tried it?  Its pretty good at flushing these kinds of issues out 
(assuming its available).


-Greg
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Oops with PREEMPT-RT on 2.6.25.4

2008-05-19 Thread Scott Wood

Rune Torgersen wrote:

Scott Wood wrote:

You're recursively entering lock_timer_base, which does a
spin_lock_irqsave().  Either interrupts are enabled when they should
not be, or an interrupt was supposed to be threaded that isn't.


Sort of figured. How do I figure out which one, and how to fix it?


Almost certainly the latter.  Is the disk interrupt shared with any 
other interrupts, that are marked IRQF_NODELAY?  The -rt patch doesn't 
seem to handle mixing the two well.


Oh, and just to be sure: you do have CONFIG_PREEMPT_RT turned on, and 
not just CONFIG_PREEMPT, right?  The non-preempt-rt versions in the -rt 
patch don't look like they disable interrupts, though I may just be 
getting lost in a sea of underscores and ifdefs.


-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


RE: Oops with PREEMPT-RT on 2.6.25.4

2008-05-19 Thread Rune Torgersen
Scott Wood wrote:
 Almost certainly the latter.  Is the disk interrupt shared with any
 other interrupts, that are marked IRQF_NODELAY?  The -rt
 patch doesn't seem to handle mixing the two well.

Disk is on a muxed PCI interrupt. None of the other interrupts on the
mux is fireing at the time.
Is is possible that the demuxer is not set up right? It is based loosely
on pq2-pci-pic.c

 
 Oh, and just to be sure: you do have CONFIG_PREEMPT_RT turned on, and
 not just CONFIG_PREEMPT, right?  The non-preempt-rt versions in the
-rt
 patch don't look like they disable interrupts, though I may just be
 getting lost in a sea of underscores and ifdefs.

Full CONFIG_PREEMPT_RT. I was actually going to try CONFIG_PREEMPT to
see if anything helped.

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Oops with PREEMPT-RT on 2.6.25.4

2008-05-19 Thread Scott Wood

Rune Torgersen wrote:

Scott Wood wrote:

Almost certainly the latter.  Is the disk interrupt shared with any
other interrupts, that are marked IRQF_NODELAY?  The -rt
patch doesn't seem to handle mixing the two well.


Disk is on a muxed PCI interrupt. None of the other interrupts on the
mux is fireing at the time.


Regardless of whether they're firing, any request_irq with IRQF_NODELAY 
will turn off threading for all handlers.



Is is possible that the demuxer is not set up right? It is based loosely
on pq2-pci-pic.c


Try calling irq_set_chip_and_handler() with handle_level_irq, rather 
than irq_set_chip().  The -rt patch doesn't seem to have threadified the 
 __do_IRQ() path.


-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


RE: Oops with PREEMPT-RT on 2.6.25.4

2008-05-19 Thread Rune Torgersen
Scott Wood wrote:
 Try calling irq_set_chip_and_handler() with handle_level_irq, rather
 than irq_set_chip().  The -rt patch doesn't seem to have threadified
   the __do_IRQ() path.

The demuxer is setting itself up with set_irq_chained handler(), any
pointers on how to change to irq_set_chip_and_handler()?
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: Oops with PREEMPT-RT on 2.6.25.4

2008-05-19 Thread Scott Wood

Rune Torgersen wrote:

Scott Wood wrote:

Try calling irq_set_chip_and_handler() with handle_level_irq, rather
than irq_set_chip().  The -rt patch doesn't seem to have threadified
  the __do_IRQ() path.


The demuxer is setting itself up with set_irq_chained handler(), any
pointers on how to change to irq_set_chip_and_handler()?


No, I mean the call to set_irq_chip() in pci_pic_host_map() where it 
sets up the IRQs it manages, not the cascade IRQ itself.


-Scott
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


RE: Oops with PREEMPT-RT on 2.6.25.4

2008-05-19 Thread Rune Torgersen
Scott Wood wrote:
 Rune Torgersen wrote:
 Scott Wood wrote:
 Try calling irq_set_chip_and_handler() with handle_level_irq, rather
 than irq_set_chip().  The -rt patch doesn't seem to have threadified
   the __do_IRQ() path.
 
 The demuxer is setting itself up with set_irq_chained handler(), any
 pointers on how to change to irq_set_chip_and_handler()?
 
 No, I mean the call to set_irq_chip() in pci_pic_host_map() where it
 sets up the IRQs it manages, not the cascade IRQ itself.

Thanks!!! That fixed that particular problem.

Of course I then ran headfirst into another one
This one seems to happen when I attempt to read flash through an mtd
driver.

Oops: Exception in kernel mode, sig: 5 [#1]
PREEMPT Innovative Systems ApMax
Modules linked in:
NIP: c005e780 LR: c005e758 CTR: 
REGS: ef2c1d20 TRAP: 0700   Not tainted  (2.6.25.4-rt1)
MSR: 00029032 EE,ME,IR,DR  CR: 48222482  XER: 2000
TASK = ef2a1a90[98] 'S21initenv' THREAD: ef2c
GPR00: 3ffcf581 ef2c1dd0 ef2a1a90  c02cae8c 0002 0001
3ffcf580
GPR08:  c037  c037000c 28222484 1009ecc0 
100a5838
GPR16: 1009 1009   1009 ef2a7100 3ee38385
100955ac
GPR24:  ef2a4ef0  c27dc700 c27dcb80 0003 ef359040
c0334f88
NIP [c005e780] do_wp_page+0x650/0xc2c
LR [c005e758] do_wp_page+0x628/0xc2c
Call Trace:
[ef2c1dd0] [c005e758] do_wp_page+0x628/0xc2c (unreliable)
[ef2c1e10] [c0012420] do_page_fault+0x338/0x4b4
[ef2c1f40] [c0010120] handle_page_fault+0xc/0x80
--- Exception: 301 at 0x100322e0
LR = 0x100322dc
Instruction dump:
409e0594 4810f2e1 3d20c035 1fa3000f 8129b140 3bbd0003 57ab103a 7c0b482e
7d6b4a14 540007fa 30e0 7cc70110 0f06 3d20c035 3d40c035
8129b480
Oops: Exception in kernel mode, sig: 5 [#2]
PREEMPT Innovative Systems ApMax
Modules linked in:
NIP: c005db0c LR: c005dae4 CTR: 
REGS: ef29fd00 TRAP: 0700   Tainted: G  D   (2.6.25.4-rt1)
MSR: 00029032 EE,ME,IR,DR  CR: 48002482  XER: 2000
TASK = ef26e070[102] 'sed' THREAD: ef29e000
GPR00: 3ee2d581 ef29fdb0 ef26e070  c02cae8c  0001
3ee2d580
GPR08:  c037  c037000c 0722 1009ecc0 4802dfa4
4802d878
GPR16: 0014f73c   0003 4802cce0  0001
0200
GPR24: ef3592e0 0ffece1c ef3592e0 ef2a50fc ef2a4df4 0003 c27dcfe0
c27ff4c0
NIP [c005db0c] __do_fault+0x1e0/0x804
LR [c005dae4] __do_fault+0x1b8/0x804
Call Trace:
[ef29fdb0] [c005dae4] __do_fault+0x1b8/0x804 (unreliable)
[ef29fe10] [c0012420] do_page_fault+0x338/0x4b4
[ef29ff40] [c0010120] handle_page_fault+0xc/0x80
--- Exception: 301 at 0x48017b10
LR = 0x48007ac8
Instruction dump:
409e060c 4810ff55 3d20c035 1fa3000f 8129b140 3bbd0003 57ab103a 7c0b482e
7d6b4a14 540007fa 30e0 7cc70110 0f06 3d20c035 3d40c035
8129b480
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


RE: Oops with PREEMPT-RT on 2.6.25.4

2008-05-19 Thread Rune Torgersen
Rune Torgersen wrote: Scott Wood wrote:
 Of course I then ran headfirst into another one
 This one seems to happen when I attempt to read flash through an mtd
 driver. 

Both if these is hitting a BUG_ON in kmap_atomic
(include/asm-powerpc/highmem.h)

 
 Oops: Exception in kernel mode, sig: 5 [#1]
 PREEMPT Innovative Systems ApMax
 Modules linked in:
 NIP: c005e780 LR: c005e758 CTR: 
 REGS: ef2c1d20 TRAP: 0700   Not tainted  (2.6.25.4-rt1)
 MSR: 00029032 EE,ME,IR,DR  CR: 48222482  XER: 2000
 TASK = ef2a1a90[98] 'S21initenv' THREAD: ef2c
 GPR00: 3ffcf581 ef2c1dd0 ef2a1a90  c02cae8c 0002 0001
 3ffcf580 GPR08:  c037  c037000c 28222484 1009ecc0
  100a5838 GPR16: 1009 1009   1009
 ef2a7100 3ee38385 100955ac GPR24:  ef2a4ef0  c27dc700
 c27dcb80 0003 ef359040 c0334f88 NIP [c005e780]
 do_wp_page+0x650/0xc2c 
 LR [c005e758] do_wp_page+0x628/0xc2c
 Call Trace:
 [ef2c1dd0] [c005e758] do_wp_page+0x628/0xc2c (unreliable)
 [ef2c1e10] [c0012420] do_page_fault+0x338/0x4b4
 [ef2c1f40] [c0010120] handle_page_fault+0xc/0x80
 --- Exception: 301 at 0x100322e0
 LR = 0x100322dc
 Instruction dump:
 409e0594 4810f2e1 3d20c035 1fa3000f 8129b140 3bbd0003 57ab103a
 7c0b482e 7d6b4a14 540007fa 30e0 7cc70110 0f06 3d20c035
 3d40c035 8129b480 Oops: Exception in kernel mode, sig: 5 [#2]
 PREEMPT Innovative Systems ApMax
 Modules linked in:
 NIP: c005db0c LR: c005dae4 CTR: 
 REGS: ef29fd00 TRAP: 0700   Tainted: G  D   (2.6.25.4-rt1)
 MSR: 00029032 EE,ME,IR,DR  CR: 48002482  XER: 2000
 TASK = ef26e070[102] 'sed' THREAD: ef29e000
 GPR00: 3ee2d581 ef29fdb0 ef26e070  c02cae8c  0001
 3ee2d580 GPR08:  c037  c037000c 0722 1009ecc0
 4802dfa4 4802d878 GPR16: 0014f73c   0003 4802cce0
  0001 0200 GPR24: ef3592e0 0ffece1c ef3592e0 ef2a50fc
 ef2a4df4 0003 c27dcfe0 c27ff4c0 NIP [c005db0c]
 __do_fault+0x1e0/0x804 
 LR [c005dae4] __do_fault+0x1b8/0x804
 Call Trace:
 [ef29fdb0] [c005dae4] __do_fault+0x1b8/0x804 (unreliable)
 [ef29fe10] [c0012420] do_page_fault+0x338/0x4b4
 [ef29ff40] [c0010120] handle_page_fault+0xc/0x80
 --- Exception: 301 at 0x48017b10
 LR = 0x48007ac8
 Instruction dump:
 409e060c 4810ff55 3d20c035 1fa3000f 8129b140 3bbd0003 57ab103a
 7c0b482e 7d6b4a14 540007fa 30e0 7cc70110 0f06 3d20c035
 3d40c035 8129b480 ___
 Linuxppc-dev mailing list
 Linuxppc-dev@ozlabs.org
 https://ozlabs.org/mailman/listinfo/linuxppc-dev

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev