Steven Rostedt wrote:
On Wed, 20 Aug 2008, Eran Liberty wrote:

Steven Rostedt wrote:
On Wed, 20 Aug 2008, Steven Rostedt wrote:

On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote:

Found the problem (or at least -a- problem), it's a gcc bug.

Well, first I must say the code generated by -pg is just plain
horrible :-)

Appart from that, look at the exit of, for example, __d_lookup, as
generated by gcc when ftrace is enabled:

c00c0498:       38 60 00 00     li      r3,0
c00c049c:       81 61 00 00     lwz     r11,0(r1)
c00c04a0:       80 0b 00 04     lwz     r0,4(r11)
c00c04a4:       7d 61 5b 78     mr      r1,r11
c00c04a8:       bb 0b ff e0     lmw     r24,-32(r11)
c00c04ac:       7c 08 03 a6     mtlr    r0
c00c04b0:       4e 80 00 20     blr

As you can see, it restores r1 -before- it pops r24..r31 off
the stack ! I let you imagine what happens if an interrupt happens
just in between those two instructions (mr and lmw). We don't do
redzones on our ABI, so basically, the registers end up corrupted
by the interrupt.
Ouch!  You've disassembled this without -pg too, and it does not have this
bug? What version of gcc do you have?

I have:
 gcc (Debian 4.3.1-2) 4.3.1

c00c64c8:       81 61 00 00     lwz     r11,0(r1)
c00c64cc:       7f 83 e3 78     mr      r3,r28
c00c64d0:       80 0b 00 04     lwz     r0,4(r11)
c00c64d4:       ba eb ff dc     lmw     r23,-36(r11)
c00c64d8:       7d 61 5b 78     mr      r1,r11
c00c64dc:       7c 08 03 a6     mtlr    r0
c00c64e0:       4e 80 00 20     blr


My version looks fine.  I'm thinking that this is a separate issue than what
Eran is seeing.

Eran, can you do an "objdump -dr vmlinux" and search for __d_lookup, and
print out the end of the function dump.

Thanks,

-- Steve



powerpc-linux-gnu-objdump -dr --start-address=0xc00bb584 vmlinux | head -n 100

vmlinux:     file format elf32-powerpc

Disassembly of section .text:

c00bb584 <__d_lookup>:

[...]

c00bb670:       41 9e 00 50     beq-    cr7,c00bb6c0 <__d_lookup+0x13c>
c00bb674:       83 de 00 00     lwz     r30,0(r30)
c00bb678:       2f 9e 00 00     cmpwi   cr7,r30,0
c00bb67c:       40 9e ff 98     bne+    cr7,c00bb614 <__d_lookup+0x90>
c00bb680:       38 60 00 00     li      r3,0
c00bb684:       81 61 00 00     lwz     r11,0(r1)
c00bb688:       80 0b 00 04     lwz     r0,4(r11)
c00bb68c:       7d 61 5b 78     mr      r1,r11

[ BUG HERE IF INTERRUPT HAPPENS ]

c00bb690:       bb 0b ff e0     lmw     r24,-32(r11)
c00bb694:       7c 08 03 a6     mtlr    r0
c00bb698:       4e 80 00 20     blr

Yep, you have the same bug in your compiler.

-- Steve
Hmm... so whats now?

Is there a way to prove this scenario is indeed the one that caused the opps?

-- Liberty
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Reply via email to