Hello Mike, hello Linus,

Some minutes ago, I wrote:
> I think I have found the reason for our bugs. It seems GCC really
> miscompiles buffer.c:bdflush_init without frame pointers. I'll try harder
> now to understand what excactly is going on, but it seems it is smashing
> its local stack space by decrementing its stack pointer too early, then
> calling an assembler function (__down_failed). It might be that GCC is
> confused by this.

[...]

> Any comments on this? I'll now try to split up the stack space operation in
> two parts, the first after call kernel_thread: addl $12, %esp (as in the
> first call), and an additional addl $64, %esp just before leaving (before
> popl %ebx). And I'll report what happened, later - but I have a good
> feeling that I have caught the bug.

... and my good feeling was right. Changing the bogus assembly code made the 
bug go away. I'll try to prepare a simpler testcase for the GCC maintainers 
tomorrow. For short, this is what happens: GCC tries to free its stack frame 
for the local variables far too early. It then calls __down_failed(), which 
pushes some things on the stack - thereby corrupting the semaphore pointer! 
So __down() works on a random memory location instead of the semaphore, which 
is guaranteed to fail badly. 

I've added linux-kernel as CC again, so everybody can now hear that this is 
definitely a GCC bug, and not a kernel issue.

Greetings,
Andreas

-- 
->>>----------------------- Andreas Franck --------<<<-
---<<<---- [EMAIL PROTECTED] --->>>---
->>>---- Keep smiling! ----------------------------<<<-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Reply via email to