Re: segfault on x86_64 (Fedora Core 2) 2.6.x

Juergen Kreileder Thu, 10 Jun 2004 12:29:27 -0700

Marc Heckmann <[EMAIL PROTECTED]> writes:

> On Wed, Jun 09, 2004 at 11:58:32PM -0700, Hui Huang wrote:
>> The segfault messages look benign - they are the implicit NULL
>> checks within JVM.
>
> But look at the following discussion, Andi Kleen one of the x86_64
> kernel developers says that the kernel has been fixed not to log the
> messages if it is not a real segfault:
>
> http://bugme.osdl.org/show_bug.cgi?id=2839
>
> But maybe they fixed 2.4 and forgot to do so for 2.6.x?


It looks like it's supposed to be fixed in 2.6 too.  But a quick test
shows that it actually works the other way round, at least in 2.6.6:
Only catched segfaults get logged!

Here's a patch which fixed the problem for me;

--- arch/x86_64/mm/fault.c.orig 2004-06-10 19:51:45.000000000 +0200
+++ arch/x86_64/mm/fault.c      2004-06-10 20:38:38.000000000 +0200
@@ -210,11 +210,11 @@ static int is_errata93(struct pt_regs *r
 
 int unhandled_signal(struct task_struct *tsk, int sig)
 {
-       /* Warn for strace, but not for gdb */
-       if ((tsk->ptrace & (PT_PTRACED|PT_TRACESYSGOOD)) == PT_PTRACED)
-               return 0;
-       return (tsk->sighand->action[sig-1].sa.sa_handler == SIG_IGN) ||
-               (tsk->sighand->action[sig-1].sa.sa_handler == SIG_DFL);
+       /* I'm not sure about PT_TRACESYSGOOD. Is gdb supposed
+          to use PTRACE_O_TRACESYSGOOD?  Mine doesn't. */
+       return !(tsk->ptrace & PT_PTRACED) &&
+               ((tsk->sighand->action[sig-1].sa.sa_handler == SIG_IGN) ||
+                (tsk->sighand->action[sig-1].sa.sa_handler == SIG_DFL));
 }
 
 int page_fault_trace; 
@@ -374,7 +374,7 @@ bad_area_nosemaphore:
                    (address >> 32))
                        return;
 
-               if (exception_trace && !unhandled_signal(tsk, SIGSEGV)) { 
+               if (exception_trace && unhandled_signal(tsk, SIGSEGV)) { 
                printk(KERN_INFO 
                       "%s[%d]: segfault at %016lx rip %016lx rsp %016lx error %lx\n",
                                        tsk->comm, tsk->pid, address, regs->rip,


> It also seems to me that the Sun 1.4.2 JVM did not function
> correctly (1.5.0-beta for x86_64 did actually dump core.) in that
> the webapp was not functioning correctly.

32-bit VMs should work OK (unless you use noexec32=all,on; only
Blackdown 1.4.2 works fine with that).
As for 64-bit VMs, please wait a few days: 1.4.2-fcs will fix quite a few
x86_64 specific bugs.


        Juergen

-- 
Juergen Kreileder, Blackdown Java-Linux Team
http://www.blackdown.org/java-linux/java2-status/


----------------------------------------------------------------------
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Re: segfault on x86_64 (Fedora Core 2) 2.6.x

Reply via email to