Re: ptrace crash on PREEMPT 2.6.18-128.7.1.el5 kernel
Hi, We experiment the same issue. It is related to this fix: linux-2.6-misc-ptrace-fix-exec-report.patch and the related bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=455060 We patched the ptrace.c with this: *** /root/rpmbuild_r5/BUILD/kernel-2.6.18/linux-2.6.18.i386/kernel/ptrace.c 2010-02-17 18:02:49.0 +0100 --- ptrace.c2010-02-17 19:23:05.0 +0100 *** *** 1976,1981 --- 1976,1982 * The difference is in where the real stop takes place and * what ptrace can do with tsk-exit_code there. */ + preempt_enable_no_resched(); send_sig(SIGTRAP, tsk, 0); return UTRACE_ACTION_RESUME; } And it seems to be fine for now: gdb, strace are now working properly. Regards Régis. Roland McGrath wrote: But I'm happy to have tracked it down to the utrace-based ptrace emulation, and was mostly just interested in knowing if preempt and utrace are fundamentally incompatible on x86_64, or something like that. I'll fight through the 2.6.18-164 issues instead, since the ptrace problem doesn't seem to be happening on that version. It is probably the case that the RHEL5 utrace code cannot easily be made to work reliably with CONFIG_PREEMPT. Thanks, Roland -- Régis ODEYE Kontron Modular Computers SA 150, rue M. Berthelot / ZI Toulon Est / BP 244 / Fr 83078 TOULON Cedex 9 Phone: (33) 4 98 16 34 86 Fax: (33) 4 98 16 34 01 E-mail: regis.od...@kontron.com Web : www.kontron.com
ptrace crash on PREEMPT 2.6.18-128.7.1.el5 kernel
I'm not sure if this is the place for this, but: I have an x86_64 machine that gets an immediate SIGSEGV when ptracing anything: [r...@dl360g6gs1 kernel-2.6.18]# strace true execve(/bin/true, [true], [/* 28 vars */]) = 0 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++ I have recompiled the kernel (2.6.18-128.7.1.el5), but the only significant change I can think of making is that I enabled preemption. I also have an x86_64 VM under VirtualBox using a slightly different kernel. It was initially working, but when I installed an updated kernel RPM, it started crashing as well -- even before rebooting into the new kernel! However, it is crashing differently. It gives me a kernel stack trace (pasted below). It looks like some sort of locking issue. Is this problem fixed in later patched kernels? I know kernel-2.6.18-164.11.1.el5.x86_64.rpm is available, but the last time I tried that particular one it caused me some unrelated problems so I'm hesitant to go there. I can post the kernel config if it would be helpful. BUG: warning at kernel/ptrace.c:1674/ptrace_report() (Not tainted) Call Trace: [800c44d4] ptrace_report+0xeb/0x120 [800c3f52] utrace_report_syscall+0x74/0x227 [80070f41] syscall_trace_leave+0x5e/0x87 [80060312] int_very_careful+0x35/0x3f BUG: scheduling while atomic: true/0x0001/2526 Call Trace: [800655dd] __sched_text_start+0x7d/0xc22 [800a0484] kernel_text_address+0x1a/0x26 [8006f17b] dump_trace+0x214/0x23d [800c2212] utrace_quiescent+0xde/0x261 [800c4095] utrace_report_syscall+0x1b7/0x227 [80070f41] syscall_trace_leave+0x5e/0x87 [80060312] int_very_careful+0x35/0x3f BUG: warning at kernel/ptrace.c:1674/ptrace_report() (Not tainted) Call Trace: [800c44d4] ptrace_report+0xeb/0x120 [80060312] int_very_careful+0x35/0x3f [800c45ea] ptrace_report_signal+0x4c/0x5c [800c1955] report_signal+0x7f/0x179 [800c3282] utrace_get_signal+0x3e3/0x62b [8006e73c] __switch_to+0x2e/0x22d [8002c12a] get_signal_to_deliver+0x177/0x461 [8005d4ed] do_notify_resume+0x9c/0x7b0 [800c4095] utrace_report_syscall+0x1b7/0x227 [8006032e] int_signal+0x12/0x17 BUG: scheduling while atomic: true/0x0001/2526 Call Trace: [800655dd] __sched_text_start+0x7d/0xc22 [800c44ec] ptrace_report+0x103/0x120 [80060312] int_very_careful+0x35/0x3f [800c45ea] ptrace_report_signal+0x4c/0x5c [800c1955] report_signal+0x7f/0x179 [800c2212] utrace_quiescent+0xde/0x261 [800c3450] utrace_get_signal+0x5b1/0x62b [8006e73c] __switch_to+0x2e/0x22d [8002c12a] get_signal_to_deliver+0x177/0x461 [8005d4ed] do_notify_resume+0x9c/0x7b0 [800c4095] utrace_report_syscall+0x1b7/0x227 [8006032e] int_signal+0x12/0x17 Call Trace: [800655dd] __sched_text_start+0x7d/0xc22 [800c44ec] ptrace_report+0x103/0x120 [80060312] int_very_careful+0x35/0x3f [800c45ea] ptrace_report_signal+0x4c/0x5c [800c1955] report_signal+0x7f/0x179 [800c2212] utrace_quiescent+0xde/0x261 [800c3450] utrace_get_signal+0x5b1/0x62b [8006e73c] __switch_to+0x2e/0x22d [8002c12a] get_signal_to_deliver+0x177/0x461 [8005d4ed] do_notify_resume+0x9c/0x7b0 [800c4095] utrace_report_syscall+0x1b7/0x227 [8006032e] int_signal+0x12/0x17 BUG: warning at kernel/ptrace.c:1674/ptrace_report() (Not tainted) Call Trace: [800c44d4] ptrace_report+0xeb/0x120 [800c45ea] ptrace_report_signal+0x4c/0x5c [800c1955] report_signal+0x7f/0x179 [800c3282] utrace_get_signal+0x3e3/0x62b [8002c12a] get_signal_to_deliver+0x177/0x461 [8005d4ed] do_notify_resume+0x9c/0x7b0 [8009af62] specific_send_sig_info+0xa1/0xac [800683c9] _spin_unlock_irqrestore+0x16/0x31 [8009b243] force_sig_info+0xae/0xb9 [8006a707] do_page_fault+0x81e/0x830 [800c4095] utrace_report_syscall+0x1b7/0x227 [800606e0] retint_signal+0x3d/0x79 BUG: scheduling while atomic: true/0x0001/2526 Call Trace: [800655dd] __sched_text_start+0x7d/0xc22 [800c44ec] ptrace_report+0x103/0x120 [800c45ea] ptrace_report_signal+0x4c/0x5c [800c1955] report_signal+0x7f/0x179 [800c2212] utrace_quiescent+0xde/0x261 [800c3450] utrace_get_signal+0x5b1/0x62b [8002c12a] get_signal_to_deliver+0x177/0x461 [8005d4ed] do_notify_resume+0x9c/0x7b0 [8009af62] specific_send_sig_info+0xa1/0xac [800683c9] _spin_unlock_irqrestore+0x16/0x31 [8009b243] force_sig_info+0xae/0xb9 [8006a707] do_page_fault+0x81e/0x830 [800c4095] utrace_report_syscall+0x1b7/0x227 [800606e0] retint_signal+0x3d/0x79 BUG: warning at
Re: ptrace crash on PREEMPT 2.6.18-128.7.1.el5 kernel
On Fri, Feb 26, 2010 at 12:33 PM, Roland McGrath rol...@redhat.com wrote: If you are using actual RHEL5, you can go through your normal support channels for help on that. I don't know off hand of anybody who wants to help you with support for using RHEL5 kernel source built with a set of options different from what RHEL5's own builds use. For kernel developers, that is a really ancient kernel now. For enterprise support folks, changing big important config options for rebuilding from the stable old kernel's source is outside the scope of stable and support. Fair enough. Thanks for the quick response. For what I'm working on, I really do need the preemptive kernel (I'm generating a few thousand different live video streams, so latency=glitches, and preempt measurably helps.) Which, as you say, pretty much pushes me outside of the supportable envelope unless we track the bleeding edge, which is not a good idea for our setup. But I'm happy to have tracked it down to the utrace-based ptrace emulation, and was mostly just interested in knowing if preempt and utrace are fundamentally incompatible on x86_64, or something like that. I'll fight through the 2.6.18-164 issues instead, since the ptrace problem doesn't seem to be happening on that version. Thanks, Steve
Re: ptrace crash on PREEMPT 2.6.18-128.7.1.el5 kernel
But I'm happy to have tracked it down to the utrace-based ptrace emulation, and was mostly just interested in knowing if preempt and utrace are fundamentally incompatible on x86_64, or something like that. I'll fight through the 2.6.18-164 issues instead, since the ptrace problem doesn't seem to be happening on that version. It is probably the case that the RHEL5 utrace code cannot easily be made to work reliably with CONFIG_PREEMPT. Thanks, Roland