Re: odd utrace testing results on s390x
Could you please re-test with this patch applied? It turned out that this patch did not make much difference. the step-simple is still failing with the patch applied. It could be reproduced a few times after a fresh reboot. The test exited with 1 here, /* Known bug in 2.6.28-rc7 + utrace patch: * child was left to run freely, and exited * Deterministic (happens even with NUM_SINGLESTEPS = 1) */ if (WIFEXITED (status)) { VERBOSE(PTRACE_SINGLESTEP did not stop (step #%d)\n, i+1); assert (WEXITSTATUS (status) == 42); exit (1); } Here was the strace output when failure. # strace ./step-simple execve(./step-simple, [./step-simple], [/* 28 vars */]) = 0 brk(0) = 0x80003000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2027000 access(/etc/ld.so.preload, R_OK) = -1 ENOENT (No such file or directory) open(/etc/ld.so.cache, O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=44711, ...}) = 0 mmap(NULL, 44711, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2028000 close(3)= 0 open(/lib64/libc.so.6, O_RDONLY) = 3 read(3, \177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\26\0\0\0\1\0\0\0\0\0\2\10\364\0..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=2703224, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2033000 mmap(NULL, 1729920, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x2034000 mmap(0x21d1000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19c000) = 0x21d1000 mmap(0x21d6000, 17792, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x21d6000 close(3)= 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x21db000 mprotect(0x21d1000, 16384, PROT_READ) = 0 mprotect(0x2022000, 4096, PROT_READ) = 0 munmap(0x2028000, 44711)= 0 rt_sigaction(SIGABRT, {0x8e7c, [ABRT], SA_RESTART}, {SIG_DFL, [], 0}, 8) = 0 rt_sigaction(SIGINT, {0x8e7c, [INT], SA_RESTART}, {SIG_DFL, [], 0}, 8) = 0 rt_sigaction(SIGALRM, {0x8e7c, [ALRM], SA_RESTART}, {SIG_DFL, [], 0}, 8) = 0 alarm(5)= 0 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x21db7c0) = 2026 wait4(2026, [{WIFSTOPPED(s) WSTOPSIG(s) == SIGSTOP}], 0, NULL) = 2026 --- SIGCHLD (Child exited) @ 0 (0) --- ptrace(PTRACE_SINGLESTEP, 2026, 0, SIG_0) = 0 wait4(2026, [{WIFEXITED(s) WEXITSTATUS(s) == 42}], 0, NULL) = 2026 --- SIGCHLD (Child exited) @ 0 (0) --- kill(2026, SIGKILL) = -1 ESRCH (No such process) wait4(-1, NULL, __WALL, NULL) = -1 ECHILD (No child processes) exit_group(1) Also, I could not reproduce the problem in kernels without utrace. Thanks, CAI Qian
Re: odd utrace testing results on s390x
On 12/22, caiq...@redhat.com wrote: The following are testing results on s390x kernels build from the source, http://kojipkgs.fedoraproject.org/packages/kernel/2.6.32.2/14.fc13/src/kernel-2.6.32.2-14.fc13.src.rpm without and with CONFIG_UTRACE using the latest ptrace-utrace git tree. ptrace testsuite: looks like step-simple is starting to fail, Damn, my fault. I forgot to cc you when I sent the fix for s390 (attached below), and I forgot to remind you about this fix when we discussed the testing on s390. Could you please re-test with this patch applied? and a different syscall number when syscall-from-clone failed. This is not clear to me, will take a look. Thanks! Oleg. - Untested, but hopefully trivial enough and should't change the compiled code. Nobody except ptrace itself should use task-ptrace or PT_PTRACED directly, change arch/s390/kernel/traps.c to use the helper. Signed-off-by: Oleg Nesterov o...@redhat.com Acked-by: Roland McGrath rol...@redhat.com --- arch/s390/kernel/traps.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- V1/arch/s390/kernel/traps.c~S390_DONT_ABUSE_PT_PTRACED 2009-04-06 00:03:36.0 +0200 +++ V1/arch/s390/kernel/traps.c 2009-12-09 20:31:49.0 +0100 @@ -18,7 +18,7 @@ #include linux/kernel.h #include linux/string.h #include linux/errno.h -#include linux/ptrace.h +#include linux/tracehook.h #include linux/timer.h #include linux/mm.h #include linux/smp.h @@ -382,7 +382,7 @@ void __kprobes do_single_step(struct pt_ SIGTRAP) == NOTIFY_STOP){ return; } - if ((current-ptrace PT_PTRACED) != 0) + if (tracehook_consider_fatal_signal(current, SIGTRAP)) force_sig(SIGTRAP, current); } @@ -483,7 +483,7 @@ static void illegal_op(struct pt_regs * if (get_user(*((__u16 *) opcode), (__u16 __user *) location)) return; if (*((__u16 *) opcode) == S390_BREAKPOINT_U16) { - if (current-ptrace PT_PTRACED) + if (tracehook_consider_fatal_signal(current, SIGTRAP)) force_sig(SIGTRAP, current); else signal = SIGILL;
Re: odd utrace testing results on s390x
On 12/22, Oleg Nesterov wrote: On 12/22, caiq...@redhat.com wrote: and a different syscall number when syscall-from-clone failed. This is not clear to me, will take a look. Should I say I know nothing about s390 and can't read its asm? First of all, I think syscall-from-clone is wrong on s390 and needs a fix, #elif defined __s390__ # define REGISTER_CALLNOoffsetof (struct user_regs_struct, orig_gpr2) # define REGISTER_RETVALoffsetof (struct user_regs_struct, gprs[2]) This doesn't look right. I did a simple test-case (below), and with or without utrace it prints syscall=1234 ret=6 - __NR_close syscall=1234 ret=-9 - -EBADF, this is correct Apparently, 1234 is the argument for close(1234). So: it is not clear to me how to get callno on SYSCALL_EXIT, and I don't know whether it is OK or not that syscall-from-clone sees different -orig_gpr2 after return from fork() on s390 -unexpected syscall 0x5ee9 without utrace +unexpected syscall 0 with utrace syscall_get_nr() returns regs-svcnr, but there is no svcnr in user_regs_struct on s390. Still investigating... Oleg. #include stdio.h #include unistd.h #include stdlib.h #include signal.h #include stddef.h #include sys/ptrace.h #include sys/wait.h #include assert.h #include sys/user.h #include asm/ptrace.h #ifdef __s390__ #define REGISTER_CALLNO offsetof(struct user_regs_struct, orig_gpr2) #define REGISTER_RETVAL offsetof(struct user_regs_struct, gprs[2]) #else #define REGISTER_CALLNO offsetof(struct user_regs_struct, orig_rax) #define REGISTER_RETVAL offsetof(struct user_regs_struct, rax) #endif int main(void) { int pid, status, i; long scn, ret; pid = fork(); if (!pid) { assert(ptrace(PTRACE_TRACEME, 0, 0, 0) == 0); kill(getpid(), SIGUSR1); close(1234); return 0x66; } assert(pid == wait(status)); assert(WIFSTOPPED(status) WSTOPSIG(status) == SIGUSR1); assert(ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_TRACESYSGOOD) == 0); for (i = 0; i 2; ++i) { ptrace(PTRACE_SYSCALL, pid, 0,0); assert(pid == wait(status)); assert(status == 0x857f); scn = ptrace(PTRACE_PEEKUSER, pid, REGISTER_CALLNO, NULL); ret = ptrace(PTRACE_PEEKUSER, pid, REGISTER_RETVAL, NULL); printf(syscall=%ld ret=%ld\n, scn, ret); } kill(pid, SIGKILL); return 0; }
Re: odd utrace testing results on s390x
On 12/22, Oleg Nesterov wrote: On 12/22, Oleg Nesterov wrote: and I don't know whether it is OK or not that syscall-from-clone sees different -orig_gpr2 after return from fork() on s390 -unexpected syscall 0x5ee9 without utrace +unexpected syscall 0 with utrace AARGH. I misread the diff you reported, the difference above has nothing to do with syscall-from-clone! This is the output form the next test-case, step-from-clone. I think, everything is (almost) clear now, I'll try to summarize. Both syscall-from-clone and step-from-clone fail, with or without utrace. I think they both need the fix, REGISTER_CALLNO and REGISTER_RETVAL are wrong on s390. (I also tested my trivial test-case on rhel5 and it prints the same). Now, the difference above _can_ be explained because utrace kernel lacks ca633fd006486ed2c2d3b542283067aab61e6dc8 (Cai, I attached this patch in the first reply), or we have issues with utrace on s390. without utrace: -unexpected syscall 0x5ee9 this is in fact grandchild's pid == fork's retval because REGISTER_CALLNO is wrong with utrace: +unexpected syscall 0 well, we are reading orig_gpr2 which is wrong anyway, but I am worried because cat /proc/child/syscall reports -1 0x3ff21d8 0x20f1e52 _Perhaps_ this all will be fixed by ca633fd006486ed2c2d3b542283067aab61e6dc8, but I am not sure. This trap was (I think) generated by ptrace_report_signal(), it may happen that so s390 doesn't preserve some registers when we dequeue SIGTRAP after do_notify_resume()-utrace_stop() and call utrace_stop() again. Oh. I am still trying to parse arch/s390/kernel/entry.S to understand how can we fix these test-cases. I think I need the help, will continue tomorrow. Oleg.
Re: odd utrace testing results on s390x
Damn, my fault. I forgot to cc you when I sent the fix for s390 (attached below), and I forgot to remind you about this fix when we discussed the testing on s390. That change is upstream for 2.6.33 now. I'll cherry-picked it into the 2.6.32/tracehook branch so it will be in the backport patchset too. Thanks, Roland
Re: odd utrace testing results on s390x
Oh. I am still trying to parse arch/s390/kernel/entry.S to understand how can we fix these test-cases. I think I need the help, will continue tomorrow. Martin Schwidefsky schwidef...@de.ibm.com is the s390 arch maintainer. He is friendly and helpful. You can ask him for help both with understanding the intended s390 behavior before, and with understanding the code paths. He won't expect any of us to grok s390 assembly. :-) Thanks, Roland