Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)
On 12/08, Ananth N Mavinakayanahalli wrote: On Mon, Dec 07, 2009 at 07:05:40PM +0100, Oleg Nesterov wrote: On 12/07, Oleg Nesterov wrote: On 12/07, Jan Kratochvil wrote: On Mon, 07 Dec 2009 15:24:51 +0100, Oleg Nesterov wrote: But. raise_sigusr2 is not equal to the actual address of raise_sigusr2(), this value points to the thunk (I do not know the correct English term) ppc64 calls it function descriptor (GDB ppc64_linux_convert_from_func_ptr_addr): For PPC64, a function descriptor is a TOC entry, Thanks Jan. in a data section, Yes! Now I can't understand how this test-case could ever work on ppc. step-jump-cont does: regs-nip = raise_sigusr2; --- points to data section ptrace(PTRACE_CONT); of course, the tracee gets SIGSEGV, this section is not executable. Hmm. Looks like, powerpc means a lot of different hardware, and _PAGE_EXEC may be 0. I didn't notice this when I quickly grepped arch/powerpc/ IOW, perhaps on some machines r implies x ? Is yes, this can explain why the results differ on different machines. Well, powerpc 32-bit adheres to the SVR4 ABI, while powerpc 64-bit uses the PPC64-ELF ABI (http://refspecs.linuxfoundation.org/ELF/ppc64/). The 64bit ABI uses function descriptors and the 'func_name' is the data address, Cai, Ananth, thank you. So. I think we can forget about the possible kernel problems (and in any case we can rule out utrace). The test-case just wrong and should be fixed. The tracee can't execute the function descriptor in data section, that is why it gets SIGSEGV. while the '.func_name' is the text address. tried to change the code to REGS_ACCESS (regs, nip) = (unsigned long) .raise_sigusr2 but gcc doesn't like this ;) (See handle_rt_signal64 in arch/powerpc/kernel/signal_64.c and kprobe_lookup_name in arch/powerpc/include/asm/kprobes.h. Thanks... looking at handle_rt_signal64(), looks like we should also set regs-gpr[2] = funct_desc_ptr-toc if we change regs-nip I hope someone who understand powerpc could fix the test-case ;) Oleg.
Re: Tests Failures on PPC64
On 12/08, caiq...@redhat.com wrote: This is seen with and without CONFIG_UTRACE. Good, at least we shouldn't worry about utrace. FAIL: watchpoint ppc-dabr-race: ./../tests/ppc-dabr-race.c:141: handler_fail: Assertion `0' failed. /bin/sh: line 5: 31750 Aborted ${dir}$tst FAIL: ppc-dabr-race Are those known issues? No, it is not. However I do not not what this test-case does, and I know nothing about data watchpoints. Hmm. it is obvioulsy racy, static volatile unsigned started is not atomic and thus the main thread can hang doing while (started THREADS); not that I think this explains the failure though. Cai, I tried to reproduce the failure on your machine but it doesn't fail? Oleg.
s390: stepping PT_PTRACED (Was: utrace failed to compile for s390x)
(add cc's) On 12/03, Roland McGrath wrote: Not sure what should we do right now, s390 needs more attention. Also, it uses PT_PTRACED bit, I am afraid this is not what we wan with utrace. This looks simple. Its one use can just be tracehook_consider_fatal_signal(SIGTRAP) instead. I expect you can just send that change to the s390 maintainers and they will take it. Done. But... I tried to understand these check and failed. Why do we need them? They look unneeded to me, but of course I know nothing about s390. For example, do_single_step(). If notify_die(DIE_SSTEP) does not return NOTIFY_STOP, it is not kprobe. This means the caller is sysc_singlestep: in entry.S, who else except ptrace could set TIF_SINGLESTEP ? Oleg.
Re: s390: stepping PT_PTRACED (Was: utrace failed to compile for s390x)
On Wed, Dec 09, 2009 at 08:54:28PM +0100, Oleg Nesterov wrote: (add cc's) On 12/03, Roland McGrath wrote: Not sure what should we do right now, s390 needs more attention. Also, it uses PT_PTRACED bit, I am afraid this is not what we wan with utrace. This looks simple. Its one use can just be tracehook_consider_fatal_signal(SIGTRAP) instead. I expect you can just send that change to the s390 maintainers and they will take it. Done. But... I tried to understand these check and failed. Why do we need them? They look unneeded to me, but of course I know nothing about s390. Could be. I tried to figure out when and why this was added. But looks like this is 10 year old code.
Re: s390: stepping PT_PTRACED (Was: utrace failed to compile for s390x)
But... I tried to understand these check and failed. Why do we need them? They look unneeded to me, but of course I know nothing about s390. It's not specific to s390. Other arch's have equivalent logic. As with all things ptrace, I strongly suspect that they just blindly copied the logic from some old i386 code and never contemplated the actual intent. AFAIK this is just part of some old defensive programming or partial mitigation of the general problem that we overload the user signal queue as a queue of debugger-induced hardware traps. This is some very incomplete mitigation--the general problem is a known issue we want to address in the future--but it's the status quo, so better not to tweak it now in case it might be closing an observable hole today in some usage pattern that someone might actually experience. The general problem has many corners and races and we still have those with or without this check. But consider e.g. the scenario where the debugger PTRACE_SINGLESTEP'd into a long-blocking syscall (read on a dangling pipe, etc.), then the debugger suddenly exits without doing a proper PTRACE_DETACH. That is an entirely non-racey case where TIF_SINGLE_STEP was left set but -ptrace is clear, so without the check you could get the spurious SIGTRAP killing the tracee (the once-was-tracee-and-no-longer, that is). Perhaps nowadays exit_ptrace() leads to ptrace_disable() - user_disable_single_step() and so the TIF_* bit is clear before resuming and getting here (or at least with utrace, it leads to UTRACE_DETACH and eventually to user_disable_single_step()). But at least before that (and it looks like with the 2.6.32 ptrace code still), it makes a difference in this scenario. So it is still an open can of worms I don't want us to dive into this week, but at least this one worm-escape hole should stay as plugged as it has been these last 10+ years. (For extra credit, write up that case in ptrace-tests and make it a KFAIL.) Thanks, Roland
Re: [RFC] [PATCH] In-kernel gdbstub based on utrace Infrastructure.
* Frank Ch. Eigler f...@redhat.com wrote: [...] If the in-kernel gdb stub replaced kgdb functionally you'd hear no complaints from me. Let's leave it as an idea for the future. We came a full circle - that's the argument. We say overlap, duplication and incomplete implementation in this area is a problem. Since the speed of development in this area is truly glacial at the moment and the practical advantages that i can experience personally (directly as a Linux user and indirectly as a maintainer) are miniscule so far, caution is warranted IMO. You say there is no problem. We'll have to agree to disagree i guess. Ingo