Re: powerpc: step-jump-cont failure (Was: [PATCH] utrace: don't set -ops = utrace_detached_ops lockless)

2009-12-09 Thread Oleg Nesterov
On 12/08, Ananth N Mavinakayanahalli wrote:

 On Mon, Dec 07, 2009 at 07:05:40PM +0100, Oleg Nesterov wrote:
  On 12/07, Oleg Nesterov wrote:
  
   On 12/07, Jan Kratochvil wrote:
   
On Mon, 07 Dec 2009 15:24:51 +0100, Oleg Nesterov wrote:
 But. raise_sigusr2 is not equal to the actual address of 
 raise_sigusr2(),
 this value points to the thunk (I do not know the correct English 
 term)
   
ppc64 calls it function descriptor (GDB
ppc64_linux_convert_from_func_ptr_addr):
   For PPC64, a function descriptor is a TOC entry,
  
   Thanks Jan.
  
in a data section,
  
   Yes!
  
   Now I can't understand how this test-case could ever work on ppc.
   step-jump-cont does:
  
 regs-nip = raise_sigusr2;  --- points to data section
 ptrace(PTRACE_CONT);
  
   of course, the tracee gets SIGSEGV, this section is not executable.
 
  Hmm. Looks like, powerpc means a lot of different hardware, and
  _PAGE_EXEC may be 0. I didn't notice this when I quickly grepped
  arch/powerpc/
 
  IOW, perhaps on some machines r implies x ?
 
  Is yes, this can explain why the results differ on different
  machines.

 Well, powerpc 32-bit adheres to the SVR4 ABI, while powerpc 64-bit uses
 the PPC64-ELF ABI (http://refspecs.linuxfoundation.org/ELF/ppc64/). The
 64bit ABI uses function descriptors and the 'func_name' is the data
 address,

Cai, Ananth, thank you.

So. I think we can forget about the possible kernel problems (and
in any case we can rule out utrace).

The test-case just wrong and should be fixed. The tracee can't execute
the function descriptor in data section, that is why it gets SIGSEGV.

 while the '.func_name' is the text address.

tried to change the code to

REGS_ACCESS (regs, nip) = (unsigned long) .raise_sigusr2

but gcc doesn't like this ;)

 (See
 handle_rt_signal64 in arch/powerpc/kernel/signal_64.c and
 kprobe_lookup_name in arch/powerpc/include/asm/kprobes.h.

Thanks... looking at handle_rt_signal64(), looks like we should
also set regs-gpr[2] = funct_desc_ptr-toc if we change regs-nip


I hope someone who understand powerpc could fix the test-case ;)

Oleg.



Re: Tests Failures on PPC64

2009-12-09 Thread Oleg Nesterov
On 12/08, caiq...@redhat.com wrote:

 This is seen with and without CONFIG_UTRACE.

Good, at least we shouldn't worry about utrace.

 FAIL: watchpoint

 ppc-dabr-race: ./../tests/ppc-dabr-race.c:141: handler_fail: Assertion `0' 
 failed.
 /bin/sh: line 5: 31750 Aborted   ${dir}$tst
 FAIL: ppc-dabr-race

 Are those known issues?

No, it is not. However I do not not what this test-case does,
and I know nothing about data watchpoints.

Hmm. it is obvioulsy racy, static volatile unsigned started
is not atomic and thus the main thread can hang doing

while (started  THREADS);

not that I think this explains the failure though.


Cai, I tried to reproduce the failure on your machine but it
doesn't fail?

Oleg.



s390: stepping PT_PTRACED (Was: utrace failed to compile for s390x)

2009-12-09 Thread Oleg Nesterov
(add cc's)

On 12/03, Roland McGrath wrote:

  Not sure what should we do right now, s390 needs more attention.
  Also, it uses PT_PTRACED bit, I am afraid this is not what we wan
  with utrace.

 This looks simple.  Its one use can just be
 tracehook_consider_fatal_signal(SIGTRAP) instead.  I expect you can just
 send that change to the s390 maintainers and they will take it.

Done.

But... I tried to understand these check and failed. Why do we need them?
They look unneeded to me, but of course I know nothing about s390.

For example, do_single_step(). If notify_die(DIE_SSTEP) does not return
NOTIFY_STOP, it is not kprobe. This means the caller is sysc_singlestep:
in entry.S, who else except ptrace could set TIF_SINGLESTEP ?

Oleg.



Re: s390: stepping PT_PTRACED (Was: utrace failed to compile for s390x)

2009-12-09 Thread Heiko Carstens
On Wed, Dec 09, 2009 at 08:54:28PM +0100, Oleg Nesterov wrote:
 (add cc's)
 
 On 12/03, Roland McGrath wrote:
 
   Not sure what should we do right now, s390 needs more attention.
   Also, it uses PT_PTRACED bit, I am afraid this is not what we wan
   with utrace.
 
  This looks simple.  Its one use can just be
  tracehook_consider_fatal_signal(SIGTRAP) instead.  I expect you can just
  send that change to the s390 maintainers and they will take it.
 
 Done.
 
 But... I tried to understand these check and failed. Why do we need them?
 They look unneeded to me, but of course I know nothing about s390.

Could be. I tried to figure out when and why this was added. But looks
like this is 10 year old code.



Re: s390: stepping PT_PTRACED (Was: utrace failed to compile for s390x)

2009-12-09 Thread Roland McGrath
 But... I tried to understand these check and failed. Why do we need them?
 They look unneeded to me, but of course I know nothing about s390.

It's not specific to s390.  Other arch's have equivalent logic.  As
with all things ptrace, I strongly suspect that they just blindly
copied the logic from some old i386 code and never contemplated the
actual intent.

AFAIK this is just part of some old defensive programming or partial
mitigation of the general problem that we overload the user signal
queue as a queue of debugger-induced hardware traps.  This is some
very incomplete mitigation--the general problem is a known issue we
want to address in the future--but it's the status quo, so better
not to tweak it now in case it might be closing an observable hole
today in some usage pattern that someone might actually experience.

The general problem has many corners and races and we still have
those with or without this check.  But consider e.g. the scenario
where the debugger PTRACE_SINGLESTEP'd into a long-blocking syscall
(read on a dangling pipe, etc.), then the debugger suddenly exits
without doing a proper PTRACE_DETACH.  That is an entirely non-racey
case where TIF_SINGLE_STEP was left set but -ptrace is clear, so
without the check you could get the spurious SIGTRAP killing the
tracee (the once-was-tracee-and-no-longer, that is).

Perhaps nowadays exit_ptrace() leads to ptrace_disable() -
user_disable_single_step() and so the TIF_* bit is clear before
resuming and getting here (or at least with utrace, it leads to
UTRACE_DETACH and eventually to user_disable_single_step()).  But at
least before that (and it looks like with the 2.6.32 ptrace code
still), it makes a difference in this scenario.  So it is still an
open can of worms I don't want us to dive into this week, but at
least this one worm-escape hole should stay as plugged as it has
been these last 10+ years.  (For extra credit, write up that case in
ptrace-tests and make it a KFAIL.)


Thanks,
Roland



Re: [RFC] [PATCH] In-kernel gdbstub based on utrace Infrastructure.

2009-12-09 Thread Ingo Molnar

* Frank Ch. Eigler f...@redhat.com wrote:

  [...]
 
  If the in-kernel gdb stub replaced kgdb functionally you'd hear no 
  complaints from me.
 
 Let's leave it as an idea for the future.

We came a full circle - that's the argument. We say overlap, duplication 
and incomplete implementation in this area is a problem. Since the speed 
of development in this area is truly glacial at the moment and the 
practical advantages that i can experience personally (directly as a 
Linux user and indirectly as a maintainer) are miniscule so far, caution 
is warranted IMO. You say there is no problem. We'll have to agree to 
disagree i guess.

Ingo