Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-26 Thread Martin Schwidefsky
On Thu, 21 Jan 2010 21:51:13 +0100
Oleg Nesterov o...@redhat.com wrote:

 On 01/07, Roland McGrath wrote:
 
   I am confused as well. Yes, I thought about regs-psw.mask change too,
   but I don't understand why it helps..
  [...]
   But. Acoording to the testing I did (unless I did something wrong
   again) this patch doesn't make any difference in this particular
   case. 6580807da14c423f0d0a708108e6df6ebc8bc83d does.
 
  Those results are quite mysterious to me.
  I think we'll have to get Martin to sort it out definitively.

Finally nailed that one. Grrmpf.. the special case in the program check
handler for single stepped svcs clobbers the argument registers. With our
test case this affects the clone() system call. Funny things happen when
the clone_flags argument is more or less random ..
The following patch fixes the problem for me.

--
Subject: [PATCH] fix single stepped svcs with TRACE_IRQFLAGS=y

From: Martin Schwidefsky schwidef...@de.ibm.com

If irq flags tracing is enabled the TRACE_IRQS_ON macros expands to
a function call which clobbers registers %r0-%r5. The macro is used
in the code path for single stepped system calls. The argument
registers %r2-%r6 need to be restored from the stack before the system
call function is called.

Cc: sta...@kernel.org
Signed-off-by: Martin Schwidefsky schwidef...@de.ibm.com
---

 arch/s390/kernel/entry.S   |1 +
 arch/s390/kernel/entry64.S |1 +
 2 files changed, 2 insertions(+)

diff -urpN linux-2.6/arch/s390/kernel/entry64.S 
linux-2.6-patched/arch/s390/kernel/entry64.S
--- linux-2.6/arch/s390/kernel/entry64.S2009-12-03 04:51:21.0 
+0100
+++ linux-2.6-patched/arch/s390/kernel/entry64.S2010-01-26 
14:04:58.0 +0100
@@ -549,6 +549,7 @@ pgm_svcper:
mvc __THREAD_per+__PER_access_id(1,%r8),__LC_PER_ACCESS_ID
oi  __TI_flags+7(%r9),_TIF_SINGLE_STEP # set TIF_SINGLE_STEP
TRACE_IRQS_ON
+   lmg %r2,%r6,SP_R2(%r15) # load svc arguments
stosm   __SF_EMPTY(%r15),0x03   # reenable interrupts
j   sysc_do_svc
 
diff -urpN linux-2.6/arch/s390/kernel/entry.S 
linux-2.6-patched/arch/s390/kernel/entry.S
--- linux-2.6/arch/s390/kernel/entry.S  2009-12-03 04:51:21.0 +0100
+++ linux-2.6-patched/arch/s390/kernel/entry.S  2010-01-26 14:04:58.0 
+0100
@@ -571,6 +571,7 @@ pgm_svcper:
mvc __THREAD_per+__PER_access_id(1,%r8),__LC_PER_ACCESS_ID
oi  __TI_flags+3(%r9),_TIF_SINGLE_STEP # set TIF_SINGLE_STEP
TRACE_IRQS_ON
+   lm  %r2,%r6,SP_R2(%r15) # load svc arguments
stosm   __SF_EMPTY(%r15),0x03   # reenable interrupts
b   BASED(sysc_do_svc)
 
-- 
blue skies,
   Martin.

Reality continues to ruin my life. - Calvin.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-21 Thread Oleg Nesterov
On 01/07, Roland McGrath wrote:

  I am confused as well. Yes, I thought about regs-psw.mask change too,
  but I don't understand why it helps..
 [...]
  But. Acoording to the testing I did (unless I did something wrong
  again) this patch doesn't make any difference in this particular
  case. 6580807da14c423f0d0a708108e6df6ebc8bc83d does.

 Those results are quite mysterious to me.
 I think we'll have to get Martin to sort it out definitively.

I did the testing again with 2.6.32-5.el6 + Martin's
c3311c13adc1021e986fef12609ceb395ffc5014
f8d5faf718c9ff2c04eb8484585d4963c4111cd7
patches.

the same test-case:

#include stdio.h
#include unistd.h
#include signal.h
#include sys/ptrace.h
#include sys/wait.h
#include assert.h

int main(void)
{
int pid, status;

if (!(pid = fork())) {
assert(ptrace(PTRACE_TRACEME) == 0);
kill(getpid(), SIGSTOP);

if (!fork())
return 43;

wait(status);
return WEXITSTATUS(status);
}


for (;;) {
assert(pid == wait(status));
if (WIFEXITED(status))
break;
assert(ptrace(PTRACE_SINGLESTEP, pid, 0,0) == 0);
}

assert(WEXITSTATUS(status) == 43);
return 0;
}

with the simple debugging patch below I did

# perl -e 'syscall 172, 666, 0,0'; ./xxx
# perl -e 'syscall 172, 666, 0,1'; ./xxx
# perl -e 'syscall 172, 666, 1,0'; ./xxx
# perl -e 'syscall 172, 666, 1,1'; ./xxx

and dmesg reports:

XXX disable_step=0, clear_flag=0
XXX: xxx/1868 0
[... 799 times ...]
XXX disable_step=0, clear_flag=1
XXX: xxx/1905 0
[... 799 times ...]
XXX disable_step=1, clear_flag=0
XXX disable_step=1, clear_flag=1

Just in case, I did the testing with and without CONFIG_UTRACE,
result is the same.

IOW, copy_thread()-clear_tsk_thread_flag(TIF_SINGLE_STEP) doesn't
make any difference, copy_process()-user_disable_single_step() does.

Although I need to re-read Martin's explanations about psw magic,
perhaps this was already explained...

Oleg.

--- K/kernel/sys.c~ 2010-01-21 14:16:15.366639654 -0500
+++ K/kernel/sys.c  2010-01-21 14:30:35.131591879 -0500
@@ -1453,6 +1453,8 @@ SYSCALL_DEFINE1(umask, int, mask)
return mask;
 }
 
+int xxx_disable_step, xxx_clear_flag;
+
 SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
unsigned long, arg4, unsigned long, arg5)
 {
@@ -1466,6 +1468,13 @@ SYSCALL_DEFINE5(prctl, int, option, unsi
 
error = 0;
switch (option) {
+   case 666:
+   xxx_disable_step = arg2;
+   xxx_clear_flag = arg3;
+   printk(KERN_INFO XXX disable_step=%d, clear_flag=%d\n,
+   xxx_disable_step, xxx_clear_flag);
+   break;
+
case PR_SET_PDEATHSIG:
if (!valid_signal(arg2)) {
error = -EINVAL;
--- K/kernel/fork.c~2010-01-18 09:35:16.823811008 -0500
+++ K/kernel/fork.c 2010-01-21 14:29:39.131624971 -0500
@@ -964,6 +964,8 @@ static void posix_cpu_timers_init(struct
INIT_LIST_HEAD(tsk-cpu_timers[2]);
 }
 
+extern int xxx_disable_step;
+
 /*
  * This creates a new process as a copy of the old one,
  * but does not actually start it yet.
@@ -1207,7 +1209,8 @@ static struct task_struct *copy_process(
 * Syscall tracing and stepping should be turned off in the
 * child regardless of CLONE_PTRACE.
 */
-   user_disable_single_step(p);
+   if (xxx_disable_step)
+   user_disable_single_step(p);
clear_tsk_thread_flag(p, TIF_SYSCALL_TRACE);
 #ifdef TIF_SYSCALL_EMU
clear_tsk_thread_flag(p, TIF_SYSCALL_EMU);
--- K/arch/s390/kernel/process.c~   2010-01-21 14:32:38.541609793 -0500
+++ K/arch/s390/kernel/process.c2010-01-21 14:34:10.461584130 -0500
@@ -161,6 +161,8 @@ void release_thread(struct task_struct *
 {
 }
 
+extern int xxx_clear_flag;
+
 int copy_thread(unsigned long clone_flags, unsigned long new_stackp,
unsigned long unused,
struct task_struct *p, struct pt_regs *regs)
@@ -217,7 +219,8 @@ int copy_thread(unsigned long clone_flag
p-thread.mm_segment = get_fs();
/* Don't copy debug registers */
memset(p-thread.per_info, 0, sizeof(p-thread.per_info));
-   clear_tsk_thread_flag(p, TIF_SINGLE_STEP);
+   if (xxx_clear_flag)
+   clear_tsk_thread_flag(p, TIF_SINGLE_STEP);
/* Initialize per thread user and system timer values */
ti = 

Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-08 Thread Martin Schwidefsky
On Thu,  7 Jan 2010 13:46:42 -0800 (PST)
Roland McGrath rol...@redhat.com wrote:

  Clear the TIF_SINGLE_STEP bit in copy_thread. If the new process is
  not auto-attached by the tracer it is wrong to delivere SIGTRAP to
  the new process.
 
 The change is right, but this log entry is confusing.  auto-attached has
 nothing to do with it, nor does anything about tracing the new process or
 not.  The new process has not experienced a PER trap of its own, so it is
 wrong to deliver a SIGTRAP that is meant for its creator.

Ok, I changed the wording slightly:

Clear the TIF_SINGLE_STEP bit in copy_thread. The new process did not get
a PER event of its own. It is wrong deliver a SIGTRAP that was meant for
the parent process.

-- 
blue skies,
   Martin.

Reality continues to ruin my life. - Calvin.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-08 Thread Roland McGrath
 Ok, I changed the wording slightly:
 
 Clear the TIF_SINGLE_STEP bit in copy_thread. The new process did not get
 a PER event of its own. It is wrong deliver a SIGTRAP that was meant for
 the parent process.

Very good!

Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-07 Thread Oleg Nesterov
Martin, sorry for delay,

On 01/07, Martin Schwidefsky wrote:

 On Wed,  6 Jan 2010 13:13:29 -0800 (PST)
 Roland McGrath rol...@redhat.com wrote:

   However, with or without CONFIG_UTRACE, 
   6580807da14c423f0d0a708108e6df6ebc8bc83d
   is needed on s390 too, otherwise the child gets unnecessary traps.
 
  This confuses me.  user_disable_single_step on non-current doesn't do
  anything not already done by the memset in copy_thread.  Ooh, except
  perhaps it does not clear PSW_MASK_PER.  Maybe that matters.  That's
  the only thing I can think of.  Maybe Martin can make sense of it.

I am confused as well. Yes, I thought about regs-psw.mask change too,
but I don't understand why it helps..

 The additional traps should not happen anymore with this patch:
 --
 Subject: [PATCH] clear TIF_SINGLE_STEP for new process.

 From: Martin Schwidefsky schwidef...@de.ibm.com

 Clear the TIF_SINGLE_STEP bit in copy_thread. If the new process is
 not auto-attached by the tracer it is wrong to delivere SIGTRAP to
 the new process.

 Signed-off-by: Martin Schwidefsky schwidef...@de.ibm.com
 ---

  arch/s390/kernel/process.c |1 +
  1 file changed, 1 insertion(+)

 diff -urpN linux-2.6/arch/s390/kernel/process.c 
 linux-2.6-patched/arch/s390/kernel/process.c
 --- linux-2.6/arch/s390/kernel/process.c  2009-12-03 04:51:21.0 
 +0100
 +++ linux-2.6-patched/arch/s390/kernel/process.c  2010-01-07 
 09:25:53.0 +0100
 @@ -217,6 +217,7 @@ int copy_thread(unsigned long clone_flag
   p-thread.mm_segment = get_fs();
   /* Don't copy debug registers */
   memset(p-thread.per_info, 0, sizeof(p-thread.per_info));
 + clear_tsk_thread_flag(p, TIF_SINGLE_STEP);

Even if I don't understand s390, I think this patch makes sense
anyway. Or, user_disable_single_step() can clear this bit.


But. Acoording to the testing I did (unless I did something wrong
again) this patch doesn't make any difference in this particular
case. 6580807da14c423f0d0a708108e6df6ebc8bc83d does.

And. Please note that the test-case triggers 799 false step, but
TIF_SINGLE_STEP is surely cleared (by the caller) after the first
invocation of do_single_step().

Oleg.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-07 Thread Oleg Nesterov
On 01/07, Martin Schwidefsky wrote:

 On Wed,  6 Jan 2010 13:08:12 -0800 (PST)
 Roland McGrath rol...@redhat.com wrote:

  That's what tracehook_signal_handler is for.  You're both doing it yourself
  in the arch code (by setting TIF_SINGLE_STEP), and then telling the generic
  code to do it (by passing stepping=1 to tracehook_signal_handler).

 Ok, so with the full utrace the semantics of tracehook_signal_handler
 is more than just causing a SIGTRAP. It is an indication for a signal
 AND a SIGTRAP if single-stepping is active. To make both cases work we
 should stop setting TIF_SINGLE_STEP in do_signal and pass
 current-thread.per_info.single_step to tracehook_signal_handler
 instead of test_thread_flag(TIF_SINGLE_STEP).

Can't understand why do we need TIF_SINGLE_STEP at all.

Just pass current-thread.per_info.single_step to
tracehook_signal_handler() ?

Oleg.

--- a/arch/s390/kernel/signal.c
+++ b/arch/s390/kernel/signal.c
@@ -504,14 +504,8 @@ void do_signal(struct pt_regs *regs)
 * for a normal instruction, act like we took
 * one for the handler setup.
 */
-   if (current-thread.per_info.single_step)
-   set_thread_flag(TIF_SINGLE_STEP);
-
-   /*
-* Let tracing know that we've done the handler setup.
-*/
tracehook_signal_handler(signr, info, ka, regs,
-test_thread_flag(TIF_SINGLE_STEP));
+current-thread.per_info.single_step);
}
return;
}



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-07 Thread Roland McGrath
  Right.  That means we should leave the status quo of not clearing
  TIF_SINGLE_STEP in user_disable_single_step.
 
 Ok, although it seems a bit strange not to do it. Perhaps I should add a
 comment about it.

It doesn't seem strange to me, but then I've just been through all this.
user_*_step is about what the task will do next.  TIF_SINGLE_STEP is about
what the task has done recently.  Of course more good comments always help.
I might be inclined to change the name of TIF_SINGLE_STEP so that its true
purpose is more obvious.  

AFAICT, in fact it is not even about single-step per se.  It means some PER
trap happened and should produce SIGTRAP.  Don't you get the same thing if
you haven't used single-step, but instead used PTRACE_POKEUSR to set up
per_struct with bits that say to trigger for some other reason?

How about calling it TIF_PER_PENDING?

 So after everthing has been converted to utrace we always will load the
 control registers in FixPerRegisters.

Right.  (This could well still change in the future.  But that's how it is
in utrace now.  And regardless of possible future implementation changes it
will always be the case that sometimes it will be called on current.)

 We could use the same compare of the control registers as the code in
 __switch_to. See below.

Yes, sounds good.

 The PSW_MASK_PER is the global PER enablement switch, the PER_EM_MASK
 bits enable the different PER events. We check for the PER_EM_MASK bits
 because it is easier to access in __switch_to. The return PSW is stored
 in the pt_regs structure, we would have to get a pointer to it (what
 regs = task_pt_regs(taks) does in FixPerRegisters). In
 FixPerRegisters we can as well use the PSW_MASK_PER bit.

I see.  Thanks for the explanation.


Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-07 Thread Roland McGrath
 Hmm, command for tracehook_signal_handler say this for stepping:
 @stepping:   nonzero if debugger single-step or block-step in
 use

Are you saying you would like me to clarify that wording somehow?  It's
meant to be implicit that the arch code is not doing any special fakery
about single-step for signal handlers, only processing real single-step
traps (and faking them for a syscall instruction if the arch requires
that).  No other arch does it, so it didn't occur to me that s390 would.
Before tracehook some had ptrace_notify calls there, and the call to
tracehook_signal_handler replaced that call.

  In ptrace (including utrace-based ptrace), this winds up with sending a
  SIGTRAP.  So when we finally do get out of do_signal and TIF_SINGLE_STEP
  causes a second SIGTRAP, it's already pending and the second one makes no
  difference.
 
 So we have been lucky so far.

Actually, Oleg rightly points out:

 Confused again, perhaps I just misunderstood what you mean...
 
 Without utrace, tracehook_signal_handler() doesn't send SIGTRAP, it
 merely does ptrace_notify(SIGTRAP), this means that
[...]
 even without utrace we can have unexpected SIGTRAP.

That is quite true, and I just misremembered when writing that paragraph.

So indeed we have been lucky, but it's not the luck of the problem not
happening on s390, but the luck of nobody ever caring. :-)

 Ok, so with the full utrace the semantics of tracehook_signal_handler
 is more than just causing a SIGTRAP. It is an indication for a signal
 AND a SIGTRAP if single-stepping is active. 

In short, it is the indication of a signal handler having been set up, just
like its kerneldoc description says.  Whatever that should mean to tracing
(SIGTRAP or otherwise) is in the purview of the generic tracing layer, not
the arch layer.

 To make both cases work we
 should stop setting TIF_SINGLE_STEP in do_signal and pass
 current-thread.per_info.single_step to tracehook_signal_handler
 instead of test_thread_flag(TIF_SINGLE_STEP).

Correct.


Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-07 Thread Roland McGrath
 Clear the TIF_SINGLE_STEP bit in copy_thread. If the new process is
 not auto-attached by the tracer it is wrong to delivere SIGTRAP to
 the new process.

The change is right, but this log entry is confusing.  auto-attached has
nothing to do with it, nor does anything about tracing the new process or
not.  The new process has not experienced a PER trap of its own, so it is
wrong to deliver a SIGTRAP that is meant for its creator.


Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-07 Thread Roland McGrath
 I am confused as well. Yes, I thought about regs-psw.mask change too,
 but I don't understand why it helps..
[...]
 But. Acoording to the testing I did (unless I did something wrong
 again) this patch doesn't make any difference in this particular
 case. 6580807da14c423f0d0a708108e6df6ebc8bc83d does.

Those results are quite mysterious to me.  
I think we'll have to get Martin to sort it out definitively.


Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-06 Thread Heiko Carstens
On Tue, Jan 05, 2010 at 08:58:18PM +0100, Oleg Nesterov wrote:
 I decided to re-test this all with vanilla 2.6.33-rc2. It is really
 amazing how long it takes to recompile/install the kernel!

Then either you have a lot of steal time or an old machine (pre z10)?
You could also try to define more cpus if you have a virtual machine:
#cp define cpu from-to
...that is if your admin gave you permissions to do that.

 I spent
 the rest of this day fighting with this rhts machine. Result - it
 doesn't boot:
 
   00: zIPL v1.8.2-5.el6 interactive boot menu
   00:
   00:  0. default (2.6.33-rc2)
   00:
   00:  1. 2.6.33-rc2
   00:  2. 2.6.32.2-14.s390x.el6.s390x
   00:  3. 2.6.32.2-14.el6.s390x
   00:  4. linux
   00:
   00: Note: VM users please use '#cp vi vmsg input'
   00:
   00: Please choose (default will boot in 15 seconds):
   00: Booting default (2.6.33-rc2)...
Ý0011c4fc¨ sysc_return+0x0/0x8
Ý003cc0c6¨ selinux_sb_copy_data+0x17e/0x238
   (Ý003cbf94¨ selinux_sb_copy_data+0x4c/0x238)
Ý003b62a6¨ security_sb_copy_data+0x4e/0x60
Ý00280338¨ vfs_kern_mount+0x19c/0x1f4
Ý002803de¨ kern_mount_data+0x4e/0x5c
Ý00ae1908¨ devtmpfs_init+0x60/0xbc
Ý00ae1818¨ driver_init+0x28/0x6c
Ý00abe582¨ kernel_init+0x32a/0x3d8
Ý0010b8c2¨ kernel_thread_starter+0x6/0xc
Ý0010b8bc¨ kernel_thread_starter+0x0/0xc
   INFO: lockdep is turned off.
   00: HCPGIR450W CP entered; disabled wait PSW 00020001 8000  
 00114BD4
 
 Cai, any chance you can help? Using /usr/bin/console I can't choose another
 kernel at the 00: Please choose (default will boot in 15 seconds): stage,
 how can I do this???

You did enter something like
#cp vi vmsg 2
and not just '2'?



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-06 Thread caiqian

 Cai, any chance you can help? Using /usr/bin/console I can't choose
 anotherkernel at the 00: Please choose (default will boot in 15 seconds):
 stage, how can I do this???

As Heiko mentioned, I did manage to enter,

#cp vi vmsg 2

during the prompt to get to the second kernel to boot...

Thanks,
CAI Qian



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-06 Thread Oleg Nesterov
On 01/06, caiq...@redhat.com wrote:

  Cai, any chance you can help? Using /usr/bin/console I can't choose
  anotherkernel at the 00: Please choose (default will boot in 15 seconds):
  stage, how can I do this???

 As Heiko mentioned, I did manage to enter,

 #cp vi vmsg 2

if only I new about this magic ;)

Thanks Cai and Heiko!

Oleg.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-06 Thread Oleg Nesterov
On 01/05, Oleg Nesterov wrote:

 On 01/05, Oleg Nesterov wrote:
 
  On 01/05, Oleg Nesterov wrote:
  
   I'll add clear_bit(TIF_SINGLE_STEP) into do_fork() path and re-test.
 
  Hmm. This patch
 
  --- kernel/fork.c~  2009-12-22 10:41:53.188084961 -0500
  +++ kernel/fork.c   2010-01-05 11:42:58.370636323 -0500
  @@ -1206,6 +1206,8 @@ static struct task_struct *copy_process(
   * of CLONE_PTRACE.
   */
  clear_tsk_thread_flag(p, TIF_SYSCALL_TRACE);
  +   clear_tsk_thread_flag(p, TIF_SINGLE_STEP);
  +   user_disable_single_step(p);
   #ifdef TIF_SYSCALL_EMU
  clear_tsk_thread_flag(p, TIF_SYSCALL_EMU);
   #endif
 
  doesn't help, I still see the same XXX's in dmesg...

 Oh, now I am totally confused.

 I reverted your fix from
 https://www.redhat.com/archives/utrace-devel/2010-January/msg6.html
 and now there is nothing in dmesg.

I take this back. I re-tested this all under 2.6.32.2 + utrace, and I
see nothing in dmesg. I don't know what I did wrong, most probably I
forgot to do zipl or something like this...

I'll try to summarize. Martin's patch
from https://www.redhat.com/archives/utrace-devel/2010-January/msg6.html
fixes the problems with utrace.


However, with or without CONFIG_UTRACE, 6580807da14c423f0d0a708108e6df6ebc8bc83d
is needed on s390 too, otherwise the child gets unnecessary traps.

Oleg.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-06 Thread Oleg Nesterov
On 01/05, Martin Schwidefsky wrote:

 On Mon,  4 Jan 2010 13:11:47 -0800 (PST)
 Roland McGrath rol...@redhat.com wrote:

   This probably means that copy_process()-user_disable_single_step()
   is not enough to clear the this task wants single-stepping copied
   from parent.
 
  I would suspect s390's TIF_SINGLE_STEP flag here.  That flag means a
  single-step trap occurred.  This is what causes do_single_step to be
  called before returning to user mode, rather than the machine trap doing it
  directly as is done in the other arch implementations.

 Just my thinking as well.

  If I'm right, then this task wants single-stepping is not the problem,
  and that really is fully cleared.  In fact, looking at s390's copy_thread
  (arch/s390/kernel/process.c) it clears out all the state that is actually
  touched by user_enable_single_step and user_disable_single_step.  So for
  s390 the new fork.c call is actually superfluous AFAICT.

 /* Don't copy debug registers */
 memset(p-thread.per_info, 0, sizeof(p-thread.per_info));

 Yep, the call from fork.c is indeed superfluous.

I can't explain this, but if I remove copy_process()-user_disable_single_step()
the test-case below triggers XXX printk's from do_single_step() with or
without CONFIG_UTRACE. the patch is

--- arch/s390/kernel/traps.c~   2009-12-22 10:41:52.909174198 -0500
+++ arch/s390/kernel/traps.c2010-01-05 11:03:55.006487697 -0500
@@ -384,6 +384,9 @@ void __kprobes do_single_step(struct pt_
}
if (tracehook_consider_fatal_signal(current, SIGTRAP))
force_sig(SIGTRAP, current);
+   else
+   printk(XXX: %s/%d %d\n, current-comm, current-pid,
+   test_thread_flag(TIF_SINGLE_STEP));
 }
 
 static void default_trap_handler(struct pt_regs * regs, long 
interruption_code)

Oleg.

#include stdio.h
#include unistd.h
#include signal.h
#include sys/ptrace.h
#include sys/wait.h
#include assert.h

int main(void)
{
int pid, status;

if (!(pid = fork())) {
assert(ptrace(PTRACE_TRACEME) == 0);
kill(getpid(), SIGSTOP);

if (!fork())
return 43;

wait(status);
return WEXITSTATUS(status);
}


for (;;) {
assert(pid == wait(status));
if (WIFEXITED(status))
break;
assert(ptrace(PTRACE_SINGLESTEP, pid, 0,0) == 0);
}

assert(WEXITSTATUS(status) == 43);
return 0;
}




Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-06 Thread Roland McGrath
 However, with or without CONFIG_UTRACE, 
 6580807da14c423f0d0a708108e6df6ebc8bc83d
 is needed on s390 too, otherwise the child gets unnecessary traps.

This confuses me.  user_disable_single_step on non-current doesn't do
anything not already done by the memset in copy_thread.  Ooh, except
perhaps it does not clear PSW_MASK_PER.  Maybe that matters.  That's
the only thing I can think of.  Maybe Martin can make sense of it.


Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-05 Thread Martin Schwidefsky
On Mon, 4 Jan 2010 19:14:12 +0100
Oleg Nesterov o...@redhat.com wrote:

 On 01/04, Martin Schwidefsky wrote:
  Subject: [PATCH] fix loading of PER control registers for utrace.
 
  From: Martin Schwidefsky schwidef...@de.ibm.com
 
  If the current task enables / disables PER tracing for itself the
  PER control registers need to be loaded in FixPerRegisters.
 
  Signed-off-by: Martin Schwidefsky schwidef...@de.ibm.com
  ---
   arch/s390/kernel/ptrace.c |3 +++
   1 file changed, 3 insertions(+)
 
  --- a/arch/s390/kernel/ptrace.c
  +++ b/arch/s390/kernel/ptrace.c
  @@ -98,6 +98,9 @@ FixPerRegisters(struct task_struct *task
  per_info-control_regs.bits.storage_alt_space_ctl = 1;
  else
  per_info-control_regs.bits.storage_alt_space_ctl = 0;
  +
  +   if (task == current)
  +   __ctl_load(per_info-control_regs.words, 9, 11);
   }
 
 Yes it does fix the problem! Thanks a lot Martin.

Ok, I will add that patch to the git390 queue.

 However. Could you please look at 6580807da14c423f0d0a708108e6df6ebc8bc83d ?
 I am worried, perhaps this commit is not enough for s390. OK, do_single_step()
 tracehook_consider_fatal_signal(), this means the forked thread will not
 be killed by SIGTRAP if it is not auto-attached, but still this may be
 wrong.
 
 IOW. I think this problem is minor and probably can be ignored, but if
 I remove tracehook_consider_fatal_signal() check from do_single_step(),
 
 --- a/arch/s390/kernel/traps.c
 +++ b/arch/s390/kernel/traps.c
 @@ -382,8 +382,7 @@ void __kprobes do_single_step(struct pt_
   SIGTRAP) == NOTIFY_STOP){
   return;
   }
 - if (tracehook_consider_fatal_signal(current, SIGTRAP))
 - force_sig(SIGTRAP, current);
 + force_sig(SIGTRAP, current);
  }
 
  static void default_trap_handler(struct pt_regs * regs, long 
 interruption_code)
 ---
 
 then the test-case from 6580807da14c423f0d0a708108e6df6ebc8bc83d
 fails. This probably means that copy_process()-user_disable_single_step()
 is not enough to clear the this task wants single-stepping copied
 from parent.

user_disable_single_step() does not remove the TIF_SINGLE_STEP bit from the
forked task. Perhaps we should just clear the bit in the function.

-- 
blue skies,
   Martin.

Reality continues to ruin my life. - Calvin.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-05 Thread Oleg Nesterov
On 01/05, Martin Schwidefsky wrote:

 On Mon,  4 Jan 2010 13:11:47 -0800 (PST)
 Roland McGrath rol...@redhat.com wrote:

   This probably means that copy_process()-user_disable_single_step()
   is not enough to clear the this task wants single-stepping copied
   from parent.
 
  I would suspect s390's TIF_SINGLE_STEP flag here.  That flag means a
  single-step trap occurred.  This is what causes do_single_step to be
  called before returning to user mode, rather than the machine trap doing it
  directly as is done in the other arch implementations.

 Just my thinking as well.

Oh, I am not sure. But I don't understand TIF_SINGLE_STEP on s390,
absolutely.

For example, why do_signal() sets TIF_SINGLE_STEP? Why can't we do

--- a/arch/s390/kernel/signal.c
+++ b/arch/s390/kernel/signal.c
@@ -500,18 +500,10 @@ void do_signal(struct pt_regs *regs)
clear_thread_flag(TIF_RESTORE_SIGMASK);
 
/*
-* If we would have taken a single-step trap
-* for a normal instruction, act like we took
-* one for the handler setup.
-*/
-   if (current-thread.per_info.single_step)
-   set_thread_flag(TIF_SINGLE_STEP);
-
-   /*
 * Let tracing know that we've done the handler 
setup.
 */
tracehook_signal_handler(signr, info, ka, 
regs,
-
test_thread_flag(TIF_SINGLE_STEP));
+   
current-thread.per_info.single_step);
}
return;
}

?

Apart from arch/s390/signal.c, TIF_SINGLE_STEP is used by entry.S
but I don't understand this asm at all.

Anyway. I modified the debugging patch a bit:

--- K/arch/s390/kernel/traps.c~ 2009-12-22 10:41:52.909174198 -0500
+++ K/arch/s390/kernel/traps.c  2010-01-05 09:49:19.541792379 -0500
@@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
}
if (tracehook_consider_fatal_signal(current, SIGTRAP))
force_sig(SIGTRAP, current);
+   else
+   printk(XXX: %d %d\n, current-pid, 
test_thread_flag(TIF_SINGLE_STEP));
 }
 
 static void default_trap_handler(struct pt_regs * regs, long interruption_code)
---

Now, when I run this test-case

#include stdio.h
#include unistd.h
#include signal.h
#include sys/ptrace.h
#include sys/wait.h
#include assert.h

int main(void)
{
int pid, status;

if (!(pid = fork())) {
assert(ptrace(PTRACE_TRACEME) == 0);
kill(getpid(), SIGSTOP);

if (!fork())
return 43;

wait(status);
return WEXITSTATUS(status);
}


for (;;) {
assert(pid == wait(status));
if (WIFEXITED(status))
break;
assert(ptrace(PTRACE_SINGLESTEP, pid, 0,0) == 0);
}

assert(WEXITSTATUS(status) == 43);
return 0;
}

dmesg shows 799 lines of

XXX: 2389 0


The kernel is 2.6.32.2 + utrace, but CONFIG_UTRACE is not set.

Oleg.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-05 Thread Martin Schwidefsky
On Tue, 5 Jan 2010 16:36:33 +0100
Oleg Nesterov o...@redhat.com wrote:

 On 01/05, Martin Schwidefsky wrote:
 
  On Mon,  4 Jan 2010 13:11:47 -0800 (PST)
  Roland McGrath rol...@redhat.com wrote:
 
This probably means that copy_process()-user_disable_single_step()
is not enough to clear the this task wants single-stepping copied
from parent.
  
   I would suspect s390's TIF_SINGLE_STEP flag here.  That flag means a
   single-step trap occurred.  This is what causes do_single_step to be
   called before returning to user mode, rather than the machine trap doing 
   it
   directly as is done in the other arch implementations.
 
  Just my thinking as well.
 
 Oh, I am not sure. But I don't understand TIF_SINGLE_STEP on s390,
 absolutely.
 
 For example, why do_signal() sets TIF_SINGLE_STEP? Why can't we do
 
   --- a/arch/s390/kernel/signal.c
   +++ b/arch/s390/kernel/signal.c
   @@ -500,18 +500,10 @@ void do_signal(struct pt_regs *regs)
   clear_thread_flag(TIF_RESTORE_SIGMASK);

   /*
   -* If we would have taken a single-step trap
   -* for a normal instruction, act like we took
   -* one for the handler setup.
   -*/
   -   if (current-thread.per_info.single_step)
   -   set_thread_flag(TIF_SINGLE_STEP);
   -
   -   /*
* Let tracing know that we've done the handler 
 setup.
*/
   tracehook_signal_handler(signr, info, ka, 
 regs,
   -
 test_thread_flag(TIF_SINGLE_STEP));
   +   
 current-thread.per_info.single_step);
   }
   return;
   }
 
 ?

The reason why we set the TIF_SINGLE_STEP bit in do_signal is that we
want to be able to stop the debugged program before the first
instruction of the signal handler has been executed. The PER single
step causes a trap after an instruction has been executed. That first
instruction can do bad things to the arguments of the signal handler..

 Apart from arch/s390/signal.c, TIF_SINGLE_STEP is used by entry.S
 but I don't understand this asm at all.
 
 Anyway. I modified the debugging patch a bit:
 
 --- K/arch/s390/kernel/traps.c~   2009-12-22 10:41:52.909174198 -0500
 +++ K/arch/s390/kernel/traps.c2010-01-05 09:49:19.541792379 -0500
 @@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
   }
   if (tracehook_consider_fatal_signal(current, SIGTRAP))
   force_sig(SIGTRAP, current);
 + else
 + printk(XXX: %d %d\n, current-pid, 
 test_thread_flag(TIF_SINGLE_STEP));
  }
 
  static void default_trap_handler(struct pt_regs * regs, long 
 interruption_code)
 ---
 
 Now, when I run this test-case
 
   #include stdio.h
   #include unistd.h
   #include signal.h
   #include sys/ptrace.h
   #include sys/wait.h
   #include assert.h
 
   int main(void)
   {
   int pid, status;
 
   if (!(pid = fork())) {
   assert(ptrace(PTRACE_TRACEME) == 0);
   kill(getpid(), SIGSTOP);
 
   if (!fork())
   return 43;
 
   wait(status);
   return WEXITSTATUS(status);
   }
 
 
   for (;;) {
   assert(pid == wait(status));
   if (WIFEXITED(status))
   break;
   assert(ptrace(PTRACE_SINGLESTEP, pid, 0,0) == 0);
   }
 
   assert(WEXITSTATUS(status) == 43);
   return 0;
   }
 
 dmesg shows 799 lines of
 
   XXX: 2389 0
 
 
 The kernel is 2.6.32.2 + utrace, but CONFIG_UTRACE is not set.

With or without my bug fix ?

-- 
blue skies,
   Martin.

Reality continues to ruin my life. - Calvin.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-05 Thread Oleg Nesterov
On 01/05, Oleg Nesterov wrote:

 Anyway. I modified the debugging patch a bit:

 --- K/arch/s390/kernel/traps.c~   2009-12-22 10:41:52.909174198 -0500
 +++ K/arch/s390/kernel/traps.c2010-01-05 09:49:19.541792379 -0500
 @@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
   }
   if (tracehook_consider_fatal_signal(current, SIGTRAP))
   force_sig(SIGTRAP, current);
 + else
 + printk(XXX: %d %d\n, current-pid, 
 test_thread_flag(TIF_SINGLE_STEP));
  }

  static void default_trap_handler(struct pt_regs * regs, long 
 interruption_code)
 ---

Ah, please ignore. I guess TIF_SINGLE_STEP was already cleared by the caller
in entry.S

Oleg.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-05 Thread Martin Schwidefsky
On Tue, 5 Jan 2010 16:47:25 +0100
Oleg Nesterov o...@redhat.com wrote:

 On 01/05, Oleg Nesterov wrote:
 
  Anyway. I modified the debugging patch a bit:
 
  --- K/arch/s390/kernel/traps.c~ 2009-12-22 10:41:52.909174198 -0500
  +++ K/arch/s390/kernel/traps.c  2010-01-05 09:49:19.541792379 -0500
  @@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
  }
  if (tracehook_consider_fatal_signal(current, SIGTRAP))
  force_sig(SIGTRAP, current);
  +   else
  +   printk(XXX: %d %d\n, current-pid, 
  test_thread_flag(TIF_SINGLE_STEP));
   }
 
   static void default_trap_handler(struct pt_regs * regs, long 
  interruption_code)
  ---
 
 Ah, please ignore. I guess TIF_SINGLE_STEP was already cleared by the caller
 in entry.S

Yes, TIF_SINGLE_STEP is checked in entry.S and cleared before do_signal
is called. That is the ni instruction at sysc_singlestep and
sysc_sigpending.

-- 
blue skies,
   Martin.

Reality continues to ruin my life. - Calvin.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-05 Thread Oleg Nesterov
On 01/05, Martin Schwidefsky wrote:

 On Tue, 5 Jan 2010 16:36:33 +0100
 Oleg Nesterov o...@redhat.com wrote:

  For example, why do_signal() sets TIF_SINGLE_STEP? Why can't we do
  
  --- a/arch/s390/kernel/signal.c
  +++ b/arch/s390/kernel/signal.c
  @@ -500,18 +500,10 @@ void do_signal(struct pt_regs *regs)
  clear_thread_flag(TIF_RESTORE_SIGMASK);
   
  /*
  -* If we would have taken a single-step trap
  -* for a normal instruction, act like we took
  -* one for the handler setup.
  -*/
  -   if (current-thread.per_info.single_step)
  -   set_thread_flag(TIF_SINGLE_STEP);
  -
  -   /*
   * Let tracing know that we've done the handler 
  setup.
   */
  tracehook_signal_handler(signr, info, ka, 
  regs,
  -
  test_thread_flag(TIF_SINGLE_STEP));
  +   
  current-thread.per_info.single_step);
  }
  return;
  }
  
  ?

 The reason why we set the TIF_SINGLE_STEP bit in do_signal is that we
 want to be able to stop the debugged program before the first
 instruction of the signal handler has been executed. The PER single
 step causes a trap after an instruction has been executed. That first
 instruction can do bad things to the arguments of the signal handler..

Yes, but afaics all we need is to pass the correct int stepping arg
to tracehook_signal_handler(). If it is true, the tracee does
ptrace_notify() before it returns to user-mode.

  dmesg shows 799 lines of
 
  XXX: 2389 0
 
 
  The kernel is 2.6.32.2 + utrace, but CONFIG_UTRACE is not set.

 With or without my bug fix ?

With, but please see another email.


I'll add clear_bit(TIF_SINGLE_STEP) into do_fork() path and re-test.

Oleg.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-04 Thread Martin Schwidefsky
On Mon, 4 Jan 2010 16:52:25 +0100
Oleg Nesterov o...@redhat.com wrote:

 Hi!
 
 We have some strange problems with utrace on s390, and so far this _looks_
 like a s390 problem.
 
 Looks like, on any CPU user_enable_single_step() does not work until at
 least one thread with per_info.single_step = 1 does the context switch.
 
 This doesn't matter with the old ptrace implementation, but with utrace
 the tracee itself does user_enable_single_step(current) and returns to
 user-mode. Until it does at least one context switch the single-stepping
 doesn't work, after that everything works fine till the next reboot.

The PER control registers only get reloaded on task switch. Can you test
if this patch fixes your problem?

--
Subject: [PATCH] fix loading of PER control registers for utrace.

From: Martin Schwidefsky schwidef...@de.ibm.com

If the current task enables / disables PER tracing for itself the
PER control registers need to be loaded in FixPerRegisters.

Signed-off-by: Martin Schwidefsky schwidef...@de.ibm.com
---
 arch/s390/kernel/ptrace.c |3 +++
 1 file changed, 3 insertions(+)

--- a/arch/s390/kernel/ptrace.c
+++ b/arch/s390/kernel/ptrace.c
@@ -98,6 +98,9 @@ FixPerRegisters(struct task_struct *task
per_info-control_regs.bits.storage_alt_space_ctl = 1;
else
per_info-control_regs.bits.storage_alt_space_ctl = 0;
+
+   if (task == current)
+   __ctl_load(per_info-control_regs.words, 9, 11);
 }
 
 void user_enable_single_step(struct task_struct *task)

-- 
blue skies,
   Martin.

Reality continues to ruin my life. - Calvin.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-04 Thread Oleg Nesterov
On 01/04, Oleg Nesterov wrote:

 IOW. I think this problem is minor and probably can be ignored,

Or may be not...

Even if the child is not killed by SIGTRAP, it can get a lot of
unnecessary traps.

To verify, I did another trivial patch (below), and the test
case from 6580807da14c423f0d0a708108e6df6ebc8bc83d does trigger
a lot of false step printks.

Hmm. And sometimes there is nothing in dmesg, but the test-case
needs a lot of time to complete. taskset -c seems to always
trigger printk's. Magic.

Oleg.

--- arch/s390/kernel/traps.c~   2009-12-22 10:41:52.909174198 -0500
+++ arch/s390/kernel/traps.c2010-01-04 13:19:51.038187586 -0500
@@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
}
if (tracehook_consider_fatal_signal(current, SIGTRAP))
force_sig(SIGTRAP, current);
+   else
+   printk(false step\n);
 }
 
 static void default_trap_handler(struct pt_regs * regs, long interruption_code)



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-04 Thread Roland McGrath
 The PER control registers only get reloaded on task switch. Can you test
 if this patch fixes your problem?

Long ago when I first worked with David Wilder on s390 arch code,
I remember we made this change.  It seems to have been forgotten
in the later rounds of reworking and merging.


Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-04 Thread Roland McGrath
 This probably means that copy_process()-user_disable_single_step()
 is not enough to clear the this task wants single-stepping copied
 from parent.

I would suspect s390's TIF_SINGLE_STEP flag here.  That flag means a
single-step trap occurred.  This is what causes do_single_step to be
called before returning to user mode, rather than the machine trap doing it
directly as is done in the other arch implementations.

If I'm right, then this task wants single-stepping is not the problem,
and that really is fully cleared.  In fact, looking at s390's copy_thread
(arch/s390/kernel/process.c) it clears out all the state that is actually
touched by user_enable_single_step and user_disable_single_step.  So for
s390 the new fork.c call is actually superfluous AFAICT.

The problem is that the copied parent state includes the this task has a
pending single-step to report flag.  IMHO it clearly makes sense for
s390's copy_thread to clear this flag in a new task, which it does not do now.

An alternative to that would be just to make its user_disable_single_step
clear the flag.  That could in theory also have an effect on e.g. the
(authentic) pending step report of a tracee that was stopped with
TIF_SINGLE_STEP set when its tracer detached.  This might be considered a
good thing, but since every other arch posts the SIGTRAP immediately they
all have the equivalent issue and s390 doesn't need to be any better than
they are before we have a generic resolution to the whole subject of
tracer-induced signals (which we won't get into now).  I'm not even sure
from my insufficient reading of the s390 assembly code whether this path is
even possible, i.e. do_signal called before do_single_step.

Martin, I suggest having copy_thread clear TIF_SINGLE_STEP.
That bit is always task-private state that should not be copied.

Btw, given the complexity of FixPerRegisters (and its new additional cost
on task==current), you might want to make user_*_single_step bail out if
per_info.single_step is already set/clear on entry.


Thanks,
Roland



Re: odd utrace testing results on s390x

2009-12-24 Thread caiqian

 Could you please re-test with this patch applied?

It turned out that this patch did not make much difference. the step-simple is 
still failing with the patch applied. It could be reproduced a few times after 
a fresh reboot. The test exited with 1 here,

  /* Known bug in 2.6.28-rc7 + utrace patch:
   * child was left to run freely, and exited
   * Deterministic (happens even with NUM_SINGLESTEPS = 1)
   */
  if (WIFEXITED (status))
{
  VERBOSE(PTRACE_SINGLESTEP did not stop (step #%d)\n, i+1);
  assert (WEXITSTATUS (status) == 42);
  exit (1);
}

Here was the strace output when failure.

# strace ./step-simple
execve(./step-simple, [./step-simple], [/* 28 vars */]) = 0
brk(0)  = 0x80003000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x2027000
access(/etc/ld.so.preload, R_OK)  = -1 ENOENT (No such file or directory)
open(/etc/ld.so.cache, O_RDONLY)  = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=44711, ...}) = 0
mmap(NULL, 44711, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2028000
close(3)= 0
open(/lib64/libc.so.6, O_RDONLY)  = 3
read(3, 
\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\26\0\0\0\1\0\0\0\0\0\2\10\364\0..., 
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=2703224, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x2033000
mmap(NULL, 1729920, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x2034000
mmap(0x21d1000, 20480, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19c000) = 0x21d1000
mmap(0x21d6000, 17792, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x21d6000
close(3)= 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x21db000
mprotect(0x21d1000, 16384, PROT_READ) = 0
mprotect(0x2022000, 4096, PROT_READ) = 0
munmap(0x2028000, 44711)= 0
rt_sigaction(SIGABRT, {0x8e7c, [ABRT], SA_RESTART}, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGINT, {0x8e7c, [INT], SA_RESTART}, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGALRM, {0x8e7c, [ALRM], SA_RESTART}, {SIG_DFL, [], 0}, 8) = 0
alarm(5)= 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x21db7c0) = 2026
wait4(2026, [{WIFSTOPPED(s)  WSTOPSIG(s) == SIGSTOP}], 0, NULL) = 2026
--- SIGCHLD (Child exited) @ 0 (0) ---
ptrace(PTRACE_SINGLESTEP, 2026, 0, SIG_0) = 0
wait4(2026, [{WIFEXITED(s)  WEXITSTATUS(s) == 42}], 0, NULL) = 2026
--- SIGCHLD (Child exited) @ 0 (0) ---
kill(2026, SIGKILL) = -1 ESRCH (No such process)
wait4(-1, NULL, __WALL, NULL)   = -1 ECHILD (No child processes)
exit_group(1)

Also, I could not reproduce the problem in kernels without utrace.
Thanks,
CAI Qian



Re: odd utrace testing results on s390x

2009-12-22 Thread Oleg Nesterov
On 12/22, caiq...@redhat.com wrote:

 The following are testing results on s390x kernels build from the source,

 http://kojipkgs.fedoraproject.org/packages/kernel/2.6.32.2/14.fc13/src/kernel-2.6.32.2-14.fc13.src.rpm

 without and with CONFIG_UTRACE using the latest ptrace-utrace git tree.

 ptrace testsuite: looks like step-simple is starting to fail,

Damn, my fault. I forgot to cc you when I sent the fix for s390 (attached
below), and I forgot to remind you about this fix when we discussed the
testing on s390.

Could you please re-test with this patch applied?

 and a different syscall number when syscall-from-clone failed.

This is not clear to me, will take a look.

Thanks!

Oleg.

-
Untested, but hopefully trivial enough and should't change the
compiled code.

Nobody except ptrace itself should use task-ptrace or PT_PTRACED
directly, change arch/s390/kernel/traps.c to use the helper.

Signed-off-by: Oleg Nesterov o...@redhat.com
Acked-by: Roland McGrath rol...@redhat.com
---

 arch/s390/kernel/traps.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- V1/arch/s390/kernel/traps.c~S390_DONT_ABUSE_PT_PTRACED  2009-04-06 
00:03:36.0 +0200
+++ V1/arch/s390/kernel/traps.c 2009-12-09 20:31:49.0 +0100
@@ -18,7 +18,7 @@
 #include linux/kernel.h
 #include linux/string.h
 #include linux/errno.h
-#include linux/ptrace.h
+#include linux/tracehook.h
 #include linux/timer.h
 #include linux/mm.h
 #include linux/smp.h
@@ -382,7 +382,7 @@ void __kprobes do_single_step(struct pt_
SIGTRAP) == NOTIFY_STOP){
return;
}
-   if ((current-ptrace  PT_PTRACED) != 0)
+   if (tracehook_consider_fatal_signal(current, SIGTRAP))
force_sig(SIGTRAP, current);
 }
 
@@ -483,7 +483,7 @@ static void illegal_op(struct pt_regs * 
if (get_user(*((__u16 *) opcode), (__u16 __user *) location))
return;
if (*((__u16 *) opcode) == S390_BREAKPOINT_U16) {
-   if (current-ptrace  PT_PTRACED)
+   if (tracehook_consider_fatal_signal(current, SIGTRAP))
force_sig(SIGTRAP, current);
else
signal = SIGILL;



Re: odd utrace testing results on s390x

2009-12-22 Thread Oleg Nesterov
On 12/22, Oleg Nesterov wrote:

 On 12/22, caiq...@redhat.com wrote:
 
  and a different syscall number when syscall-from-clone failed.

 This is not clear to me, will take a look.

Should I say I know nothing about s390 and can't read its asm?

First of all, I think syscall-from-clone is wrong on s390 and
needs a fix,

#elif defined __s390__
# define REGISTER_CALLNOoffsetof (struct user_regs_struct, 
orig_gpr2)
# define REGISTER_RETVALoffsetof (struct user_regs_struct, 
gprs[2])

This doesn't look right. I did a simple test-case (below), and
with or without utrace it prints

syscall=1234 ret=6  - __NR_close
syscall=1234 ret=-9 - -EBADF, this is correct

Apparently, 1234 is the argument for close(1234).

So: it is not clear to me how to get callno on SYSCALL_EXIT,
and I don't know whether it is OK or not that syscall-from-clone
sees different -orig_gpr2 after return from fork() on s390

-unexpected syscall 0x5ee9   without utrace
+unexpected syscall 0    with utrace

syscall_get_nr() returns regs-svcnr, but there is no svcnr in
user_regs_struct on s390.

Still investigating...

Oleg.

#include stdio.h
#include unistd.h
#include stdlib.h
#include signal.h
#include stddef.h
#include sys/ptrace.h
#include sys/wait.h
#include assert.h
#include sys/user.h
#include asm/ptrace.h


#ifdef __s390__
#define REGISTER_CALLNO offsetof(struct user_regs_struct, orig_gpr2)
#define REGISTER_RETVAL offsetof(struct user_regs_struct, gprs[2])
#else
#define REGISTER_CALLNO offsetof(struct user_regs_struct, orig_rax)
#define REGISTER_RETVAL offsetof(struct user_regs_struct, rax)
#endif

int main(void)
{
int pid, status, i;
long scn, ret;

pid = fork();
if (!pid) {
assert(ptrace(PTRACE_TRACEME, 0, 0, 0) == 0);
kill(getpid(), SIGUSR1);

close(1234);

return 0x66;
}

assert(pid == wait(status));
assert(WIFSTOPPED(status)  WSTOPSIG(status) == SIGUSR1);
assert(ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_TRACESYSGOOD) == 0);

for (i = 0; i  2; ++i) {
ptrace(PTRACE_SYSCALL, pid, 0,0);
assert(pid == wait(status));
assert(status == 0x857f);

scn = ptrace(PTRACE_PEEKUSER, pid, REGISTER_CALLNO, NULL);
ret = ptrace(PTRACE_PEEKUSER, pid, REGISTER_RETVAL, NULL);

printf(syscall=%ld ret=%ld\n, scn, ret);

}

kill(pid, SIGKILL);

return 0;
}



Re: odd utrace testing results on s390x

2009-12-22 Thread Oleg Nesterov
On 12/22, Oleg Nesterov wrote:

 On 12/22, Oleg Nesterov wrote:
 
 and I don't know whether it is OK or not that syscall-from-clone
 sees different -orig_gpr2 after return from fork() on s390

   -unexpected syscall 0x5ee9   without utrace
   +unexpected syscall 0    with utrace

AARGH. I misread the diff you reported, the difference above
has nothing to do with syscall-from-clone! This is the output
form the next test-case, step-from-clone.

I think, everything is (almost) clear now, I'll try to summarize.
Both syscall-from-clone and step-from-clone fail, with or without
utrace. I think they both need the fix, REGISTER_CALLNO and
REGISTER_RETVAL are wrong on s390. (I also tested my trivial
test-case on rhel5 and it prints the same).

Now, the difference above _can_ be explained because utrace
kernel lacks ca633fd006486ed2c2d3b542283067aab61e6dc8 (Cai,
I attached this patch in the first reply), or we have issues
with utrace on s390.

without utrace:

-unexpected syscall 0x5ee9

this is in fact grandchild's pid == fork's retval because
REGISTER_CALLNO is wrong

with utrace:

+unexpected syscall 0

well, we are reading orig_gpr2 which is wrong anyway, but I am
worried because cat /proc/child/syscall reports

-1 0x3ff21d8 0x20f1e52

_Perhaps_ this all will be fixed by ca633fd006486ed2c2d3b542283067aab61e6dc8,
but I am not sure. This trap was (I think) generated by
ptrace_report_signal(), it may happen that so s390 doesn't preserve
some registers when we dequeue SIGTRAP after do_notify_resume()-utrace_stop()
and call utrace_stop() again.



Oh. I am still trying to parse arch/s390/kernel/entry.S to understand
how can we fix these test-cases. I think I need the help, will continue
tomorrow.

Oleg.



Re: odd utrace testing results on s390x

2009-12-22 Thread Roland McGrath
 Damn, my fault. I forgot to cc you when I sent the fix for s390 (attached
 below), and I forgot to remind you about this fix when we discussed the
 testing on s390.

That change is upstream for 2.6.33 now.  I'll cherry-picked it into 
the 2.6.32/tracehook branch so it will be in the backport patchset too.


Thanks,
Roland



Re: odd utrace testing results on s390x

2009-12-22 Thread Roland McGrath
 Oh. I am still trying to parse arch/s390/kernel/entry.S to understand
 how can we fix these test-cases. I think I need the help, will continue
 tomorrow.

Martin Schwidefsky schwidef...@de.ibm.com is the s390 arch maintainer.
He is friendly and helpful.  You can ask him for help both with
understanding the intended s390 behavior before, and with understanding the
code paths.  He won't expect any of us to grok s390 assembly. :-)


Thanks,
Roland