Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-26 Thread Martin Schwidefsky
On Thu, 21 Jan 2010 21:51:13 +0100
Oleg Nesterov o...@redhat.com wrote:

 On 01/07, Roland McGrath wrote:
 
   I am confused as well. Yes, I thought about regs-psw.mask change too,
   but I don't understand why it helps..
  [...]
   But. Acoording to the testing I did (unless I did something wrong
   again) this patch doesn't make any difference in this particular
   case. 6580807da14c423f0d0a708108e6df6ebc8bc83d does.
 
  Those results are quite mysterious to me.
  I think we'll have to get Martin to sort it out definitively.

Finally nailed that one. Grrmpf.. the special case in the program check
handler for single stepped svcs clobbers the argument registers. With our
test case this affects the clone() system call. Funny things happen when
the clone_flags argument is more or less random ..
The following patch fixes the problem for me.

--
Subject: [PATCH] fix single stepped svcs with TRACE_IRQFLAGS=y

From: Martin Schwidefsky schwidef...@de.ibm.com

If irq flags tracing is enabled the TRACE_IRQS_ON macros expands to
a function call which clobbers registers %r0-%r5. The macro is used
in the code path for single stepped system calls. The argument
registers %r2-%r6 need to be restored from the stack before the system
call function is called.

Cc: sta...@kernel.org
Signed-off-by: Martin Schwidefsky schwidef...@de.ibm.com
---

 arch/s390/kernel/entry.S   |1 +
 arch/s390/kernel/entry64.S |1 +
 2 files changed, 2 insertions(+)

diff -urpN linux-2.6/arch/s390/kernel/entry64.S 
linux-2.6-patched/arch/s390/kernel/entry64.S
--- linux-2.6/arch/s390/kernel/entry64.S2009-12-03 04:51:21.0 
+0100
+++ linux-2.6-patched/arch/s390/kernel/entry64.S2010-01-26 
14:04:58.0 +0100
@@ -549,6 +549,7 @@ pgm_svcper:
mvc __THREAD_per+__PER_access_id(1,%r8),__LC_PER_ACCESS_ID
oi  __TI_flags+7(%r9),_TIF_SINGLE_STEP # set TIF_SINGLE_STEP
TRACE_IRQS_ON
+   lmg %r2,%r6,SP_R2(%r15) # load svc arguments
stosm   __SF_EMPTY(%r15),0x03   # reenable interrupts
j   sysc_do_svc
 
diff -urpN linux-2.6/arch/s390/kernel/entry.S 
linux-2.6-patched/arch/s390/kernel/entry.S
--- linux-2.6/arch/s390/kernel/entry.S  2009-12-03 04:51:21.0 +0100
+++ linux-2.6-patched/arch/s390/kernel/entry.S  2010-01-26 14:04:58.0 
+0100
@@ -571,6 +571,7 @@ pgm_svcper:
mvc __THREAD_per+__PER_access_id(1,%r8),__LC_PER_ACCESS_ID
oi  __TI_flags+3(%r9),_TIF_SINGLE_STEP # set TIF_SINGLE_STEP
TRACE_IRQS_ON
+   lm  %r2,%r6,SP_R2(%r15) # load svc arguments
stosm   __SF_EMPTY(%r15),0x03   # reenable interrupts
b   BASED(sysc_do_svc)
 
-- 
blue skies,
   Martin.

Reality continues to ruin my life. - Calvin.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-21 Thread Oleg Nesterov
On 01/07, Roland McGrath wrote:

  I am confused as well. Yes, I thought about regs-psw.mask change too,
  but I don't understand why it helps..
 [...]
  But. Acoording to the testing I did (unless I did something wrong
  again) this patch doesn't make any difference in this particular
  case. 6580807da14c423f0d0a708108e6df6ebc8bc83d does.

 Those results are quite mysterious to me.
 I think we'll have to get Martin to sort it out definitively.

I did the testing again with 2.6.32-5.el6 + Martin's
c3311c13adc1021e986fef12609ceb395ffc5014
f8d5faf718c9ff2c04eb8484585d4963c4111cd7
patches.

the same test-case:

#include stdio.h
#include unistd.h
#include signal.h
#include sys/ptrace.h
#include sys/wait.h
#include assert.h

int main(void)
{
int pid, status;

if (!(pid = fork())) {
assert(ptrace(PTRACE_TRACEME) == 0);
kill(getpid(), SIGSTOP);

if (!fork())
return 43;

wait(status);
return WEXITSTATUS(status);
}


for (;;) {
assert(pid == wait(status));
if (WIFEXITED(status))
break;
assert(ptrace(PTRACE_SINGLESTEP, pid, 0,0) == 0);
}

assert(WEXITSTATUS(status) == 43);
return 0;
}

with the simple debugging patch below I did

# perl -e 'syscall 172, 666, 0,0'; ./xxx
# perl -e 'syscall 172, 666, 0,1'; ./xxx
# perl -e 'syscall 172, 666, 1,0'; ./xxx
# perl -e 'syscall 172, 666, 1,1'; ./xxx

and dmesg reports:

XXX disable_step=0, clear_flag=0
XXX: xxx/1868 0
[... 799 times ...]
XXX disable_step=0, clear_flag=1
XXX: xxx/1905 0
[... 799 times ...]
XXX disable_step=1, clear_flag=0
XXX disable_step=1, clear_flag=1

Just in case, I did the testing with and without CONFIG_UTRACE,
result is the same.

IOW, copy_thread()-clear_tsk_thread_flag(TIF_SINGLE_STEP) doesn't
make any difference, copy_process()-user_disable_single_step() does.

Although I need to re-read Martin's explanations about psw magic,
perhaps this was already explained...

Oleg.

--- K/kernel/sys.c~ 2010-01-21 14:16:15.366639654 -0500
+++ K/kernel/sys.c  2010-01-21 14:30:35.131591879 -0500
@@ -1453,6 +1453,8 @@ SYSCALL_DEFINE1(umask, int, mask)
return mask;
 }
 
+int xxx_disable_step, xxx_clear_flag;
+
 SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
unsigned long, arg4, unsigned long, arg5)
 {
@@ -1466,6 +1468,13 @@ SYSCALL_DEFINE5(prctl, int, option, unsi
 
error = 0;
switch (option) {
+   case 666:
+   xxx_disable_step = arg2;
+   xxx_clear_flag = arg3;
+   printk(KERN_INFO XXX disable_step=%d, clear_flag=%d\n,
+   xxx_disable_step, xxx_clear_flag);
+   break;
+
case PR_SET_PDEATHSIG:
if (!valid_signal(arg2)) {
error = -EINVAL;
--- K/kernel/fork.c~2010-01-18 09:35:16.823811008 -0500
+++ K/kernel/fork.c 2010-01-21 14:29:39.131624971 -0500
@@ -964,6 +964,8 @@ static void posix_cpu_timers_init(struct
INIT_LIST_HEAD(tsk-cpu_timers[2]);
 }
 
+extern int xxx_disable_step;
+
 /*
  * This creates a new process as a copy of the old one,
  * but does not actually start it yet.
@@ -1207,7 +1209,8 @@ static struct task_struct *copy_process(
 * Syscall tracing and stepping should be turned off in the
 * child regardless of CLONE_PTRACE.
 */
-   user_disable_single_step(p);
+   if (xxx_disable_step)
+   user_disable_single_step(p);
clear_tsk_thread_flag(p, TIF_SYSCALL_TRACE);
 #ifdef TIF_SYSCALL_EMU
clear_tsk_thread_flag(p, TIF_SYSCALL_EMU);
--- K/arch/s390/kernel/process.c~   2010-01-21 14:32:38.541609793 -0500
+++ K/arch/s390/kernel/process.c2010-01-21 14:34:10.461584130 -0500
@@ -161,6 +161,8 @@ void release_thread(struct task_struct *
 {
 }
 
+extern int xxx_clear_flag;
+
 int copy_thread(unsigned long clone_flags, unsigned long new_stackp,
unsigned long unused,
struct task_struct *p, struct pt_regs *regs)
@@ -217,7 +219,8 @@ int copy_thread(unsigned long clone_flag
p-thread.mm_segment = get_fs();
/* Don't copy debug registers */
memset(p-thread.per_info, 0, sizeof(p-thread.per_info));
-   clear_tsk_thread_flag(p, TIF_SINGLE_STEP);
+   if (xxx_clear_flag)
+   clear_tsk_thread_flag(p, TIF_SINGLE_STEP);
/* Initialize per thread user and system timer values */
ti = 

Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-08 Thread Martin Schwidefsky
On Thu,  7 Jan 2010 13:46:42 -0800 (PST)
Roland McGrath rol...@redhat.com wrote:

  Clear the TIF_SINGLE_STEP bit in copy_thread. If the new process is
  not auto-attached by the tracer it is wrong to delivere SIGTRAP to
  the new process.
 
 The change is right, but this log entry is confusing.  auto-attached has
 nothing to do with it, nor does anything about tracing the new process or
 not.  The new process has not experienced a PER trap of its own, so it is
 wrong to deliver a SIGTRAP that is meant for its creator.

Ok, I changed the wording slightly:

Clear the TIF_SINGLE_STEP bit in copy_thread. The new process did not get
a PER event of its own. It is wrong deliver a SIGTRAP that was meant for
the parent process.

-- 
blue skies,
   Martin.

Reality continues to ruin my life. - Calvin.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-08 Thread Roland McGrath
 Ok, I changed the wording slightly:
 
 Clear the TIF_SINGLE_STEP bit in copy_thread. The new process did not get
 a PER event of its own. It is wrong deliver a SIGTRAP that was meant for
 the parent process.

Very good!

Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-07 Thread Oleg Nesterov
Martin, sorry for delay,

On 01/07, Martin Schwidefsky wrote:

 On Wed,  6 Jan 2010 13:13:29 -0800 (PST)
 Roland McGrath rol...@redhat.com wrote:

   However, with or without CONFIG_UTRACE, 
   6580807da14c423f0d0a708108e6df6ebc8bc83d
   is needed on s390 too, otherwise the child gets unnecessary traps.
 
  This confuses me.  user_disable_single_step on non-current doesn't do
  anything not already done by the memset in copy_thread.  Ooh, except
  perhaps it does not clear PSW_MASK_PER.  Maybe that matters.  That's
  the only thing I can think of.  Maybe Martin can make sense of it.

I am confused as well. Yes, I thought about regs-psw.mask change too,
but I don't understand why it helps..

 The additional traps should not happen anymore with this patch:
 --
 Subject: [PATCH] clear TIF_SINGLE_STEP for new process.

 From: Martin Schwidefsky schwidef...@de.ibm.com

 Clear the TIF_SINGLE_STEP bit in copy_thread. If the new process is
 not auto-attached by the tracer it is wrong to delivere SIGTRAP to
 the new process.

 Signed-off-by: Martin Schwidefsky schwidef...@de.ibm.com
 ---

  arch/s390/kernel/process.c |1 +
  1 file changed, 1 insertion(+)

 diff -urpN linux-2.6/arch/s390/kernel/process.c 
 linux-2.6-patched/arch/s390/kernel/process.c
 --- linux-2.6/arch/s390/kernel/process.c  2009-12-03 04:51:21.0 
 +0100
 +++ linux-2.6-patched/arch/s390/kernel/process.c  2010-01-07 
 09:25:53.0 +0100
 @@ -217,6 +217,7 @@ int copy_thread(unsigned long clone_flag
   p-thread.mm_segment = get_fs();
   /* Don't copy debug registers */
   memset(p-thread.per_info, 0, sizeof(p-thread.per_info));
 + clear_tsk_thread_flag(p, TIF_SINGLE_STEP);

Even if I don't understand s390, I think this patch makes sense
anyway. Or, user_disable_single_step() can clear this bit.


But. Acoording to the testing I did (unless I did something wrong
again) this patch doesn't make any difference in this particular
case. 6580807da14c423f0d0a708108e6df6ebc8bc83d does.

And. Please note that the test-case triggers 799 false step, but
TIF_SINGLE_STEP is surely cleared (by the caller) after the first
invocation of do_single_step().

Oleg.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-07 Thread Oleg Nesterov
On 01/07, Martin Schwidefsky wrote:

 On Wed,  6 Jan 2010 13:08:12 -0800 (PST)
 Roland McGrath rol...@redhat.com wrote:

  That's what tracehook_signal_handler is for.  You're both doing it yourself
  in the arch code (by setting TIF_SINGLE_STEP), and then telling the generic
  code to do it (by passing stepping=1 to tracehook_signal_handler).

 Ok, so with the full utrace the semantics of tracehook_signal_handler
 is more than just causing a SIGTRAP. It is an indication for a signal
 AND a SIGTRAP if single-stepping is active. To make both cases work we
 should stop setting TIF_SINGLE_STEP in do_signal and pass
 current-thread.per_info.single_step to tracehook_signal_handler
 instead of test_thread_flag(TIF_SINGLE_STEP).

Can't understand why do we need TIF_SINGLE_STEP at all.

Just pass current-thread.per_info.single_step to
tracehook_signal_handler() ?

Oleg.

--- a/arch/s390/kernel/signal.c
+++ b/arch/s390/kernel/signal.c
@@ -504,14 +504,8 @@ void do_signal(struct pt_regs *regs)
 * for a normal instruction, act like we took
 * one for the handler setup.
 */
-   if (current-thread.per_info.single_step)
-   set_thread_flag(TIF_SINGLE_STEP);
-
-   /*
-* Let tracing know that we've done the handler setup.
-*/
tracehook_signal_handler(signr, info, ka, regs,
-test_thread_flag(TIF_SINGLE_STEP));
+current-thread.per_info.single_step);
}
return;
}



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-07 Thread Roland McGrath
  Right.  That means we should leave the status quo of not clearing
  TIF_SINGLE_STEP in user_disable_single_step.
 
 Ok, although it seems a bit strange not to do it. Perhaps I should add a
 comment about it.

It doesn't seem strange to me, but then I've just been through all this.
user_*_step is about what the task will do next.  TIF_SINGLE_STEP is about
what the task has done recently.  Of course more good comments always help.
I might be inclined to change the name of TIF_SINGLE_STEP so that its true
purpose is more obvious.  

AFAICT, in fact it is not even about single-step per se.  It means some PER
trap happened and should produce SIGTRAP.  Don't you get the same thing if
you haven't used single-step, but instead used PTRACE_POKEUSR to set up
per_struct with bits that say to trigger for some other reason?

How about calling it TIF_PER_PENDING?

 So after everthing has been converted to utrace we always will load the
 control registers in FixPerRegisters.

Right.  (This could well still change in the future.  But that's how it is
in utrace now.  And regardless of possible future implementation changes it
will always be the case that sometimes it will be called on current.)

 We could use the same compare of the control registers as the code in
 __switch_to. See below.

Yes, sounds good.

 The PSW_MASK_PER is the global PER enablement switch, the PER_EM_MASK
 bits enable the different PER events. We check for the PER_EM_MASK bits
 because it is easier to access in __switch_to. The return PSW is stored
 in the pt_regs structure, we would have to get a pointer to it (what
 regs = task_pt_regs(taks) does in FixPerRegisters). In
 FixPerRegisters we can as well use the PSW_MASK_PER bit.

I see.  Thanks for the explanation.


Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-07 Thread Roland McGrath
 Hmm, command for tracehook_signal_handler say this for stepping:
 @stepping:   nonzero if debugger single-step or block-step in
 use

Are you saying you would like me to clarify that wording somehow?  It's
meant to be implicit that the arch code is not doing any special fakery
about single-step for signal handlers, only processing real single-step
traps (and faking them for a syscall instruction if the arch requires
that).  No other arch does it, so it didn't occur to me that s390 would.
Before tracehook some had ptrace_notify calls there, and the call to
tracehook_signal_handler replaced that call.

  In ptrace (including utrace-based ptrace), this winds up with sending a
  SIGTRAP.  So when we finally do get out of do_signal and TIF_SINGLE_STEP
  causes a second SIGTRAP, it's already pending and the second one makes no
  difference.
 
 So we have been lucky so far.

Actually, Oleg rightly points out:

 Confused again, perhaps I just misunderstood what you mean...
 
 Without utrace, tracehook_signal_handler() doesn't send SIGTRAP, it
 merely does ptrace_notify(SIGTRAP), this means that
[...]
 even without utrace we can have unexpected SIGTRAP.

That is quite true, and I just misremembered when writing that paragraph.

So indeed we have been lucky, but it's not the luck of the problem not
happening on s390, but the luck of nobody ever caring. :-)

 Ok, so with the full utrace the semantics of tracehook_signal_handler
 is more than just causing a SIGTRAP. It is an indication for a signal
 AND a SIGTRAP if single-stepping is active. 

In short, it is the indication of a signal handler having been set up, just
like its kerneldoc description says.  Whatever that should mean to tracing
(SIGTRAP or otherwise) is in the purview of the generic tracing layer, not
the arch layer.

 To make both cases work we
 should stop setting TIF_SINGLE_STEP in do_signal and pass
 current-thread.per_info.single_step to tracehook_signal_handler
 instead of test_thread_flag(TIF_SINGLE_STEP).

Correct.


Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-07 Thread Roland McGrath
 Clear the TIF_SINGLE_STEP bit in copy_thread. If the new process is
 not auto-attached by the tracer it is wrong to delivere SIGTRAP to
 the new process.

The change is right, but this log entry is confusing.  auto-attached has
nothing to do with it, nor does anything about tracing the new process or
not.  The new process has not experienced a PER trap of its own, so it is
wrong to deliver a SIGTRAP that is meant for its creator.


Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-07 Thread Roland McGrath
 I am confused as well. Yes, I thought about regs-psw.mask change too,
 but I don't understand why it helps..
[...]
 But. Acoording to the testing I did (unless I did something wrong
 again) this patch doesn't make any difference in this particular
 case. 6580807da14c423f0d0a708108e6df6ebc8bc83d does.

Those results are quite mysterious to me.  
I think we'll have to get Martin to sort it out definitively.


Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-06 Thread Heiko Carstens
On Tue, Jan 05, 2010 at 08:58:18PM +0100, Oleg Nesterov wrote:
 I decided to re-test this all with vanilla 2.6.33-rc2. It is really
 amazing how long it takes to recompile/install the kernel!

Then either you have a lot of steal time or an old machine (pre z10)?
You could also try to define more cpus if you have a virtual machine:
#cp define cpu from-to
...that is if your admin gave you permissions to do that.

 I spent
 the rest of this day fighting with this rhts machine. Result - it
 doesn't boot:
 
   00: zIPL v1.8.2-5.el6 interactive boot menu
   00:
   00:  0. default (2.6.33-rc2)
   00:
   00:  1. 2.6.33-rc2
   00:  2. 2.6.32.2-14.s390x.el6.s390x
   00:  3. 2.6.32.2-14.el6.s390x
   00:  4. linux
   00:
   00: Note: VM users please use '#cp vi vmsg input'
   00:
   00: Please choose (default will boot in 15 seconds):
   00: Booting default (2.6.33-rc2)...
Ý0011c4fc¨ sysc_return+0x0/0x8
Ý003cc0c6¨ selinux_sb_copy_data+0x17e/0x238
   (Ý003cbf94¨ selinux_sb_copy_data+0x4c/0x238)
Ý003b62a6¨ security_sb_copy_data+0x4e/0x60
Ý00280338¨ vfs_kern_mount+0x19c/0x1f4
Ý002803de¨ kern_mount_data+0x4e/0x5c
Ý00ae1908¨ devtmpfs_init+0x60/0xbc
Ý00ae1818¨ driver_init+0x28/0x6c
Ý00abe582¨ kernel_init+0x32a/0x3d8
Ý0010b8c2¨ kernel_thread_starter+0x6/0xc
Ý0010b8bc¨ kernel_thread_starter+0x0/0xc
   INFO: lockdep is turned off.
   00: HCPGIR450W CP entered; disabled wait PSW 00020001 8000  
 00114BD4
 
 Cai, any chance you can help? Using /usr/bin/console I can't choose another
 kernel at the 00: Please choose (default will boot in 15 seconds): stage,
 how can I do this???

You did enter something like
#cp vi vmsg 2
and not just '2'?



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-06 Thread caiqian

 Cai, any chance you can help? Using /usr/bin/console I can't choose
 anotherkernel at the 00: Please choose (default will boot in 15 seconds):
 stage, how can I do this???

As Heiko mentioned, I did manage to enter,

#cp vi vmsg 2

during the prompt to get to the second kernel to boot...

Thanks,
CAI Qian



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-06 Thread Oleg Nesterov
On 01/06, caiq...@redhat.com wrote:

  Cai, any chance you can help? Using /usr/bin/console I can't choose
  anotherkernel at the 00: Please choose (default will boot in 15 seconds):
  stage, how can I do this???

 As Heiko mentioned, I did manage to enter,

 #cp vi vmsg 2

if only I new about this magic ;)

Thanks Cai and Heiko!

Oleg.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-06 Thread Oleg Nesterov
On 01/05, Oleg Nesterov wrote:

 On 01/05, Oleg Nesterov wrote:
 
  On 01/05, Oleg Nesterov wrote:
  
   I'll add clear_bit(TIF_SINGLE_STEP) into do_fork() path and re-test.
 
  Hmm. This patch
 
  --- kernel/fork.c~  2009-12-22 10:41:53.188084961 -0500
  +++ kernel/fork.c   2010-01-05 11:42:58.370636323 -0500
  @@ -1206,6 +1206,8 @@ static struct task_struct *copy_process(
   * of CLONE_PTRACE.
   */
  clear_tsk_thread_flag(p, TIF_SYSCALL_TRACE);
  +   clear_tsk_thread_flag(p, TIF_SINGLE_STEP);
  +   user_disable_single_step(p);
   #ifdef TIF_SYSCALL_EMU
  clear_tsk_thread_flag(p, TIF_SYSCALL_EMU);
   #endif
 
  doesn't help, I still see the same XXX's in dmesg...

 Oh, now I am totally confused.

 I reverted your fix from
 https://www.redhat.com/archives/utrace-devel/2010-January/msg6.html
 and now there is nothing in dmesg.

I take this back. I re-tested this all under 2.6.32.2 + utrace, and I
see nothing in dmesg. I don't know what I did wrong, most probably I
forgot to do zipl or something like this...

I'll try to summarize. Martin's patch
from https://www.redhat.com/archives/utrace-devel/2010-January/msg6.html
fixes the problems with utrace.


However, with or without CONFIG_UTRACE, 6580807da14c423f0d0a708108e6df6ebc8bc83d
is needed on s390 too, otherwise the child gets unnecessary traps.

Oleg.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-06 Thread Oleg Nesterov
On 01/05, Martin Schwidefsky wrote:

 On Mon,  4 Jan 2010 13:11:47 -0800 (PST)
 Roland McGrath rol...@redhat.com wrote:

   This probably means that copy_process()-user_disable_single_step()
   is not enough to clear the this task wants single-stepping copied
   from parent.
 
  I would suspect s390's TIF_SINGLE_STEP flag here.  That flag means a
  single-step trap occurred.  This is what causes do_single_step to be
  called before returning to user mode, rather than the machine trap doing it
  directly as is done in the other arch implementations.

 Just my thinking as well.

  If I'm right, then this task wants single-stepping is not the problem,
  and that really is fully cleared.  In fact, looking at s390's copy_thread
  (arch/s390/kernel/process.c) it clears out all the state that is actually
  touched by user_enable_single_step and user_disable_single_step.  So for
  s390 the new fork.c call is actually superfluous AFAICT.

 /* Don't copy debug registers */
 memset(p-thread.per_info, 0, sizeof(p-thread.per_info));

 Yep, the call from fork.c is indeed superfluous.

I can't explain this, but if I remove copy_process()-user_disable_single_step()
the test-case below triggers XXX printk's from do_single_step() with or
without CONFIG_UTRACE. the patch is

--- arch/s390/kernel/traps.c~   2009-12-22 10:41:52.909174198 -0500
+++ arch/s390/kernel/traps.c2010-01-05 11:03:55.006487697 -0500
@@ -384,6 +384,9 @@ void __kprobes do_single_step(struct pt_
}
if (tracehook_consider_fatal_signal(current, SIGTRAP))
force_sig(SIGTRAP, current);
+   else
+   printk(XXX: %s/%d %d\n, current-comm, current-pid,
+   test_thread_flag(TIF_SINGLE_STEP));
 }
 
 static void default_trap_handler(struct pt_regs * regs, long 
interruption_code)

Oleg.

#include stdio.h
#include unistd.h
#include signal.h
#include sys/ptrace.h
#include sys/wait.h
#include assert.h

int main(void)
{
int pid, status;

if (!(pid = fork())) {
assert(ptrace(PTRACE_TRACEME) == 0);
kill(getpid(), SIGSTOP);

if (!fork())
return 43;

wait(status);
return WEXITSTATUS(status);
}


for (;;) {
assert(pid == wait(status));
if (WIFEXITED(status))
break;
assert(ptrace(PTRACE_SINGLESTEP, pid, 0,0) == 0);
}

assert(WEXITSTATUS(status) == 43);
return 0;
}




Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-06 Thread Roland McGrath
 However, with or without CONFIG_UTRACE, 
 6580807da14c423f0d0a708108e6df6ebc8bc83d
 is needed on s390 too, otherwise the child gets unnecessary traps.

This confuses me.  user_disable_single_step on non-current doesn't do
anything not already done by the memset in copy_thread.  Ooh, except
perhaps it does not clear PSW_MASK_PER.  Maybe that matters.  That's
the only thing I can think of.  Maybe Martin can make sense of it.


Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-05 Thread Martin Schwidefsky
On Mon, 4 Jan 2010 19:14:12 +0100
Oleg Nesterov o...@redhat.com wrote:

 On 01/04, Martin Schwidefsky wrote:
  Subject: [PATCH] fix loading of PER control registers for utrace.
 
  From: Martin Schwidefsky schwidef...@de.ibm.com
 
  If the current task enables / disables PER tracing for itself the
  PER control registers need to be loaded in FixPerRegisters.
 
  Signed-off-by: Martin Schwidefsky schwidef...@de.ibm.com
  ---
   arch/s390/kernel/ptrace.c |3 +++
   1 file changed, 3 insertions(+)
 
  --- a/arch/s390/kernel/ptrace.c
  +++ b/arch/s390/kernel/ptrace.c
  @@ -98,6 +98,9 @@ FixPerRegisters(struct task_struct *task
  per_info-control_regs.bits.storage_alt_space_ctl = 1;
  else
  per_info-control_regs.bits.storage_alt_space_ctl = 0;
  +
  +   if (task == current)
  +   __ctl_load(per_info-control_regs.words, 9, 11);
   }
 
 Yes it does fix the problem! Thanks a lot Martin.

Ok, I will add that patch to the git390 queue.

 However. Could you please look at 6580807da14c423f0d0a708108e6df6ebc8bc83d ?
 I am worried, perhaps this commit is not enough for s390. OK, do_single_step()
 tracehook_consider_fatal_signal(), this means the forked thread will not
 be killed by SIGTRAP if it is not auto-attached, but still this may be
 wrong.
 
 IOW. I think this problem is minor and probably can be ignored, but if
 I remove tracehook_consider_fatal_signal() check from do_single_step(),
 
 --- a/arch/s390/kernel/traps.c
 +++ b/arch/s390/kernel/traps.c
 @@ -382,8 +382,7 @@ void __kprobes do_single_step(struct pt_
   SIGTRAP) == NOTIFY_STOP){
   return;
   }
 - if (tracehook_consider_fatal_signal(current, SIGTRAP))
 - force_sig(SIGTRAP, current);
 + force_sig(SIGTRAP, current);
  }
 
  static void default_trap_handler(struct pt_regs * regs, long 
 interruption_code)
 ---
 
 then the test-case from 6580807da14c423f0d0a708108e6df6ebc8bc83d
 fails. This probably means that copy_process()-user_disable_single_step()
 is not enough to clear the this task wants single-stepping copied
 from parent.

user_disable_single_step() does not remove the TIF_SINGLE_STEP bit from the
forked task. Perhaps we should just clear the bit in the function.

-- 
blue skies,
   Martin.

Reality continues to ruin my life. - Calvin.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-05 Thread Oleg Nesterov
On 01/05, Martin Schwidefsky wrote:

 On Mon,  4 Jan 2010 13:11:47 -0800 (PST)
 Roland McGrath rol...@redhat.com wrote:

   This probably means that copy_process()-user_disable_single_step()
   is not enough to clear the this task wants single-stepping copied
   from parent.
 
  I would suspect s390's TIF_SINGLE_STEP flag here.  That flag means a
  single-step trap occurred.  This is what causes do_single_step to be
  called before returning to user mode, rather than the machine trap doing it
  directly as is done in the other arch implementations.

 Just my thinking as well.

Oh, I am not sure. But I don't understand TIF_SINGLE_STEP on s390,
absolutely.

For example, why do_signal() sets TIF_SINGLE_STEP? Why can't we do

--- a/arch/s390/kernel/signal.c
+++ b/arch/s390/kernel/signal.c
@@ -500,18 +500,10 @@ void do_signal(struct pt_regs *regs)
clear_thread_flag(TIF_RESTORE_SIGMASK);
 
/*
-* If we would have taken a single-step trap
-* for a normal instruction, act like we took
-* one for the handler setup.
-*/
-   if (current-thread.per_info.single_step)
-   set_thread_flag(TIF_SINGLE_STEP);
-
-   /*
 * Let tracing know that we've done the handler 
setup.
 */
tracehook_signal_handler(signr, info, ka, 
regs,
-
test_thread_flag(TIF_SINGLE_STEP));
+   
current-thread.per_info.single_step);
}
return;
}

?

Apart from arch/s390/signal.c, TIF_SINGLE_STEP is used by entry.S
but I don't understand this asm at all.

Anyway. I modified the debugging patch a bit:

--- K/arch/s390/kernel/traps.c~ 2009-12-22 10:41:52.909174198 -0500
+++ K/arch/s390/kernel/traps.c  2010-01-05 09:49:19.541792379 -0500
@@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
}
if (tracehook_consider_fatal_signal(current, SIGTRAP))
force_sig(SIGTRAP, current);
+   else
+   printk(XXX: %d %d\n, current-pid, 
test_thread_flag(TIF_SINGLE_STEP));
 }
 
 static void default_trap_handler(struct pt_regs * regs, long interruption_code)
---

Now, when I run this test-case

#include stdio.h
#include unistd.h
#include signal.h
#include sys/ptrace.h
#include sys/wait.h
#include assert.h

int main(void)
{
int pid, status;

if (!(pid = fork())) {
assert(ptrace(PTRACE_TRACEME) == 0);
kill(getpid(), SIGSTOP);

if (!fork())
return 43;

wait(status);
return WEXITSTATUS(status);
}


for (;;) {
assert(pid == wait(status));
if (WIFEXITED(status))
break;
assert(ptrace(PTRACE_SINGLESTEP, pid, 0,0) == 0);
}

assert(WEXITSTATUS(status) == 43);
return 0;
}

dmesg shows 799 lines of

XXX: 2389 0


The kernel is 2.6.32.2 + utrace, but CONFIG_UTRACE is not set.

Oleg.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-05 Thread Martin Schwidefsky
On Tue, 5 Jan 2010 16:36:33 +0100
Oleg Nesterov o...@redhat.com wrote:

 On 01/05, Martin Schwidefsky wrote:
 
  On Mon,  4 Jan 2010 13:11:47 -0800 (PST)
  Roland McGrath rol...@redhat.com wrote:
 
This probably means that copy_process()-user_disable_single_step()
is not enough to clear the this task wants single-stepping copied
from parent.
  
   I would suspect s390's TIF_SINGLE_STEP flag here.  That flag means a
   single-step trap occurred.  This is what causes do_single_step to be
   called before returning to user mode, rather than the machine trap doing 
   it
   directly as is done in the other arch implementations.
 
  Just my thinking as well.
 
 Oh, I am not sure. But I don't understand TIF_SINGLE_STEP on s390,
 absolutely.
 
 For example, why do_signal() sets TIF_SINGLE_STEP? Why can't we do
 
   --- a/arch/s390/kernel/signal.c
   +++ b/arch/s390/kernel/signal.c
   @@ -500,18 +500,10 @@ void do_signal(struct pt_regs *regs)
   clear_thread_flag(TIF_RESTORE_SIGMASK);

   /*
   -* If we would have taken a single-step trap
   -* for a normal instruction, act like we took
   -* one for the handler setup.
   -*/
   -   if (current-thread.per_info.single_step)
   -   set_thread_flag(TIF_SINGLE_STEP);
   -
   -   /*
* Let tracing know that we've done the handler 
 setup.
*/
   tracehook_signal_handler(signr, info, ka, 
 regs,
   -
 test_thread_flag(TIF_SINGLE_STEP));
   +   
 current-thread.per_info.single_step);
   }
   return;
   }
 
 ?

The reason why we set the TIF_SINGLE_STEP bit in do_signal is that we
want to be able to stop the debugged program before the first
instruction of the signal handler has been executed. The PER single
step causes a trap after an instruction has been executed. That first
instruction can do bad things to the arguments of the signal handler..

 Apart from arch/s390/signal.c, TIF_SINGLE_STEP is used by entry.S
 but I don't understand this asm at all.
 
 Anyway. I modified the debugging patch a bit:
 
 --- K/arch/s390/kernel/traps.c~   2009-12-22 10:41:52.909174198 -0500
 +++ K/arch/s390/kernel/traps.c2010-01-05 09:49:19.541792379 -0500
 @@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
   }
   if (tracehook_consider_fatal_signal(current, SIGTRAP))
   force_sig(SIGTRAP, current);
 + else
 + printk(XXX: %d %d\n, current-pid, 
 test_thread_flag(TIF_SINGLE_STEP));
  }
 
  static void default_trap_handler(struct pt_regs * regs, long 
 interruption_code)
 ---
 
 Now, when I run this test-case
 
   #include stdio.h
   #include unistd.h
   #include signal.h
   #include sys/ptrace.h
   #include sys/wait.h
   #include assert.h
 
   int main(void)
   {
   int pid, status;
 
   if (!(pid = fork())) {
   assert(ptrace(PTRACE_TRACEME) == 0);
   kill(getpid(), SIGSTOP);
 
   if (!fork())
   return 43;
 
   wait(status);
   return WEXITSTATUS(status);
   }
 
 
   for (;;) {
   assert(pid == wait(status));
   if (WIFEXITED(status))
   break;
   assert(ptrace(PTRACE_SINGLESTEP, pid, 0,0) == 0);
   }
 
   assert(WEXITSTATUS(status) == 43);
   return 0;
   }
 
 dmesg shows 799 lines of
 
   XXX: 2389 0
 
 
 The kernel is 2.6.32.2 + utrace, but CONFIG_UTRACE is not set.

With or without my bug fix ?

-- 
blue skies,
   Martin.

Reality continues to ruin my life. - Calvin.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-05 Thread Oleg Nesterov
On 01/05, Oleg Nesterov wrote:

 Anyway. I modified the debugging patch a bit:

 --- K/arch/s390/kernel/traps.c~   2009-12-22 10:41:52.909174198 -0500
 +++ K/arch/s390/kernel/traps.c2010-01-05 09:49:19.541792379 -0500
 @@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
   }
   if (tracehook_consider_fatal_signal(current, SIGTRAP))
   force_sig(SIGTRAP, current);
 + else
 + printk(XXX: %d %d\n, current-pid, 
 test_thread_flag(TIF_SINGLE_STEP));
  }

  static void default_trap_handler(struct pt_regs * regs, long 
 interruption_code)
 ---

Ah, please ignore. I guess TIF_SINGLE_STEP was already cleared by the caller
in entry.S

Oleg.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-05 Thread Martin Schwidefsky
On Tue, 5 Jan 2010 16:47:25 +0100
Oleg Nesterov o...@redhat.com wrote:

 On 01/05, Oleg Nesterov wrote:
 
  Anyway. I modified the debugging patch a bit:
 
  --- K/arch/s390/kernel/traps.c~ 2009-12-22 10:41:52.909174198 -0500
  +++ K/arch/s390/kernel/traps.c  2010-01-05 09:49:19.541792379 -0500
  @@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
  }
  if (tracehook_consider_fatal_signal(current, SIGTRAP))
  force_sig(SIGTRAP, current);
  +   else
  +   printk(XXX: %d %d\n, current-pid, 
  test_thread_flag(TIF_SINGLE_STEP));
   }
 
   static void default_trap_handler(struct pt_regs * regs, long 
  interruption_code)
  ---
 
 Ah, please ignore. I guess TIF_SINGLE_STEP was already cleared by the caller
 in entry.S

Yes, TIF_SINGLE_STEP is checked in entry.S and cleared before do_signal
is called. That is the ni instruction at sysc_singlestep and
sysc_sigpending.

-- 
blue skies,
   Martin.

Reality continues to ruin my life. - Calvin.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-05 Thread Oleg Nesterov
On 01/05, Martin Schwidefsky wrote:

 On Tue, 5 Jan 2010 16:36:33 +0100
 Oleg Nesterov o...@redhat.com wrote:

  For example, why do_signal() sets TIF_SINGLE_STEP? Why can't we do
  
  --- a/arch/s390/kernel/signal.c
  +++ b/arch/s390/kernel/signal.c
  @@ -500,18 +500,10 @@ void do_signal(struct pt_regs *regs)
  clear_thread_flag(TIF_RESTORE_SIGMASK);
   
  /*
  -* If we would have taken a single-step trap
  -* for a normal instruction, act like we took
  -* one for the handler setup.
  -*/
  -   if (current-thread.per_info.single_step)
  -   set_thread_flag(TIF_SINGLE_STEP);
  -
  -   /*
   * Let tracing know that we've done the handler 
  setup.
   */
  tracehook_signal_handler(signr, info, ka, 
  regs,
  -
  test_thread_flag(TIF_SINGLE_STEP));
  +   
  current-thread.per_info.single_step);
  }
  return;
  }
  
  ?

 The reason why we set the TIF_SINGLE_STEP bit in do_signal is that we
 want to be able to stop the debugged program before the first
 instruction of the signal handler has been executed. The PER single
 step causes a trap after an instruction has been executed. That first
 instruction can do bad things to the arguments of the signal handler..

Yes, but afaics all we need is to pass the correct int stepping arg
to tracehook_signal_handler(). If it is true, the tracee does
ptrace_notify() before it returns to user-mode.

  dmesg shows 799 lines of
 
  XXX: 2389 0
 
 
  The kernel is 2.6.32.2 + utrace, but CONFIG_UTRACE is not set.

 With or without my bug fix ?

With, but please see another email.


I'll add clear_bit(TIF_SINGLE_STEP) into do_fork() path and re-test.

Oleg.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-04 Thread Martin Schwidefsky
On Mon, 4 Jan 2010 16:52:25 +0100
Oleg Nesterov o...@redhat.com wrote:

 Hi!
 
 We have some strange problems with utrace on s390, and so far this _looks_
 like a s390 problem.
 
 Looks like, on any CPU user_enable_single_step() does not work until at
 least one thread with per_info.single_step = 1 does the context switch.
 
 This doesn't matter with the old ptrace implementation, but with utrace
 the tracee itself does user_enable_single_step(current) and returns to
 user-mode. Until it does at least one context switch the single-stepping
 doesn't work, after that everything works fine till the next reboot.

The PER control registers only get reloaded on task switch. Can you test
if this patch fixes your problem?

--
Subject: [PATCH] fix loading of PER control registers for utrace.

From: Martin Schwidefsky schwidef...@de.ibm.com

If the current task enables / disables PER tracing for itself the
PER control registers need to be loaded in FixPerRegisters.

Signed-off-by: Martin Schwidefsky schwidef...@de.ibm.com
---
 arch/s390/kernel/ptrace.c |3 +++
 1 file changed, 3 insertions(+)

--- a/arch/s390/kernel/ptrace.c
+++ b/arch/s390/kernel/ptrace.c
@@ -98,6 +98,9 @@ FixPerRegisters(struct task_struct *task
per_info-control_regs.bits.storage_alt_space_ctl = 1;
else
per_info-control_regs.bits.storage_alt_space_ctl = 0;
+
+   if (task == current)
+   __ctl_load(per_info-control_regs.words, 9, 11);
 }
 
 void user_enable_single_step(struct task_struct *task)

-- 
blue skies,
   Martin.

Reality continues to ruin my life. - Calvin.



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-04 Thread Oleg Nesterov
On 01/04, Oleg Nesterov wrote:

 IOW. I think this problem is minor and probably can be ignored,

Or may be not...

Even if the child is not killed by SIGTRAP, it can get a lot of
unnecessary traps.

To verify, I did another trivial patch (below), and the test
case from 6580807da14c423f0d0a708108e6df6ebc8bc83d does trigger
a lot of false step printks.

Hmm. And sometimes there is nothing in dmesg, but the test-case
needs a lot of time to complete. taskset -c seems to always
trigger printk's. Magic.

Oleg.

--- arch/s390/kernel/traps.c~   2009-12-22 10:41:52.909174198 -0500
+++ arch/s390/kernel/traps.c2010-01-04 13:19:51.038187586 -0500
@@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
}
if (tracehook_consider_fatal_signal(current, SIGTRAP))
force_sig(SIGTRAP, current);
+   else
+   printk(false step\n);
 }
 
 static void default_trap_handler(struct pt_regs * regs, long interruption_code)



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-04 Thread Roland McGrath
 The PER control registers only get reloaded on task switch. Can you test
 if this patch fixes your problem?

Long ago when I first worked with David Wilder on s390 arch code,
I remember we made this change.  It seems to have been forgotten
in the later rounds of reworking and merging.


Thanks,
Roland



Re: s390 user_enable_single_step() (Was: odd utrace testing results on s390x)

2010-01-04 Thread Roland McGrath
 This probably means that copy_process()-user_disable_single_step()
 is not enough to clear the this task wants single-stepping copied
 from parent.

I would suspect s390's TIF_SINGLE_STEP flag here.  That flag means a
single-step trap occurred.  This is what causes do_single_step to be
called before returning to user mode, rather than the machine trap doing it
directly as is done in the other arch implementations.

If I'm right, then this task wants single-stepping is not the problem,
and that really is fully cleared.  In fact, looking at s390's copy_thread
(arch/s390/kernel/process.c) it clears out all the state that is actually
touched by user_enable_single_step and user_disable_single_step.  So for
s390 the new fork.c call is actually superfluous AFAICT.

The problem is that the copied parent state includes the this task has a
pending single-step to report flag.  IMHO it clearly makes sense for
s390's copy_thread to clear this flag in a new task, which it does not do now.

An alternative to that would be just to make its user_disable_single_step
clear the flag.  That could in theory also have an effect on e.g. the
(authentic) pending step report of a tracee that was stopped with
TIF_SINGLE_STEP set when its tracer detached.  This might be considered a
good thing, but since every other arch posts the SIGTRAP immediately they
all have the equivalent issue and s390 doesn't need to be any better than
they are before we have a generic resolution to the whole subject of
tracer-induced signals (which we won't get into now).  I'm not even sure
from my insufficient reading of the s390 assembly code whether this path is
even possible, i.e. do_signal called before do_single_step.

Martin, I suggest having copy_thread clear TIF_SINGLE_STEP.
That bit is always task-private state that should not be copied.

Btw, given the complexity of FixPerRegisters (and its new additional cost
on task==current), you might want to make user_*_single_step bail out if
per_info.single_step is already set/clear on entry.


Thanks,
Roland