Re: [PATCH 133] stepping, accommodate to utrace-cleanup changes

2009-11-20 Thread Roland McGrath
  The former is e.g. PTRACE_SINGLESTEP while an unrelated engine uses
  UTRACE_EVENT(SYSCALL_ENTRY).  The ptrace engine's report_quiesce returns
  UTRACE_SINGLESTEP.  finish_resume_report() calls user_enable_single_step().
  That seems fine.  Right?
 
 In this case ptrace_report_quiesce(event) returns UTRACE_RESUME, note
 that it does
 
   return event ? UTRACE_RESUME : ctx-resume;
 
 and event == SYSCALL_ENTRY. This looks correct.

With the utrace layer changes we're discussing, we need it to return
UTRACE_SINGLESTEP (i.e. ctx-resume) here.  AFAICT that never hurts
even with today's utrace layer.

  I see.  finish_resume_report() will only do utrace_stop() and then we'll
  go directly into the syscall unless someone used UTRACE_REPORT.
 
 Yes,
 
  utrace_stop() will only set TIF_NOTIFY_RESUME and utrace-resume.  In
  the real resume-to-user cases, that's fine because we'll handle that in
  utrace_resume() or utrace_get_signal() before doing anything else.
 
 Not sure I understand. utrace_stop() will not set TIF_NOTIFY_RESUME?
 
 We are talking about the case when the tracee stops in SYSCALL_ENTER,
 and then the tracer does utrace_control(UTRACE_SINGLESTEP) to resume.
 In this case utrace_control() sets -resume/TIF_NOTIFY_RESUME, yes.

I think you do understand fine.  Yes, that's what it will do.  I was just
recognizing that this isn't enough in the syscall-entry case.

  this
  seems like the right idea for how to get tracehook_report_syscall_exit
  called in the cases where it is in old ptrace.
 
 Afaics yes.
 
 But, what if another engine does utrace_control(UTRACE_INTERRUPT) ?

Hmm.  Yes, I think this is the one case that is unlike all the others.
That is, UTRACE_INTERRUPT at syscall-entry.  In all other cases,
nothing would care about utrace-resume until we get to
get_signal_to_deliver anyway, where the UTRACE_SIGNAL_REPORT pass will
trump any pending resume action anyway so you don't care that you lost
track of it when UTRACE_INTERRUPT came in.

Hmm.  So what does UTRACE_INTERRUPT mean here anyway?  It means that
we should report soon and that signal_pending() should be true until
that report.  So today that means you can get into the syscall with
signal_pending() and depending on the particular call that might cause
a normal blocking path not to block and/or a -EINTR/-ERESTART* return,
but some syscalls will just complete normally.

How about if we say that UTRACE_INTERRUPT at syscall-entry means that
the syscall really doesn't happen at all?  That is, instead, you force
an -ERESTARTNOHAND return and the normal roll-back such that you get
your report_signal callback before the syscall is entered.  That even
seems like a useful utrace API feature, as well as conveniently
smoothing over this quirk in the resume action handling.

My idea here is that this makes it OK to lose UTRACE_SINGLESTEP here
because from the user-mode-centric perspective no instruction has
happened yet.  The original guy who did UTRACE_SINGLESTEP (and perhaps
never cared about syscall tracing) will get a generic report_quiesce
or report_signal at which it needs to reassert UTRACE_SINGLESTEP.

This merits more thought.  For now, let's just leave an XXX comment
about a UTRACE_INTERRUPT effectively swallowing the UTRACE_SINGLESTEP
that should cause utrace_report_syscall_exit to be called.  I think we
can revisit this corner after we have merged up all the rest of it.

 Argh. I meant, from the caller pov, utrace_control(UTRACE_SINGLESTEP)
 does enable_step() immediately, like it did before utrace-cleanup
 changes.

I see.  From the utrace API perspective, I don't think anything changes
here.  In today's code, the resume action can take effect without a
subsequent utrace callback.  So that's the same as it ever was, and where
this actually happens is not something that a utrace caller should know or
can tell.

 IOW, suppose the tracee is stopped, and ptrace does UTRACE_SINGLESTEP.
 The tracee resumes, processes -resume, and does enable_step().
 From the ptrace pov, it looks as if utrace_control() does enable_step()
 before utrace_wakeup().

Sure.

 I meant, every time the tracee stops, it should process -resume
 after wakeup. Looks like, the patch you sent in another thread
 (which adds apply_resume_action()) does something like this, right?

Right.


Thanks,
Roland



Re: [PATCH 133] stepping, accommodate to utrace-cleanup changes

2009-11-19 Thread Oleg Nesterov
On 11/18, Roland McGrath wrote:

  Hmm. not sure I understand how this can work. I mean, this won't
  enable stepping (in this particular case we are discussing).

 I might be confused.  I'm thinking of two cases: the report_syscall_entry
 pass yields UTRACE_SINGLESTEP, or that it yields UTRACE_STOP and then
 utrace_control() sets utrace-resume to UTRACE_SINGLESTEP.

 The former is e.g. PTRACE_SINGLESTEP while an unrelated engine uses
 UTRACE_EVENT(SYSCALL_ENTRY).  The ptrace engine's report_quiesce returns
 UTRACE_SINGLESTEP.  finish_resume_report() calls user_enable_single_step().
 That seems fine.  Right?

In this case ptrace_report_quiesce(event) returns UTRACE_RESUME, note
that it does

return event ? UTRACE_RESUME : ctx-resume;

and event == SYSCALL_ENTRY. This looks correct.

 The latter is e.g. PTRACE_SINGLESTEP after ptrace stop at syscall entry.

Yes, this is the case I was talking about.

 I see.  finish_resume_report() will only do utrace_stop() and then we'll
 go directly into the syscall unless someone used UTRACE_REPORT.

Yes,

 utrace_stop() will only set TIF_NOTIFY_RESUME and utrace-resume.  In
 the real resume-to-user cases, that's fine because we'll handle that in
 utrace_resume() or utrace_get_signal() before doing anything else.

Not sure I understand. utrace_stop() will not set TIF_NOTIFY_RESUME?

We are talking about the case when the tracee stops in SYSCALL_ENTER,
and then the tracer does utrace_control(UTRACE_SINGLESTEP) to resume.
In this case utrace_control() sets -resume/TIF_NOTIFY_RESUME, yes.

 @@ -1794,11 +1794,13 @@ static void finish_resume_report(struct
  {
   finish_report_reset(task, utrace, report);

 - switch (report-action) {
 - case UTRACE_STOP:
 - utrace_stop(task, utrace, report-resume_action);
 - break;
 + if (report-action == UTRACE_STOP) {
 + utrace_stop(task, utrace, report-resume_action, true);
 + report-action = start_report(utrace);
 + WARN_ON(report-action == UTRACE_STOP);
 + }

 + switch (report-action) {
   case UTRACE_INTERRUPT:
   if (!signal_pending(task))
   set_tsk_thread_flag(task, TIF_SIGPENDING);

Yes, this means the tracee gets TIF_SINGLESTEP right after resume,
and

 this
 seems like the right idea for how to get tracehook_report_syscall_exit
 called in the cases where it is in old ptrace.

Afaics yes.

But, what if another engine does utrace_control(UTRACE_INTERRUPT) ?

  Instead of fixing utrace_report_syscall_entry(), perhaps we should
  change utrace_stop() path to enable the stepping if needed, but this
  means we return to synchronous utrace_control(UTRACE_XXXSTEP).

 I don't really follow what you mean by synchronous.

Argh. I meant, from the caller pov, utrace_control(UTRACE_SINGLESTEP)
does enable_step() immediately, like it did before utrace-cleanup
changes.

IOW, suppose the tracee is stopped, and ptrace does UTRACE_SINGLESTEP.
The tracee resumes, processes -resume, and does enable_step().
From the ptrace pov, it looks as if utrace_control() does enable_step()
before utrace_wakeup().

 Your suggestion is
 equivalent to making utrace_finish_vfork call finish_resume_report instead
 of utrace_stop, isn't it?

I meant, every time the tracee stops, it should process -resume
after wakeup. Looks like, the patch you sent in another thread
(which adds apply_resume_action()) does something like this, right?

I'll try to read it carefully now.

Oleg.



Re: [PATCH 133] stepping, accommodate to utrace-cleanup changes

2009-11-18 Thread Oleg Nesterov
On 11/18, Roland McGrath wrote:

  Once again. The tracee sleeps in SYSCALL_ENTER. The tracer resumes
  the tracee via utrace_control(UTRACE_SINGLESTEP).

 Right.  This is another strange corner.  It doesn't necessarily make
 especially good sense intrinsically for single-step to have this meaning
 from this place.  But it's the status quo in the ptrace interface.

Yes.

  So, I think we should do something like
 
  tracehook_report_syscall_exit(step)
 
  // do not use step at all
  if (task_utrace_flags() != 0)
  utrace_report_syscall_exit();

 The trouble here is that the arch code looks like this:

   step = test_thread_flag(TIF_SINGLESTEP);
   if (step || test_thread_flag(TIF_SYSCALL_TRACE))
   tracehook_report_syscall_exit(regs, step);

 i.e. unless syscall tracing is enabled, we won't even get there at all.

You are right, I missed this.

  utrace_report_syscall_exit()
 
  if (UTRACE_EVENT(SYSCALL_EXIT))
  REPORT(report_syscall_exit);
 
  if (utrace-resume == UTRACE_SINGLESTEP ||
  utrace-resume == UTRACE_BLOCKSTEP)
  send_sigtrap();
 
  What do you think?

 This can't work in the REPORT case.  There start_report() will have reset
 utrace-resume,

Yes, yes, the pseudo-code above was just for illustration.

 (This would go with un-reverting f19442c and reverting 3bd4be9.)

 --- a/kernel/utrace.c
 +++ b/kernel/utrace.c
 @@ -1584,20 +1584,25 @@ static inline u32 do_report_syscall_entr
UTRACE_EVENT(SYSCALL_ENTRY), report_syscall_entry,
resume_report | report-result | report-action,
engine, current, regs);
 - finish_report(task, utrace, report, false);
 +
 + /*
 +  * Now we'll stop if asked.  If we're proceeding with the syscall,
 +  * now we'll first enable single-step if asked.  That leaves the
 +  * task in stepping state so tracehook_report_syscall_exit() and
 +  * the arch code that calls it will know.
 +  */
 + finish_resume_report(task, utrace, report);

Hmm. not sure I understand how this can work. I mean, this won't
enable stepping (in this particular case we are discussing).

 - } else if (utrace-resume == UTRACE_REPORT) {
 + } else if (utrace-resume = UTRACE_REPORT) {

perhaps, you meant = UTRACE_REPORT ? In this case yes, UTRACE_SINGLESTEP
will be noticed after resume, and ptrace or another engine can reassert
it during the next reporting loop, and then finish_resume_report() will
enable the stepping. But we can't trust =, another engine can
do utrace_control(INTERRUPT) and change -resume before the tracee
resumes.

OK, probably I missed the implemenation details, but I think I got
the idea, and this should fix the step-after-syscall_enter problem.

But. We have the same problem with utrace_finish_vfork(), no?


Instead of fixing utrace_report_syscall_entry(), perhaps we should
change utrace_stop() path to enable the stepping if needed, but this
means we return to synchronous utrace_control(UTRACE_XXXSTEP).

Oleg.



Re: [PATCH 133] stepping, accommodate to utrace-cleanup changes

2009-11-18 Thread Oleg Nesterov
I'll reply tomorrow, it is to late for me.

But I noticed the funny detail in your email,

On 11/18, Roland McGrath wrote:

 Yes.  But, hmm.

   utrace_stop(task, utrace, UTRACE_RESUME); /* XXX */

 This XXX was there about passing UTRACE_RESUME, because that's not really
 right.

I didn't notice this XXX marker, but I guess I just sent the patch
which makes this case correct. (of course this patch doesn't fix
the problem we are dicussing in this thread).

Oleg.



Re: [PATCH 133] stepping, accommodate to utrace-cleanup changes

2009-11-17 Thread Oleg Nesterov
On 11/16, Roland McGrath wrote:

 Whatever temporary hacks are fine by me one way or the other.
 They will just be coming out later along with assorted other cleanup.
 We certainly do want to get this right in the utrace layer.

Yes. But imho it is always good to test/review the patches against
the working kernel. The patch I sent is very simple, and can be
easily reverted once we improve utrace.

IOW, I am asking you to apply my patch for now and revert your
change to have the working tree, then discuss the right fix.

 The change we talked about before seems simple enough and should cover this
 without new kludges in the ptrace layer.  I did this (commit f19442c).

I don't think this can work.

tracehook_report_syscall_exit(step)

if (step || UTRACE_EVENT(SYSCALL_EXIT))
utrace_report_syscall_exit(step);

and,
utrace_report_syscall_exit(step)

if (step)
send_sigtrap();

The problems is: we can trust bool step, and in fact we do
not need it at all.

Once again. The tracee sleeps in SYSCALL_ENTER. The tracer resumes
the tracee via utrace_control(UTRACE_SINGLESTEP).

By the time the resumed tracee passes tracehook_report_syscall_exit()
step == F, utrace_control() does not set TIF_SINGLESTEP.

So, I think we should do something like

tracehook_report_syscall_exit(step)

// do not use step at all
if (task_utrace_flags() != 0)
utrace_report_syscall_exit();

// this code below is only for old ptrace

if (step  (task_ptrace(current)  PT_PTRACED))
send_sigtrap();
ptrace_report_syscall();

and,
utrace_report_syscall_exit()

if (UTRACE_EVENT(SYSCALL_EXIT))
REPORT(report_syscall_exit);

if (utrace-resume == UTRACE_SINGLESTEP ||
utrace-resume == UTRACE_BLOCKSTEP)
send_sigtrap();

What do you think?

Oleg.



Re: [PATCH 133] stepping, accommodate to utrace-cleanup changes

2009-11-16 Thread Oleg Nesterov
Just in case, forgot to clarify...

On 11/16, Oleg Nesterov wrote:

 With this patch make check passes all tests again (to clarify, except
 some tests which upstream doesn't pass too), including this one:

with this patch +
[HACK] utrace: fix utrace_resume()-finish_resume_report() logic

And I didn't check make xcheck, I guess it still crashes the kernel.

Oleg.



Re: [PATCH 133] stepping, accommodate to utrace-cleanup changes

2009-11-16 Thread Oleg Nesterov
On 11/16, Oleg Nesterov wrote:

 And I didn't check make xcheck, I guess it still crashes the kernel.

Yes it does. I am almost sure the bug should be trivial, but
somehow can't find find it. Just fyi, to ensure this is connected
to utrace-indirect I applied the hack below and the bug goes away.

OK. Tomorrow I will just read utrace.c trying to understand the
details of the new code, perhaps I will notice something.

Oleg.

--- UTRACE-PTRACE/kernel/fork.c~XXX_CRASH   2009-11-16 20:26:23.0 
+0100
+++ UTRACE-PTRACE/kernel/fork.c 2009-11-16 21:18:57.0 +0100
@@ -970,6 +970,8 @@ static void posix_cpu_timers_init(struct
  * parts of the process environment (as per the clone
  * flags). The actual kick-off is left to the caller.
  */
+bool utrace_task_alloc(struct task_struct *task);
+
 static struct task_struct *copy_process(unsigned long clone_flags,
unsigned long stack_start,
struct pt_regs *regs,
@@ -1227,6 +1229,10 @@ static struct task_struct *copy_process(
cgroup_fork_callbacks(p);
cgroup_callbacks_done = 1;
 
+   p-utrace = NULL;
+   if (!utrace_task_alloc(p))
+   WARN_ON(1);
+
/* Need tasklist lock for parent etc handling! */
write_lock_irq(tasklist_lock);
 
--- UTRACE-PTRACE/kernel/utrace.c~XXX_CRASH 2009-11-16 20:31:38.0 
+0100
+++ UTRACE-PTRACE/kernel/utrace.c   2009-11-16 21:19:26.0 +0100
@@ -87,9 +87,9 @@ module_init(utrace_init);
  *
  * This returns false only in case of a memory allocation failure.
  */
-static bool utrace_task_alloc(struct task_struct *task)
+bool utrace_task_alloc(struct task_struct *task)
 {
-   struct utrace *utrace = kmem_cache_zalloc(utrace_cachep, GFP_KERNEL);
+   struct utrace *utrace = kzalloc(sizeof(struct utrace), GFP_KERNEL);
if (unlikely(!utrace))
return false;
spin_lock_init(utrace-lock);
@@ -108,7 +108,7 @@ static bool utrace_task_alloc(struct tas
}
task_unlock(task);
if (unlikely(task-utrace != utrace))
-   kmem_cache_free(utrace_cachep, utrace);
+   kfree(utrace);
return true;
 }
 
@@ -118,7 +118,7 @@ static bool utrace_task_alloc(struct tas
  */
 void utrace_free_task(struct task_struct *task)
 {
-   kmem_cache_free(utrace_cachep, task-utrace);
+   kfree(task-utrace);
 }
 
 /*
@@ -286,6 +286,8 @@ struct utrace_engine *utrace_attach_task
struct utrace_engine *engine;
int ret;
 
+   WARN_ON(!utrace);
+
if (!(flags  UTRACE_ATTACH_CREATE)) {
if (unlikely(!utrace))
return ERR_PTR(-ENOENT);
--- UTRACE-PTRACE/include/linux/utrace.h~XXX_CRASH  2009-11-16 
01:06:34.0 +0100
+++ UTRACE-PTRACE/include/linux/utrace.h2009-11-16 20:50:38.0 
+0100
@@ -163,7 +163,7 @@ static inline struct utrace *task_utrace
 static inline void utrace_init_task(struct task_struct *task)
 {
task-utrace_flags = 0;
-   task-utrace = NULL;
+   WARN_ON(!task-utrace);
 }
 
 void utrace_release_task(struct task_struct *);



Re: [PATCH 133] stepping, accommodate to utrace-cleanup changes

2009-11-16 Thread Roland McGrath
Whatever temporary hacks are fine by me one way or the other.
They will just be coming out later along with assorted other cleanup.
We certainly do want to get this right in the utrace layer.

The change we talked about before seems simple enough and should cover this
without new kludges in the ptrace layer.  I did this (commit f19442c).
I think that should cover what you need without this ptrace change.
(All untested, of course!)


Thanks,
Roland



Re: [PATCH 133] stepping, accommodate to utrace-cleanup changes

2009-11-16 Thread Roland McGrath
 Yes it does. I am almost sure the bug should be trivial, but
 somehow can't find find it. Just fyi, to ensure this is connected
 to utrace-indirect I applied the hack below and the bug goes away.

Does s/kmem_cache_zalloc/kzalloc/ really have anything to do with it?
Isn't it just the allocation synchronization logic?

 OK. Tomorrow I will just read utrace.c trying to understand the
 details of the new code, perhaps I will notice something.

Ok.  I hope you'll find something not too strange.  But if it takes long,
then probably we should just change utrace_init_task() to eagerly call
utrace_task_alloc() and let you work through other issues first.


Thanks,
Roland