Re: [PATCH 0/1] Was: utrace-cleanup branch

2009-11-13 Thread Oleg Nesterov
On 11/12, Oleg Nesterov wrote:

 On 11/12, Roland McGrath wrote:
 
  I did some tweaks that I think address the several things you've raised.
  But I didn't try to reply point by point.  I've merged everything up now,
  so the utrace-cleanup branch is gone.  Please review the current code and
  post about anything we still need to fix.

 Great, thanks.

 I will definitely read the changes in utrace.c, if nothing else this is
 interesting to me. But for now, until utrace-ptrace is updated/finished,
 I will assume it is correct and doesn't need any further changes.

(except the SIGTRAP changes in utrace_report_syscall_exit() we are discussing
 in another thread)

Roland, I pulled the last changes in your tree (utrace-cleanup merged
in utrace-ptrace) and did some testing.

In short: utrace-ptrace does not work at all.

It fails a lot (if not most) of tests, in particular
sa-resethand-on-cont-signal hangs and clone-multi-ptrace silently
crashes the kernel.

The quick investigation didn't help, utrace.c has too much changes,
I will read the code tomorrow from the very beginning.

Today I am going to pretend utrace works and do the discussed changes
in utrace-ptrace, then try to fix utrace.

Oleg.



Re: [PATCH 0/1] Was: utrace-cleanup branch

2009-11-12 Thread Roland McGrath
I did some tweaks that I think address the several things you've raised.
But I didn't try to reply point by point.  I've merged everything up now,
so the utrace-cleanup branch is gone.  Please review the current code and
post about anything we still need to fix.

I merged into utrace-ptrace and did the trivial hack to handle
UTRACE_SYSCALL_RESUMED.  Later we probably want to change this to some
smarter handling so that ptrace doesn't report until other engines have
finished their fiddling and let the thread resume.  That way we can have
e.g. strace vs kmview show something consistent.  But we can refine this
later.  I think what I did is sufficient to maintain the status quo in your
code.  You can fix it up as needed.


Thanks,
Roland



Re: [PATCH 0/1] Was: utrace-cleanup branch

2009-11-12 Thread Oleg Nesterov
On 11/12, Roland McGrath wrote:

 I did some tweaks that I think address the several things you've raised.
 But I didn't try to reply point by point.  I've merged everything up now,
 so the utrace-cleanup branch is gone.  Please review the current code and
 post about anything we still need to fix.

Great, thanks.

I will definitely read the changes in utrace.c, if nothing else this is
interesting to me. But for now, until utrace-ptrace is updated/finished,
I will assume it is correct and doesn't need any further changes.

Oleg.



Re: [PATCH 0/1] Was: utrace-cleanup branch

2009-11-04 Thread Roland McGrath
 3. Now that we have utrace-resume, can't we kill report-resume_action ?

I thought this initially when making the change and then decided against
it.  I don't recall exactly what was in my mind at the time.  It would
take some more thought now to be sure whether there is a semantic
problem.  But at any rate, keeping it reduces the number of cases in
which we take the utrace lock in finish_callback_report, which seems
like a good thing.  Still, if you propose a patch to remove it with
thorough rationale on why it's good or better, then that will be fine.


Thanks,
Roland



Re: [PATCH 0/1] Was: utrace-cleanup branch

2009-11-03 Thread Roland McGrath
 8. Completely off-topic, but utrace_control() has a very strange comment
under case UTRACE_INTERRUPT,
 
* When it's stopped, we know it's always going
* through utrace_get_signal() and will recalculate.
 
can't recall if it were ever true, but surely it is not now?

I think what's true now is what's always been true.  The comment is a bit
confusing, though.  The two kinds of stopped are inside do_signal_stop()
and inside utrace_stop().  In the former, it is inside the signal-checking
loop and will check again.  In the latter, utrace_stop() ends with a call
to recalc_sigpending().


Thanks,
Roland



Re: [PATCH 0/1] Was: utrace-cleanup branch

2009-11-03 Thread Roland McGrath
 4. One of the changes in utrace_get_signal() doesn't look exactly right.
 
   if (utrace-resume  UTRACE_RESUME || utrace-signal_handler) {
   ...
   if (resume  UTRACE_REPORT) {
   report.action = resume;
   finish_resume_report(report);
   }
 
Yes, we need finish_resume_report() here, but since report-takers
is not set, finish_report_reset() will always call utrace_reset().
 
OTOH, we can't set -takers before finish_resume_report(), we can
miss UTRACE_DETACH request. utrace_control(DETACH)-utrace_do_stop()
does not change utrace-resume != UTRACE_RESUME.
 
And since utrace_do_stop() never upgrades utrace-resume, we have
another problem: UTRACE_STOP request can be lost here.

Hmm.  Maybe this?

diff --git a/kernel/utrace.c b/kernel/utrace.c
index f46854b..000 100644  
--- a/kernel/utrace.c
+++ b/kernel/utrace.c
@@ -1740,8 +1740,6 @@ static void finish_resume_report(struct 
 struct task_struct *task,
 struct utrace *utrace)
 {
-   finish_report_reset(task, utrace, report);
-
switch (report-action) {
case UTRACE_STOP:
utrace_stop(task, utrace, report-resume_action);
@@ -1829,6 +1827,8 @@ void utrace_resume(struct task_struct *t
report.action = UTRACE_RESUME;
list_for_each_entry(engine, utrace-attached, entry)
start_callback(utrace, report, engine, task, 0);
+
+   finish_report_reset(task, utrace, report);
}
 
/*
@@ -2147,6 +2147,7 @@ int utrace_get_signal(struct task_struct
 * as in utrace_resume(), above.  After we've dealt with that,
 * our caller will relock and come back through here.
 */
+   finish_report_reset(task, utrace, report);
finish_resume_report(report, task, utrace);
 
if (unlikely(fatal_signal_pending(task))) {


 5. utrace_resume() has the same problems. If report-action != REPORT
we do not call callbacks and finish_resume_report() is called with
-takers == F.

 6. But! I think utrace_resume() was always wrong here. Because it
calls start_callback(events = 0) and thus we nevet set -takers.

I think that change covers these too.  What do you think?


Thanks,
Roland



Re: [PATCH 0/1] Was: utrace-cleanup branch

2009-11-03 Thread Roland McGrath
 7. utrace_attach_task() has an implicit wmb() between -utrace = new_utrace
and -utrace_flags = REAP, this is good.
 
But, for example, tracehook_force_sigpending() does not have rmb(),
this means utrace_interrupt_pending() can OOPS in theory.

Ok.  Please send a patch.  Off hand it seems to me it isn't worth using a
barrier because the winner of that race doesn't matter.  So:

struct utrace *utrace = task_utrace_struct(current);
return likely(utrace)  utrace-resume == UTRACE_INTERRUPT;

is fine, right?  (With comment changes to explain the need for the check.)


Thanks,
Roland



Re: [PATCH 0/1] Was: utrace-cleanup branch

2009-11-03 Thread Roland McGrath
 2. Cosmetic, but the usage of utrace_task_alloc() looks a bit strange.
Why it returns bool, not struct utrace * ?

The pointer it would return is always target-utrace or it's NULL.
So the bool just says which of those it would be instead.  Either
way I imagine it should be inlined so the check on the return value
from kmem_cache_zalloc is the only test/branch.  I have no special
opinion about the cosmetic choice.  Whatever you and/or upstream
prefer is fine.  Send a patch if you would like it to be different.


Thanks,
Roland



[PATCH 0/1] Was: utrace-cleanup branch

2009-11-02 Thread Oleg Nesterov
On 10/28, Roland McGrath wrote:

 I've made a new branch, utrace-cleanup.
 This forks from utrace-indirect and has:

 26fefca utrace: sticky resume action
 28b2774 utrace: remove -stopped field

I am not sure I understand the new code in details - too much changes.
Anyway, I can never understand the code after the first reading. At
first glance, imho this change is right in general but:

1. This breaks the current implementation of utrace-ptrace.

  I guess I misunderstood you when we discussed this before, I thought
  that the tracee should handle UTRACE_SINGLESTEP right after resume
  and call user_enable_single_step(). However, enable_step() is called
  at utrace_resume/utrace_get_signal time, this is too late for ptrace.

  I guess this branch is the code which will be sent to lkml, right?

  In this case utrace-cleanup should be merged into utrace-ptrace right
  now, then I need to fix ptrace. Basically, ptrace_report_quiesce() and
  ptrace_report_interrupt() should detect the case when the tracee was
  resumed by PTRACE_XXXSTEP and then it passed syscall_trace_leave()
  without TIF_SINGLESTEP, and synthesize a trap.

  The main problem is that this is not consistent across the different
  arch'es :[


2. Cosmetic, but the usage of utrace_task_alloc() looks a bit strange.
   Why it returns bool, not struct utrace * ?


3. Now that we have utrace-resume, can't we kill report-resume_action ?


4. One of the changes in utrace_get_signal() doesn't look exactly right.

if (utrace-resume  UTRACE_RESUME || utrace-signal_handler) {
...
if (resume  UTRACE_REPORT) {
report.action = resume;
finish_resume_report(report);
}

   Yes, we need finish_resume_report() here, but since report-takers
   is not set, finish_report_reset() will always call utrace_reset().

   OTOH, we can't set -takers before finish_resume_report(), we can
   miss UTRACE_DETACH request. utrace_control(DETACH)-utrace_do_stop()
   does not change utrace-resume != UTRACE_RESUME.

   And since utrace_do_stop() never upgrades utrace-resume, we have
   another problem: UTRACE_STOP request can be lost here.


5. utrace_resume() has the same problems. If report-action != REPORT
   we do not call callbacks and finish_resume_report() is called with
   -takers == F.

6. But! I think utrace_resume() was always wrong here. Because it
   calls start_callback(events = 0) and thus we nevet set -takers.


7. utrace_attach_task() has an implicit wmb() between -utrace = new_utrace
   and -utrace_flags = REAP, this is good.

   But, for example, tracehook_force_sigpending() does not have rmb(),
   this means utrace_interrupt_pending() can OOPS in theory.


8. Completely off-topic, but utrace_control() has a very strange comment
   under case UTRACE_INTERRUPT,

 * When it's stopped, we know it's always going
 * through utrace_get_signal() and will recalculate.

   can't recall if it were ever true, but surely it is not now?

Oleg.