Re: Q: utrace->stopped && utrace_report_jctl()

2009-03-12 Thread Roland McGrath
> I'd like to ask you to clarify what utrace->stopped means...

I'm very glad you are looking into this area!

> My understanding is: if we see ->stopped == true under utrace->lock, then
> the target can do nothing "interesting" from the utrace's pov. The target
> should take utrace->lock at least once. Either in finish_utrace_stop(), or,
> if ->stopped was set by do_signal_stop() path, the target will call
> tracehook_get_signal()->utrace_get_signal(). So we can assume the target
> is "quiescent" and we can do, for example, UTRACE_DETACH safely.

Correct.

> But utrace_report_jctl() doesn't look right to me,
> 
>   spin_lock(&utrace->lock);
>   utrace->stopped = 0;
>   utrace->report = 0;
>   spin_unlock(&utrace->lock);
> 
> I must admit, I dont't understand the comment above, but obviously this is
> right, we should clear ->stopped. If nothing else, REPORT()->start_report()
> won't be happy if ->stopped.

The comment mentions "utrace being removed", which is a bit of old text
referring to an indirect struct utrace.  Aside from that, please tell me
what is not clear about that comment.

> But ->stopped can be restored right after we clear it! Yes, utrace_do_stop()
> and utrace_set_events() set ->stopped == 1 only if ->utrace_flags has no JCTL,
> and since we are here we must have JCTL.

That's indeed the logic intended to prevent ->stopped being set again here.

> But, if we enter utrace_report_jctl() with ->stopped == 1, JCTL can be
> already removed from ->utrace_flags, exactly because ->stopped was true.

I don't follow this.  JCTL is never "removed" from ->utrace_flags, except
as all event bits are, by utrace_reset().

> This leads to another minor question, how it is possible to enter enter
> utrace_report_jctl() with ->stopped == 1 ? I think the only possibility
> it was previously set by another call to utrace_report_jctl(), see below.

There are two ways to enter utrace_report_jctl with ->stopped set.

1. utrace_report_jctl was called when entering TASK_STOPPED, and set it then.
   Now utrace_report_jctl is called for the CLD_CONTINUED case, and
   ->stopped remains set.

2. utrace_set_events(target, UTRACE_EVENT(JCTL)) was called when target was
   already in TASK_STOPPED (and really stopped, or at least got past
   tracehook_notify_jctl before JCTL was set).  It sets ->stopped before
   adding JCTL to ->utrace_flags, so that utrace_control() will consider
   the target stopped.

> SIGNAL_STOP_STOPPED is not reliable, it is possible that the group stop in
> progress and it is not finished yet. 

SIGNAL_STOP_STOPPED should be reliable, as far as it goes.  It will only be
set if the group stop is complete.  If then a SIGCONT+stop signal come,
SIGCONT will clear SIGNAL_STOP_STOPPED before the stop signal starts
another group stop.  (We have no bad old PTRACE_CONT implementation to
conflict with here.)

> But ->group_stop_count is not reliable too. It it possible that we
> recieved SIGCONT and then another SIGSTOP. If another thread has already
> dequeued this SIGSTOP and initiated the new group stop, we can't just set
> TASK_STOPPED, we must participate in the ->group_stop_count accounting.

It's worse than that!  If we came out of TASK_STOPPED, we did it implicitly
and without holding the siglock.  

We participated in group_stop_count accounting for the first stop before we
got here.  If we stayed in TASK_STOPPED throughout the callbacks, then that
bookkeeping is still correct.

If the initiation of the new group stop happened while we were in
TASK_STOPPED, we were omitted from the count but we should stop again.  
In that case we should stop either if SIGNAL_STOP_STOPPED is set or if
group_stop_count > 0.  Since we weren't counted, if group_stop_count==0
then SIGNAL_STOP_STOPPED will be set (again).

If that initiation happened while a callback (e.g.) blocked in kmalloc or
after (i.e. we were not in TASK_STOPPED), we were included in that count.
In that case we need to decrement group_stop_count and stop again, but
possibly also need to call do_notify_parent_cldstop again if it was 1.  For
that we'd do the right thing just by returning in TASK_RUNNING.  We'll just
come right back around in get_signal_to_deliver and handle group_stop_count
normally.

The trouble is that we have no way to distinguish these two cases, i.e. to
know whether or not we were counted in group_stop_count.  Am I missing a
way?  (The one piece of information we are not using is the @notify
argument: it tells us whether we were the thread responsible for setting
SIGNAL_STOP_STOPPED just before we got here.  But I don't see how that helps.)

I think the bottom line is that we can't ever allow any transition to or
from TASK_STOPPED when we don't hold the siglock.  Every such transition
must hold that lock to manage group_stop_count and SIGNAL_STOP_STOPPED.

That suggests we must preemptively go back to TASK_RUNNING before making
the callbacks, just in case they would do the transition.  We'd take the
siglock and

[PATCH 1/2] UTRACE_STOP race condition (updated)

2009-03-12 Thread Renzo Davoli
Dear Roland, dear utrace developers,

I have updated my patch #1 (it solves the race condition on utrace_stop but
not the nesting issue) for the latest version of utrace.

I am trying to get the patches updated downloading, compiling and testing
the fixes every week or so... 
Things would be easier if these patch could be merged in the mainstream ;-)

renzo

diff -Naur linux-2.6.29-rc7-git5-utrace/kernel/utrace.c 
linux-2.6.29-rc7-git5-utrace-p1/kernel/utrace.c
--- linux-2.6.29-rc7-git5-utrace/kernel/utrace.c2009-03-12 
11:00:09.0 +0100
+++ linux-2.6.29-rc7-git5-utrace-p1/kernel/utrace.c 2009-03-12 
11:05:50.0 +0100
@@ -376,6 +376,13 @@
return killed;
 }
 
+static void mark_engine_wants_stop(struct utrace_engine *engine);
+static void clear_engine_wants_stop(struct utrace_engine *engine);
+static bool engine_wants_stop(struct utrace_engine *engine);
+static void mark_engine_wants_resume(struct utrace_engine *engine);
+static void clear_engine_wants_resume(struct utrace_engine *engine);
+static bool engine_wants_resume(struct utrace_engine *engine);
+
 /*
  * Perform %UTRACE_STOP, i.e. block in TASK_TRACED until woken up.
  * @task == current, @utrace == current->utrace, which is not locked.
@@ -385,6 +392,7 @@
 static bool utrace_stop(struct task_struct *task, struct utrace *utrace)
 {
bool killed;
+   struct utrace_engine *engine, *next;
 
/*
 * @utrace->stopped is the flag that says we are safely
@@ -406,7 +414,23 @@
return true;
}
 
-   utrace->stopped = 1;
+   /* final check: it is really needed to stop? */
+   list_for_each_entry_safe(engine, next, &utrace->attached, entry) {
+   if ((engine->ops != &utrace_detached_ops) && 
engine_wants_stop(engine)) {
+   if (engine_wants_resume(engine)) {
+   clear_engine_wants_stop(engine);
+   clear_engine_wants_resume(engine);
+   }
+   else
+   utrace->stopped = 1;
+   }
+   }
+   if (unlikely(!utrace->stopped)) {
+   spin_unlock_irq(&task->sighand->siglock);
+   spin_unlock(&utrace->lock);
+   return false;
+   }
+
__set_current_state(TASK_TRACED);
 
/*
@@ -632,6 +656,7 @@
  * to record whether the engine is keeping the target thread stopped.
  */
 #define ENGINE_STOP(1UL << _UTRACE_NEVENTS)
+#define ENGINE_RESUME  (1UL << (_UTRACE_NEVENTS+1))
 
 static void mark_engine_wants_stop(struct utrace_engine *engine)
 {
@@ -648,6 +673,21 @@
return (engine->flags & ENGINE_STOP) != 0;
 }
 
+static void mark_engine_wants_resume(struct utrace_engine *engine)
+{
+   engine->flags |= ENGINE_RESUME;
+}
+
+static void clear_engine_wants_resume(struct utrace_engine *engine)
+{
+   engine->flags &= ~ENGINE_RESUME;
+}
+
+static bool engine_wants_resume(struct utrace_engine *engine)
+{
+   return (engine->flags & ENGINE_RESUME) != 0;
+}
+
 /**
  * utrace_set_events - choose which event reports a tracing engine gets
  * @target:thread to affect
@@ -906,6 +946,10 @@
list_move(&engine->entry, &detached);
} else {
flags |= engine->flags | UTRACE_EVENT(REAP);
+   if (engine_wants_resume(engine)) {
+   clear_engine_wants_stop(engine);
+   clear_engine_wants_resume(engine);
+   }
wake = wake && !engine_wants_stop(engine);
}
}
@@ -1133,6 +1177,7 @@
 * There might not be another report before it just
 * resumes, so make sure single-step is not left set.
 */
+   mark_engine_wants_resume(engine);
if (likely(resume))
user_disable_single_step(target);
break;



[PATCH 2/2] UTRACE_STOP: nesting engine management (updated)

2009-03-12 Thread Renzo Davoli
Dear Roland, dear utrace developers,

I have update also the second patch. Please note that now this patch
must be applied after the first one.
This patch implements a consistent nesting model for utrace machines.
(There is a full description in the messages I sent on Feb. 14 and Mar. 6)

renzo
---
diff -Naur linux-2.6.29-rc7-git5-utrace-p1/kernel/utrace.c 
linux-2.6.29-rc7-git5-utrace-p2/kernel/utrace.c
--- linux-2.6.29-rc7-git5-utrace-p1/kernel/utrace.c 2009-03-12 
11:05:50.0 +0100
+++ linux-2.6.29-rc7-git5-utrace-p2/kernel/utrace.c 2009-03-12 
13:37:27.0 +0100
@@ -1405,6 +1405,7 @@
 static bool finish_callback(struct utrace *utrace,
struct utrace_report *report,
struct utrace_engine *engine,
+   struct task_struct *task,
u32 ret)
 {
enum utrace_resume_action action = utrace_resume_action(ret);
@@ -1426,6 +1427,7 @@
spin_lock(&utrace->lock);
mark_engine_wants_stop(engine);
spin_unlock(&utrace->lock);
+   utrace_stop(task, utrace);
}
} else if (engine_wants_stop(engine)) {
spin_lock(&utrace->lock);
@@ -1492,7 +1494,7 @@
ops = engine->ops;
 
if (want & UTRACE_EVENT(QUIESCE)) {
-   if (finish_callback(utrace, report, engine,
+   if (finish_callback(utrace, report, engine, task,
(*ops->report_quiesce)(report->action,
   engine, task,
   event)))
@@ -1526,24 +1528,24 @@
  * @callback is the name of the member in the ops vector, and remaining
  * args are the extras it takes after the standard three args.
  */
-#define REPORT(task, utrace, report, event, callback, ...)   \
+#define REPORT(reverse, task, utrace, report, event, callback, ...)
  \
do {  \
start_report(utrace); \
-   REPORT_CALLBACKS(task, utrace, report, event, callback,   \
+   REPORT_CALLBACKS(reverse, task, utrace, report, event, 
callback,  \
 (report)->action, engine, current,   \
 ## __VA_ARGS__); \
finish_report(report, task, utrace);  \
} while (0)
-#define REPORT_CALLBACKS(task, utrace, report, event, callback, ...) \
+#define REPORT_CALLBACKS(reverse, task, utrace, report, event, callback, ...)  
  \
do {  \
struct utrace_engine *engine; \
const struct utrace_engine_ops *ops;  \
-   list_for_each_entry(engine, &utrace->attached, entry) {   \
+   list_for_each_entry ## reverse(engine, &utrace->attached, 
entry) {\
ops = start_callback(utrace, report, engine, task,\
 event);  \
if (!ops) \
continue; \
-   finish_callback(utrace, report, engine,   \
+   finish_callback(utrace, report, engine, task,   
  \
(*ops->callback)(__VA_ARGS__));   \
} \
} while (0)
@@ -1558,7 +1560,7 @@
struct utrace *utrace = task_utrace_struct(task);
INIT_REPORT(report);
 
-   REPORT(task, utrace, &report, UTRACE_EVENT(EXEC),
+   REPORT(, task, utrace, &report, UTRACE_EVENT(EXEC),
   report_exec, fmt, bprm, regs);
 }
 
@@ -1573,7 +1575,7 @@
INIT_REPORT(report);
 
start_report(utrace);
-   REPORT_CALLBACKS(task, utrace, &report, UTRACE_EVENT(SYSCALL_ENTRY),
+   REPORT_CALLBACKS(_reverse, task, utrace, &report, 
UTRACE_EVENT(SYSCALL_ENTRY),
 report_syscall_entry, report.result | report.action,
 engine, current, regs);
finish_report(&report, task, utrace);
@@ -1615,7 +1617,7 @@
struct utrace *utrace = task_utrace_struct(task);
INIT_REPORT(report);
 
-   REPORT(task, utrace, &report, UTRACE_EVENT(SYSCALL_EXIT),
+   REPORT(, task, utrace, &report, UTRACE_EVENT(SYSCALL_EXIT),
   report_syscall_exit, regs);
 }
 
@@ -1640,7 +1642,7 @@
start_re

Re: [PATCH 1/2] UTRACE_STOP race condition (updated)

2009-03-12 Thread Oleg Nesterov
Hi Renzo,

This patch needs Roland's review, but I'd like to participate...

On 03/12, Renzo Davoli wrote:
>
> I have updated my patch #1 (it solves the race condition on utrace_stop but
> not the nesting issue) for the latest version of utrace.
>
> I am trying to get the patches updated downloading, compiling and testing
> the fixes every week or so...
> Things would be easier if these patch could be merged in the mainstream ;-)

I think it would be better if you describe the problem in the changelog.
It is not convenient to dig the archives to understand which problem
this patch fixes.

Can't really comment this change because I don't understand what is the
supposed behaviour of utrace_control(UTRACE_RESUME). Perhaps the caller
should wait until the target is stopped? The comment says:

 case UTRACE_RESUME:
* This and all other cases imply resuming if stopped.

it doesn't explain what should we do if it is not stopped yet.

>  static bool utrace_stop(struct task_struct *task, struct utrace *utrace)
>  {
>   bool killed;
> + struct utrace_engine *engine, *next;
>
>   /*
>* @utrace->stopped is the flag that says we are safely
> @@ -406,7 +414,23 @@
>   return true;
>   }
>
> - utrace->stopped = 1;
> + /* final check: it is really needed to stop? */
> + list_for_each_entry_safe(engine, next, &utrace->attached, entry) {

I think we can do this earlier, before taking ->siglock

> + if ((engine->ops != &utrace_detached_ops) && 
> engine_wants_stop(engine)) {

Do we need "!= &utrace_detached_ops" check? mark_engine_detached() removes
ENGINE_STOP from ->flags.

> +   if (engine_wants_resume(engine)) {
> +   clear_engine_wants_stop(engine);
> +   clear_engine_wants_resume(engine);
> +   }

I'm afraid _wants_resume() adds another problem. Let's suppose we do

utrace_control(UTRACE_RESUME);
utrace_control(UTRACE_STOP);

UTRACE_STOP doesn't do clear_engine_wants_resume(), so it can be lost.

And. Let's suppose we call utrace_control(UTRACE_RESUME), and later
report_xxx() returns UTRACE_STOP. Again, this stop request can be lost.
This doesn't look consistent.


Do we really need _wants_resume()? Note that utrace_control(UTRACE_RESUME)
does clear_engine_wants_stop(). Yes, we can race with finish_callback()
in case when ->report_xxx() returns UTRACE_STOP. But, perhaps, in that
case the caller of utrace_control(UTRACE_RESUME) should take care about
the synchronization with its own callbacks? Something like:

make_sure_my_callback_wont_return_UTRACE_STOP();
utrace_barrier();
utrace_control(UTRACE_RESUME);

This way utrace_stop() can just check engine_wants_stop().

Oleg.



Re: [PATCH 2/2] UTRACE_STOP: nesting engine management (updated)

2009-03-12 Thread Oleg Nesterov
On 03/12, Renzo Davoli wrote:
>
> I have update also the second patch. Please note that now this patch
> must be applied after the first one.
> This patch implements a consistent nesting model for utrace machines.
> (There is a full description in the messages I sent on Feb. 14 and Mar. 6)

This patch does 2 completely different things. I think you should
make separate patches.

Again, we need Roland's opinion, but could you explain why it would
be better to use _reverse in utrace_report_syscall_entry() ?

As for another change,

> --- linux-2.6.29-rc7-git5-utrace-p1/kernel/utrace.c   2009-03-12 
> 11:05:50.0 +0100
> +++ linux-2.6.29-rc7-git5-utrace-p2/kernel/utrace.c   2009-03-12 
> 13:37:27.0 +0100
> @@ -1405,6 +1405,7 @@
>  static bool finish_callback(struct utrace *utrace,
>   struct utrace_report *report,
>   struct utrace_engine *engine,
> + struct task_struct *task,
>   u32 ret)
>  {
>   enum utrace_resume_action action = utrace_resume_action(ret);
> @@ -1426,6 +1427,7 @@
>   spin_lock(&utrace->lock);
>   mark_engine_wants_stop(engine);
>   spin_unlock(&utrace->lock);
> + utrace_stop(task, utrace);

I don't think this is safe. If we do utrace_stop() here, the next engine
can be detached before we return (UTRACE_DETACH assumes it it safe to
unlink the engine when the target is stopped). This means we can't
continue list_for_each_entry(engine, &utrace->attached, entry) after
return from finish_callback().

Oleg.



Re: Q: utrace->stopped && utrace_report_jctl()

2009-03-12 Thread Oleg Nesterov
Roland, I left some parts of your message unanswered because I need to think
more about them...

On 03/12, Roland McGrath wrote:
>
> > But, if we enter utrace_report_jctl() with ->stopped == 1, JCTL can be
> > already removed from ->utrace_flags, exactly because ->stopped was true.
>
> I don't follow this.  JCTL is never "removed" from ->utrace_flags, except
> as all event bits are, by utrace_reset().

Yep. And utrace_reset() can be called because ->stopped == 1.

Let me explain. Again, let's suppose D attaches engine E to the target T.

T enters utrace_report_jctl() with ->stopped == 1.

D calls utrace_set_events(events => 0), this removes JCTL from E->flags.

D calls, say, utrace_control(UTRACE_RESUME). Since ->stopped == 1, this
calls utrace_reset() and removes JCTL from T->utrace_flags.

T takes utrace->lock, clears ->stopped, and drops the lock.

D does utrace_control(UTRACE_STOP). This calls utrace_do_stop() which
sees task_is_stopped() && !JCTL, so it sets ->stopped = true.

T calls REPORT() and start_report() hits the (correct) BUG_ON(stopped).

No?

> > This leads to another minor question, how it is possible to enter enter
> > utrace_report_jctl() with ->stopped == 1 ? I think the only possibility
> > it was previously set by another call to utrace_report_jctl(), see below.
>
> There are two ways to enter utrace_report_jctl with ->stopped set.
>
> 1. utrace_report_jctl was called when entering TASK_STOPPED, and set it then.
>Now utrace_report_jctl is called for the CLD_CONTINUED case, and
>->stopped remains set.

this is covered by my guess above,

> 2. utrace_set_events(target, UTRACE_EVENT(JCTL)) was called when target was
>already in TASK_STOPPED (and really stopped, or at least got past
>tracehook_notify_jctl before JCTL was set).  It sets ->stopped before
>adding JCTL to ->utrace_flags,

Yes, thanks. I missed this.

> > SIGNAL_STOP_STOPPED is not reliable, it is possible that the group stop in
> > progress and it is not finished yet.
>
> SIGNAL_STOP_STOPPED should be reliable, as far as it goes.  It will only be
> set if the group stop is complete.

Yes sure. I wasn't clear. I meant, what if SIGNAL_STOP_STOPPED is not set?
This doesn't mean we don't need  __set_current_state(TASK_STOPPED), it is
possible that the group-stop is in progress and ->group_stop_count != 0.

> > But! can't we miss utrace_wakeup() ?
>
> I think you've found something (though not quite the scenario you describe).
>
> > Let's suppose the debugger D attaches the single engine E to the target T.
> >
> > D does utrace_set_events(JCTL).
> >
> > T calls do_signal_stop(), tracehook_notify_jctl() sees JCTL and starts
> > utrace_report_jctl().
> >
> > D does utrace_set_events(events => 0), this clears E->flags.
> >
> > T calls REPORT(), nobody needs needs JCTL, finish_report() sees 
> > !->takers
> > and calls utrace_reset(). It sets ->utrace_flags = 0.
>
> Nope:
>
>   flags |= engine->flags | UTRACE_EVENT(REAP);

Ah, thanks. Can't understand how I didn't notice this, I checked the
code several times ;)

But as you pointed out,

> So, change your scenario to:
>
>   D does utrace_control(UTRACE_DETACH).
>
> and then this will happen.

Yes.

Oleg.



Q: utrace_reset() && UTRACE_EVENT(REAP)

2009-03-12 Thread Oleg Nesterov
On 03/12, Roland McGrath wrote:
>
> > T calls REPORT(), nobody needs needs JCTL, finish_report() sees 
> > !->takers
> > and calls utrace_reset(). It sets ->utrace_flags = 0.
>
> Nope:
>
>   flags |= engine->flags | UTRACE_EVENT(REAP);

Hmm. But this leads to another question: why does utrace_reset() set
UTRACE_EVENT(REAP) ?

This looks as: make sure ->utrace_flags is never 0 unless we detach
all engines. Perhaps because sometimes, say tracehook_notify_resume(),
we just check task_utrace_flags() != 0 ?

Imho, this needs a comment. Or I missed something obvious.


Oh. utrace_attach_task()->utrace_add_engine() sets ->report + TIF_NOTIFY_RESUME.
But tracehook_notify_resume() does nothing because ->utrace_flags == 0 ?
Confused.

Oleg.



Re: Q: utrace_reset() && UTRACE_EVENT(REAP)

2009-03-12 Thread Oleg Nesterov
I'm afraid I wasn't clear again,

On 03/12, Oleg Nesterov wrote:
>
> Oh. utrace_attach_task()->utrace_add_engine() sets ->report + 
> TIF_NOTIFY_RESUME.
> But tracehook_notify_resume() does nothing because ->utrace_flags == 0 ?
> Confused.

Perhaps this is not problem per se. But let's suppose we call, say,
utrace_control(UTRACE_STOP) later. utrace_do_stop() sees ->report == 1
and doesn't call set_notify_resume(). But TIF_NOTIFY_RESUME was already
cleared by do_notify_resume().

And again, utrace_control(UTRACE_STOP) does not set ->utrace_flags != 0
itself. But even if we called utrace_set_events(XXX) before, without
set_notify_resume() we have to wait for that XXX event, this doesn't
look right.

Oleg.



Re: Q: ->attaching && REPORT_CALLBACKS()

2009-03-12 Thread Oleg Nesterov
On 03/11, Roland McGrath wrote:
>
> > But not vise versa. I misunderstood the comment as if the new engine
> > should not be notified if it is attached by another task while target
> > is inside callback.
>
> That is indeed what happens in that case.  But that one is not a
> specific "should not", it's just what happens to be true given what we
> say about the "asynchronous" attach case in general.  That is, that an
> "asynchronous" attach + set_events makes no guarantees about how
> instantly you start to get event reports.

Yes, yes, I understand. In short: I greatly misinterpreted the comment.

> > Still can't understand... If (say) ->report_exec() attaches the new
> > engine to the same task and does utrace_set_events(EXEC), then it looks
> > logical the new engine gets the notification too. But OK, I agree, either
> > way is correct, and perhaps the current behaviour is more intuitive.
>
> As you can see in the cited thread, that's what I thought too.  Jim
> convinced me that the (new) current behavior is more useful.
> ...
> Jim's use seems fairly representative of situations where this might
> come up.  He's concerned with the EXEC event as the "old address space
> is gone, new one is here" event.  It's also the "my name changed"
> event that may be triggering a new tracing setup.

Aha, thanks!

> > But this means that "When target == current it would be safe just to call
> > splice_attaching() right here" part of the comment is not right, no?
> > Except for report_reap() target == current.
>
> It would be "safe", meaning it doesn't have race problems like the
> target != current case does for touching ->attached here.

Yes, I see. Again, I confused "safe" with "not what the user wants".

> > Yes, yes, I see. But I meant another case. Suppose that the debugger D
> > attaches to T and does
> >
> > engine = utrace_attach_task(T, ...);
> > utrace_set_events(T, engine, XXX);
> >
> > It is possible that ->report_xxx() is called before utrace_set_events()
> > completes. But afaics currently this is not a problem.
>
> As far as the API guarantees are concerned, there is no "completes".
> When you call utrace_set_events, it becomes possible your callbacks
> get made.

Yes sure.

Thanks!

Oleg.



Re: Q: utrace->stopped && utrace_report_jctl()

2009-03-12 Thread Roland McGrath
> Yep. And utrace_reset() can be called because ->stopped == 1.

Right.

> Let me explain. Again, let's suppose D attaches engine E to the target T.
> 
> T enters utrace_report_jctl() with ->stopped == 1.
> 
> D calls utrace_set_events(events => 0), this removes JCTL from E->flags.
> 
> D calls, say, utrace_control(UTRACE_RESUME). Since ->stopped == 1, this
> calls utrace_reset() and removes JCTL from T->utrace_flags.

Right.  In the utrace-indirect code this would have reset the utrace
pointer too.

> T takes utrace->lock, clears ->stopped, and drops the lock.

In the utrace-indirect code, this part would have been harmless even in the
race case where it happened (the more likely case being that task->utrace
was cleared already before utrace_report_jctl looked at it).  (That code
just had the dangling utrace pointer issue I noticed yesterday, at the end
of the function.)

But, yes, this is a problem.  I think this ought to cover it:

@@ -1659,6 +1659,16 @@ void utrace_report_jctl(int notify, int what)
 * longer considered stopped while we run callbacks.
 */
spin_lock(&utrace->lock);
+   /*
+* Now that we have the lock, check in case utrace_reset() has
+* just now cleared UTRACE_EVENT(JCTL) while it considered us
+* safely stopped.  In that case, we should not touch ->stopped
+* and have nothing else to do.
+*/
+   if (unlikely(!(task->utrace_flags & UTRACE_EVENT(JCTL {
+   spin_unlock(&utrace->lock);
+   return;
+   }
utrace->stopped = 0;
utrace->report = 0;
spin_unlock(&utrace->lock);

> > 2. utrace_set_events(target, UTRACE_EVENT(JCTL)) was called when target was
> >already in TASK_STOPPED (and really stopped, or at least got past
> >tracehook_notify_jctl before JCTL was set).  It sets ->stopped before
> >adding JCTL to ->utrace_flags,
> 
> Yes, thanks. I missed this.

I feel I should also point out the case where exit_signals() calls
tracehook_notify_jctl, because I just noticed it.  I don't think that path
existed the last time I thought seriously about the utrace_report_jctl
logic.  (This is not a #3 in that list, but in general is another path we
need to keep in mind here.)

> Yes sure. I wasn't clear. I meant, what if SIGNAL_STOP_STOPPED is not set?
> This doesn't mean we don't need  __set_current_state(TASK_STOPPED), it is
> possible that the group-stop is in progress and ->group_stop_count != 0.

Right.


Thanks,
Roland



Re: Q: utrace_reset() && UTRACE_EVENT(REAP)

2009-03-12 Thread Roland McGrath
> Hmm. But this leads to another question: why does utrace_reset() set
> UTRACE_EVENT(REAP) ?
> 
> This looks as: make sure ->utrace_flags is never 0 unless we detach
> all engines. Perhaps because sometimes, say tracehook_notify_resume(),
> we just check task_utrace_flags() != 0 ?

Right, it's an invariant that utrace_flags != 0 if there is any utrace
stuff to do.  It just fits logically too.  The utrace_flags bits mean "need
to call into utrace", so UTRACE_EVENT(REAP) means that we need to call
utrace_release_task.

> Imho, this needs a comment. Or I missed something obvious.

Sure, better comments are always good.  How's this?

@@ -899,6 +899,10 @@ static void utrace_reset(struct task_struct *task, struct 
utrace *utrace,
 * of the interests of the remaining tracing engines.
 * For any engine marked detached, remove it from the list.
 * We'll collect them on the detached list.
+*
+* Any engine that's not detached implies tracking the REAP event,
+* whether or not that engine wants a report_reap callback.  Any
+* engine requires attention from utrace_release_task().
 */
list_for_each_entry_safe(engine, next, &utrace->attached, entry) {
if (engine->ops == &utrace_detached_ops) {

> Oh. utrace_attach_task()->utrace_add_engine() sets ->report + 
> TIF_NOTIFY_RESUME.
> But tracehook_notify_resume() does nothing because ->utrace_flags == 0 ?

The logic (in the utrace_add_engine comment) is to have ->report just to
make sure splice_attaching() precedes the next reporting pass (start_report).
It doesn't actually care about TIF_NOTIFY_RESUME (i.e. how soon the report
happens), but just wants to keep the invariant that ->report matches
TIF_NOTIFY_RESUME.  But as you point out, this invariant will be violated
later if tracehook_notify_resume() sees ->utrace_flags == 0.

> Perhaps this is not problem per se. But let's suppose we call, say,
> utrace_control(UTRACE_STOP) later. utrace_do_stop() sees ->report == 1
> and doesn't call set_notify_resume(). But TIF_NOTIFY_RESUME was already
> cleared by do_notify_resume().

Right.

So I think we need this:

@@ -181,7 +181,13 @@ static int utrace_add_engine(struct task_struct *target,
 * also set.  Otherwise utrace_control() or utrace_do_stop()
 * might skip setting TIF_NOTIFY_RESUME upon seeing ->report
 * already set, and we'd miss a necessary callback.
+*
+* In case we had no engines before, make sure that
+* utrace_flags is not zero when tracehook_notify_resume()
+* checks.  That would bypass utrace reporting clearing
+* TIF_NOTIFY_RESUME, and thus violate the same invariant.
 */
+   target->utrace_flags |= UTRACE_EVENT(REAP);
list_add_tail(&engine->entry, &utrace->attaching);
utrace->report = 1;
set_notify_resume(target);

Does that need a barrier pair here and in tracehook_notify_resume()?


Thanks,
Roland



Re: [PATCH 2/2] UTRACE_STOP: nesting engine management (updated)

2009-03-12 Thread Renzo Davoli
> Again, we need Roland's opinion, but could you explain why it would
> be better to use _reverse in utrace_report_syscall_entry() ?

I refer to this posting:
http://www.mail-archive.com/utrace-devel@redhat.com/msg00579.html

Item #4 explains why it is *needed* to reverse the order in 
utrace_report_syscall_entry
to have a consistent implementation of nested virtualization.

> I don't think this is safe. If we do utrace_stop() here, the next engine
> can be detached before we return (UTRACE_DETACH assumes it it safe to
> unlink the engine when the target is stopped). This means we can't
> continue list_for_each_entry(engine, &utrace->attached, entry) after
> return from finish_callback().

Maybe this is not the best patch, maybe we can solve the problem in a
better way.
The point is explained in #3 in the same posting cited above.

When a report function of an engine returns UTRACE_STOP, it means (may mean)
that it wants to change the status of the process before resuming it.
VM monitors often change the status, sometimes debugger users want to set
some variables too.

IMHO, utrace should stop it *before* calling the report function of the 
next engine, otherwise we need to set up another structure to synchronize
the engines (that may even be unknown one to the other).
If there is a tracer/debugger among the engines, it is not even possible to know
which snapshot it gets, after or before the modification created by the VM
monitor?

With these patches it is possible to run nested virtual machines based
on utrace, it is also possbile to strace (use ptrace) on processes running
inside a VM.

renzo