Re: [PATCH 0/3] UTRACE_DETACH fixes

2010-09-10 Thread Roland McGrath
 I misunderstood the UTRACE_INTERRUPT without UTRACE_EVENT(QUIESCE) above
 as if you mean it is possible to get UTRACE_SIGNAL_REPORT without QUIESCE.

No, I meant the opposite: since you won't get UTRACE_SIGNAL_REPORT without
UTRACE_EVENT(QUIESCE), it might seem that UTRACE_INTERRUPT should not be
permitted, as UTRACE_REPORT is not.  But because that is not it's *only*
effect, we don't do permit it even though you won't get UTRACE_SIGNAL_REPORT.

 OK. This means ugdb should set QUIESCE, just to ensure its -report_signal()
 will be called.

If you ever want UTRACE_SIGNAL_REPORT calls, yes.

 OK, please forget. I must admit, this still looks a bit, well, unnatural
 to me. May be it makes sense to add
 
   _UTRACE_EVENT_SIGNAL_REPORT,
   _UTRACE_EVENT_SIGNAL_HANDLER,
 
 into utrace_events. Then ugdb could ask for UTRACE_SIGNAL_REPORT and
 avoid the unnecessary -report_quiesce() calls.

If anything, I think UTRACE_EVENT(RESUME) would make sense, perhaps with a
separate -report_resume callback.  That would request only the cases that
now get report_quiesce(0).  But, it really does not cost much to have 

if (event) return UTRACE_RESUME;

or similar in your report_quiesce callback.  It's beyond refinement of the
utrace API (that we're not doing now), it's just micro-optimization.

 Roland, could you also comment this patch?
 
   https://www.redhat.com/archives/utrace-devel/2010-August/msg00108.html
 
 Again, this looks like a bug to me, but I won't insist if it is not.

Sorry, that is still on my queue.  I haven't forgotten it.  
I just haven't gotten to reviewing it yet because of other
work and it needing a lot of hard thinking.


Thanks,
Roland



Re: [PATCH 0/3] UTRACE_DETACH fixes

2010-09-06 Thread Oleg Nesterov
On 09/03, Roland McGrath wrote:

   So, it is technically kosher enough to use UTRACE_INTERRUPT without
   UTRACE_EVENT(QUIESCE) set, if you really know the situation and are sure
   about all those caveats above.
 
  How? see below.

 I mentioned the examples.

I misunderstood the UTRACE_INTERRUPT without UTRACE_EVENT(QUIESCE) above
as if you mean it is possible to get UTRACE_SIGNAL_REPORT without QUIESCE.

   But, it's only UTRACE_EVENT(QUIESCE) that
   gives you any expectation of an extra callback for no event from 
   either
   UTRACE_REPORT or UTRACE_INTERRUPT.
 
  Yes, this is clear too.

 Apparently not. ;-)

Perhaps ;) At least I certainly missed your point.

 QUIESCE is the only event type for no event.  If you only asked for
 signal events, then you only get a callback when there is an actual signal
 event.  If you want an extra callback before dequeuing any signal, then
 that is what QUIESCE is for.

OK. This means ugdb should set QUIESCE, just to ensure its -report_signal()
will be called.

  Once again, I am not asking to change utrace now, but could you explain
  what is wrong with the patch below?

 That gives an extra unrequested report_signal callback to every engine that
 is only interested in some signal event.  Consider a crash-catcher
 engine.  It would only track UTRACE_EVENT(SIGNAL_CORE) (and perhaps CLONE
 for its bookkeeping).  This engine never asked for a callback before each
 and every attempt to dequeue any signal,

I see your point (I hope).

 which is what you would give it
 with your change.

No ;) My change was wrong. event == 0 in this case, and then later
utrace_get_signal() checks

if ((want  (event | UTRACE_EVENT(QUIESCE))) == 0)
continue;



OK, please forget. I must admit, this still looks a bit, well, unnatural
to me. May be it makes sense to add

_UTRACE_EVENT_SIGNAL_REPORT,
_UTRACE_EVENT_SIGNAL_HANDLER,

into utrace_events. Then ugdb could ask for UTRACE_SIGNAL_REPORT and
avoid the unnecessary -report_quiesce() calls.

But at least I can see the logic now, thanks.



Roland, could you also comment this patch?

https://www.redhat.com/archives/utrace-devel/2010-August/msg00108.html

Again, this looks like a bug to me, but I won't insist if it is not.

Oleg.



Re: [PATCH 0/3] UTRACE_DETACH fixes

2010-09-03 Thread Roland McGrath
 Hmm. I am not sure I understand. If -report_signal != NULL, then
 you can't expect -report_quiesce() after utrace_control(INTERRUPT) ?

Sorry, I used report_quiesce as shorthand for either report_quiesce, or
report_signal with UTRACE_SIGNAL_REPORT.

  So, it is technically kosher enough to use UTRACE_INTERRUPT without
  UTRACE_EVENT(QUIESCE) set, if you really know the situation and are sure
  about all those caveats above.
 
 How? see below.

I mentioned the examples.

  But, it's only UTRACE_EVENT(QUIESCE) that
  gives you any expectation of an extra callback for no event from either
  UTRACE_REPORT or UTRACE_INTERRUPT.
 
 Yes, this is clear too.

Apparently not. ;-)

 So. Now I am even more confused. First of all, ugdb (at least currently)
 does not need report_quiesce() and UTRACE_EVENT(QUIESCE) at all. OK, this
 is not 100% true due to multitracing/etc, lets ignore this. Anyway it
 makes sense to clear QUIESCE after $c to avoid the unnecessary callbacks.
 
 All it needs is -report_signal(). But, the problem is, it is not called
 after utrace_control(INTERRUPT) ! You need QUIESCE to ensure report_signal()
 will be called, and this looks at least strange to me.

QUIESCE is the only event type for no event.  If you only asked for
signal events, then you only get a callback when there is an actual signal
event.  If you want an extra callback before dequeuing any signal, then
that is what QUIESCE is for.

 Once again, I am not asking to change utrace now, but could you explain
 what is wrong with the patch below?

That gives an extra unrequested report_signal callback to every engine that
is only interested in some signal event.  Consider a crash-catcher
engine.  It would only track UTRACE_EVENT(SIGNAL_CORE) (and perhaps CLONE
for its bookkeeping).  This engine never asked for a callback before each
and every attempt to dequeue any signal, which is what you would give it
with your change.


Thanks,
Roland



Re: [PATCH 0/3] UTRACE_DETACH fixes

2010-09-02 Thread Roland McGrath
 But probably we need utrace_detached_ops-report_reap() anyway.
 -report_death can return UTRACE_DETACH, and after that utrace_report_death()
 can skip utrace_reset() and do -report_reap().

Yeah, that's a good point.  There's no good reason to do a special case
check for detached_ops there rather than just having the no-op report_reap
hook.

  If an engine used UTRACE_INTERRUPT without having first set QUIESCE,
  then it has no rightful expectation of getting any callback.
 
 Hmm. This is certainly new to me. Could you confirm?

Well, I didn't say it precisely correctly.  UTRACE_INTERRUPT serves two
purposes.  First, it just interrupts things like a signal would.  You could
use that on its own without even tracing any events at all, just do achieve
fault injection or similar.  Second, it's like UTRACE_REPORT.  

If you do have some other event bits set, then you can expect/rely on
getting those normal events soon if they would otherwise have
happened--i.e., if you know it's blocked in a syscall, and you have
UTRACE_EVENT(SYSCALL_EXIT) set, then you can expect to get a
report_syscall_exit soon.  (But, it's always possible to race with the
syscall finishing on its own, or being interrupted by an outside signal, so
that exit might have come before your utrace_control call if you were not
otherwise synchronizing with your established report_syscall_exit callback.)

As with UTRACE_REPORT, after UTRACE_INTERRUPT you can expect to get some
report_quiesce callback even if there is no other event you were tracing
that happens.  For this to actually happen, you need to have
UTRACE_EVENT(QUIESCE) set.  

So, it is technically kosher enough to use UTRACE_INTERRUPT without
UTRACE_EVENT(QUIESCE) set, if you really know the situation and are sure
about all those caveats above.  But, it's only UTRACE_EVENT(QUIESCE) that
gives you any expectation of an extra callback for no event from either
UTRACE_REPORT or UTRACE_INTERRUPT.  (And since that is the sole possible
purpose of UTRACE_REPORT, we diagnose a utrace_control call like that.)


Thanks,
Roland



Re: [PATCH 0/3] UTRACE_DETACH fixes

2010-09-02 Thread Oleg Nesterov
I was going to ping you ;) This is connected to the problem I hit today.
Despite the fact I already complained about utrace_get_signal()  QUIESCE,
I forgot about this and figured out it doesn't work in practice.

On 09/02, Roland McGrath wrote:

   If an engine used UTRACE_INTERRUPT without having first set QUIESCE,
   then it has no rightful expectation of getting any callback.
 
  Hmm. This is certainly new to me. Could you confirm?

 Well, I didn't say it precisely correctly.  UTRACE_INTERRUPT serves two
 purposes.  First, it just interrupts things like a signal would.  You could
 use that on its own without even tracing any events at all, just do achieve
 fault injection or similar.  Second, it's like UTRACE_REPORT.

Yes, this is clear.

 As with UTRACE_REPORT, after UTRACE_INTERRUPT you can expect to get some
 report_quiesce callback even if there is no other event you were tracing
 that happens.  For this to actually happen, you need to have
 UTRACE_EVENT(QUIESCE) set.

Hmm. I am not sure I understand. If -report_signal != NULL, then
you can't expect -report_quiesce() after utrace_control(INTERRUPT) ?

 So, it is technically kosher enough to use UTRACE_INTERRUPT without
 UTRACE_EVENT(QUIESCE) set, if you really know the situation and are sure
 about all those caveats above.

How? see below.

 But, it's only UTRACE_EVENT(QUIESCE) that
 gives you any expectation of an extra callback for no event from either
 UTRACE_REPORT or UTRACE_INTERRUPT.

Yes, this is clear too.


So. Now I am even more confused. First of all, ugdb (at least currently)
does not need report_quiesce() and UTRACE_EVENT(QUIESCE) at all. OK, this
is not 100% true due to multitracing/etc, lets ignore this. Anyway it
makes sense to clear QUIESCE after $c to avoid the unnecessary callbacks.

All it needs is -report_signal(). But, the problem is, it is not called
after utrace_control(INTERRUPT) ! You need QUIESCE to ensure report_signal()
will be called, and this looks at least strange to me.

Once again, I am not asking to change utrace now, but could you explain
what is wrong with the patch below?

Oleg.

--- x/kernel/utrace.c
+++ x/kernel/utrace.c
@@ -2029,7 +2029,8 @@ int utrace_get_signal(struct task_struct
report.spurious = false;
finish_resume_report(task, utrace, report);
return -1;
-   } else if (!(task-utrace_flags  UTRACE_EVENT(QUIESCE))) {
+   } else if (!(task-utrace_flags 
+   (UTRACE_EVENT(QUIESCE) | 
UTRACE_EVENT_SIGNAL_ALL))) {
/*
 * We only got here to clear utrace-signal_handler.
 */



Re: [PATCH 0/3] UTRACE_DETACH fixes

2010-08-26 Thread Oleg Nesterov
On 08/19, Roland McGrath wrote:

  OK, instead of filling utrace_detached_ops{} we can change start_callback()
 
  - if (want  UTRACE_EVENT(QUIESCE)) {
  + if ((want  UTRACE_EVENT(QUIESCE)) || ops == detached_ops) {

 If we're going to have a special-case check for utrace_detached_ops, then
 it can just short-circuit entirely and there is no need for any callbacks
 in utrace_detached_ops.  (Then we might even use NULL or (void *)-1l
 instead of utrace_detached_ops.)  All it needs to do is:

   report-detaches = true;
   utrace-reporting = NULL;
   return NULL;

Yes, I thought about this too, see the changelog.

But probably we need utrace_detached_ops-report_reap() anyway.
-report_death can return UTRACE_DETACH, and after that utrace_report_death()
can skip utrace_reset() and do -report_reap().

  And, in particular, I don't understand this code in utrace_get_signal:
 
  } else if (!(task-utrace_flags  UTRACE_EVENT(QUIESCE))) {
  /*
   * We only got here to clear utrace-signal_handler.
   */
  return -1;
  }
 
  How so? What if we were called because of utrace_control(INTERRUPT)
  and QUIESCE is not set?

 If an engine used UTRACE_INTERRUPT without having first set QUIESCE,
 then it has no rightful expectation of getting any callback.

Hmm. This is certainly new to me. Could you confirm?

If yes, shouldn't we change utrace_control()

- if (action = UTRACE_REPORT  action  UTRACE_RESUME 
+ if (action = UTRACE_INTERRUPT  action  UTRACE_RESUME 
  unlikely(!(engine-flags  UTRACE_EVENT(QUIESCE {

then?

I must admit, I do not understand why INTERRUPT needs QUIESCE.
utrace_get_signal() should respect QUIESCE, this is clear. But why
it is needed?

Oleg.



Re: [PATCH 0/3] UTRACE_DETACH fixes

2010-08-17 Thread Oleg Nesterov
On 08/17, Oleg Nesterov wrote:

 Suppose that, say, engine-flags == UTRACE_EVENT(EXEC) (no QUIESCE).

   1. start_callback() reads want = engine-flags (== EXEC)

   2. mark_engine_detached() sets engine-ops = utrace_detached_ops

   3. start_callback() gets ops = utrace_detached_ops

 After that start_callback() skips if (want  UTRACE_EVENT(QUIESCE)) block
 and returns utrace_detached_ops, utrace_detached_ops-report_exec == NULL
 will be called.

OK, instead of filling utrace_detached_ops{} we can change start_callback()

- if (want  UTRACE_EVENT(QUIESCE)) {
+ if ((want  UTRACE_EVENT(QUIESCE)) || ops == detached_ops) {


 But this is minor. Roland, I can't explain this, but I have the strong
 gut feeling there is something else we should worry about. I am still
 reading this code...

Or not... I'll recheck once again tomorrow and try to make the patch.

In particular, I'd like to think if we can simplify this somehow.
Say, avoid the barriers somewhere, or avoid changing engine-flags in
mark_engine_detached().

I am worried that mark_engine_detached() sets engine-flags, but it
doesn't add QUIESCE to utrace_flags. I can't convince myself that
utrace_do_stop() closes all possible races.


And, in particular, I don't understand this code in utrace_get_signal:

} else if (!(task-utrace_flags  UTRACE_EVENT(QUIESCE))) {
/*
 * We only got here to clear utrace-signal_handler.
 */
return -1;
}

How so? What if we were called because of utrace_control(INTERRUPT)
and QUIESCE is not set?

Oleg.



Re: [PATCH 0/3] UTRACE_DETACH fixes

2010-08-16 Thread Oleg Nesterov
On 08/16, Oleg Nesterov wrote:

 Well, I tried to test these changes, seems to work.

 But... From 2/3 changelog:

   This means that
   mark_engine_detached() can race with REPORT_CALLBACKS() but there
   is nothing new.

 Yes, there is nothing new. But, Roland, this is WRONG!

 With or without these fixes utrace_control()-mark_engine_detached()
 was always wrong if it races with REPORT/REPORT_CALLBACKS. Not sure
 what we were thinking about :/

 Look. Suppose that utrace_control(DETACH) is called right after
 start_callback() returns ops != NULL. After that
 finish_callback(...,  ops-callback(...)) can hit -callback == NULL!

Cough. I guess I spent too much time with utrace and testing today ;)

Let me try again. mark_engine_detached() does

engine-ops = utrace_detached_ops;
smp_wmb();
engine-flags = UTRACE_EVENT(QUIESCE);

Whatever we do, start_callback() can see the old engine-flags but
the new -ops = utrace_detached_ops. Just suppose that the caller
of UTRACE_DETACH is interrupted right after setting engine-ops.

 The obvious and simplest solution: utrace_detached_ops{} should
 have the dummy handler for every callback. But I'd like to think
 a bit more.

Yes. Perhaps (unlikely) we can reverse the order, but I can no longer
think properly today.

Or do you think I miss something and this is false alarm?

Oleg.



Re: [PATCH 0/3] UTRACE_DETACH fixes

2010-08-16 Thread Roland McGrath
 Whatever we do, start_callback() can see the old engine-flags but
 the new -ops = utrace_detached_ops. Just suppose that the caller
 of UTRACE_DETACH is interrupted right after setting engine-ops.

 * it can check the old flags before using the old ops, or check the old
 * flags before using the new ops, or check the new flags before using the
 * new ops, but can never check the new flags before using the old ops.

 Or do you think I miss something and this is false alarm?

 * Hence, utrace_detached_ops might be used with any old flags in place.
 * It has report_quiesce() and report_reap() callbacks to handle all cases.

report_reap is covered with the utrace_detached_reap() stub.  Every other
callback uses start_callback(), i.e. calls report_quiesce first.
utrace_detached_quiesce() returns UTRACE_DETACH, so start_callback() will
return NULL.

 [...] but I can no longer think properly today.

Sleep well. :-)


Thanks,
Roland