> Yeah. I think one concern was that when a breakpoint is hit, a utrace > client other than uprobes might see the SIGTRAP first and take some > unwarranted action because it thinks that the task is going to die.
There are two components of this issue. The issue is that tracing activities induce signals that would not ordinarily be raised in the program. Inserted breakpoints do this; so does single-step; so do hw breakpoint features; etc. First is the problem of interaction with signals generally. This is already a problem in vanilla ptrace. Induced signals interact with the normal functioning of the system, not just with competing tracers. A straightforward example is when the debugger suddenly dies. If it had been using single-step or suchlike, there may be a pending SIGTRAP that was already posted before the debugger exited and detached, but had not yet been processed by the debugger. This is easy to achieve with a wedged debugger, for example: it hasn't called wait4 to eat the signal yet, and then it gets killed and never does. Now the thread that was formerly traced resumes running, and delivers the SIGTRAP and dumps core. Some kernel behavior tweaks to avoid that particular scenario are easy to come up with. But that is just the simplest example of the issue. More subtle bad effects range from perturbing the signal queue resource accounting to interfering with the program's own use of blocked and pending signals. (Today it's effectively impossible to do some normal debugging activities with a program that itself catches and generates SIGTRAP.) In summary, the first component is distinguishing tracing events from real signals. The second problem is engine interaction per se. My view is that the crux of the former problem is the whole notion of overloading the signals mechanism for reporting tracing-induced events. Signals have one essential characteristic we need: they can be queued/posted quickly and safely at interrupt level, to take effect only at the safe, unentangled place just before returning to user mode. My thinking is to have a mechanism with this property separate from signals. The "extension events" item on the TODO list is such a mechanism. That feature idea is intended to serve a variety of needs. Among them is the idea to use these for machine-level events that are being traced. So rather than catching signals, an engine inducing traps would register its interest in the "single-stepped" event or the "breakpoint insn" event. In cases like the trap for a breakpoint instruction, there is nothing at the lowest level to distinguish a tracing-induced breakpoint from a user's other use of those instructions (intended to generate a SIGTRAP). So this event would be reported to interested tracing engines, and if no engine consumes it, then it can become a signal. I think the interface can be devised such that a detaching engine always gets a callback for any last events, so it's harder to write an engine that forgets to swallow the last event it induced and leaves someone to die by SIGTRAP. The mechanism to support that is a little way off. But this is how I see that going. In that context, engine interaction seems a lot simpler. Engines that induce low-level events have to sort those out among each other (probably easy from PC values and such). Engines interested in fatal signals only ever see "natural" signals. Those engines have to work out among themselves who eats the signal last. The report_resume interface and simple priorities for callback order seem like they might suffice. Thanks, Roland