On Mon, 2009-08-10 at 12:42 -0400, Frank Ch. Eigler wrote: > Hi - > > On Mon, Aug 03, 2009 at 11:10:00AM -0700, Jim Keniston wrote: > > [...] > > > So as per my analysis, gdb_utrace_report_signal was called, followed by > > > uprobe_report_signal. Since gdb_utrace_report_signal requested for > > > UTRACE_STOP as resume action for SIGTRAP, the thread got stopped. > > > [...] > > A variant of this problem still exists, despite switching to a > deferred uprobes-unregister operation (now done at gdb-disconnect > time), but even then sometimes the process is stopped mid-singlestep. > It seems to me that a more robust solution for > registering/unregistering these probes is necessary - one that > tolerates the threads being in stopped or running or whatever state.
Srikar and I discussed this some. He may have some new ideas to try. Here's one idea that we didn't discuss. (Srikar may be the only one who can follow this....) Until recently, uprobes couldn't remove a probepoint while any probepoint hits were being processed. uproc->rwsem was read-locked basically from the time uprobe_report_signal() was called due to the breakpoint trap until after it had fixed up the IP and/or stack after the single-step trap -- hence PR #9828. So the first thing uprobe_report_quiesce() does is check to see if we're in the middle of single-stepping, and re-assert UTRACE_SINGLESTEP (and skip any other quiesce-time work) if so. At the end of uprobe-report_signal(), the call to utask_fake_quisece() handles any delayed quiesce-time work. But Srikar fixed up uprobes to drop uproc->rwsem while single-stepping, and handle the case where the active probepoint gets unregistered at that point. See, e.g., adjust_ip_active_ppt(). So it seems to me that uprobe_report_quiesce() should check utask->quiescing before checking (utask->state == UPTASK_SSTEP). If utask->quiescing is set, go ahead and do the quiesce-time work... If utask->active_probe gets deleted by the quiesce-time work, then you may have to do some cleanup associated with that probepoint hit that you'd normally do after single-stepping (uprobe_run_def_regs(), uprobe_inject_delayed_signals()). On the other hand, if utask->active_probe is still around after the quiesce-time work, then go ahead an assert UTRACE_SINGLESTEP as before. All this depends on uprobe_report_quiesce() actually getting called as a result of gdbstub's report_signal callback asserting UTRACE_STOP. Does it? Also, is uprobe_delay_signal() getting called much during these hangs? See PR #9826. > > Another problem: under some conditions, it seems we can have a race > between uprobes and gdbstub w.r.t. the handling of the SSOL > instructions. Sometimes by the time the gdbstub's uprobe-handler code > is called, and some time later, gdb starts asking for register > contents, the user-regset PC has been updated to point at the SSOL > memory page. GDB can't make heads or tails of that. I don't know if > this is a symptom of a deeper utrace issue. Uprobes knows which breakpoints it inserted. It refuses to register a probepoint where there's already a non-uprobes breakpoint, and it ignores breakpoint traps that aren't at uprobes probepoints. Can gdb/gdbstub be trained not to try to single-step at uprobes probepoints? If they don't keep track of their breakpoints, perhaps thay could ask uprobes: uprobes could export a wrapper for uprobe_find_probepoint() to answer this question. > > I pushed my current version of utrace-gdbstub-uprobes for your joint > pain-sharing. > > - FChE Jim