Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Thu, 2010-01-14 at 14:49 -0800, Jim Keniston wrote: On Thu, 2010-01-14 at 12:09 +0100, Peter Zijlstra wrote: On Mon, 2010-01-11 at 17:55 +0530, Srikar Dronamraju wrote: Uprobes Infrastructure enables user to dynamically establish probepoints in user applications and collect information by executing a handler functions when the probepoints are hit. Please refer Documentation/uprobes.txt for more details. This patch provides the core implementation of uprobes. This patch builds on utrace infrastructure. You need to follow this up with the uprobes patch for your architecture. So all this is basically some glue between what you call ubp (the real userspace breakpoint stuff) and utrace? Or does it do more? My reply in http://lkml.indiana.edu/hypermail/linux/kernel/1001.1/02483.html addresses this. Right, so all that need be done is add the multiple probe stuff to UBP and its a sane interface to use on its own, at which point I'd be inclined to call that uprobes (UBP really is an crap name). Then we can ditch the whole utrace muck as I see no reason to want to use that, whereas the ubp (given a sane name) looks interesting.
Re: [RFC] [PATCH 4/7] Uprobes Implementation
Peter Zijlstra pet...@infradead.org writes: [...] Right, so all that need be done is add the multiple probe stuff to UBP and its a sane interface to use on its own, at which point I'd be inclined to call that uprobes (UBP really is an crap name). At one point ubp+uprobes were one piece. They were separated on the suspicion that lkml would like them that way. Then we can ditch the whole utrace muck as I see no reason to want to use that, whereas the ubp (given a sane name) looks interesting. Assuming you meant what you write, perhaps you misunderstand the layering relationship of these pieces. utrace underlies uprobes and other process manipulation functionality, present and future. - FChE
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Fri, 2010-01-15 at 04:26 -0500, Frank Ch. Eigler wrote: Peter Zijlstra pet...@infradead.org writes: [...] Right, so all that need be done is add the multiple probe stuff to UBP and its a sane interface to use on its own, at which point I'd be inclined to call that uprobes (UBP really is an crap name). At one point ubp+uprobes were one piece. They were separated on the suspicion that lkml would like them that way. Right, good thinking, that way we can use ubp without having to use utrace ;-) Then we can ditch the whole utrace muck as I see no reason to want to use that, whereas the ubp (given a sane name) looks interesting. Assuming you meant what you write, perhaps you misunderstand the layering relationship of these pieces. utrace underlies uprobes and other process manipulation functionality, present and future. Why, utrace doesn't at all look to bring a fundamental contribution to all that. If there's a proper kernel interface to install probes on userspace code (ubp seems to be mostly that) then I can use perf/ftrace to do the rest of the state management, no need to use utrace there. You can hardly force me to use utrace there, can you?
Re: [RFC] [PATCH 4/7] Uprobes Implementation
Hi Peter, My reply in http://lkml.indiana.edu/hypermail/linux/kernel/1001.1/02483.html addresses this. Right, so all that need be done is add the multiple probe stuff to UBP and its a sane interface to use on its own, at which point I'd be inclined to call that uprobes (UBP really is an crap name). I am fine with renaming ubp to a suggested name. The reason for splitting uprobes to two layers was to allow others (currently none) to reuse the current ubp layer. It was felt that there could be multiple clients for ubp who could co-exist. However ubp alone is not enough to provide the userspace tracing. Currently it wouldn't understand synchronization between different threads of a process, process life time issues, context in which the handler has to be run. As pointed out by Jim earlier, we have segregrated that layer which takes care of the above issues into the uprobes layer. For example, while inserting a breakpoint, one of the threads of a process could be running at the same place where we are trying to place a breakpoint. Or there could be two threads that could be racing to insert/delete a breakpoint. These synchronization issues are all handled by the Uprobes layer. Uprobes layer would need to be notified of process life-time events like fork/clone/exec/exit. It also needs to know - when a breakpoint is hit - stop and resume a thread. Uprobes layer uses utrace to be notified of the process life time events and the signal handling part. -- Thanks and Regards Srikar Then we can ditch the whole utrace muck as I see no reason to want to use that, whereas the ubp (given a sane name) looks interesting.
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Fri, 2010-01-15 at 15:56 +0530, Srikar Dronamraju wrote: Hi Peter, Or there could be two threads that could be racing to insert/delete a breakpoint. These synchronization issues are all handled by the Uprobes layer. Shouldn't be hard to put that in the ubp layer, right? Uprobes layer would need to be notified of process life-time events like fork/clone/exec/exit. No so much the process lifetimes as the vma life times are interesting, placing a hook in the vm code to track that isn't too hard, It also needs to know - when a breakpoint is hit - stop and resume a thread. A simple hook in the trap code is done quickly enough, and no reason to stop the thread, its not going anywhere when it traps.
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Fri, Jan 15, 2010 at 11:33:27AM +0100, Peter Zijlstra wrote: On Fri, 2010-01-15 at 15:56 +0530, Srikar Dronamraju wrote: Hi Peter, Or there could be two threads that could be racing to insert/delete a breakpoint. These synchronization issues are all handled by the Uprobes layer. Shouldn't be hard to put that in the ubp layer, right? Uprobes layer would need to be notified of process life-time events like fork/clone/exec/exit. No so much the process lifetimes as the vma life times are interesting, placing a hook in the vm code to track that isn't too hard, I think similar hooks were given thumbs down in the previous incarnation of uprobes (which was implemented without utrace). http://lkml.indiana.edu/hypermail/linux/kernel/0603.2/1254.html Thanks Maneesh -- Maneesh Soni Linux Technology Center IBM India Systems and Technology Lab, Bangalore, India.
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Fri, 2010-01-15 at 16:35 +0530, Maneesh Soni wrote: On Fri, Jan 15, 2010 at 11:33:27AM +0100, Peter Zijlstra wrote: On Fri, 2010-01-15 at 15:56 +0530, Srikar Dronamraju wrote: Hi Peter, Or there could be two threads that could be racing to insert/delete a breakpoint. These synchronization issues are all handled by the Uprobes layer. Shouldn't be hard to put that in the ubp layer, right? Uprobes layer would need to be notified of process life-time events like fork/clone/exec/exit. No so much the process lifetimes as the vma life times are interesting, placing a hook in the vm code to track that isn't too hard, I think similar hooks were given thumbs down in the previous incarnation of uprobes (which was implemented without utrace). http://lkml.indiana.edu/hypermail/linux/kernel/0603.2/1254.html I wasn't at all proposing to mess with a_ops, nor do you really need to, I was more thinking of adding a callback like perf_event_mmap() and a corresponding unmap(), that way you can track mapping life-times and add/remove probes accordingly. Adding the probe uses the fact that (most) executable mappings are MAP_PRIVATE and CoWs a private copy of the page with the modified ins, right? What does it do for MAP_SHARED|MAP_EXECUTABLE sections -- simply fail to add the probe?
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Fri, 2010-01-15 at 12:12 +0100, Peter Zijlstra wrote: On Fri, 2010-01-15 at 16:35 +0530, Maneesh Soni wrote: On Fri, Jan 15, 2010 at 11:33:27AM +0100, Peter Zijlstra wrote: On Fri, 2010-01-15 at 15:56 +0530, Srikar Dronamraju wrote: Hi Peter, Or there could be two threads that could be racing to insert/delete a breakpoint. These synchronization issues are all handled by the Uprobes layer. Shouldn't be hard to put that in the ubp layer, right? Uprobes layer would need to be notified of process life-time events like fork/clone/exec/exit. No so much the process lifetimes as the vma life times are interesting, placing a hook in the vm code to track that isn't too hard, I think similar hooks were given thumbs down in the previous incarnation of uprobes (which was implemented without utrace). http://lkml.indiana.edu/hypermail/linux/kernel/0603.2/1254.html I wasn't at all proposing to mess with a_ops, nor do you really need to, I was more thinking of adding a callback like perf_event_mmap() and a corresponding unmap(), that way you can track mapping life-times and add/remove probes accordingly. Adding the probe uses the fact that (most) executable mappings are MAP_PRIVATE and CoWs a private copy of the page with the modified ins, right? Does it clean up the CoW'ed page on removing the probe? Does that account for userspace having made other changes in between installing and removing the probe (for PROT_WRITE mappings obviously)?
Re: [RFC] [PATCH 4/7] Uprobes Implementation
* Peter Zijlstra pet...@infradead.org [2010-01-15 11:33:27]: Uprobes layer would need to be notified of process life-time events like fork/clone/exec/exit. No so much the process lifetimes as the vma life times are interesting, placing a hook in the vm code to track that isn't too hard, It also needs to know - when a breakpoint is hit - stop and resume a thread. A simple hook in the trap code is done quickly enough, and no reason to stop the thread, its not going anywhere when it traps. Some of the threads could be executing in the vicinity of the breakpoint when it is getting inserted or deleted. Wont we need to stop/quiesce those threads?
Re: [RFC] [PATCH 4/7] Uprobes Implementation
Hi - Then we can ditch the whole utrace muck as I see no reason to want to use that, whereas the ubp (given a sane name) looks interesting. Assuming you meant what you write, perhaps you misunderstand the layering relationship of these pieces. utrace underlies uprobes and other process manipulation functionality, present and future. Why, utrace doesn't at all look to bring a fundamental contribution to all that. If there's a proper kernel interface to install probes on userspace code (ubp seems to be mostly that) then I can use perf/ftrace to do the rest of the state management, no need to use utrace there. You can hardly force me to use utrace there, can you? utrace is not a form of punishment inflicted upon the undeserving. It is a service layer that uprobes et alii are built upon. You as a potential uprobes client need not also talk directly to it, if you wish to reimplement task-finder-like services some other way. - FChE
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Fri, 2010-01-15 at 19:50 +0530, Srikar Dronamraju wrote: Srikar seemed to suggest it needed stop/resume. If process traps, We dont need to stop/resume other threads. uprobes needs threads to quiesce when inserting/deleting the breakpoint. Right, I don't think we need to at all. See the CoW thing from previous emails.
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Fri, 2010-01-15 at 09:22 -0500, Frank Ch. Eigler wrote: Hi - Well, I'm not in a position to argue line by line about the necessity or the cost of utrace low level guts, but this may represent the most practical engineering balance between functionality / modularity / undesirably intrusive modifications. How intrusive and non-modular is installing a DIE_INT3 notifier? I'm not sure about all the reasons pro/con, but it looks like installing such a systemwide hook would force every userspace breakpoint or kprobe event machine wide to pass through the hypothetical uprobes layer, whether or not applicable to a current task. Well, we'll have to pass through the global die notifier anyway, but a quick per task filter sounds like a good idea, we can do that by keeping a per-task count of the number of uprobes in use. Then the uprobe code can avoid the lookup if there are no task users and no global users. The advantage of this construct is that is easily allows for global users, whereas a utrace based one doesn't.
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Fri, 2010-01-15 at 12:18 +0100, Peter Zijlstra wrote: On Fri, 2010-01-15 at 12:12 +0100, Peter Zijlstra wrote: ... Adding the probe uses the fact that (most) executable mappings are MAP_PRIVATE and CoWs a private copy of the page with the modified ins, right? We've just used access_process_vm() to insert the breakpoint instruction. (If there are situations where that's not appropriate, please advise.) Does it clean up the CoW'ed page on removing the probe? If I understand your question, the answer is no. We make no attempt to reclaim COW'ed pages, even after all the probes have been removed. In fact, once the first probe is hit and the XOL vma is created, the XOL vma hangs around for the life of the process. Does that account for userspace having made other changes in between installing and removing the probe (for PROT_WRITE mappings obviously)? We don't attempt the aforementioned cleanup, so I think the answer is N/A. Jim
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Fri, 2010-01-15 at 19:50 +0530, Srikar Dronamraju wrote: Furthermore it requires stopping and resuming tasks and nonsense like that, that's unwanted in many cases, just run stuff from the trap site and you're done. I don't know what you mean exactly. A trap already stopped task. utrace merely allows various clients to inspect/manipulate the state of the task at that moment. It does not add any context switches or spurious stop/resumue operations. Srikar seemed to suggest it needed stop/resume. If process traps, We dont need to stop/resume other threads. uprobes needs threads to quiesce when inserting/deleting the breakpoint. Years ago, we had pre-utrace versions of uprobes where the uprobes breakpoint-handler code was dispatched from the die_notifier, before the int3 turned into a SIGTRAP. I believe that's what Peter is recommending. On my old Pentium M... - a pre-utrace uprobe hit cost about 1 usec; - a utrace-based uprobe hit cost about 3 usec; - and an unboosted kprobe hit cost 0.57 usec. So yeah, learning about the int3 via utrace after the SIGTRAP gets created adds some overhead to uprobes. But as previously discussed in this thread -- e.g., http://lkml.indiana.edu/hypermail/linux/kernel/1001.1/02969.html -- there are ways to avoid the 2nd (single-step) trap, which should cut overhead in half. Jim
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Fri, 2010-01-15 at 12:12 +0100, Peter Zijlstra wrote: ... Adding the probe uses the fact that (most) executable mappings are MAP_PRIVATE and CoWs a private copy of the page with the modified ins, right? What does it do for MAP_SHARED|MAP_EXECUTABLE sections -- simply fail to add the probe? If the vma containing the instruction to be probed has the VM_EXEC flag set (and it's not in the XOL area) we go ahead and try to probe it. I'm not familar with the implications of MAP_SHARED|MAP_EXECUTABLE -- how you would get such a combo, or what access_process_vm() would do with it. Jim
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Mon, 2010-01-11 at 17:55 +0530, Srikar Dronamraju wrote: Uprobes Infrastructure enables user to dynamically establish probepoints in user applications and collect information by executing a handler functions when the probepoints are hit. Please refer Documentation/uprobes.txt for more details. This patch provides the core implementation of uprobes. This patch builds on utrace infrastructure. You need to follow this up with the uprobes patch for your architecture. So all this is basically some glue between what you call ubp (the real userspace breakpoint stuff) and utrace? Or does it do more?
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Tue, 2010-01-12 at 13:44 +0530, Ananth N Mavinakayanahalli wrote: Well, I wonder if perf can ride on utrace's callbacks for the hook_task() for the clone/fork cases? Well it could, but using all of utrace to simply get simple callbacks from copy_process() is just daft so we're not going to do that.
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Thu, 2010-01-14 at 12:09 +0100, Peter Zijlstra wrote: On Mon, 2010-01-11 at 17:55 +0530, Srikar Dronamraju wrote: Uprobes Infrastructure enables user to dynamically establish probepoints in user applications and collect information by executing a handler functions when the probepoints are hit. Please refer Documentation/uprobes.txt for more details. This patch provides the core implementation of uprobes. This patch builds on utrace infrastructure. You need to follow this up with the uprobes patch for your architecture. So all this is basically some glue between what you call ubp (the real userspace breakpoint stuff) and utrace? Or does it do more? My reply in http://lkml.indiana.edu/hypermail/linux/kernel/1001.1/02483.html addresses this. Jim
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Tue, Jan 12, 2010 at 06:36:00AM +0100, Frederic Weisbecker wrote: On Mon, Jan 11, 2010 at 05:55:53PM +0530, Srikar Dronamraju wrote: +static const struct utrace_engine_ops uprobe_utrace_ops = { + .report_quiesce = uprobe_report_quiesce, + .report_signal = uprobe_report_signal, + .report_exit = uprobe_report_exit, + .report_clone = uprobe_report_clone, + .report_exec = uprobe_report_exec +}; So, as stated before, uprobe seems to handle too much standalone policies such as freeing on exec, always inherit on clone and never on fork. Such rules should be decided from uprobe clients not from uprobe itself and that makes it not enough flexible to be usable for now. The freeing on exec is only housekeeping of uprobe data structures. And probepoints are inherited only on CLONE_THREAD and not otherwise, simply since the existing probes can be hit in the new thread's context. Not sure what other policy you are hinting at. For example if we want it to be usable by perf, we have two ways: - a trace event. Unfortunately, like I explained in a previous mail, this doesn't seem to be a suitable interface for this particular case. - a performance monitoring unit, with the existing unified interface struct pmu, usable by perf. Typically, to use it with perf toward a pmu, perf tools need to create a uprobe on perf process and activate its hook on the next exec. Thereafter, it's up to perf to decide if we inherit through clone and fork. As mentioned above, the inheritance is only for threads. It should be fairly easy to inherit probes on fork, and that can be made a perf based policy decision. Here I fear utrace and perf are going to collide. Utrace does not mandate any of the above concerns you've mentioned. Utrace just provides callbacks at the said events and uprobes can be tweaked to accommodate perf's requirements as possible, as feasible. See how could be the final struct pmu (we need to extend it to support utrace): struct pmu { enable() - called we schedule in a context where we want a uprobe to be active. Called very often disable() - the above opposite /* Not yet existing callbacks */ hook_task() - called when a process is created which we want to activate our hook would be typically called once on exec if we have set enable_on_exec and also on clone()/fork() if we want to inherit. } The above hook_task (could be divided in more precise callback events like hook_on_exec, hook_on_clone, etc...) would be needed by perf to drive correctly utrace and this is going to collide with utrace callbacks that notify execs and forks. Probably utrace can be kept for all the utrace breakpoint signal handling an so. But I guess the rest can be implemented on top of a struct pmu and driven by perf like we did with hardware breakpoints re-implementation. Just an idea. Well, I wonder if perf can ride on utrace's callbacks for the hook_task() for the clone/fork cases? Ananth
Re: [RFC] [PATCH 4/7] Uprobes Implementation
On Tue, 2010-01-12 at 13:44 +0530, Ananth N Mavinakayanahalli wrote: On Tue, Jan 12, 2010 at 06:36:00AM +0100, Frederic Weisbecker wrote: ... So, as stated before, uprobe seems to handle too much standalone policies such as freeing on exec, always inherit on clone and never on fork. Such rules should be decided from uprobe clients not from uprobe itself and that makes it not enough flexible to be usable for now. The freeing on exec is only housekeeping of uprobe data structures. And probepoints are inherited only on CLONE_THREAD and not otherwise, simply since the existing probes can be hit in the new thread's context. Not sure what other policy you are hinting at. ... Typically, to use it with perf toward a pmu, perf tools need to create a uprobe on perf process and activate its hook on the next exec. Thereafter, it's up to perf to decide if we inherit through clone and fork. As mentioned above, the inheritance is only for threads. It should be fairly easy to inherit probes on fork, and that can be made a perf based policy decision. One reason we don't currently support inheritance (or cloning) of uprobes across fork is that a uprobe object is (a) per-process (and I think we want to keep it that way); and (b) owned by the uprobes client. That is, the client creates and populates that uprobe object, and passes a pointer to it to both register_uprobe() and unregister_uprobe(). We could clone this object on fork, but then how would the client refer to the cloned uprobes in the new process -- e.g., to unregister them? I guess each cloned uprobe could remember its patriarch uprobe -- its ultimate ancestor, the one created by the client; and we could add an unregister_uprobe_clone function that takes both the address of the patriarch uprobe and the pid of the (clone) uprobe to be unregistered. It has also been suggested that it might be more user-friendly to let the client discard (or reuse) the uprobe object as soon as register_uprobe() returns. register_uprobe() would presumably copy everything it needs from the uprobe to the uprobe_kimg, and pass back a handle (e.g., the address of the uprobe_kimg) that the client can later pass to unregister_uprobe() -- or unregister_uprobe_clone(). (In this case, only the uprobe_kimg would be cloned.) It might be good to consider both these enhancement requests together. Anyway, as previously described, the clone-on-fork feature can be (and has been) implemented by a utrace-based task-finder that notices forks, and creates and registers a whole new set of uprobes for the new process. Jim