Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)
On Mon, Jan 18, 2010 at 02:15:51PM +0100, Peter Zijlstra wrote: On Mon, 2010-01-18 at 14:37 +0200, Avi Kivity wrote: On 01/18/2010 02:14 PM, Peter Zijlstra wrote: Well, the alternatives are very unappealing. Emulation and single-stepping are going to be very slow compared to a couple of jumps. With CPL2 or RPL on user segments the protection issue seems to be manageable for running the instructions from kernel space. CPL2 gives unrestricted access to the kernel address space; and RPL does not affect page level protection. Segment limits don't work on x86-64. But perhaps I missed something - these things are tricky. So setting RPL to 3 on the user segments allows access to kernel pages just fine? How useful.. :/ It should be possible to translate the instruction into an address space check, followed by the action, but that's still slower due to privilege level switches. Well, if you manage to do the address validation you don't need the priv level switch anymore, right? It also starts becoming very x86-centric though, doesn't it? It might kick other ports later. What is there at the moment is storing the copied instructions in a VMA. The most unpalatable part of that to me is that it's visible to userspace, probably via /proc/ and I didn't check, but I hope an munmap() from userspace cannot delete it. What the VMA has going for it is that it *appears* to be easier to port to other architectures than the alternatives, certainly easier to handle than instruction emulation. Are the ins encodings sane enough to recognize mem parameters without needing to know the actual ins? How about using a hw-breakpoint to close the gap for the inline single step? You could even re-insert the int3 lazily when you need the hw-breakpoint again. It would consume one hw-breakpoint register for each task/cpu that has probes though.. This feels very racy. Along with that, making these sort of changes was considered a risky venture on x86 and needed strong verification from elsewhere (http://lkml.org/lkml/2010/1/12/300). There are probably similar concerns on other architectures that would make a reliable port difficult. Right now the approach is with VMAs. The alternatives are 1. reserved XOL page (similar disadvantages to the VMA) 2. emulated instructions This is an emulation bug waiting to happen in my opinion and makes porting uprobes a significantly more difficult undertaking than either the XOL-VMA or XOL-page approach 3. XOL page in kernel space available at a different CPL This assumes all target architectures have a usable privilege ring which may be the case. However, I would guess that it is going to perform worse than the current approach because of the change in privilege level. No idea what the cost of a privilege level change is, but I doubt it's free 4. Boosted probes (arch-specific, apparently only x86 does this for kprobes) As unpalatable as the VMA is, I am failing to see why it's not a reasonable starting point with an understanding that 2 or 3 would be implemented in the future after the other architecture ports are in place and the reliability of the options as well as the performance can be measured. There would appear to be two classes of application that might suffer from the VMA. The first which need absolutly every single ounce of address space. The second which introspects itself via /proc/self/maps and makes decisions based on that. The first is unfortunate but should be a limited number of use cases. The second could be fudged by simply not exporting the information via /proc. I'm of the opinion it would be reasonable to let the VMA go ahead, look at the ports for the other architectures and revisit options 2 and 3 above to see if the VMA can really be removed with performance or reliability penalty. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab
Re: [RFC] [PATCH 0/7] UBP, XOL and Uprobes [ Summary of Comments and actions to be taken ]
On Fri, 2010-01-22 at 12:54 +0530, Ananth N Mavinakayanahalli wrote: On Fri, Jan 22, 2010 at 12:32:32PM +0530, Srikar Dronamraju wrote: Here is a summary of the Comments and actions that need to be taken for the current uprobes patchset. Please let me know if I missed or misunderstood any of your comments. 1. Uprobes depends on trap signal. Uprobes depends on trap signal rather than hooking to the global die notifier. It was suggested that we hook to the global die notifier. In the next version of patches, Uprobes will use the global die notifier and look at the per-task count of the probes in use to see if it has to be consumed. However this would reduce the ability of uprobe handlers to sleep. Since we are dealing with userspace, sleeping in handlers would have been a good feature. We are looking at ways to get around this limitation. We could set a TIF_ flag in the notifier to indicate a breakpoint hit and process it in task context before the task heads into userspace. Make that optional, not everybody might want that. Either provide a simple trampoline or use a flag to indicate the callback be called from process context on registration.
Re: PTRACE_SYSCALL_ENTRY/EXIT
Roland McGrath yazmış: We don't have any particular plans to extend the ptrace interface. I strongly doubt we would even try to do anything like that until the utrace-based ptrace interface is merged into Linux and the old ptrace implementation gone. In general, we are not looking for extensions to the ptrace interface. It is an ugly hairball already and we are more interested in having the utrace API layer available inside the kernel and then embarking on new and sane userland interfaces instead of shoehorning more into ptrace. I respect that. That said, some particular kinds of simple enhancements to ptrace are really quite trivial to implement in the new utrace-based implementation. The particular area you suggest is one of these. What I would expect is not new variants of the one-shot interface like PTRACE_SYSCALL. Rather, I would envision new PTRACE_O_* options to enable syscall entry and exit tracing analogous to the PTRACE_EVENT_* events you can now enable. This means that you make one PTRACE_SETOPTIONS call to enable the set of events you want, and then use plain PTRACE_CONT (or whatever). If you really want exactly the one-shot behavior instead, then we could consider that. But, like I said, we are not looking to add much in the way of new wrinkles to the dismal ptrace userland interface. The one-shot behaviour is what I want because adding a PTRACE_O_* option won't solve my problem if I understood correctly. I'm writing a tool that audits system calls and *only* denied system calls need to be stopped at the exit of the system call to set return value and errno. System calls are checked at entry, if they're safe another PTRACE_SYSCALL_ENTRY will be issued to continue to the next system call. If, however, the system call needs to be denied, PTRACE_SYSCALL_EXIT will be issued after changing system call no to something invalid so that return value and errno can be set. I think this will be useful for every program that audits system calls. Thanks, Roland -- Regards, Ali Polatel signature.asc Description: PGP signature
Re: linux-next: add utrace tree
On 01/21, Linus Torvalds wrote: On Thu, 21 Jan 2010, Andrew Morton wrote: ptrace is a nasty, complex part of the kernel which has a long history of problems, but it's all been pretty quiet in there for the the past few years. More importantly, we're not ever going to get rid of it. Unfortunately, you are right. The current ptrace (as it is visible from user-space) should stay forever. Quite frankly, judging my all past history we have ever seen in kernel interfaces, new an non-portable interfaces simply are never used. The whole question whether they are nicer or not is entirely immaterial. I have to admit this point looks very reasonable to me. Except, can't resist, ptrace itself is hardly portable. I'm personally very dubious that there are any merits to utrace that outweigh the very clear disadvantages: just another layer that adds a new level of abstraction to the only interface that people actually _use_, namely ptrace. Of course they can't use other interfaces, we don't have them. And without the new abstraction layer we will never have, I think. Oleg.
Re: [RFC] [PATCH 0/7] UBP, XOL and Uprobes [ Summary of Comments and actions to be taken ]
Peter Zijlstra wrote: On Fri, 2010-01-22 at 12:32 +0530, Srikar Dronamraju wrote: 2. XOL vma vs Emulation vs Single Stepping Inline vs using Protection Rings. XOL VMA is an additional process address vma. This is opposition to add an additional vma without user actually requesting for the same. XOL vma and single stepping inline are the two architecture independent implementations. While other implementations are more architecture specific. Single stepping inline wouldnt go well with multithreaded process. Even though XOL vma has its own issues, we will go with it since other implementations seem to have more complications. we would look forward to implementing boosters later. Later on, if we come across another techniques with lesser side-effects than the XOL vma, we would switch to using them. How about modifying glibc to reserve like 64 bytes on the TLS structure it has and storing the ins and possible boost jmp there? Since each thread can only have a single trap at any one time that should be enough. Hmm, it is a good idea. Well, we'll have a copy of original insn in kernel, but it could be simpler than managing XOL vma. :-) Thank you, -- Masami Hiramatsu Software Engineer Hitachi Computer Products (America), Inc. Software Solutions Division e-mail: mhira...@redhat.com
Re: linux-next: add utrace tree
Hi - oleg wrote: [...] I'm personally very dubious that there are any merits to utrace that outweigh the very clear disadvantages: just another layer that adds a new level of abstraction to the only interface that people actually _use_, namely ptrace. Of course they can't use other interfaces, we don't have them. And without the new abstraction layer we will never have, I think. This is one of the reasons we built, up on request of lkml people, the utrace-gdbstub prototype (http://lkml.org/lkml/2009/11/30/173). It presents a standard userspace debugging interface -- actually, more standard than ptrace! It has the potential to be more powerful feature-wise and perhaps even perform faster than ptrace. And yet that RFC didn't receive any on-topic review, only wishes for unspecified blue-sky integration with kernel debugging. So then there's uprobes, which is another potential utrace killer app, if it weren't so tainted by some peoples' disdain for its current user, when other users are already being seriously discussed. So a working prototype, which demonstrates both the utility of utrace itself and the end-user value of user-space probing, is disregarded. And there are several smaller utrace clients in the works, each of them merge candidates in the future. Yes, most of them may be rewritten with special-purpose hook after hook as people reinvent the utrace wheel piece by piece, but how long will that take? How is the opportunity cost of missing features valued? Finally, I don't know how to address the logic of if a feature requires utrace, that's a bad argument for utrace and at the same time you need to show a killer app for utrace. What could possibly satisfy both of those constraints? Please advise. - FChE
Re: linux-next: add utrace tree
On Fri, 2010-01-22 at 15:01 -0500, Frank Ch. Eigler wrote: So then there's uprobes, which is another potential utrace killer app That's bollocks, uprobes is an utter and total mis-match for utrace. Probing userspace is primarily about DSOs which is files and vma's, not tasks. You might maybe want a utrace interface to that, but that is largely non-interesting. IOW, we don't need utrace to make sensible use of uprobes. (And when I speak of uprobes I mean the thing formerly called UBP)
Re: linux-next: add utrace tree
On 01/21, Linus Torvalds wrote: I realize that my argument is very anti-thetical to the normal CS teaching of general-purpose is good. I often feel that very specific code with very clearly defined (and limited) applicability is a good thing - I'd rather have just a very specific ptrace layer that does nothing but ptrace, than a generic plugin layer that can be layered under ptrace and other things. I am repeating the same (and probably poor) arguments, but we don't have a clearly defined ptrace layer. The current code is just the set of precedents, I mean, this code does this because we always did this for unknown reason. And we can't fix it without breaking things. Even the obvious bugs which could be fixed by the very simple patch should be preserved sometimes. In fact, afaics the current state is: if it can't crash the kernel - it is not the bug. Otoh, ptrace is very limited, yes. Imho - too limited. And, as a user-space api, it is just horrible. However: we're not ever going to get rid of it. Yes, sure. But I am afraid this all is almost off-topic. Afaik, utrace was not created to solve the problems with ptrace, at least I am sure this wasn't the only goal. Unfortunately, I didn't participate in other projects which use utrace. Even if I did, I don't know how could I prove they are important enough to have a generic layer to make other things possible. Oleg.
Re: linux-next: add utrace tree
Hi - On Fri, Jan 22, 2010 at 09:16:16PM +0100, Peter Zijlstra wrote: [...] So then there's uprobes, which is another potential utrace killer app That's bollocks, uprobes is an utter and total mis-match for utrace. Probing userspace is primarily about DSOs which is files and vma's, not tasks. [...] Your experience with user-space probing apparently differs from ours. In fact there exists plenty of interest and utility in probing given processes only, if for no other reason then to avoid disrupting others running on the machine. Nearly always, it is better to build a multiprocess probing widget from multiply-applied single-process ones, rather than to build single-process probing from grossly-filtered systemwide/VMA ones. (If the lower level infrastructure provides both options, groovy.) - FChE
Re: linux-next: add utrace tree
Hi - On Fri, Jan 22, 2010 at 01:59:11PM -0800, Linus Torvalds wrote: [...] Finally, I don't know how to address the logic of if a feature requires utrace, that's a bad argument for utrace and at the same time you need to show a killer app for utrace. What could possibly satisfy both of those constraints? Please advise. The point is, the feature needs to be a killer feature. And I have yet to hear _any_ such killer feature, especially from a kernel maintenance standpoint. The better ptrace than ptrace is irrelevant. Sure, we all know ptrace isn't a wonderful feature. But it's there, and a debugger is going to have support for it anyway, so what's the _advantage_ of a better ptrace interface? There is absolutely _zero_ advantage, there's just yet another interface. We can't get rid of the old one _anyway_. The point is that the intermediate api will allow (and, as the part you clipped out about utrace-gdbstub said, *already has allowed*) alternative plausible interfaces that coexist just fine. And the seccomp replacement just sounds horrible. Using some tracing interface to implement security models sounds like the worst idea ever. So all this is about *naming* utrace? It was never built for tracing, but for (efficient/multiplexed) *control*. That wasn't even its original name -- one of your lieutenants asked roland to change it to utrace. And like it or not, over the last almost-decade, _not_ having to have to work with system tap has been a feature, not a problem, for the kernel community. I don't have a problem with that. We have apprx. never imposed anything on developers who didn't want to use it. There are plenty who have and will. - FChE
Re: [RFC] [PATCH 0/7] UBP, XOL and Uprobes [ Summary of Comments and actions to be taken ]
On Fri, 2010-01-22 at 19:06 +0100, Peter Zijlstra wrote: On Fri, 2010-01-22 at 12:32 +0530, Srikar Dronamraju wrote: 2. XOL vma vs Emulation vs Single Stepping Inline vs using Protection Rings. XOL VMA is an additional process address vma. This is opposition to add an additional vma without user actually requesting for the same. XOL vma and single stepping inline are the two architecture independent implementations. While other implementations are more architecture specific. Single stepping inline wouldnt go well with multithreaded process. Even though XOL vma has its own issues, we will go with it since other implementations seem to have more complications. we would look forward to implementing boosters later. Later on, if we come across another techniques with lesser side-effects than the XOL vma, we would switch to using them. How about modifying glibc to reserve like 64 bytes on the TLS structure it has and storing the ins and possible boost jmp there? Since each thread can only have a single trap at any one time that should be enough. We once implemented something similar, but using an area just beyond the top of the stack instead of TLS. We figured it would never pass muster because we have to temporarily map the page executable (and undo it after the single-step), and this felt like a big security hole. I'd think we'd have the same concern with TLS. Jim
Re: linux-next: add utrace tree
On Fri, 22 Jan 2010, Frank Ch. Eigler wrote: The point is that the intermediate api will allow (and, as the part you clipped out about utrace-gdbstub said, *already has allowed*) alternative plausible interfaces that coexist just fine. And my point is that multiple interfaces are BAD. There is one interface we _have_ to have: the traditional ptrace one. That one we can't get away from. Multiple interfaces on its own is just confusion with no upside. You need a _reason_ to have other interfaces. They need to have that killer feature. Just being different is not a feature at all. So all this is about *naming* utrace? It was never built for tracing, but for (efficient/multiplexed) *control*. That wasn't even its original name -- one of your lieutenants asked roland to change it to utrace. No. It's not about naming. It's about the downside of having amorphous interfaces that apparently don't even have rules, and are then used to implement random crap. Yes, the SNL skit about It's a dessert topping _and_ a floor wax was funny, but it was funny exactly because it was crazy. The fact that you can do crazy things is not a good thing. You need to find the goodness somewhere else, and that's what I'm trying to tell you. You just seem to have trouble listening. Linus
Re: linux-next: add utrace tree
On Fri, 22 Jan 2010, Linus Torvalds wrote: No. It's not about naming. It's about the downside of having amorphous interfaces that apparently don't even have rules, and are then used to implement random crap. Yes, the SNL skit about It's a dessert topping _and_ a floor wax was funny, but it was funny exactly because it was crazy. Put yet another way: I'd _much_ rather have two totally separate pieces that don't depend on each other, and do different things. So to take a very practical example: I'd much rather have 'seccomp' and 'ptrace' that have _nothing_ what-so-ever to do with each other, than have some intermediate layer that then needs to make both of those happy, and that both have to interact with. There are cases where we really _want_ to have common code. We want to have a common VFS interface because we want to show _one_ interface to user space across a gazillion different filesystems. We want to have a common driver layer (as far as possible) because - again - we expose a metric shitload of drivers, and we want to have one unified interface to them. But going the other way: trying to share code when the interfaces are fundamentally _different_ is generally not at all such a great idea. It ends up tying two conceptually totally separate things together, and suddenly people who work on feature X aneed to modify infrastructure that affects feature Y, and it turns ou that details A, B and C are all totally different for the two features and the middle layer has two conflicting things it needs to work with. This is why when somebody brought up you could do a seccomp-like thing on top of utrace that my reaction was and is just totally negative. It shows all the wrong kinds of tying things together. Linus
Re: linux-next: add utrace tree
On Fri, Jan 22, 2010 at 19:22, Linus Torvalds torva...@linux-foundation.org wrote: There are cases where we really _want_ to have common code. We want to have a common VFS interface because we want to show _one_ interface to user space across a gazillion different filesystems. We want to have a common driver layer (as far as possible) because - again - we expose a metric shitload of drivers, and we want to have one unified interface to them. So... Everybody agrees that ptrace() is horrible and a royal pain to use, let alone use correctly and without bugs. Everybody also agrees that ptrace() needs to stay around for a long time to avoid breaking all the existing users. Now how do we get from here to a moderately portable API for interrogating, controlling, and intercepting process state? Essentially it would need to support all of the things that a powerful debugger would want to do, including modifying registers and memory, substituting syscall return values, etc. I believe that utrace is the kernel side of that API. The killer app for this will be the ability to delete thousands of lines of code from GDB, strace, and all the various other tools that have to painfully work around the major interface gotchas of ptrace(), while at the same time making their handling of complex processes much more robust. The *second* killer app for this is to make it much easier for people to write new userspace debugging tools. I love the various crash-catching tools that different distributions or applications provide, but they all basically have to trap the SIGSEGV and hope they're still sensible enough to fork() and exec() a gdb process. Furthermore, I would love to be able to write debugging tools for scripting languages that allow me to step across Perl, C, PHP, assembly code, etc, all within the same process. In theory that's all possible today, but given how much of a *pain* ptrace() is to use correctly, nobody bothers. Now, with all that said, utrace does not provide any of the userspace side APIs today... but I think it is a necessary refactoring if we want to provide a new ideal process-introspection interface without breaking all the ptrace() users. Think of the utrace interface as very much like the LSM interface. Just like with LSMs, there is a lot of active research in debugging and tracing tools, and nobody can even remotely agree what the hell they want out of the hooks. In theory you could add one hook for every place each security module needs one... but then your fast-path is littered with always-false test-and-jump statements. What utrace provides is the one single test in each fast path that then searches for and executes the appropriate slow path(s) for that process. I personally would be very happy to see utrace merged. Cheers, Kyle Moffett