Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)
On 01/16/2010 02:58 AM, Jim Keniston wrote: I hear (er, read) you. Emulation may turn out to be the answer for some architectures. But here are some things to keep in mind about the various approaches: 1. Single-stepping inline is easiest: you need to know very little about the instruction set you're probing. But it's inadequate for multithreaded apps. 2. Single-stepping out of line solves the multithreading issue (as do #3 and #4), but requires more knowledge of the instruction set. (In particular, calls, jumps, and returns need special care; as do rip-relative instructions in x86_64.) I count 9 architectures that support kprobes. I think most of these do SSOL. 3. Boosted probes (where an appended jump instruction removes the need for the single-step trap on many instructions) require even more knowledge of the instruction set, and like SSOL, require XOL slots. Right now, as far as I know, x86 is the only architecture with boosted kprobes. 4. Emulation removes the need for the XOL area, but requires pretty much total knowledge of the instruction set. It's also a performance win for architectures that can't do #3. I see kvm implemented on 4 architectures (ia64, powerpc, s390, x86). Coincidentally, those are the architectures to which uprobes (old uprobes, with ubp and xol bundled in) has already been ported (though Intel hasn't been maintaining their ia64 port). So it sort of comes down to how objectionable the XOL vma (or page) really is. The kvm emulator emulates only a subset of the x86 instruction set (basically mmio instructions and commonly-used page-table manipulation instructions, as well as some privileged instructions). It would take a lot of work to expand it to be completely generic; and even then it will fail if userspace uses an instruction set extension the kernel is not aware of. To me, boosted probes with a fallback to single-stepping seems to be the better option by far. -- error compiling committee.c: too many arguments to function
Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)
On Sun, 2010-01-17 at 16:56 +0200, Avi Kivity wrote: On 01/17/2010 04:52 PM, Peter Zijlstra wrote: Also, if its fixed size you're imposing artificial limits on the number of possible probes. Obviously we'll need a limit, a uprobe will also take kernel memory, we can't allow people to exhaust it. Only if its unprivilidged, kernel and root should be able to place as many probes until the machine keels over.
Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)
On Sun, 2010-01-17 at 16:59 +0200, Avi Kivity wrote: On 01/17/2010 04:52 PM, Peter Zijlstra wrote: On Sun, 2010-01-17 at 16:39 +0200, Avi Kivity wrote: On 01/15/2010 11:50 AM, Peter Zijlstra wrote: As previously stated, I think poking at a process's address space is an utter no-go. Why not reserve an address space range for this, somewhere near the top of memory? It doesn't have to be populated if it isn't used. Because I think poking at a process's address space like that is gross. Also, if its fixed size you're imposing artificial limits on the number of possible probes. btw, an alternative is to require the caller to provide the address space for this. If the caller is in another process, we need to allow it to play with the target's address space (i.e. mmap_process()). I don't think uprobes justifies this by itself, but mmap_process() can be very useful for sandboxing with seccomp. mmap_process() sounds utterly gross, one process playing with another process's address space.. yuck!
Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)
On 01/17/2010 05:03 PM, Peter Zijlstra wrote: btw, an alternative is to require the caller to provide the address space for this. If the caller is in another process, we need to allow it to play with the target's address space (i.e. mmap_process()). I don't think uprobes justifies this by itself, but mmap_process() can be very useful for sandboxing with seccomp. mmap_process() sounds utterly gross, one process playing with another process's address space.. yuck! This is debugging. We're playing with registers, we're playing with the cpu, we're playing with memory contents. Why not the address space as well? For seccomp, this really should be generalized. Run a system call on behalf of another process, but don't let that process do anything to affect it. I think Google is doing something clever with one thread in seccomp mode and another unconstrained, but that's very hacky - you have to stop the constrained thread so it can't interfere with the live one. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.
Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)
On Sat, 2010-01-16 at 18:48 -0500, Jim Keniston wrote: As you may have noted before, I think FP would be a special problem for your approach. I'm not sure how folks would react to the idea of executing FP instructions in kernel space. But emulating them is also tough. There's an IEEE FP emulation package somewhere in one of the Linux arch directories, but I'm not sure how precise it is, and dropping even 1 bit of precision is unacceptable for many applications, since such errors tend to grow in complex computations employing many FP instructions. Well, we have kernel space using FP/MMX/SSE like things, its not hard if you really need it, but in this case I think its easier than normal, because we'll just allow it to change the userspace state because that is exactly what we want it to do.
Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)
On Sat, 2010-01-16 at 19:12 -0500, Bryan Donlan wrote: On Fri, Jan 15, 2010 at 7:58 PM, Jim Keniston jkeni...@us.ibm.com wrote: 4. Emulation removes the need for the XOL area, but requires pretty much total knowledge of the instruction set. It's also a performance win for architectures that can't do #3. I see kvm implemented on 4 architectures (ia64, powerpc, s390, x86). Coincidentally, those are the architectures to which uprobes (old uprobes, with ubp and xol bundled in) has already been ported (though Intel hasn't been maintaining their ia64 port). So it sort of comes down to how objectionable the XOL vma (or page) really is. On x86 at least, wouldn't one option to be to run the instruction to be emulated in CPL ('ring') 2, from a XOL page above the user-kernel split, not accessible to userspace at CPL 3? Linux hasn't traditionally used anything other than CPL 0 and CPL 3 (plus CPL 1 on Xen), but it would seem to avoid many of the problems here - it's invisible to normal userspace code and so doesn't pollute userspace memory maps with kernel-private stuff, but since it's running at a higher CPL than the kernel, we can still protect kernel memory and protect against privileged instructions. Another option is to go play games with the RPL of the user data segments when we load them. But yeah, something like this seems to nicely deal with the protection issues.
Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)
On Sun, 2010-01-17 at 21:33 +0200, Avi Kivity wrote: On 01/17/2010 05:03 PM, Peter Zijlstra wrote: btw, an alternative is to require the caller to provide the address space for this. If the caller is in another process, we need to allow it to play with the target's address space (i.e. mmap_process()). I don't think uprobes justifies this by itself, but mmap_process() can be very useful for sandboxing with seccomp. mmap_process() sounds utterly gross, one process playing with another process's address space.. yuck! This is debugging. We're playing with registers, we're playing with the cpu, we're playing with memory contents. Why not the address space as well? Because you want thins go to be as transparent as possible in order to avoid heisenbugs. Sure we cannot avoid everything, but we should avoid everything we possibly can. Also, aside of the VDSO, we simply do not force map things into address spaces (and like said before, I think the VDSO stinks for doing that) and I think we don't want to create (more) precedents in this case.