On 01/27/2010 12:23 PM, Ingo Molnar wrote:
* Avi Kivitya...@redhat.com wrote:
(back from vacation)
If so then you ignore the obvious solution to _that_ problem: dont use
INT3 at all, but rebuild (or re-JIT) your program with explicit callbacks.
It's _MUCH_ faster than _any_ breakpoint
On 01/27/2010 10:24 AM, Ingo Molnar wrote:
Not to mention that that process could wreck the trace data rendering it
utterly unreliable.
It could, but it also might not. Are we going to deny high performance
tracing to users just because it doesn't work in all cases?
Tracing
* Avi Kivity a...@redhat.com wrote:
On 01/27/2010 10:24 AM, Ingo Molnar wrote:
Not to mention that that process could wreck the trace data rendering it
utterly unreliable.
It could, but it also might not. Are we going to deny high performance
tracing to users just because it doesn't
* Avi Kivity a...@redhat.com wrote:
If so then you ignore the obvious solution to _that_ problem: dont use
INT3 at all, but rebuild (or re-JIT) your program with explicit callbacks.
It's _MUCH_ faster than _any_ breakpoint based solution - literally just
the cost of a function call
On Sun 2010-01-17 16:01:46, Peter Zijlstra wrote:
On Sun, 2010-01-17 at 16:56 +0200, Avi Kivity wrote:
On 01/17/2010 04:52 PM, Peter Zijlstra wrote:
Also, if its fixed size you're imposing artificial limits on the number
of possible probes.
Obviously we'll need a limit, a
On Mon, Jan 18, 2010 at 02:15:51PM +0100, Peter Zijlstra wrote:
On Mon, 2010-01-18 at 14:37 +0200, Avi Kivity wrote:
On 01/18/2010 02:14 PM, Peter Zijlstra wrote:
Well, the alternatives are very unappealing. Emulation and
single-stepping are going to be very slow compared to a couple
On 01/19/2010 07:47 PM, Jim Keniston wrote:
This is still with a kernel entry, yes?
Yes, this involves setting a breakpoint and trapping into the kernel
when it's hit. The 6-7x figure is with the current 2-trap approach
(breakpoint, single-step). Boosting could presumably make that
On Wed, 2010-01-20 at 11:43 +0200, Avi Kivity wrote:
1. Write a trace entry into shared memory, trap into the kernel on overflow.
2. Trap if a condition is satisfied (fast watchpoint implementation).
So now you want to consume more of a process' address space to store
trace data as well? Not to
On Wed, Jan 20, 2010 at 12:06:20PM +0530, Srikar Dronamraju wrote:
* Frederic Weisbecker fweis...@gmail.com [2010-01-19 19:06:12]:
On Tue, Jan 19, 2010 at 09:47:45AM -0800, Jim Keniston wrote:
What does the code in the jumped-to vma do? Is the instrumentation code
that corresponds
On 01/20/2010 11:57 AM, Peter Zijlstra wrote:
On Wed, 2010-01-20 at 11:43 +0200, Avi Kivity wrote:
1. Write a trace entry into shared memory, trap into the kernel on overflow.
2. Trap if a condition is satisfied (fast watchpoint implementation).
So now you want to consume more of a
On 01/20/2010 12:45 PM, Srikar Dronamraju wrote:
What does the code in the jumped-to vma do?
1. Write a trace entry into shared memory, trap into the kernel on overflow.
2. Trap if a condition is satisfied (fast watchpoint implementation).
That looks to be a nice idea. We should
Peter Zijlstra pet...@infradead.org writes:
With CPL2 or RPL on user segments the protection issue seems to be
manageable for running the instructions from kernel space.
Nope -- it doesn't work on 64bit and even on 32bit can have large
costs on some CPUs.
Also designing 32bit only features
Frederic Weisbecker wrote:
On Tue, Jan 19, 2010 at 09:47:45AM -0800, Jim Keniston wrote:
Do you have plans for a variant
that's completely in userspace?
I don't know of any such plans, but I'd be interested to read more of
your thoughts here. As I understand it, you've suggested replacing
On 01/19/2010 12:15 AM, Jim Keniston wrote:
I don't like the idea but if the performance benefits are real (are
they?),
Based on what seems to be the closest thing to an apples-to-apples
comparison -- counting the number of calls to a specified function --
uprobes is 6-7 times faster
On Tue, 2010-01-19 at 10:07 +0200, Avi Kivity wrote:
On 01/19/2010 12:15 AM, Jim Keniston wrote:
I don't like the idea but if the performance benefits are real (are
they?),
Based on what seems to be the closest thing to an apples-to-apples
comparison -- counting the number of
On Tue, Jan 19, 2010 at 09:47:45AM -0800, Jim Keniston wrote:
Do you have plans for a variant
that's completely in userspace?
I don't know of any such plans, but I'd be interested to read more of
your thoughts here. As I understand it, you've suggested replacing the
probed instruction
* Frederic Weisbecker fweis...@gmail.com [2010-01-19 19:06:12]:
On Tue, Jan 19, 2010 at 09:47:45AM -0800, Jim Keniston wrote:
What does the code in the jumped-to vma do? Is the instrumentation code
that corresponds to the uprobe handlers encoded in an ad hoc .so?
Once the
On 01/18/2010 09:45 AM, Peter Zijlstra wrote:
This is debugging. We're playing with registers, we're playing with the
cpu, we're playing with memory contents. Why not the address space as well?
Because you want thins go to be as transparent as possible in order to
avoid heisenbugs.
On Mon, 2010-01-18 at 13:01 +0200, Avi Kivity wrote:
You've made it clear that you don't like it, but not why.
The kernel already manages the user's address space (except for
MAP_FIXED which is unreliable unless you've already reserved the address
space). I don't see why adding a vma
On Mon, 2010-01-18 at 13:01 +0200, Avi Kivity wrote:
If we reserve some address space, you don't add any heisenbugs (at
least, not any additional ones over emulation). Even if we don't,
address space layout randomization means we're not keeping the address
space layout constant between
On 01/18/2010 01:44 PM, Peter Zijlstra wrote:
On Mon, 2010-01-18 at 13:01 +0200, Avi Kivity wrote:
You've made it clear that you don't like it, but not why.
The kernel already manages the user's address space (except for
MAP_FIXED which is unreliable unless you've already reserved the
On Mon, 2010-01-18 at 14:01 +0200, Avi Kivity wrote:
Maybe you place no value on uprobes. But people who debug userspace
likely will see a reason.
I do see value in uprobes, I just don't like it mucking about with the
address space. Nor does it appear required.
On 01/18/2010 02:06 PM, Peter Zijlstra wrote:
On Mon, 2010-01-18 at 14:01 +0200, Avi Kivity wrote:
Maybe you place no value on uprobes. But people who debug userspace
likely will see a reason.
I do see value in uprobes, I just don't like it mucking about with the
address space. Nor
Hi Avi,
On Mon, 2010-01-18 at 14:01 +0200, Avi Kivity wrote:
Maybe you place no value on uprobes. But people who debug userspace
likely will see a reason.
On 01/18/2010 02:06 PM, Peter Zijlstra wrote:
I do see value in uprobes, I just don't like it mucking about with the
address space. Nor
On 01/18/2010 02:13 PM, Pekka Enberg wrote:
So how big chunks of the address space are we talking here for uprobes?
That's for the authors to answer, but at a guess, 32 bytes per probe
(largest x86 instruction is 15 bytes), so 32 MB will give you a million
probes. That's a piece of cake
On Mon, 2010-01-18 at 14:17 +0200, Avi Kivity wrote:
On 01/18/2010 02:13 PM, Pekka Enberg wrote:
So how big chunks of the address space are we talking here for uprobes?
That's for the authors to answer, but at a guess, 32 bytes per probe
(largest x86 instruction is 15 bytes), so 32
* Avi Kivity a...@redhat.com [2010-01-18 14:17:10]:
On 01/18/2010 02:13 PM, Pekka Enberg wrote:
So how big chunks of the address space are we talking here for uprobes?
That's for the authors to answer, but at a guess, 32 bytes per probe
(largest x86 instruction is 15 bytes), so 32 MB will
On Mon, Jan 18, 2010 at 2:44 PM, Srikar Dronamraju
sri...@linux.vnet.ibm.com wrote:
* Avi Kivity a...@redhat.com [2010-01-18 14:17:10]:
On 01/18/2010 02:13 PM, Pekka Enberg wrote:
So how big chunks of the address space are we talking here for uprobes?
That's for the authors to answer, but at
On 01/18/2010 02:51 PM, Pekka Enberg wrote:
And how many probes do we expected to be live at the same time in
real-world scenarios? I guess Avi's one million is more than enough?
I don't think a user will ever come close to a million, but we can
expect some inflation from inlined
On 01/18/2010 02:51 PM, Pekka Enberg wrote:
And how many probes do we expected to be live at the same time in
real-world scenarios? I guess Avi's one million is more than enough?
Avi Kivity kirjoitti:
I don't think a user will ever come close to a million, but we can
expect some inflation
On 01/18/2010 02:57 PM, Pekka Enberg wrote:
On 01/18/2010 02:51 PM, Pekka Enberg wrote:
And how many probes do we expected to be live at the same time in
real-world scenarios? I guess Avi's one million is more than enough?
Avi Kivity kirjoitti:
I don't think a user will ever come close to a
On Mon, 2010-01-18 at 14:37 +0200, Avi Kivity wrote:
On 01/18/2010 02:14 PM, Peter Zijlstra wrote:
Well, the alternatives are very unappealing. Emulation and
single-stepping are going to be very slow compared to a couple of jumps.
With CPL2 or RPL on user segments the protection
On 01/18/2010 03:15 PM, Peter Zijlstra wrote:
On Mon, 2010-01-18 at 14:37 +0200, Avi Kivity wrote:
On 01/18/2010 02:14 PM, Peter Zijlstra wrote:
Well, the alternatives are very unappealing. Emulation and
single-stepping are going to be very slow compared to a couple of
On Mon, 2010-01-18 at 14:53 +0200, Avi Kivity wrote:
On 01/18/2010 02:51 PM, Pekka Enberg wrote:
And how many probes do we expected to be live at the same time in
real-world scenarios? I guess Avi's one million is more than enough?
I don't think a user will ever come close to a
On Mon, Jan 18, 2010 at 02:15:51PM +0100, Peter Zijlstra wrote:
On Mon, 2010-01-18 at 14:37 +0200, Avi Kivity wrote:
On 01/18/2010 02:14 PM, Peter Zijlstra wrote:
Well, the alternatives are very unappealing. Emulation and
single-stepping are going to be very slow compared to a couple
On Mon, Jan 18, 2010 at 02:13:25PM +0200, Pekka Enberg wrote:
Hi Avi,
On Mon, 2010-01-18 at 14:01 +0200, Avi Kivity wrote:
Maybe you place no value on uprobes. But people who debug userspace
likely will see a reason.
On 01/18/2010 02:06 PM, Peter Zijlstra wrote:
I do see value in
Jim Keniston wrote:
Not really. For #3 (boosting), you need to know everything for #2,
plus be able to compute the length of each instruction -- which we can
now do for x86. To emulate an instruction (#4), you need to replicate
what it does, side-effects and all. The x86 instruction
On 01/18/2010 05:43 PM, Ananth N Mavinakayanahalli wrote:
Well, the alternatives are very unappealing. Emulation and single-stepping
are going to be very slow compared to a couple of jumps.
So how big chunks of the address space are we talking here for uprobes?
As Srikar
On Mon, Jan 18, 2010 at 06:52:32PM +0200, Avi Kivity wrote:
On 01/18/2010 05:43 PM, Ananth N Mavinakayanahalli wrote:
Well, the alternatives are very unappealing. Emulation and single-stepping
are going to be very slow compared to a couple of jumps.
So how big chunks of the address
On Mon, 2010-01-18 at 10:58 -0500, Masami Hiramatsu wrote:
Jim Keniston wrote:
Not really. For #3 (boosting), you need to know everything for #2,
plus be able to compute the length of each instruction -- which we can
now do for x86. To emulate an instruction (#4), you need to
On Mon, 2010-01-18 at 14:34 +0100, Mark Wielaard wrote:
On Mon, 2010-01-18 at 14:53 +0200, Avi Kivity wrote:
On 01/18/2010 02:51 PM, Pekka Enberg wrote:
And how many probes do we expected to be live at the same time in
real-world scenarios? I guess Avi's one million is more than
Jim Keniston wrote:
On Mon, 2010-01-18 at 10:58 -0500, Masami Hiramatsu wrote:
Jim Keniston wrote:
Not really. For #3 (boosting), you need to know everything for #2,
plus be able to compute the length of each instruction -- which we can
now do for x86. To emulate an instruction (#4),
On 01/16/2010 02:58 AM, Jim Keniston wrote:
I hear (er, read) you. Emulation may turn out to be the answer for some
architectures. But here are some things to keep in mind about the
various approaches:
1. Single-stepping inline is easiest: you need to know very little about
the instruction
On Sun, 2010-01-17 at 16:56 +0200, Avi Kivity wrote:
On 01/17/2010 04:52 PM, Peter Zijlstra wrote:
Also, if its fixed size you're imposing artificial limits on the number
of possible probes.
Obviously we'll need a limit, a uprobe will also take kernel memory, we
can't allow people
On Sun, 2010-01-17 at 16:59 +0200, Avi Kivity wrote:
On 01/17/2010 04:52 PM, Peter Zijlstra wrote:
On Sun, 2010-01-17 at 16:39 +0200, Avi Kivity wrote:
On 01/15/2010 11:50 AM, Peter Zijlstra wrote:
As previously stated, I think poking at a process's address space is an
utter
On 01/17/2010 05:03 PM, Peter Zijlstra wrote:
btw, an alternative is to require the caller to provide the address
space for this. If the caller is in another process, we need to allow
it to play with the target's address space (i.e. mmap_process()). I
don't think uprobes justifies this by
On Sat, 2010-01-16 at 18:48 -0500, Jim Keniston wrote:
As you may have noted before, I think FP would be a special problem
for your approach. I'm not sure how folks would react to the idea of
executing FP instructions in kernel space. But emulating them is also
tough. There's an IEEE
On Sat, 2010-01-16 at 19:12 -0500, Bryan Donlan wrote:
On Fri, Jan 15, 2010 at 7:58 PM, Jim Keniston jkeni...@us.ibm.com wrote:
4. Emulation removes the need for the XOL area, but requires pretty much
total knowledge of the instruction set. It's also a performance win for
architectures
On Sun, 2010-01-17 at 21:33 +0200, Avi Kivity wrote:
On 01/17/2010 05:03 PM, Peter Zijlstra wrote:
btw, an alternative is to require the caller to provide the address
space for this. If the caller is in another process, we need to allow
it to play with the target's address space (i.e.
On Fri, Jan 15, 2010 at 7:58 PM, Jim Keniston jkeni...@us.ibm.com wrote:
4. Emulation removes the need for the XOL area, but requires pretty much
total knowledge of the instruction set. It's also a performance win for
architectures that can't do #3. I see kvm implemented on 4
architectures
On Thu, 2010-01-14 at 11:46 -0800, Jim Keniston wrote:
discussed elsewhere.
Thanks for the pointer...
On Fri, Jan 15, 2010 at 10:03:48AM +0100, Peter Zijlstra wrote:
On Thu, 2010-01-14 at 11:46 -0800, Jim Keniston wrote:
discussed elsewhere.
Thanks for the pointer...
:-)
Peter,
I think Jim was referring to
http://sources.redhat.com/ml/systemtap/2007-q1/msg00571.html
Ananth
On Fri, 2010-01-15 at 15:08 +0530, Ananth N Mavinakayanahalli wrote:
On Fri, Jan 15, 2010 at 10:03:48AM +0100, Peter Zijlstra wrote:
On Thu, 2010-01-14 at 11:46 -0800, Jim Keniston wrote:
discussed elsewhere.
Thanks for the pointer...
:-)
Peter,
I think Jim was referring to
On Fri, 2010-01-15 at 15:40 +0530, Ananth N Mavinakayanahalli wrote:
Ideas?
emulate the one instruction?
On Fri, Jan 15, 2010 at 11:13:32AM +0100, Peter Zijlstra wrote:
On Fri, 2010-01-15 at 15:40 +0530, Ananth N Mavinakayanahalli wrote:
Ideas?
emulate the one instruction?
In kernel? Generically? Don't think its that easy for userspace --
you have the full gamut of instructions to emulate
On Fri, 2010-01-15 at 15:52 +0530, Ananth N Mavinakayanahalli wrote:
On Fri, Jan 15, 2010 at 11:13:32AM +0100, Peter Zijlstra wrote:
On Fri, 2010-01-15 at 15:40 +0530, Ananth N Mavinakayanahalli wrote:
Ideas?
emulate the one instruction?
In kernel? Generically? Don't think its
On Fri, 2010-01-15 at 10:02 +0100, Peter Zijlstra wrote:
On Thu, 2010-01-14 at 11:46 -0800, Jim Keniston wrote:
+Instruction copies to be single-stepped are stored in a per-process
+single-step out of line (XOL) area, which is a little VM area
+created by Uprobes in each probed
On Fri, 2010-01-15 at 13:07 -0800, Jim Keniston wrote:
On Fri, 2010-01-15 at 10:02 +0100, Peter Zijlstra wrote:
On Thu, 2010-01-14 at 11:46 -0800, Jim Keniston wrote:
+Instruction copies to be single-stepped are stored in a per-process
+single-step out of line (XOL) area, which is a
On Mon, 2010-01-11 at 17:55 +0530, Srikar Dronamraju wrote:
User Space Breakpoint Assistance Layer (UBP)
User space breakpointing Infrastructure provides kernel subsystems
with architecture independent interface to establish breakpoints in
user applications. This patch provides core
On Thu, 2010-01-14 at 12:08 +0100, Peter Zijlstra wrote:
On Mon, 2010-01-11 at 17:55 +0530, Srikar Dronamraju wrote:
User Space Breakpoint Assistance Layer (UBP)
User space breakpointing Infrastructure provides kernel subsystems
with architecture independent interface to establish
60 matches
Mail list logo