Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Avi Kivity

On 01/18/2010 09:45 AM, Peter Zijlstra wrote:



This is debugging.  We're playing with registers, we're playing with the
cpu, we're playing with memory contents.  Why not the address space as well?
 

Because you want thins go to be as transparent as possible in order to
avoid heisenbugs. Sure we cannot avoid everything, but we should avoid
everything we possibly can.
   


If we reserve some address space, you don't add any heisenbugs (at 
least, not any additional ones over emulation).  Even if we don't, 
address space layout randomization means we're not keeping the address 
space layout constant between runs anyway.



Also, aside of the VDSO, we simply do not force map things into address
spaces (and like said before, I think the VDSO stinks for doing that)
and I think we don't want to create (more) precedents in this case.
   


You've made it clear that you don't like it, but not why.

The kernel already manages the user's address space (except for 
MAP_FIXED which is unreliable unless you've already reserved the address 
space).  I don't see why adding a vma for debugging is so horrible.


--
error compiling committee.c: too many arguments to function



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Peter Zijlstra
On Mon, 2010-01-18 at 13:01 +0200, Avi Kivity wrote:
 
 You've made it clear that you don't like it, but not why.
 
 The kernel already manages the user's address space (except for 
 MAP_FIXED which is unreliable unless you've already reserved the address 
 space).  I don't see why adding a vma for debugging is so horrible. 

Well, the kernel only does what the user (and loader) tell it through
mmap(). Other than that we never (except this VDSO thing) inject vmas,
and I see no reason to start doing that now.



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Peter Zijlstra
On Mon, 2010-01-18 at 13:01 +0200, Avi Kivity wrote:
 If we reserve some address space, you don't add any heisenbugs (at 
 least, not any additional ones over emulation).  Even if we don't, 
 address space layout randomization means we're not keeping the address 
 space layout constant between runs anyway. 

Well, it still limits the number of probes to the reserved area. If you
want more you need to grow the area.. which then changes the state.



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Avi Kivity

On 01/18/2010 01:44 PM, Peter Zijlstra wrote:

On Mon, 2010-01-18 at 13:01 +0200, Avi Kivity wrote:
   

You've made it clear that you don't like it, but not why.

The kernel already manages the user's address space (except for
MAP_FIXED which is unreliable unless you've already reserved the address
space).  I don't see why adding a vma for debugging is so horrible.
 

Well, the kernel only does what the user (and loader) tell it through
mmap().


What I meant was that the kernel chooses the addresses (unless you go 
the MAP_FIXED way).  From the user's point of view, there is no change 
in behaviour: the kernel picks an address.  If the constraints have 
changed (because we reserve a range), that doesn't affect the user.



Other than that we never (except this VDSO thing) inject vmas,
and I see no reason to start doing that now.
   


Maybe you place no value on uprobes.  But people who debug userspace 
likely will see a reason.


--
error compiling committee.c: too many arguments to function



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Peter Zijlstra
On Mon, 2010-01-18 at 14:01 +0200, Avi Kivity wrote:
 
 Maybe you place no value on uprobes.  But people who debug userspace 
 likely will see a reason.

I do see value in uprobes, I just don't like it mucking about with the
address space. Nor does it appear required. 



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Avi Kivity

On 01/18/2010 02:06 PM, Peter Zijlstra wrote:

On Mon, 2010-01-18 at 14:01 +0200, Avi Kivity wrote:
   

Maybe you place no value on uprobes.  But people who debug userspace
likely will see a reason.
 

I do see value in uprobes, I just don't like it mucking about with the
address space. Nor does it appear required.
   


Well, the alternatives are very unappealing.  Emulation and 
single-stepping are going to be very slow compared to a couple of jumps.


--
error compiling committee.c: too many arguments to function



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Pekka Enberg
Hi Avi,

On Mon, 2010-01-18 at 14:01 +0200, Avi Kivity wrote:
 Maybe you place no value on uprobes.  But people who debug userspace
 likely will see a reason.

On 01/18/2010 02:06 PM, Peter Zijlstra wrote:
 I do see value in uprobes, I just don't like it mucking about with the
 address space. Nor does it appear required.

On Mon, Jan 18, 2010 at 2:09 PM, Avi Kivity a...@redhat.com wrote:
 Well, the alternatives are very unappealing.  Emulation and single-stepping
 are going to be very slow compared to a couple of jumps.

So how big chunks of the address space are we talking here for uprobes?



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Avi Kivity

On 01/18/2010 02:13 PM, Pekka Enberg wrote:

So how big chunks of the address space are we talking here for uprobes?
   


That's for the authors to answer, but at a guess, 32 bytes per probe 
(largest x86 instruction is 15 bytes), so 32 MB will give you a million 
probes.  That's a piece of cake for x86-64, probably harder to justify 
for i386.


--
error compiling committee.c: too many arguments to function



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Peter Zijlstra
On Mon, 2010-01-18 at 14:17 +0200, Avi Kivity wrote:
 On 01/18/2010 02:13 PM, Pekka Enberg wrote:
  So how big chunks of the address space are we talking here for uprobes?
 
 
 That's for the authors to answer, but at a guess, 32 bytes per probe 
 (largest x86 instruction is 15 bytes), so 32 MB will give you a million 
 probes.  That's a piece of cake for x86-64, probably harder to justify 
 for i386.

Yeah, I'm aware of people turning off address space randomization to
gain more virtual space on i386, I'm pretty sure those folks aren't
going to be happy if we shrink it.

Let alone them trying to probe their app.



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Srikar Dronamraju
* Avi Kivity a...@redhat.com [2010-01-18 14:17:10]:

 On 01/18/2010 02:13 PM, Pekka Enberg wrote:
 So how big chunks of the address space are we talking here for uprobes?
 
 That's for the authors to answer, but at a guess, 32 bytes per probe
 (largest x86 instruction is 15 bytes), so 32 MB will give you a
 million probes.  That's a piece of cake for x86-64, probably harder
 to justify for i386.


On x86, each probe takes 16 bytes. 
In the current implementation of XOL, the first hit of a breakpoint,
requires us to allocate a page. If that page does get full with active
breakpoints, we expand / add a page. There is a bit map that keeps a
check to see if a previously used breakpoint is removed and hence that
slot can be reused.  By active breakpoints, I refer to those that are
inserted, and has been trapped atleast once but not yet removed.

Jim did try a few other allocation techniques but those that involved
slot stealing did end up having locking. People who did look at that
code did advise us to reduce the locking and keep the allocation simple
(atleast for the first cut).

--
Thanks and Regards
Srikar

 
 -- 
 error compiling committee.c: too many arguments to function
 



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Pekka Enberg
On Mon, Jan 18, 2010 at 2:44 PM, Srikar Dronamraju
sri...@linux.vnet.ibm.com wrote:
 * Avi Kivity a...@redhat.com [2010-01-18 14:17:10]:

 On 01/18/2010 02:13 PM, Pekka Enberg wrote:
 So how big chunks of the address space are we talking here for uprobes?

 That's for the authors to answer, but at a guess, 32 bytes per probe
 (largest x86 instruction is 15 bytes), so 32 MB will give you a
 million probes.  That's a piece of cake for x86-64, probably harder
 to justify for i386.

 On x86, each probe takes 16 bytes.

And how many probes do we expected to be live at the same time in
real-world scenarios? I guess Avi's one million is more than enough?



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Avi Kivity

On 01/18/2010 02:51 PM, Pekka Enberg wrote:


And how many probes do we expected to be live at the same time in
real-world scenarios? I guess Avi's one million is more than enough?
   


I don't think a user will ever come close to a million, but we can 
expect some inflation from inlined functions (I don't know if uprobes 
replicates such probes, but if it doesn't, it should).


--
error compiling committee.c: too many arguments to function



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Pekka Enberg

On 01/18/2010 02:51 PM, Pekka Enberg wrote:

And how many probes do we expected to be live at the same time in
real-world scenarios? I guess Avi's one million is more than enough?


Avi Kivity kirjoitti:
I don't think a user will ever come close to a million, but we can 
expect some inflation from inlined functions (I don't know if uprobes 
replicates such probes, but if it doesn't, it should).


Right. I guess we're looking at few megabytes of the address space for 
normal scenarios which doesn't seem too excessive.


However, as Peter pointed out, the bigger problem is that now we're 
opening the door for other features to steal chunks of the address 
space. And I think it's a legitimate worry that it's going to cause 
problems for 32-bit in the future.


I don't like the idea but if the performance benefits are real (are 
they?), maybe it's a worthwhile trade-off. Dunno.


Pekka



Re: [RFC] [PATCH 7/7] Ftrace plugin for Uprobes

2010-01-18 Thread Frederic Weisbecker
On Thu, Jan 14, 2010 at 01:29:09PM +0100, Peter Zijlstra wrote:
 On Thu, 2010-01-14 at 13:23 +0100, Frederic Weisbecker wrote:
  
  I see, so what you suggest is to have the probe set up
  as generic first. Then the process that activates it
  becomes a consumer, right?
 
 Right, so either we have it always on, for things like ftrace, 
 
   in which case the creation traverses rmap and installs the probes
   all existing mmap()s, and a mmap() hook will install it on all new
   ones.
 
 Or they're strictly consumer driver, like perf, in which case the act of
 enabling the event will install the probe (if its not there yet).
 


Looks like a good plan.



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Avi Kivity

On 01/18/2010 02:57 PM, Pekka Enberg wrote:

On 01/18/2010 02:51 PM, Pekka Enberg wrote:

And how many probes do we expected to be live at the same time in
real-world scenarios? I guess Avi's one million is more than enough?


Avi Kivity kirjoitti:
I don't think a user will ever come close to a million, but we can 
expect some inflation from inlined functions (I don't know if uprobes 
replicates such probes, but if it doesn't, it should).


Right. I guess we're looking at few megabytes of the address space for 
normal scenarios which doesn't seem too excessive.


However, as Peter pointed out, the bigger problem is that now we're 
opening the door for other features to steal chunks of the address 
space. And I think it's a legitimate worry that it's going to cause 
problems for 32-bit in the future.


I don't like the idea but if the performance benefits are real (are 
they?), maybe it's a worthwhile trade-off. Dunno.


If uprobes can trace to buffer memory in the process address space, I 
think the win can be dramatic.  Incidentally it will require injecting 
even more vmas into a process.


Basically it means very low cost tracing, like the kernel tracers.

--
error compiling committee.c: too many arguments to function



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Peter Zijlstra
On Mon, 2010-01-18 at 14:37 +0200, Avi Kivity wrote:
 On 01/18/2010 02:14 PM, Peter Zijlstra wrote:
 
  Well, the alternatives are very unappealing.  Emulation and
  single-stepping are going to be very slow compared to a couple of jumps.
   
  With CPL2 or RPL on user segments the protection issue seems to be
  manageable for running the instructions from kernel space.
 
 
 CPL2 gives unrestricted access to the kernel address space; and RPL does 
 not affect page level protection.  Segment limits don't work on x86-64.  
 But perhaps I missed something - these things are tricky.

So setting RPL to 3 on the user segments allows access to kernel pages
just fine? How useful.. :/

 It should be possible to translate the instruction into an address space 
 check, followed by the action, but that's still slower due to privilege 
 level switches.

Well, if you manage to do the address validation you don't need the priv
level switch anymore, right?

Are the ins encodings sane enough to recognize mem parameters without
needing to know the actual ins?

How about using a hw-breakpoint to close the gap for the inline single
step? You could even re-insert the int3 lazily when you need the
hw-breakpoint again. It would consume one hw-breakpoint register for
each task/cpu that has probes though..



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Avi Kivity

On 01/18/2010 03:15 PM, Peter Zijlstra wrote:

On Mon, 2010-01-18 at 14:37 +0200, Avi Kivity wrote:
   

On 01/18/2010 02:14 PM, Peter Zijlstra wrote:
 
   

Well, the alternatives are very unappealing.  Emulation and
single-stepping are going to be very slow compared to a couple of jumps.

 

With CPL2 or RPL on user segments the protection issue seems to be
manageable for running the instructions from kernel space.

   

CPL2 gives unrestricted access to the kernel address space; and RPL does
not affect page level protection.  Segment limits don't work on x86-64.
But perhaps I missed something - these things are tricky.
 

So setting RPL to 3 on the user segments allows access to kernel pages
just fine? How useful.. :/
   


The further we stay away from segmentation, the better.  Thankfully AMD 
removed hardware task switching from x86-64 so we can't even think about 
that.



It should be possible to translate the instruction into an address space
check, followed by the action, but that's still slower due to privilege
level switches.
 

Well, if you manage to do the address validation you don't need the priv
level switch anymore, right?
   


Right.


Are the ins encodings sane enough to recognize mem parameters without
needing to know the actual ins?
   


No.  You need to know whether the instruction accesses memory or not.

Look at the tables at the beginning of arch/x86/kvm/emulate.c.  Opcodes 
marked with ModRM, BitOp, MemAbs, String, Stack are all different styles 
of memory instructions.  You need to know the operand size for the edge 
cases.  And there are probably a few special cases in the code.



How about using a hw-breakpoint to close the gap for the inline single
step? You could even re-insert the int3 lazily when you need the
hw-breakpoint again. It would consume one hw-breakpoint register for
each task/cpu that has probes though..
   


If you have more than four threads, it breaks, no?  And you need an IPI 
each time you hit the breakpoint.


Ultimately I'd like to see the breakpoint avoided as well, use a jump to 
the XOL area and trace in ~20 cycles instead of ~1000.


--
error compiling committee.c: too many arguments to function



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Mark Wielaard
On Mon, 2010-01-18 at 14:53 +0200, Avi Kivity wrote:
 On 01/18/2010 02:51 PM, Pekka Enberg wrote:
 
  And how many probes do we expected to be live at the same time in
  real-world scenarios? I guess Avi's one million is more than enough?
 
 I don't think a user will ever come close to a million, but we can 
 expect some inflation from inlined functions (I don't know if uprobes 
 replicates such probes, but if it doesn't, it should).

SystemTap by default places probes on all instances of an inlined
function. It is still hard to get to a million probes though.
$ stap -v -l 'process(/usr/bin/emacs).function(*)'
[...]
Pass 2: analyzed script: 4359 probe(s)

You can try probing all statements (for every function, in every file,
on every line of source code), but even that only adds up to ten
thousands of probes:
$ stap -v -l 'process(/usr/bin/emacs).statement(*...@*:*)'
[...]
Pass 2: analyzed script: 39603 probe(s)

So a million is pretty far out, even if you add larger programs and all
the shared libraries they are using.

As Srikar said the current allocation technique is the simplest you can
do, one xol slot for each uprobe. But there are other techniques that
you can use. Theoretically you only need a xol slot for each thread of a
process that simultaneously hits a uprobe instance. That requires a bit
more bookkeeping. The variant of uprobes that systemtap uses at the
moment does that. But the locking in that case is pretty tricky, so it
seemed easier to first get the code with the simplest xol allocation
technique upstream. But if you do that than you can use a very small xol
area to support millions of uprobes and only have to expand it when
there are hundreds of threads in a process all hitting the probes
simultaneously.

Cheers,

Mark



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread K.Prasad
On Mon, Jan 18, 2010 at 02:15:51PM +0100, Peter Zijlstra wrote:
 On Mon, 2010-01-18 at 14:37 +0200, Avi Kivity wrote:
  On 01/18/2010 02:14 PM, Peter Zijlstra wrote:
  
   Well, the alternatives are very unappealing.  Emulation and
   single-stepping are going to be very slow compared to a couple of jumps.

   With CPL2 or RPL on user segments the protection issue seems to be
   manageable for running the instructions from kernel space.
  
  
  CPL2 gives unrestricted access to the kernel address space; and RPL does 
  not affect page level protection.  Segment limits don't work on x86-64.  
  But perhaps I missed something - these things are tricky.
 
 So setting RPL to 3 on the user segments allows access to kernel pages
 just fine? How useful.. :/
 
  It should be possible to translate the instruction into an address space 
  check, followed by the action, but that's still slower due to privilege 
  level switches.
 
 Well, if you manage to do the address validation you don't need the priv
 level switch anymore, right?
 
 Are the ins encodings sane enough to recognize mem parameters without
 needing to know the actual ins?
 
 How about using a hw-breakpoint to close the gap for the inline single
 step? You could even re-insert the int3 lazily when you need the
 hw-breakpoint again. It would consume one hw-breakpoint register for
 each task/cpu that has probes though..


A very scarce resource that it is, well, sometimes all that we might have
is just one hw-breakpoint register (like older PPC64 with 1 IABR) in the
system. If one process/thread consumes it, then all other contenders (from
both kernel and user-space) are prevented from acquiring it.

Also to mention the existence of processors with no support for
instruction breakpoints.

Thanks,
K.Prasad



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Ananth N Mavinakayanahalli
On Mon, Jan 18, 2010 at 02:13:25PM +0200, Pekka Enberg wrote:
 Hi Avi,
 
 On Mon, 2010-01-18 at 14:01 +0200, Avi Kivity wrote:
  Maybe you place no value on uprobes.  But people who debug userspace
  likely will see a reason.
 
 On 01/18/2010 02:06 PM, Peter Zijlstra wrote:
  I do see value in uprobes, I just don't like it mucking about with the
  address space. Nor does it appear required.
 
 On Mon, Jan 18, 2010 at 2:09 PM, Avi Kivity a...@redhat.com wrote:
  Well, the alternatives are very unappealing.  Emulation and single-stepping
  are going to be very slow compared to a couple of jumps.
 
 So how big chunks of the address space are we talking here for uprobes?

As Srikar mentioned, the least we start with is 1 page. Though you can
have as many probes as you want, there are certain optimizations we can
do, depending on the most common usecases.

For eg., if you'd consider the start of a routine to be the most
commonly traced location, most routines in a binary would generally
start with the same instruction (say push %ebp), and we can refcount a
slot with that instruction to be used for all probes of the same
instruction.

Ananth



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Masami Hiramatsu
Jim Keniston wrote:
 Not really.  For #3 (boosting), you need to know everything for #2,  
 plus be able to compute the length of each instruction -- which we can  
 now do for x86.  To emulate an instruction (#4), you need to replicate  
 what it does, side-effects and all.  The x86 instruction set seems to  
 be adding new floating-point instructions all the time, and I bet even  
 Masami doesn't know what they all do, but so far, they all seem to  
 adhere to the instruction-length rules encoded in Masami's instruction  
 decoder.

Actually, current x86 decoder doesn't support FP(x87) instructions.(even
it already supported AVX) But I think it's not so hard to add it.

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhira...@redhat.com



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Avi Kivity

On 01/18/2010 05:43 PM, Ananth N Mavinakayanahalli wrote:



Well, the alternatives are very unappealing.  Emulation and single-stepping
are going to be very slow compared to a couple of jumps.
   

So how big chunks of the address space are we talking here for uprobes?
 

As Srikar mentioned, the least we start with is 1 page. Though you can
have as many probes as you want, there are certain optimizations we can
do, depending on the most common usecases.

For eg., if you'd consider the start of a routine to be the most
commonly traced location, most routines in a binary would generally
start with the same instruction (say push %ebp), and we can refcount a
slot with that instruction to be used for all probes of the same
instruction.
   


But then you can't follow the instruction with a jump back to the code...

--
error compiling committee.c: too many arguments to function



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Ananth N Mavinakayanahalli
On Mon, Jan 18, 2010 at 06:52:32PM +0200, Avi Kivity wrote:
 On 01/18/2010 05:43 PM, Ananth N Mavinakayanahalli wrote:

 Well, the alternatives are very unappealing.  Emulation and single-stepping
 are going to be very slow compared to a couple of jumps.

 So how big chunks of the address space are we talking here for uprobes?
  
 As Srikar mentioned, the least we start with is 1 page. Though you can
 have as many probes as you want, there are certain optimizations we can
 do, depending on the most common usecases.

 For eg., if you'd consider the start of a routine to be the most
 commonly traced location, most routines in a binary would generally
 start with the same instruction (say push %ebp), and we can refcount a
 slot with that instruction to be used for all probes of the same
 instruction.


 But then you can't follow the instruction with a jump back to the code...

Right. This will work only for the non boosted case where single-stepping
is mandatory. I guess the tradeoff is vma space and speed.

Ananth



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Jim Keniston
On Mon, 2010-01-18 at 10:58 -0500, Masami Hiramatsu wrote:
 Jim Keniston wrote:
  Not really.  For #3 (boosting), you need to know everything for #2,  
  plus be able to compute the length of each instruction -- which we can  
  now do for x86.  To emulate an instruction (#4), you need to replicate  
  what it does, side-effects and all.  The x86 instruction set seems to  
  be adding new floating-point instructions all the time, and I bet even  
  Masami doesn't know what they all do, but so far, they all seem to  
  adhere to the instruction-length rules encoded in Masami's instruction  
  decoder.
 
 Actually, current x86 decoder doesn't support FP(x87) instructions.(even
 it already supported AVX) But I think it's not so hard to add it.
 

At one point I verified that it worked for all the x87 instructions in
libm:
https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html
I'm pretty sure I tested mmx instructions as well.  But I guess this was
before you rearranged the opcode tables.

Yeah, it wouldn't be hard to add back in, at least for purposes of
computing instruction lengths.

Jim



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Jim Keniston
On Mon, 2010-01-18 at 14:34 +0100, Mark Wielaard wrote:
 On Mon, 2010-01-18 at 14:53 +0200, Avi Kivity wrote:
  On 01/18/2010 02:51 PM, Pekka Enberg wrote:
  
   And how many probes do we expected to be live at the same time in
   real-world scenarios? I guess Avi's one million is more than enough?
  
  I don't think a user will ever come close to a million, but we can 
  expect some inflation from inlined functions (I don't know if uprobes 
  replicates such probes, but if it doesn't, it should).
 
 SystemTap by default places probes on all instances of an inlined
 function. It is still hard to get to a million probes though.
 $ stap -v -l 'process(/usr/bin/emacs).function(*)'
 [...]
 Pass 2: analyzed script: 4359 probe(s)
 
 You can try probing all statements (for every function, in every file,
 on every line of source code), but even that only adds up to ten
 thousands of probes:
 $ stap -v -l 'process(/usr/bin/emacs).statement(*...@*:*)'
 [...]
 Pass 2: analyzed script: 39603 probe(s)
 
 So a million is pretty far out, even if you add larger programs and all
 the shared libraries they are using.

Thanks, Mark.  One correction, below.

 
 As Srikar said the current allocation technique is the simplest you can
 do, one xol slot for each uprobe. But there are other techniques that
 you can use. Theoretically you only need a xol slot for each thread of a
 process that simultaneously hits a uprobe instance. That requires a bit
 more bookkeeping. The variant of uprobes that systemtap uses at the
 moment does that.

Actually, it's per-probepoint, with a fixed number of slots.  If the
probepoint you just hit doesn't have a slot, and none are free, you
steal a slot from another probepoint.  Yeah, it's messy.

We considered allocating slots per-thread, hoping to make it basically
lockless, but that way there's more likely to be constant scribbling on
the XOL area, as a thread with n slots cycles through n+m probepoints.
And of course, it gets dicey as the process clones more threads.

I guess the point is, there are a lot of ways to allocate slots, and we
haven't found the perfect algorithm yet, even if you accept the
existence of (and need for) the XOL area.  Keep the ideas coming.

 But the locking in that case is pretty tricky, so it
 seemed easier to first get the code with the simplest xol allocation
 technique upstream. But if you do that than you can use a very small xol
 area to support millions of uprobes and only have to expand it when
 there are hundreds of threads in a process all hitting the probes
 simultaneously.
 
 Cheers,
 
 Mark
 

Jim



Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-18 Thread Masami Hiramatsu
Jim Keniston wrote:
 On Mon, 2010-01-18 at 10:58 -0500, Masami Hiramatsu wrote:
 Jim Keniston wrote:
 Not really.  For #3 (boosting), you need to know everything for #2,  
 plus be able to compute the length of each instruction -- which we can  
 now do for x86.  To emulate an instruction (#4), you need to replicate  
 what it does, side-effects and all.  The x86 instruction set seems to  
 be adding new floating-point instructions all the time, and I bet even  
 Masami doesn't know what they all do, but so far, they all seem to  
 adhere to the instruction-length rules encoded in Masami's instruction  
 decoder.

 Actually, current x86 decoder doesn't support FP(x87) instructions.(even
 it already supported AVX) But I think it's not so hard to add it.

 
 At one point I verified that it worked for all the x87 instructions in
 libm:
 https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html
 I'm pretty sure I tested mmx instructions as well.  But I guess this was
 before you rearranged the opcode tables.
 
 Yeah, it wouldn't be hard to add back in, at least for purposes of
 computing instruction lengths.

objdump -d /lib/libm.so.6  | awk -f arch/x86/tools/distill.awk | ./test_get_len 
Succeed: decoded and checked 37198 instructions

Hmm, yeah, that's already supported :-D.

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhira...@redhat.com



Re: PTRACE_SYSCALL_ENTRY/EXIT

2010-01-18 Thread Roland McGrath
We don't have any particular plans to extend the ptrace interface.  
I strongly doubt we would even try to do anything like that until the
utrace-based ptrace interface is merged into Linux and the old ptrace
implementation gone.

In general, we are not looking for extensions to the ptrace interface.
It is an ugly hairball already and we are more interested in having 
the utrace API layer available inside the kernel and then embarking on
new and sane userland interfaces instead of shoehorning more into ptrace.

That said, some particular kinds of simple enhancements to ptrace are
really quite trivial to implement in the new utrace-based implementation.
The particular area you suggest is one of these.

What I would expect is not new variants of the one-shot interface like
PTRACE_SYSCALL.  Rather, I would envision new PTRACE_O_* options to enable
syscall entry and exit tracing analogous to the PTRACE_EVENT_* events you
can now enable.  This means that you make one PTRACE_SETOPTIONS call to
enable the set of events you want, and then use plain PTRACE_CONT (or
whatever).

If you really want exactly the one-shot behavior instead, then we could
consider that.  But, like I said, we are not looking to add much in the
way of new wrinkles to the dismal ptrace userland interface.


Thanks,
Roland