Re: [RFC] [PATCH 1/7] User Space Breakpoint Assistance Layer (UBP)

2010-01-22 Thread Mel Gorman
On Mon, Jan 18, 2010 at 02:15:51PM +0100, Peter Zijlstra wrote:
 On Mon, 2010-01-18 at 14:37 +0200, Avi Kivity wrote:
  On 01/18/2010 02:14 PM, Peter Zijlstra wrote:
  
   Well, the alternatives are very unappealing.  Emulation and
   single-stepping are going to be very slow compared to a couple of jumps.

   With CPL2 or RPL on user segments the protection issue seems to be
   manageable for running the instructions from kernel space.
  
  
  CPL2 gives unrestricted access to the kernel address space; and RPL does 
  not affect page level protection.  Segment limits don't work on x86-64.  
  But perhaps I missed something - these things are tricky.
 
 So setting RPL to 3 on the user segments allows access to kernel pages
 just fine? How useful.. :/
 
  It should be possible to translate the instruction into an address space 
  check, followed by the action, but that's still slower due to privilege 
  level switches.
 
 Well, if you manage to do the address validation you don't need the priv
 level switch anymore, right?
 

It also starts becoming very x86-centric though, doesn't it? It might
kick other ports later.

What is there at the moment is storing the copied instructions in a VMA.
The most unpalatable part of that to me is that it's visible to
userspace, probably via /proc/ and I didn't check, but I hope an
munmap() from userspace cannot delete it.

What the VMA has going for it is that it *appears* to be easier to port to
other architectures than the alternatives, certainly easier to handle than
instruction emulation.

 Are the ins encodings sane enough to recognize mem parameters without
 needing to know the actual ins?
 
 How about using a hw-breakpoint to close the gap for the inline single
 step? You could even re-insert the int3 lazily when you need the
 hw-breakpoint again. It would consume one hw-breakpoint register for
 each task/cpu that has probes though..
 

This feels very racy. Along with that, making these sort of changes
was considered a risky venture on x86 and needed strong verification from
elsewhere (http://lkml.org/lkml/2010/1/12/300). There are probably similar
concerns on other architectures that would make a reliable port difficult.

Right now the approach is with VMAs. The alternatives are

  1. reserved XOL page (similar disadvantages to the VMA)
  2. emulated instructions
This is an emulation bug waiting to happen in my opinion and makes
porting uprobes a significantly more difficult undertaking than
either the XOL-VMA or XOL-page approach
  3. XOL page in kernel space available at a different CPL
This assumes all target architectures have a usable privilege
ring which may be the case. However, I would guess that it
is going to perform worse than the current approach because
of the change in privilege level. No idea what the cost of
a privilege level change is, but I doubt it's free
  4. Boosted probes (arch-specific, apparently only x86 does this for
kprobes)

As unpalatable as the VMA is, I am failing to see why it's not a
reasonable starting point with an understanding that 2 or 3 would be
implemented in the future after the other architecture ports are in
place and the reliability of the options as well as the performance can
be measured.

There would appear to be two classes of application that might suffer
from the VMA. The first which need absolutly every single ounce of address
space. The second which introspects itself via /proc/self/maps and makes
decisions based on that. The first is unfortunate but should be a limited
number of use cases. The second could be fudged by simply not exporting the
information via /proc.

I'm of the opinion it would be reasonable to let the VMA go ahead, look
at the ports for the other architectures and revisit options 2 and 3 above
to see if the VMA can really be removed with performance or reliability
penalty.

-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center
University of Limerick IBM Dublin Software Lab



Re: [RFC] [PATCH 0/7] UBP, XOL and Uprobes [ Summary of Comments and actions to be taken ]

2010-01-22 Thread Peter Zijlstra
On Fri, 2010-01-22 at 12:54 +0530, Ananth N Mavinakayanahalli wrote:
 On Fri, Jan 22, 2010 at 12:32:32PM +0530, Srikar Dronamraju wrote:
  Here is a summary of the Comments and actions that need to be taken for
  the current uprobes patchset. Please let me know if I missed or
  misunderstood any of your comments.  
  
  1. Uprobes depends on trap signal.
  Uprobes depends on trap signal rather than hooking to the global
  die notifier. It was suggested that we hook to the global die notifier.
  
  In the next version of patches, Uprobes will use the global die
  notifier and look at the per-task count of the probes in use to
  see if it has to be consumed.
  
  However this would reduce the ability of uprobe handlers to
  sleep. Since we are dealing with userspace, sleeping in handlers
  would have been a good feature. We are looking at ways to get
  around this limitation.
 
 We could set a TIF_ flag in the notifier to indicate a breakpoint hit
 and process it in task context before the task heads into userspace.

Make that optional, not everybody might want that. Either provide a
simple trampoline or use a flag to indicate the callback be called from
process context on registration.



Re: PTRACE_SYSCALL_ENTRY/EXIT

2010-01-22 Thread Ali Polatel
Roland McGrath yazmış:
 We don't have any particular plans to extend the ptrace interface.  
 I strongly doubt we would even try to do anything like that until the
 utrace-based ptrace interface is merged into Linux and the old ptrace
 implementation gone.
 
 In general, we are not looking for extensions to the ptrace interface.
 It is an ugly hairball already and we are more interested in having 
 the utrace API layer available inside the kernel and then embarking on
 new and sane userland interfaces instead of shoehorning more into ptrace.
 

I respect that.

 That said, some particular kinds of simple enhancements to ptrace are
 really quite trivial to implement in the new utrace-based implementation.
 The particular area you suggest is one of these.
 
 What I would expect is not new variants of the one-shot interface like
 PTRACE_SYSCALL.  Rather, I would envision new PTRACE_O_* options to enable
 syscall entry and exit tracing analogous to the PTRACE_EVENT_* events you
 can now enable.  This means that you make one PTRACE_SETOPTIONS call to
 enable the set of events you want, and then use plain PTRACE_CONT (or
 whatever).
 
 If you really want exactly the one-shot behavior instead, then we could
 consider that.  But, like I said, we are not looking to add much in the
 way of new wrinkles to the dismal ptrace userland interface.

The one-shot behaviour is what I want because adding a PTRACE_O_* option
won't solve my problem if I understood correctly. I'm writing a tool
that audits system calls and *only* denied system calls need to be
stopped at the exit of the system call to set return value and errno.
System calls are checked at entry, if they're safe another
PTRACE_SYSCALL_ENTRY will be issued to continue to the next system call.
If, however, the system call needs to be denied, PTRACE_SYSCALL_EXIT
will be issued after changing system call no to something invalid so
that return value and errno can be set.

I think this will be useful for every program that audits system calls.

 
 Thanks,
 Roland

-- 
Regards,
Ali Polatel


signature.asc
Description: PGP signature


Re: linux-next: add utrace tree

2010-01-22 Thread Oleg Nesterov
On 01/21, Linus Torvalds wrote:

 On Thu, 21 Jan 2010, Andrew Morton wrote:
 
  ptrace is a nasty, complex part of the kernel which has a long history
  of problems, but it's all been pretty quiet in there for the the past few
  years.

 More importantly, we're not ever going to get rid of it.

Unfortunately, you are right. The current ptrace (as it is visible from
user-space) should stay forever.

 Quite frankly, judging my all past history we have ever seen in kernel
 interfaces, new an non-portable interfaces simply are never used. The
 whole question whether they are nicer or not is entirely immaterial.

I have to admit this point looks very reasonable to me. Except, can't
resist, ptrace itself is hardly portable.

 I'm personally very dubious that there are any merits to utrace that
 outweigh the very clear disadvantages: just another layer that adds a new
 level of abstraction to the only interface that people actually _use_,
 namely ptrace.

Of course they can't use other interfaces, we don't have them. And
without the new abstraction layer we will never have, I think.

Oleg.



Re: [RFC] [PATCH 0/7] UBP, XOL and Uprobes [ Summary of Comments and actions to be taken ]

2010-01-22 Thread Masami Hiramatsu
Peter Zijlstra wrote:
 On Fri, 2010-01-22 at 12:32 +0530, Srikar Dronamraju wrote:
 
 2. XOL vma vs Emulation vs Single Stepping Inline vs using Protection
 Rings.
  XOL VMA is an additional process address vma.  This is
  opposition to add an additional vma without user actually
  requesting for the same.

  XOL vma and single stepping inline are the two architecture
  independent implementations. While other implementations are
  more architecture specific. Single stepping inline wouldnt go
  well with multithreaded process.

  Even though XOL vma has its own issues, we will go with it since
  other implementations seem to have more complications.

  we would look forward to implementing boosters later. 
  Later on, if we come across another techniques with lesser
  side-effects than the XOL vma, we would switch to using them.
 
 How about modifying glibc to reserve like 64 bytes on the TLS structure
 it has and storing the ins and possible boost jmp there? Since each
 thread can only have a single trap at any one time that should be
 enough.

Hmm, it is a good idea. Well, we'll have a copy of original insn
in kernel, but it could be simpler than managing XOL vma. :-)

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhira...@redhat.com



Re: linux-next: add utrace tree

2010-01-22 Thread Frank Ch. Eigler
Hi -

oleg wrote:

 [...]
 I'm personally very dubious that there are any merits to utrace that
 outweigh the very clear disadvantages: just another layer that adds a new
 level of abstraction to the only interface that people actually _use_,
 namely ptrace.

 Of course they can't use other interfaces, we don't have them. And
 without the new abstraction layer we will never have, I think.

This is one of the reasons we built, up on request of lkml people, the
utrace-gdbstub prototype (http://lkml.org/lkml/2009/11/30/173).  It
presents a standard userspace debugging interface -- actually, more
standard than ptrace!  It has the potential to be more powerful
feature-wise and perhaps even perform faster than ptrace.  And yet
that RFC didn't receive any on-topic review, only wishes for
unspecified blue-sky integration with kernel debugging.

So then there's uprobes, which is another potential utrace killer
app, if it weren't so tainted by some peoples' disdain for its
current user, when other users are already being seriously discussed.
So a working prototype, which demonstrates both the utility of utrace
itself and the end-user value of user-space probing, is disregarded.

And there are several smaller utrace clients in the works, each of
them merge candidates in the future.  Yes, most of them may be
rewritten with special-purpose hook after hook as people reinvent the
utrace wheel piece by piece, but how long will that take?  How is the
opportunity cost of missing features valued?

Finally, I don't know how to address the logic of if a feature
requires utrace, that's a bad argument for utrace and at the same
time you need to show a killer app for utrace.  What could possibly
satisfy both of those constraints?  Please advise.


- FChE



Re: linux-next: add utrace tree

2010-01-22 Thread Peter Zijlstra
On Fri, 2010-01-22 at 15:01 -0500, Frank Ch. Eigler wrote:
 So then there's uprobes, which is another potential utrace killer
 app

That's bollocks, uprobes is an utter and total mis-match for utrace.
Probing userspace is primarily about DSOs which is files and vma's, not
tasks.

You might maybe want a utrace interface to that, but that is largely
non-interesting.

IOW, we don't need utrace to make sensible use of uprobes.

(And when I speak of uprobes I mean the thing formerly called UBP)



Re: linux-next: add utrace tree

2010-01-22 Thread Oleg Nesterov
On 01/21, Linus Torvalds wrote:

 I realize that my argument is very anti-thetical to the normal CS teaching
 of general-purpose is good. I often feel that very specific code with
 very clearly defined (and limited) applicability is a good thing - I'd
 rather have just a very specific ptrace layer that does nothing but
 ptrace, than a generic plugin layer that can be layered under ptrace and
 other things.

I am repeating the same (and probably poor) arguments, but we don't have a
clearly defined ptrace layer. The current code is just the set of precedents,
I mean, this code does this because we always did this for unknown reason.
And we can't fix it without breaking things. Even the obvious bugs which
could be fixed by the very simple patch should be preserved sometimes.
In fact, afaics the current state is: if it can't crash the kernel - it is
not the bug.

Otoh, ptrace is very limited, yes. Imho - too limited. And, as a user-space
api, it is just horrible.

However: we're not ever going to get rid of it. Yes, sure.


But I am afraid this all is almost off-topic. Afaik, utrace was not created
to solve the problems with ptrace, at least I am sure this wasn't the only
goal.

Unfortunately, I didn't participate in other projects which use utrace.
Even if I did, I don't know how could I prove they are important enough
to have a generic layer to make other things possible.

Oleg.



Re: linux-next: add utrace tree

2010-01-22 Thread Frank Ch. Eigler
Hi -

On Fri, Jan 22, 2010 at 09:16:16PM +0100, Peter Zijlstra wrote:
 [...]
  So then there's uprobes, which is another potential utrace killer
  app

 That's bollocks, uprobes is an utter and total mis-match for utrace.
 Probing userspace is primarily about DSOs which is files and vma's,
 not tasks. [...]

Your experience with user-space probing apparently differs from ours.
In fact there exists plenty of interest and utility in probing given
processes only, if for no other reason then to avoid disrupting others
running on the machine.

Nearly always, it is better to build a multiprocess probing widget
from multiply-applied single-process ones, rather than to build
single-process probing from grossly-filtered systemwide/VMA ones.
(If the lower level infrastructure provides both options, groovy.)

- FChE



Re: linux-next: add utrace tree

2010-01-22 Thread Frank Ch. Eigler
Hi -

On Fri, Jan 22, 2010 at 01:59:11PM -0800, Linus Torvalds wrote:
 [...]
  Finally, I don't know how to address the logic of if a feature
  requires utrace, that's a bad argument for utrace and at the same
  time you need to show a killer app for utrace.  What could possibly
  satisfy both of those constraints?  Please advise.
 
 The point is, the feature needs to be a killer feature. And I have yet to 
 hear _any_ such killer feature, especially from a kernel maintenance 
 standpoint.


 The better ptrace than ptrace is irrelevant. Sure, we all know ptrace 
 isn't a wonderful feature. But it's there, and a debugger is going to have 
 support for it anyway, so what's the _advantage_ of a better ptrace 
 interface? There is absolutely _zero_ advantage, there's just yet 
 another interface. We can't get rid of the old one _anyway_.

The point is that the intermediate api will allow (and, as the part
you clipped out about utrace-gdbstub said, *already has allowed*)
alternative plausible interfaces that coexist just fine.


 And the seccomp replacement just sounds horrible. Using some tracing 
 interface to implement security models sounds like the worst idea ever.

So all this is about *naming* utrace?  It was never built for
tracing, but for (efficient/multiplexed) *control*.  That wasn't even
its original name -- one of your lieutenants asked roland to change it
to utrace.


 And like it or not, over the last almost-decade, _not_ having to
 have to work with system tap has been a feature, not a problem, for
 the kernel community.

I don't have a problem with that.  We have apprx. never imposed
anything on developers who didn't want to use it.  There are plenty
who have and will.


- FChE



Re: [RFC] [PATCH 0/7] UBP, XOL and Uprobes [ Summary of Comments and actions to be taken ]

2010-01-22 Thread Jim Keniston

On Fri, 2010-01-22 at 19:06 +0100, Peter Zijlstra wrote:
 On Fri, 2010-01-22 at 12:32 +0530, Srikar Dronamraju wrote:
 
  2. XOL vma vs Emulation vs Single Stepping Inline vs using Protection
  Rings.
  XOL VMA is an additional process address vma.  This is
  opposition to add an additional vma without user actually
  requesting for the same.
  
  XOL vma and single stepping inline are the two architecture
  independent implementations. While other implementations are
  more architecture specific. Single stepping inline wouldnt go
  well with multithreaded process.
  
  Even though XOL vma has its own issues, we will go with it since
  other implementations seem to have more complications.
  
  we would look forward to implementing boosters later. 
  Later on, if we come across another techniques with lesser
  side-effects than the XOL vma, we would switch to using them.
 
 How about modifying glibc to reserve like 64 bytes on the TLS structure
 it has and storing the ins and possible boost jmp there? Since each
 thread can only have a single trap at any one time that should be
 enough.

We once implemented something similar, but using an area just beyond the
top of the stack instead of TLS.  We figured it would never pass muster
because we have to temporarily map the page executable (and undo it
after the single-step), and this felt like a big security hole.  I'd
think we'd have the same concern with TLS.

Jim



Re: linux-next: add utrace tree

2010-01-22 Thread Linus Torvalds


On Fri, 22 Jan 2010, Frank Ch. Eigler wrote:
 
 The point is that the intermediate api will allow (and, as the part
 you clipped out about utrace-gdbstub said, *already has allowed*)
 alternative plausible interfaces that coexist just fine.

And my point is that multiple interfaces are BAD. 

There is one interface we _have_ to have: the traditional ptrace one. That 
one we can't get away from.

Multiple interfaces on its own is just confusion with no upside. 

You need a _reason_ to have other interfaces. They need to have that 
killer feature. Just being different is not a feature at all.

 So all this is about *naming* utrace?  It was never built for
 tracing, but for (efficient/multiplexed) *control*.  That wasn't even
 its original name -- one of your lieutenants asked roland to change it
 to utrace.

No. It's not about naming. It's about the downside of having amorphous 
interfaces that apparently don't even have rules, and are then used to 
implement random crap.

Yes, the SNL skit about It's a dessert topping _and_ a floor wax was 
funny, but it was funny exactly because it was crazy.

The fact that you can do crazy things is not a good thing. You need to 
find the goodness somewhere else, and that's what I'm trying to tell 
you.

You just seem to have trouble listening. 

Linus



Re: linux-next: add utrace tree

2010-01-22 Thread Linus Torvalds


On Fri, 22 Jan 2010, Linus Torvalds wrote:

 No. It's not about naming. It's about the downside of having amorphous 
 interfaces that apparently don't even have rules, and are then used to 
 implement random crap.
 
 Yes, the SNL skit about It's a dessert topping _and_ a floor wax was 
 funny, but it was funny exactly because it was crazy.

Put yet another way: I'd _much_ rather have two totally separate pieces 
that don't depend on each other, and do different things.

So to take a very practical example: I'd much rather have 'seccomp' and 
'ptrace' that have _nothing_ what-so-ever to do with each other, than have 
some intermediate layer that then needs to make both of those happy, and 
that both have to interact with.

There are cases where we really _want_ to have common code. We want to 
have a common VFS interface because we want to show _one_ interface to 
user space across a gazillion different filesystems. We want to have a 
common driver layer (as far as possible) because - again - we expose a 
metric shitload of drivers, and we want to have one unified interface to 
them.

But going the other way: trying to share code when the interfaces are 
fundamentally _different_ is generally not at all such a great idea. It 
ends up tying two conceptually totally separate things together, and 
suddenly people who work on feature X aneed to modify infrastructure that 
affects feature Y, and it turns ou that details A, B and C are all totally 
different for the two features and the middle layer has two conflicting 
things it needs to work with.

This is why when somebody brought up you could do a seccomp-like thing on 
top of utrace that my reaction was and is just totally negative. It shows 
all the wrong kinds of tying things together.

Linus



Re: linux-next: add utrace tree

2010-01-22 Thread Kyle Moffett
On Fri, Jan 22, 2010 at 19:22, Linus Torvalds
torva...@linux-foundation.org wrote:
 There are cases where we really _want_ to have common code. We want to
 have a common VFS interface because we want to show _one_ interface to
 user space across a gazillion different filesystems. We want to have a
 common driver layer (as far as possible) because - again - we expose a
 metric shitload of drivers, and we want to have one unified interface to
 them.

So... Everybody agrees that ptrace() is horrible and a royal pain to
use, let alone use correctly and without bugs.  Everybody also agrees
that ptrace() needs to stay around for a long time to avoid breaking
all the existing users.

Now how do we get from here to a moderately portable API for
interrogating, controlling, and intercepting process state?
Essentially it would need to support all of the things that a powerful
debugger would want to do, including modifying registers and memory,
substituting syscall return values, etc.  I believe that utrace is
the kernel side of that API.

The killer app for this will be the ability to delete thousands of
lines of code from GDB, strace, and all the various other tools that
have to painfully work around the major interface gotchas of ptrace(),
while at the same time making their handling of complex processes much
more robust.

The *second* killer app for this is to make it much easier for people
to write new userspace debugging tools.  I love the various
crash-catching tools that different distributions or applications
provide, but they all basically have to trap the SIGSEGV and hope
they're still sensible enough to fork() and exec() a gdb process.

Furthermore, I would love to be able to write debugging tools for
scripting languages that allow me to step across Perl, C, PHP,
assembly code, etc, all within the same process.  In theory that's all
possible today, but given how much of a *pain* ptrace() is to use
correctly, nobody bothers.

Now, with all that said, utrace does not provide any of the
userspace side APIs today... but I think it is a necessary refactoring
if we want to provide a new ideal process-introspection interface
without breaking all the ptrace() users.

Think of the utrace interface as very much like the LSM interface.
Just like with LSMs, there is a lot of active research in debugging
and tracing tools, and nobody can even remotely agree what the hell
they want out of the hooks.  In theory you could add one hook for
every place each security module needs one... but then your fast-path
is littered with always-false test-and-jump statements.  What utrace
provides is the one single test in each fast path that then searches
for and executes the appropriate slow path(s) for that process.

I personally would be very happy to see utrace merged.

Cheers,
Kyle Moffett