Re: linux-next: add utrace tree

2010-02-08 Thread H. Peter Anvin

On 02/07/2010 10:54 PM, Pavel Machek wrote:


No, it has nothing to do with ring.  It has to do with modifying code
that another CPU could be executing at the same time, and with modifying
code on the same processor through another virtual alias (they are
different issues.)  The same issues apply regardless of the CPL of the
processor.


...but these are always 'there could be cpu bugs around' issues,
right? Like amd k6. AFAICT x86 always supported self-modifying code
without any extra barriers needed...



*Self*-modifying code, yes.  *Cross*-modifying code, no.

-hpa



Re: linux-next: add utrace tree

2010-01-27 Thread H. Peter Anvin
On 01/27/2010 02:43 AM, Linus Torvalds wrote:
 
 
 On Wed, 27 Jan 2010, Peter Zijlstra wrote:

 Right, so you're going to love uprobes, which does exactly that. The
 current proposal is overwriting the target instruction with an INT3 and
 injecting an extra vma into the target process's address space
 containing the original instruction(s) and possible jumps back to the
 old code stream.
 
 Just out of interest, how does it handle the threading issue?
 
 Last I saw, at least some CPU people were _very_ nervous about overwriting 
 instructions if another CPU might be just about to execute them.
 
 Even the overwrite only the first byte with 'int3' made them go umm, I 
 need to talk to some core CPU people to see if that's ok. They mumble 
 about possible CPU errata, I$ coherency, instruction retry etc.
 

We actually went through a review of that here at Intel.  We do not yet
have an *official* answer (in order for us to have that we have to have
it approved by the architecture committee and published in the SDM), but
to the best of our current knowledge (and I'm allowed to say this) the
int3 method followed by global IPIs should be safe for modifying *one
(atomic) instruction*.  This is a specific case of a more general rule,
but I don't want to disclose the whole rule until it has been officially
approved.

 I realize kprobes does this very thing, but kprobes is esoteric stuff and 
 doesn't have much choice. In user space, you _could_ do the modification 
 on a different physical page and then just switch the page table entry 
 instead, and not get into the whole D$/I$ coherency thing at all.

On the more general rule of interpretation: I'm really concerned about
having a bunch of partially-capable x86 interpreters all over the
kernel.  x86 is *hard* to emulate, and it will only get harder as the
architecture evolves.

-hpa