Re: Why do kprobes and uprobes singlestep?

2021-03-03 Thread Andy Lutomirski


> On Mar 3, 2021, at 10:11 AM, Daniel Xu  wrote:
> 
> On Tue, Mar 02, 2021 at 06:18:23PM -0800, Alexei Starovoitov wrote:
>>> On Tue, Mar 2, 2021 at 5:46 PM Andy Lutomirski  wrote:
>>> 
>>> 
 On Mar 2, 2021, at 5:22 PM, Alexei Starovoitov 
  wrote:
 
 On Tue, Mar 2, 2021 at 1:02 PM Andy Lutomirski  
 wrote:
> 
> 
>>> On Mar 2, 2021, at 12:24 PM, Alexei Starovoitov 
>>>  wrote:
>> 
>> On Tue, Mar 2, 2021 at 10:38 AM Andy Lutomirski  wrote:
>>> 
>>> Is there something like a uprobe test suite?  How maintained /
>>> actively used is uprobe?
>> 
>> uprobe+bpf is heavily used in production.
>> selftests/bpf has only one test for it though.
>> 
>> Why are you asking?
> 
> Because the integration with the x86 entry code is a mess, and I want to 
> know whether to mark it BROKEN or how to make sure the any cleanups 
> actually work.
 
 Any test case to repro the issue you found?
 Is it a bug or just messy code?
>>> 
>>> Just messy code.
>>> 
 Nowadays a good chunk of popular applications (python, mysql, etc) has
 USDTs in them.
 Issues reported with bcc:
 https://github.com/iovisor/bcc/issues?q=is%3Aissue+USDT
 Similar thing with bpftrace.
 Both standard USDT and semaphore based are used in the wild.
 uprobe for containers has been a long standing feature request.
 If you can improve uprobe performance that would be awesome.
 That's another thing that people report often. We optimized it a bit.
 More can be done.
>>> 
>>> 
>>> Wait... USDT is much easier to implement well.  Are we talking just USDT or 
>>> are we talking about general uprobes in which almost any instruction can 
>>> get probed?  If the only users that care about uprobes are doing USDT, we 
>>> could vastly simplify the implementation and probably make it faster, too.
>> 
>> USDTs are driving the majority of uprobe usage.
> 
> I'd say 50/50 in my experience. Larger userspace applications using bpf
> for production monitoring tend to use USDT for stability and ABI reasons
> (hard for bpf to read C++ classes). Bare uprobes (ie not USDT) are used
> quite often for ad-hoc production debugging.
> 
>> If they can get faster it will increase their adoption even more.
>> There are certainly cases of normal uprobes.
>> They are at the start of the function 99% of the time.
>> Like the following:
>> "uprobe:/lib64/libc.so:malloc(u64 size):size:size,_ret",
>> "uprobe:/lib64/libc.so:free(void *ptr)::ptr",
>> is common despite its overhead.
>> 
>> Here is the most interesting and practical usage of uprobes:
>> https://github.com/iovisor/bcc/blob/master/tools/sslsniff.py
>> and the manpage for the tool:
>> https://github.com/iovisor/bcc/blob/master/tools/sslsniff_example.txt
>> 
>> uprobe in the middle of the function is very rare.
>> If the kernel starts rejecting uprobes on some weird instructions
>> I suspect no one will complain.
> 
> I think it would be great if the kernel could reject mid-instruction
> uprobes. Unlike with kprobes, you can place uprobes on immediate
> operands which can cause silent data corruption. See
> https://github.com/iovisor/bpftrace/pull/803#issuecomment-507693933
> for a funny example.

This can’t be done in general on x86. One cannot look at code and find the 
instruction boundaries.

> 
> To prevent accidental (and silent) data corruption, bpftrace uses a
> disassembler to ensure uprobes are placed on instruction boundaries.
> 
> <...>
> 
> Daniel


Re: Why do kprobes and uprobes singlestep?

2021-03-03 Thread Daniel Xu
On Tue, Mar 02, 2021 at 06:18:23PM -0800, Alexei Starovoitov wrote:
> On Tue, Mar 2, 2021 at 5:46 PM Andy Lutomirski  wrote:
> >
> >
> > > On Mar 2, 2021, at 5:22 PM, Alexei Starovoitov 
> > >  wrote:
> > >
> > > On Tue, Mar 2, 2021 at 1:02 PM Andy Lutomirski  
> > > wrote:
> > >>
> > >>
> >  On Mar 2, 2021, at 12:24 PM, Alexei Starovoitov 
> >   wrote:
> > >>>
> > >>> On Tue, Mar 2, 2021 at 10:38 AM Andy Lutomirski  
> > >>> wrote:
> > 
> >  Is there something like a uprobe test suite?  How maintained /
> >  actively used is uprobe?
> > >>>
> > >>> uprobe+bpf is heavily used in production.
> > >>> selftests/bpf has only one test for it though.
> > >>>
> > >>> Why are you asking?
> > >>
> > >> Because the integration with the x86 entry code is a mess, and I want to 
> > >> know whether to mark it BROKEN or how to make sure the any cleanups 
> > >> actually work.
> > >
> > > Any test case to repro the issue you found?
> > > Is it a bug or just messy code?
> >
> > Just messy code.
> >
> > > Nowadays a good chunk of popular applications (python, mysql, etc) has
> > > USDTs in them.
> > > Issues reported with bcc:
> > > https://github.com/iovisor/bcc/issues?q=is%3Aissue+USDT
> > > Similar thing with bpftrace.
> > > Both standard USDT and semaphore based are used in the wild.
> > > uprobe for containers has been a long standing feature request.
> > > If you can improve uprobe performance that would be awesome.
> > > That's another thing that people report often. We optimized it a bit.
> > > More can be done.
> >
> >
> > Wait... USDT is much easier to implement well.  Are we talking just USDT or 
> > are we talking about general uprobes in which almost any instruction can 
> > get probed?  If the only users that care about uprobes are doing USDT, we 
> > could vastly simplify the implementation and probably make it faster, too.
> 
> USDTs are driving the majority of uprobe usage.

I'd say 50/50 in my experience. Larger userspace applications using bpf
for production monitoring tend to use USDT for stability and ABI reasons
(hard for bpf to read C++ classes). Bare uprobes (ie not USDT) are used
quite often for ad-hoc production debugging.

> If they can get faster it will increase their adoption even more.
> There are certainly cases of normal uprobes.
> They are at the start of the function 99% of the time.
> Like the following:
> "uprobe:/lib64/libc.so:malloc(u64 size):size:size,_ret",
> "uprobe:/lib64/libc.so:free(void *ptr)::ptr",
> is common despite its overhead.
> 
> Here is the most interesting and practical usage of uprobes:
> https://github.com/iovisor/bcc/blob/master/tools/sslsniff.py
> and the manpage for the tool:
> https://github.com/iovisor/bcc/blob/master/tools/sslsniff_example.txt
> 
> uprobe in the middle of the function is very rare.
> If the kernel starts rejecting uprobes on some weird instructions
> I suspect no one will complain.

I think it would be great if the kernel could reject mid-instruction
uprobes. Unlike with kprobes, you can place uprobes on immediate
operands which can cause silent data corruption. See
https://github.com/iovisor/bpftrace/pull/803#issuecomment-507693933
for a funny example.

To prevent accidental (and silent) data corruption, bpftrace uses a
disassembler to ensure uprobes are placed on instruction boundaries.

<...>

Daniel


Re: Why do kprobes and uprobes singlestep?

2021-03-03 Thread Oleg Nesterov
On 03/02, Alexei Starovoitov wrote:
>
> Especially if such tightening will come with performance boost for
> uprobe on a nop and unprobe at the start (which is typically push or
> alu on %sp).
> That would be a great step forward.

Just in case, nop and push are emulated without additional overhead.

Oleg.



Re: Why do kprobes and uprobes singlestep?

2021-03-03 Thread Alexei Starovoitov
On Tue, Mar 2, 2021 at 5:46 PM Andy Lutomirski  wrote:
>
>
> > On Mar 2, 2021, at 5:22 PM, Alexei Starovoitov 
> >  wrote:
> >
> > On Tue, Mar 2, 2021 at 1:02 PM Andy Lutomirski  wrote:
> >>
> >>
>  On Mar 2, 2021, at 12:24 PM, Alexei Starovoitov 
>   wrote:
> >>>
> >>> On Tue, Mar 2, 2021 at 10:38 AM Andy Lutomirski  wrote:
> 
>  Is there something like a uprobe test suite?  How maintained /
>  actively used is uprobe?
> >>>
> >>> uprobe+bpf is heavily used in production.
> >>> selftests/bpf has only one test for it though.
> >>>
> >>> Why are you asking?
> >>
> >> Because the integration with the x86 entry code is a mess, and I want to 
> >> know whether to mark it BROKEN or how to make sure the any cleanups 
> >> actually work.
> >
> > Any test case to repro the issue you found?
> > Is it a bug or just messy code?
>
> Just messy code.
>
> > Nowadays a good chunk of popular applications (python, mysql, etc) has
> > USDTs in them.
> > Issues reported with bcc:
> > https://github.com/iovisor/bcc/issues?q=is%3Aissue+USDT
> > Similar thing with bpftrace.
> > Both standard USDT and semaphore based are used in the wild.
> > uprobe for containers has been a long standing feature request.
> > If you can improve uprobe performance that would be awesome.
> > That's another thing that people report often. We optimized it a bit.
> > More can be done.
>
>
> Wait... USDT is much easier to implement well.  Are we talking just USDT or 
> are we talking about general uprobes in which almost any instruction can get 
> probed?  If the only users that care about uprobes are doing USDT, we could 
> vastly simplify the implementation and probably make it faster, too.

USDTs are driving the majority of uprobe usage.
If they can get faster it will increase their adoption even more.
There are certainly cases of normal uprobes.
They are at the start of the function 99% of the time.
Like the following:
"uprobe:/lib64/libc.so:malloc(u64 size):size:size,_ret",
"uprobe:/lib64/libc.so:free(void *ptr)::ptr",
is common despite its overhead.

Here is the most interesting and practical usage of uprobes:
https://github.com/iovisor/bcc/blob/master/tools/sslsniff.py
and the manpage for the tool:
https://github.com/iovisor/bcc/blob/master/tools/sslsniff_example.txt

uprobe in the middle of the function is very rare.
If the kernel starts rejecting uprobes on some weird instructions
I suspect no one will complain.
Especially if such tightening will come with performance boost for
uprobe on a nop and unprobe at the start (which is typically push or
alu on %sp).
That would be a great step forward.


Re: Why do kprobes and uprobes singlestep?

2021-03-03 Thread Andy Lutomirski


> On Mar 2, 2021, at 5:22 PM, Alexei Starovoitov  
> wrote:
> 
> On Tue, Mar 2, 2021 at 1:02 PM Andy Lutomirski  wrote:
>> 
>> 
 On Mar 2, 2021, at 12:24 PM, Alexei Starovoitov 
  wrote:
>>> 
>>> On Tue, Mar 2, 2021 at 10:38 AM Andy Lutomirski  wrote:
 
 Is there something like a uprobe test suite?  How maintained /
 actively used is uprobe?
>>> 
>>> uprobe+bpf is heavily used in production.
>>> selftests/bpf has only one test for it though.
>>> 
>>> Why are you asking?
>> 
>> Because the integration with the x86 entry code is a mess, and I want to 
>> know whether to mark it BROKEN or how to make sure the any cleanups actually 
>> work.
> 
> Any test case to repro the issue you found?
> Is it a bug or just messy code?

Just messy code.

> Nowadays a good chunk of popular applications (python, mysql, etc) has
> USDTs in them.
> Issues reported with bcc:
> https://github.com/iovisor/bcc/issues?q=is%3Aissue+USDT
> Similar thing with bpftrace.
> Both standard USDT and semaphore based are used in the wild.
> uprobe for containers has been a long standing feature request.
> If you can improve uprobe performance that would be awesome.
> That's another thing that people report often. We optimized it a bit.
> More can be done.


Wait... USDT is much easier to implement well.  Are we talking just USDT or are 
we talking about general uprobes in which almost any instruction can get 
probed?  If the only users that care about uprobes are doing USDT, we could 
vastly simplify the implementation and probably make it faster, too.

Re: Why do kprobes and uprobes singlestep?

2021-03-03 Thread Alexei Starovoitov
On Tue, Mar 2, 2021 at 1:02 PM Andy Lutomirski  wrote:
>
>
> > On Mar 2, 2021, at 12:24 PM, Alexei Starovoitov 
> >  wrote:
> >
> > On Tue, Mar 2, 2021 at 10:38 AM Andy Lutomirski  wrote:
> >>
> >> Is there something like a uprobe test suite?  How maintained /
> >> actively used is uprobe?
> >
> > uprobe+bpf is heavily used in production.
> > selftests/bpf has only one test for it though.
> >
> > Why are you asking?
>
> Because the integration with the x86 entry code is a mess, and I want to know 
> whether to mark it BROKEN or how to make sure the any cleanups actually work.

Any test case to repro the issue you found?
Is it a bug or just messy code?
Nowadays a good chunk of popular applications (python, mysql, etc) has
USDTs in them.
Issues reported with bcc:
https://github.com/iovisor/bcc/issues?q=is%3Aissue+USDT
Similar thing with bpftrace.
Both standard USDT and semaphore based are used in the wild.
uprobe for containers has been a long standing feature request.
If you can improve uprobe performance that would be awesome.
That's another thing that people report often. We optimized it a bit.
More can be done.


Re: Why do kprobes and uprobes singlestep?

2021-03-02 Thread Andy Lutomirski


> On Mar 2, 2021, at 12:24 PM, Alexei Starovoitov 
>  wrote:
> 
> On Tue, Mar 2, 2021 at 10:38 AM Andy Lutomirski  wrote:
>> 
>> Is there something like a uprobe test suite?  How maintained /
>> actively used is uprobe?
> 
> uprobe+bpf is heavily used in production.
> selftests/bpf has only one test for it though.
> 
> Why are you asking?

Because the integration with the x86 entry code is a mess, and I want to know 
whether to mark it BROKEN or how to make sure the any cleanups actually work.

Re: Why do kprobes and uprobes singlestep?

2021-03-02 Thread Andy Lutomirski



> On Mar 2, 2021, at 12:25 PM, Oleg Nesterov  wrote:
> 
> On 03/01, Andy Lutomirski wrote:
>> 
>>> On Mon, Mar 1, 2021 at 8:51 AM Oleg Nesterov  wrote:
>>> 
>>> But I guess this has nothing to do with uprobes, they do not single-step
>>> in kernel mode, right?
>> 
>> They single-step user code, though, and the code that makes this work
>> is quite ugly.  Single-stepping on x86 is a mess.
> 
> But this doesn't really differ from, say, gdb doing si ? OK, except uprobes
> have to hook DIE_DEBUG. Nevermind...

Also, gdb doing so isn’t great either.  Single stepping over a pushf 
instruction, signal delivery, or a syscall on x86 is a mess.

> 
 Uprobes seem to single-step user code for no discernable reason.
 (They want to trap after executing an out of line instruction, AFAICT.
 Surely INT3 or even CALL after the out-of-line insn would work as well
 or better.)
>>> 
>>> Uprobes use single-step from the very beginning, probably because this
>>> is the most simple and "standard" way to implement xol.
>>> 
>>> And please note that CALL/JMP/etc emulation was added much later to fix the
>>> problems with non-canonical addresses, and this emulation it still 
>>> incomplete.
>> 
>> Is there something like a uprobe test suite?
> 
> Afaik, no.
> 
>> How maintained /
> 
> Add Srikar who sent the initial implementation. I can only say that I am glad 
> that
> ./scripts/get_maintainer.pl no longer mentions me ;) I did some changes 
> (including
> emulation) but a) this was a long ago and b) only because I was forced^W 
> asked to
> fix the numerous bugs in this code.
> 
>> actively used is uprobe?
> 
> I have no idea, sorry ;)
> 
> Oleg.
> 


Re: Why do kprobes and uprobes singlestep?

2021-03-02 Thread Oleg Nesterov
On 03/02, Masami Hiramatsu wrote:
>
> > Not sure I understand you correctly, I know almost nothing about low-level
> > x86  magic.
>
> x86 has normal interrupt and NMI. When an NMI occurs the CPU masks NMI
> (the mask itself is hidden status) and IRET releases the mask. The problem
> is that if an INT3 is hit in the NMI handler and does a single-stepping,
> it has to use IRET for atomically setting TF and return.

Ah, thanks a lot,

Oleg.



Re: Why do kprobes and uprobes singlestep?

2021-03-02 Thread Oleg Nesterov
forgot to add Srikar, sorry for resend...

On 03/01, Andy Lutomirski wrote:
>
> On Mon, Mar 1, 2021 at 8:51 AM Oleg Nesterov  wrote:
> >
> > But I guess this has nothing to do with uprobes, they do not single-step
> > in kernel mode, right?
>
> They single-step user code, though, and the code that makes this work
> is quite ugly.  Single-stepping on x86 is a mess.

But this doesn't really differ from, say, gdb doing si ? OK, except uprobes
have to hook DIE_DEBUG. Nevermind...

> > > Uprobes seem to single-step user code for no discernable reason.
> > > (They want to trap after executing an out of line instruction, AFAICT.
> > > Surely INT3 or even CALL after the out-of-line insn would work as well
> > > or better.)
> >
> > Uprobes use single-step from the very beginning, probably because this
> > is the most simple and "standard" way to implement xol.
> >
> > And please note that CALL/JMP/etc emulation was added much later to fix the
> > problems with non-canonical addresses, and this emulation it still 
> > incomplete.
>
> Is there something like a uprobe test suite?

Afaik, no.

> How maintained /

Add Srikar who sent the initial implementation. I can only say that I am glad 
that
./scripts/get_maintainer.pl no longer mentions me ;) I did some changes 
(including
emulation) but a) this was a long ago and b) only because I was forced^W asked 
to
fix the numerous bugs in this code.

> actively used is uprobe?

I have no idea, sorry ;)

Oleg.



Re: Why do kprobes and uprobes singlestep?

2021-03-02 Thread Oleg Nesterov
On 03/01, Andy Lutomirski wrote:
>
> On Mon, Mar 1, 2021 at 8:51 AM Oleg Nesterov  wrote:
> >
> > But I guess this has nothing to do with uprobes, they do not single-step
> > in kernel mode, right?
>
> They single-step user code, though, and the code that makes this work
> is quite ugly.  Single-stepping on x86 is a mess.

But this doesn't really differ from, say, gdb doing si ? OK, except uprobes
have to hook DIE_DEBUG. Nevermind...

> > > Uprobes seem to single-step user code for no discernable reason.
> > > (They want to trap after executing an out of line instruction, AFAICT.
> > > Surely INT3 or even CALL after the out-of-line insn would work as well
> > > or better.)
> >
> > Uprobes use single-step from the very beginning, probably because this
> > is the most simple and "standard" way to implement xol.
> >
> > And please note that CALL/JMP/etc emulation was added much later to fix the
> > problems with non-canonical addresses, and this emulation it still 
> > incomplete.
>
> Is there something like a uprobe test suite?

Afaik, no.

> How maintained /

Add Srikar who sent the initial implementation. I can only say that I am glad 
that
./scripts/get_maintainer.pl no longer mentions me ;) I did some changes 
(including
emulation) but a) this was a long ago and b) only because I was forced^W asked 
to
fix the numerous bugs in this code.

> actively used is uprobe?

I have no idea, sorry ;)

Oleg.



Re: Why do kprobes and uprobes singlestep?

2021-03-02 Thread Alexei Starovoitov
On Tue, Mar 2, 2021 at 10:38 AM Andy Lutomirski  wrote:
>
> Is there something like a uprobe test suite?  How maintained /
> actively used is uprobe?

uprobe+bpf is heavily used in production.
selftests/bpf has only one test for it though.

Why are you asking?


Re: Why do kprobes and uprobes singlestep?

2021-03-01 Thread Andy Lutomirski
On Mon, Mar 1, 2021 at 6:22 PM Masami Hiramatsu  wrote:
>
> Hi Oleg and Andy,
>
> On Mon, 1 Mar 2021 17:51:31 +0100
> Oleg Nesterov  wrote:
>
> > Hi Andy,
> >
> > sorry for delay.
> >
> > On 02/23, Andy Lutomirski wrote:
> > >
> > > A while back, I let myself be convinced that kprobes genuinely need to
> > > single-step the kernel on occasion, and I decided that this sucked but
> > > I could live with it.  it would, however, be Really Really Nice (tm)
> > > if we could have a rule that anyone running x86 Linux who single-steps
> > > the kernel (e.g. kgdb and nothing else) gets to keep all the pieces
> > > when the system falls apart around them.  Specifically, if we don't
> > > allow kernel single-stepping and if we suitably limit kernel
> > > instruction breakpoints (the latter isn't actually a major problem),
> > > then we don't really really need to use IRET to return to the kernel,
> > > and that means we can avoid some massive NMI nastiness.
> >
> > Not sure I understand you correctly, I know almost nothing about low-level
> > x86  magic.
>
> x86 has normal interrupt and NMI. When an NMI occurs the CPU masks NMI
> (the mask itself is hidden status) and IRET releases the mask. The problem
> is that if an INT3 is hit in the NMI handler and does a single-stepping,
> it has to use IRET for atomically setting TF and return.
>
> >
> > But I guess this has nothing to do with uprobes, they do not single-step
> > in kernel mode, right?
>
> Agreed, if the problematic case is IRET from NMI handler, uprobes doesn't
> hit because it only invoked from user-space.
> Andy, what would you think?

Indeed, this isn't a problem for uprobes.  The problem for uprobes is
that all the notifiers from #DB are kind of messy, and I would like to
get rid of them if possible.


Re: Why do kprobes and uprobes singlestep?

2021-03-01 Thread Masami Hiramatsu
Hi Oleg and Andy,

On Mon, 1 Mar 2021 17:51:31 +0100
Oleg Nesterov  wrote:

> Hi Andy,
> 
> sorry for delay.
> 
> On 02/23, Andy Lutomirski wrote:
> >
> > A while back, I let myself be convinced that kprobes genuinely need to
> > single-step the kernel on occasion, and I decided that this sucked but
> > I could live with it.  it would, however, be Really Really Nice (tm)
> > if we could have a rule that anyone running x86 Linux who single-steps
> > the kernel (e.g. kgdb and nothing else) gets to keep all the pieces
> > when the system falls apart around them.  Specifically, if we don't
> > allow kernel single-stepping and if we suitably limit kernel
> > instruction breakpoints (the latter isn't actually a major problem),
> > then we don't really really need to use IRET to return to the kernel,
> > and that means we can avoid some massive NMI nastiness.
> 
> Not sure I understand you correctly, I know almost nothing about low-level
> x86  magic.

x86 has normal interrupt and NMI. When an NMI occurs the CPU masks NMI
(the mask itself is hidden status) and IRET releases the mask. The problem
is that if an INT3 is hit in the NMI handler and does a single-stepping,
it has to use IRET for atomically setting TF and return.

> 
> But I guess this has nothing to do with uprobes, they do not single-step
> in kernel mode, right?

Agreed, if the problematic case is IRET from NMI handler, uprobes doesn't
hit because it only invoked from user-space.
Andy, what would you think?

> > Uprobes seem to single-step user code for no discernable reason.
> > (They want to trap after executing an out of line instruction, AFAICT.
> > Surely INT3 or even CALL after the out-of-line insn would work as well
> > or better.)
> 
> Uprobes use single-step from the very beginning, probably because this
> is the most simple and "standard" way to implement xol.
> 
> And please note that CALL/JMP/etc emulation was added much later to fix the
> problems with non-canonical addresses, and this emulation it still incomplete.

Yeah, I found another implementation of the emulation afterwards. Of cource
since uprobes only treat user-space, it maybe need more care.

Thank you,

-- 
Masami Hiramatsu 


Re: Why do kprobes and uprobes singlestep?

2021-03-01 Thread Andy Lutomirski
On Mon, Mar 1, 2021 at 8:51 AM Oleg Nesterov  wrote:
>
> Hi Andy,
>
> sorry for delay.
>
> On 02/23, Andy Lutomirski wrote:
> >
> > A while back, I let myself be convinced that kprobes genuinely need to
> > single-step the kernel on occasion, and I decided that this sucked but
> > I could live with it.  it would, however, be Really Really Nice (tm)
> > if we could have a rule that anyone running x86 Linux who single-steps
> > the kernel (e.g. kgdb and nothing else) gets to keep all the pieces
> > when the system falls apart around them.  Specifically, if we don't
> > allow kernel single-stepping and if we suitably limit kernel
> > instruction breakpoints (the latter isn't actually a major problem),
> > then we don't really really need to use IRET to return to the kernel,
> > and that means we can avoid some massive NMI nastiness.
>
> Not sure I understand you correctly, I know almost nothing about low-level
> x86  magic.
>
> But I guess this has nothing to do with uprobes, they do not single-step
> in kernel mode, right?

They single-step user code, though, and the code that makes this work
is quite ugly.  Single-stepping on x86 is a mess.

>
> > Uprobes seem to single-step user code for no discernable reason.
> > (They want to trap after executing an out of line instruction, AFAICT.
> > Surely INT3 or even CALL after the out-of-line insn would work as well
> > or better.)
>
> Uprobes use single-step from the very beginning, probably because this
> is the most simple and "standard" way to implement xol.
>
> And please note that CALL/JMP/etc emulation was added much later to fix the
> problems with non-canonical addresses, and this emulation it still incomplete.

Is there something like a uprobe test suite?  How maintained /
actively used is uprobe?

--Andy


Re: Why do kprobes and uprobes singlestep?

2021-03-01 Thread Oleg Nesterov
Hi Andy,

sorry for delay.

On 02/23, Andy Lutomirski wrote:
>
> A while back, I let myself be convinced that kprobes genuinely need to
> single-step the kernel on occasion, and I decided that this sucked but
> I could live with it.  it would, however, be Really Really Nice (tm)
> if we could have a rule that anyone running x86 Linux who single-steps
> the kernel (e.g. kgdb and nothing else) gets to keep all the pieces
> when the system falls apart around them.  Specifically, if we don't
> allow kernel single-stepping and if we suitably limit kernel
> instruction breakpoints (the latter isn't actually a major problem),
> then we don't really really need to use IRET to return to the kernel,
> and that means we can avoid some massive NMI nastiness.

Not sure I understand you correctly, I know almost nothing about low-level
x86  magic.

But I guess this has nothing to do with uprobes, they do not single-step
in kernel mode, right?

> Uprobes seem to single-step user code for no discernable reason.
> (They want to trap after executing an out of line instruction, AFAICT.
> Surely INT3 or even CALL after the out-of-line insn would work as well
> or better.)

Uprobes use single-step from the very beginning, probably because this
is the most simple and "standard" way to implement xol.

And please note that CALL/JMP/etc emulation was added much later to fix the
problems with non-canonical addresses, and this emulation it still incomplete.

Oleg.



Re: Why do kprobes and uprobes singlestep?

2021-02-25 Thread Peter Zijlstra
On Wed, Feb 24, 2021 at 11:45:10AM -0800, Andy Lutomirski wrote:
> I guess I see the point for CALL, JMP and RET, but it seems like we
> could emulate those cases instead fairly easily.

Today, yes. CALL emulation was 'recently' made possible by having #BP
have a stack gap. We have emulation for all 3 those instructions
implemented in asm/text-patching.h, see int3_emulate_$insn().


Re: Why do kprobes and uprobes singlestep?

2021-02-25 Thread Masami Hiramatsu
On Wed, 24 Feb 2021 22:03:12 -0800
Andy Lutomirski  wrote:

> On Wed, Feb 24, 2021 at 6:22 PM Masami Hiramatsu  wrote:
> >
> > On Wed, 24 Feb 2021 11:45:10 -0800
> > Andy Lutomirski  wrote:
> >
> > > On Tue, Feb 23, 2021 at 5:18 PM Masami Hiramatsu  
> > > wrote:
> > > >
> > > > On Tue, 23 Feb 2021 15:24:19 -0800
> > > > Andy Lutomirski  wrote:
> > > >
> > > > > A while back, I let myself be convinced that kprobes genuinely need to
> > > > > single-step the kernel on occasion, and I decided that this sucked but
> > > > > I could live with it.  it would, however, be Really Really Nice (tm)
> > > > > if we could have a rule that anyone running x86 Linux who single-steps
> > > > > the kernel (e.g. kgdb and nothing else) gets to keep all the pieces
> > > > > when the system falls apart around them.  Specifically, if we don't
> > > > > allow kernel single-stepping and if we suitably limit kernel
> > > > > instruction breakpoints (the latter isn't actually a major problem),
> > > > > then we don't really really need to use IRET to return to the kernel,
> > > > > and that means we can avoid some massive NMI nastiness.
> > > >
> > > > Would you mean using "pop regs + popf + ret" instead of IRET after
> > > > int3 handled for avoiding IRET releasing the NMI mask? Yeah, it is
> > > > possible. I don't complain about that.
> > >
> > > Yes, more or less.
> > >
> > > >
> > > > However, what is the relationship between the IRET and single-stepping?
> > > > I think we can do same thing in do_debug...
> > >
> > > Because there is no way to single-step without using IRET.  POPF; RET
> > > will trap after RET and you won't make forward progress.
> >
> > Ah, indeed. "POPF; RET" is not atomically exceute.
> >
> > > > > But I was contemplating the code, and I'm no longer convinced.
> > > > > Uprobes seem to single-step user code for no discernable reason.
> > > > > (They want to trap after executing an out of line instruction, AFAICT.
> > > > > Surely INT3 or even CALL after the out-of-line insn would work as well
> > > > > or better.)  Why does kprobe single-step?  I spend a while staring at
> > > > > the code, and it was entirely unclear to me what the purpose of the
> > > > > single-step is.
> > > >
> > > > For kprobes, there are 2 major reasons for (still relaying on) single 
> > > > stepping.
> > > > One is to provide post_handler, another is executing the original code,
> > > > which is replaced by int3, without modifying code nor emulation.
> > >
> > > I don't follow.  Suppose we execute out of line.  If we originally have:
> > >
> > > INSN
> > >
> > > we replace it with:
> > >
> > > INT3
> > >
> > > and we have, out of line:
> > >
> > > INSN [but with displacement modified if it's RIP-relative]
> > >
> > > right now, we single-step the out of line copy.  But couldn't we instead 
> > > do:
> > >
> > > INSN [but with displacement modified if it's RIP-relative]
> > > INT3
> >
> > If the INSN is "jmp +127", it will skip the INT3. So those instructions
> > must be identified and emulated. We did it already in the arm64 (see commit
> > 7ee31a3aa8f4 ("arm64: kprobes: Use BRK instead of single-step when executing
> >  instructions out-of-line")), because arm64 already emulated the branch
> > instructions. I have to check x86 insns can be emulated without 
> > side-effects.
> 
> Off the top of my head:
> 
> JMP changes RIP but has no other side effects.  Jcc is the same except
> that the condition needs checking, which would be a bit tedious.
> 
> CALL changes RIP and does a push but has no other side effects.  We
> have special infrastructure to emulate CALL from int3 context:
> int3_emulate_call().

Yeah, I remember that a gap was introduced for int3_emulate_call().
These helps me to implement emulation.

> 
> RET pops and changes RIP.  No other side effects.
> 
> RET imm is rare.  I don't think it occurs in the kernel at all.
> 
> LRET is rare.  I don't think kprobe needs to support it.
> 
> JMP FAR and CALL FAR are rare.  I see no reason to support them.

I see those are rare, but supporting those is not hard.

> 
> IRET is rare, and trying to kprobe it seems likely to cause a
> disaster, although it's within the realm of possibility that the IRET
> in sync_core() could work.

Agreed. Iret should not be probed.


> > > or even
> > >
> > > INSN [but with displacement modified if it's RIP-relative]
> > > JMP kprobe_post_handler
> >
> > This needs a sequence of push-regs etc. ;)
> >
> > >
> > > and avoid single-stepping?
> > >
> > > I guess I see the point for CALL, JMP and RET, but it seems like we
> > > could emulate those cases instead fairly easily.
> >
> > OK, let's try to do it. I think it should be possible because even in the
> > current code, resume fixup code (adjust IP register) works only for a few
> > groups of instructions.
> 
> I suspect that emulating them would give a nice performance boost,
> too.  Single-stepping is very slow on x86.

Yeah, that's same on arm64. Jean reported eliminating single-step
gained 

Re: Why do kprobes and uprobes singlestep?

2021-02-24 Thread Andy Lutomirski
On Wed, Feb 24, 2021 at 6:22 PM Masami Hiramatsu  wrote:
>
> On Wed, 24 Feb 2021 11:45:10 -0800
> Andy Lutomirski  wrote:
>
> > On Tue, Feb 23, 2021 at 5:18 PM Masami Hiramatsu  
> > wrote:
> > >
> > > On Tue, 23 Feb 2021 15:24:19 -0800
> > > Andy Lutomirski  wrote:
> > >
> > > > A while back, I let myself be convinced that kprobes genuinely need to
> > > > single-step the kernel on occasion, and I decided that this sucked but
> > > > I could live with it.  it would, however, be Really Really Nice (tm)
> > > > if we could have a rule that anyone running x86 Linux who single-steps
> > > > the kernel (e.g. kgdb and nothing else) gets to keep all the pieces
> > > > when the system falls apart around them.  Specifically, if we don't
> > > > allow kernel single-stepping and if we suitably limit kernel
> > > > instruction breakpoints (the latter isn't actually a major problem),
> > > > then we don't really really need to use IRET to return to the kernel,
> > > > and that means we can avoid some massive NMI nastiness.
> > >
> > > Would you mean using "pop regs + popf + ret" instead of IRET after
> > > int3 handled for avoiding IRET releasing the NMI mask? Yeah, it is
> > > possible. I don't complain about that.
> >
> > Yes, more or less.
> >
> > >
> > > However, what is the relationship between the IRET and single-stepping?
> > > I think we can do same thing in do_debug...
> >
> > Because there is no way to single-step without using IRET.  POPF; RET
> > will trap after RET and you won't make forward progress.
>
> Ah, indeed. "POPF; RET" is not atomically exceute.
>
> > > > But I was contemplating the code, and I'm no longer convinced.
> > > > Uprobes seem to single-step user code for no discernable reason.
> > > > (They want to trap after executing an out of line instruction, AFAICT.
> > > > Surely INT3 or even CALL after the out-of-line insn would work as well
> > > > or better.)  Why does kprobe single-step?  I spend a while staring at
> > > > the code, and it was entirely unclear to me what the purpose of the
> > > > single-step is.
> > >
> > > For kprobes, there are 2 major reasons for (still relaying on) single 
> > > stepping.
> > > One is to provide post_handler, another is executing the original code,
> > > which is replaced by int3, without modifying code nor emulation.
> >
> > I don't follow.  Suppose we execute out of line.  If we originally have:
> >
> > INSN
> >
> > we replace it with:
> >
> > INT3
> >
> > and we have, out of line:
> >
> > INSN [but with displacement modified if it's RIP-relative]
> >
> > right now, we single-step the out of line copy.  But couldn't we instead do:
> >
> > INSN [but with displacement modified if it's RIP-relative]
> > INT3
>
> If the INSN is "jmp +127", it will skip the INT3. So those instructions
> must be identified and emulated. We did it already in the arm64 (see commit
> 7ee31a3aa8f4 ("arm64: kprobes: Use BRK instead of single-step when executing
>  instructions out-of-line")), because arm64 already emulated the branch
> instructions. I have to check x86 insns can be emulated without side-effects.

Off the top of my head:

JMP changes RIP but has no other side effects.  Jcc is the same except
that the condition needs checking, which would be a bit tedious.

CALL changes RIP and does a push but has no other side effects.  We
have special infrastructure to emulate CALL from int3 context:
int3_emulate_call().

RET pops and changes RIP.  No other side effects.

RET imm is rare.  I don't think it occurs in the kernel at all.

LRET is rare.  I don't think kprobe needs to support it.

IRET is rare, and trying to kprobe it seems likely to cause a
disaster, although it's within the realm of possibility that the IRET
in sync_core() could work.

JMP FAR and CALL FAR are rare.  I see no reason to support them.

>
> >
> > or even
> >
> > INSN [but with displacement modified if it's RIP-relative]
> > JMP kprobe_post_handler
>
> This needs a sequence of push-regs etc. ;)
>
> >
> > and avoid single-stepping?
> >
> > I guess I see the point for CALL, JMP and RET, but it seems like we
> > could emulate those cases instead fairly easily.
>
> OK, let's try to do it. I think it should be possible because even in the
> current code, resume fixup code (adjust IP register) works only for a few
> groups of instructions.

I suspect that emulating them would give a nice performance boost,
too.  Single-stepping is very slow on x86.

I should let you know, though, that I might have found a sneaky
alternative solution to handling NMIs, so this is a bit lower priority
from my perspective than I thought it was.  I'm not quite 100%
convinced my idea works, but I'll play with it.

--Andy

>
> Thank you,
>
> --
> Masami Hiramatsu 


Re: Why do kprobes and uprobes singlestep?

2021-02-24 Thread Masami Hiramatsu
On Wed, 24 Feb 2021 11:45:10 -0800
Andy Lutomirski  wrote:

> On Tue, Feb 23, 2021 at 5:18 PM Masami Hiramatsu  wrote:
> >
> > On Tue, 23 Feb 2021 15:24:19 -0800
> > Andy Lutomirski  wrote:
> >
> > > A while back, I let myself be convinced that kprobes genuinely need to
> > > single-step the kernel on occasion, and I decided that this sucked but
> > > I could live with it.  it would, however, be Really Really Nice (tm)
> > > if we could have a rule that anyone running x86 Linux who single-steps
> > > the kernel (e.g. kgdb and nothing else) gets to keep all the pieces
> > > when the system falls apart around them.  Specifically, if we don't
> > > allow kernel single-stepping and if we suitably limit kernel
> > > instruction breakpoints (the latter isn't actually a major problem),
> > > then we don't really really need to use IRET to return to the kernel,
> > > and that means we can avoid some massive NMI nastiness.
> >
> > Would you mean using "pop regs + popf + ret" instead of IRET after
> > int3 handled for avoiding IRET releasing the NMI mask? Yeah, it is
> > possible. I don't complain about that.
> 
> Yes, more or less.
> 
> >
> > However, what is the relationship between the IRET and single-stepping?
> > I think we can do same thing in do_debug...
> 
> Because there is no way to single-step without using IRET.  POPF; RET
> will trap after RET and you won't make forward progress.

Ah, indeed. "POPF; RET" is not atomically exceute.

> > > But I was contemplating the code, and I'm no longer convinced.
> > > Uprobes seem to single-step user code for no discernable reason.
> > > (They want to trap after executing an out of line instruction, AFAICT.
> > > Surely INT3 or even CALL after the out-of-line insn would work as well
> > > or better.)  Why does kprobe single-step?  I spend a while staring at
> > > the code, and it was entirely unclear to me what the purpose of the
> > > single-step is.
> >
> > For kprobes, there are 2 major reasons for (still relaying on) single 
> > stepping.
> > One is to provide post_handler, another is executing the original code,
> > which is replaced by int3, without modifying code nor emulation.
> 
> I don't follow.  Suppose we execute out of line.  If we originally have:
> 
> INSN
> 
> we replace it with:
> 
> INT3
> 
> and we have, out of line:
> 
> INSN [but with displacement modified if it's RIP-relative]
> 
> right now, we single-step the out of line copy.  But couldn't we instead do:
> 
> INSN [but with displacement modified if it's RIP-relative]
> INT3

If the INSN is "jmp +127", it will skip the INT3. So those instructions
must be identified and emulated. We did it already in the arm64 (see commit
7ee31a3aa8f4 ("arm64: kprobes: Use BRK instead of single-step when executing
 instructions out-of-line")), because arm64 already emulated the branch
instructions. I have to check x86 insns can be emulated without side-effects.

> 
> or even
> 
> INSN [but with displacement modified if it's RIP-relative]
> JMP kprobe_post_handler

This needs a sequence of push-regs etc. ;)

> 
> and avoid single-stepping?
> 
> I guess I see the point for CALL, JMP and RET, but it seems like we
> could emulate those cases instead fairly easily.

OK, let's try to do it. I think it should be possible because even in the
current code, resume fixup code (adjust IP register) works only for a few
groups of instructions.

Thank you,

-- 
Masami Hiramatsu 


Re: Why do kprobes and uprobes singlestep?

2021-02-24 Thread Andy Lutomirski
On Tue, Feb 23, 2021 at 5:18 PM Masami Hiramatsu  wrote:
>
> On Tue, 23 Feb 2021 15:24:19 -0800
> Andy Lutomirski  wrote:
>
> > A while back, I let myself be convinced that kprobes genuinely need to
> > single-step the kernel on occasion, and I decided that this sucked but
> > I could live with it.  it would, however, be Really Really Nice (tm)
> > if we could have a rule that anyone running x86 Linux who single-steps
> > the kernel (e.g. kgdb and nothing else) gets to keep all the pieces
> > when the system falls apart around them.  Specifically, if we don't
> > allow kernel single-stepping and if we suitably limit kernel
> > instruction breakpoints (the latter isn't actually a major problem),
> > then we don't really really need to use IRET to return to the kernel,
> > and that means we can avoid some massive NMI nastiness.
>
> Would you mean using "pop regs + popf + ret" instead of IRET after
> int3 handled for avoiding IRET releasing the NMI mask? Yeah, it is
> possible. I don't complain about that.

Yes, more or less.

>
> However, what is the relationship between the IRET and single-stepping?
> I think we can do same thing in do_debug...

Because there is no way to single-step without using IRET.  POPF; RET
will trap after RET and you won't make forward progress.

>
> > But I was contemplating the code, and I'm no longer convinced.
> > Uprobes seem to single-step user code for no discernable reason.
> > (They want to trap after executing an out of line instruction, AFAICT.
> > Surely INT3 or even CALL after the out-of-line insn would work as well
> > or better.)  Why does kprobe single-step?  I spend a while staring at
> > the code, and it was entirely unclear to me what the purpose of the
> > single-step is.
>
> For kprobes, there are 2 major reasons for (still relaying on) single 
> stepping.
> One is to provide post_handler, another is executing the original code,
> which is replaced by int3, without modifying code nor emulation.

I don't follow.  Suppose we execute out of line.  If we originally have:

INSN

we replace it with:

INT3

and we have, out of line:

INSN [but with displacement modified if it's RIP-relative]

right now, we single-step the out of line copy.  But couldn't we instead do:

INSN [but with displacement modified if it's RIP-relative]
INT3

or even

INSN [but with displacement modified if it's RIP-relative]
JMP kprobe_post_handler

and avoid single-stepping?

I guess I see the point for CALL, JMP and RET, but it seems like we
could emulate those cases instead fairly easily.


Re: Why do kprobes and uprobes singlestep?

2021-02-23 Thread Masami Hiramatsu
On Tue, 23 Feb 2021 15:24:19 -0800
Andy Lutomirski  wrote:

> A while back, I let myself be convinced that kprobes genuinely need to
> single-step the kernel on occasion, and I decided that this sucked but
> I could live with it.  it would, however, be Really Really Nice (tm)
> if we could have a rule that anyone running x86 Linux who single-steps
> the kernel (e.g. kgdb and nothing else) gets to keep all the pieces
> when the system falls apart around them.  Specifically, if we don't
> allow kernel single-stepping and if we suitably limit kernel
> instruction breakpoints (the latter isn't actually a major problem),
> then we don't really really need to use IRET to return to the kernel,
> and that means we can avoid some massive NMI nastiness.

Would you mean using "pop regs + popf + ret" instead of IRET after
int3 handled for avoiding IRET releasing the NMI mask? Yeah, it is
possible. I don't complain about that.

However, what is the relationship between the IRET and single-stepping?
I think we can do same thing in do_debug...

> But I was contemplating the code, and I'm no longer convinced.
> Uprobes seem to single-step user code for no discernable reason.
> (They want to trap after executing an out of line instruction, AFAICT.
> Surely INT3 or even CALL after the out-of-line insn would work as well
> or better.)  Why does kprobe single-step?  I spend a while staring at
> the code, and it was entirely unclear to me what the purpose of the
> single-step is.

For kprobes, there are 2 major reasons for (still relaying on) single stepping.
One is to provide post_handler, another is executing the original code,
which is replaced by int3, without modifying code nor emulation.
Indeed, most of the instructions actually not depends on the ip register,
in that case (and user doesn't set post_handler), kprobe already skips
single stepping (a.k.a. kprobe booster, jump back to the kernel code after
executing out-of-line instruction.)
However, since some instructions, e.g. jump, call and ret, changes the ip
register (and stack), we have to do a fixup afterwards. 

But yes, it is possible to emulate, as same as arm/arm64 does. I just
concern about side-effects of the emulation, need to be carefully
implemented.

Thank you,

-- 
Masami Hiramatsu 


Why do kprobes and uprobes singlestep?

2021-02-23 Thread Andy Lutomirski
A while back, I let myself be convinced that kprobes genuinely need to
single-step the kernel on occasion, and I decided that this sucked but
I could live with it.  it would, however, be Really Really Nice (tm)
if we could have a rule that anyone running x86 Linux who single-steps
the kernel (e.g. kgdb and nothing else) gets to keep all the pieces
when the system falls apart around them.  Specifically, if we don't
allow kernel single-stepping and if we suitably limit kernel
instruction breakpoints (the latter isn't actually a major problem),
then we don't really really need to use IRET to return to the kernel,
and that means we can avoid some massive NMI nastiness.

But I was contemplating the code, and I'm no longer convinced.
Uprobes seem to single-step user code for no discernable reason.
(They want to trap after executing an out of line instruction, AFAICT.
Surely INT3 or even CALL after the out-of-line insn would work as well
or better.)  Why does kprobe single-step?  I spend a while staring at
the code, and it was entirely unclear to me what the purpose of the
single-step is.

--Andy