Re: [PATCH] x86/kprobe: unbalanced preempt counter with CONFIG_PREEMPT=y

Masami Hiramatsu Fri, 02 Mar 2018 17:21:30 -0800

On Fri, 2 Mar 2018 09:36:06 +0100 (CET)
Thomas Gleixner <[email protected]> wrote:


> On Fri, 2 Mar 2018, Masami Hiramatsu wrote:
> > On Thu, 1 Mar 2018 22:12:56 +0100 (CET)
> > Thomas Gleixner <[email protected]> wrote:
> > > > For each kprobe hook, preempt_disable() is called twice, but
> > > > preempt_enable() is called once, when CONFIG_PREEMPT=y. No
> > > > matter kprobe uses ftrace or int3. Then the preempt counter
> > > > overflows, the process might be preempted and migrate to another
> > > > cpu. It causes the kprobe int3 being treated as not kprobe's int3,
> > > > kernel dies.
> > > > 
> > > > The backtrace looks like this:
> > > > 
> > > > int3: 0000 [#1] PREEMPT SMP
> > > > Modules linkedd in: ...
> > > > CPU: 1 PID: 2618 COMM: ...
> > > > Hardware: ...
> > > > task: ...
> > > > RIP: 0010[<ffffffff81457cd5>] [<ffffffff81457cd5>] 
> > > > jprobe_return_end+0x0/0xb
> > > > ...
> > > > RIP [<ffffffff81457cd5>] jprobe_return_end+0x0/0xb
> > > > RSP ...
> > > > 
> > > > Signed-off-by: Yang Bo <[email protected]>
> > > > ---
> > > >  arch/x86/kernel/kprobes/core.c | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > > 
> > > > diff --git a/arch/x86/kernel/kprobes/core.c 
> > > > b/arch/x86/kernel/kprobes/core.c
> > > > index bd36f3c33cd0..1ea5c9caa8f1 100644
> > > > --- a/arch/x86/kernel/kprobes/core.c
> > > > +++ b/arch/x86/kernel/kprobes/core.c
> > > > @@ -611,6 +611,7 @@ static void setup_singlestep(struct kprobe *p, 
> > > > struct pt_regs *regs,
> > > >                 regs->ip = (unsigned long)p->addr;
> > > >         else
> > > >                 regs->ip = (unsigned long)p->ainsn.insn;
> > > > +       preempt_enable_no_resched();
> > > 
> > > Good catch, but to be honest the proper fix is to move the 
> > > preempt_enable()
> > > invocation to the callsite. This kind of asymetric handling is error prone
> > > and leads exactly to that type of bugs. Just adding more preempt_enable()
> > > invocations is proliferating the problem.
> > 
> > Oops, I missed to send my NACK not to ML...
> > 
> > ----
> > > NACK. While single-stepping we have to disable preemption, this
> > > is balanced right after debug-exception is handled (which you
> > > removed next patch).
> > > 
> > > kprobes does not use only int3, but also debug (single step) exception.
> > > So, basically its sequence is;
> > > 1) disable preemption at int3 handler and return
> > > 2) do single-step on returned instruction
> > > 3) enable preemption at debug exception handler
> 
> Well, but that does not make the crash Yang is seeing magically go
> away. There must be a bug in that logic somewhere and that needs to be
> addressed.

As Yang said, this bug had been fixed by 

commit 5bb4fc2d8641 ("Disable preemption in ftrace-based jprobes")


> 
> > > Looking deeper into kprobe_int3_handler() there are more places which do
> > > this completely unomprehensible conditional preempt count handling via
> > > skip_singlestep() and setup_detour_execution(). What an unholy mess.
> > > 
> > > Masami?
> > 
> > As I pointed above, since kprobe has to disable preempt while 
> > singlestepping,
> > preempt_disable()/enable() must be separated in int3 handler and debug 
> > handler
> > by default. There are some exception paths for skipping singlestep to speed 
> > up
> > the process. Maybe I can make it better to handle it in 
> > kprobe_int3_handler()
> >
> > E.g. check singlestep setup in the end of kprobe_int3_handler() and decide
> >  to enable preempt again or keep disabled.
> > 
> > Does it match to your thought?
> 
> Yes. Doing it at the callsite and documenting the rules proper is much
> preferred. Right now the code looks like:
> 
> 
>       preempt_disable();
>       if (foo) {
>               if (baz())
>                       return;
>               ....
>               return;
>       } else {
>               more of this....
>       }
>       preempt_enable();
> 
> And the preempt_enable() is conditionally inside of baz() and other
> functions, which is completely non-obvious.
> 
>       preempt_disable();
>       if (foo) {
>               if (baz()) {
>                       preempt_enable();
>                       return;
>               }
>               ....
>               /* Keep preemption disabled across the probe operation */
>               return;
>       } else {
>               more of this....
>       }
>       preempt_enable();
> 
> That's a bit more understandable, but still confusing.
> 
> Preemption needs to be disabled when the probe hits and reenabled when the
> probe operation finishes, right?
> 
> So I'd rather have an indicator for probe in progress and use that to
> control preemption:
> 
> handler()
> {
>       if (!current->probe_active) {
>               if (is_probe()) {
>                       do_stuff();
>                       current->probe_active = true;
>                       /* Keep preemption disabled across the probe operation 
> */
>                       preempt_disable();
>                       return 1;
>               }
>               return 0;
>       }
> 
>       if (probe_done()) {
>               do_other_stuff();
>               current->probe_active = false;
>               preempt_enable();
>               return 1;
>       }
>       do_more_stuff();
>       return 1;
> }
> 
> See how that automatically documents the mode of operation and the
> preemption semantics? No need for all that conditional preempt_enable()
> stuff all over the place. It's in reality probably not that simple, but you
> get the idea.

Ok, my idea will be a bit simpler.

handler()
{
        preempt_disable();
        ret = do_kprobe();
        if (!prepared_for_singlestep())
                preempt_enable();
        return ret;
}

This shows in what case we keep preempt disabled, and give a hint
where it is enabled again :)

Thanks,

-- 
Masami Hiramatsu <[email protected]>

Re: [PATCH] x86/kprobe: unbalanced preempt counter with CONFIG_PREEMPT=y

Reply via email to