On Sat, May 30, 2020 at 10:21:19AM +0800, Wangshaobo (bobo) wrote:
> 1) when a user mode task just fork start excuting ret_from_fork() till
> schedule_tail, unwind_next_frame found
> 
> orc->sp_reg is ORC_REG_UNDEFINED but orc->end not equals zero, this time
> arch_stack_walk_reliable()
> 
> terminates it's backtracing loop for unwind_done() return true. then 'if
> (!(task->flags & (PF_KTHREAD | PF_IDLE)))'
> 
> in arch_stack_walk_reliable() true and return -EINVAL after.
> 
> * The stack trace looks like that:
> 
> ret_from_fork
> 
>       -=> UNWIND_HINT_EMPTY
> 
>       -=> schedule_tail             /* schedule out */
> 
>       ...
> 
>       -=> UNWIND_HINT_REGS      /*  UNDO */

Yes, makes sense.

> 2) when using call_usermodehelper_exec_async() to create a user mode task,
> ret_from_fork() still not exec whereas
> 
> the task has been scheduled in __schedule(), at this time, orc->sp_reg is
> ORC_REG_UNDEFINED but orc->end equals zero,
> 
> unwind_error() return true and also terminates arch_stack_walk_reliable()'s
> backtracing loop, end up return from
> 
> 'if (unwind_error())' branch.
> 
> * The stack trace looks like that:
> 
> -=> call_usermodehelper_exec
> 
>                  -=> do_exec
> 
>                            -=> search_binary_handler
> 
>                                       -=> load_elf_binary
> 
>                                                 -=> elf_map
> 
>                                                          -=> vm_mmap_pgoff
> 
> -=> down_write_killable
> 
> -=> _cond_resched
> 
>              -=> __schedule           /* scheduled to work */
> 
> -=> ret_from_fork       /* UNDO */

I don't quite follow the stacktrace, but it sounds like the issue is the
same as the first one you originally reported:

> 1) The task was not actually scheduled to excute, at this time
> UNWIND_HINT_EMPTY in ret_from_fork() has not reset unwind_hint, it's
> sp_reg and end field remain default value and end up throwing an error
> in unwind_next_frame() when called by arch_stack_walk_reliable();

Or am I misunderstanding?

And to reiterate, these are not "livepatch failures", right?  Livepatch
doesn't fail when stack_trace_save_tsk_reliable() returns an error.  It
recovers gracefully and tries again later.

-- 
Josh

Reply via email to