Re: [PATCH 2/2] arm64: stacktrace: Add skip when task == current
On Thu, Mar 18, 2021 at 05:12:07PM +, Mark Rutland wrote: > On Thu, Mar 18, 2021 at 04:17:24PM +, Catalin Marinas wrote: > > On Wed, Mar 17, 2021 at 07:34:16PM +, Mark Rutland wrote: > > > On Wed, Mar 17, 2021 at 06:36:36PM +, Catalin Marinas wrote: > > > > On Wed, Mar 17, 2021 at 02:20:50PM +, Chen Jun wrote: > > > > > On ARM64, cat /sys/kernel/debug/page_owner, all pages return the same > > > > > stack: > > > > > stack_trace_save+0x4c/0x78 > > > > > register_early_stack+0x34/0x70 > > > > > init_page_owner+0x34/0x230 > > > > > page_ext_init+0x1bc/0x1dc > > > > > > > > > > The reason is that: > > > > > check_recursive_alloc always return 1 because that > > > > > entries[0] is always equal to ip (__set_page_owner+0x3c/0x60). > > > > > > > > > > The root cause is that: > > > > > commit 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") > > > > > make the save_trace save 2 more entries. > > > > > > > > > > Add skip in arch_stack_walk when task == current. > > > > > > > > > > Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") > > > > > Signed-off-by: Chen Jun > > > > > --- > > > > > arch/arm64/kernel/stacktrace.c | 5 +++-- > > > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/arch/arm64/kernel/stacktrace.c > > > > > b/arch/arm64/kernel/stacktrace.c > > > > > index ad20981..c26b0ac 100644 > > > > > --- a/arch/arm64/kernel/stacktrace.c > > > > > +++ b/arch/arm64/kernel/stacktrace.c > > > > > @@ -201,11 +201,12 @@ void arch_stack_walk(stack_trace_consume_fn > > > > > consume_entry, void *cookie, > > > > > > > > > > if (regs) > > > > > start_backtrace(, regs->regs[29], regs->pc); > > > > > - else if (task == current) > > > > > + else if (task == current) { > > > > > + ((struct stacktrace_cookie *)cookie)->skip += 2; > > > > > start_backtrace(, > > > > > (unsigned > > > > > long)__builtin_frame_address(0), > > > > > (unsigned long)arch_stack_walk); > > > > > - else > > > > > + } else > > > > > start_backtrace(, thread_saved_fp(task), > > > > > thread_saved_pc(task)); > > > > > > > > I don't like abusing the cookie here. It's void * as it's meant to be an > > > > opaque type. I'd rather skip the first two frames in walk_stackframe() > > > > instead before invoking fn(). > > > > > > I agree that we shouldn't touch cookie here. > > > > > > I don't think that it's right to bodge this inside walk_stackframe(), > > > since that'll add bogus skipping for the case starting with regs in the > > > current task. If we need a bodge, it has to live in arch_stack_walk() > > > where we set up the initial unwinding state. > > > > Good point. However, instead of relying on __builtin_frame_address(1), > > can we add a 'skip' value to struct stackframe via arch_stack_walk() -> > > start_backtrace() that is consumed by walk_stackframe()? > > We could, but I'd strongly prefer to use __builtin_frame_address(1) if > we can, as it's much simpler to read and keeps the logic constrained to > the starting function. I'd already hacked that up at: > > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=arm64/unwind=5811a76c1be1dcea7104a9a771fc2604bc2a90ef > > ... and I'm fairly confident that this works on arm64. If it works with both clang and gcc (and various versions), it's cleaner this way. > If __builtin_frame_address(1) is truly unreliable, then we could just > manually unwind one step within arch_stack_walk() when unwinding > current, which I think is cleaner than spreading this within > walk_stackframe(). > > I can clean up the commit message and post that as a real patch, if you > like? Yes, please. Either variant is fine by me, with a preference for __builtin_frame_address(1) (if we know it works). > > > In another thread, we came to the conclusion that arch_stack_walk() > > > should start at its parent, and its parent should add any skipping it > > > requires. > > > > This makes sense. > > > > > Currently, arch_stack_walk() is off-by-one, and we can bodge that by > > > using __builtin_frame_address(1), though I'm waiting for some compiler > > > folk to confirm that's sound. Otherwise we need to add an assembly > > > trampoline to snapshot the FP, which is unfortunastely convoluted. > > > > > > This report suggests that a caller of arch_stack_walk() is off-by-one > > > too, which suggests a larger cross-architecture semantic issue. I'll try > > > to take a look tomorrow. > > > > I don't think the caller is off by one, at least not by the final skip > > value. __set_page_owner() wants the trace to start at its caller. The > > callee save_stack() in the same file adds a skip of 2. > > save_stack_trace() increments the skip before invoking > > arch_stack_walk(). So far, this assumes that arch_stack_walk() starts at > > its parent, i.e.
Re: [PATCH 2/2] arm64: stacktrace: Add skip when task == current
On Thu, Mar 18, 2021 at 04:17:24PM +, Catalin Marinas wrote: > On Wed, Mar 17, 2021 at 07:34:16PM +, Mark Rutland wrote: > > On Wed, Mar 17, 2021 at 06:36:36PM +, Catalin Marinas wrote: > > > On Wed, Mar 17, 2021 at 02:20:50PM +, Chen Jun wrote: > > > > On ARM64, cat /sys/kernel/debug/page_owner, all pages return the same > > > > stack: > > > > stack_trace_save+0x4c/0x78 > > > > register_early_stack+0x34/0x70 > > > > init_page_owner+0x34/0x230 > > > > page_ext_init+0x1bc/0x1dc > > > > > > > > The reason is that: > > > > check_recursive_alloc always return 1 because that > > > > entries[0] is always equal to ip (__set_page_owner+0x3c/0x60). > > > > > > > > The root cause is that: > > > > commit 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") > > > > make the save_trace save 2 more entries. > > > > > > > > Add skip in arch_stack_walk when task == current. > > > > > > > > Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") > > > > Signed-off-by: Chen Jun > > > > --- > > > > arch/arm64/kernel/stacktrace.c | 5 +++-- > > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/arch/arm64/kernel/stacktrace.c > > > > b/arch/arm64/kernel/stacktrace.c > > > > index ad20981..c26b0ac 100644 > > > > --- a/arch/arm64/kernel/stacktrace.c > > > > +++ b/arch/arm64/kernel/stacktrace.c > > > > @@ -201,11 +201,12 @@ void arch_stack_walk(stack_trace_consume_fn > > > > consume_entry, void *cookie, > > > > > > > > if (regs) > > > > start_backtrace(, regs->regs[29], regs->pc); > > > > - else if (task == current) > > > > + else if (task == current) { > > > > + ((struct stacktrace_cookie *)cookie)->skip += 2; > > > > start_backtrace(, > > > > (unsigned > > > > long)__builtin_frame_address(0), > > > > (unsigned long)arch_stack_walk); > > > > - else > > > > + } else > > > > start_backtrace(, thread_saved_fp(task), > > > > thread_saved_pc(task)); > > > > > > I don't like abusing the cookie here. It's void * as it's meant to be an > > > opaque type. I'd rather skip the first two frames in walk_stackframe() > > > instead before invoking fn(). > > > > I agree that we shouldn't touch cookie here. > > > > I don't think that it's right to bodge this inside walk_stackframe(), > > since that'll add bogus skipping for the case starting with regs in the > > current task. If we need a bodge, it has to live in arch_stack_walk() > > where we set up the initial unwinding state. > > Good point. However, instead of relying on __builtin_frame_address(1), > can we add a 'skip' value to struct stackframe via arch_stack_walk() -> > start_backtrace() that is consumed by walk_stackframe()? We could, but I'd strongly prefer to use __builtin_frame_address(1) if we can, as it's much simpler to read and keeps the logic constrained to the starting function. I'd already hacked that up at: https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=arm64/unwind=5811a76c1be1dcea7104a9a771fc2604bc2a90ef ... and I'm fairly confident that this works on arm64. If __builtin_frame_address(1) is truly unreliable, then we could just manually unwind one step within arch_stack_walk() when unwinding current, which I think is cleaner than spreading this within walk_stackframe(). I can clean up the commit message and post that as a real patch, if you like? > > In another thread, we came to the conclusion that arch_stack_walk() > > should start at its parent, and its parent should add any skipping it > > requires. > > This makes sense. > > > Currently, arch_stack_walk() is off-by-one, and we can bodge that by > > using __builtin_frame_address(1), though I'm waiting for some compiler > > folk to confirm that's sound. Otherwise we need to add an assembly > > trampoline to snapshot the FP, which is unfortunastely convoluted. > > > > This report suggests that a caller of arch_stack_walk() is off-by-one > > too, which suggests a larger cross-architecture semantic issue. I'll try > > to take a look tomorrow. > > I don't think the caller is off by one, at least not by the final skip > value. __set_page_owner() wants the trace to start at its caller. The > callee save_stack() in the same file adds a skip of 2. > save_stack_trace() increments the skip before invoking > arch_stack_walk(). So far, this assumes that arch_stack_walk() starts at > its parent, i.e. save_stack_trace(). FWIW, I had only assumed the caller was also off-by-one because the commit message for this patch said the conversion to ARCH_STACKWALK added two entries. Have I misunderstood, or is that incorrect? So if this is only off-by-one, I agree it's the same problem. Thanks, Mark. > So save_stack_trace() only need to skip 1 and I think that's in line > with the original report where the entries[0] is
Re: [PATCH 2/2] arm64: stacktrace: Add skip when task == current
On Wed, Mar 17, 2021 at 07:34:16PM +, Mark Rutland wrote: > On Wed, Mar 17, 2021 at 06:36:36PM +, Catalin Marinas wrote: > > On Wed, Mar 17, 2021 at 02:20:50PM +, Chen Jun wrote: > > > On ARM64, cat /sys/kernel/debug/page_owner, all pages return the same > > > stack: > > > stack_trace_save+0x4c/0x78 > > > register_early_stack+0x34/0x70 > > > init_page_owner+0x34/0x230 > > > page_ext_init+0x1bc/0x1dc > > > > > > The reason is that: > > > check_recursive_alloc always return 1 because that > > > entries[0] is always equal to ip (__set_page_owner+0x3c/0x60). > > > > > > The root cause is that: > > > commit 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") > > > make the save_trace save 2 more entries. > > > > > > Add skip in arch_stack_walk when task == current. > > > > > > Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") > > > Signed-off-by: Chen Jun > > > --- > > > arch/arm64/kernel/stacktrace.c | 5 +++-- > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/arm64/kernel/stacktrace.c > > > b/arch/arm64/kernel/stacktrace.c > > > index ad20981..c26b0ac 100644 > > > --- a/arch/arm64/kernel/stacktrace.c > > > +++ b/arch/arm64/kernel/stacktrace.c > > > @@ -201,11 +201,12 @@ void arch_stack_walk(stack_trace_consume_fn > > > consume_entry, void *cookie, > > > > > > if (regs) > > > start_backtrace(, regs->regs[29], regs->pc); > > > - else if (task == current) > > > + else if (task == current) { > > > + ((struct stacktrace_cookie *)cookie)->skip += 2; > > > start_backtrace(, > > > (unsigned long)__builtin_frame_address(0), > > > (unsigned long)arch_stack_walk); > > > - else > > > + } else > > > start_backtrace(, thread_saved_fp(task), > > > thread_saved_pc(task)); > > > > I don't like abusing the cookie here. It's void * as it's meant to be an > > opaque type. I'd rather skip the first two frames in walk_stackframe() > > instead before invoking fn(). > > I agree that we shouldn't touch cookie here. > > I don't think that it's right to bodge this inside walk_stackframe(), > since that'll add bogus skipping for the case starting with regs in the > current task. If we need a bodge, it has to live in arch_stack_walk() > where we set up the initial unwinding state. Good point. However, instead of relying on __builtin_frame_address(1), can we add a 'skip' value to struct stackframe via arch_stack_walk() -> start_backtrace() that is consumed by walk_stackframe()? > In another thread, we came to the conclusion that arch_stack_walk() > should start at its parent, and its parent should add any skipping it > requires. This makes sense. > Currently, arch_stack_walk() is off-by-one, and we can bodge that by > using __builtin_frame_address(1), though I'm waiting for some compiler > folk to confirm that's sound. Otherwise we need to add an assembly > trampoline to snapshot the FP, which is unfortunastely convoluted. > > This report suggests that a caller of arch_stack_walk() is off-by-one > too, which suggests a larger cross-architecture semantic issue. I'll try > to take a look tomorrow. I don't think the caller is off by one, at least not by the final skip value. __set_page_owner() wants the trace to start at its caller. The callee save_stack() in the same file adds a skip of 2. save_stack_trace() increments the skip before invoking arch_stack_walk(). So far, this assumes that arch_stack_walk() starts at its parent, i.e. save_stack_trace(). So save_stack_trace() only need to skip 1 and I think that's in line with the original report where the entries[0] is __set_page_owner(). We only need to skip one. Another untested quick hack (we should probably add the skip argument to start_backtrace()): diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h index eb29b1fe8255..0d32d932ac89 100644 --- a/arch/arm64/include/asm/stacktrace.h +++ b/arch/arm64/include/asm/stacktrace.h @@ -56,6 +56,7 @@ struct stackframe { DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES); unsigned long prev_fp; enum stack_type prev_type; + int skip; #ifdef CONFIG_FUNCTION_GRAPH_TRACER int graph; #endif @@ -153,6 +154,7 @@ static inline void start_backtrace(struct stackframe *frame, { frame->fp = fp; frame->pc = pc; + frame->skip = 0; #ifdef CONFIG_FUNCTION_GRAPH_TRACER frame->graph = 0; #endif diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c index ad20981dfda4..a89b2ecbf3de 100644 --- a/arch/arm64/kernel/stacktrace.c +++ b/arch/arm64/kernel/stacktrace.c @@ -118,7 +118,9 @@ void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame, while (1) { int ret; - if (!fn(data, frame->pc)) + if (frame->skip > 0) + frame->skip--; +
Re: [PATCH 2/2] arm64: stacktrace: Add skip when task == current
在 2021/3/18 11:31, chenjun (AM) 写道: > 在 2021/3/18 3:34, Mark Rutland 写道: >> On Wed, Mar 17, 2021 at 06:36:36PM +, Catalin Marinas wrote: >>> On Wed, Mar 17, 2021 at 02:20:50PM +, Chen Jun wrote: On ARM64, cat /sys/kernel/debug/page_owner, all pages return the same stack: stack_trace_save+0x4c/0x78 register_early_stack+0x34/0x70 init_page_owner+0x34/0x230 page_ext_init+0x1bc/0x1dc The reason is that: check_recursive_alloc always return 1 because that entries[0] is always equal to ip (__set_page_owner+0x3c/0x60). The root cause is that: commit 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") make the save_trace save 2 more entries. Add skip in arch_stack_walk when task == current. Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") Signed-off-by: Chen Jun --- arch/arm64/kernel/stacktrace.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c index ad20981..c26b0ac 100644 --- a/arch/arm64/kernel/stacktrace.c +++ b/arch/arm64/kernel/stacktrace.c @@ -201,11 +201,12 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie, if (regs) start_backtrace(, regs->regs[29], regs->pc); - else if (task == current) + else if (task == current) { + ((struct stacktrace_cookie *)cookie)->skip += 2; start_backtrace(, (unsigned long)__builtin_frame_address(0), (unsigned long)arch_stack_walk); - else + } else start_backtrace(, thread_saved_fp(task), thread_saved_pc(task)); >>> >>> I don't like abusing the cookie here. It's void * as it's meant to be an >>> opaque type. I'd rather skip the first two frames in walk_stackframe() >>> instead before invoking fn(). >> >> I agree that we shouldn't touch cookie here. >> >> I don't think that it's right to bodge this inside walk_stackframe(), >> since that'll add bogus skipping for the case starting with regs in the >> current task. If we need a bodge, it has to live in arch_stack_walk() >> where we set up the initial unwinding state. >> >> In another thread, we came to the conclusion that arch_stack_walk() >> should start at its parent, and its parent should add any skipping it >> requires. >> >> Currently, arch_stack_walk() is off-by-one, and we can bodge that by >> using __builtin_frame_address(1), though I'm waiting for some compiler >> folk to confirm that's sound. Otherwise we need to add an assembly >> trampoline to snapshot the FP, which is unfortunastely convoluted. >> >> This report suggests that a caller of arch_stack_walk() is off-by-one >> too, which suggests a larger cross-architecture semantic issue. I'll try >> to take a look tomorrow. >> >> Thanks, >> Mark. >> >>> >>> Prior to the conversion to ARCH_STACKWALK, we were indeed skipping two >>> more entries in __save_stack_trace() if tsk == current. Something like >>> below, completely untested: >>> >>> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c >>> index ad20981dfda4..2a9f759aa41a 100644 >>> --- a/arch/arm64/kernel/stacktrace.c >>> +++ b/arch/arm64/kernel/stacktrace.c >>> @@ -115,10 +115,15 @@ NOKPROBE_SYMBOL(unwind_frame); >>>void notrace walk_stackframe(struct task_struct *tsk, struct stackframe >>> *frame, >>> bool (*fn)(void *, unsigned long), void *data) >>>{ >>> + /* for the current task, we don't want this function nor its caller */ >>> + int skip = tsk == current ? 2 : 0; >>> + >>> while (1) { >>> int ret; >>> >>> - if (!fn(data, frame->pc)) >>> + if (skip) >>> + skip--; >>> + else if (!fn(data, frame->pc)) >>> break; >>> ret = unwind_frame(tsk, frame); >>> if (ret < 0) >>> >>> >>> -- >>> Catalin >> > > This change will make kmemleak broken. > Maybe the reason is what Mark pointed out. I will try to check out. > I make a mistake. kmemleak seems to work good. I will do more tests. -- Regards Chen Jun
Re: [PATCH 2/2] arm64: stacktrace: Add skip when task == current
在 2021/3/18 3:34, Mark Rutland 写道: > On Wed, Mar 17, 2021 at 06:36:36PM +, Catalin Marinas wrote: >> On Wed, Mar 17, 2021 at 02:20:50PM +, Chen Jun wrote: >>> On ARM64, cat /sys/kernel/debug/page_owner, all pages return the same >>> stack: >>> stack_trace_save+0x4c/0x78 >>> register_early_stack+0x34/0x70 >>> init_page_owner+0x34/0x230 >>> page_ext_init+0x1bc/0x1dc >>> >>> The reason is that: >>> check_recursive_alloc always return 1 because that >>> entries[0] is always equal to ip (__set_page_owner+0x3c/0x60). >>> >>> The root cause is that: >>> commit 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") >>> make the save_trace save 2 more entries. >>> >>> Add skip in arch_stack_walk when task == current. >>> >>> Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") >>> Signed-off-by: Chen Jun >>> --- >>> arch/arm64/kernel/stacktrace.c | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> >>> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c >>> index ad20981..c26b0ac 100644 >>> --- a/arch/arm64/kernel/stacktrace.c >>> +++ b/arch/arm64/kernel/stacktrace.c >>> @@ -201,11 +201,12 @@ void arch_stack_walk(stack_trace_consume_fn >>> consume_entry, void *cookie, >>> >>> if (regs) >>> start_backtrace(, regs->regs[29], regs->pc); >>> - else if (task == current) >>> + else if (task == current) { >>> + ((struct stacktrace_cookie *)cookie)->skip += 2; >>> start_backtrace(, >>> (unsigned long)__builtin_frame_address(0), >>> (unsigned long)arch_stack_walk); >>> - else >>> + } else >>> start_backtrace(, thread_saved_fp(task), >>> thread_saved_pc(task)); >> >> I don't like abusing the cookie here. It's void * as it's meant to be an >> opaque type. I'd rather skip the first two frames in walk_stackframe() >> instead before invoking fn(). > > I agree that we shouldn't touch cookie here. > > I don't think that it's right to bodge this inside walk_stackframe(), > since that'll add bogus skipping for the case starting with regs in the > current task. If we need a bodge, it has to live in arch_stack_walk() > where we set up the initial unwinding state. > > In another thread, we came to the conclusion that arch_stack_walk() > should start at its parent, and its parent should add any skipping it > requires. > > Currently, arch_stack_walk() is off-by-one, and we can bodge that by > using __builtin_frame_address(1), though I'm waiting for some compiler > folk to confirm that's sound. Otherwise we need to add an assembly > trampoline to snapshot the FP, which is unfortunastely convoluted. > > This report suggests that a caller of arch_stack_walk() is off-by-one > too, which suggests a larger cross-architecture semantic issue. I'll try > to take a look tomorrow. > > Thanks, > Mark. > >> >> Prior to the conversion to ARCH_STACKWALK, we were indeed skipping two >> more entries in __save_stack_trace() if tsk == current. Something like >> below, completely untested: >> >> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c >> index ad20981dfda4..2a9f759aa41a 100644 >> --- a/arch/arm64/kernel/stacktrace.c >> +++ b/arch/arm64/kernel/stacktrace.c >> @@ -115,10 +115,15 @@ NOKPROBE_SYMBOL(unwind_frame); >> void notrace walk_stackframe(struct task_struct *tsk, struct stackframe >> *frame, >> bool (*fn)(void *, unsigned long), void *data) >> { >> +/* for the current task, we don't want this function nor its caller */ >> +int skip = tsk == current ? 2 : 0; >> + >> while (1) { >> int ret; >> >> -if (!fn(data, frame->pc)) >> +if (skip) >> +skip--; >> +else if (!fn(data, frame->pc)) >> break; >> ret = unwind_frame(tsk, frame); >> if (ret < 0) >> >> >> -- >> Catalin > This change will make kmemleak broken. Maybe the reason is what Mark pointed out. I will try to check out. -- Regards Chen Jun
Re: [PATCH 2/2] arm64: stacktrace: Add skip when task == current
On Wed, Mar 17, 2021 at 06:36:36PM +, Catalin Marinas wrote: > On Wed, Mar 17, 2021 at 02:20:50PM +, Chen Jun wrote: > > On ARM64, cat /sys/kernel/debug/page_owner, all pages return the same > > stack: > > stack_trace_save+0x4c/0x78 > > register_early_stack+0x34/0x70 > > init_page_owner+0x34/0x230 > > page_ext_init+0x1bc/0x1dc > > > > The reason is that: > > check_recursive_alloc always return 1 because that > > entries[0] is always equal to ip (__set_page_owner+0x3c/0x60). > > > > The root cause is that: > > commit 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") > > make the save_trace save 2 more entries. > > > > Add skip in arch_stack_walk when task == current. > > > > Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") > > Signed-off-by: Chen Jun > > --- > > arch/arm64/kernel/stacktrace.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c > > index ad20981..c26b0ac 100644 > > --- a/arch/arm64/kernel/stacktrace.c > > +++ b/arch/arm64/kernel/stacktrace.c > > @@ -201,11 +201,12 @@ void arch_stack_walk(stack_trace_consume_fn > > consume_entry, void *cookie, > > > > if (regs) > > start_backtrace(, regs->regs[29], regs->pc); > > - else if (task == current) > > + else if (task == current) { > > + ((struct stacktrace_cookie *)cookie)->skip += 2; > > start_backtrace(, > > (unsigned long)__builtin_frame_address(0), > > (unsigned long)arch_stack_walk); > > - else > > + } else > > start_backtrace(, thread_saved_fp(task), > > thread_saved_pc(task)); > > I don't like abusing the cookie here. It's void * as it's meant to be an > opaque type. I'd rather skip the first two frames in walk_stackframe() > instead before invoking fn(). I agree that we shouldn't touch cookie here. I don't think that it's right to bodge this inside walk_stackframe(), since that'll add bogus skipping for the case starting with regs in the current task. If we need a bodge, it has to live in arch_stack_walk() where we set up the initial unwinding state. In another thread, we came to the conclusion that arch_stack_walk() should start at its parent, and its parent should add any skipping it requires. Currently, arch_stack_walk() is off-by-one, and we can bodge that by using __builtin_frame_address(1), though I'm waiting for some compiler folk to confirm that's sound. Otherwise we need to add an assembly trampoline to snapshot the FP, which is unfortunastely convoluted. This report suggests that a caller of arch_stack_walk() is off-by-one too, which suggests a larger cross-architecture semantic issue. I'll try to take a look tomorrow. Thanks, Mark. > > Prior to the conversion to ARCH_STACKWALK, we were indeed skipping two > more entries in __save_stack_trace() if tsk == current. Something like > below, completely untested: > > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c > index ad20981dfda4..2a9f759aa41a 100644 > --- a/arch/arm64/kernel/stacktrace.c > +++ b/arch/arm64/kernel/stacktrace.c > @@ -115,10 +115,15 @@ NOKPROBE_SYMBOL(unwind_frame); > void notrace walk_stackframe(struct task_struct *tsk, struct stackframe > *frame, >bool (*fn)(void *, unsigned long), void *data) > { > + /* for the current task, we don't want this function nor its caller */ > + int skip = tsk == current ? 2 : 0; > + > while (1) { > int ret; > > - if (!fn(data, frame->pc)) > + if (skip) > + skip--; > + else if (!fn(data, frame->pc)) > break; > ret = unwind_frame(tsk, frame); > if (ret < 0) > > > -- > Catalin
Re: [PATCH 2/2] arm64: stacktrace: Add skip when task == current
On Wed, Mar 17, 2021 at 02:20:50PM +, Chen Jun wrote: > On ARM64, cat /sys/kernel/debug/page_owner, all pages return the same > stack: > stack_trace_save+0x4c/0x78 > register_early_stack+0x34/0x70 > init_page_owner+0x34/0x230 > page_ext_init+0x1bc/0x1dc > > The reason is that: > check_recursive_alloc always return 1 because that > entries[0] is always equal to ip (__set_page_owner+0x3c/0x60). > > The root cause is that: > commit 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") > make the save_trace save 2 more entries. > > Add skip in arch_stack_walk when task == current. > > Fixes: 5fc57df2f6fd ("arm64: stacktrace: Convert to ARCH_STACKWALK") > Signed-off-by: Chen Jun > --- > arch/arm64/kernel/stacktrace.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c > index ad20981..c26b0ac 100644 > --- a/arch/arm64/kernel/stacktrace.c > +++ b/arch/arm64/kernel/stacktrace.c > @@ -201,11 +201,12 @@ void arch_stack_walk(stack_trace_consume_fn > consume_entry, void *cookie, > > if (regs) > start_backtrace(, regs->regs[29], regs->pc); > - else if (task == current) > + else if (task == current) { > + ((struct stacktrace_cookie *)cookie)->skip += 2; > start_backtrace(, > (unsigned long)__builtin_frame_address(0), > (unsigned long)arch_stack_walk); > - else > + } else > start_backtrace(, thread_saved_fp(task), > thread_saved_pc(task)); I don't like abusing the cookie here. It's void * as it's meant to be an opaque type. I'd rather skip the first two frames in walk_stackframe() instead before invoking fn(). Prior to the conversion to ARCH_STACKWALK, we were indeed skipping two more entries in __save_stack_trace() if tsk == current. Something like below, completely untested: diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c index ad20981dfda4..2a9f759aa41a 100644 --- a/arch/arm64/kernel/stacktrace.c +++ b/arch/arm64/kernel/stacktrace.c @@ -115,10 +115,15 @@ NOKPROBE_SYMBOL(unwind_frame); void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame, bool (*fn)(void *, unsigned long), void *data) { + /* for the current task, we don't want this function nor its caller */ + int skip = tsk == current ? 2 : 0; + while (1) { int ret; - if (!fn(data, frame->pc)) + if (skip) + skip--; + else if (!fn(data, frame->pc)) break; ret = unwind_frame(tsk, frame); if (ret < 0) -- Catalin