[ARM64] Printing IRQ stack usage information
Hi All, I have a requirement in printing irq stack usage information for each cpu. I am using the following: Linux Kernel: 4.9.x Board: hikey620 (arm64, armv8 with 8 cores) Platform: Ubuntu 14 debian As a vague idea, I tried to implement like this: - static int dump_irq_stack_info(void) { int cpu, actual; unsigned long irq_stack_ptr; //unsigned long sp, sp1; unsigned long stack_start; unsigned long used; actual = IRQ_STACK_SIZE; used = 0; pr_info("CPU UNUSED-STACK ACTUAL-STACK\n"); //seq_printf(m, "CPU UNUSED-STACK ACTUAL-STACK\n"); for_each_present_cpu(cpu) { irq_stack_ptr = IRQ_STACK_PTR(cpu); //sp1 = current_stack_pointer; //sp = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr); stack_start = (unsigned long)per_cpu(irq_stack, cpu); used = irq_stack_ptr - stack_start; pr_info("%2d %10lu %10d\n", cpu, used, actual); //seq_printf(m, "%2d %10lu %10d\n", cpu, used, actual); } return 0; } - Currently, when I tested this (as a proc interface), I got the below output: CPUUNUSED-STACKACTUAL-STACK 0 16368 16384 1 16368 16384 2 16368 16384 3 16368 16384 4 16368 16384 5 16368 16384 6 16368 16384 7 16368 16384 - But, I have some doubts: 1) I am not sure if my implementation is in right direction, or the logic is totally wrong. 2) Is there better way to perform the similar objective? 3) How should I test it to get the different usage values for unused stack ? Can I get these values by implementing a sample interrupt handler, and printing information from there? If this works, then I want to use it as part of dump_backtrace() function. If anybody have done a similar thing in past, or have some ideas to share, please help me. Note: This is required only for our internal debugging purpose. Regards, Pintu ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: [ARM64] Printing IRQ stack usage information
On Thu, 15 Nov 2018 18:52:39 +0530, Pintu Agarwal said: > Currently, when I tested this (as a proc interface), I got the below output: > CPUUNUSED-STACKACTUAL-STACK > 0 16368 16384 > 3) How should I test it to get the different usage values for unused stack ? > Can I get these values by implementing a sample interrupt handler, > and printing information from there? Hint 1: If you're in a state where seq_printf() is legal, how many IRQ's are on this processor's IRQ stack? Hint 2: What are the chances that some other CPU is currently in an IRQ? (run 'top' and look for what percent of time that's happening) Hint 3: what are the chances that the value of irq_stack_ptr is already stale by the time seq_printf() finishes running? Hint 4: what happens to the validity of your output if you get rescheduled in the middle of that for_each loop? (In other words, this code is terribly racy and is probably not going to answer whatever debugging question you were working on.. If your question is "Did one of the CPUs blow out its IRQ stack (or come close to doing so)?" there's better approaches. pgpgnND87yFdu.pgp Description: PGP signature ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: [ARM64] Printing IRQ stack usage information
On Thu, 15 Nov 2018, 10:19 pm > On Thu, 15 Nov 2018 18:52:39 +0530, Pintu Agarwal said: > > > Currently, when I tested this (as a proc interface), I got the below output: > > CPUUNUSED-STACKACTUAL-STACK > > 0 16368 16384 > > > 3) How should I test it to get the different usage values for unused stack ? > > Can I get these values by implementing a sample interrupt handler, > > and printing information from there? > > Hint 1: If you're in a state where seq_printf() is legal, how many IRQ's are > on this processor's IRQ stack? > > Hint 2: What are the chances that some other CPU is currently in an IRQ? > (run 'top' and look for what percent of time that's happening) > > Hint 3: what are the chances that the value of irq_stack_ptr is already stale > by the time seq_printf() finishes running? > > Hint 4: what happens to the validity of your output if you get rescheduled > in the middle of that for_each loop? > > (In other words, this code is terribly racy and is probably not going to > answer > whatever debugging question you were working on.. Okay. Thanks so much for your hints. Yes, I understand that this code is horribly ugly and bad. But this is only to understand if the below logic is fine to get the irq stack usage: {{{ for_each_present_cpu(cpu) { irq_stack_ptr = IRQ_STACK_PTR(cpu); //unsigned long sp = current_stack_pointer; stack_start = (unsigned long)per_cpu(irq_stack, cpu); free_stack = irq_stack_ptr - stack_start; seq_printf(m, "%2d %10lu %10d\n", cpu, free_stack, actual); } }}} Of course, final plan is not the proc entry, but to find a relevant place to invoke it, probably during boot time, or during backtrace. > If your question is "Did one > of the CPUs blow out its IRQ stack (or come close to doing so)?" there's > better > approaches. > Yes, exactly, this is what the main intention. If you have any better idea about this approach, please refer me. It will be of great help. Thank You! > ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: [ARM64] Printing IRQ stack usage information
On Fri, 16 Nov 2018 11:44:36 +0530, Pintu Agarwal said: > > If your question is "Did one > > of the CPUs blow out its IRQ stack (or come close to doing so)?" there's > > better > > approaches. > > > Yes, exactly, this is what the main intention. > If you have any better idea about this approach, please refer me. > It will be of great help. Look at the code controlled by '#ifdef CONFIG_DEBUG_STACK_USAGE' which does the same thing for process stacks, or CONFIG_SCHED_STACK_END_CHECK or the use of guard pages for detecting stack overrun ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: [ARM64] Printing IRQ stack usage information
On Fri, Nov 16, 2018 at 5:03 PM wrote: > > On Fri, 16 Nov 2018 11:44:36 +0530, Pintu Agarwal said: > > > > If your question is "Did one > > > of the CPUs blow out its IRQ stack (or come close to doing so)?" there's > > > better > > > approaches. > > > > > Yes, exactly, this is what the main intention. > > If you have any better idea about this approach, please refer me. > > It will be of great help. > > Look at the code controlled by '#ifdef CONFIG_DEBUG_STACK_USAGE' > which does the same thing for process stacks, or CONFIG_SCHED_STACK_END_CHECK > or the use of guard pages for detecting stack overrun Hi, Thank you so much for your reference. Yes, I have already gone through the process stack usage, which I found slightly different. However, I will go through it in more detail, and see if I can gain some ideas from there. I found a similar irq_stack_usage implementation in parisc architecture: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/parisc/kernel/irq.c?h=v4.19.1 I have also gone through the unwind_frame() part in arch/arm64/stacktrace.c: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/kernel/stacktrace.c?h=v4.9.137 By referring to these, I tried to make a similar approach for arm64: I created a new function: dump_irq_stack_info() [arch/arm64/kernel/traps.c], and called it as part of show_stack(). This is the experimental patch I created. Note: This is just for my experiment purpose. I know this is ugly and in very bad shape right now. It is only to get some idea about irq stack usage. diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c index 11e5eae..6ac855d 100644 --- a/arch/arm64/kernel/traps.c +++ b/arch/arm64/kernel/traps.c @@ -214,9 +214,39 @@ static void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk) } } +void dump_irq_stack_info(void) +{ +int cpu, actual; +unsigned long irq_stack_ptr; +unsigned long stack_start; +unsigned long free_stack; + +actual = IRQ_STACK_SIZE; +free_stack = 0; +pr_info("CPU UNUSED-STACK ACTUAL-STACK\n"); + + for_each_present_cpu(cpu) { + unsigned long sp; + irq_stack_ptr = IRQ_STACK_PTR(cpu); + sp = current_stack_pointer; + //sp = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr); + stack_start = (unsigned long)per_cpu(irq_stack, cpu); + if (on_irq_stack(sp, cpu)) { + pr_info("cpu:%d : sp: on irq_stack\n", cpu); + free_stack = sp - stack_start; + } else { + free_stack = irq_stack_ptr - stack_start; + } + pr_info("%2d %10lu %10d\n", cpu, free_stack, actual); + } +} + void show_stack(struct task_struct *tsk, unsigned long *sp) { dump_backtrace(NULL, tsk); + dump_irq_stack_info(); barrier(); } Then, I developed a sample kernel module for timer handler (timerirq.c) and called the dump_stack() function from inside my timer interrupt handler. The dump_stack() will internally call show_stack(), which will then call our function: dump_irq_stack_info(). /* From interrupt context */ static void my_timer_irq_handler(unsigned long ptr) { int i; unsigned long flags; if (in_interrupt()) { pr_info("[timerirq]: %s: in interrupt context, count: %d\n", __func__, count); spin_lock_irqsave(&mylock, flags); + dump_stack(); spin_unlock_irqrestore(&mylock, flags); } else { /* This is not needed here*/ } tasklet_schedule(&my_tasklet); } OUTPUT: With this, I got the below output as part of dump_stack() and backtrace: [ 43.267923] CPU UNUSED-STACK ACTUAL-STACK [ 43.271925] 0 16368 16384 [ 43.275493] 1 16368 16384 [ 43.279061] 2 16368 16384 [ 43.282628] cpu:3 : sp: on irq_stack [ 43.286195] 3 15616 16384 [ 43.289762] 4 16368 16384 [ 43.293330] 5 16368 16384 [ 43.296898] 6 16368 16384 [ 43.300465] 7 16368 16384 So, I observed that my interrupt handler was executed by cpu3, and it's irq_stack usage is shown: 3 15616 16384 With this information, I can know that which interrupt handler is using how much irq_stack ? Is this approach valid ? Or still there is much better way to dump the information ? For example: is it possible to keep storing the irq_stack_usage (for each cpu in a variable) information from boot time, and then use this variable to dump the irq_stack information, after the system booted, may be from proc entry ? Thanks, Pintu ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: [ARM64] Printing IRQ stack usage information
On Fri, 16 Nov 2018 20:10:28 +0530, Pintu Agarwal said: > > Look at the code controlled by '#ifdef CONFIG_DEBUG_STACK_USAGE' > For example: is it possible to keep storing the irq_stack_usage (for > each cpu in a variable) information from boot time, and then use this > variable to dump the irq_stack information, after the system booted, > may be from proc entry ? Congrats. You just re-invented DEBUG_STACK_USAGE, which just keeps a high-water mark for stack usage. pgpWrHR9Ets3M.pgp Description: PGP signature ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: [ARM64] Printing IRQ stack usage information
On Fri, Nov 16, 2018 at 10:16 PM wrote: > > On Fri, 16 Nov 2018 20:10:28 +0530, Pintu Agarwal said: > > > > Look at the code controlled by '#ifdef CONFIG_DEBUG_STACK_USAGE' > > > For example: is it possible to keep storing the irq_stack_usage (for > > each cpu in a variable) information from boot time, and then use this > > variable to dump the irq_stack information, after the system booted, > > may be from proc entry ? > > Congrats. You just re-invented DEBUG_STACK_USAGE, which just keeps a > high-water mark > for stack usage. So, you mean to say, my implementation is good enough to get the irq_stack usage, from the interrupt handler ? But my concern is that if I dump it from irq handler, I will get information only for the current cpu. How do I store and get the information for all the cpu from the boot time ? >From where do I call my dump_irq_stack_info() [some where during the entry/exit part of the irq handler], so that I could dump information for all the handler at boot time itself ? Like I would to capture these information: - What was the name of the handler ? - Which cpu was executing it ? - How much irq stack (max value, same like high water mark) were used at that time ? Where can I collect these information from the boot time ? Thanks, Pintu ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: [ARM64] Printing IRQ stack usage information
On Fri, 16 Nov 2018 23:13:48 +0530, Pintu Agarwal said: > On Fri, Nov 16, 2018 at 10:16 PM wrote: > > Congrats. You just re-invented DEBUG_STACK_USAGE, which just keeps a > > high-water mark > > for stack usage. > > So, you mean to say, my implementation is good enough to get the > irq_stack usage, from the interrupt handler ? No - your code doesn't keep a high-water mark (which should probably be hooked into the IRQ exit code. > But my concern is that if I dump it from irq handler, I will get > information only for the current cpu. > How do I store and get the information for all the cpu from the boot time ? Make the high-water mark a per-cpu variable. > From where do I call my dump_irq_stack_info() [some where during the > entry/exit part of the irq handler], so that I could dump information > for all the handler at boot time itself ? No, you don't do a dump-stack during entry/exit. You just maintain a high-water value in the exit, and then you create a /proc/something or similar that when read does a 'foreach CPU do print_high_water_irq'. > Like I would to capture these information: > - What was the name of the handler ? > - Which cpu was executing it ? > - How much irq stack (max value, same like high water mark) were used > at that time ? First, do the easy part and find out if you even *care* once you see actual numbers. If your IRQ stack is 8K but you never use more than 2500 bytes, do you *really* care about the name of the handler anymore? Also, see the code for /proc/interrupts to see how it keeps track of the interrupts per CPU - maybe all you need to do is change each entry from a 'count' to 'count, highwater'. pgp_W8ryM0fu5.pgp Description: PGP signature ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: [ARM64] Printing IRQ stack usage information
On Sat, Nov 17, 2018 at 12:02 AM wrote: > > On Fri, 16 Nov 2018 23:13:48 +0530, Pintu Agarwal said: > > On Fri, Nov 16, 2018 at 10:16 PM wrote: > > > > Congrats. You just re-invented DEBUG_STACK_USAGE, which just keeps a > > > high-water mark > > > for stack usage. > > > > So, you mean to say, my implementation is good enough to get the > > irq_stack usage, from the interrupt handler ? > > No - your code doesn't keep a high-water mark (which should probably be > hooked into the IRQ exit code. > > > But my concern is that if I dump it from irq handler, I will get > > information only for the current cpu. > > How do I store and get the information for all the cpu from the boot time ? > > Make the high-water mark a per-cpu variable. > > > From where do I call my dump_irq_stack_info() [some where during the > > entry/exit part of the irq handler], so that I could dump information > > for all the handler at boot time itself ? > > No, you don't do a dump-stack during entry/exit. You just maintain a > high-water > value in the exit, Which is the right place to keep track of this high-water-irq-stack-usage (per_cpu) in arch/arm64/* ? > and then you create a /proc/something or similar that when > read does a 'foreach CPU do print_high_water_irq'. > Ok got it. > > Like I would to capture these information: > > - What was the name of the handler ? > > - Which cpu was executing it ? > > - How much irq stack (max value, same like high water mark) were used > > at that time ? > > First, do the easy part and find out if you even *care* once you see actual > numbers. If your IRQ stack is 8K but you never use more than 2500 bytes, > do you *really* care about the name of the handler anymore? > Hmm, yes, getting the name of the handler is not so important in the first run. > Also, see the code for /proc/interrupts to see how it keeps track of the > interrupts per CPU - maybe all you need to do is change each entry from > a 'count' to 'count, highwater'. Ok thanks, thats a good pointer. ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: [ARM64] Printing IRQ stack usage information
On Sat, Nov 17, 2018 at 6:36 PM Pintu Agarwal wrote: > > On Sat, Nov 17, 2018 at 12:02 AM wrote: > > > > > But my concern is that if I dump it from irq handler, I will get > > > information only for the current cpu. > > > How do I store and get the information for all the cpu from the boot time > > > ? > > > > Make the high-water mark a per-cpu variable. > > > > > From where do I call my dump_irq_stack_info() [some where during the > > > entry/exit part of the irq handler], so that I could dump information > > > for all the handler at boot time itself ? > > > > No, you don't do a dump-stack during entry/exit. You just maintain a > > high-water > > value in the exit, > > Which is the right place to keep track of this > high-water-irq-stack-usage (per_cpu) > in arch/arm64/* ? > I tried to create a per-cpu irq_stack_usage variable like this in : arch/arm64/include/asm/hardirq.h +DECLARE_PER_CPU(unsigned int, irq_stack_usage); But, I could not figure out, from where to fill these variable for irq stack usage. + sp = current_stack_pointer; + if (on_irq_stack(sp, cpu)) { + stack_start = (unsigned long)per_cpu(irq_stack, cpu); + last_usage = per_cpu(irq_stack_usage, cpu); + curr_usage = sp - stack_start; + pr_info("cpu:%d : sp: %lu, stack_start: %lu, usage: %lu\n", cpu, sp, stack_start, (sp - stack_start)); + if (curr_usage > last_usage) + per_cpu(irq_stack_usage, cpu) = curr_usage; + } Which is the best place to invoke this ? I have the following option: 1. kernel/softirq.c => __do_softirq() 2. arch/arm64/kernel/smp.c => handle_IPI() 3. kernel/softirq.c => irq_exit() 4. ??? Please let me know. Thank You! Pintu ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: [ARM64] Printing IRQ stack usage information
On Tue, 20 Nov 2018 18:21:33 +0530, Pintu Agarwal said: > + sp = current_stack_pointer; > + if (on_irq_stack(sp, cpu)) { > + stack_start = (unsigned long)per_cpu(irq_stack, cpu); > + last_usage = per_cpu(irq_stack_usage, cpu); > + curr_usage = sp - stack_start; > + pr_info("cpu:%d : sp: %lu, stack_start: %lu, usage: %lu\n", cpu, > sp, stack_start, (sp - stack_start)); > + if (curr_usage > last_usage) > + per_cpu(irq_stack_usage, cpu) = curr_usage; > + } This code only works if called from the function that uses the deepest amount of stack, because curr_usage depends on the current stack pointer. If some other routine went 2,934 bytes further into the stack and then returned, you'll never know. What you want instead is a way to see what the deepest point reached during the current call, and compare that to the previous low. How to detect the used stack? Go look at how the code currently detects how deep the kernel stacks reached. (Hint 1: look at check_stack_usage() in kernel/exit.c and stack_not_used(). Hint 2: what other support code is implied by the implementation of stack_not used()? > Which is the best place to invoke this ? > I have the following option: > 1. kernel/softirq.c => __do_softirq() > 2. arch/arm64/kernel/smp.c => handle_IPI() > 3. kernel/softirq.c => irq_exit() > 4. ??? Pondering the above, what other implementations might work? Hint: if you have a similar irq_stack_not_used() function, do you need any additional code in *any* of those 4 places? (And yes, I could just splat out the needed code - but you'd learn nowhere near as much that way.. :) pgpF7C47lQyPl.pgp Description: PGP signature ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies