Hi,

On Mon, Feb 4, 2019 at 5:12 AM Mark Rutland <mark.rutl...@arm.com> wrote:
>
> On Fri, Feb 01, 2019 at 01:38:05PM -0800, Doug Anderson wrote:
> > Hi,
>
> Hi Doug,
>
> > I was wondering if anyone out there has given any thought to
> > annotating the ARM64 IRQ handling in such a way that we could stack
> > crawl past el1_irq() when in gdb.
> >
> > I spent a bit of time on this a few months ago and documented all my
> > findings in:
> >
> > https://bugs.chromium.org/p/chromium/issues/detail?id=908721
>
> There, the error from GDB is:
>
>     Backtrace stopped: previous frame identical to this frame (corrupt
>     stack?)
>
> ... is that misleading?
>
> ... or do we have some duplicate stack frame that we somewhow skip in
> the kernel unwinder?

If I had to guess I'd say that when gdb doesn't see a frame it
recognizes then it just returns the previous one, which causes it to
stop.  I don't think gdb falls back to just looking at the link
register because it needs more.


> > I can copy and paste all the discussion from that bug here, but since
> > it's public hopefully folks can read the discussion / investigation
> > there.  To put it briefly, though: I can stack crawl past "el1_irq"
> > with the normal linux stack crawl (which is what kdb uses) but I can't
> > crawl past "el1_irq" in gdb().  After talking to some of our tools
> > guys here I'm fairly certain that we could solve this with the right
> > CFI directives, but when I poked at it I wasn't able to figure out the
> > magic.
>
> AFAICT, we don't know why GDB is terminating early. Could we please
> figure that out first? e.g. by looking for the above message in the GDB
> sources.
>
> If we do need CFI annotations, I'd rather move that entry code to C
> first, to minimize how painful that is. I have an ongoing project [1] to
> do just that...
>
> Thanks,
> Mark.
>
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/entry-deasm

OK, I tried this.  It _changes_ the behavior but doesn't magically get
me get a full crawl.  If something like this is likely to merge to
mainline before too long then it makes sense to spend the time
debugging it instead of the old code...

---

Vanilla v5.0-rc6 on kevin:

#13 0xffffff801013e08c in generic_handle_irq_desc
    (desc=0x1)
    at .../include/linux/irqdesc.h:154
#14 generic_handle_irq
    (irq=<optimized out>)
    at .../kernel/irq/irqdesc.c:628
#15 0xffffff801013e110 in __handle_domain_irq
    (domain=0xffffffc000211880, hwirq=<optimized out>,
     lookup=<optimized out>, regs=0xffffff8011003ce0)
    at .../kernel/irq/irqdesc.c:665
#16 0xffffff8010081124 in handle_domain_irq
    (domain=0x1, hwirq=<optimized out>, regs=<optimized out>)
    at .../include/linux/irqdesc.h:172
#17 gic_handle_irq (regs=0xffffff8011003ce0)
    at .../drivers/irqchip/irq-gic-v3.c:367
#18 0xffffff8010082bf4 in el1_irq ()
    at .../arch/arm64/kernel/entry.S:609
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

---

Vanilla v5.0-rc6 + your patches on kevin:

#13 0xffffff801013e3cc in generic_handle_irq_desc
    (desc=0x1)
    at .../include/linux/irqdesc.h:154
#14 generic_handle_irq
    (irq=<optimized out>)
    at .../kernel/irq/irqdesc.c:628
#15 0xffffff801013e450 in __handle_domain_irq
    (domain=0xffffffc000211880, hwirq=<optimized out>,
     lookup=<optimized out>, regs=0xffffff8011003ce0)
    at .../kernel/irq/irqdesc.c:665
#16 0xffffff80100810c4 in handle_domain_irq
    (domain=0x1, hwirq=<optimized out>, regs=<optimized out>)
    at .../include/linux/irqdesc.h:172
#17 gic_handle_irq
    (regs=0xffffff8011003ce0)
    at .../drivers/irqchip/irq-gic-v3.c:367
#18 0xffffff8010084fd0 in call_on_stack
    ()
    at .../arch/arm64/kernel/entry.S:718
Backtrace stopped: Cannot access memory at address 0xffffff8010004008


-Doug


_______________________________________________
Kgdb-bugreport mailing list
Kgdb-bugreport@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport

Reply via email to