Re: Broken stack traces on crashed programs

2020-11-18 Thread Ludovic Courtès
Hi,

Samuel Thibault  skribis:

> Ludovic Courtès, le mar. 17 nov. 2020 14:55:32 +0100, a ecrit:
>> Samuel Thibault  skribis:
>> 
>> > Ludovic Courtès, le mar. 17 nov. 2020 10:57:43 +0100, a ecrit:
>> >> I’ve noticed that I’d always get “broken” stack traces in GDB when (1)
>> >> attaching to a program suspended by /servers/crash-suspend, (2)
>> >> examining a core dump, or (3) spawning a program in GDB and examining it
>> >> after it’s received an unhandled signal like SIGILL.
>
> Ah, in all case you mean when receiving an unhandled signal?

Yes.

> I see that happen indeed.

Ah ha!

With a native GDB build¹ I observe the same phenomenon.

>> I get pretty back traces until the program gets an unhandled signal,
>> AFAICT.  This makes me think it could have something to do with how GDB
>> obtains thread state info for suspended threads.
>
> Probably, yes.

I looked around in gdb/*gnu-nat.c though I’m not quite sure what to look
for.

‘set debug gnu-nat on’ gives this right before the inferior, started
from GDB, gets an unhandled SIGILL:

--8<---cut here---start->8---
../../gdb-9.2/gdb/gnu-nat.c:953: {inf 8543 0x88062a0}: running...
../../gdb-9.2/gdb/gnu-nat.c:688: {inf 8543 0x88062a0}: clearing wait
../../gdb-9.2/gdb/gnu-nat.c:1535: {inf 8543 0x88062a0}: waiting for an event...
receiving objects   1% [  
]../../gdb-9.2/gdb/gnu-nat.c:1027: {inf 8543 0x88062a0}: fetching threads
../../gdb-9.2/gdb/gnu-nat.c:927: {inf 8543 0x88062a0}: updating suspend counts
../../gdb-9.2/gdb/gnu-nat.c:273: {proc 8543/-1 0x8806b10}: sc: 0 --> 1
../../gdb-9.2/gdb/gnu-nat.c:312: {proc 8543/-1 0x8806b10}: is suspended
../../gdb-9.2/gdb/gnu-nat.c:312: {proc 8543/4 0x88083d0}: is running
../../gdb-9.2/gdb/gnu-nat.c:312: {proc 8543/5 0x8da2db0}: is running
../../gdb-9.2/gdb/gnu-nat.c:312: {proc 8543/6 0x8d9d9d0}: is running
../../gdb-9.2/gdb/gnu-nat.c:312: {proc 8543/7 0x8ece5b0}: is running
../../gdb-9.2/gdb/gnu-nat.c:953: {inf 8543 0x88062a0}: not running...
../../gdb-9.2/gdb/gnu-nat.c:1564: {inf 8543 0x88062a0}: event: msgid = 24120
../../gdb-9.2/gdb/gnu-nat.c:1827: {inf 8543 0x88062a0}: err = 0, pid = 8543, 
status = 0x47f, sigcode = 0
../../gdb-9.2/gdb/gnu-nat.c:1842: {inf 8543 0x88062a0}: waits pending now: 0
../../gdb-9.2/gdb/gnu-nat.c:1863: {inf 8543 0x88062a0}: process has stopped 
itself
../../gdb-9.2/gdb/gnu-nat.c:1027: {inf 8543 0x88062a0}: fetching threads
../../gdb-9.2/gdb/gnu-nat.c:1653: {inf 8543 0x88062a0}: returning ptid = Thread 
8543.4, status->kind = stopped, signal = GDB_SIGNAL_ILL
../../gdb-9.2/gdb/gnu-nat.c:374: {proc 8543/4 0x88083d0}: updating state info
../../gdb-9.2/gdb/gnu-nat.c:356: {proc 8543/4 0x88083d0}: aborted
../../gdb-9.2/gdb/gnu-nat.c:389: {proc 8543/4 0x88083d0}: getting thread state
../../gdb-9.2/gdb/i386-gnu-nat.c:149: {proc 8543/4 0x88083d0}: fetching 
register eip

Thread 4 received signal SIGILL, Illegal instruction.
../../gdb-9.2/gdb/gnu-nat.c:2525: {inf 8543 0x88062a0}: writing 0x11270[1] <-- 
0x881d5d8
../../gdb-9.2/gdb/gnu-nat.c:2532: {inf 8543 0x88062a0}: reading 0x810be88[1] 
--> 0x280479f
../../gdb-9.2/gdb/gnu-nat.c:2532: {inf 8543 0x88062a0}: reading 0x810be88[1] 
--> 0x280479b
../../gdb-9.2/gdb/gnu-nat.c:2532: {inf 8543 0x88062a0}: reading 0x810be80[64] 
--> 0x8f0f710
0x0810be88 in ?? ()
(gdb) bt
#0  0x0810be88 in ?? ()
../../gdb-9.2/gdb/gnu-nat.c:374: {proc 8543/4 0x88083d0}: updating state info
../../gdb-9.2/gdb/i386-gnu-nat.c:149: {proc 8543/4 0x88083d0}: fetching 
register ebp
../../gdb-9.2/gdb/gnu-nat.c:2532: {inf 8543 0x88062a0}: reading 0x813b480[64] 
--> 0x95328f0
../../gdb-9.2/gdb/gnu-nat.c:2532: {inf 8543 0x88062a0}: reading 0x0[64] --> 
0x8f60410
../../gdb-9.2/gdb/gnu-nat.c:2532: {inf 8543 0x88062a0}: reading 0x0[1] --> 
0x280478c
#1  0x in ?? ()
../../gdb-9.2/gdb/gnu-nat.c:2532: {inf 8543 0x88062a0}: reading 0x813b4c0[64] 
--> 0x8f60410
../../gdb-9.2/gdb/gnu-nat.c:2532: {inf 8543 0x88062a0}: reading 0x0[64] --> 
0x960e1b0
../../gdb-9.2/gdb/gnu-nat.c:2532: {inf 8543 0x88062a0}: reading 0x1[1] --> 
0x28047cc
(gdb) thread 5
[Switching to thread 5 (Thread 8543.5)]
../../gdb-9.2/gdb/gnu-nat.c:2532: {inf 8543 0x88062a0}: reading 0x123582c[1] 
--> 0x280468f
#0  0x0123582c in mach_msg_trap () at 
/tmp/guix-build-glibc-cross-i586-pc-gnu-2.31.drv-0/build/mach/mach_msg_trap.S:2
2   
/tmp/guix-build-glibc-cross-i586-pc-gnu-2.31.drv-0/build/mach/mach_msg_trap.S: 
No such file or directory.
(gdb) bt
#0  0x0123582c in mach_msg_trap () at 
/tmp/guix-build-glibc-cross-i586-pc-gnu-2.31.drv-0/build/mach/mach_msg_trap.S:2
../../gdb-9.2/gdb/gnu-nat.c:374: {proc 8543/5 0x8da2db0}: updating state info
../../gdb-9.2/gdb/i386-gnu-nat.c:149: {proc 8543/5 0x8da2db0}: fetching 
register esp
../../gdb-9.2/gdb/gnu-nat.c:2532: {inf 8543 0x88062a0}: reading 0x2802e80[64] 
--> 0x8f60410
#1  0x01235f2a in __GI___mach_msg (../../gdb-9.2/gdb/gnu-nat.c:2532: {inf 8543 
0x88062a0}: reading 0x2802ec0[64] --> 

Re: Broken stack traces on crashed programs

2020-11-18 Thread Samuel Thibault
Ludovic Courtès, le mar. 17 nov. 2020 14:55:32 +0100, a ecrit:
> Samuel Thibault  skribis:
> 
> > Ludovic Courtès, le mar. 17 nov. 2020 10:57:43 +0100, a ecrit:
> >> I’ve noticed that I’d always get “broken” stack traces in GDB when (1)
> >> attaching to a program suspended by /servers/crash-suspend, (2)
> >> examining a core dump, or (3) spawning a program in GDB and examining it
> >> after it’s received an unhandled signal like SIGILL.

Ah, in all case you mean when receiving an unhandled signal?

I see that happen indeed.

> I get pretty back traces until the program gets an unhandled signal,
> AFAICT.  This makes me think it could have something to do with how GDB
> obtains thread state info for suspended threads.

Probably, yes.

Samuel



Re: Broken stack traces on crashed programs

2020-11-17 Thread Ludovic Courtès
Hi!

Samuel Thibault  skribis:

> Ludovic Courtès, le mar. 17 nov. 2020 10:57:43 +0100, a ecrit:
>> I’ve noticed that I’d always get “broken” stack traces in GDB when (1)
>> attaching to a program suspended by /servers/crash-suspend, (2)
>> examining a core dump, or (3) spawning a program in GDB and examining it
>> after it’s received an unhandled signal like SIGILL.
>> 
>> At best I can see the backtrace of the msg thread, the other ones are
>> all question-marky:
>
> Silly question, but still important to ask: did you build with -g?

Yes.

I get pretty back traces until the program gets an unhandled signal,
AFAICT.  This makes me think it could have something to do with how GDB
obtains thread state info for suspended threads.

> (meaning: no, I don't have such kind of issue with gdb 9.2 in debian)

(Same with GDB 10.1.)

It could be that we’re missing a libc patch that Debian has, or (more
likely) that we’re miscompiling something on the way (this is all
cross-compiled from x86_64-linux-gnu).

To be continued…

Ludo’.



Re: Broken stack traces on crashed programs

2020-11-17 Thread Samuel Thibault
Ludovic Courtès, le mar. 17 nov. 2020 10:57:43 +0100, a ecrit:
> I’ve noticed that I’d always get “broken” stack traces in GDB when (1)
> attaching to a program suspended by /servers/crash-suspend, (2)
> examining a core dump, or (3) spawning a program in GDB and examining it
> after it’s received an unhandled signal like SIGILL.
> 
> At best I can see the backtrace of the msg thread, the other ones are
> all question-marky:

Silly question, but still important to ask: did you build with -g?

(meaning: no, I don't have such kind of issue with gdb 9.2 in debian)

Samuel



Broken stack traces on crashed programs

2020-11-17 Thread Ludovic Courtès
Hello!

I’ve noticed that I’d always get “broken” stack traces in GDB when (1)
attaching to a program suspended by /servers/crash-suspend, (2)
examining a core dump, or (3) spawning a program in GDB and examining it
after it’s received an unhandled signal like SIGILL.

At best I can see the backtrace of the msg thread, the other ones are
all question-marky:

--8<---cut here---start->8---
(gdb) thread 1
[Switching to thread 1 (process 310)]
#0  0x080f08c0 in ?? ()
(gdb) bt
#0  0x080f08c0 in ?? ()
#1  0x in ?? ()
(gdb) thread 2
[Switching to thread 2 (process 1)]
#0  0x0159282c in mach_msg_trap () at 
/tmp/guix-build-glibc-cross-i586-pc-gnu-2.31.drv-0/build/mach/mach_msg_trap.S:2
2   
/tmp/guix-build-glibc-cross-i586-pc-gnu-2.31.drv-0/build/mach/mach_msg_trap.S: 
No such file or directory.
(gdb) bt
#0  0x0159282c in mach_msg_trap () at 
/tmp/guix-build-glibc-cross-i586-pc-gnu-2.31.drv-0/build/mach/mach_msg_trap.S:2
#1  0x01592f2a in __GI___mach_msg (msg=0x2802aa0, option=3, send_size=96, 
rcv_size=32, rcv_name=109, timeout=0, notify=0) at msg.c:111
#2  0x017dc8ab in __crash_dump_task (crashserver=132, task=1, file=133, 
signo=11, sigcode=2, sigerror=2, exc=1, code=2, subcode=210986494, 
cttyid_port=102, cttyid_portPoly=19)
at 
/tmp/guix-build-glibc-cross-i586-pc-gnu-2.31.drv-0/build/hurd/RPC_crash_dump_task.c:254
#3  0x015b248c in write_corefile (detail=, signo=) at hurdsig.c:296
#4  post_signal (untraced=) at hurdsig.c:947
#5  0x015b274b in _hurd_internal_post_signal (ss=0x1800808, signo=11, 
detail=0x2802e5c, reply_port=0, reply_port_type=17, untraced=0) at 
hurdsig.c:1235
#6  0x015b3fc1 in _S_catch_exception_raise (port=96, thread=39, task=1, 
exception=1, code=2, subcode=210986494) at catch-exc.c:88
#7  0x017c09b4 in _Xexception_raise (InHeadP=0x2802f20, OutHeadP=0x2803f30) at 
/tmp/guix-build-glibc-cross-i586-pc-gnu-2.31.drv-0/build/mach/mach/exc_server.c:155
#8  0x017c0a52 in _S_exc_server (InHeadP=0x2802f20, OutHeadP=0x2803f30) at 
/tmp/guix-build-glibc-cross-i586-pc-gnu-2.31.drv-0/build/mach/mach/exc_server.c:208
#9  0x015a7a09 in msgport_server (inp=0x2802f20, outp=0x2803f30) at 
msgportdemux.c:49
#10 0x015934c3 in __mach_msg_server_timeout (demux=0x15a79b0 , 
max_size=4096, rcv_name=96, option=0, timeout=0) at msgserver.c:108
#11 0x01593607 in __mach_msg_server (demux=0x15a79b0 , 
max_size=4096, rcv_name=96) at msgserver.c:195
#12 0x015a7a86 in _hurd_msgport_receive () at msgportdemux.c:67
#13 0x011eda50 in entry_point (self=0x804ac20, start_routine=0x15a7a30 
<_hurd_msgport_receive>, arg=0x0) at pt-create.c:62
#14 0x in ?? ()
--8<---cut here---end--->8---

(This is on Guix System with GDB 9.2.)

Does that ring a bell?

Thanks,
Ludo’.