Re: addr2line [ Was: better stackdumps ]

2008-03-20 Thread Brian Dessent
Brian Dessent wrote:

> I think I see what's going on here though, the Cygwin fault handler took
> the first chance exception and wrote the stackdump file, and only then
> passed it on to the debugger, so that by the time gdb got notice of the
> fault the stack was all fubar.  This could be the reason why dumper is
> not working too.  I thought there was a IsBeingDebugged() check in the

Silly me, this is good old "set cygwin-exceptions" defaulting to off...
of course gdb was ignoring the fault and letting Cygwin handle it.  With
it set to on everything works as expected, and the issue of why the
process state that dumper records is so trashed is unrelated.

Brian


Re: addr2line [ Was: better stackdumps ]

2008-03-20 Thread Christopher Faylor
On Thu, Mar 20, 2008 at 11:23:05AM -0700, Brian Dessent wrote:
>Corinna Vinschen wrote:
>
>> Is it a big problem to fix addr2line to deal with .dbg files?
>> 
>> I like your idea to add names to the stackdump especially because of
>> addr2line's brokenness.  But, actually, if addr2line would work with
>> .dbg files, there would be no reason to add this to the stackdump file.
>
>I absolutely agree that addr2line and/or dumper and/or gdb should be
>fixed, regardless of this patch.  I never meant to imply an either/or
>situation, and in fact I have debugged addr2line and here are the
>reasons it's broken:
>
>Firstly it's got nothing to do with .gnu_debuglink separate debug file,
>that part works just fine.  And secondly addr2line only loads the debug
>information for the module that you supply with -e, meaning that if you
>give "-e a.exe" it will look at symbols for a.exe, but it doesn't know
>that a.exe is dynamically linked to cygwin1.dll and it won't try to load
>symbols for cygwin1.dll.  This means to use it you need to know
>beforehand which module the address is in, which right there makes it
>kind of a pain to use for DLLs, and to me it rather dilutes the argument
>that you can just postprocess a stackdump file with it since you need
>more information than what's there.
>
>The next problem is that addr2line first tries to read STABS, and if
>that fails it falls back to DWARF-2.  I always build Cygwin and most
>other things with DWARF-2 debug symbols, mainly to make sure they work
>but really aren't we eventually hoping to get rid of STABS?  Anyway,
>this exposed another problem in that even if you build all of Newlib and
>Cygwin with -gdwarf-2 or -ggdb3, you still get a handful STABS symbols
>which are hardcoded in various assembler files:
>
>mktemp.cc:20:  asm (".stabs \"" msg "\",30,0,0,0\n\t" \
>mktemp.cc:21:  ".stabs \"_" #symbol "\",1,0,0,0\n");
>
>This is used to insert a linktime warning for using mktemp().
>
>sigfe.s:3:  .stabs  "_sigfe:F(0,1)",36,0,0,__sigfe
>sigfe.s:44: .stabs  "_sigbe:F(0,1)",36,0,0,__sigbe
>sigfe.s:70: .stabs  "sigreturn:F(0,1)",36,0,0,_sigreturn
>sigfe.s:108:.stabs  "sigdelayed:F(0,1)",36,0,0,_sigdelayed
>
>This becomes a problem in that when bfd tries to find an address in the
>debug data it sees these minimal STABS and considers them a match --
>even though they are mostly irrelevant, they are present and since it's
>only got an address to go by it doesn't know that there is a much better
>match in the DWARF-2 data.  It just sees that it has gotten a (bad)
>match, so it doesn't bother looking in the DWARF-2 data.  And since
>those hand-coded .stabs above only give symbol name locations, not line
>number information, that means that regardless of what you ask addr2line
>it's going to return nothing because it only cares about line number
>info.
>
>I see two potential fixes here, the first being that Cygwin could be
>adapted to not hardcode .stabs but rather detect whether it's being
>built with DWARF-2 or STABS and use the appropriate kind.  The other fix
>is to teach BFD to try DWARF-2 first before STABS.  The attached patch
>does this, for the purposes of illustration -- I don't really claim this
>is correct.
>
>Once that is applied, here is the result of running the patched
>addr2line on the addresses in the stackdump of this testcase:
>
>$ for F in 610F74B1 610FDD3B 6110A310 610AA4A8 61006094; do
>/build/combined/binutils/.libs/addr2line.exe -e /bin/cygwin1.dll -f
>0x$F; done
>??
>??:0
>_vfprintf_r
>/usr/src/sourceware/newlib/libc/stdio/vfprintf.c:1197
>printf
>/usr/src/sourceware/newlib/libc/stdio/printf.c:55
>??
>??:0
>_Z10dll_crt0_1Pv
>/usr/src/sourceware/winsup/cygwin/dcrt0.cc:930
>
>It now gets 3 out of 5 correct.  It got tripped up on _sigbe because
>again addr2line only cares about line number info, not general address
>information, and while there is information for the location of _sigbe,
>they don't contain line number info:
>
>(gdb) i ad _sigbe
>Symbol "_sigbe" is at 0x610aa4a8 in a file compiled without debugging.
>
>For the top frame (strlen), addr2line could not print anything because
>while there is location information, there is no line number
>information:
>
>(gdb) i li *0x610F74B1
>No line number information available for address 0x610f74b1 
>
>This is due to the fact that strlen is implemented in newlib as
>libc/machine/i386/strlen.S which is a straight assembler version, and
>hence no line number debug records.
>
>
>
>*** To summarize thus far:
>
>1. addr2line can be made to work again by one of a) dictating the use of
>STABS (boo!), b) modifying Cygwin to not emit hardcoded .stabs
>directives directly, c) modifying BFD to prefer DWARF-2 to STABS when
>reading COFF files.
>
>2. addr2line requires the user to know beforehand which DLL a symbol is
>in, because it can't resolve runtime dependencies.
>
>3. addr2line only cares about line number debug records, which means it
>will be incapable of representing many symbols.
>
>4. As an im

addr2line [ Was: better stackdumps ]

2008-03-20 Thread Brian Dessent
Corinna Vinschen wrote:

> Is it a big problem to fix addr2line to deal with .dbg files?
> 
> I like your idea to add names to the stackdump especially because of
> addr2line's brokenness.  But, actually, if addr2line would work with
> .dbg files, there would be no reason to add this to the stackdump file.

I absolutely agree that addr2line and/or dumper and/or gdb should be
fixed, regardless of this patch.  I never meant to imply an either/or
situation, and in fact I have debugged addr2line and here are the
reasons it's broken:

Firstly it's got nothing to do with .gnu_debuglink separate debug file,
that part works just fine.  And secondly addr2line only loads the debug
information for the module that you supply with -e, meaning that if you
give "-e a.exe" it will look at symbols for a.exe, but it doesn't know
that a.exe is dynamically linked to cygwin1.dll and it won't try to load
symbols for cygwin1.dll.  This means to use it you need to know
beforehand which module the address is in, which right there makes it
kind of a pain to use for DLLs, and to me it rather dilutes the argument
that you can just postprocess a stackdump file with it since you need
more information than what's there.

The next problem is that addr2line first tries to read STABS, and if
that fails it falls back to DWARF-2.  I always build Cygwin and most
other things with DWARF-2 debug symbols, mainly to make sure they work
but really aren't we eventually hoping to get rid of STABS?  Anyway,
this exposed another problem in that even if you build all of Newlib and
Cygwin with -gdwarf-2 or -ggdb3, you still get a handful STABS symbols
which are hardcoded in various assembler files:

mktemp.cc:20:  asm (".stabs \"" msg "\",30,0,0,0\n\t" \
mktemp.cc:21:  ".stabs \"_" #symbol "\",1,0,0,0\n");

This is used to insert a linktime warning for using mktemp().

sigfe.s:3:  .stabs  "_sigfe:F(0,1)",36,0,0,__sigfe
sigfe.s:44: .stabs  "_sigbe:F(0,1)",36,0,0,__sigbe
sigfe.s:70: .stabs  "sigreturn:F(0,1)",36,0,0,_sigreturn
sigfe.s:108:.stabs  "sigdelayed:F(0,1)",36,0,0,_sigdelayed

This becomes a problem in that when bfd tries to find an address in the
debug data it sees these minimal STABS and considers them a match --
even though they are mostly irrelevant, they are present and since it's
only got an address to go by it doesn't know that there is a much better
match in the DWARF-2 data.  It just sees that it has gotten a (bad)
match, so it doesn't bother looking in the DWARF-2 data.  And since
those hand-coded .stabs above only give symbol name locations, not line
number information, that means that regardless of what you ask addr2line
it's going to return nothing because it only cares about line number
info.

I see two potential fixes here, the first being that Cygwin could be
adapted to not hardcode .stabs but rather detect whether it's being
built with DWARF-2 or STABS and use the appropriate kind.  The other fix
is to teach BFD to try DWARF-2 first before STABS.  The attached patch
does this, for the purposes of illustration -- I don't really claim this
is correct.

Once that is applied, here is the result of running the patched
addr2line on the addresses in the stackdump of this testcase:

$ for F in 610F74B1 610FDD3B 6110A310 610AA4A8 61006094; do
/build/combined/binutils/.libs/addr2line.exe -e /bin/cygwin1.dll -f
0x$F; done
??
??:0
_vfprintf_r
/usr/src/sourceware/newlib/libc/stdio/vfprintf.c:1197
printf
/usr/src/sourceware/newlib/libc/stdio/printf.c:55
??
??:0
_Z10dll_crt0_1Pv
/usr/src/sourceware/winsup/cygwin/dcrt0.cc:930

It now gets 3 out of 5 correct.  It got tripped up on _sigbe because
again addr2line only cares about line number info, not general address
information, and while there is information for the location of _sigbe,
they don't contain line number info:

(gdb) i ad _sigbe
Symbol "_sigbe" is at 0x610aa4a8 in a file compiled without debugging.

For the top frame (strlen), addr2line could not print anything because
while there is location information, there is no line number
information:

(gdb) i li *0x610F74B1
No line number information available for address 0x610f74b1 

This is due to the fact that strlen is implemented in newlib as
libc/machine/i386/strlen.S which is a straight assembler version, and
hence no line number debug records.



*** To summarize thus far:

1. addr2line can be made to work again by one of a) dictating the use of
STABS (boo!), b) modifying Cygwin to not emit hardcoded .stabs
directives directly, c) modifying BFD to prefer DWARF-2 to STABS when
reading COFF files.

2. addr2line requires the user to know beforehand which DLL a symbol is
in, because it can't resolve runtime dependencies.

3. addr2line only cares about line number debug records, which means it
will be incapable of representing many symbols.

4. As an implication of 3), addr2line is totally useless on DLLs/EXEs
without debug information available.



I think point number 4 is worth repeating: we as developers take for
granted