Re: [RFC] Wrong register numbers in .dwarf_frame on Linux/PowerPC

2012-11-28 Thread Mark Wielaard
On Tue, 2012-11-27 at 19:49 +0100, Ulrich Weigand wrote:
> Mark Wielaard wrote:
> 
> > Which other unwinders are out there, that might rely on the current
> > numbering?
> 
> Well, runtime unwinders using .eh_frame should be fine, since this
> uses (and has always used) consistently the GCC numbering.  I don't
> know if there are other unwinders using .dwarf_frame ...

The reason systemtap hits this is that it can do unwinding of both user
and kernel space. The linux kernel doesn't include eh_frames, so we have
to fall back to .debug_frame.

> The change will most likely be to consistently use GCC numbering in
> .dwarf_frame as well, which changes only the encoding of the condition
> code register.  Since you're not using that at all in systemtap, you
> shouldn't be affected.

Yeah, we only use the unwinder currently to produce backtraces, which
are unlikely to rely on the condition code register.

> As far as Linux goes, yes, ppc was the only architecture with a
> different encoding between .eh_frame and .dwarf_frame.

In that case your option 3 seems ideal.

Thanks,

Mark



Re: [RFC] Wrong register numbers in .dwarf_frame on Linux/PowerPC

2012-11-27 Thread Mark Kettenis
> Date: Tue, 27 Nov 2012 19:43:40 +0100 (CET)
> From: "Ulrich Weigand" 
> 
> Mark Kettenis wrote:
> > > Date: Mon, 26 Nov 2012 20:10:06 +0100 (CET)
> > > From: "Ulrich Weigand" 
> > > 
> > > Hello,
> > > 
> > > I noticed what appears to be a long-standing bug in generating 
> > > .dwarf_frame
> > > sections with GCC on Linux on PowerPC.
> > > 
> > > ...
> > > 
> > > So I'm wondering where to go from here.  I guess we could:
> > > 
> > > 1. Bring GCC (and gas) behaviour in compliance with the documented ABI
> > >by removing the #undef DBX_REGISTER_NUMBER and changing gas's
> > >md_reg_eh_frame_to_debug_frame to the original implementation from
> > >Jakub's patch.  That would make GDB work well on new files, but
> > >there are a large number of binaries out there where we continue
> > >to have the same behaviour as today ...
> > > 
> > > 2. Leave GCC and gas as-is and modify GDB to expect GCC numbering in
> > >.dwarf_frame, except for the condition code register.  This would
> > >break debugging of files built with GCC 4.0 and 4.1 unless we
> > >want to add a special hack for that.
> > > 
> > > 3. Like 2., but remove the condition code hack: simply use identical
> > >numbers in .eh_frame and .dwarf_frame.  This would make PowerPC
> > >like other Linux platforms in that respect.
> > > 
> > > Thoughts?
> > 
> > What do other compilers (in particular XLC) do?  From a GDB standpoint
> > it would be a major PITA if different compilers would use different
> > encodings for .dwarf_frame.
> 
> In my tests XLC (version 12.1 on Linux) seems to consistently use the
> GCC register numbering in both .eh_frame and .dwarf_frame.  LLVM also
> consistently uses the GCC register numbering.  Looks like this would
> be another argument for variant 3 ...

Probably.  Certainly the most practical solution.  Although I'd say
that the fact that people have been able to live with the non-matching
register numbering schemes for so many years means that variant 1
wouldn't hurt people too badly.  It's a bit of a shame that on one of
the few architectures that bothered to provide a definition of the
DWARF register numbers we wouldn't use it :(.


Re: [RFC] Wrong register numbers in .dwarf_frame on Linux/PowerPC

2012-11-27 Thread Ulrich Weigand
David Edelsohn wrote:
> On Mon, Nov 26, 2012 at 2:10 PM, Ulrich Weigand  wrote:
> 
> > So I'm wondering where to go from here.  I guess we could:
> >
> > 1. Bring GCC (and gas) behaviour in compliance with the documented ABI
> >by removing the #undef DBX_REGISTER_NUMBER and changing gas's
> >md_reg_eh_frame_to_debug_frame to the original implementation from
> >Jakub's patch.  That would make GDB work well on new files, but
> >there are a large number of binaries out there where we continue
> >to have the same behaviour as today ...
> >
> > 2. Leave GCC and gas as-is and modify GDB to expect GCC numbering in
> >.dwarf_frame, except for the condition code register.  This would
> >break debugging of files built with GCC 4.0 and 4.1 unless we
> >want to add a special hack for that.
> >
> > 3. Like 2., but remove the condition code hack: simply use identical
> >numbers in .eh_frame and .dwarf_frame.  This would make PowerPC
> >like other Linux platforms in that respect.
> >
> > Thoughts?
> 
> I vote for (3).

I'd agree, in particular given that XLC and LLVM seem to match this
behaviour as well.

Looking into this further, it turns out that on Linux not only .debug_frame
is affected, but also .debug_info and all the other .debug_... sections.
DBX_REGISTER_NUMBER is used for register numbers in those sections too ...

This again doesn't match what GDB is expecting:  For regular debug info
(not frame info), GDB only distinguished between stabs and DWARF, and
assumes GCC numbering for stabs, and DWARF numbering for DWARF.  This
holds for any PowerPC operating system.

However, looking at GCC behaviour, we have instead GCC numbering used
in either stabs or DWARF on Linux, but DWARF numbering apparently used
in either stabs or DWARF on AIX/BSD/Darwin.

Here, comparison with other compilers is less clear.  I wasn't able to
get XLC on Linux to generate any .debug_info containing a register
number for non-GPR/FPR registers (it would always put such variables
on the stack).  The XLC on AIX I have access to is quite old and only
generates stabs; again I wasn't able to see any non-GPR register
assignments.  LLVM consistently uses the GCC numbering on all operating
systems it supports (I think that's Linux, Darwin, and FreeBSD).

As far as Linux is concerned, leaving the compilers as-is and changing
GDB to expect GCC numbering might be the best option.  Not sure about
other operating systems ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com



Re: [RFC] Wrong register numbers in .dwarf_frame on Linux/PowerPC

2012-11-27 Thread Ulrich Weigand
Mark Wielaard wrote:

> Which other unwinders are out there, that might rely on the current
> numbering?

Well, runtime unwinders using .eh_frame should be fine, since this
uses (and has always used) consistently the GCC numbering.  I don't
know if there are other unwinders using .dwarf_frame ...

> The Systemtap runtime unwinder (*) currently is incomplete
> (and in one case wrong since the numbering overlaps), so it doesn't
> really matter much which solution you pick (we will just have to watch
> out and fix things to be as consistent as possible when your change goes
> through). If you do change the numbering it would be ideal if there was
> a way to detect which one was in place (although it is probably hopeless
> because depending on which GCC version is in use there can already be
> different numberings).

The change will most likely be to consistently use GCC numbering in
.dwarf_frame as well, which changes only the encoding of the condition
code register.  Since you're not using that at all in systemtap, you
shouldn't be affected.

> BTW. The reason the systemtap runtime unwinder is
> a little wrong here is because on all other architectures we assume
> eh_frame and debug_frame DWARF register numberings are equal, is ppc
> really the only architecture for which that isn't true, or were we just
> lucky?

As far as Linux goes, yes, ppc was the only architecture with a
different encoding between .eh_frame and .dwarf_frame.  The only
other such platforms I'm aware of are Darwin on 32-bit i386, and
some other operating systems on ppc (AIX, Darwin, BSD).

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com



Re: [RFC] Wrong register numbers in .dwarf_frame on Linux/PowerPC

2012-11-27 Thread Ulrich Weigand
Mark Kettenis wrote:
> > Date: Mon, 26 Nov 2012 20:10:06 +0100 (CET)
> > From: "Ulrich Weigand" 
> > 
> > Hello,
> > 
> > I noticed what appears to be a long-standing bug in generating .dwarf_frame
> > sections with GCC on Linux on PowerPC.
> > 
> > ...
> > 
> > So I'm wondering where to go from here.  I guess we could:
> > 
> > 1. Bring GCC (and gas) behaviour in compliance with the documented ABI
> >by removing the #undef DBX_REGISTER_NUMBER and changing gas's
> >md_reg_eh_frame_to_debug_frame to the original implementation from
> >Jakub's patch.  That would make GDB work well on new files, but
> >there are a large number of binaries out there where we continue
> >to have the same behaviour as today ...
> > 
> > 2. Leave GCC and gas as-is and modify GDB to expect GCC numbering in
> >.dwarf_frame, except for the condition code register.  This would
> >break debugging of files built with GCC 4.0 and 4.1 unless we
> >want to add a special hack for that.
> > 
> > 3. Like 2., but remove the condition code hack: simply use identical
> >numbers in .eh_frame and .dwarf_frame.  This would make PowerPC
> >like other Linux platforms in that respect.
> > 
> > Thoughts?
> 
> What do other compilers (in particular XLC) do?  From a GDB standpoint
> it would be a major PITA if different compilers would use different
> encodings for .dwarf_frame.

In my tests XLC (version 12.1 on Linux) seems to consistently use the
GCC register numbering in both .eh_frame and .dwarf_frame.  LLVM also
consistently uses the GCC register numbering.  Looks like this would
be another argument for variant 3 ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com



Re: [RFC] Wrong register numbers in .dwarf_frame on Linux/PowerPC

2012-11-27 Thread Mark Wielaard
On Mon, 2012-11-26 at 20:10 +0100, Ulrich Weigand wrote:
> I noticed what appears to be a long-standing bug in generating .dwarf_frame
> sections with GCC on Linux on PowerPC.
> 
> It had been my understanding that .dwarf_frame is supposed to differ from
> .eh_frame on PowerPC w.r.t. register numbers: .eh_frame should use GCC
> internal numbers, while .dwarf_frame should use the DWARF register numbers
> documented in the PowerPC ELF ABI.  However, in actual fact, .dwarf_frame
> does not use the DWARF numbers; and it does not use the GCC numbers either,
> but a weird mixture: it uses GCC numbers for everything except for the
> condition code register, for which it uses the DWARF number (64).
> [...]
> Unfortunately, "use a newer version of
> GCC" isn't really quite right any more: the only versions of GCC that
> ever did it correctly were 4.0 and 4.1, it would appear.

Aha. Thanks for the investigation. I remember being very confused when
hacking on the Systemtap unwinder for ppc64. This explains it (RHEL5
derived distributions use GCC 4.1, but most others use something much
older or much newer).

> So I'm wondering where to go from here.  I guess we could:
> 
> 1. Bring GCC (and gas) behaviour in compliance with the documented ABI
>by removing the #undef DBX_REGISTER_NUMBER and changing gas's
>md_reg_eh_frame_to_debug_frame to the original implementation from
>Jakub's patch.  That would make GDB work well on new files, but
>there are a large number of binaries out there where we continue
>to have the same behaviour as today ...
> 
> 2. Leave GCC and gas as-is and modify GDB to expect GCC numbering in
>.dwarf_frame, except for the condition code register.  This would
>break debugging of files built with GCC 4.0 and 4.1 unless we
>want to add a special hack for that.
> 
> 3. Like 2., but remove the condition code hack: simply use identical
>numbers in .eh_frame and .dwarf_frame.  This would make PowerPC
>like other Linux platforms in that respect.
> 
> Thoughts?

Which other unwinders are out there, that might rely on the current
numbering? The Systemtap runtime unwinder (*) currently is incomplete
(and in one case wrong since the numbering overlaps), so it doesn't
really matter much which solution you pick (we will just have to watch
out and fix things to be as consistent as possible when your change goes
through). If you do change the numbering it would be ideal if there was
a way to detect which one was in place (although it is probably hopeless
because depending on which GCC version is in use there can already be
different numberings). BTW. The reason the systemtap runtime unwinder is
a little wrong here is because on all other architectures we assume
eh_frame and debug_frame DWARF register numberings are equal, is ppc
really the only architecture for which that isn't true, or were we just
lucky?

Thanks,

Mark

(*) The Systemtap runtime unwinder has this comment that explains (or
maybe confuses things even more...):

/* These are slightly strange since they don't really use dwarf register
   mappings, but gcc internal register numbers. There is some confusion about
   the numbering see http://gcc.gnu.org/ml/gcc/2004-01/msg00025.html
   We just handle the 32 fixed point registers, mq, count and link and
   ignore status registers, floating point, vectors and special registers
   (most of which aren't available in pt_regs anyway). Also we placed nip
   last since we use that as UNW_PC register and it needs to be filled in.
   Note that we handle both the .eh_frame and .debug_frame numbering at
   the same time. There is potential overlap though. 64 maps to cr in one
   and mq in the other...
   Everything else is mapped to an invalid register number . */




Re: [RFC] Wrong register numbers in .dwarf_frame on Linux/PowerPC

2012-11-26 Thread Mark Kettenis
> Date: Mon, 26 Nov 2012 20:10:06 +0100 (CET)
> From: "Ulrich Weigand" 
> 
> Hello,
> 
> I noticed what appears to be a long-standing bug in generating .dwarf_frame
> sections with GCC on Linux on PowerPC.
> 
> ...
> 
> So I'm wondering where to go from here.  I guess we could:
> 
> 1. Bring GCC (and gas) behaviour in compliance with the documented ABI
>by removing the #undef DBX_REGISTER_NUMBER and changing gas's
>md_reg_eh_frame_to_debug_frame to the original implementation from
>Jakub's patch.  That would make GDB work well on new files, but
>there are a large number of binaries out there where we continue
>to have the same behaviour as today ...
> 
> 2. Leave GCC and gas as-is and modify GDB to expect GCC numbering in
>.dwarf_frame, except for the condition code register.  This would
>break debugging of files built with GCC 4.0 and 4.1 unless we
>want to add a special hack for that.
> 
> 3. Like 2., but remove the condition code hack: simply use identical
>numbers in .eh_frame and .dwarf_frame.  This would make PowerPC
>like other Linux platforms in that respect.
> 
> Thoughts?

What do other compilers (in particular XLC) do?  From a GDB standpoint
it would be a major PITA if different compilers would use different
encodings for .dwarf_frame.


Re: [RFC] Wrong register numbers in .dwarf_frame on Linux/PowerPC

2012-11-26 Thread David Edelsohn
On Mon, Nov 26, 2012 at 2:10 PM, Ulrich Weigand  wrote:

> So I'm wondering where to go from here.  I guess we could:
>
> 1. Bring GCC (and gas) behaviour in compliance with the documented ABI
>by removing the #undef DBX_REGISTER_NUMBER and changing gas's
>md_reg_eh_frame_to_debug_frame to the original implementation from
>Jakub's patch.  That would make GDB work well on new files, but
>there are a large number of binaries out there where we continue
>to have the same behaviour as today ...
>
> 2. Leave GCC and gas as-is and modify GDB to expect GCC numbering in
>.dwarf_frame, except for the condition code register.  This would
>break debugging of files built with GCC 4.0 and 4.1 unless we
>want to add a special hack for that.
>
> 3. Like 2., but remove the condition code hack: simply use identical
>numbers in .eh_frame and .dwarf_frame.  This would make PowerPC
>like other Linux platforms in that respect.
>
> Thoughts?

I vote for (3).

Thanks, David