Re: Type representation in CTF and DWARF

Richard Biener Wed, 09 Oct 2019 00:41:58 -0700

On Wed, Oct 9, 2019 at 7:26 AM Indu Bhagat <indu.bha...@oracle.com> wrote:
>
>
>
> On 10/08/2019 08:37 AM, Pedro Alves wrote:
> > On 10/4/19 8:23 PM, Indu Bhagat wrote:
> >> Hello,
> >>
> >> At GNU Tools Cauldron this year, some folks were curious to know more on 
> >> how
> >> the "type representation" in CTF compares vis-a-vis DWARF.
> > I was one of those, and I brought this up to Jose, after your
> > presentation.  Glad to see the follow up!  Thanks much for this.
> >
> > In your Cauldron presentation we saw CTF compared to full blown DWARF
> > as justification for CTF,
>
> Hmm. And I thought I made the effort reqd to clarify my position that 
> comparing
> full-blown DWARF sizes to type-only CTF section sizes is not appropriate, let
> alone to not use as a justification for CTF. My intention to show those 
> numbers was
> only to give some perspective to users curious to know the sizes of CTF debug
> info (as generated by dwarf2ctf) because these sections will ideally be not
> stripped out of shipped binaries.
>
> The justification for CTF is and will remain - a compact, faster debug format
> for type information and support some online debugging use-cases (like
> backtraces) in future.
>
> > but I was more interested in a comparison between
> > CTF and a DWARF subset containing exactly only what you have available in
> > CTF.  Because if DWARF with everything-you-don't-need stripped out
> > is in the same ballpark, then I am puzzled on why add/maintain a new
> > Debug format, with all the duplication of effort that entails going
> > forward.
>
> I shared some numbers on this in the previous emails in this thread. I thought
> comparing DWARF's de-duplication-amenable offering (using
> -fdebug-types-section) will be useful in this context.
>
> For binaries compiled with -fdebug-types-section -gdwarf-4, here is some data.
> The CTF sections are generated with dwarf2ctf because CTF link-time de-dup is
> being worked on currently. The end result of link-time CTF de-dup is expected
> to be at par with these .ctf section sizes.
>
> The .ctf section sizes below include the CTF string table (.debug_str is
> excluded from the calculations however):
>
> (coreutils-0.22)
>     .debug_info(D1) | .debug_abbrev(D2) | .debug_str | .debug_types(D3) | 
> .ctf (uncompressed) | ratio (.ctf/(D1+D2+D3))
> ls  109806         |  18876            |  22042     |  12413           |   
> 26240             | 0.18
> pwd 27902          |  7914             |  10851     |  5753            |   
> 13929             | 0.33
> groups 26920       |  8173             |  10674     |  5070            |   
> 13378             | 0.33
>
> (emacs-26.3)
>     .debug_info(D1) | .debug_abbrev(D2) | .debug_str | .debug_types(D3) | 
> .ctf (uncompressed) | ratio (.ctf/(D1+D2+D3))
> emacs 3755083      |   202926          |  431926    |   143462         |   
> 273910            | 0.06
>
>
> It is not easy to get an estimate of 'DWARF with everything-you-don't-need
> stripped out'. At this time, I don't know of an easy way to make this 
> comparison
> more meaningful. Any suggestions ?


There's a mechanism to get type (and decl - I suppose CTF also
contains debug info
for function declarations not only its type?) info as part of early
debug generation.
The attached "hack" simply mangles dwarf2out to output this early info as the
only debug info (only verified on a small .c file).  We still have things like
file, line and column numbers for entities (not sure if CTF has those).

It should be possible to "hide" the hack behind a -gdwarf-like-ctf or similar.
I guess -g0.5 isn't desirable and we've taken both -g0 and -g1 already...
(and -g1 doesn't include types but just decls).

Richard.

> > Also, it's my understanding that the current CTF format doesn't yet
> > support C++, Vector registers, etc., maybe other things, so if DWARF
> > was sufficient for your needs, then in the long run it sounds like
> > a better option to me, as then you wouldn't have to extend CTF _and_
> > DWARF whenever some feature is needed.
>
> Yes, CTF does not support C++ at this time. To cover all of C (including
> GNU C extensions), we need to add representation for things like Vector type,
> non IEEE float etc. (somewhat infrequently occurring constructs)
>
> The issue is not that DWARF cannot represent the required type information.
> DWARF is voluminous and secondly, the current workflow to get to CTF from
> source programs without direct toolchain support is tiresome and lengthy.
>
> For current and future users of CTF, having the support for the format in the
> toolchain is the best way to promote adoption and enhance community 
> experience.
>
> > Maybe it would make sense to work on integrating CTF into the DWARF
> > standard itself, not sure?
> >
> > I was also curious on your plans for adding unwinding support to CTF,
> > while the kernel (the main CTF user, IIUC), already has plans to
> > use its own unwinding format (ORC)?
>
> Kernel's unwinding format (ORC) helps generate backtrace with function
> identifiers. For some (ORCL) internal customers, the requirement is to go 
> beyond
> that and support input arg values. The requirement there is to generate
> backtraces in a fast way, without relying on DWARF.
>
> > So with all those questions, I came out of the presentation
> > thinking that I could not really justify CTF if I were asked to.
>
> Thanks for discussing this openly. I believe there are other GCC
> maintainers who are undecided as well :)
>
> I hope I have answered some of your concerns.
>
> > (Side note: the Cauldron page is missing slides for your
> > presentation, so I couldn't go and recheck some things
> > mentioned above.)
> >
> > Thanks,
> > Pedro Alves
> >
> I mailed the organizers my slides. They should be online soon.
>
> Thanks
>

p
Description: Binary data

Re: Type representation in CTF and DWARF

Reply via email to