>
> IIRC this will add linkage name into early debug, so with LTO this pre-dates
> any mangling done by WPA. Did you check how it behaves with LTO?
auto-profile is read at compile time (before LTO). It
needs to apply profile measured on optimized program to almost
unoptimized one. We have get_original_name which knows what suffixes
are added after auto-profile (isra, constprop, lto_priv, part, cold).
>
> linkage names add quite some bloat - the patch indicates we never added
> them for local visibility functions? As for the name/TU pair, the TU should
> be discoverable by walking the DIE parents.
Yes, dwarf contains them only for public symbols (including optimized
out comdats). I am not sure what they are used for, I see it was
originally MIPS extension that became part of dwarf. If function is not
inline AFDO tools will use symbol name, but if it is inline there is no
info. llvm adds it always.
>
> -gdebug-for-auto-profile is a bit verbose, I wonder (unless you plan more
> changes here) whether a -g[no-]linkage-names or
> -glinkage-names={default,full,none}
> might be more universally useful?
I am not sure what kind of command line interface we want to arrive for,
but I would like to have one option for user to specify that he/she
intends to perf the program and use auto-profile later.
This is relatively complicated at clang side. Manual
https://clang.llvm.org/docs/UsersManual.html#using-sampling-profilers
suggests:
clang++ -O2 -gline-tables-only \
-fdebug-info-for-profiling -funique-internal-linkage-names \
code.cc -o code
for training run.
-gline-tables-only enables debug info w/o debug for
types at all (so less than our -O1). I think it may make sense to have
similar switch, singe the debug info is then much smaller. It is not
that useful for debuging then, but still useful for profiling by perf
etc.
-fdebug-info-for-profiling enables
discriminators, so I tought it may be benefical to have simlar name.
(which I got wrong in the email). There is no documentation of that
option, but grepping clang sources seems to show only this difference.
I think we may have two things
1) -gdebug-info-for-profiling which enables necessary bits of debug
info. I do not know why they went for -f
2) -fauto-profile-friendly (or better name) which will set up debug
but also disable optimizations not working well with auto-profile.
This is, for example, ICF that leads to completely misplaced
profile.
-funique-internal-linkage-names adds the .uniq suffixes
to all internal symbols which I would like to avoid. It breaks asm
statements and causes bloat of symbol tables. LLVM folks says they are
replacing it anyway.
I am also fine with -glinkage-names= and -gdiscriminators,
but we probably want some umbrella switch for those planning to use
auto-fdo.
I am not thrilled with a way clang handles the tooling of auto-fdo.
While solving individual problem I however try to look to what they do
and try to avoid unnecesary wheels reinventions. I thinkt he sugested
command line above is way too verbose and sensitive to actual setup.
For example on Windows one needs to use
clang-cl /O2 -gdwarf -gline-tables-only ^
/clang:-fdebug-info-for-profiling /clang:-funique-internal-linkage-names ^
code.cc /Fe:code /fuse-ld=lld /link /debug:dwarf
Which looks awful.
LLVM has also competely alternative way to do instrumentation using
-fpseudo-probe-for-profiling which is an alternative to
-fdebug-info-for-profiling and its implementation is described in this
review thread https://reviews.llvm.org/D86193
Pseudo-probes internally are implemented as kind of debug statements.
They are builtin calls that fake incrementing some counter which gets
through the compilation and ends up in separate section.
Newither -fdebug-info-for-profiling or -fpseudo-probe-for-profiling is
documented in clang manual. -fdebug-info-for-profiling is
As discussed, I think we can try to follow the path of improving debug
info especially since I think most of the changes necessary are actually
useful to real debugging. In addition to the way to get unique names
for internal linkage functions, I also need to solve the problem with
multiple stacktraces per instruction which would be indeed useful if you
want to implement single-stepping over optimized out code. Jakub
mentioned that this is being worked on at GDB side.
Honza
>
>
> > gcc/ChangeLog:
> >
> > * dwarf2out.cc (add_linkage_name): Store linkage names
> > of private function symbols to help auto-fdo.
> >
> > diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
> > index 0bd8474bc37..95d9f219cf0 100644
> > --- a/gcc/dwarf2out.cc
> > +++ b/gcc/dwarf2out.cc
> > @@ -22235,7 +22235,10 @@ add_linkage_name (dw_die_ref die, tree decl)
> > {
> > if (debug_info_level > DINFO_LEVEL_NONE
> > && VAR_OR_FUNCTION_DECL_P (decl)
> > - && TREE_PUBLIC (decl)
> > + && (TREE_PUBLIC (decl)
> > + /* Linkage names of internal function symbols
> > + are used by auto-fdo. */
> > + || TREE_CODE (decl) == FUNCTION_DECL)
> > && !(VAR_P (decl) && DECL_REGISTER (decl))
> > && die->die_tag != DW_TAG_member)
> > add_linkage_name_raw (die, decl);