On Jun 3, 2008, at 9:45 AM, Diego Novillo wrote:
We've started working on the driver and WPA components for whopr.
These are some of our initial thoughts and implementation strategy.  I
have linked these to the WHOPR page as well.  I'm hoping we can
discuss these at the Summit BoF, so I'm posting them now to start the
discussion.

This is a very interesting design, and a very nice evolution from the previous proposal. I'm not completely clear on the difference between LTO and whopr here. Is LTO the mode "normal people" will use, and whopr is the mode where "people with huge clusters" will use? Will LTO/whopr support useful optimization on common multicore machines?

Some thoughts:

== Repackaging ==
Under this proposal, WPA repackages its input files.  Each output file
consists of the contents of a primary input file plus additional
DECL's and functions required for inlining.  ELF data is output
directly so that functions don't need to be deserialized.  LTRANS
reads each output file without reference to other files.  Initially,
only inlining will be supported.  Because inlining decisions can also
be made at the LTRANS phase, IPA serialization may be deferred to
phase 2.  This is roughly equivalent to the
many-to-1/many-to-many/1-to-many mappings approach described in the
WHOPR design document.

Are you focusing on inlining here as a specific example, or is this the only planned IPA optimization that can use summaries? It seems unfortunate to design a system where inlining is the only real IPO transformation you can do. Does adding new interprocedural optimizations require adding whole new phases?

= WHOPR Driver Design =

This document proposes a driver design for WHOPR based on
the linker.  Although this document focuses on gold, but a similar
approach can also be implemented in GNU ld.

I'm glad you guys finally came around to this design, it is far more sane.

== Design Philosophy ==
* The implementation provides complete transparency. Developers
should be able to take advantage of LTO without having to modify
existing build systems and/or Makefiles, all that's needed is to add
an LTO option (-flto).

Ok. How do you handle merging of optimization info? If I build one .o file with -Os and one with -O3 who wins or what does this mean? If I build one with -ffast-math and one without, does the right thing happen?

Also, where does debug info (i.e. DWARF for -g) get stored? I'm not talking about people debugging the compiler, I'm talking about people who want to build an executable with debug info.

* Transparency is achieved through tight integration with the linker.
Ideally, the linker communicates with LTO via a shared library
(plugin), eliminating any dependencies between the source bases of
linker and LTO, but other callback methods are also possible.

Excellent. Is the FSF/RMS ok with the linker having plugins? It seems strange (but great!) to allow plugins for the linker but not the compiler.

=== Why in the Linker? ===
As of this writing, the pre-ld driver collect2 performs the LTO file
identification. However, this is sub-optimal. The benefits of driving
LTO from the linker are:

* The linker performs full symbol resolution. Therefore, it will only
bring in objects that are necessary.  This can greatly reduce build
and library extraction times.

Yes, as we discussed before, this is tantamount to re-implementing system linkers and all their associated craziness. This is a much better approach.

The linker performs regular symbol resolution. For each object file it
touches, it calls a specific function in the plugin (int
ldplugin_claim_file(const char *fname, size_t offset)). This
function returns 1 if it intends to claim a file (e.g. it contains
IR), and 0 if it doesn't.   The offset is used in the case of an
archive file. This way the plugin doesn't need to understand archives.

Is there a specific reason you don't use the LLVM LTO interface? It seems to be roughly the same as your proposed interface:

a) it has a simple C interface like your proposed one
b) it is already implemented in one system linker (Apple's), so GCC would just provide its own linker plugin and it would work on apple platforms
c) it is richer than your interface
d) it is battle tested, and exists today
e) it is completely independent of llvm (by design)
f) it is fully documented: http://llvm.org/docs/LinkTimeOptimization.html

Is there something specific you don't like about the LLVM interface?

-Chris

Reply via email to