Hi, all. David Ungar and Graydon Hoare have been doing compiler performance 
investigations, and in particular are interested in the cases where 
WMO-with-Onone builds are faster than the multi-file -Onone builds. By now this 
is out there as a hackaround for build time on targets with many files because 
the compiler spends too much time doing repeated work, and whatever work is 
saved by -incremental isn't worth it. -incremental is still very conservative 
and we could probably do better there, but meanwhile there's something to this: 
could we save on work in non-WMO mode by processing multiple "primary files" in 
one frontend invocation?

(This email is a bit of a brain dump on some of the past discussions on this, 
because David's fairly new to the Apple side of the compiler project, and even 
Graydon hasn't been here for all of it. I also realized partway through writing 
it that I already said a lot of this in docs/Driver.md 
<https://github.com/apple/swift/blob/master/docs/Driver.md>, but that's more 
for outside people. docs/DependenciesAnalysis.rst 
<https://github.com/apple/swift/blob/master/docs/DependencyAnalysis.rst> may 
also be relevant.)

First, some background:

WMO mode
- one frontend process
- parses, resolves imports, and type-checks all files
- generates SIL for all files in a single SILModule
- usually generates N LLVM modules, which turn into N .o files in parallel
- does not support incremental builds, even under -Onone (one reason: it isn't 
guaranteed which content goes into which .o)

(There's also a fully single-threaded WMO mode which outputs a single .o file. 
This allows slightly more LLVM optimization at the cost of, well, being 
single-threaded. For historical reasons this is the default WMO mode on the 
command line, but isn't used by Xcode or SwiftPM.)

Non-WMO mode
- a.k.a. "multi-file mode", "multi-frontend mode", "primary file mode", or 
"incremental mode" – there isn't really a standard name for this
- parses and resolves imports in all files, type-checks just the "primary file" 
(specified on command line)
- …but type-checking one file can sometimes require partially type-checking 
others (e.g. because you inherit from a class in another file). You pretty much 
always get to skip function bodies, though.
- generates SIL for the primary file in a SILModule
- generates 1 LLVM module <http://llvm.org/docs/LangRef.html#module-structure>, 
which results in one .o file
- very conservative incremental builds (as in, "err on the side of rebuilding")


So of course there's repeated work in the non-WMO mode. Parsing all files is 
something that we don't consider expensive because it's a relatively small part 
of the total compile time, though I can't remember if we're smart enough to 
delay parsing of function bodies in other files. However, it turns out even the 
cost of opening the files can matter. Beyond that, there are a number of other 
pieces of potentially repeated work:

- anything brought in through the Clang importer
- anything deserialized from a Swift module (usually the standard library)
- searching for modules on disk (both Clang modules and Swift modules)
- the "partial type-checking" I alluded to above
- synthesizing helper functions for things imported from Clang (for example, 
memberwise initializers for structs)

David suggested today the idea of a "multiple primary file" mode to speed up 
compiles (prompting this email), which isn't a bad idea. However, we definitely 
want to be sure we don't break incremental builds. That means a few 
restrictions:

- We still want to generate one object file per source file.
- This object file has to be self-contained, in case we don't rebuild it next 
time, or do rebuild something it depends on.
- The easiest way to guarantee one self-contained object file is to have one 
LLVM module.
- But we probably also want one SIL module per source file, since SILModule has 
a field that controls what scope we can assume perfect information for, and it 
wouldn't be self-contained to have that cross source file boundaries. (I don't 
trust us to only use this during optimization, either.)
- The Swift AST and ASTContext are not thread-safe, so we can't process 
multiple SILModules in different threads today.
- When we do synthesize helper functions that aren't associated with a source 
file, we probably need to put them in every object file.

So, the least intrusive approach here that still supports incremental builds 
would be to type-check multiple files, but then serially process a separate 
SILModule for each "primary file". (We might want to call them something more 
like "active files".) This is (probably) what David was getting at.

However…

…there's something more interesting.

Pipelining

Currently, the way we incrementally build a project is by passing all input 
files to the Swift driver, which decides which files need to be rebuilt and 
then spawns a bunch of "frontend jobs", one per primary file that needs to be 
rebuilt. (It's actually a little worse than this because we may discover that a 
file needs to be rebuilt only after doing some of the work 
<https://github.com/apple/swift/blob/master/docs/Driver.md#incremental-builds>discover
 that a file needs to be rebuilt 
<https://github.com/apple/swift/blob/master/docs/Driver.md#incremental-builds> 
only after doing some of the work.) The "least intrusive" approach I described 
above would partition up the files we know we need to rebuild ahead of time, 
but it would still end up spawning new jobs for those "late-discovered" files. 
It could also end up accidentally putting five long files on one job and five 
short files on another, which isn't a great use of resources.

An alternate approach would be what Daniel Dunbar has called pipelining: when a 
particular frontend job is done, the parent process should be able to send it 
another file to work on, starting from the type-checking phase.

There are some downsides to pipelining:
- Even less deterministic than a normal parallel build (if "less deterministic" 
can be considered a thing)
- Harder to test – we'd have to make the "multiple primary file" mode anyway 
just to ensure the same series of processing.
- Unbounded growth – we'd probably want to start from scratch at some point if 
memory usage exceeded some threshold.

But overall I think this fits our model better, and if we can figure out a good 
way to communicate between the frontend and driver this is the way to go.

(Since David recently pushed through a nice refactoring of 
CompilerInstance::performSema, we know there's an additional complication 
around targets that have a main.swift—today we type-check that file piecemeal 
instead of parsing it all at once. We'd probably either subset main.swift out 
from this optimization, or figure out if we can drop the weird special-case 
type-checking.)

Faster Optimized Builds?

Also important! But since we expect most optimized builds these days are WMO 
builds, this sort of approach is less important than it might otherwise be.

Incremental Optimized Builds?

Unlikely for WMO, although there's already a small optimization where if a 
particular LLVM module's contents are exactly the same as the last build, we 
won't bother re-optimizing at the LLVM level. For non-WMO, naively extending 
the current model would require that we were sure that the current dependency 
tracking mechanism included all information that the optimizer uses, which I'm 
not confident about. However, we've talked (idly) about doing incremental 
processing up to the "mandatory optimizations" in SIL, i.e. up until we've 
produced canonical SIL and emitted flow-sensitive diagnostics. We would then 
write this to disk (we call these "SIB" files, for "Swift Intermediate Binary 
file") and then do a non-incremental stage turning this into object files. We 
could even do cross-file WMO-like optimization this way, either by using a 
single process that merged in all the SIB information, or by loading all N 
files into multiple processes, each responsible for one or more object files.

But of course, within most optimized builds, type-checking and SILGen is at 
most a third of the build time, usually less. So it's unclear whether this 
would be a net win given all the extra overhead.


That's probably all the relevant information I have right now, but it seems 
like quite enough.

Jordan
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev

Reply via email to