Re: [rust-dev] cycle time, compile/test performance

Corey Richardson Wed, 21 Aug 2013 19:47:55 -0700

On Wed, Aug 21, 2013 at 8:53 PM, Bill Myers <[email protected]> wrote:
> Have you considered the following "non-specific" quick fixes?
>
> 1. Build on a ramfs/ramdisk
>


IO and especially disk IO are almost 0 compilation time. All files in
a crate are read at once, then compilation happens.

> 2. Distribute compilations and tests across a cluster of machines (like
> distcc)
>

Compilation is 99%  serial (the only things that happen in parallel
are rustpkg and rustdoc etc at the end, and they are almost nothing),
though tests could be distributed (and Graydon is working on doing
that afaik).

> 3. If non-parallelizable code is still the bottleneck, use the fastest CPU
> possible (i.e. an overclocked Core i7 4770K, overclocked Core i7
> 4960X/3960X/4930K/3930K, dual Xeon E5 2687W or quad Xeon E5 4650 depending
> on whether you need 4, 6, 16 or 32 cores)
>
> 4. Read metadata only once, in one of these ways:
> 4a. Pass all files to a single compiler invocation (per machine or core)

This already happens: crates are compiled all-at-once, unlike C/C++'s
per-file-and-then-link compilation model.

> 4b. Have a long-lived rustc "daemon" (per machine or core) that keeps crate
> metadata in memory and gets passed files to compile by fd

This wouldn't be that much of a quick fix, and the work to do it would
need a betterly-structured metadata that wouldn't suffer from the same
problems current metadata does (though this could still be an
optimization later).

> 4c. Use CRIU suspend/restore (or the unexec from Emacs or whatever) to
> suspend a rustc process after metadata is read and restore that image for
> each file instead of spawning a new one

This is an interesting idea, pursuing it might be warranted.

> 4d. Allocate metadata using a special allocator that allocates it from a
> block at a fixed memory address, then just dump the block into a file, and
> read metadata with a single mmap system call at that same fixed address
> (this is a security hole in general, so it needs to be optional and off by
> default)

Also an interesting idea, though a bit of work. Brian is working on
not compressing metadata, which would be a win.

Here is my current understanding of the problems with metadata (from
scouring some profiles and the code ~3 months ago):

- Metadata is large. It is multiple megabytes of data (uncompressed.
compressed as of now it is 749K) for libstd. I'm not sure whether we
are encoding too much data or if it's exactly what we need, but this
is a very large constant that every inefficiency gets scaled by.
- Metadata is stored as EBML, always. I'm sure EBML is fine most of
the time, but it really hurts us here. Part of the problem is that it
is a big endian format. The byte swapping shows up very high (top 5)
in a profile. Additionally, *every time we query the stored metadata
for something, we decode the EBML*. This turns EBML into a
multiplicative slowdown, rather than just an additive if we decoded
the entire structure up front and used the native-rust representation.
- Metadata is stored in a single blob. To access any part of metadata,
we need to load it all, and there is no index. If we could only load
the metadata we need, we'd reduce memory usage and do less wasted
work.
- Metadata is scary. It's a hairy part of the codebase that I sure
don't understand. I know its purpose, more or less, but not the
specifics of how or why things are encoded. Michael Sullivan could
speak more to this, he is the last one to have touched it. The
compiler can't help you when you make a mistake, either.

I think the solution requires a much more systemic change than your
proposes quick fixes.
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] cycle time, compile/test performance

Reply via email to