On 8/21/13 7:47 PM, Corey Richardson wrote:
IO and especially disk IO are almost 0 compilation time. All files in a crate are read at once, then compilation happens.
I don't believe this is true, as disk IO from metadata reading hurts.
- Metadata is large. It is multiple megabytes of data (uncompressed. compressed as of now it is 749K) for libstd. I'm not sure whether we are encoding too much data or if it's exactly what we need, but this is a very large constant that every inefficiency gets scaled by.
We could probably do a bit better here, but we do have to serialize ASTs for generics. libstd is already 2.3MB of Rust code, so I would expect the serialized ASTs from the generics to be on that order.
- Metadata is stored as EBML, always. I'm sure EBML is fine most of the time, but it really hurts us here. Part of the problem is that it is a big endian format. The byte swapping shows up very high (top 5) in a profile.
If I had to do it all over again I'd use atom trees instead of EBML, but I doubt that getting rid of vuints will help that much. I rewrote that routine in optimized asm once and it didn't help. The reason vuint_at shows up so high in the profile is mostly because, algorithmically, we read metadata too much, and reading integers is most of what reading metadata is.
Additionally, *every time we query the stored metadata for something, we decode the EBML*. This turns EBML into a multiplicative slowdown, rather than just an additive if we decoded the entire structure up front and used the native-rust representation.
If we decoded the entire structure up front and used native Rust we would lose the index and would suffer a slowdown. I believe Niko tried this and saw massive performance losses.
- Metadata is stored in a single blob. To access any part of metadata, we need to load it all, and there is no index. If we could only load the metadata we need, we'd reduce memory usage and do less wasted work.
This is untrue; there is an index. It's just that not every part of the compiler uses it yet. I have a patch that I will try to land tomorrow that converts resolve to be lazy and consult the index and reduces its time on hello world by 10x. Method tables in coherence will require a bit more work to be lazy, but could also reduce its time.
Patrick _______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
