I have a proposal for improving the V8 build times, which I think are a big issue for many who want to contribute to V8. For example, if I make a change to a src/objects/objects.h it takes 18 minutes to recompile V8 on a laptop. This is with gn, ninja, ccache. This is for building all of V8 including d8, cctest, unittests.
The only solution I know is to compile .cc files in batches (see below for why this helps). I have some changes to the gni files that add a compilation flag, v8_enable_cluster_build so that this happens automatically. It's giving me a 3x improvement on compile times on both big desktops and smaller laptops. There are some .cc files that have global name clashes with each other. I have a set of CLs (linked off the bug at https://issues.chromium.org/issues/483903200) that fix these name clashes. For example https://chromium-review.googlesource.com/c/v8/v8/+/7562658 An alternative approach, which is in https://chromium-review.googlesource.com/c/v8/v8/+/7585474 is to have exclusion-lists of such problematic .cc files and just build these files in the ordinary way. This reduces the source changes to a minimum. Even with this approach I'm getting close to 3x speedup on my workstation, but I would personally prefer to fix the .cc files. I don't envision having a CI bot for the cluster build. Those of us who benefit from it would maintain it by either updating the exclude-lists or fixing issues with name clashes in the .cc files. As such it would not be much of a burden for Google if they choose not to use it. With the change we call out to python from the gni files (at gn gen time). This only happens if the v8_enable_cluster_build flag is activated, which I don't expect it to be by default. This costs a few hundred milliseconds at gn gen time, but this is well worth it since it only affects those using the cluster build option. Why compiling .cc files together works: The root of the problem is that there is a large number of .h files that get pulled into each .cc file. This means, for example, that each auto-generated .cc file that is made from a .tq file takes 20 seconds of CPU to compile, even on a fast modern core. This is true even if the .cc file is less than 100 lines long. (Clang is single-threaded.) Over the years the number of .cc files has increased dramatically. For example the regexp engine was a handful of arch-independent .cc files, plus one or two arch-dependent files. There are now 24 arch-independent .cc files for the regexp engine. I hear there have been some attempts to improve the situation with .h files, so that a smaller number of them get pulled into a given .cc compilation. This is complicated by the fact that some optimization decisions are based on the compiler seeing a .h (or -inl.h) file that may not be necessary for correctness. So reducing the number of .h files in a compilation causes performance regressions that cannot be entirely fixed with PGO. I'm still in favour of fixes to the .h files to improve compile times, but given that people have been trying to do this for some years I don't think that should be a blocker for a different approach that actually works now. Let me know what you think. -- Erik Corry -- -- v8-dev mailing list [email protected] http://groups.google.com/group/v8-dev --- You received this message because you are subscribed to the Google Groups "v8-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/v8-dev/CAHZxHphFUmymUEB6v5jchk-e%3D5yCoFsQ9_3uo6AKqApORpmquQ%40mail.gmail.com.
