This patch series add a new flag "-fparallel-jobs=" to control if the compiler should try to compile the current file in parallel.
There are three modes which is supported by now: 1. -fparallel-jobs=<N>: Try to compile the file using a maximum of N jobs. 2. -fparallel-jobs=jobserver: Check if there is a running GNU Make Jobserver. If positive, communicate with it in order to launch jobs, but alert the user if the jobserver was not found, since it requires modifications in the project Makefile. 3. -fparallel-jobs=auto: Same as 2., but quietly fall back to a maximum of 2 jobs if the jobserver was not found. The parallelization works by using a modified LTO engine, as no IR is dumped into the disk, and a new partitioner is employed to find symbols which must be partitioned together. In order to implement the parallelism feature, we: 1. The driver will pass a hidden -fsplit-outputs=<filename> to cc1*. 2. After IPA, cc1* will search for symbols in which must be partitioned together. If the user allows GCC to automatically promote symbols to globals through "--param=promote-statics=1" for a better parallel compilation performance, it will also be done. However, if it decides that partitioning is a bad idea, it will continue with a default serial compilation, and the additional <filename> will not be created. It will avoid compiling in parallel if and only if: * File size exceeds the minimum file size specified by LTO default --param=lto-min-partition. * The partitioner is unable to find any point of partitioning in the file. 3. cc1* will fork itself; one fork for each partition. Each child process will apply its partition mask generated by the partitioner and write a new assembler name file to <filename> pointed by the driver. 4. The driver will open each file and partially link them together into a single .o file, if -c was requested, else into a binary. -S and -E is unsupported for now and probably will remain so. Speedups ranged from 0.95x to 1.9x on a Quad-Core Intel Core-i7 8565U when testing with two files in GCC, as stated in the following table. The test was the result of a single execution with a previous warm up execution. The compiled GCC had checking enabled, and therefore release version might have better timings in both sequential and parallel, but the speedup may remain the same. | | | Without Static | With Static | Max | | File | Sequential | Promotion | Promotion | Speedup | |----------------|------------|----------------|-----------------------| | gimple-match.c | 60s | 63s | 34s | 1.7x | | insn-emit.c | 37s | 19s | 20s | 1.9x | Notice that we have a slowdown in some cases when it is enabled, that is why the parallelism feature is enabled with a flag for now. Bootstrapped and Regtested on Linux x86_64. Giuliano Belinassi (6): Modify gcc driver for parallel compilation Implement a new partitioner for parallel compilation Implement fork-based parallelism engine Add `+' for Jobserver Integration Add invoke documentation New tests for parallel compilation feature gcc/Makefile.in | 6 +- gcc/cgraph.c | 16 + gcc/cgraph.h | 13 + gcc/cgraphunit.c | 198 ++- gcc/common.opt | 4 + gcc/doc/invoke.texi | 32 +- gcc/gcc.c | 1219 +++++++++++++---- gcc/ipa-fnsummary.c | 2 +- gcc/ipa-icf.c | 3 +- gcc/ipa-visibility.c | 3 +- gcc/ipa.c | 4 +- gcc/jobserver.cc | 168 +++ gcc/jobserver.h | 33 + gcc/lto-cgraph.c | 172 +++ gcc/{lto => }/lto-partition.c | 463 ++++++- gcc/{lto => }/lto-partition.h | 4 +- gcc/lto-streamer.h | 4 + gcc/lto/Make-lang.in | 4 +- gcc/lto/lto.c | 2 +- gcc/params.opt | 8 + gcc/symtab.c | 46 +- gcc/testsuite/driver/a.c | 6 + gcc/testsuite/driver/b.c | 6 + gcc/testsuite/driver/driver.exp | 80 ++ gcc/testsuite/driver/empty.c | 0 gcc/testsuite/driver/foo.c | 7 + .../gcc.dg/parallel-early-constant.c | 22 + gcc/testsuite/gcc.dg/parallel-static-1.c | 21 + gcc/testsuite/gcc.dg/parallel-static-2.c | 21 + .../gcc.dg/parallel-static-clash-1.c | 23 + .../gcc.dg/parallel-static-clash-aux.c | 14 + gcc/toplev.c | 58 +- gcc/toplev.h | 3 + gcc/tree.c | 23 +- gcc/varasm.c | 26 +- intl/Makefile.in | 2 +- libbacktrace/Makefile.in | 2 +- libcpp/Makefile.in | 2 +- libdecnumber/Makefile.in | 2 +- libiberty/Makefile.in | 212 +-- zlib/Makefile.in | 64 +- 41 files changed, 2539 insertions(+), 459 deletions(-) create mode 100644 gcc/jobserver.cc create mode 100644 gcc/jobserver.h rename gcc/{lto => }/lto-partition.c (78%) rename gcc/{lto => }/lto-partition.h (89%) create mode 100644 gcc/testsuite/driver/a.c create mode 100644 gcc/testsuite/driver/b.c create mode 100644 gcc/testsuite/driver/driver.exp create mode 100644 gcc/testsuite/driver/empty.c create mode 100644 gcc/testsuite/driver/foo.c create mode 100644 gcc/testsuite/gcc.dg/parallel-early-constant.c create mode 100644 gcc/testsuite/gcc.dg/parallel-static-1.c create mode 100644 gcc/testsuite/gcc.dg/parallel-static-2.c create mode 100644 gcc/testsuite/gcc.dg/parallel-static-clash-1.c create mode 100644 gcc/testsuite/gcc.dg/parallel-static-clash-aux.c -- 2.28.0