This patch series add a new flag "-fparallel-jobs=" to control if the
compiler should try to compile the current file in parallel.

There are three modes which is supported by now:

1. -fparallel-jobs=<N>: Try to compile the file using a maximum of N
jobs.

2. -fparallel-jobs=jobserver: Check if there is a running GNU Make
Jobserver. If positive, communicate with it in order to launch jobs,
but alert the user if the jobserver was not found, since it requires
modifications in the project Makefile.

3. -fparallel-jobs=auto: Same as 2., but quietly fall back to a maximum
of 2 jobs if the jobserver was not found.

The parallelization works by using a modified LTO engine, as no IR is
dumped into the disk, and a new partitioner is employed to find
symbols which must be partitioned together.

In order to implement the parallelism feature, we:

1. The driver will pass a hidden -fsplit-outputs=<filename> to cc1*.

2. After IPA, cc1* will search for symbols in which must be partitioned
together.  If the user allows GCC to automatically promote symbols to
globals through "--param=promote-statics=1" for a better parallel
compilation performance, it will also be done.  However, if it decides
that partitioning is a bad idea, it will continue with a default serial
compilation, and the additional <filename> will not be created.  It will
avoid compiling in parallel if and only if:

  * File size exceeds the minimum file size specified by LTO default
  --param=lto-min-partition.

  * The partitioner is unable to find any point of partitioning in the
  file.

3. cc1* will fork itself; one fork for each partition. Each child
process will apply its partition mask generated by the partitioner
and write a new assembler name file to <filename> pointed by the driver.

4. The driver will open each file and partially link them together into
a single .o file, if -c was requested, else into a binary.  -S and -E
is unsupported for now and probably will remain so.


Speedups ranged from 0.95x to 1.9x on a Quad-Core Intel Core-i7 8565U
when testing with two files in GCC, as stated in the following table.
The test was the result of a single execution with a previous warm up
execution. The compiled GCC had checking enabled, and therefore release
version might have better timings in both sequential and parallel, but the
speedup may remain the same.

|                |            | Without Static | With Static |   Max   |
| File           | Sequential |    Promotion   |  Promotion  | Speedup |
|----------------|------------|----------------|-----------------------|
| gimple-match.c |     60s    |       63s      |     34s     |   1.7x  |
| insn-emit.c    |     37s    |       19s      |     20s     |   1.9x  |

Notice that we have a slowdown in some cases when it is enabled, that
is why the parallelism feature is enabled with a flag for now.

Bootstrapped and Regtested on Linux x86_64.

Giuliano Belinassi (6):
  Modify gcc driver for parallel compilation
  Implement a new partitioner for parallel compilation
  Implement fork-based parallelism engine
  Add `+' for Jobserver Integration
  Add invoke documentation
  New tests for parallel compilation feature

 gcc/Makefile.in                               |    6 +-
 gcc/cgraph.c                                  |   16 +
 gcc/cgraph.h                                  |   13 +
 gcc/cgraphunit.c                              |  198 ++-
 gcc/common.opt                                |    4 +
 gcc/doc/invoke.texi                           |   32 +-
 gcc/gcc.c                                     | 1219 +++++++++++++----
 gcc/ipa-fnsummary.c                           |    2 +-
 gcc/ipa-icf.c                                 |    3 +-
 gcc/ipa-visibility.c                          |    3 +-
 gcc/ipa.c                                     |    4 +-
 gcc/jobserver.cc                              |  168 +++
 gcc/jobserver.h                               |   33 +
 gcc/lto-cgraph.c                              |  172 +++
 gcc/{lto => }/lto-partition.c                 |  463 ++++++-
 gcc/{lto => }/lto-partition.h                 |    4 +-
 gcc/lto-streamer.h                            |    4 +
 gcc/lto/Make-lang.in                          |    4 +-
 gcc/lto/lto.c                                 |    2 +-
 gcc/params.opt                                |    8 +
 gcc/symtab.c                                  |   46 +-
 gcc/testsuite/driver/a.c                      |    6 +
 gcc/testsuite/driver/b.c                      |    6 +
 gcc/testsuite/driver/driver.exp               |   80 ++
 gcc/testsuite/driver/empty.c                  |    0
 gcc/testsuite/driver/foo.c                    |    7 +
 .../gcc.dg/parallel-early-constant.c          |   22 +
 gcc/testsuite/gcc.dg/parallel-static-1.c      |   21 +
 gcc/testsuite/gcc.dg/parallel-static-2.c      |   21 +
 .../gcc.dg/parallel-static-clash-1.c          |   23 +
 .../gcc.dg/parallel-static-clash-aux.c        |   14 +
 gcc/toplev.c                                  |   58 +-
 gcc/toplev.h                                  |    3 +
 gcc/tree.c                                    |   23 +-
 gcc/varasm.c                                  |   26 +-
 intl/Makefile.in                              |    2 +-
 libbacktrace/Makefile.in                      |    2 +-
 libcpp/Makefile.in                            |    2 +-
 libdecnumber/Makefile.in                      |    2 +-
 libiberty/Makefile.in                         |  212 +--
 zlib/Makefile.in                              |   64 +-
 41 files changed, 2539 insertions(+), 459 deletions(-)
 create mode 100644 gcc/jobserver.cc
 create mode 100644 gcc/jobserver.h
 rename gcc/{lto => }/lto-partition.c (78%)
 rename gcc/{lto => }/lto-partition.h (89%)
 create mode 100644 gcc/testsuite/driver/a.c
 create mode 100644 gcc/testsuite/driver/b.c
 create mode 100644 gcc/testsuite/driver/driver.exp
 create mode 100644 gcc/testsuite/driver/empty.c
 create mode 100644 gcc/testsuite/driver/foo.c
 create mode 100644 gcc/testsuite/gcc.dg/parallel-early-constant.c
 create mode 100644 gcc/testsuite/gcc.dg/parallel-static-1.c
 create mode 100644 gcc/testsuite/gcc.dg/parallel-static-2.c
 create mode 100644 gcc/testsuite/gcc.dg/parallel-static-clash-1.c
 create mode 100644 gcc/testsuite/gcc.dg/parallel-static-clash-aux.c

-- 
2.28.0

Reply via email to