Re: How to better parallelize GHC build.

Karel Gardas Wed, 01 Apr 2015 14:17:30 -0700


Hi Thomas,

thanks for your suggestion. Also thanks for the PR number. I've triedwith quick way (build.mk) and benchmarking ghc while compiling ghc-cabalmanually and here are the results:


-j1: 45s
-j2: 28s
-j3: 26s
-j4: 24s
-j5: 24s
-j6: 25s
-j6 -A32m: 23s
-j6 -A64m: 21s
-j6 -A128m: 23s

real time is reported, GHC compiles into i386 code on Solaris 11. GHC islocated in /tmp hence basically in RAM. CPU is 6c/12ht E5-2620.

So not that bad, but on the other hand also not that good result.Anyway, unfortunately on my niagara this will probably not help me,since I guess --make -jX is recent addition probably not presented in7.6.x, right? If so, then I'm afraid this will not help me since onniagara I'm using patched 7.6.x with fixed SPARC NCG and thissingle-threaded will be probably faster than 7.10.1 multithreaded butbuilding unregisterised (hence with C compiler...). Anyway, I'll try tobenchmark this tomorrow and will keep you posted.


Thanks!
Karel

On 04/ 1/15 12:34 PM, Thomas Miedema wrote:

Hi Karel,

could you try adding `-j8` to `SRC_HC_OPTS` for the build flavor you're
using in `mk/build.mk <http://build.mk>`, and running `gmake -j8`
instead of `gmake -j64`. A graph like the one you attached will likely
look even worse, but the walltime of your build should hopefully be
improved.

The build system seems to currently rely entirely on `make` for
parallelism. It doesn't exploit ghc's own parallel `--make` at all,
unless you explictly add `-jn` to SRC_HC_OPTS, with n>1 (which also sets
the number of capabilities for the runtime system, so also adding `+RTS
-Nn` is not needed).

Case study: One of the first things the build system does is build
ghc-cabal and Cabal using the stage 0 compiler, through a single
invocation of `ghc --make`. All the later make targets depend on that
step to complete first. Because `ghc --make` is not instructed to build
in parallel, using `make -j1` or `make -j100000` doesn't make any
difference (for that step). I think your graph shows that there are many
of more of such bottlenecks.

You would have to find out empirically how to best divide your number of
threads (32) between `make` and `ghc --make`. From reading this comment
<https://ghc.haskell.org/trac/ghc/ticket/9221#comment:12> by Simon in
#9221 I understand it's better not to call `ghc --make -jn` with `n`
higher than the number of physical cores of your machine (8 in your
case). Once you get some better parallelism, other flags like `-A` might
also have an effect on walltime (see that ticket).

-Thomas

On Sat, Mar 7, 2015 at 11:49 AM, Karel Gardas <karel.gar...@centrum.cz
<mailto:karel.gar...@centrum.cz>> wrote:


    Folks,

    first of all, I remember someone already mentioned issue with
    decreased parallelism of the GHC build recently somewhere but I
    cann't find it now. Sorry, for that since otherwise I would use this
    thread if it was on this mailing list.

    Anyway, while working on SPARC NCG I'm using T2000 which provides 32
    threads/8 core UltraSPARC T1 CPU. The property of this machine is
    that it's really slow on single-threaded work. To squeeze some perf
    from it man really needs to push 32 threads of work on it. Now, it
    really hurts my nerves to see it's lazy building/running just one or
    two ghc processes. To verify the fact I've created simple script to
    collect number of ghc processes over time and putting this to graph.
    The result is in the attached picture. The graph is result of running:

    gmake -j64

    anyway, the average number of running ghc processes is 4.4 and the
    median value is 2. IMHO such low number not only hurts build times
    on something like CMT SPARC machine, but also on let say a cluster
    of ARM machines using NFS and also on common engineering
    workstations which provide these days (IMHO!) around 8-16 cores (and
    double the threads number).

    My naive idea(s) for fixing this issue is (I'm assuming no Haskell
    file imports unused imports here, but perhaps this may be also
    investigated):

    1) provide explicit dependencies which guides make to build in more
    optimal way

    2) hack GHC's make depend to kind of compute explicit dependencies
    from (1) in an optimal way automatically

    3) someone already mentioned using shake for building ghc. I don't
    know shake but perhaps this is the right direction?

    4) hack GHC to compile needed hi file directly in its memory if hi
    file is not (yet!) available (issue how to get compiling options
    right here). Also I don't know hi file semantics yet so bear with me
    on this.


    Is there anything else which may be done to fix that issue? Is
    someone already working on some of those? (I mean those reasonable
    from the list)?

    Thanks!
    Karel


    _______________________________________________
    ghc-devs mailing list
    ghc-devs@haskell.org <mailto:ghc-devs@haskell.org>
    http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Re: How to better parallelize GHC build.

Reply via email to