Hi Jakub,

> On Fri, Oct 21, 2016 at 04:01:48PM +0200, Rainer Orth wrote:
>> I happened to notice that the gnat.dg testsuite run is slow even on a
>> reasonably fast SPARC machine (3.6 GHz SPARC T5) and together with the
>> libgomp testsuite (PR libgomp/66005) dominates bootstrap time: within a
>> make -j96 -k check, it takes 1h 18m 37s.  For unknown reasons,
>> check-gnat isn't parallelized though it is trivial to do and buys quite
>> a bit:
>
> check-gnat dominates anything?  That just really weird,
> it has only
> # of expected passes            2544
> # of unexpected failures        2
> # of expected failures          24
> # of unsupported tests          3
>
> compared to the 100000+ tests in gcc/g++ or 40000+ in gfortran testsuites
> it is just nothing.

That's comparing apples and oranges: the gnat.dg (and acats) tests are
all compile or even run tests, while within the gcc or g++ testsuites
you're also counting dg-error, dg-warning and some such, which are much
cheeper.

What ultimately matters is wall clock time, though (from a make -j96 run
before my patch):

         start      end         #tests  #partitions
acats    12:28:51   13:24:25      2320          19
g++      12:28:57   14:00:13    210885          48
gcc      12:28:57   14:10:41    197266          90
gfortran 12:28:57   13:47:40     86959          32
gnat     12:28:52   14:17:16      5100           1
go       12:28:57   13:17:02     14636          11
obj-c++  12:28:53   12:48:47      3074           1
objc     12:28:57   13:14:38      5742           6

Here you can see what I mean by dominate: even on this relatively fast
system (3.6 GHz SPARC T5), the gnat testsuite runs for several minutes
beyond everything else in gcc/testsuite, thus determinating the end of
the bootstrap.  The effect becomes much more pronounced on slower boxes
(UltraSPARC T2 for example) where the machine is almost idle, running
just a single instance of runtest for half an hour or more.

> libgomp is a know problem, sure, the problem with parallelizing it is that
> many tests just use all available cores/threads.  Perhaps we should do some

Right, the same holds for the Cilk+ tests as well: I'm including my
libcilkrts-on-sparc patch in my bootstraps and often see one or two
tests failing because they time out, grabbing all 96 strands within a
make -j96 check...

> small (at most 2 or 3 concurrent libgomp tests) parallelization of the
> libgomp testsuite unless disallowed through some env var option, but in that
> case bound OMP_NUM_THREADS if `getconf _NPROCESSORS_ONLN` > 32 to
> `getconf _NPROCESSORS_ONLN` / 2 or something similar.

That would certainly be a start, even though _NPROCESSORS_ONLN/2 can
still be a bit much on larger systems, especially if they are already
running make -j_NPROCESSORS_ONLN check (or with even more parallelism).
Besides, there's no reason to limit the parallel number of compile tests
in this way.

But certainly, every single bit helps: the libgomp testsuite right now
is what really dominates make check time, check-gnat was just a
low-hanging fruit.

> I'm not strongly against your patch, I'm just very surprised it is really
> needed (acats is much larger, check-gnat is small).

Not really: on that SPARC T5 system, I have (sequential gnat.dg
vs. acats with 19 partitions), all within a -j96 bootstrap:

                wall clock              #tests

gnat.dg         6505s = 108m 25s        5100
acats           3334s =  55m 34s        2320

compared to (one week later) parallel gnat.dg (5 partitions):

gnat.dg         2458s =  40m 58s        5104

Right now, gnat.dg is larger since it's run for all multilibs (two in
this case) while acats is for the default multilib only (until I finish
my `convert acats to dg' patch).

        Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University

Reply via email to