ms-sysv FAILs

daniel.santos at pobox dot com Thu, 01 Jun 2017 01:42:55 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80759


--- Comment #21 from Daniel Santos <daniel.santos at pobox dot com> ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #20)
> > failures, but if you call dg-runtest, you are using gcc's hack-daptation of
> > parallelization.  However, your patch doesn't remove *my* hack-daptation of
> > parallelization, so we end up with two different parallelization schemes 
> > that
> > step on each other's toes.
> >
> > Another problem with the already present parallelization is that it bunches
> > tests into groups of 10 per job which will perform very poorly for these 
> > tests.
> >
> > (https://github.com/gcc-mirror/gcc/blob/master/gcc/testsuite/lib/gcc-defs.exp#L170).
> >  I presume this is to reduce disk I/O and it makes sense from that 
> > standpoint
> > (I don't want to know what it would take to get a ramdisk/tmpfs in a
> > platform-neutral fashion.)
> 
> My basic point still stands: running your ms-sysv tests sequentially
> takes just a few minutes even on an old and (by today's standards) slow
> CPU, so there's absolutely no point investing lots of effort and
> complexity to parallelize what already runs adequately fast sequentially!

There is plenty of point!  It may be fast without --enable-checking=rtl, but
it's very slow with it.  A very large portion of my development lifecycle was
spent waiting for tests to run.  Using --enable-checking=rtl caught SO many
errors that didn't (or might not have) cause an ICE without it.

Now at the time, all I had was a phenom and when I had more than 64 tests run
in a single function it was very slow (I presume due to thrashing the data
cache) which is the reason the generator splits the tests out into multiple
functions that run 64 tests each.  I have a nice quad i7 now, so it's going
faster.  But one thing I hadn't gotten back to yet was adding more extensive
tests using features and optimizations that effect the stack (-fsplit-stack,
-pg, etc.).


> > However, I'm learning a little more about how the test harness works, and it
> > MAY be possible to call gcc_parallel_test_enable 0 at the start of 
> > ms-sysv.exp
> > and be able to use all of the built-in dg-runtest, et. al. functions!  If I 
> > can
> > get this to work (and not break something else in the process), then we may 
> > be
> > on to a pathway to clean up ms-sysv.exp a little bit -- that is except for a
> > few outstanding (possibly surmountable) issues:
> >
> > 1.) Can the default time-out of 5 minutes be changed?  I need 20 minutes for
> > the slowest processors and a whole HOUR when full tests are enabled.
> 
> Sure it can: for one there are dg-timeout (and preferably dg-timeout
> factor) per testcase.  I still wonder why you'd need that, though: if
> all your tests together take no more than a few minutes, why would you
> need to increase the timeout at all?  Which processor would this be that
> takes 20 minutes or even an hour to run the tests *and complete all
> other tests well within the five minute timeout*?

Thanks for that!  I *may* have a better solution (described below).  But the 5
minute timeout even happens on my new i7 when --enable-checking=rtl is on.

> In fact, every test that takes more than about a minute on a resonably
> current CPU is frowned upon because under parallel testing/load such
> tests tend to run into the timeout.

The time is taken during compilation and also during linking when -flto is used
(again, with rtl checking).

> If your tests regularly exceed the timeout, there's something wrong with
> them: you need to split the so individual tests complete within the
> minute just mentioned.

Hah! That is actually the direction I had decided to go and have been testing
it the last few days. :)  However, they can still run longer than 1 minute
(with rtl checking).  I can split them apart even further with a few changes to
the code generator.

> If really necessary in a setup, it is possible to set
> board_info(unix,gcc,timeout) in a global site.exp file, e.g. to deal
> with really slow/ancient systems.  This would be necessary without your
> test anyway.
> 
> > 2.) The test description should include the generator flags and not just the
> > CFLAGS.  Is that possible from dg-runtest, et. al.?  I suppose it's always
> > possible to add them to CFLAGS with -DGEN_FLAGS="-p0-12" as a hack.
> 
> That's a requirement actually: the summary lines for different runs of a
> test must differ so you can tell them apart if one of them fails.  How
> this is done in the end is primarily a cosmetic issue, though.

I seem to have worked this out, although I have to do a regsub to replace
spaces in the generator_args with an escaped space and it prints a little ugly,
but at least it works.

    set escaped_generator_args [regsub -all " " $generator_args "\\ "]
    set cflags "$cflags\"-DGEN_ARGS=$escaped_generator_args\""

> > I guess you can see why I said that I was "semi-content" to leave it like it
> > is. :)  But I'm also glad to better understand how the test harness
> > parallelization works.  Maybe it's possible to make a small modification to
> > dg-defs.exp to get it to divvy out a single test per job instead of 10.
> 
> As I said: first get the sequential execution right and then, if really
> really necessary (and I want proof for that) look at parallelization.
> It may well be that the initial solution is to restrict the number and
> size of tests run by everyone.  After all, this is just for a niche
> feature of a single architecture/OS and there's no point in everyone
> testing on x86 having to pay a massive penalty for that.
> 
>       Rainer

This is a niche feature and only for x86_64, but it targets every *nix OS where
Wine can run including Mac, GNU/Linux, Solaris and I think BSD too.  (btw, Wine
actually runs on ARM now!)  We don't need this on Windows.  But to my
knowledge, Wine would be the only project to benefit from it (so "niche" is
accurate).

Spitting out the tests out solves a few problems -- the timeouts and the header
file size is much smaller.  It even ends up distributing the tests a tiny bit
better (it uses 2 CPUs :)

Separate from that, I have a change set that provides a mechanism to tune
parallelization.  It works by a .exp file disclosing in advance how many tests
they plan to run and calculating the tests-at-a-time from that value.  One
non-pretty aspect of it is how I grab the number of jobs being run:

    global gcc_runtest_parallelize_njobs

    if { [info exists env(MAKEFLAGS)] } {
        set njobs [regsub "^.*?(?: -)?j(\\d+).*?$" $env(MAKEFLAGS) "\\1" ]
        if [regexp "^\\d+$" $njobs match] then {
            set gcc_runtest_parallelize_njobs $njobs
        }
    }

Anyway, I'm re-working this into a logical progression of patches:

1.) First fixing broken things (there is an additional test error with "-p7"
    in the generator arguments),
2.) Cleaning up some bad code formatting, inconsistencies and other cosmetics,
3.) Adding an option to the generator to control how tests are split
    across functions.
4.) Adding an extensive set of tests using torture flags and that are
    triggered by GCC_TEST_RUN_EXPENSIVE (if you think I should use a
    different environment variable, please let me know),
5.) And then my proposed parallelization tweaks.

This way we can at least get the broken things fixed immediately and punt
parallelization in the (likely) event that my initial proposal isn't yet ready
for prime time.

Thanks,
Daniel

[Bug testsuite/80759] gcc.target/x86_64/abi/ms-sysv FAILs

Reply via email to