Mark,

On 6/15/15 8:02 AM, Mark Thomas wrote:
> I have been experimenting with the free Azure credits that come with the
> MSDN subscription Microsoft kindly offers to all Apache committers to
> use for their ASF work.
> 
> I have been looking at options for making the unit tests run faster.
> 
> All the figures below are for running the trunk unit tests on a fully
> updated Ubuntu 14.04 LTS instance.
> 
> 
> A2 Basic 233:53 tests on hdd, with code coverage, 1 thread
> D2       120:57 tests on hdd, with code coverage, 1 thread
> D2       119:53 tests on ssd, with code coverage, 1 thread
> D2        32:16 tests on hdd, no code coverage,   2 threads
> D2        23:24 tests on hdd, no code coverage,   4 threads
> 
> (Both A2 and D2 boxes have 2 cores. D2 have 60% faster processors).
> 
> I'll be testing larger instance with more cores later.
> 
> So far, I think it is safe to draw the following conclusions:
> - code coverage is expensive
> - code coverage (as currently configured) requires single thread
>   execution (more on this below)
> - 1 test thread per core definitely gives better performance
> - 2 test threads per core gives even better performance

Obviously, code coverage and CPU power (more likely access to the CPU,
not the CPU speed itself) are bigger factors in the equation, here.
Multi-threaded is nice, but it's marginal compared to the other factors
(which are orders of magnitude at this point).

One more data point would have been good to have:

D2    ???:?? tests on hdd, no code coverage, 1 thread

> Where the limit is for threads per core is TBD.
> 
> I've already fixed the unit tests (I think) so parallel running is
> possible. I'll be adding a threads option to build.xml shortly. It will
> default to 1 and I'll add a comment to build.properties.default not to
> increase it above 1 if code coverage is enabled (I might try and detect
> and handle that case). Once I have data on threads vs cores I'll add
> that too.
> 
> The reason code coverage doesn't work with the junit threads option is
> that cobertura serialises the coverage data between tests. If we
> partitioned the tests (e.g. by name) and configured separated coverage
> data files for each partition (merging them at the end) then cobertura
> would be OK. Sensibly partitioning the tests is more effort than I have
> time for at the moment so I am going with the simple option.

If doubling the number of threads delivers a ~30% performance
improvement in the code coverage (just extrapolating the results for
merely running the tests over to code-coverage), then perhaps a
heavy-handed segmentation of the Cobertura tests into two
arbitrarily-selected sets of tests would be a good trial with not too
much effort to give it a try.

What do you think?

-chris

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to