On Wed, Nov 23, 2011 at 5:05 PM, Robert Collins <[email protected]> wrote: > On Thu, Nov 24, 2011 at 10:59 AM, Benji York <[email protected]> wrote: >>> Our test distribution per layer is not very even - I highly doubt that >>> we'd be able to meet a reduction to 15% of the current time splitting >>> per layer. >> >> Let's look at the test distribution: The last buildbot run took 360 >> minutes. There were 4 layers that took longer than 11 minutes to run: >> 55, 56, 65, and 99 minutes. All the other layers add up to about >> 60 minutes. > > So the shortest run -j could give is 99 minutes, or 27% runtime. I > don't see how you can bisect a layer, unless you mean 'create a fake > layer extending it and manually allocate 50% of the tests to it'. That > seems like a non-starter to me - way to much maintenance overhead. > >> If we bisect the four largest layers (to make it so the test runner's >> blind layer scheduling can't bite us too hard) and assume that running 4 >> layers simultaneously imposes no more than a 50% overhead, then we would >> be right at 40% of the current running time. >> >> Reasoning sidebar: 99 is the length in minutes of the longest layer; it >> was bisected, but even then its other half is still the longest >> remaining layer so for pessimism's sake we assume they get run one after >> another. All the other layers would be finished by then, so that gives >> us 99*1.50/360 = .41. >> >> Even if we assume no parallelization overhead, per-test scheduling (as >> opposed to per-layer as above) and four-way parallelization, we'll still >> be at 25% of the original time, so I'm interested in ideas as to how we >> might achieve a reduction to 15% of the original time. > > If local parallelisation will work, testr run --parallel will load > balance all the tests optimally based on previous performance - a > single run from e.g. ec2 can tell us which tests are slow and let it > decide from there.
Cool. I wasn't aware it had that functionality. >>> The other issue of shared global state that will bite us, >>> will also be a significant issue with -j, unless a remoting facility >>> is brought in (and at that point it seems to be reinventing >>> subunit.... :P). >> >> This is the real catch. If the tests haven't been written to be >> parallelizable (which LP's certainly have not), then global state >> collisions accumulated over years of assuming non-parallel tests could >> be hard to fix. On the other hand, if fixing them turns out to be easy, >> then using the test runner's built-in parallelization (-j) would be the >> most bang for the buck. > > bin/test --parallel already exists and does better splitting than -j, > so I disagree that -j would be the best approach, *if* the collisions > etc are easy to fix :). Indeed. I had forgotten about --parallel. If it hasn't already been added to zope.testrunner, it sounds like a good candidate to replace -j. -- Benji York -- Mailing list: https://launchpad.net/~yellow Post to : [email protected] Unsubscribe : https://launchpad.net/~yellow More help : https://help.launchpad.net/ListHelp

