I did a bisect based on the runtime of `./gradlew :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits between 9/1 and 9/15 to see if I could find the source of the spike that happened around 9/6. It looks like it was due to PR#9283 [1]. I thought maybe this search would reveal some mis-guided configuration change, but as far as I can tell 9283 just added a well-tested feature. I don't think there's anything to learn from that... I just wanted to circle back about it in case others are curious about that spike.
I'm +1 on bumping some FnApiRunner configurations. Brian [1] https://github.com/apache/beam/pull/9283 On Fri, Oct 25, 2019 at 4:49 PM Pablo Estrada <[email protected]> wrote: > I think it makes sense to remove some of the extra FnApiRunner > configurations. Perhaps some of the multiworkers and some of the grpc > versions? > Best > -P. > > On Fri, Oct 25, 2019 at 12:27 PM Robert Bradshaw <[email protected]> > wrote: > >> It looks like fn_api_runner_test.py is quite expensive, taking 10-15+ >> minutes on each version of Python. This test consists of a base class >> that is basically a validates runner suite, and is then run in several >> configurations, many more of which (including some expensive ones) >> have been added lately. >> >> class FnApiRunnerTest(unittest.TestCase): >> class FnApiRunnerTestWithGrpc(FnApiRunnerTest): >> class FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest): >> class FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest): >> class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest): >> class FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest): >> class FnApiRunnerTestWithBundleRepeat(FnApiRunnerTest): >> class FnApiRunnerTestWithBundleRepeatAndMultiWorkers(FnApiRunnerTest): >> >> I'm not convinced we need to run all of these permutations, or at >> least not all tests in all permutations. >> >> On Fri, Oct 25, 2019 at 10:57 AM Valentyn Tymofieiev >> <[email protected]> wrote: >> > >> > I took another look at this and precommit ITs are already running in >> parallel, albeit in the same suite. However it appears Python precommits >> became slower, especially Python 2 precommits [35 min per suite x 3 >> suites], see [1]. Not sure yet what caused the increase, but precommits >> used to be faster. Perhaps we have added a slow test or a lot of new tests. >> > >> > [1] >> https://scans.gradle.com/s/jvcw5fpqfc64k/timeline?task=ancsbov425524 >> > >> > On Thu, Oct 24, 2019 at 4:53 PM Ahmet Altay <[email protected]> wrote: >> >> >> >> Ack. Separating precommit ITs to a different suite sounds good. Anyone >> is interested in doing that? >> >> >> >> On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev < >> [email protected]> wrote: >> >>> >> >>> This should not increase the queue time substantially, since >> precommit ITs are running sequentially with precommit tests, unlike >> multiple precommit tests which run in parallel to each other. >> >>> >> >>> The precommit ITs we run are batch and streaming wordcount tests on >> Py2 and one Py3 version, so it's not a lot of tests. >> >>> >> >>> On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay <[email protected]> wrote: >> >>>> >> >>>> +1 to separating ITs from precommit. Downside would be, when Chad >> tried to do something similar [1] it was noted that the total time to run >> all precommit tests would increase and also potentially increase the queue >> time. >> >>>> >> >>>> Another alternative, we could run a smaller set of IT tests in >> precommits and run the whole suite as part of post commit tests. >> >>>> >> >>>> [1] https://github.com/apache/beam/pull/9642 >> >>>> >> >>>> On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev < >> [email protected]> wrote: >> >>>>> >> >>>>> One improvement could be move to Precommit IT tests into a separate >> suite from precommit tests, and run it in parallel. >> >>>>> >> >>>>> On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette <[email protected]> >> wrote: >> >>>>>> >> >>>>>> Python Precommits are taking quite a while now [1]. Just visually >> it looks like the average length is 1.5h or so, but it spikes up to 2h. >> I've had several precommit runs get aborted due to the 2 hour limit. >> >>>>>> >> >>>>>> It looks like there was a spike up above 1h back on 9/6 and the >> duration has been steadily rising since then. Is there anything we can do >> about this? >> >>>>>> >> >>>>>> Brian >> >>>>>> >> >>>>>> [1] >> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4 >> >
