Re: Python Precommit duration pushing 2 hours

Ahmet Altay Mon, 04 Nov 2019 12:06:41 -0800

Python precommits are still timing out on #9925. I am guessing that means
this change would not be enough.


I am proposing cutting down the number of test variants we run in
precommits. Currently for each version we ran the following variants
serially:
- base: Runs all unit tests with tox
- Cython: Installs cython and runs all unit tests as base version. The
original purpose was to ensure that tests pass with or without cython.
There is probably a huge overlap with base. (IIRC only a few coders have
different slow vs fast tests.)
- GCP: Installs GCP dependencies and tests all base + additional gcp
specific tests. The original purpose was to ensure that GCP is an optional
component and all non-GCP tests still works without GCP components.

We can reduce the list to cython + GCP tests only. This will cover the same
group of tests and will check that tests pass with or without cython or GCP
dependencies. This could reduce the precommit time by ~30 minutes.

What do you think?

Ahmet


On Tue, Oct 29, 2019 at 11:15 AM Robert Bradshaw <[email protected]>
wrote:

> https://github.com/apache/beam/pull/9925
>
> On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri <[email protected]> wrote:
> >
> > I don't have the bandwidth right now to tackle this. Feel free to take
> it.
> >
> > On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw <[email protected]>
> wrote:
> >>
> >> The Python SDK does as well. These calls are coming from
> >> to_runner_api, is_stateful_dofn, and validate_stateful_dofn which are
> >> invoked once per pipene or bundle. They are, however, surprisingly
> >> expensive. Even memoizing across those three calls should save a
> >> significant amount of time. Udi, did you want to tackle this?
> >>
> >> Looking at the profile, Pipeline.to_runner_api() is being called 30
> >> times in this test, and [Applied]PTransform.to_fn_api being called
> >> 3111 times, so that in itself might be interesting to investigate.
> >>
> >> On Tue, Oct 29, 2019 at 8:26 AM Robert Burke <[email protected]>
> wrote:
> >> >
> >> > As does the Go SDK. Invokers are memoized and when possible code is
> generated to avoid reflection.
> >> >
> >> > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles <[email protected]> wrote:
> >> >>
> >> >> Noting for the benefit of the thread archive in case someone goes
> digging and wonders if this affects other SDKs: the Java SDK memoizes
> DoFnSignatures and generated DoFnInvoker classes.
> >> >>
> >> >> Kenn
> >> >>
> >> >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri <[email protected]> wrote:
> >> >>>
> >> >>> Re: #9283 slowing down tests, ideas for slowness:
> >> >>> 1. I added a lot of test cases, some with locally run pipelines.
> >> >>> 2. The PR somehow changed how coders are selected, and now we're
> using less efficient ones.
> >> >>> 3. New dependency funcsigs is slowing things down? (py2 only)
> >> >>>
> >> >>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and 3.7
> and got these cool graphs (attached).
> >> >>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time
> (IIUC), gets called ~230k times
> >> >>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k times
> >> >>>
> >> >>> After memoization of get_function_args_defaults:
> >> >>> 2.7: core:294:get_function_arguments 20.02%
> >> >>> 3.7: core:294:get_function_arguments 8.11%
> >> >>>
> >> >>>
> >> >>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada <[email protected]>
> wrote:
> >> >>>>
> >> >>>> *not deciles, but 9-percentiles : )
> >> >>>>
> >> >>>> On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada <[email protected]>
> wrote:
> >> >>>>>
> >> >>>>> I've ran the tests in Python 2 (without cython), and used a
> utility to track runtime for each test method. I found some of the
> following things:
> >> >>>>> - Total test methods run: 2665
> >> >>>>> - Total test runtime: 990 seconds
> >> >>>>> - Deciles of time spent:
> >> >>>>>   - 1949 tests run in the first 9% of time
> >> >>>>>   - 173 in the 9-18% rang3e
> >> >>>>>   - 130 in the 18-27% range
> >> >>>>>   - 95 in the 27-36% range
> >> >>>>>   - 77
> >> >>>>>   - 66
> >> >>>>>   - 55
> >> >>>>>   - 46
> >> >>>>>   - 37
> >> >>>>>   - 24
> >> >>>>>   - 13 tests run in the last 9% of time. This represents about 1
> minute and a half.
> >> >>>>>
> >> >>>>> We may be able to look at the slowest X tests, and get gradual
> improvements from there. Although it seems .. not dramatic ones : )
> >> >>>>>
> >> >>>>> FWIW I uploaded the results here:
> https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json
> >> >>>>>
> >> >>>>> The slowest 13 tests were:
> >> >>>>>
> >> >>>>>
> [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic',
> >> >>>>>   5.253582000732422),
> >> >>>>>
> ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_wordcount',
> >> >>>>>   7.907713890075684),
> >> >>>>>
> ('apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest.test_failure_has_same_insert_ids',
> >> >>>>>   5.237942934036255),
> >> >>>>>
> ('apache_beam.transforms.combiners_test.CombineTest.test_global_sample',
> >> >>>>>   5.563946008682251),
> >> >>>>>
> ('apache_beam.runners.worker.sideinputs_test.EmulatedCollectionsTest.test_large_iterable_values',
> >> >>>>>   5.680700063705444),
> >> >>>>>
> ('apache_beam.io.parquetio_test.TestParquet.test_sink_transform_multiple_row_group',
> >> >>>>>   6.111238956451416),
> >> >>>>>
> ('apache_beam.runners.worker.statesampler_test.StateSamplerTest.test_basic_sampler',
> >> >>>>>   6.007534980773926),
> >> >>>>>
> ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_basic',
> >> >>>>>   13.993916988372803),
> >> >>>>>
> ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_read_cache_expansion',
> >> >>>>>   6.3383049964904785),
> >> >>>>>
> ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_word_count',
> >> >>>>>   9.157485008239746),
> >> >>>>>
> ('apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses.test_pardo_side_and_main_outputs',
> >> >>>>>   5.191173076629639),
> >> >>>>>
> ('apache_beam.io.vcfio_test.VcfSourceTest.test_pipeline_read_file_pattern_large',
> >> >>>>>   6.2221620082855225),
> >> >>>>>
> ('apache_beam.io.fileio_test.WriteFilesTest.test_streaming_complex_timing',
> >> >>>>>   7.7187910079956055)]
> >> >>>>>
> >> >>>>> On Mon, Oct 28, 2019 at 3:10 PM Pablo Estrada <[email protected]>
> wrote:
> >> >>>>>>
> >> >>>>>> I have written https://github.com/apache/beam/pull/9910 to
> reduce FnApiRunnerTest variations.
> >> >>>>>> I'm not in a rush to merge, but rather happy to start a
> discussion.
> >> >>>>>> I'll also try to figure out if there are other tests slowing
> down the suite significantly.
> >> >>>>>> Best
> >> >>>>>> -P.
> >> >>>>>>
> >> >>>>>> On Fri, Oct 25, 2019 at 7:41 PM Valentyn Tymofieiev <
> [email protected]> wrote:
> >> >>>>>>>
> >> >>>>>>> Thanks, Brian.
> >> >>>>>>> +Udi Meiri
> >> >>>>>>> As next step, it would be good to know whether slowdown is
> caused by tests in this PR, or its effect on other tests, and to confirm
> that only Python 2 codepaths were affected.
> >> >>>>>>>
> >> >>>>>>> On Fri, Oct 25, 2019 at 6:35 PM Brian Hulette <
> [email protected]> wrote:
> >> >>>>>>>>
> >> >>>>>>>> I did a bisect based on the runtime of `./gradlew
> :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits between 9/1
> and 9/15 to see if I could find the source of the spike that happened
> around 9/6. It looks like it was due to PR#9283 [1]. I thought maybe this
> search would reveal some mis-guided configuration change, but as far as I
> can tell 9283 just added a well-tested feature. I don't think there's
> anything to learn from that... I just wanted to circle back about it in
> case others are curious about that spike.
> >> >>>>>>>>
> >> >>>>>>>> I'm +1 on bumping some FnApiRunner configurations.
> >> >>>>>>>>
> >> >>>>>>>> Brian
> >> >>>>>>>>
> >> >>>>>>>> [1] https://github.com/apache/beam/pull/9283
> >> >>>>>>>>
> >> >>>>>>>> On Fri, Oct 25, 2019 at 4:49 PM Pablo Estrada <
> [email protected]> wrote:
> >> >>>>>>>>>
> >> >>>>>>>>> I think it makes sense to remove some of the extra
> FnApiRunner configurations. Perhaps some of the multiworkers and some of
> the grpc versions?
> >> >>>>>>>>> Best
> >> >>>>>>>>> -P.
> >> >>>>>>>>>
> >> >>>>>>>>> On Fri, Oct 25, 2019 at 12:27 PM Robert Bradshaw <
> [email protected]> wrote:
> >> >>>>>>>>>>
> >> >>>>>>>>>> It looks like fn_api_runner_test.py is quite expensive,
> taking 10-15+
> >> >>>>>>>>>> minutes on each version of Python. This test consists of a
> base class
> >> >>>>>>>>>> that is basically a validates runner suite, and is then run
> in several
> >> >>>>>>>>>> configurations, many more of which (including some expensive
> ones)
> >> >>>>>>>>>> have been added lately.
> >> >>>>>>>>>>
> >> >>>>>>>>>> class FnApiRunnerTest(unittest.TestCase):
> >> >>>>>>>>>> class FnApiRunnerTestWithGrpc(FnApiRunnerTest):
> >> >>>>>>>>>> class FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest):
> >> >>>>>>>>>> class FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest):
> >> >>>>>>>>>> class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest):
> >> >>>>>>>>>> class
> FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest):
> >> >>>>>>>>>> class FnApiRunnerTestWithBundleRepeat(FnApiRunnerTest):
> >> >>>>>>>>>> class
> FnApiRunnerTestWithBundleRepeatAndMultiWorkers(FnApiRunnerTest):
> >> >>>>>>>>>>
> >> >>>>>>>>>> I'm not convinced we need to run all of these permutations,
> or at
> >> >>>>>>>>>> least not all tests in all permutations.
> >> >>>>>>>>>>
> >> >>>>>>>>>> On Fri, Oct 25, 2019 at 10:57 AM Valentyn Tymofieiev
> >> >>>>>>>>>> <[email protected]> wrote:
> >> >>>>>>>>>> >
> >> >>>>>>>>>> > I took another look at this and precommit ITs are already
> running in parallel, albeit in the same suite. However it appears Python
> precommits became slower, especially Python 2 precommits [35 min per suite
> x 3 suites], see [1]. Not sure yet what caused the increase, but precommits
> used to be faster. Perhaps we have added a slow test or a lot of new tests.
> >> >>>>>>>>>> >
> >> >>>>>>>>>> > [1]
> https://scans.gradle.com/s/jvcw5fpqfc64k/timeline?task=ancsbov425524
> >> >>>>>>>>>> >
> >> >>>>>>>>>> > On Thu, Oct 24, 2019 at 4:53 PM Ahmet Altay <
> [email protected]> wrote:
> >> >>>>>>>>>> >>
> >> >>>>>>>>>> >> Ack. Separating precommit ITs to a different suite sounds
> good. Anyone is interested in doing that?
> >> >>>>>>>>>> >>
> >> >>>>>>>>>> >> On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev <
> [email protected]> wrote:
> >> >>>>>>>>>> >>>
> >> >>>>>>>>>> >>> This should not increase the queue time substantially,
> since precommit ITs are running sequentially with precommit tests, unlike
> multiple precommit tests which run in parallel to each other.
> >> >>>>>>>>>> >>>
> >> >>>>>>>>>> >>> The precommit ITs we run are batch and streaming
> wordcount tests on Py2 and one Py3 version, so it's not a lot of tests.
> >> >>>>>>>>>> >>>
> >> >>>>>>>>>> >>> On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay <
> [email protected]> wrote:
> >> >>>>>>>>>> >>>>
> >> >>>>>>>>>> >>>> +1 to separating ITs from precommit. Downside would be,
> when Chad tried to do something similar [1] it was noted that the total
> time to run all precommit tests would increase and also potentially
> increase the queue time.
> >> >>>>>>>>>> >>>>
> >> >>>>>>>>>> >>>> Another alternative, we could run a smaller set of IT
> tests in precommits and run the whole suite as part of post commit tests.
> >> >>>>>>>>>> >>>>
> >> >>>>>>>>>> >>>> [1] https://github.com/apache/beam/pull/9642
> >> >>>>>>>>>> >>>>
> >> >>>>>>>>>> >>>> On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev <
> [email protected]> wrote:
> >> >>>>>>>>>> >>>>>
> >> >>>>>>>>>> >>>>> One improvement could be move to Precommit IT tests
> into a separate suite from precommit tests, and run it in parallel.
> >> >>>>>>>>>> >>>>>
> >> >>>>>>>>>> >>>>> On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette <
> [email protected]> wrote:
> >> >>>>>>>>>> >>>>>>
> >> >>>>>>>>>> >>>>>> Python Precommits are taking quite a while now [1].
> Just visually it looks like the average length is 1.5h or so, but it spikes
> up to 2h. I've had several precommit runs get aborted due to the 2 hour
> limit.
> >> >>>>>>>>>> >>>>>>
> >> >>>>>>>>>> >>>>>> It looks like there was a spike up above 1h back on
> 9/6 and the duration has been steadily rising since then. Is there anything
> we can do about this?
> >> >>>>>>>>>> >>>>>>
> >> >>>>>>>>>> >>>>>> Brian
> >> >>>>>>>>>> >>>>>>
> >> >>>>>>>>>> >>>>>> [1]
> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4
>

Re: Python Precommit duration pushing 2 hours

Reply via email to