Re: Python Precommit duration pushing 2 hours

Kenneth Knowles Mon, 04 Nov 2019 20:33:04 -0800

+1 to moving forward with this

Could we move GCP tests outside the core? Then only code changes
touches/affecting GCP would cause them to run in precommit. Could still run
them in postcommit in their own suite. If the core has reasonably stable
abstractions that the connectors are built on, this should not change
coverage much.


Kenn

On Mon, Nov 4, 2019 at 1:55 PM Ahmet Altay <[email protected]> wrote:

> PR for the proposed change: https://github.com/apache/beam/pull/9985
>
> On Mon, Nov 4, 2019 at 1:35 PM Udi Meiri <[email protected]> wrote:
>
>> +1
>>
>> On Mon, Nov 4, 2019 at 12:09 PM Robert Bradshaw <[email protected]>
>> wrote:
>>
>>> +1, this seems like a good step with a clear win.
>>>
>>> On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay <[email protected]> wrote:
>>> >
>>> > Python precommits are still timing out on #9925. I am guessing that
>>> means this change would not be enough.
>>> >
>>> > I am proposing cutting down the number of test variants we run in
>>> precommits. Currently for each version we ran the following variants
>>> serially:
>>> > - base: Runs all unit tests with tox
>>> > - Cython: Installs cython and runs all unit tests as base version. The
>>> original purpose was to ensure that tests pass with or without cython.
>>> There is probably a huge overlap with base. (IIRC only a few coders have
>>> different slow vs fast tests.)
>>> > - GCP: Installs GCP dependencies and tests all base + additional gcp
>>> specific tests. The original purpose was to ensure that GCP is an optional
>>> component and all non-GCP tests still works without GCP components.
>>> >
>>> > We can reduce the list to cython + GCP tests only. This will cover the
>>> same group of tests and will check that tests pass with or without cython
>>> or GCP dependencies. This could reduce the precommit time by ~30 minutes.
>>> >
>>> > What do you think?
>>> >
>>> > Ahmet
>>> >
>>> >
>>> > On Tue, Oct 29, 2019 at 11:15 AM Robert Bradshaw <[email protected]>
>>> wrote:
>>> >>
>>> >> https://github.com/apache/beam/pull/9925
>>> >>
>>> >> On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri <[email protected]> wrote:
>>> >> >
>>> >> > I don't have the bandwidth right now to tackle this. Feel free to
>>> take it.
>>> >> >
>>> >> > On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw <
>>> [email protected]> wrote:
>>> >> >>
>>> >> >> The Python SDK does as well. These calls are coming from
>>> >> >> to_runner_api, is_stateful_dofn, and validate_stateful_dofn which
>>> are
>>> >> >> invoked once per pipene or bundle. They are, however, surprisingly
>>> >> >> expensive. Even memoizing across those three calls should save a
>>> >> >> significant amount of time. Udi, did you want to tackle this?
>>> >> >>
>>> >> >> Looking at the profile, Pipeline.to_runner_api() is being called 30
>>> >> >> times in this test, and [Applied]PTransform.to_fn_api being called
>>> >> >> 3111 times, so that in itself might be interesting to investigate.
>>> >> >>
>>> >> >> On Tue, Oct 29, 2019 at 8:26 AM Robert Burke <[email protected]>
>>> wrote:
>>> >> >> >
>>> >> >> > As does the Go SDK. Invokers are memoized and when possible code
>>> is generated to avoid reflection.
>>> >> >> >
>>> >> >> > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles <[email protected]>
>>> wrote:
>>> >> >> >>
>>> >> >> >> Noting for the benefit of the thread archive in case someone
>>> goes digging and wonders if this affects other SDKs: the Java SDK memoizes
>>> DoFnSignatures and generated DoFnInvoker classes.
>>> >> >> >>
>>> >> >> >> Kenn
>>> >> >> >>
>>> >> >> >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri <[email protected]>
>>> wrote:
>>> >> >> >>>
>>> >> >> >>> Re: #9283 slowing down tests, ideas for slowness:
>>> >> >> >>> 1. I added a lot of test cases, some with locally run
>>> pipelines.
>>> >> >> >>> 2. The PR somehow changed how coders are selected, and now
>>> we're using less efficient ones.
>>> >> >> >>> 3. New dependency funcsigs is slowing things down? (py2 only)
>>> >> >> >>>
>>> >> >> >>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7
>>> and 3.7 and got these cool graphs (attached).
>>> >> >> >>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time
>>> (IIUC), gets called ~230k times
>>> >> >> >>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k
>>> times
>>> >> >> >>>
>>> >> >> >>> After memoization of get_function_args_defaults:
>>> >> >> >>> 2.7: core:294:get_function_arguments 20.02%
>>> >> >> >>> 3.7: core:294:get_function_arguments 8.11%
>>> >> >> >>>
>>> >> >> >>>
>>> >> >> >>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada <
>>> [email protected]> wrote:
>>> >> >> >>>>
>>> >> >> >>>> *not deciles, but 9-percentiles : )
>>> >> >> >>>>
>>> >> >> >>>> On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada <
>>> [email protected]> wrote:
>>> >> >> >>>>>
>>> >> >> >>>>> I've ran the tests in Python 2 (without cython), and used a
>>> utility to track runtime for each test method. I found some of the
>>> following things:
>>> >> >> >>>>> - Total test methods run: 2665
>>> >> >> >>>>> - Total test runtime: 990 seconds
>>> >> >> >>>>> - Deciles of time spent:
>>> >> >> >>>>>   - 1949 tests run in the first 9% of time
>>> >> >> >>>>>   - 173 in the 9-18% rang3e
>>> >> >> >>>>>   - 130 in the 18-27% range
>>> >> >> >>>>>   - 95 in the 27-36% range
>>> >> >> >>>>>   - 77
>>> >> >> >>>>>   - 66
>>> >> >> >>>>>   - 55
>>> >> >> >>>>>   - 46
>>> >> >> >>>>>   - 37
>>> >> >> >>>>>   - 24
>>> >> >> >>>>>   - 13 tests run in the last 9% of time. This represents
>>> about 1 minute and a half.
>>> >> >> >>>>>
>>> >> >> >>>>> We may be able to look at the slowest X tests, and get
>>> gradual improvements from there. Although it seems .. not dramatic ones : )
>>> >> >> >>>>>
>>> >> >> >>>>> FWIW I uploaded the results here:
>>> https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json
>>> >> >> >>>>>
>>> >> >> >>>>> The slowest 13 tests were:
>>> >> >> >>>>>
>>> >> >> >>>>>
>>> [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic',
>>> >> >> >>>>>   5.253582000732422),
>>> >> >> >>>>>
>>> ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_wordcount',
>>> >> >> >>>>>   7.907713890075684),
>>> >> >> >>>>>
>>> ('apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest.test_failure_has_same_insert_ids',
>>> >> >> >>>>>   5.237942934036255),
>>> >> >> >>>>>
>>> ('apache_beam.transforms.combiners_test.CombineTest.test_global_sample',
>>> >> >> >>>>>   5.563946008682251),
>>> >> >> >>>>>
>>> ('apache_beam.runners.worker.sideinputs_test.EmulatedCollectionsTest.test_large_iterable_values',
>>> >> >> >>>>>   5.680700063705444),
>>> >> >> >>>>>
>>> ('apache_beam.io.parquetio_test.TestParquet.test_sink_transform_multiple_row_group',
>>> >> >> >>>>>   6.111238956451416),
>>> >> >> >>>>>
>>> ('apache_beam.runners.worker.statesampler_test.StateSamplerTest.test_basic_sampler',
>>> >> >> >>>>>   6.007534980773926),
>>> >> >> >>>>>
>>> ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_basic',
>>> >> >> >>>>>   13.993916988372803),
>>> >> >> >>>>>
>>> ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_read_cache_expansion',
>>> >> >> >>>>>   6.3383049964904785),
>>> >> >> >>>>>
>>> ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_word_count',
>>> >> >> >>>>>   9.157485008239746),
>>> >> >> >>>>>
>>> ('apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses.test_pardo_side_and_main_outputs',
>>> >> >> >>>>>   5.191173076629639),
>>> >> >> >>>>>
>>> ('apache_beam.io.vcfio_test.VcfSourceTest.test_pipeline_read_file_pattern_large',
>>> >> >> >>>>>   6.2221620082855225),
>>> >> >> >>>>>
>>> ('apache_beam.io.fileio_test.WriteFilesTest.test_streaming_complex_timing',
>>> >> >> >>>>>   7.7187910079956055)]
>>> >> >> >>>>>
>>> >> >> >>>>> On Mon, Oct 28, 2019 at 3:10 PM Pablo Estrada <
>>> [email protected]> wrote:
>>> >> >> >>>>>>
>>> >> >> >>>>>> I have written https://github.com/apache/beam/pull/9910 to
>>> reduce FnApiRunnerTest variations.
>>> >> >> >>>>>> I'm not in a rush to merge, but rather happy to start a
>>> discussion.
>>> >> >> >>>>>> I'll also try to figure out if there are other tests
>>> slowing down the suite significantly.
>>> >> >> >>>>>> Best
>>> >> >> >>>>>> -P.
>>> >> >> >>>>>>
>>> >> >> >>>>>> On Fri, Oct 25, 2019 at 7:41 PM Valentyn Tymofieiev <
>>> [email protected]> wrote:
>>> >> >> >>>>>>>
>>> >> >> >>>>>>> Thanks, Brian.
>>> >> >> >>>>>>> +Udi Meiri
>>> >> >> >>>>>>> As next step, it would be good to know whether slowdown is
>>> caused by tests in this PR, or its effect on other tests, and to confirm
>>> that only Python 2 codepaths were affected.
>>> >> >> >>>>>>>
>>> >> >> >>>>>>> On Fri, Oct 25, 2019 at 6:35 PM Brian Hulette <
>>> [email protected]> wrote:
>>> >> >> >>>>>>>>
>>> >> >> >>>>>>>> I did a bisect based on the runtime of `./gradlew
>>> :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits between 9/1
>>> and 9/15 to see if I could find the source of the spike that happened
>>> around 9/6. It looks like it was due to PR#9283 [1]. I thought maybe this
>>> search would reveal some mis-guided configuration change, but as far as I
>>> can tell 9283 just added a well-tested feature. I don't think there's
>>> anything to learn from that... I just wanted to circle back about it in
>>> case others are curious about that spike.
>>> >> >> >>>>>>>>
>>> >> >> >>>>>>>> I'm +1 on bumping some FnApiRunner configurations.
>>> >> >> >>>>>>>>
>>> >> >> >>>>>>>> Brian
>>> >> >> >>>>>>>>
>>> >> >> >>>>>>>> [1] https://github.com/apache/beam/pull/9283
>>> >> >> >>>>>>>>
>>> >> >> >>>>>>>> On Fri, Oct 25, 2019 at 4:49 PM Pablo Estrada <
>>> [email protected]> wrote:
>>> >> >> >>>>>>>>>
>>> >> >> >>>>>>>>> I think it makes sense to remove some of the extra
>>> FnApiRunner configurations. Perhaps some of the multiworkers and some of
>>> the grpc versions?
>>> >> >> >>>>>>>>> Best
>>> >> >> >>>>>>>>> -P.
>>> >> >> >>>>>>>>>
>>> >> >> >>>>>>>>> On Fri, Oct 25, 2019 at 12:27 PM Robert Bradshaw <
>>> [email protected]> wrote:
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>> It looks like fn_api_runner_test.py is quite expensive,
>>> taking 10-15+
>>> >> >> >>>>>>>>>> minutes on each version of Python. This test consists
>>> of a base class
>>> >> >> >>>>>>>>>> that is basically a validates runner suite, and is then
>>> run in several
>>> >> >> >>>>>>>>>> configurations, many more of which (including some
>>> expensive ones)
>>> >> >> >>>>>>>>>> have been added lately.
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>> class FnApiRunnerTest(unittest.TestCase):
>>> >> >> >>>>>>>>>> class FnApiRunnerTestWithGrpc(FnApiRunnerTest):
>>> >> >> >>>>>>>>>> class
>>> FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest):
>>> >> >> >>>>>>>>>> class
>>> FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest):
>>> >> >> >>>>>>>>>> class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest):
>>> >> >> >>>>>>>>>> class
>>> FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest):
>>> >> >> >>>>>>>>>> class FnApiRunnerTestWithBundleRepeat(FnApiRunnerTest):
>>> >> >> >>>>>>>>>> class
>>> FnApiRunnerTestWithBundleRepeatAndMultiWorkers(FnApiRunnerTest):
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>> I'm not convinced we need to run all of these
>>> permutations, or at
>>> >> >> >>>>>>>>>> least not all tests in all permutations.
>>> >> >> >>>>>>>>>>
>>> >> >> >>>>>>>>>> On Fri, Oct 25, 2019 at 10:57 AM Valentyn Tymofieiev
>>> >> >> >>>>>>>>>> <[email protected]> wrote:
>>> >> >> >>>>>>>>>> >
>>> >> >> >>>>>>>>>> > I took another look at this and precommit ITs are
>>> already running in parallel, albeit in the same suite. However it appears
>>> Python precommits became slower, especially Python 2 precommits [35 min per
>>> suite x 3 suites], see [1]. Not sure yet what caused the increase, but
>>> precommits used to be faster. Perhaps we have added a slow test or a lot of
>>> new tests.
>>> >> >> >>>>>>>>>> >
>>> >> >> >>>>>>>>>> > [1]
>>> https://scans.gradle.com/s/jvcw5fpqfc64k/timeline?task=ancsbov425524
>>> >> >> >>>>>>>>>> >
>>> >> >> >>>>>>>>>> > On Thu, Oct 24, 2019 at 4:53 PM Ahmet Altay <
>>> [email protected]> wrote:
>>> >> >> >>>>>>>>>> >>
>>> >> >> >>>>>>>>>> >> Ack. Separating precommit ITs to a different suite
>>> sounds good. Anyone is interested in doing that?
>>> >> >> >>>>>>>>>> >>
>>> >> >> >>>>>>>>>> >> On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev <
>>> [email protected]> wrote:
>>> >> >> >>>>>>>>>> >>>
>>> >> >> >>>>>>>>>> >>> This should not increase the queue time
>>> substantially, since precommit ITs are running sequentially with precommit
>>> tests, unlike multiple precommit tests which run in parallel to each other.
>>> >> >> >>>>>>>>>> >>>
>>> >> >> >>>>>>>>>> >>> The precommit ITs we run are batch and streaming
>>> wordcount tests on Py2 and one Py3 version, so it's not a lot of tests.
>>> >> >> >>>>>>>>>> >>>
>>> >> >> >>>>>>>>>> >>> On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay <
>>> [email protected]> wrote:
>>> >> >> >>>>>>>>>> >>>>
>>> >> >> >>>>>>>>>> >>>> +1 to separating ITs from precommit. Downside
>>> would be, when Chad tried to do something similar [1] it was noted that the
>>> total time to run all precommit tests would increase and also potentially
>>> increase the queue time.
>>> >> >> >>>>>>>>>> >>>>
>>> >> >> >>>>>>>>>> >>>> Another alternative, we could run a smaller set of
>>> IT tests in precommits and run the whole suite as part of post commit tests.
>>> >> >> >>>>>>>>>> >>>>
>>> >> >> >>>>>>>>>> >>>> [1] https://github.com/apache/beam/pull/9642
>>> >> >> >>>>>>>>>> >>>>
>>> >> >> >>>>>>>>>> >>>> On Thu, Oct 24, 2019 at 12:15 PM Valentyn
>>> Tymofieiev <[email protected]> wrote:
>>> >> >> >>>>>>>>>> >>>>>
>>> >> >> >>>>>>>>>> >>>>> One improvement could be move to Precommit IT
>>> tests into a separate suite from precommit tests, and run it in parallel.
>>> >> >> >>>>>>>>>> >>>>>
>>> >> >> >>>>>>>>>> >>>>> On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette <
>>> [email protected]> wrote:
>>> >> >> >>>>>>>>>> >>>>>>
>>> >> >> >>>>>>>>>> >>>>>> Python Precommits are taking quite a while now
>>> [1]. Just visually it looks like the average length is 1.5h or so, but it
>>> spikes up to 2h. I've had several precommit runs get aborted due to the 2
>>> hour limit.
>>> >> >> >>>>>>>>>> >>>>>>
>>> >> >> >>>>>>>>>> >>>>>> It looks like there was a spike up above 1h back
>>> on 9/6 and the duration has been steadily rising since then. Is there
>>> anything we can do about this?
>>> >> >> >>>>>>>>>> >>>>>>
>>> >> >> >>>>>>>>>> >>>>>> Brian
>>> >> >> >>>>>>>>>> >>>>>>
>>> >> >> >>>>>>>>>> >>>>>> [1]
>>> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4
>>>
>>

Re: Python Precommit duration pushing 2 hours

Reply via email to