+1, this seems like a good step with a clear win.
On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay <al...@google.com> wrote: > > Python precommits are still timing out on #9925. I am guessing that means > this change would not be enough. > > I am proposing cutting down the number of test variants we run in precommits. > Currently for each version we ran the following variants serially: > - base: Runs all unit tests with tox > - Cython: Installs cython and runs all unit tests as base version. The > original purpose was to ensure that tests pass with or without cython. There > is probably a huge overlap with base. (IIRC only a few coders have different > slow vs fast tests.) > - GCP: Installs GCP dependencies and tests all base + additional gcp specific > tests. The original purpose was to ensure that GCP is an optional component > and all non-GCP tests still works without GCP components. > > We can reduce the list to cython + GCP tests only. This will cover the same > group of tests and will check that tests pass with or without cython or GCP > dependencies. This could reduce the precommit time by ~30 minutes. > > What do you think? > > Ahmet > > > On Tue, Oct 29, 2019 at 11:15 AM Robert Bradshaw <rober...@google.com> wrote: >> >> https://github.com/apache/beam/pull/9925 >> >> On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri <eh...@google.com> wrote: >> > >> > I don't have the bandwidth right now to tackle this. Feel free to take it. >> > >> > On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw <rober...@google.com> >> > wrote: >> >> >> >> The Python SDK does as well. These calls are coming from >> >> to_runner_api, is_stateful_dofn, and validate_stateful_dofn which are >> >> invoked once per pipene or bundle. They are, however, surprisingly >> >> expensive. Even memoizing across those three calls should save a >> >> significant amount of time. Udi, did you want to tackle this? >> >> >> >> Looking at the profile, Pipeline.to_runner_api() is being called 30 >> >> times in this test, and [Applied]PTransform.to_fn_api being called >> >> 3111 times, so that in itself might be interesting to investigate. >> >> >> >> On Tue, Oct 29, 2019 at 8:26 AM Robert Burke <rob...@frantil.com> wrote: >> >> > >> >> > As does the Go SDK. Invokers are memoized and when possible code is >> >> > generated to avoid reflection. >> >> > >> >> > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles <k...@google.com> wrote: >> >> >> >> >> >> Noting for the benefit of the thread archive in case someone goes >> >> >> digging and wonders if this affects other SDKs: the Java SDK memoizes >> >> >> DoFnSignatures and generated DoFnInvoker classes. >> >> >> >> >> >> Kenn >> >> >> >> >> >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri <eh...@google.com> wrote: >> >> >>> >> >> >>> Re: #9283 slowing down tests, ideas for slowness: >> >> >>> 1. I added a lot of test cases, some with locally run pipelines. >> >> >>> 2. The PR somehow changed how coders are selected, and now we're >> >> >>> using less efficient ones. >> >> >>> 3. New dependency funcsigs is slowing things down? (py2 only) >> >> >>> >> >> >>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and 3.7 >> >> >>> and got these cool graphs (attached). >> >> >>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time (IIUC), >> >> >>> gets called ~230k times >> >> >>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k times >> >> >>> >> >> >>> After memoization of get_function_args_defaults: >> >> >>> 2.7: core:294:get_function_arguments 20.02% >> >> >>> 3.7: core:294:get_function_arguments 8.11% >> >> >>> >> >> >>> >> >> >>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada <pabl...@google.com> >> >> >>> wrote: >> >> >>>> >> >> >>>> *not deciles, but 9-percentiles : ) >> >> >>>> >> >> >>>> On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada <pabl...@google.com> >> >> >>>> wrote: >> >> >>>>> >> >> >>>>> I've ran the tests in Python 2 (without cython), and used a utility >> >> >>>>> to track runtime for each test method. I found some of the >> >> >>>>> following things: >> >> >>>>> - Total test methods run: 2665 >> >> >>>>> - Total test runtime: 990 seconds >> >> >>>>> - Deciles of time spent: >> >> >>>>> - 1949 tests run in the first 9% of time >> >> >>>>> - 173 in the 9-18% rang3e >> >> >>>>> - 130 in the 18-27% range >> >> >>>>> - 95 in the 27-36% range >> >> >>>>> - 77 >> >> >>>>> - 66 >> >> >>>>> - 55 >> >> >>>>> - 46 >> >> >>>>> - 37 >> >> >>>>> - 24 >> >> >>>>> - 13 tests run in the last 9% of time. This represents about 1 >> >> >>>>> minute and a half. >> >> >>>>> >> >> >>>>> We may be able to look at the slowest X tests, and get gradual >> >> >>>>> improvements from there. Although it seems .. not dramatic ones : ) >> >> >>>>> >> >> >>>>> FWIW I uploaded the results here: >> >> >>>>> https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json >> >> >>>>> >> >> >>>>> The slowest 13 tests were: >> >> >>>>> >> >> >>>>> [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic', >> >> >>>>> 5.253582000732422), >> >> >>>>> >> >> >>>>> ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_wordcount', >> >> >>>>> 7.907713890075684), >> >> >>>>> >> >> >>>>> ('apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest.test_failure_has_same_insert_ids', >> >> >>>>> 5.237942934036255), >> >> >>>>> >> >> >>>>> ('apache_beam.transforms.combiners_test.CombineTest.test_global_sample', >> >> >>>>> 5.563946008682251), >> >> >>>>> >> >> >>>>> ('apache_beam.runners.worker.sideinputs_test.EmulatedCollectionsTest.test_large_iterable_values', >> >> >>>>> 5.680700063705444), >> >> >>>>> >> >> >>>>> ('apache_beam.io.parquetio_test.TestParquet.test_sink_transform_multiple_row_group', >> >> >>>>> 6.111238956451416), >> >> >>>>> >> >> >>>>> ('apache_beam.runners.worker.statesampler_test.StateSamplerTest.test_basic_sampler', >> >> >>>>> 6.007534980773926), >> >> >>>>> >> >> >>>>> ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_basic', >> >> >>>>> 13.993916988372803), >> >> >>>>> >> >> >>>>> ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_read_cache_expansion', >> >> >>>>> 6.3383049964904785), >> >> >>>>> >> >> >>>>> ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_word_count', >> >> >>>>> 9.157485008239746), >> >> >>>>> >> >> >>>>> ('apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses.test_pardo_side_and_main_outputs', >> >> >>>>> 5.191173076629639), >> >> >>>>> >> >> >>>>> ('apache_beam.io.vcfio_test.VcfSourceTest.test_pipeline_read_file_pattern_large', >> >> >>>>> 6.2221620082855225), >> >> >>>>> >> >> >>>>> ('apache_beam.io.fileio_test.WriteFilesTest.test_streaming_complex_timing', >> >> >>>>> 7.7187910079956055)] >> >> >>>>> >> >> >>>>> On Mon, Oct 28, 2019 at 3:10 PM Pablo Estrada <pabl...@google.com> >> >> >>>>> wrote: >> >> >>>>>> >> >> >>>>>> I have written https://github.com/apache/beam/pull/9910 to reduce >> >> >>>>>> FnApiRunnerTest variations. >> >> >>>>>> I'm not in a rush to merge, but rather happy to start a discussion. >> >> >>>>>> I'll also try to figure out if there are other tests slowing down >> >> >>>>>> the suite significantly. >> >> >>>>>> Best >> >> >>>>>> -P. >> >> >>>>>> >> >> >>>>>> On Fri, Oct 25, 2019 at 7:41 PM Valentyn Tymofieiev >> >> >>>>>> <valen...@google.com> wrote: >> >> >>>>>>> >> >> >>>>>>> Thanks, Brian. >> >> >>>>>>> +Udi Meiri >> >> >>>>>>> As next step, it would be good to know whether slowdown is caused >> >> >>>>>>> by tests in this PR, or its effect on other tests, and to confirm >> >> >>>>>>> that only Python 2 codepaths were affected. >> >> >>>>>>> >> >> >>>>>>> On Fri, Oct 25, 2019 at 6:35 PM Brian Hulette >> >> >>>>>>> <bhule...@google.com> wrote: >> >> >>>>>>>> >> >> >>>>>>>> I did a bisect based on the runtime of `./gradlew >> >> >>>>>>>> :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits >> >> >>>>>>>> between 9/1 and 9/15 to see if I could find the source of the >> >> >>>>>>>> spike that happened around 9/6. It looks like it was due to >> >> >>>>>>>> PR#9283 [1]. I thought maybe this search would reveal some >> >> >>>>>>>> mis-guided configuration change, but as far as I can tell 9283 >> >> >>>>>>>> just added a well-tested feature. I don't think there's anything >> >> >>>>>>>> to learn from that... I just wanted to circle back about it in >> >> >>>>>>>> case others are curious about that spike. >> >> >>>>>>>> >> >> >>>>>>>> I'm +1 on bumping some FnApiRunner configurations. >> >> >>>>>>>> >> >> >>>>>>>> Brian >> >> >>>>>>>> >> >> >>>>>>>> [1] https://github.com/apache/beam/pull/9283 >> >> >>>>>>>> >> >> >>>>>>>> On Fri, Oct 25, 2019 at 4:49 PM Pablo Estrada >> >> >>>>>>>> <pabl...@google.com> wrote: >> >> >>>>>>>>> >> >> >>>>>>>>> I think it makes sense to remove some of the extra FnApiRunner >> >> >>>>>>>>> configurations. Perhaps some of the multiworkers and some of >> >> >>>>>>>>> the grpc versions? >> >> >>>>>>>>> Best >> >> >>>>>>>>> -P. >> >> >>>>>>>>> >> >> >>>>>>>>> On Fri, Oct 25, 2019 at 12:27 PM Robert Bradshaw >> >> >>>>>>>>> <rober...@google.com> wrote: >> >> >>>>>>>>>> >> >> >>>>>>>>>> It looks like fn_api_runner_test.py is quite expensive, taking >> >> >>>>>>>>>> 10-15+ >> >> >>>>>>>>>> minutes on each version of Python. This test consists of a >> >> >>>>>>>>>> base class >> >> >>>>>>>>>> that is basically a validates runner suite, and is then run in >> >> >>>>>>>>>> several >> >> >>>>>>>>>> configurations, many more of which (including some expensive >> >> >>>>>>>>>> ones) >> >> >>>>>>>>>> have been added lately. >> >> >>>>>>>>>> >> >> >>>>>>>>>> class FnApiRunnerTest(unittest.TestCase): >> >> >>>>>>>>>> class FnApiRunnerTestWithGrpc(FnApiRunnerTest): >> >> >>>>>>>>>> class FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest): >> >> >>>>>>>>>> class FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest): >> >> >>>>>>>>>> class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest): >> >> >>>>>>>>>> class FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest): >> >> >>>>>>>>>> class FnApiRunnerTestWithBundleRepeat(FnApiRunnerTest): >> >> >>>>>>>>>> class >> >> >>>>>>>>>> FnApiRunnerTestWithBundleRepeatAndMultiWorkers(FnApiRunnerTest): >> >> >>>>>>>>>> >> >> >>>>>>>>>> I'm not convinced we need to run all of these permutations, or >> >> >>>>>>>>>> at >> >> >>>>>>>>>> least not all tests in all permutations. >> >> >>>>>>>>>> >> >> >>>>>>>>>> On Fri, Oct 25, 2019 at 10:57 AM Valentyn Tymofieiev >> >> >>>>>>>>>> <valen...@google.com> wrote: >> >> >>>>>>>>>> > >> >> >>>>>>>>>> > I took another look at this and precommit ITs are already >> >> >>>>>>>>>> > running in parallel, albeit in the same suite. However it >> >> >>>>>>>>>> > appears Python precommits became slower, especially Python 2 >> >> >>>>>>>>>> > precommits [35 min per suite x 3 suites], see [1]. Not sure >> >> >>>>>>>>>> > yet what caused the increase, but precommits used to be >> >> >>>>>>>>>> > faster. Perhaps we have added a slow test or a lot of new >> >> >>>>>>>>>> > tests. >> >> >>>>>>>>>> > >> >> >>>>>>>>>> > [1] >> >> >>>>>>>>>> > https://scans.gradle.com/s/jvcw5fpqfc64k/timeline?task=ancsbov425524 >> >> >>>>>>>>>> > >> >> >>>>>>>>>> > On Thu, Oct 24, 2019 at 4:53 PM Ahmet Altay >> >> >>>>>>>>>> > <al...@google.com> wrote: >> >> >>>>>>>>>> >> >> >> >>>>>>>>>> >> Ack. Separating precommit ITs to a different suite sounds >> >> >>>>>>>>>> >> good. Anyone is interested in doing that? >> >> >>>>>>>>>> >> >> >> >>>>>>>>>> >> On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev >> >> >>>>>>>>>> >> <valen...@google.com> wrote: >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> >>> This should not increase the queue time substantially, >> >> >>>>>>>>>> >>> since precommit ITs are running sequentially with >> >> >>>>>>>>>> >>> precommit tests, unlike multiple precommit tests which run >> >> >>>>>>>>>> >>> in parallel to each other. >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> >>> The precommit ITs we run are batch and streaming wordcount >> >> >>>>>>>>>> >>> tests on Py2 and one Py3 version, so it's not a lot of >> >> >>>>>>>>>> >>> tests. >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> >>> On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay >> >> >>>>>>>>>> >>> <al...@google.com> wrote: >> >> >>>>>>>>>> >>>> >> >> >>>>>>>>>> >>>> +1 to separating ITs from precommit. Downside would be, >> >> >>>>>>>>>> >>>> when Chad tried to do something similar [1] it was noted >> >> >>>>>>>>>> >>>> that the total time to run all precommit tests would >> >> >>>>>>>>>> >>>> increase and also potentially increase the queue time. >> >> >>>>>>>>>> >>>> >> >> >>>>>>>>>> >>>> Another alternative, we could run a smaller set of IT >> >> >>>>>>>>>> >>>> tests in precommits and run the whole suite as part of >> >> >>>>>>>>>> >>>> post commit tests. >> >> >>>>>>>>>> >>>> >> >> >>>>>>>>>> >>>> [1] https://github.com/apache/beam/pull/9642 >> >> >>>>>>>>>> >>>> >> >> >>>>>>>>>> >>>> On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev >> >> >>>>>>>>>> >>>> <valen...@google.com> wrote: >> >> >>>>>>>>>> >>>>> >> >> >>>>>>>>>> >>>>> One improvement could be move to Precommit IT tests into >> >> >>>>>>>>>> >>>>> a separate suite from precommit tests, and run it in >> >> >>>>>>>>>> >>>>> parallel. >> >> >>>>>>>>>> >>>>> >> >> >>>>>>>>>> >>>>> On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette >> >> >>>>>>>>>> >>>>> <bhule...@google.com> wrote: >> >> >>>>>>>>>> >>>>>> >> >> >>>>>>>>>> >>>>>> Python Precommits are taking quite a while now [1]. >> >> >>>>>>>>>> >>>>>> Just visually it looks like the average length is 1.5h >> >> >>>>>>>>>> >>>>>> or so, but it spikes up to 2h. I've had several >> >> >>>>>>>>>> >>>>>> precommit runs get aborted due to the 2 hour limit. >> >> >>>>>>>>>> >>>>>> >> >> >>>>>>>>>> >>>>>> It looks like there was a spike up above 1h back on 9/6 >> >> >>>>>>>>>> >>>>>> and the duration has been steadily rising since then. >> >> >>>>>>>>>> >>>>>> Is there anything we can do about this? >> >> >>>>>>>>>> >>>>>> >> >> >>>>>>>>>> >>>>>> Brian >> >> >>>>>>>>>> >>>>>> >> >> >>>>>>>>>> >>>>>> [1] >> >> >>>>>>>>>> >>>>>> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4