Re: Python Precommit duration pushing 2 hours

Robert Bradshaw Tue, 29 Oct 2019 11:15:54 -0700

https://github.com/apache/beam/pull/9925


On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri <eh...@google.com> wrote:
>
> I don't have the bandwidth right now to tackle this. Feel free to take it.
>
> On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw <rober...@google.com> wrote:
>>
>> The Python SDK does as well. These calls are coming from
>> to_runner_api, is_stateful_dofn, and validate_stateful_dofn which are
>> invoked once per pipene or bundle. They are, however, surprisingly
>> expensive. Even memoizing across those three calls should save a
>> significant amount of time. Udi, did you want to tackle this?
>>
>> Looking at the profile, Pipeline.to_runner_api() is being called 30
>> times in this test, and [Applied]PTransform.to_fn_api being called
>> 3111 times, so that in itself might be interesting to investigate.
>>
>> On Tue, Oct 29, 2019 at 8:26 AM Robert Burke <rob...@frantil.com> wrote:
>> >
>> > As does the Go SDK. Invokers are memoized and when possible code is 
>> > generated to avoid reflection.
>> >
>> > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles <k...@google.com> wrote:
>> >>
>> >> Noting for the benefit of the thread archive in case someone goes digging 
>> >> and wonders if this affects other SDKs: the Java SDK memoizes 
>> >> DoFnSignatures and generated DoFnInvoker classes.
>> >>
>> >> Kenn
>> >>
>> >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri <eh...@google.com> wrote:
>> >>>
>> >>> Re: #9283 slowing down tests, ideas for slowness:
>> >>> 1. I added a lot of test cases, some with locally run pipelines.
>> >>> 2. The PR somehow changed how coders are selected, and now we're using 
>> >>> less efficient ones.
>> >>> 3. New dependency funcsigs is slowing things down? (py2 only)
>> >>>
>> >>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and 3.7 and 
>> >>> got these cool graphs (attached).
>> >>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time (IIUC), 
>> >>> gets called ~230k times
>> >>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k times
>> >>>
>> >>> After memoization of get_function_args_defaults:
>> >>> 2.7: core:294:get_function_arguments 20.02%
>> >>> 3.7: core:294:get_function_arguments 8.11%
>> >>>
>> >>>
>> >>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada <pabl...@google.com> wrote:
>> >>>>
>> >>>> *not deciles, but 9-percentiles : )
>> >>>>
>> >>>> On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada <pabl...@google.com> 
>> >>>> wrote:
>> >>>>>
>> >>>>> I've ran the tests in Python 2 (without cython), and used a utility to 
>> >>>>> track runtime for each test method. I found some of the following 
>> >>>>> things:
>> >>>>> - Total test methods run: 2665
>> >>>>> - Total test runtime: 990 seconds
>> >>>>> - Deciles of time spent:
>> >>>>>   - 1949 tests run in the first 9% of time
>> >>>>>   - 173 in the 9-18% rang3e
>> >>>>>   - 130 in the 18-27% range
>> >>>>>   - 95 in the 27-36% range
>> >>>>>   - 77
>> >>>>>   - 66
>> >>>>>   - 55
>> >>>>>   - 46
>> >>>>>   - 37
>> >>>>>   - 24
>> >>>>>   - 13 tests run in the last 9% of time. This represents about 1 
>> >>>>> minute and a half.
>> >>>>>
>> >>>>> We may be able to look at the slowest X tests, and get gradual 
>> >>>>> improvements from there. Although it seems .. not dramatic ones : )
>> >>>>>
>> >>>>> FWIW I uploaded the results here: 
>> >>>>> https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json
>> >>>>>
>> >>>>> The slowest 13 tests were:
>> >>>>>
>> >>>>> [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic',
>> >>>>>   5.253582000732422),
>> >>>>>  
>> >>>>> ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_wordcount',
>> >>>>>   7.907713890075684),
>> >>>>>  
>> >>>>> ('apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest.test_failure_has_same_insert_ids',
>> >>>>>   5.237942934036255),
>> >>>>>  
>> >>>>> ('apache_beam.transforms.combiners_test.CombineTest.test_global_sample',
>> >>>>>   5.563946008682251),
>> >>>>>  
>> >>>>> ('apache_beam.runners.worker.sideinputs_test.EmulatedCollectionsTest.test_large_iterable_values',
>> >>>>>   5.680700063705444),
>> >>>>>  
>> >>>>> ('apache_beam.io.parquetio_test.TestParquet.test_sink_transform_multiple_row_group',
>> >>>>>   6.111238956451416),
>> >>>>>  
>> >>>>> ('apache_beam.runners.worker.statesampler_test.StateSamplerTest.test_basic_sampler',
>> >>>>>   6.007534980773926),
>> >>>>>  
>> >>>>> ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_basic',
>> >>>>>   13.993916988372803),
>> >>>>>  
>> >>>>> ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_read_cache_expansion',
>> >>>>>   6.3383049964904785),
>> >>>>>  
>> >>>>> ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_word_count',
>> >>>>>   9.157485008239746),
>> >>>>>  
>> >>>>> ('apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses.test_pardo_side_and_main_outputs',
>> >>>>>   5.191173076629639),
>> >>>>>  
>> >>>>> ('apache_beam.io.vcfio_test.VcfSourceTest.test_pipeline_read_file_pattern_large',
>> >>>>>   6.2221620082855225),
>> >>>>>  
>> >>>>> ('apache_beam.io.fileio_test.WriteFilesTest.test_streaming_complex_timing',
>> >>>>>   7.7187910079956055)]
>> >>>>>
>> >>>>> On Mon, Oct 28, 2019 at 3:10 PM Pablo Estrada <pabl...@google.com> 
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> I have written https://github.com/apache/beam/pull/9910 to reduce 
>> >>>>>> FnApiRunnerTest variations.
>> >>>>>> I'm not in a rush to merge, but rather happy to start a discussion.
>> >>>>>> I'll also try to figure out if there are other tests slowing down the 
>> >>>>>> suite significantly.
>> >>>>>> Best
>> >>>>>> -P.
>> >>>>>>
>> >>>>>> On Fri, Oct 25, 2019 at 7:41 PM Valentyn Tymofieiev 
>> >>>>>> <valen...@google.com> wrote:
>> >>>>>>>
>> >>>>>>> Thanks, Brian.
>> >>>>>>> +Udi Meiri
>> >>>>>>> As next step, it would be good to know whether slowdown is caused by 
>> >>>>>>> tests in this PR, or its effect on other tests, and to confirm that 
>> >>>>>>> only Python 2 codepaths were affected.
>> >>>>>>>
>> >>>>>>> On Fri, Oct 25, 2019 at 6:35 PM Brian Hulette <bhule...@google.com> 
>> >>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>> I did a bisect based on the runtime of `./gradlew 
>> >>>>>>>> :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits 
>> >>>>>>>> between 9/1 and 9/15 to see if I could find the source of the spike 
>> >>>>>>>> that happened around 9/6. It looks like it was due to PR#9283 [1]. 
>> >>>>>>>> I thought maybe this search would reveal some mis-guided 
>> >>>>>>>> configuration change, but as far as I can tell 9283 just added a 
>> >>>>>>>> well-tested feature. I don't think there's anything to learn from 
>> >>>>>>>> that... I just wanted to circle back about it in case others are 
>> >>>>>>>> curious about that spike.
>> >>>>>>>>
>> >>>>>>>> I'm +1 on bumping some FnApiRunner configurations.
>> >>>>>>>>
>> >>>>>>>> Brian
>> >>>>>>>>
>> >>>>>>>> [1] https://github.com/apache/beam/pull/9283
>> >>>>>>>>
>> >>>>>>>> On Fri, Oct 25, 2019 at 4:49 PM Pablo Estrada <pabl...@google.com> 
>> >>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>> I think it makes sense to remove some of the extra FnApiRunner 
>> >>>>>>>>> configurations. Perhaps some of the multiworkers and some of the 
>> >>>>>>>>> grpc versions?
>> >>>>>>>>> Best
>> >>>>>>>>> -P.
>> >>>>>>>>>
>> >>>>>>>>> On Fri, Oct 25, 2019 at 12:27 PM Robert Bradshaw 
>> >>>>>>>>> <rober...@google.com> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> It looks like fn_api_runner_test.py is quite expensive, taking 
>> >>>>>>>>>> 10-15+
>> >>>>>>>>>> minutes on each version of Python. This test consists of a base 
>> >>>>>>>>>> class
>> >>>>>>>>>> that is basically a validates runner suite, and is then run in 
>> >>>>>>>>>> several
>> >>>>>>>>>> configurations, many more of which (including some expensive ones)
>> >>>>>>>>>> have been added lately.
>> >>>>>>>>>>
>> >>>>>>>>>> class FnApiRunnerTest(unittest.TestCase):
>> >>>>>>>>>> class FnApiRunnerTestWithGrpc(FnApiRunnerTest):
>> >>>>>>>>>> class FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest):
>> >>>>>>>>>> class FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest):
>> >>>>>>>>>> class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest):
>> >>>>>>>>>> class FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest):
>> >>>>>>>>>> class FnApiRunnerTestWithBundleRepeat(FnApiRunnerTest):
>> >>>>>>>>>> class 
>> >>>>>>>>>> FnApiRunnerTestWithBundleRepeatAndMultiWorkers(FnApiRunnerTest):
>> >>>>>>>>>>
>> >>>>>>>>>> I'm not convinced we need to run all of these permutations, or at
>> >>>>>>>>>> least not all tests in all permutations.
>> >>>>>>>>>>
>> >>>>>>>>>> On Fri, Oct 25, 2019 at 10:57 AM Valentyn Tymofieiev
>> >>>>>>>>>> <valen...@google.com> wrote:
>> >>>>>>>>>> >
>> >>>>>>>>>> > I took another look at this and precommit ITs are already 
>> >>>>>>>>>> > running in parallel, albeit in the same suite. However it 
>> >>>>>>>>>> > appears Python precommits became slower, especially Python 2 
>> >>>>>>>>>> > precommits [35 min per suite x 3 suites], see [1]. Not sure yet 
>> >>>>>>>>>> > what caused the increase, but precommits used to be faster. 
>> >>>>>>>>>> > Perhaps we have added a slow test or a lot of new tests.
>> >>>>>>>>>> >
>> >>>>>>>>>> > [1] 
>> >>>>>>>>>> > https://scans.gradle.com/s/jvcw5fpqfc64k/timeline?task=ancsbov425524
>> >>>>>>>>>> >
>> >>>>>>>>>> > On Thu, Oct 24, 2019 at 4:53 PM Ahmet Altay <al...@google.com> 
>> >>>>>>>>>> > wrote:
>> >>>>>>>>>> >>
>> >>>>>>>>>> >> Ack. Separating precommit ITs to a different suite sounds 
>> >>>>>>>>>> >> good. Anyone is interested in doing that?
>> >>>>>>>>>> >>
>> >>>>>>>>>> >> On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev 
>> >>>>>>>>>> >> <valen...@google.com> wrote:
>> >>>>>>>>>> >>>
>> >>>>>>>>>> >>> This should not increase the queue time substantially, since 
>> >>>>>>>>>> >>> precommit ITs are running sequentially with precommit tests, 
>> >>>>>>>>>> >>> unlike multiple precommit tests which run in parallel to each 
>> >>>>>>>>>> >>> other.
>> >>>>>>>>>> >>>
>> >>>>>>>>>> >>> The precommit ITs we run are batch and streaming wordcount 
>> >>>>>>>>>> >>> tests on Py2 and one Py3 version, so it's not a lot of tests.
>> >>>>>>>>>> >>>
>> >>>>>>>>>> >>> On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay 
>> >>>>>>>>>> >>> <al...@google.com> wrote:
>> >>>>>>>>>> >>>>
>> >>>>>>>>>> >>>> +1 to separating ITs from precommit. Downside would be, when 
>> >>>>>>>>>> >>>> Chad tried to do something similar [1] it was noted that the 
>> >>>>>>>>>> >>>> total time to run all precommit tests would increase and 
>> >>>>>>>>>> >>>> also potentially increase the queue time.
>> >>>>>>>>>> >>>>
>> >>>>>>>>>> >>>> Another alternative, we could run a smaller set of IT tests 
>> >>>>>>>>>> >>>> in precommits and run the whole suite as part of post commit 
>> >>>>>>>>>> >>>> tests.
>> >>>>>>>>>> >>>>
>> >>>>>>>>>> >>>> [1] https://github.com/apache/beam/pull/9642
>> >>>>>>>>>> >>>>
>> >>>>>>>>>> >>>> On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev 
>> >>>>>>>>>> >>>> <valen...@google.com> wrote:
>> >>>>>>>>>> >>>>>
>> >>>>>>>>>> >>>>> One improvement could be move to Precommit IT tests into a 
>> >>>>>>>>>> >>>>> separate suite from precommit tests, and run it in parallel.
>> >>>>>>>>>> >>>>>
>> >>>>>>>>>> >>>>> On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette 
>> >>>>>>>>>> >>>>> <bhule...@google.com> wrote:
>> >>>>>>>>>> >>>>>>
>> >>>>>>>>>> >>>>>> Python Precommits are taking quite a while now [1]. Just 
>> >>>>>>>>>> >>>>>> visually it looks like the average length is 1.5h or so, 
>> >>>>>>>>>> >>>>>> but it spikes up to 2h. I've had several precommit runs 
>> >>>>>>>>>> >>>>>> get aborted due to the 2 hour limit.
>> >>>>>>>>>> >>>>>>
>> >>>>>>>>>> >>>>>> It looks like there was a spike up above 1h back on 9/6 
>> >>>>>>>>>> >>>>>> and the duration has been steadily rising since then. Is 
>> >>>>>>>>>> >>>>>> there anything we can do about this?
>> >>>>>>>>>> >>>>>>
>> >>>>>>>>>> >>>>>> Brian
>> >>>>>>>>>> >>>>>>
>> >>>>>>>>>> >>>>>> [1] 
>> >>>>>>>>>> >>>>>> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4

Re: Python Precommit duration pushing 2 hours

Reply via email to