Re: Python Precommit duration pushing 2 hours

Robert Bradshaw Tue, 29 Oct 2019 10:17:00 -0700

The Python SDK does as well. These calls are coming from
to_runner_api, is_stateful_dofn, and validate_stateful_dofn which are
invoked once per pipene or bundle. They are, however, surprisingly
expensive. Even memoizing across those three calls should save a
significant amount of time. Udi, did you want to tackle this?


Looking at the profile, Pipeline.to_runner_api() is being called 30
times in this test, and [Applied]PTransform.to_fn_api being called
3111 times, so that in itself might be interesting to investigate.

On Tue, Oct 29, 2019 at 8:26 AM Robert Burke <[email protected]> wrote:
>
> As does the Go SDK. Invokers are memoized and when possible code is generated 
> to avoid reflection.
>
> On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles <[email protected]> wrote:
>>
>> Noting for the benefit of the thread archive in case someone goes digging 
>> and wonders if this affects other SDKs: the Java SDK memoizes DoFnSignatures 
>> and generated DoFnInvoker classes.
>>
>> Kenn
>>
>> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri <[email protected]> wrote:
>>>
>>> Re: #9283 slowing down tests, ideas for slowness:
>>> 1. I added a lot of test cases, some with locally run pipelines.
>>> 2. The PR somehow changed how coders are selected, and now we're using less 
>>> efficient ones.
>>> 3. New dependency funcsigs is slowing things down? (py2 only)
>>>
>>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and 3.7 and got 
>>> these cool graphs (attached).
>>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time (IIUC), gets 
>>> called ~230k times
>>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k times
>>>
>>> After memoization of get_function_args_defaults:
>>> 2.7: core:294:get_function_arguments 20.02%
>>> 3.7: core:294:get_function_arguments 8.11%
>>>
>>>
>>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada <[email protected]> wrote:
>>>>
>>>> *not deciles, but 9-percentiles : )
>>>>
>>>> On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada <[email protected]> wrote:
>>>>>
>>>>> I've ran the tests in Python 2 (without cython), and used a utility to 
>>>>> track runtime for each test method. I found some of the following things:
>>>>> - Total test methods run: 2665
>>>>> - Total test runtime: 990 seconds
>>>>> - Deciles of time spent:
>>>>>   - 1949 tests run in the first 9% of time
>>>>>   - 173 in the 9-18% rang3e
>>>>>   - 130 in the 18-27% range
>>>>>   - 95 in the 27-36% range
>>>>>   - 77
>>>>>   - 66
>>>>>   - 55
>>>>>   - 46
>>>>>   - 37
>>>>>   - 24
>>>>>   - 13 tests run in the last 9% of time. This represents about 1 minute 
>>>>> and a half.
>>>>>
>>>>> We may be able to look at the slowest X tests, and get gradual 
>>>>> improvements from there. Although it seems .. not dramatic ones : )
>>>>>
>>>>> FWIW I uploaded the results here: 
>>>>> https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json
>>>>>
>>>>> The slowest 13 tests were:
>>>>>
>>>>> [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic',
>>>>>   5.253582000732422),
>>>>>  
>>>>> ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_wordcount',
>>>>>   7.907713890075684),
>>>>>  
>>>>> ('apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest.test_failure_has_same_insert_ids',
>>>>>   5.237942934036255),
>>>>>  ('apache_beam.transforms.combiners_test.CombineTest.test_global_sample',
>>>>>   5.563946008682251),
>>>>>  
>>>>> ('apache_beam.runners.worker.sideinputs_test.EmulatedCollectionsTest.test_large_iterable_values',
>>>>>   5.680700063705444),
>>>>>  
>>>>> ('apache_beam.io.parquetio_test.TestParquet.test_sink_transform_multiple_row_group',
>>>>>   6.111238956451416),
>>>>>  
>>>>> ('apache_beam.runners.worker.statesampler_test.StateSamplerTest.test_basic_sampler',
>>>>>   6.007534980773926),
>>>>>  
>>>>> ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_basic',
>>>>>   13.993916988372803),
>>>>>  
>>>>> ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_read_cache_expansion',
>>>>>   6.3383049964904785),
>>>>>  
>>>>> ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_word_count',
>>>>>   9.157485008239746),
>>>>>  
>>>>> ('apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses.test_pardo_side_and_main_outputs',
>>>>>   5.191173076629639),
>>>>>  
>>>>> ('apache_beam.io.vcfio_test.VcfSourceTest.test_pipeline_read_file_pattern_large',
>>>>>   6.2221620082855225),
>>>>>  
>>>>> ('apache_beam.io.fileio_test.WriteFilesTest.test_streaming_complex_timing',
>>>>>   7.7187910079956055)]
>>>>>
>>>>> On Mon, Oct 28, 2019 at 3:10 PM Pablo Estrada <[email protected]> wrote:
>>>>>>
>>>>>> I have written https://github.com/apache/beam/pull/9910 to reduce 
>>>>>> FnApiRunnerTest variations.
>>>>>> I'm not in a rush to merge, but rather happy to start a discussion.
>>>>>> I'll also try to figure out if there are other tests slowing down the 
>>>>>> suite significantly.
>>>>>> Best
>>>>>> -P.
>>>>>>
>>>>>> On Fri, Oct 25, 2019 at 7:41 PM Valentyn Tymofieiev 
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> Thanks, Brian.
>>>>>>> +Udi Meiri
>>>>>>> As next step, it would be good to know whether slowdown is caused by 
>>>>>>> tests in this PR, or its effect on other tests, and to confirm that 
>>>>>>> only Python 2 codepaths were affected.
>>>>>>>
>>>>>>> On Fri, Oct 25, 2019 at 6:35 PM Brian Hulette <[email protected]> 
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> I did a bisect based on the runtime of `./gradlew 
>>>>>>>> :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits 
>>>>>>>> between 9/1 and 9/15 to see if I could find the source of the spike 
>>>>>>>> that happened around 9/6. It looks like it was due to PR#9283 [1]. I 
>>>>>>>> thought maybe this search would reveal some mis-guided configuration 
>>>>>>>> change, but as far as I can tell 9283 just added a well-tested 
>>>>>>>> feature. I don't think there's anything to learn from that... I just 
>>>>>>>> wanted to circle back about it in case others are curious about that 
>>>>>>>> spike.
>>>>>>>>
>>>>>>>> I'm +1 on bumping some FnApiRunner configurations.
>>>>>>>>
>>>>>>>> Brian
>>>>>>>>
>>>>>>>> [1] https://github.com/apache/beam/pull/9283
>>>>>>>>
>>>>>>>> On Fri, Oct 25, 2019 at 4:49 PM Pablo Estrada <[email protected]> 
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> I think it makes sense to remove some of the extra FnApiRunner 
>>>>>>>>> configurations. Perhaps some of the multiworkers and some of the grpc 
>>>>>>>>> versions?
>>>>>>>>> Best
>>>>>>>>> -P.
>>>>>>>>>
>>>>>>>>> On Fri, Oct 25, 2019 at 12:27 PM Robert Bradshaw 
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>> It looks like fn_api_runner_test.py is quite expensive, taking 10-15+
>>>>>>>>>> minutes on each version of Python. This test consists of a base class
>>>>>>>>>> that is basically a validates runner suite, and is then run in 
>>>>>>>>>> several
>>>>>>>>>> configurations, many more of which (including some expensive ones)
>>>>>>>>>> have been added lately.
>>>>>>>>>>
>>>>>>>>>> class FnApiRunnerTest(unittest.TestCase):
>>>>>>>>>> class FnApiRunnerTestWithGrpc(FnApiRunnerTest):
>>>>>>>>>> class FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest):
>>>>>>>>>> class FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest):
>>>>>>>>>> class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest):
>>>>>>>>>> class FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest):
>>>>>>>>>> class FnApiRunnerTestWithBundleRepeat(FnApiRunnerTest):
>>>>>>>>>> class 
>>>>>>>>>> FnApiRunnerTestWithBundleRepeatAndMultiWorkers(FnApiRunnerTest):
>>>>>>>>>>
>>>>>>>>>> I'm not convinced we need to run all of these permutations, or at
>>>>>>>>>> least not all tests in all permutations.
>>>>>>>>>>
>>>>>>>>>> On Fri, Oct 25, 2019 at 10:57 AM Valentyn Tymofieiev
>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>> >
>>>>>>>>>> > I took another look at this and precommit ITs are already running 
>>>>>>>>>> > in parallel, albeit in the same suite. However it appears Python 
>>>>>>>>>> > precommits became slower, especially Python 2 precommits [35 min 
>>>>>>>>>> > per suite x 3 suites], see [1]. Not sure yet what caused the 
>>>>>>>>>> > increase, but precommits used to be faster. Perhaps we have added 
>>>>>>>>>> > a slow test or a lot of new tests.
>>>>>>>>>> >
>>>>>>>>>> > [1] 
>>>>>>>>>> > https://scans.gradle.com/s/jvcw5fpqfc64k/timeline?task=ancsbov425524
>>>>>>>>>> >
>>>>>>>>>> > On Thu, Oct 24, 2019 at 4:53 PM Ahmet Altay <[email protected]> 
>>>>>>>>>> > wrote:
>>>>>>>>>> >>
>>>>>>>>>> >> Ack. Separating precommit ITs to a different suite sounds good. 
>>>>>>>>>> >> Anyone is interested in doing that?
>>>>>>>>>> >>
>>>>>>>>>> >> On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev 
>>>>>>>>>> >> <[email protected]> wrote:
>>>>>>>>>> >>>
>>>>>>>>>> >>> This should not increase the queue time substantially, since 
>>>>>>>>>> >>> precommit ITs are running sequentially with precommit tests, 
>>>>>>>>>> >>> unlike multiple precommit tests which run in parallel to each 
>>>>>>>>>> >>> other.
>>>>>>>>>> >>>
>>>>>>>>>> >>> The precommit ITs we run are batch and streaming wordcount tests 
>>>>>>>>>> >>> on Py2 and one Py3 version, so it's not a lot of tests.
>>>>>>>>>> >>>
>>>>>>>>>> >>> On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay <[email protected]> 
>>>>>>>>>> >>> wrote:
>>>>>>>>>> >>>>
>>>>>>>>>> >>>> +1 to separating ITs from precommit. Downside would be, when 
>>>>>>>>>> >>>> Chad tried to do something similar [1] it was noted that the 
>>>>>>>>>> >>>> total time to run all precommit tests would increase and also 
>>>>>>>>>> >>>> potentially increase the queue time.
>>>>>>>>>> >>>>
>>>>>>>>>> >>>> Another alternative, we could run a smaller set of IT tests in 
>>>>>>>>>> >>>> precommits and run the whole suite as part of post commit tests.
>>>>>>>>>> >>>>
>>>>>>>>>> >>>> [1] https://github.com/apache/beam/pull/9642
>>>>>>>>>> >>>>
>>>>>>>>>> >>>> On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev 
>>>>>>>>>> >>>> <[email protected]> wrote:
>>>>>>>>>> >>>>>
>>>>>>>>>> >>>>> One improvement could be move to Precommit IT tests into a 
>>>>>>>>>> >>>>> separate suite from precommit tests, and run it in parallel.
>>>>>>>>>> >>>>>
>>>>>>>>>> >>>>> On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette 
>>>>>>>>>> >>>>> <[email protected]> wrote:
>>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>> Python Precommits are taking quite a while now [1]. Just 
>>>>>>>>>> >>>>>> visually it looks like the average length is 1.5h or so, but 
>>>>>>>>>> >>>>>> it spikes up to 2h. I've had several precommit runs get 
>>>>>>>>>> >>>>>> aborted due to the 2 hour limit.
>>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>> It looks like there was a spike up above 1h back on 9/6 and 
>>>>>>>>>> >>>>>> the duration has been steadily rising since then. Is there 
>>>>>>>>>> >>>>>> anything we can do about this?
>>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>> Brian
>>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>> [1] 
>>>>>>>>>> >>>>>> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4

Re: Python Precommit duration pushing 2 hours

Reply via email to