Re: Possible Python SDK performance regression

Ahmet Altay Fri, 06 Sep 2019 18:24:02 -0700

On Fri, Sep 6, 2019 at 6:17 PM Thomas Weise <[email protected]> wrote:

>
>
> On Fri, Sep 6, 2019 at 2:24 PM Valentyn Tymofieiev <[email protected]>
> wrote:
>
>> +Mark Liu <[email protected]> has added some benchmarks running across
>> multiple Python versions. Specifically we run 1 GB wordcount job on
>> Dataflow runner on Python 2.7, 3.5-3.7. The benchmarks do not have
>> configured alerting and to my knowledge are not actively monitored yet.
>>
>
> Are there any benchmarks for streaming? Streaming and batch are quite
> different runtime paths. And some of the issues can only be identified
> with longer running processes through metrics. It would be good to verify
> utilization of memory, cpu etc.
>
> I additionally discovered that our 2.16 upgrade exhibits a memory leak in
> the Python worker (Py 2.7).
>


Do you have more details on this one?


>
>
>> Thomas, is it possible for you to do the bisection using SDK code from
>> master at various commits to narrow down the regression on your end?
>>
>
> I don't know how soon I will get to it. It's of course possible, but
> expensive due to having to rebase the fork, build and deploy an entire
> stack of stuff for each iteration. The pipeline itself is super simple. We
> need this testbed as part of Beam. It would be nice to be able to pick an
> update and have more confidence that the baseline has not slipped.
>
>
>>
>> [1]
>> https://apache-beam-testing.appspot.com/explore?dashboard=5691127080419328
>> [2]
>> https://drive.google.com/file/d/1ERlnN8bA2fKCUPBHTnid1l__81qpQe2W/view
>> [3]
>> https://github.com/apache/beam/commit/2d5e493abf39ee6fc89831bb0b7ec9fee592b9c5
>>
>>
>>
>> On Fri, Sep 6, 2019 at 8:38 AM Ahmet Altay <[email protected]> wrote:
>>
>>> +Valentyn Tymofieiev <[email protected]> do we have benchmarks in
>>> different python versions? Was there a recent change that is specific to
>>> python 3.x ?
>>>
>>> On Fri, Sep 6, 2019 at 8:36 AM Thomas Weise <[email protected]> wrote:
>>>
>>>> The issue is only visible with Python 3.6, not 2.7.
>>>>
>>>> If there is a framework in place to add a streaming test, that would be
>>>> great. We would use what we have internally as starting point.
>>>>
>>>> On Thu, Sep 5, 2019 at 5:00 PM Ahmet Altay <[email protected]> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Thu, Sep 5, 2019 at 4:15 PM Thomas Weise <[email protected]> wrote:
>>>>>
>>>>>> The workload is quite different. What I have is streaming with state
>>>>>> and timers.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Sep 5, 2019 at 3:47 PM Pablo Estrada <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> We only recently started running Chicago Taxi Example. +Michał
>>>>>>> Walenia <[email protected]> I don't see it in the
>>>>>>> dashboards. Do you know if it's possible to see any trends in the data?
>>>>>>>
>>>>>>> We have a few tests running now:
>>>>>>> - Combine tests:
>>>>>>> https://apache-beam-testing.appspot.com/explore?dashboard=5763764733345792&widget=201943890&container=1334074373
>>>>>>> - GBK tests:
>>>>>>> https://apache-beam-testing.appspot.com/explore?dashboard=5763764733345792&widget=201943890&container=1334074373
>>>>>>>
>>>>>>> They don't seem to show a very drastic jump either, but they aren't
>>>>>>> very old.
>>>>>>>
>>>>>>> There is also work ongoing to add alerting for this sort of
>>>>>>> regressions by Kasia and Kamil (added). The work is not there yet (it's 
>>>>>>> in
>>>>>>> progress).
>>>>>>> Best
>>>>>>> -P.
>>>>>>>
>>>>>>> On Thu, Sep 5, 2019 at 3:35 PM Thomas Weise <[email protected]> wrote:
>>>>>>>
>>>>>>>> It probably won't be practical to do a bisect due to the high cost
>>>>>>>> of each iteration with our fork/deploy setup.
>>>>>>>>
>>>>>>>> Perhaps it is time to setup something with the synthetic source
>>>>>>>> that works just with Beam as dependency.
>>>>>>>>
>>>>>>>
>>>>> I agree with this.
>>>>>
>>>>> Pablo, Kasia, Kamil, does the new benchmarks give us a easy to use
>>>>> framework for using synthetic source in benchmarks?
>>>>>
>>>>>
>>>>>>
>>>>>>>> On Thu, Sep 5, 2019 at 3:23 PM Ahmet Altay <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> There are a few in this dashboard [1], but not very useful in this
>>>>>>>>> case because they do not go back more than a month and not very
>>>>>>>>> comprehensive. I do not see a jump there. Thomas, would it be 
>>>>>>>>> possible to
>>>>>>>>> bisect to find what commit caused the regression?
>>>>>>>>>
>>>>>>>>> +Pablo Estrada <[email protected]> do we have any python on
>>>>>>>>> flink benchmarks for chicago example?
>>>>>>>>> +Alan Myrvold <[email protected]> +Yifan Zou
>>>>>>>>> <[email protected]> It would be good to have alerts on
>>>>>>>>> benchmarks. Do we have such an ability today?
>>>>>>>>>
>>>>>>>>> [1] https://apache-beam-testing.appspot.com/dashboard-admin
>>>>>>>>>
>>>>>>>>> On Thu, Sep 5, 2019 at 3:15 PM Thomas Weise <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Are there any performance tests run for the Python SDK as part of
>>>>>>>>>> release verification (or otherwise as well)?
>>>>>>>>>>
>>>>>>>>>> I see what appears to be a regression in master (compared to
>>>>>>>>>> 2.14) with our in-house application (~ 25% jump in cpu utilization 
>>>>>>>>>> and
>>>>>>>>>> corresponds drop in throughput).
>>>>>>>>>>
>>>>>>>>>> I wanted to see if there is anything available to verify that
>>>>>>>>>> within Beam.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Thomas
>>>>>>>>>>
>>>>>>>>>>

Re: Possible Python SDK performance regression

Reply via email to