Re: Possible Python SDK performance regression

Valentyn Tymofieiev Fri, 06 Sep 2019 19:11:56 -0700

Sounds like these regressions need to be investigated ahead of 2.16.0
release.


On Fri, Sep 6, 2019 at 6:44 PM Thomas Weise <t...@apache.org> wrote:

>
>
> On Fri, Sep 6, 2019 at 6:23 PM Ahmet Altay <al...@google.com> wrote:
>
>>
>>
>> On Fri, Sep 6, 2019 at 6:17 PM Thomas Weise <t...@apache.org> wrote:
>>
>>>
>>>
>>> On Fri, Sep 6, 2019 at 2:24 PM Valentyn Tymofieiev <valen...@google.com>
>>> wrote:
>>>
>>>> +Mark Liu <mark...@google.com> has added some benchmarks running
>>>> across multiple Python versions. Specifically we run 1 GB wordcount job on
>>>> Dataflow runner on Python 2.7, 3.5-3.7. The benchmarks do not have
>>>> configured alerting and to my knowledge are not actively monitored yet.
>>>>
>>>
>>> Are there any benchmarks for streaming? Streaming and batch are quite
>>> different runtime paths. And some of the issues can only be identified
>>> with longer running processes through metrics. It would be good to verify
>>> utilization of memory, cpu etc.
>>>
>>> I additionally discovered that our 2.16 upgrade exhibits a memory leak
>>> in the Python worker (Py 2.7).
>>>
>>
>> Do you have more details on this one?
>>
>
> Unfortunately only that at the moment. The workers eat up all memory and
> eventually crash. Reverted back to 2.14 / Py 3.6 and the issue is gone.
>
>
>>
>>
>>>
>>>
>>>> Thomas, is it possible for you to do the bisection using SDK code from
>>>> master at various commits to narrow down the regression on your end?
>>>>
>>>
>>> I don't know how soon I will get to it. It's of course possible, but
>>> expensive due to having to rebase the fork, build and deploy an entire
>>> stack of stuff for each iteration. The pipeline itself is super simple. We
>>> need this testbed as part of Beam. It would be nice to be able to pick an
>>> update and have more confidence that the baseline has not slipped.
>>>
>>>
>>>>
>>>> [1]
>>>> https://apache-beam-testing.appspot.com/explore?dashboard=5691127080419328
>>>> [2]
>>>> https://drive.google.com/file/d/1ERlnN8bA2fKCUPBHTnid1l__81qpQe2W/view
>>>> [3]
>>>> https://github.com/apache/beam/commit/2d5e493abf39ee6fc89831bb0b7ec9fee592b9c5
>>>>
>>>>
>>>>
>>>> On Fri, Sep 6, 2019 at 8:38 AM Ahmet Altay <al...@google.com> wrote:
>>>>
>>>>> +Valentyn Tymofieiev <valen...@google.com> do we have benchmarks in
>>>>> different python versions? Was there a recent change that is specific to
>>>>> python 3.x ?
>>>>>
>>>>> On Fri, Sep 6, 2019 at 8:36 AM Thomas Weise <t...@apache.org> wrote:
>>>>>
>>>>>> The issue is only visible with Python 3.6, not 2.7.
>>>>>>
>>>>>> If there is a framework in place to add a streaming test, that would
>>>>>> be great. We would use what we have internally as starting point.
>>>>>>
>>>>>> On Thu, Sep 5, 2019 at 5:00 PM Ahmet Altay <al...@google.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Sep 5, 2019 at 4:15 PM Thomas Weise <t...@apache.org> wrote:
>>>>>>>
>>>>>>>> The workload is quite different. What I have is streaming with
>>>>>>>> state and timers.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Sep 5, 2019 at 3:47 PM Pablo Estrada <pabl...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> We only recently started running Chicago Taxi Example. +Michał
>>>>>>>>> Walenia <michal.wale...@polidea.com> I don't see it in the
>>>>>>>>> dashboards. Do you know if it's possible to see any trends in the 
>>>>>>>>> data?
>>>>>>>>>
>>>>>>>>> We have a few tests running now:
>>>>>>>>> - Combine tests:
>>>>>>>>> https://apache-beam-testing.appspot.com/explore?dashboard=5763764733345792&widget=201943890&container=1334074373
>>>>>>>>> - GBK tests:
>>>>>>>>> https://apache-beam-testing.appspot.com/explore?dashboard=5763764733345792&widget=201943890&container=1334074373
>>>>>>>>>
>>>>>>>>> They don't seem to show a very drastic jump either, but they
>>>>>>>>> aren't very old.
>>>>>>>>>
>>>>>>>>> There is also work ongoing to add alerting for this sort of
>>>>>>>>> regressions by Kasia and Kamil (added). The work is not there yet 
>>>>>>>>> (it's in
>>>>>>>>> progress).
>>>>>>>>> Best
>>>>>>>>> -P.
>>>>>>>>>
>>>>>>>>> On Thu, Sep 5, 2019 at 3:35 PM Thomas Weise <t...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> It probably won't be practical to do a bisect due to the high
>>>>>>>>>> cost of each iteration with our fork/deploy setup.
>>>>>>>>>>
>>>>>>>>>> Perhaps it is time to setup something with the synthetic source
>>>>>>>>>> that works just with Beam as dependency.
>>>>>>>>>>
>>>>>>>>>
>>>>>>> I agree with this.
>>>>>>>
>>>>>>> Pablo, Kasia, Kamil, does the new benchmarks give us a easy to use
>>>>>>> framework for using synthetic source in benchmarks?
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>> On Thu, Sep 5, 2019 at 3:23 PM Ahmet Altay <al...@google.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> There are a few in this dashboard [1], but not very useful in
>>>>>>>>>>> this case because they do not go back more than a month and not very
>>>>>>>>>>> comprehensive. I do not see a jump there. Thomas, would it be 
>>>>>>>>>>> possible to
>>>>>>>>>>> bisect to find what commit caused the regression?
>>>>>>>>>>>
>>>>>>>>>>> +Pablo Estrada <pabl...@google.com> do we have any python on
>>>>>>>>>>> flink benchmarks for chicago example?
>>>>>>>>>>> +Alan Myrvold <amyrv...@google.com> +Yifan Zou
>>>>>>>>>>> <yifan...@google.com> It would be good to have alerts on
>>>>>>>>>>> benchmarks. Do we have such an ability today?
>>>>>>>>>>>
>>>>>>>>>>> [1] https://apache-beam-testing.appspot.com/dashboard-admin
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Sep 5, 2019 at 3:15 PM Thomas Weise <t...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Are there any performance tests run for the Python SDK as part
>>>>>>>>>>>> of release verification (or otherwise as well)?
>>>>>>>>>>>>
>>>>>>>>>>>> I see what appears to be a regression in master (compared to
>>>>>>>>>>>> 2.14) with our in-house application (~ 25% jump in cpu utilization 
>>>>>>>>>>>> and
>>>>>>>>>>>> corresponds drop in throughput).
>>>>>>>>>>>>
>>>>>>>>>>>> I wanted to see if there is anything available to verify that
>>>>>>>>>>>> within Beam.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Thomas
>>>>>>>>>>>>
>>>>>>>>>>>>

Re: Possible Python SDK performance regression

Reply via email to