Sounds like these regressions need to be investigated ahead of 2.16.0 release.
On Fri, Sep 6, 2019 at 6:44 PM Thomas Weise <t...@apache.org> wrote: > > > On Fri, Sep 6, 2019 at 6:23 PM Ahmet Altay <al...@google.com> wrote: > >> >> >> On Fri, Sep 6, 2019 at 6:17 PM Thomas Weise <t...@apache.org> wrote: >> >>> >>> >>> On Fri, Sep 6, 2019 at 2:24 PM Valentyn Tymofieiev <valen...@google.com> >>> wrote: >>> >>>> +Mark Liu <mark...@google.com> has added some benchmarks running >>>> across multiple Python versions. Specifically we run 1 GB wordcount job on >>>> Dataflow runner on Python 2.7, 3.5-3.7. The benchmarks do not have >>>> configured alerting and to my knowledge are not actively monitored yet. >>>> >>> >>> Are there any benchmarks for streaming? Streaming and batch are quite >>> different runtime paths. And some of the issues can only be identified >>> with longer running processes through metrics. It would be good to verify >>> utilization of memory, cpu etc. >>> >>> I additionally discovered that our 2.16 upgrade exhibits a memory leak >>> in the Python worker (Py 2.7). >>> >> >> Do you have more details on this one? >> > > Unfortunately only that at the moment. The workers eat up all memory and > eventually crash. Reverted back to 2.14 / Py 3.6 and the issue is gone. > > >> >> >>> >>> >>>> Thomas, is it possible for you to do the bisection using SDK code from >>>> master at various commits to narrow down the regression on your end? >>>> >>> >>> I don't know how soon I will get to it. It's of course possible, but >>> expensive due to having to rebase the fork, build and deploy an entire >>> stack of stuff for each iteration. The pipeline itself is super simple. We >>> need this testbed as part of Beam. It would be nice to be able to pick an >>> update and have more confidence that the baseline has not slipped. >>> >>> >>>> >>>> [1] >>>> https://apache-beam-testing.appspot.com/explore?dashboard=5691127080419328 >>>> [2] >>>> https://drive.google.com/file/d/1ERlnN8bA2fKCUPBHTnid1l__81qpQe2W/view >>>> [3] >>>> https://github.com/apache/beam/commit/2d5e493abf39ee6fc89831bb0b7ec9fee592b9c5 >>>> >>>> >>>> >>>> On Fri, Sep 6, 2019 at 8:38 AM Ahmet Altay <al...@google.com> wrote: >>>> >>>>> +Valentyn Tymofieiev <valen...@google.com> do we have benchmarks in >>>>> different python versions? Was there a recent change that is specific to >>>>> python 3.x ? >>>>> >>>>> On Fri, Sep 6, 2019 at 8:36 AM Thomas Weise <t...@apache.org> wrote: >>>>> >>>>>> The issue is only visible with Python 3.6, not 2.7. >>>>>> >>>>>> If there is a framework in place to add a streaming test, that would >>>>>> be great. We would use what we have internally as starting point. >>>>>> >>>>>> On Thu, Sep 5, 2019 at 5:00 PM Ahmet Altay <al...@google.com> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Sep 5, 2019 at 4:15 PM Thomas Weise <t...@apache.org> wrote: >>>>>>> >>>>>>>> The workload is quite different. What I have is streaming with >>>>>>>> state and timers. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Sep 5, 2019 at 3:47 PM Pablo Estrada <pabl...@google.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> We only recently started running Chicago Taxi Example. +MichaĆ >>>>>>>>> Walenia <michal.wale...@polidea.com> I don't see it in the >>>>>>>>> dashboards. Do you know if it's possible to see any trends in the >>>>>>>>> data? >>>>>>>>> >>>>>>>>> We have a few tests running now: >>>>>>>>> - Combine tests: >>>>>>>>> https://apache-beam-testing.appspot.com/explore?dashboard=5763764733345792&widget=201943890&container=1334074373 >>>>>>>>> - GBK tests: >>>>>>>>> https://apache-beam-testing.appspot.com/explore?dashboard=5763764733345792&widget=201943890&container=1334074373 >>>>>>>>> >>>>>>>>> They don't seem to show a very drastic jump either, but they >>>>>>>>> aren't very old. >>>>>>>>> >>>>>>>>> There is also work ongoing to add alerting for this sort of >>>>>>>>> regressions by Kasia and Kamil (added). The work is not there yet >>>>>>>>> (it's in >>>>>>>>> progress). >>>>>>>>> Best >>>>>>>>> -P. >>>>>>>>> >>>>>>>>> On Thu, Sep 5, 2019 at 3:35 PM Thomas Weise <t...@apache.org> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> It probably won't be practical to do a bisect due to the high >>>>>>>>>> cost of each iteration with our fork/deploy setup. >>>>>>>>>> >>>>>>>>>> Perhaps it is time to setup something with the synthetic source >>>>>>>>>> that works just with Beam as dependency. >>>>>>>>>> >>>>>>>>> >>>>>>> I agree with this. >>>>>>> >>>>>>> Pablo, Kasia, Kamil, does the new benchmarks give us a easy to use >>>>>>> framework for using synthetic source in benchmarks? >>>>>>> >>>>>>> >>>>>>>> >>>>>>>>>> On Thu, Sep 5, 2019 at 3:23 PM Ahmet Altay <al...@google.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> There are a few in this dashboard [1], but not very useful in >>>>>>>>>>> this case because they do not go back more than a month and not very >>>>>>>>>>> comprehensive. I do not see a jump there. Thomas, would it be >>>>>>>>>>> possible to >>>>>>>>>>> bisect to find what commit caused the regression? >>>>>>>>>>> >>>>>>>>>>> +Pablo Estrada <pabl...@google.com> do we have any python on >>>>>>>>>>> flink benchmarks for chicago example? >>>>>>>>>>> +Alan Myrvold <amyrv...@google.com> +Yifan Zou >>>>>>>>>>> <yifan...@google.com> It would be good to have alerts on >>>>>>>>>>> benchmarks. Do we have such an ability today? >>>>>>>>>>> >>>>>>>>>>> [1] https://apache-beam-testing.appspot.com/dashboard-admin >>>>>>>>>>> >>>>>>>>>>> On Thu, Sep 5, 2019 at 3:15 PM Thomas Weise <t...@apache.org> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> Are there any performance tests run for the Python SDK as part >>>>>>>>>>>> of release verification (or otherwise as well)? >>>>>>>>>>>> >>>>>>>>>>>> I see what appears to be a regression in master (compared to >>>>>>>>>>>> 2.14) with our in-house application (~ 25% jump in cpu utilization >>>>>>>>>>>> and >>>>>>>>>>>> corresponds drop in throughput). >>>>>>>>>>>> >>>>>>>>>>>> I wanted to see if there is anything available to verify that >>>>>>>>>>>> within Beam. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Thomas >>>>>>>>>>>> >>>>>>>>>>>>