On Fri, Sep 6, 2019 at 6:17 PM Thomas Weise <t...@apache.org> wrote: > > > On Fri, Sep 6, 2019 at 2:24 PM Valentyn Tymofieiev <valen...@google.com> > wrote: > >> +Mark Liu <mark...@google.com> has added some benchmarks running across >> multiple Python versions. Specifically we run 1 GB wordcount job on >> Dataflow runner on Python 2.7, 3.5-3.7. The benchmarks do not have >> configured alerting and to my knowledge are not actively monitored yet. >> > > Are there any benchmarks for streaming? Streaming and batch are quite > different runtime paths. And some of the issues can only be identified > with longer running processes through metrics. It would be good to verify > utilization of memory, cpu etc. > > I additionally discovered that our 2.16 upgrade exhibits a memory leak in > the Python worker (Py 2.7). >
Do you have more details on this one? > > >> Thomas, is it possible for you to do the bisection using SDK code from >> master at various commits to narrow down the regression on your end? >> > > I don't know how soon I will get to it. It's of course possible, but > expensive due to having to rebase the fork, build and deploy an entire > stack of stuff for each iteration. The pipeline itself is super simple. We > need this testbed as part of Beam. It would be nice to be able to pick an > update and have more confidence that the baseline has not slipped. > > >> >> [1] >> https://apache-beam-testing.appspot.com/explore?dashboard=5691127080419328 >> [2] >> https://drive.google.com/file/d/1ERlnN8bA2fKCUPBHTnid1l__81qpQe2W/view >> [3] >> https://github.com/apache/beam/commit/2d5e493abf39ee6fc89831bb0b7ec9fee592b9c5 >> >> >> >> On Fri, Sep 6, 2019 at 8:38 AM Ahmet Altay <al...@google.com> wrote: >> >>> +Valentyn Tymofieiev <valen...@google.com> do we have benchmarks in >>> different python versions? Was there a recent change that is specific to >>> python 3.x ? >>> >>> On Fri, Sep 6, 2019 at 8:36 AM Thomas Weise <t...@apache.org> wrote: >>> >>>> The issue is only visible with Python 3.6, not 2.7. >>>> >>>> If there is a framework in place to add a streaming test, that would be >>>> great. We would use what we have internally as starting point. >>>> >>>> On Thu, Sep 5, 2019 at 5:00 PM Ahmet Altay <al...@google.com> wrote: >>>> >>>>> >>>>> >>>>> On Thu, Sep 5, 2019 at 4:15 PM Thomas Weise <t...@apache.org> wrote: >>>>> >>>>>> The workload is quite different. What I have is streaming with state >>>>>> and timers. >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Sep 5, 2019 at 3:47 PM Pablo Estrada <pabl...@google.com> >>>>>> wrote: >>>>>> >>>>>>> We only recently started running Chicago Taxi Example. +MichaĆ >>>>>>> Walenia <michal.wale...@polidea.com> I don't see it in the >>>>>>> dashboards. Do you know if it's possible to see any trends in the data? >>>>>>> >>>>>>> We have a few tests running now: >>>>>>> - Combine tests: >>>>>>> https://apache-beam-testing.appspot.com/explore?dashboard=5763764733345792&widget=201943890&container=1334074373 >>>>>>> - GBK tests: >>>>>>> https://apache-beam-testing.appspot.com/explore?dashboard=5763764733345792&widget=201943890&container=1334074373 >>>>>>> >>>>>>> They don't seem to show a very drastic jump either, but they aren't >>>>>>> very old. >>>>>>> >>>>>>> There is also work ongoing to add alerting for this sort of >>>>>>> regressions by Kasia and Kamil (added). The work is not there yet (it's >>>>>>> in >>>>>>> progress). >>>>>>> Best >>>>>>> -P. >>>>>>> >>>>>>> On Thu, Sep 5, 2019 at 3:35 PM Thomas Weise <t...@apache.org> wrote: >>>>>>> >>>>>>>> It probably won't be practical to do a bisect due to the high cost >>>>>>>> of each iteration with our fork/deploy setup. >>>>>>>> >>>>>>>> Perhaps it is time to setup something with the synthetic source >>>>>>>> that works just with Beam as dependency. >>>>>>>> >>>>>>> >>>>> I agree with this. >>>>> >>>>> Pablo, Kasia, Kamil, does the new benchmarks give us a easy to use >>>>> framework for using synthetic source in benchmarks? >>>>> >>>>> >>>>>> >>>>>>>> On Thu, Sep 5, 2019 at 3:23 PM Ahmet Altay <al...@google.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> There are a few in this dashboard [1], but not very useful in this >>>>>>>>> case because they do not go back more than a month and not very >>>>>>>>> comprehensive. I do not see a jump there. Thomas, would it be >>>>>>>>> possible to >>>>>>>>> bisect to find what commit caused the regression? >>>>>>>>> >>>>>>>>> +Pablo Estrada <pabl...@google.com> do we have any python on >>>>>>>>> flink benchmarks for chicago example? >>>>>>>>> +Alan Myrvold <amyrv...@google.com> +Yifan Zou >>>>>>>>> <yifan...@google.com> It would be good to have alerts on >>>>>>>>> benchmarks. Do we have such an ability today? >>>>>>>>> >>>>>>>>> [1] https://apache-beam-testing.appspot.com/dashboard-admin >>>>>>>>> >>>>>>>>> On Thu, Sep 5, 2019 at 3:15 PM Thomas Weise <t...@apache.org> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Are there any performance tests run for the Python SDK as part of >>>>>>>>>> release verification (or otherwise as well)? >>>>>>>>>> >>>>>>>>>> I see what appears to be a regression in master (compared to >>>>>>>>>> 2.14) with our in-house application (~ 25% jump in cpu utilization >>>>>>>>>> and >>>>>>>>>> corresponds drop in throughput). >>>>>>>>>> >>>>>>>>>> I wanted to see if there is anything available to verify that >>>>>>>>>> within Beam. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Thomas >>>>>>>>>> >>>>>>>>>>