Re: Possible Python SDK performance regression

2019-09-25 Thread Lukasz Cwik
My environment has gotten all the dependencies installed/setup/maintained organically over time as the project has evolved. On Wed, Sep 25, 2019 at 9:56 AM Thomas Weise wrote: > The issue was related to how we build our custom packages. > > However, what might help users is documentation about

Re: Possible Python SDK performance regression

2019-09-25 Thread Thomas Weise
The issue was related to how we build our custom packages. However, what might help users is documentation about the Cython setup, which is currently missing from the Python SDK docs. I'm also wondering how folks setup their environment for releases. Is it manual? Or is there a container that has

Re: Possible Python SDK performance regression

2019-09-25 Thread Valentyn Tymofieiev
Thank you. In case there are details that would be relevant for others in the community to avoid similar regressions, feel free to share them. We also have Cython experts here who may be able to advise. On Wed, Sep 25, 2019 at 6:58 AM Thomas Weise wrote: > After running through the entire bise

Re: Possible Python SDK performance regression

2019-09-25 Thread Thomas Weise
After running through the entire bisect based on the 2.16 release branch I found that the regression was caused by our own Cython setup. So green light for the 2.16.0 release. Thomas On Tue, Sep 17, 2019 at 1:21 PM Thomas Weise wrote: > Hi Valentyn, > > Thanks for the reminder. The bisect is on

Re: Possible Python SDK performance regression

2019-09-17 Thread Thomas Weise
Hi Valentyn, Thanks for the reminder. The bisect is on my TODO list. Hopefully this week. I saw the discussion about declaring 2.16 LTS. We probably need to sort these performance concerns out prior to doing so. Thomas On Tue, Sep 17, 2019 at 12:02 PM Valentyn Tymofieiev wrote: > Hi Thomas,

Re: Possible Python SDK performance regression

2019-09-17 Thread Valentyn Tymofieiev
Hi Thomas, Just a reminder that 2.16.0 was cut and soon the voting may start, so to avoid the regression that you reported blocking the vote, it would be great to start investigate it if it is reproducible. Thanks, Valentyn On Tue, Sep 10, 2019 at 1:53 PM Valentyn Tymofieiev wrote: > Thomas, d

Re: Possible Python SDK performance regression

2019-09-10 Thread Valentyn Tymofieiev
Thomas, did you have a change to open a Jira for the streaming regression you observe? If not, could you please do so and cc +Ankur Goenka ? I talked with Ankur offline and he is also interested in this regression. I opened: - https://issues.apache.org/jira/browse/BEAM-8198 for batch regression.

Re: Possible Python SDK performance regression

2019-09-09 Thread Mark Liu
> > +Alan Myrvold +Yifan Zou It > would be good to have alerts on benchmarks. Do we have such an ability > today? > As for regression detection, we have a Jenkins job beam_PerformanceTests_Analysis which

Re: Possible Python SDK performance regression

2019-09-06 Thread Ahmet Altay
I agree, let's investigate. Thomas could you file JIRAs once you have additional information. Valentyn, I think the performance regression could be investigated now, by running whatever benchmarks that is available against 2.14, 2.15 and head and see if the same regression could be reproduced. On

Re: Possible Python SDK performance regression

2019-09-06 Thread Valentyn Tymofieiev
Sounds like these regressions need to be investigated ahead of 2.16.0 release. On Fri, Sep 6, 2019 at 6:44 PM Thomas Weise wrote: > > > On Fri, Sep 6, 2019 at 6:23 PM Ahmet Altay wrote: > >> >> >> On Fri, Sep 6, 2019 at 6:17 PM Thomas Weise wrote: >> >>> >>> >>> On Fri, Sep 6, 2019 at 2:24 PM

Re: Possible Python SDK performance regression

2019-09-06 Thread Thomas Weise
On Fri, Sep 6, 2019 at 6:23 PM Ahmet Altay wrote: > > > On Fri, Sep 6, 2019 at 6:17 PM Thomas Weise wrote: > >> >> >> On Fri, Sep 6, 2019 at 2:24 PM Valentyn Tymofieiev >> wrote: >> >>> +Mark Liu has added some benchmarks running across >>> multiple Python versions. Specifically we run 1 GB wo

Re: Possible Python SDK performance regression

2019-09-06 Thread Valentyn Tymofieiev
On Fri, Sep 6, 2019 at 6:23 PM Ahmet Altay wrote: > > > On Fri, Sep 6, 2019 at 6:17 PM Thomas Weise wrote: > >> >> >> On Fri, Sep 6, 2019 at 2:24 PM Valentyn Tymofieiev >> wrote: >> >>> +Mark Liu has added some benchmarks running across >>> multiple Python versions. Specifically we run 1 GB wo

Re: Possible Python SDK performance regression

2019-09-06 Thread Ahmet Altay
On Fri, Sep 6, 2019 at 6:17 PM Thomas Weise wrote: > > > On Fri, Sep 6, 2019 at 2:24 PM Valentyn Tymofieiev > wrote: > >> +Mark Liu has added some benchmarks running across >> multiple Python versions. Specifically we run 1 GB wordcount job on >> Dataflow runner on Python 2.7, 3.5-3.7. The benc

Re: Possible Python SDK performance regression

2019-09-06 Thread Thomas Weise
On Fri, Sep 6, 2019 at 2:24 PM Valentyn Tymofieiev wrote: > +Mark Liu has added some benchmarks running across > multiple Python versions. Specifically we run 1 GB wordcount job on > Dataflow runner on Python 2.7, 3.5-3.7. The benchmarks do not have > configured alerting and to my knowledge are

Re: Possible Python SDK performance regression

2019-09-06 Thread Valentyn Tymofieiev
+Mark Liu has added some benchmarks running across multiple Python versions. Specifically we run 1 GB wordcount job on Dataflow runner on Python 2.7, 3.5-3.7. The benchmarks do not have configured alerting and to my knowledge are not actively monitored yet. The zoom buttons on the dashboard [1] s

Re: Possible Python SDK performance regression

2019-09-06 Thread Ahmet Altay
+Valentyn Tymofieiev do we have benchmarks in different python versions? Was there a recent change that is specific to python 3.x ? On Fri, Sep 6, 2019 at 8:36 AM Thomas Weise wrote: > The issue is only visible with Python 3.6, not 2.7. > > If there is a framework in place to add a streaming te

Re: Possible Python SDK performance regression

2019-09-06 Thread Thomas Weise
The issue is only visible with Python 3.6, not 2.7. If there is a framework in place to add a streaming test, that would be great. We would use what we have internally as starting point. On Thu, Sep 5, 2019 at 5:00 PM Ahmet Altay wrote: > > > On Thu, Sep 5, 2019 at 4:15 PM Thomas Weise wrote:

Re: Possible Python SDK performance regression

2019-09-05 Thread Ahmet Altay
On Thu, Sep 5, 2019 at 4:15 PM Thomas Weise wrote: > The workload is quite different. What I have is streaming with state and > timers. > > > > On Thu, Sep 5, 2019 at 3:47 PM Pablo Estrada wrote: > >> We only recently started running Chicago Taxi Example. +Michał Walenia >> I don't see it in th

Re: Possible Python SDK performance regression

2019-09-05 Thread Thomas Weise
The workload is quite different. What I have is streaming with state and timers. On Thu, Sep 5, 2019 at 3:47 PM Pablo Estrada wrote: > We only recently started running Chicago Taxi Example. +Michał Walenia > I don't see it in the dashboards. Do you > know if it's possible to see any trends in

Re: Possible Python SDK performance regression

2019-09-05 Thread Pablo Estrada
We only recently started running Chicago Taxi Example. +Michał Walenia I don't see it in the dashboards. Do you know if it's possible to see any trends in the data? We have a few tests running now: - Combine tests: https://apache-beam-testing.appspot.com/explore?dashboard=5763764733345792&widget=

Re: Possible Python SDK performance regression

2019-09-05 Thread Thomas Weise
It probably won't be practical to do a bisect due to the high cost of each iteration with our fork/deploy setup. Perhaps it is time to setup something with the synthetic source that works just with Beam as dependency. On Thu, Sep 5, 2019 at 3:23 PM Ahmet Altay wrote: > There are a few in this d

Re: Possible Python SDK performance regression

2019-09-05 Thread Ahmet Altay
There are a few in this dashboard [1], but not very useful in this case because they do not go back more than a month and not very comprehensive. I do not see a jump there. Thomas, would it be possible to bisect to find what commit caused the regression? +Pablo Estrada do we have any python on fl