Re: Performance drops in Python PortableRunner tests

Pablo Estrada Fri, 20 Dec 2019 10:11:51 -0800

The jenkins jobs for the Flink load tests:
https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_ParDo_Flink_Python.groovy


The documentation for the test contains how to run it on each runner:
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/load_tests/pardo_test.py#L17

I assume that standing up the Flink cluster should be done separately.

LMK if that helps Robert.
-P.

On Fri, Dec 20, 2019 at 9:59 AM Robert Bradshaw <[email protected]> wrote:

> Yes, it is possible that this had an influence--Reads are now all
> implemented as SDFs and Creates involve a reshuffle to better
> redistribute data. This much of a change is quite surprising. Where is
> the pipeline for, say, "Python | ParDo | 2GB, 100 byte records, 10
> iterations | Batch" and how does one run it?
>
> On Fri, Dec 20, 2019 at 6:50 AM Kamil Wasilewski
> <[email protected]> wrote:
> >
> > Hi all,
> >
> > We have a couple of Python load tests running on Flink in which we are
> testing the performance of ParDo, GroupByKey, CoGroupByKey and Combine
> operations.
> >
> > Recently, I've discovered that the runtime of all those tests rose up
> significantly. It happened between the 6th and 7th of December (the tests
> are running daily). Here are the dashboards where you can see the results:
> >
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5649695233802240
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5763764733345792
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5698549949923328
> >
> https://apache-beam-testing.appspot.com/explore?dashboard=5678187241537536
> >
> > I've seen in that period we submitted some changes to the core,
> including Read transform. Do you think this might have influenced the
> results?
> >
> > Thanks,
> > Kamil
>

Re: Performance drops in Python PortableRunner tests

Reply via email to