Re: Python Precommit duration pushing 2 hours
On Thu, Nov 14, 2019 at 2:58 PM Ahmet Altay wrote: > > On Thu, Nov 14, 2019 at 2:55 PM Mikhail Gryzykhin wrote: >> >> Hi Everyone, >> >> Python precommit phrase timeouts for (roughly) 80% of the jobs in 2 hours. >> This also blocks release branch validation. I suggest to bump the timeout to >> 3 hours while we are working on a proper solution. This way many people can >> get unblocked. >> >> I believe the change can be rather small: >> https://github.com/apache/beam/pull/10121 > > +1 to unblock the current state. +1 from me as well. Who (if anyone) is looking into the root cause? I don't want to bump this up and forget it--2 hours is painful enough as is. >> On Mon, Nov 11, 2019 at 5:24 PM Ning Kang wrote: >>> >>> I'm removing the additional interactive test env + suite and add >>> [interactive] dependencies as extra dependencies in tests_require: >>> https://github.com/apache/beam/pull/10068 >>> >>> On Mon, Nov 11, 2019 at 2:15 PM Robert Bradshaw wrote: On Fri, Nov 8, 2019 at 5:45 PM Ahmet Altay wrote: > > I looked at the log but I could not figure what is causing the timeout > because the gradle scan links are missing. I sampled a few of the > successful jobs, It seems like python 3.7 and python 2 are running 3 > tests in serial {interactive, py37cython, py37gcp} and {docs, > py27cython, py27gcp} respectively. These two versions are pushing the > total time because other variants are now only running {cython, gcp} > versions. > > I suggest breaking up docs, and interactive into 2 separate suites of > their own. docs is actually faster than interactive,just separating that > out to a new suite might help. > > Interactive was recently added > (https://github.com/apache/beam/pull/9741). +Ning Kang could you > separate interactive to new suite? I would ask why interactive is a separate tox configuration at all; I don't think there's a need to run every test again with a couple of extra dependencies (adding ~30 minutes to every presumbit). I think it would be much more valuable to run the (presumably relatively small) set of interactive tests in all modes. (The other suites are to guerentee the tests specifically run *without* installing gcp and *without* compiling with Cython.) > On Fri, Nov 8, 2019 at 11:09 AM Robert Bradshaw > wrote: >> >> Just saw another 2-hour timeout: >> https://builds.apache.org/job/beam_PreCommit_Python_Commit/9440/ , so >> perhaps we're not out of the woods yet (though in general things have >> been a lot better). >> >> On Tue, Nov 5, 2019 at 10:52 AM Ahmet Altay wrote: >> > >> > GCP tests are already on separate locations. IO related tests are >> > under /sdks/python/apache_beam/io/gcp and Dataflow related tests are >> > under sdks/python/apache_beam/runners/dataflow. It should be a matter >> > of changing gradle files to run either one of the base tests or GCP >> > tests depending on the types of changes. I do not expect this to have >> > any material impact on the precommit times because these two test >> > suites take about exactly the same time to complete. >> > >> > #9985 is merged now. Precommit times on master branch dropped to ~1h >> > 20 for the last 5 runs. >> > >> > On Tue, Nov 5, 2019 at 10:12 AM David Cavazos >> > wrote: >> >> >> >> +1 to moving the GCP tests outside of core. If there are issues that >> >> only show up on GCP tests but not in core, it might be an indication >> >> that there needs to be another test in core covering that, but I >> >> think that should be pretty rare. >> >> >> >> On Mon, Nov 4, 2019 at 8:33 PM Kenneth Knowles >> >> wrote: >> >>> >> >>> +1 to moving forward with this >> >>> >> >>> Could we move GCP tests outside the core? Then only code changes >> >>> touches/affecting GCP would cause them to run in precommit. Could >> >>> still run them in postcommit in their own suite. If the core has >> >>> reasonably stable abstractions that the connectors are built on, >> >>> this should not change coverage much. >> >>> >> >>> Kenn >> >>> >> >>> On Mon, Nov 4, 2019 at 1:55 PM Ahmet Altay wrote: >> >> PR for the proposed change: >> https://github.com/apache/beam/pull/9985 >> >> On Mon, Nov 4, 2019 at 1:35 PM Udi Meiri wrote: >> > >> > +1 >> > >> > On Mon, Nov 4, 2019 at 12:09 PM Robert Bradshaw >> > wrote: >> >> >> >> +1, this seems like a good step with a clear win. >> >> >> >> On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay >> >> wrote: >> >> > >> >> > Python precommits are still timing out on #9925.
Re: Python Precommit duration pushing 2 hours
On Thu, Nov 14, 2019 at 2:55 PM Mikhail Gryzykhin wrote: > Hi Everyone, > > Python precommit phrase timeouts for (roughly) 80% of the jobs in 2 hours. > This also blocks release branch validation. I suggest to bump the timeout > to 3 hours while we are working on a proper solution. This way many people > can get unblocked. > > I believe the change can be rather small: > https://github.com/apache/beam/pull/10121 > +1 to unblock the current state. > > --Mikhail > > > > On Mon, Nov 11, 2019 at 5:24 PM Ning Kang wrote: > >> I'm removing the additional interactive test env + suite and add >> [interactive] dependencies as extra dependencies in tests_require: >> https://github.com/apache/beam/pull/10068 >> >> On Mon, Nov 11, 2019 at 2:15 PM Robert Bradshaw >> wrote: >> >>> On Fri, Nov 8, 2019 at 5:45 PM Ahmet Altay wrote: >>> > >>> > I looked at the log but I could not figure what is causing the timeout >>> because the gradle scan links are missing. I sampled a few of the >>> successful jobs, It seems like python 3.7 and python 2 are running 3 tests >>> in serial {interactive, py37cython, py37gcp} and {docs, py27cython, >>> py27gcp} respectively. These two versions are pushing the total time >>> because other variants are now only running {cython, gcp} versions. >>> > >>> > I suggest breaking up docs, and interactive into 2 separate suites of >>> their own. docs is actually faster than interactive,just separating that >>> out to a new suite might help. >>> > >>> > Interactive was recently added ( >>> https://github.com/apache/beam/pull/9741). +Ning Kang could you >>> separate interactive to new suite? >>> >>> I would ask why interactive is a separate tox configuration at all; I >>> don't think there's a need to run every test again with a couple of >>> extra dependencies (adding ~30 minutes to every presumbit). I think it >>> would be much more valuable to run the (presumably relatively small) >>> set of interactive tests in all modes. >>> >>> (The other suites are to guerentee the tests specifically run >>> *without* installing gcp and *without* compiling with Cython.) >>> >>> > On Fri, Nov 8, 2019 at 11:09 AM Robert Bradshaw >>> wrote: >>> >> >>> >> Just saw another 2-hour timeout: >>> >> https://builds.apache.org/job/beam_PreCommit_Python_Commit/9440/ , so >>> >> perhaps we're not out of the woods yet (though in general things have >>> >> been a lot better). >>> >> >>> >> On Tue, Nov 5, 2019 at 10:52 AM Ahmet Altay wrote: >>> >> > >>> >> > GCP tests are already on separate locations. IO related tests are >>> under /sdks/python/apache_beam/io/gcp and Dataflow related tests are under >>> sdks/python/apache_beam/runners/dataflow. It should be a matter of changing >>> gradle files to run either one of the base tests or GCP tests depending on >>> the types of changes. I do not expect this to have any material impact on >>> the precommit times because these two test suites take about exactly the >>> same time to complete. >>> >> > >>> >> > #9985 is merged now. Precommit times on master branch dropped to >>> ~1h 20 for the last 5 runs. >>> >> > >>> >> > On Tue, Nov 5, 2019 at 10:12 AM David Cavazos >>> wrote: >>> >> >> >>> >> >> +1 to moving the GCP tests outside of core. If there are issues >>> that only show up on GCP tests but not in core, it might be an indication >>> that there needs to be another test in core covering that, but I think that >>> should be pretty rare. >>> >> >> >>> >> >> On Mon, Nov 4, 2019 at 8:33 PM Kenneth Knowles >>> wrote: >>> >> >>> >>> >> >>> +1 to moving forward with this >>> >> >>> >>> >> >>> Could we move GCP tests outside the core? Then only code changes >>> touches/affecting GCP would cause them to run in precommit. Could still run >>> them in postcommit in their own suite. If the core has reasonably stable >>> abstractions that the connectors are built on, this should not change >>> coverage much. >>> >> >>> >>> >> >>> Kenn >>> >> >>> >>> >> >>> On Mon, Nov 4, 2019 at 1:55 PM Ahmet Altay >>> wrote: >>> >> >>> >> PR for the proposed change: >>> https://github.com/apache/beam/pull/9985 >>> >> >>> >> On Mon, Nov 4, 2019 at 1:35 PM Udi Meiri >>> wrote: >>> >> > >>> >> > +1 >>> >> > >>> >> > On Mon, Nov 4, 2019 at 12:09 PM Robert Bradshaw < >>> rober...@google.com> wrote: >>> >> >> >>> >> >> +1, this seems like a good step with a clear win. >>> >> >> >>> >> >> On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay >>> wrote: >>> >> >> > >>> >> >> > Python precommits are still timing out on #9925. I am >>> guessing that means this change would not be enough. >>> >> >> > >>> >> >> > I am proposing cutting down the number of test variants we >>> run in precommits. Currently for each version we ran the following variants >>> serially: >>> >> >> > - base: Runs all unit tests with tox >>> >> >> > - Cython: Installs cython and runs all unit tests as base >>> version. The original purpose was to ensure that tests pas
Re: Python Precommit duration pushing 2 hours
Hi Everyone, Python precommit phrase timeouts for (roughly) 80% of the jobs in 2 hours. This also blocks release branch validation. I suggest to bump the timeout to 3 hours while we are working on a proper solution. This way many people can get unblocked. I believe the change can be rather small: https://github.com/apache/beam/pull/10121 --Mikhail On Mon, Nov 11, 2019 at 5:24 PM Ning Kang wrote: > I'm removing the additional interactive test env + suite and add > [interactive] dependencies as extra dependencies in tests_require: > https://github.com/apache/beam/pull/10068 > > On Mon, Nov 11, 2019 at 2:15 PM Robert Bradshaw > wrote: > >> On Fri, Nov 8, 2019 at 5:45 PM Ahmet Altay wrote: >> > >> > I looked at the log but I could not figure what is causing the timeout >> because the gradle scan links are missing. I sampled a few of the >> successful jobs, It seems like python 3.7 and python 2 are running 3 tests >> in serial {interactive, py37cython, py37gcp} and {docs, py27cython, >> py27gcp} respectively. These two versions are pushing the total time >> because other variants are now only running {cython, gcp} versions. >> > >> > I suggest breaking up docs, and interactive into 2 separate suites of >> their own. docs is actually faster than interactive,just separating that >> out to a new suite might help. >> > >> > Interactive was recently added ( >> https://github.com/apache/beam/pull/9741). +Ning Kang could you separate >> interactive to new suite? >> >> I would ask why interactive is a separate tox configuration at all; I >> don't think there's a need to run every test again with a couple of >> extra dependencies (adding ~30 minutes to every presumbit). I think it >> would be much more valuable to run the (presumably relatively small) >> set of interactive tests in all modes. >> >> (The other suites are to guerentee the tests specifically run >> *without* installing gcp and *without* compiling with Cython.) >> >> > On Fri, Nov 8, 2019 at 11:09 AM Robert Bradshaw >> wrote: >> >> >> >> Just saw another 2-hour timeout: >> >> https://builds.apache.org/job/beam_PreCommit_Python_Commit/9440/ , so >> >> perhaps we're not out of the woods yet (though in general things have >> >> been a lot better). >> >> >> >> On Tue, Nov 5, 2019 at 10:52 AM Ahmet Altay wrote: >> >> > >> >> > GCP tests are already on separate locations. IO related tests are >> under /sdks/python/apache_beam/io/gcp and Dataflow related tests are under >> sdks/python/apache_beam/runners/dataflow. It should be a matter of changing >> gradle files to run either one of the base tests or GCP tests depending on >> the types of changes. I do not expect this to have any material impact on >> the precommit times because these two test suites take about exactly the >> same time to complete. >> >> > >> >> > #9985 is merged now. Precommit times on master branch dropped to ~1h >> 20 for the last 5 runs. >> >> > >> >> > On Tue, Nov 5, 2019 at 10:12 AM David Cavazos >> wrote: >> >> >> >> >> >> +1 to moving the GCP tests outside of core. If there are issues >> that only show up on GCP tests but not in core, it might be an indication >> that there needs to be another test in core covering that, but I think that >> should be pretty rare. >> >> >> >> >> >> On Mon, Nov 4, 2019 at 8:33 PM Kenneth Knowles >> wrote: >> >> >>> >> >> >>> +1 to moving forward with this >> >> >>> >> >> >>> Could we move GCP tests outside the core? Then only code changes >> touches/affecting GCP would cause them to run in precommit. Could still run >> them in postcommit in their own suite. If the core has reasonably stable >> abstractions that the connectors are built on, this should not change >> coverage much. >> >> >>> >> >> >>> Kenn >> >> >>> >> >> >>> On Mon, Nov 4, 2019 at 1:55 PM Ahmet Altay >> wrote: >> >> >> >> PR for the proposed change: >> https://github.com/apache/beam/pull/9985 >> >> >> >> On Mon, Nov 4, 2019 at 1:35 PM Udi Meiri >> wrote: >> >> > >> >> > +1 >> >> > >> >> > On Mon, Nov 4, 2019 at 12:09 PM Robert Bradshaw < >> rober...@google.com> wrote: >> >> >> >> >> >> +1, this seems like a good step with a clear win. >> >> >> >> >> >> On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay >> wrote: >> >> >> > >> >> >> > Python precommits are still timing out on #9925. I am >> guessing that means this change would not be enough. >> >> >> > >> >> >> > I am proposing cutting down the number of test variants we >> run in precommits. Currently for each version we ran the following variants >> serially: >> >> >> > - base: Runs all unit tests with tox >> >> >> > - Cython: Installs cython and runs all unit tests as base >> version. The original purpose was to ensure that tests pass with or without >> cython. There is probably a huge overlap with base. (IIRC only a few coders >> have different slow vs fast tests.) >> >> >> > - GCP: Installs GCP dependencies and tests all base + >> additional gcp sp
Re: Python Precommit duration pushing 2 hours
I'm removing the additional interactive test env + suite and add [interactive] dependencies as extra dependencies in tests_require: https://github.com/apache/beam/pull/10068 On Mon, Nov 11, 2019 at 2:15 PM Robert Bradshaw wrote: > On Fri, Nov 8, 2019 at 5:45 PM Ahmet Altay wrote: > > > > I looked at the log but I could not figure what is causing the timeout > because the gradle scan links are missing. I sampled a few of the > successful jobs, It seems like python 3.7 and python 2 are running 3 tests > in serial {interactive, py37cython, py37gcp} and {docs, py27cython, > py27gcp} respectively. These two versions are pushing the total time > because other variants are now only running {cython, gcp} versions. > > > > I suggest breaking up docs, and interactive into 2 separate suites of > their own. docs is actually faster than interactive,just separating that > out to a new suite might help. > > > > Interactive was recently added (https://github.com/apache/beam/pull/9741). > +Ning Kang could you separate interactive to new suite? > > I would ask why interactive is a separate tox configuration at all; I > don't think there's a need to run every test again with a couple of > extra dependencies (adding ~30 minutes to every presumbit). I think it > would be much more valuable to run the (presumably relatively small) > set of interactive tests in all modes. > > (The other suites are to guerentee the tests specifically run > *without* installing gcp and *without* compiling with Cython.) > > > On Fri, Nov 8, 2019 at 11:09 AM Robert Bradshaw > wrote: > >> > >> Just saw another 2-hour timeout: > >> https://builds.apache.org/job/beam_PreCommit_Python_Commit/9440/ , so > >> perhaps we're not out of the woods yet (though in general things have > >> been a lot better). > >> > >> On Tue, Nov 5, 2019 at 10:52 AM Ahmet Altay wrote: > >> > > >> > GCP tests are already on separate locations. IO related tests are > under /sdks/python/apache_beam/io/gcp and Dataflow related tests are under > sdks/python/apache_beam/runners/dataflow. It should be a matter of changing > gradle files to run either one of the base tests or GCP tests depending on > the types of changes. I do not expect this to have any material impact on > the precommit times because these two test suites take about exactly the > same time to complete. > >> > > >> > #9985 is merged now. Precommit times on master branch dropped to ~1h > 20 for the last 5 runs. > >> > > >> > On Tue, Nov 5, 2019 at 10:12 AM David Cavazos > wrote: > >> >> > >> >> +1 to moving the GCP tests outside of core. If there are issues that > only show up on GCP tests but not in core, it might be an indication that > there needs to be another test in core covering that, but I think that > should be pretty rare. > >> >> > >> >> On Mon, Nov 4, 2019 at 8:33 PM Kenneth Knowles > wrote: > >> >>> > >> >>> +1 to moving forward with this > >> >>> > >> >>> Could we move GCP tests outside the core? Then only code changes > touches/affecting GCP would cause them to run in precommit. Could still run > them in postcommit in their own suite. If the core has reasonably stable > abstractions that the connectors are built on, this should not change > coverage much. > >> >>> > >> >>> Kenn > >> >>> > >> >>> On Mon, Nov 4, 2019 at 1:55 PM Ahmet Altay > wrote: > >> > >> PR for the proposed change: > https://github.com/apache/beam/pull/9985 > >> > >> On Mon, Nov 4, 2019 at 1:35 PM Udi Meiri wrote: > >> > > >> > +1 > >> > > >> > On Mon, Nov 4, 2019 at 12:09 PM Robert Bradshaw < > rober...@google.com> wrote: > >> >> > >> >> +1, this seems like a good step with a clear win. > >> >> > >> >> On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay > wrote: > >> >> > > >> >> > Python precommits are still timing out on #9925. I am guessing > that means this change would not be enough. > >> >> > > >> >> > I am proposing cutting down the number of test variants we run > in precommits. Currently for each version we ran the following variants > serially: > >> >> > - base: Runs all unit tests with tox > >> >> > - Cython: Installs cython and runs all unit tests as base > version. The original purpose was to ensure that tests pass with or without > cython. There is probably a huge overlap with base. (IIRC only a few coders > have different slow vs fast tests.) > >> >> > - GCP: Installs GCP dependencies and tests all base + > additional gcp specific tests. The original purpose was to ensure that GCP > is an optional component and all non-GCP tests still works without GCP > components. > >> >> > > >> >> > We can reduce the list to cython + GCP tests only. This will > cover the same group of tests and will check that tests pass with or > without cython or GCP dependencies. This could reduce the precommit time by > ~30 minutes. > >> >> > > >> >> > What do you think? > >> >> > > >> >> > Ahmet > >> >> > > >> >> > > >> >> > On
Re: Python Precommit duration pushing 2 hours
On Fri, Nov 8, 2019 at 5:45 PM Ahmet Altay wrote: > > I looked at the log but I could not figure what is causing the timeout > because the gradle scan links are missing. I sampled a few of the successful > jobs, It seems like python 3.7 and python 2 are running 3 tests in serial > {interactive, py37cython, py37gcp} and {docs, py27cython, py27gcp} > respectively. These two versions are pushing the total time because other > variants are now only running {cython, gcp} versions. > > I suggest breaking up docs, and interactive into 2 separate suites of their > own. docs is actually faster than interactive,just separating that out to a > new suite might help. > > Interactive was recently added (https://github.com/apache/beam/pull/9741). > +Ning Kang could you separate interactive to new suite? I would ask why interactive is a separate tox configuration at all; I don't think there's a need to run every test again with a couple of extra dependencies (adding ~30 minutes to every presumbit). I think it would be much more valuable to run the (presumably relatively small) set of interactive tests in all modes. (The other suites are to guerentee the tests specifically run *without* installing gcp and *without* compiling with Cython.) > On Fri, Nov 8, 2019 at 11:09 AM Robert Bradshaw wrote: >> >> Just saw another 2-hour timeout: >> https://builds.apache.org/job/beam_PreCommit_Python_Commit/9440/ , so >> perhaps we're not out of the woods yet (though in general things have >> been a lot better). >> >> On Tue, Nov 5, 2019 at 10:52 AM Ahmet Altay wrote: >> > >> > GCP tests are already on separate locations. IO related tests are under >> > /sdks/python/apache_beam/io/gcp and Dataflow related tests are under >> > sdks/python/apache_beam/runners/dataflow. It should be a matter of >> > changing gradle files to run either one of the base tests or GCP tests >> > depending on the types of changes. I do not expect this to have any >> > material impact on the precommit times because these two test suites take >> > about exactly the same time to complete. >> > >> > #9985 is merged now. Precommit times on master branch dropped to ~1h 20 >> > for the last 5 runs. >> > >> > On Tue, Nov 5, 2019 at 10:12 AM David Cavazos wrote: >> >> >> >> +1 to moving the GCP tests outside of core. If there are issues that only >> >> show up on GCP tests but not in core, it might be an indication that >> >> there needs to be another test in core covering that, but I think that >> >> should be pretty rare. >> >> >> >> On Mon, Nov 4, 2019 at 8:33 PM Kenneth Knowles wrote: >> >>> >> >>> +1 to moving forward with this >> >>> >> >>> Could we move GCP tests outside the core? Then only code changes >> >>> touches/affecting GCP would cause them to run in precommit. Could still >> >>> run them in postcommit in their own suite. If the core has reasonably >> >>> stable abstractions that the connectors are built on, this should not >> >>> change coverage much. >> >>> >> >>> Kenn >> >>> >> >>> On Mon, Nov 4, 2019 at 1:55 PM Ahmet Altay wrote: >> >> PR for the proposed change: https://github.com/apache/beam/pull/9985 >> >> On Mon, Nov 4, 2019 at 1:35 PM Udi Meiri wrote: >> > >> > +1 >> > >> > On Mon, Nov 4, 2019 at 12:09 PM Robert Bradshaw >> > wrote: >> >> >> >> +1, this seems like a good step with a clear win. >> >> >> >> On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay wrote: >> >> > >> >> > Python precommits are still timing out on #9925. I am guessing that >> >> > means this change would not be enough. >> >> > >> >> > I am proposing cutting down the number of test variants we run in >> >> > precommits. Currently for each version we ran the following >> >> > variants serially: >> >> > - base: Runs all unit tests with tox >> >> > - Cython: Installs cython and runs all unit tests as base version. >> >> > The original purpose was to ensure that tests pass with or without >> >> > cython. There is probably a huge overlap with base. (IIRC only a >> >> > few coders have different slow vs fast tests.) >> >> > - GCP: Installs GCP dependencies and tests all base + additional >> >> > gcp specific tests. The original purpose was to ensure that GCP is >> >> > an optional component and all non-GCP tests still works without GCP >> >> > components. >> >> > >> >> > We can reduce the list to cython + GCP tests only. This will cover >> >> > the same group of tests and will check that tests pass with or >> >> > without cython or GCP dependencies. This could reduce the precommit >> >> > time by ~30 minutes. >> >> > >> >> > What do you think? >> >> > >> >> > Ahmet >> >> > >> >> > >> >> > On Tue, Oct 29, 2019 at 11:15 AM Robert Bradshaw >> >> > wrote: >> >> >> >> >> >> https://github.com/apache/beam/pull/9925 >> >> >> >> >> >> On Tue, Oct 29, 2019 at 10:24
Re: Python Precommit duration pushing 2 hours
I looked at the log but I could not figure what is causing the timeout because the gradle scan links are missing. I sampled a few of the successful jobs, It seems like python 3.7 and python 2 are running 3 tests in serial {interactive, py37cython, py37gcp} and {docs, py27cython, py27gcp} respectively. These two versions are pushing the total time because other variants are now only running {cython, gcp} versions. I suggest breaking up docs, and interactive into 2 separate suites of their own. docs is actually faster than interactive,just separating that out to a new suite might help. Interactive was recently added (https://github.com/apache/beam/pull/9741). +Ning Kang could you separate interactive to new suite? Ahmet On Fri, Nov 8, 2019 at 11:09 AM Robert Bradshaw wrote: > Just saw another 2-hour timeout: > https://builds.apache.org/job/beam_PreCommit_Python_Commit/9440/ , so > perhaps we're not out of the woods yet (though in general things have > been a lot better). > > On Tue, Nov 5, 2019 at 10:52 AM Ahmet Altay wrote: > > > > GCP tests are already on separate locations. IO related tests are under > /sdks/python/apache_beam/io/gcp and Dataflow related tests are under > sdks/python/apache_beam/runners/dataflow. It should be a matter of changing > gradle files to run either one of the base tests or GCP tests depending on > the types of changes. I do not expect this to have any material impact on > the precommit times because these two test suites take about exactly the > same time to complete. > > > > #9985 is merged now. Precommit times on master branch dropped to ~1h 20 > for the last 5 runs. > > > > On Tue, Nov 5, 2019 at 10:12 AM David Cavazos > wrote: > >> > >> +1 to moving the GCP tests outside of core. If there are issues that > only show up on GCP tests but not in core, it might be an indication that > there needs to be another test in core covering that, but I think that > should be pretty rare. > >> > >> On Mon, Nov 4, 2019 at 8:33 PM Kenneth Knowles wrote: > >>> > >>> +1 to moving forward with this > >>> > >>> Could we move GCP tests outside the core? Then only code changes > touches/affecting GCP would cause them to run in precommit. Could still run > them in postcommit in their own suite. If the core has reasonably stable > abstractions that the connectors are built on, this should not change > coverage much. > >>> > >>> Kenn > >>> > >>> On Mon, Nov 4, 2019 at 1:55 PM Ahmet Altay wrote: > > PR for the proposed change: https://github.com/apache/beam/pull/9985 > > On Mon, Nov 4, 2019 at 1:35 PM Udi Meiri wrote: > > > > +1 > > > > On Mon, Nov 4, 2019 at 12:09 PM Robert Bradshaw > wrote: > >> > >> +1, this seems like a good step with a clear win. > >> > >> On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay > wrote: > >> > > >> > Python precommits are still timing out on #9925. I am guessing > that means this change would not be enough. > >> > > >> > I am proposing cutting down the number of test variants we run in > precommits. Currently for each version we ran the following variants > serially: > >> > - base: Runs all unit tests with tox > >> > - Cython: Installs cython and runs all unit tests as base > version. The original purpose was to ensure that tests pass with or without > cython. There is probably a huge overlap with base. (IIRC only a few coders > have different slow vs fast tests.) > >> > - GCP: Installs GCP dependencies and tests all base + additional > gcp specific tests. The original purpose was to ensure that GCP is an > optional component and all non-GCP tests still works without GCP components. > >> > > >> > We can reduce the list to cython + GCP tests only. This will > cover the same group of tests and will check that tests pass with or > without cython or GCP dependencies. This could reduce the precommit time by > ~30 minutes. > >> > > >> > What do you think? > >> > > >> > Ahmet > >> > > >> > > >> > On Tue, Oct 29, 2019 at 11:15 AM Robert Bradshaw < > rober...@google.com> wrote: > >> >> > >> >> https://github.com/apache/beam/pull/9925 > >> >> > >> >> On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri > wrote: > >> >> > > >> >> > I don't have the bandwidth right now to tackle this. Feel free > to take it. > >> >> > > >> >> > On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw < > rober...@google.com> wrote: > >> >> >> > >> >> >> The Python SDK does as well. These calls are coming from > >> >> >> to_runner_api, is_stateful_dofn, and validate_stateful_dofn > which are > >> >> >> invoked once per pipene or bundle. They are, however, > surprisingly > >> >> >> expensive. Even memoizing across those three calls should > save a > >> >> >> significant amount of time. Udi, did you want to tackle this? > >> >> >> > >> >> >> Looking at the profile, Pipeline.to_runner_api() is being > called 30 > >> >> >> times
Re: Python Precommit duration pushing 2 hours
Just saw another 2-hour timeout: https://builds.apache.org/job/beam_PreCommit_Python_Commit/9440/ , so perhaps we're not out of the woods yet (though in general things have been a lot better). On Tue, Nov 5, 2019 at 10:52 AM Ahmet Altay wrote: > > GCP tests are already on separate locations. IO related tests are under > /sdks/python/apache_beam/io/gcp and Dataflow related tests are under > sdks/python/apache_beam/runners/dataflow. It should be a matter of changing > gradle files to run either one of the base tests or GCP tests depending on > the types of changes. I do not expect this to have any material impact on the > precommit times because these two test suites take about exactly the same > time to complete. > > #9985 is merged now. Precommit times on master branch dropped to ~1h 20 for > the last 5 runs. > > On Tue, Nov 5, 2019 at 10:12 AM David Cavazos wrote: >> >> +1 to moving the GCP tests outside of core. If there are issues that only >> show up on GCP tests but not in core, it might be an indication that there >> needs to be another test in core covering that, but I think that should be >> pretty rare. >> >> On Mon, Nov 4, 2019 at 8:33 PM Kenneth Knowles wrote: >>> >>> +1 to moving forward with this >>> >>> Could we move GCP tests outside the core? Then only code changes >>> touches/affecting GCP would cause them to run in precommit. Could still run >>> them in postcommit in their own suite. If the core has reasonably stable >>> abstractions that the connectors are built on, this should not change >>> coverage much. >>> >>> Kenn >>> >>> On Mon, Nov 4, 2019 at 1:55 PM Ahmet Altay wrote: PR for the proposed change: https://github.com/apache/beam/pull/9985 On Mon, Nov 4, 2019 at 1:35 PM Udi Meiri wrote: > > +1 > > On Mon, Nov 4, 2019 at 12:09 PM Robert Bradshaw > wrote: >> >> +1, this seems like a good step with a clear win. >> >> On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay wrote: >> > >> > Python precommits are still timing out on #9925. I am guessing that >> > means this change would not be enough. >> > >> > I am proposing cutting down the number of test variants we run in >> > precommits. Currently for each version we ran the following variants >> > serially: >> > - base: Runs all unit tests with tox >> > - Cython: Installs cython and runs all unit tests as base version. The >> > original purpose was to ensure that tests pass with or without cython. >> > There is probably a huge overlap with base. (IIRC only a few coders >> > have different slow vs fast tests.) >> > - GCP: Installs GCP dependencies and tests all base + additional gcp >> > specific tests. The original purpose was to ensure that GCP is an >> > optional component and all non-GCP tests still works without GCP >> > components. >> > >> > We can reduce the list to cython + GCP tests only. This will cover the >> > same group of tests and will check that tests pass with or without >> > cython or GCP dependencies. This could reduce the precommit time by >> > ~30 minutes. >> > >> > What do you think? >> > >> > Ahmet >> > >> > >> > On Tue, Oct 29, 2019 at 11:15 AM Robert Bradshaw >> > wrote: >> >> >> >> https://github.com/apache/beam/pull/9925 >> >> >> >> On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri wrote: >> >> > >> >> > I don't have the bandwidth right now to tackle this. Feel free to >> >> > take it. >> >> > >> >> > On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw >> >> > wrote: >> >> >> >> >> >> The Python SDK does as well. These calls are coming from >> >> >> to_runner_api, is_stateful_dofn, and validate_stateful_dofn which >> >> >> are >> >> >> invoked once per pipene or bundle. They are, however, surprisingly >> >> >> expensive. Even memoizing across those three calls should save a >> >> >> significant amount of time. Udi, did you want to tackle this? >> >> >> >> >> >> Looking at the profile, Pipeline.to_runner_api() is being called 30 >> >> >> times in this test, and [Applied]PTransform.to_fn_api being called >> >> >> 3111 times, so that in itself might be interesting to investigate. >> >> >> >> >> >> On Tue, Oct 29, 2019 at 8:26 AM Robert Burke >> >> >> wrote: >> >> >> > >> >> >> > As does the Go SDK. Invokers are memoized and when possible code >> >> >> > is generated to avoid reflection. >> >> >> > >> >> >> > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles >> >> >> > wrote: >> >> >> >> >> >> >> >> Noting for the benefit of the thread archive in case someone >> >> >> >> goes digging and wonders if this affects other SDKs: the Java >> >> >> >> SDK memoizes DoFnSignatures and generated DoFnInvoker classes. >> >> >> >> >> >> >> >> Kenn >> >> >> >> >> >> >> >> On Mon, Oct 2
Re: Python Precommit duration pushing 2 hours
GCP tests are already on separate locations. IO related tests are under /sdks/python/apache_beam/io/gcp and Dataflow related tests are under sdks/python/apache_beam/runners/dataflow. It should be a matter of changing gradle files to run either one of the base tests or GCP tests depending on the types of changes. I do not expect this to have any material impact on the precommit times because these two test suites take about exactly the same time to complete. #9985 is merged now. Precommit times on master branch dropped to ~1h 20 for the last 5 runs. On Tue, Nov 5, 2019 at 10:12 AM David Cavazos wrote: > +1 to moving the GCP tests outside of core. If there are issues that only > show up on GCP tests but not in core, it might be an indication that there > needs to be another test in core covering that, but I think that should be > pretty rare. > > On Mon, Nov 4, 2019 at 8:33 PM Kenneth Knowles wrote: > >> +1 to moving forward with this >> >> Could we move GCP tests outside the core? Then only code changes >> touches/affecting GCP would cause them to run in precommit. Could still run >> them in postcommit in their own suite. If the core has reasonably stable >> abstractions that the connectors are built on, this should not change >> coverage much. >> >> Kenn >> >> On Mon, Nov 4, 2019 at 1:55 PM Ahmet Altay wrote: >> >>> PR for the proposed change: https://github.com/apache/beam/pull/9985 >>> >>> On Mon, Nov 4, 2019 at 1:35 PM Udi Meiri wrote: >>> +1 On Mon, Nov 4, 2019 at 12:09 PM Robert Bradshaw wrote: > +1, this seems like a good step with a clear win. > > On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay wrote: > > > > Python precommits are still timing out on #9925. I am guessing that > means this change would not be enough. > > > > I am proposing cutting down the number of test variants we run in > precommits. Currently for each version we ran the following variants > serially: > > - base: Runs all unit tests with tox > > - Cython: Installs cython and runs all unit tests as base version. > The original purpose was to ensure that tests pass with or without cython. > There is probably a huge overlap with base. (IIRC only a few coders have > different slow vs fast tests.) > > - GCP: Installs GCP dependencies and tests all base + additional gcp > specific tests. The original purpose was to ensure that GCP is an optional > component and all non-GCP tests still works without GCP components. > > > > We can reduce the list to cython + GCP tests only. This will cover > the same group of tests and will check that tests pass with or without > cython or GCP dependencies. This could reduce the precommit time by ~30 > minutes. > > > > What do you think? > > > > Ahmet > > > > > > On Tue, Oct 29, 2019 at 11:15 AM Robert Bradshaw < > rober...@google.com> wrote: > >> > >> https://github.com/apache/beam/pull/9925 > >> > >> On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri > wrote: > >> > > >> > I don't have the bandwidth right now to tackle this. Feel free to > take it. > >> > > >> > On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw < > rober...@google.com> wrote: > >> >> > >> >> The Python SDK does as well. These calls are coming from > >> >> to_runner_api, is_stateful_dofn, and validate_stateful_dofn > which are > >> >> invoked once per pipene or bundle. They are, however, > surprisingly > >> >> expensive. Even memoizing across those three calls should save a > >> >> significant amount of time. Udi, did you want to tackle this? > >> >> > >> >> Looking at the profile, Pipeline.to_runner_api() is being called > 30 > >> >> times in this test, and [Applied]PTransform.to_fn_api being > called > >> >> 3111 times, so that in itself might be interesting to > investigate. > >> >> > >> >> On Tue, Oct 29, 2019 at 8:26 AM Robert Burke > wrote: > >> >> > > >> >> > As does the Go SDK. Invokers are memoized and when possible > code is generated to avoid reflection. > >> >> > > >> >> > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles > wrote: > >> >> >> > >> >> >> Noting for the benefit of the thread archive in case someone > goes digging and wonders if this affects other SDKs: the Java SDK memoizes > DoFnSignatures and generated DoFnInvoker classes. > >> >> >> > >> >> >> Kenn > >> >> >> > >> >> >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri > wrote: > >> >> >>> > >> >> >>> Re: #9283 slowing down tests, ideas for slowness: > >> >> >>> 1. I added a lot of test cases, some with locally run > pipelines. > >> >> >>> 2. The PR somehow changed how coders are selected, and now > we're using less efficient ones. > >> >> >>> 3. New dependency funcsigs is slowing things down? (py2 only) > >> >> >>>
Re: Python Precommit duration pushing 2 hours
+1 to moving the GCP tests outside of core. If there are issues that only show up on GCP tests but not in core, it might be an indication that there needs to be another test in core covering that, but I think that should be pretty rare. On Mon, Nov 4, 2019 at 8:33 PM Kenneth Knowles wrote: > +1 to moving forward with this > > Could we move GCP tests outside the core? Then only code changes > touches/affecting GCP would cause them to run in precommit. Could still run > them in postcommit in their own suite. If the core has reasonably stable > abstractions that the connectors are built on, this should not change > coverage much. > > Kenn > > On Mon, Nov 4, 2019 at 1:55 PM Ahmet Altay wrote: > >> PR for the proposed change: https://github.com/apache/beam/pull/9985 >> >> On Mon, Nov 4, 2019 at 1:35 PM Udi Meiri wrote: >> >>> +1 >>> >>> On Mon, Nov 4, 2019 at 12:09 PM Robert Bradshaw >>> wrote: >>> +1, this seems like a good step with a clear win. On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay wrote: > > Python precommits are still timing out on #9925. I am guessing that means this change would not be enough. > > I am proposing cutting down the number of test variants we run in precommits. Currently for each version we ran the following variants serially: > - base: Runs all unit tests with tox > - Cython: Installs cython and runs all unit tests as base version. The original purpose was to ensure that tests pass with or without cython. There is probably a huge overlap with base. (IIRC only a few coders have different slow vs fast tests.) > - GCP: Installs GCP dependencies and tests all base + additional gcp specific tests. The original purpose was to ensure that GCP is an optional component and all non-GCP tests still works without GCP components. > > We can reduce the list to cython + GCP tests only. This will cover the same group of tests and will check that tests pass with or without cython or GCP dependencies. This could reduce the precommit time by ~30 minutes. > > What do you think? > > Ahmet > > > On Tue, Oct 29, 2019 at 11:15 AM Robert Bradshaw wrote: >> >> https://github.com/apache/beam/pull/9925 >> >> On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri wrote: >> > >> > I don't have the bandwidth right now to tackle this. Feel free to take it. >> > >> > On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw < rober...@google.com> wrote: >> >> >> >> The Python SDK does as well. These calls are coming from >> >> to_runner_api, is_stateful_dofn, and validate_stateful_dofn which are >> >> invoked once per pipene or bundle. They are, however, surprisingly >> >> expensive. Even memoizing across those three calls should save a >> >> significant amount of time. Udi, did you want to tackle this? >> >> >> >> Looking at the profile, Pipeline.to_runner_api() is being called 30 >> >> times in this test, and [Applied]PTransform.to_fn_api being called >> >> 3111 times, so that in itself might be interesting to investigate. >> >> >> >> On Tue, Oct 29, 2019 at 8:26 AM Robert Burke wrote: >> >> > >> >> > As does the Go SDK. Invokers are memoized and when possible code is generated to avoid reflection. >> >> > >> >> > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles wrote: >> >> >> >> >> >> Noting for the benefit of the thread archive in case someone goes digging and wonders if this affects other SDKs: the Java SDK memoizes DoFnSignatures and generated DoFnInvoker classes. >> >> >> >> >> >> Kenn >> >> >> >> >> >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri wrote: >> >> >>> >> >> >>> Re: #9283 slowing down tests, ideas for slowness: >> >> >>> 1. I added a lot of test cases, some with locally run pipelines. >> >> >>> 2. The PR somehow changed how coders are selected, and now we're using less efficient ones. >> >> >>> 3. New dependency funcsigs is slowing things down? (py2 only) >> >> >>> >> >> >>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and 3.7 and got these cool graphs (attached). >> >> >>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time (IIUC), gets called ~230k times >> >> >>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k times >> >> >>> >> >> >>> After memoization of get_function_args_defaults: >> >> >>> 2.7: core:294:get_function_arguments 20.02% >> >> >>> 3.7: core:294:get_function_arguments 8.11% >> >> >>> >> >> >>> >> >> >>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada < pabl...@google.com> wrote: >> >> >> >> *not deciles, but 9-percentiles : ) >> >> >> >> On Mon, Oct 28, 2019 at
Re: Python Precommit duration pushing 2 hours
+1 to moving forward with this Could we move GCP tests outside the core? Then only code changes touches/affecting GCP would cause them to run in precommit. Could still run them in postcommit in their own suite. If the core has reasonably stable abstractions that the connectors are built on, this should not change coverage much. Kenn On Mon, Nov 4, 2019 at 1:55 PM Ahmet Altay wrote: > PR for the proposed change: https://github.com/apache/beam/pull/9985 > > On Mon, Nov 4, 2019 at 1:35 PM Udi Meiri wrote: > >> +1 >> >> On Mon, Nov 4, 2019 at 12:09 PM Robert Bradshaw >> wrote: >> >>> +1, this seems like a good step with a clear win. >>> >>> On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay wrote: >>> > >>> > Python precommits are still timing out on #9925. I am guessing that >>> means this change would not be enough. >>> > >>> > I am proposing cutting down the number of test variants we run in >>> precommits. Currently for each version we ran the following variants >>> serially: >>> > - base: Runs all unit tests with tox >>> > - Cython: Installs cython and runs all unit tests as base version. The >>> original purpose was to ensure that tests pass with or without cython. >>> There is probably a huge overlap with base. (IIRC only a few coders have >>> different slow vs fast tests.) >>> > - GCP: Installs GCP dependencies and tests all base + additional gcp >>> specific tests. The original purpose was to ensure that GCP is an optional >>> component and all non-GCP tests still works without GCP components. >>> > >>> > We can reduce the list to cython + GCP tests only. This will cover the >>> same group of tests and will check that tests pass with or without cython >>> or GCP dependencies. This could reduce the precommit time by ~30 minutes. >>> > >>> > What do you think? >>> > >>> > Ahmet >>> > >>> > >>> > On Tue, Oct 29, 2019 at 11:15 AM Robert Bradshaw >>> wrote: >>> >> >>> >> https://github.com/apache/beam/pull/9925 >>> >> >>> >> On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri wrote: >>> >> > >>> >> > I don't have the bandwidth right now to tackle this. Feel free to >>> take it. >>> >> > >>> >> > On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw < >>> rober...@google.com> wrote: >>> >> >> >>> >> >> The Python SDK does as well. These calls are coming from >>> >> >> to_runner_api, is_stateful_dofn, and validate_stateful_dofn which >>> are >>> >> >> invoked once per pipene or bundle. They are, however, surprisingly >>> >> >> expensive. Even memoizing across those three calls should save a >>> >> >> significant amount of time. Udi, did you want to tackle this? >>> >> >> >>> >> >> Looking at the profile, Pipeline.to_runner_api() is being called 30 >>> >> >> times in this test, and [Applied]PTransform.to_fn_api being called >>> >> >> 3111 times, so that in itself might be interesting to investigate. >>> >> >> >>> >> >> On Tue, Oct 29, 2019 at 8:26 AM Robert Burke >>> wrote: >>> >> >> > >>> >> >> > As does the Go SDK. Invokers are memoized and when possible code >>> is generated to avoid reflection. >>> >> >> > >>> >> >> > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles >>> wrote: >>> >> >> >> >>> >> >> >> Noting for the benefit of the thread archive in case someone >>> goes digging and wonders if this affects other SDKs: the Java SDK memoizes >>> DoFnSignatures and generated DoFnInvoker classes. >>> >> >> >> >>> >> >> >> Kenn >>> >> >> >> >>> >> >> >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri >>> wrote: >>> >> >> >>> >>> >> >> >>> Re: #9283 slowing down tests, ideas for slowness: >>> >> >> >>> 1. I added a lot of test cases, some with locally run >>> pipelines. >>> >> >> >>> 2. The PR somehow changed how coders are selected, and now >>> we're using less efficient ones. >>> >> >> >>> 3. New dependency funcsigs is slowing things down? (py2 only) >>> >> >> >>> >>> >> >> >>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 >>> and 3.7 and got these cool graphs (attached). >>> >> >> >>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time >>> (IIUC), gets called ~230k times >>> >> >> >>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k >>> times >>> >> >> >>> >>> >> >> >>> After memoization of get_function_args_defaults: >>> >> >> >>> 2.7: core:294:get_function_arguments 20.02% >>> >> >> >>> 3.7: core:294:get_function_arguments 8.11% >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada < >>> pabl...@google.com> wrote: >>> >> >> >>> >> >> *not deciles, but 9-percentiles : ) >>> >> >> >>> >> >> On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada < >>> pabl...@google.com> wrote: >>> >> >> > >>> >> >> > I've ran the tests in Python 2 (without cython), and used a >>> utility to track runtime for each test method. I found some of the >>> following things: >>> >> >> > - Total test methods run: 2665 >>> >> >> > - Total test runtime: 990 seconds >>> >> >> > - Deciles of time spent: >>> >> >> > - 1949 tests run in the first 9%
Re: Python Precommit duration pushing 2 hours
PR for the proposed change: https://github.com/apache/beam/pull/9985 On Mon, Nov 4, 2019 at 1:35 PM Udi Meiri wrote: > +1 > > On Mon, Nov 4, 2019 at 12:09 PM Robert Bradshaw > wrote: > >> +1, this seems like a good step with a clear win. >> >> On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay wrote: >> > >> > Python precommits are still timing out on #9925. I am guessing that >> means this change would not be enough. >> > >> > I am proposing cutting down the number of test variants we run in >> precommits. Currently for each version we ran the following variants >> serially: >> > - base: Runs all unit tests with tox >> > - Cython: Installs cython and runs all unit tests as base version. The >> original purpose was to ensure that tests pass with or without cython. >> There is probably a huge overlap with base. (IIRC only a few coders have >> different slow vs fast tests.) >> > - GCP: Installs GCP dependencies and tests all base + additional gcp >> specific tests. The original purpose was to ensure that GCP is an optional >> component and all non-GCP tests still works without GCP components. >> > >> > We can reduce the list to cython + GCP tests only. This will cover the >> same group of tests and will check that tests pass with or without cython >> or GCP dependencies. This could reduce the precommit time by ~30 minutes. >> > >> > What do you think? >> > >> > Ahmet >> > >> > >> > On Tue, Oct 29, 2019 at 11:15 AM Robert Bradshaw >> wrote: >> >> >> >> https://github.com/apache/beam/pull/9925 >> >> >> >> On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri wrote: >> >> > >> >> > I don't have the bandwidth right now to tackle this. Feel free to >> take it. >> >> > >> >> > On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw < >> rober...@google.com> wrote: >> >> >> >> >> >> The Python SDK does as well. These calls are coming from >> >> >> to_runner_api, is_stateful_dofn, and validate_stateful_dofn which >> are >> >> >> invoked once per pipene or bundle. They are, however, surprisingly >> >> >> expensive. Even memoizing across those three calls should save a >> >> >> significant amount of time. Udi, did you want to tackle this? >> >> >> >> >> >> Looking at the profile, Pipeline.to_runner_api() is being called 30 >> >> >> times in this test, and [Applied]PTransform.to_fn_api being called >> >> >> 3111 times, so that in itself might be interesting to investigate. >> >> >> >> >> >> On Tue, Oct 29, 2019 at 8:26 AM Robert Burke >> wrote: >> >> >> > >> >> >> > As does the Go SDK. Invokers are memoized and when possible code >> is generated to avoid reflection. >> >> >> > >> >> >> > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles >> wrote: >> >> >> >> >> >> >> >> Noting for the benefit of the thread archive in case someone >> goes digging and wonders if this affects other SDKs: the Java SDK memoizes >> DoFnSignatures and generated DoFnInvoker classes. >> >> >> >> >> >> >> >> Kenn >> >> >> >> >> >> >> >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri >> wrote: >> >> >> >>> >> >> >> >>> Re: #9283 slowing down tests, ideas for slowness: >> >> >> >>> 1. I added a lot of test cases, some with locally run pipelines. >> >> >> >>> 2. The PR somehow changed how coders are selected, and now >> we're using less efficient ones. >> >> >> >>> 3. New dependency funcsigs is slowing things down? (py2 only) >> >> >> >>> >> >> >> >>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and >> 3.7 and got these cool graphs (attached). >> >> >> >>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time >> (IIUC), gets called ~230k times >> >> >> >>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k >> times >> >> >> >>> >> >> >> >>> After memoization of get_function_args_defaults: >> >> >> >>> 2.7: core:294:get_function_arguments 20.02% >> >> >> >>> 3.7: core:294:get_function_arguments 8.11% >> >> >> >>> >> >> >> >>> >> >> >> >>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada < >> pabl...@google.com> wrote: >> >> >> >> >> >> *not deciles, but 9-percentiles : ) >> >> >> >> >> >> On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada < >> pabl...@google.com> wrote: >> >> >> > >> >> >> > I've ran the tests in Python 2 (without cython), and used a >> utility to track runtime for each test method. I found some of the >> following things: >> >> >> > - Total test methods run: 2665 >> >> >> > - Total test runtime: 990 seconds >> >> >> > - Deciles of time spent: >> >> >> > - 1949 tests run in the first 9% of time >> >> >> > - 173 in the 9-18% rang3e >> >> >> > - 130 in the 18-27% range >> >> >> > - 95 in the 27-36% range >> >> >> > - 77 >> >> >> > - 66 >> >> >> > - 55 >> >> >> > - 46 >> >> >> > - 37 >> >> >> > - 24 >> >> >> > - 13 tests run in the last 9% of time. This represents >> about 1 minute and a half. >> >> >> > >> >> >> > We may be able to look at the slowest X tests, and get >> gradual improvements from there. Although it seem
Re: Python Precommit duration pushing 2 hours
+1 On Mon, Nov 4, 2019 at 12:09 PM Robert Bradshaw wrote: > +1, this seems like a good step with a clear win. > > On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay wrote: > > > > Python precommits are still timing out on #9925. I am guessing that > means this change would not be enough. > > > > I am proposing cutting down the number of test variants we run in > precommits. Currently for each version we ran the following variants > serially: > > - base: Runs all unit tests with tox > > - Cython: Installs cython and runs all unit tests as base version. The > original purpose was to ensure that tests pass with or without cython. > There is probably a huge overlap with base. (IIRC only a few coders have > different slow vs fast tests.) > > - GCP: Installs GCP dependencies and tests all base + additional gcp > specific tests. The original purpose was to ensure that GCP is an optional > component and all non-GCP tests still works without GCP components. > > > > We can reduce the list to cython + GCP tests only. This will cover the > same group of tests and will check that tests pass with or without cython > or GCP dependencies. This could reduce the precommit time by ~30 minutes. > > > > What do you think? > > > > Ahmet > > > > > > On Tue, Oct 29, 2019 at 11:15 AM Robert Bradshaw > wrote: > >> > >> https://github.com/apache/beam/pull/9925 > >> > >> On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri wrote: > >> > > >> > I don't have the bandwidth right now to tackle this. Feel free to > take it. > >> > > >> > On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw > wrote: > >> >> > >> >> The Python SDK does as well. These calls are coming from > >> >> to_runner_api, is_stateful_dofn, and validate_stateful_dofn which are > >> >> invoked once per pipene or bundle. They are, however, surprisingly > >> >> expensive. Even memoizing across those three calls should save a > >> >> significant amount of time. Udi, did you want to tackle this? > >> >> > >> >> Looking at the profile, Pipeline.to_runner_api() is being called 30 > >> >> times in this test, and [Applied]PTransform.to_fn_api being called > >> >> 3111 times, so that in itself might be interesting to investigate. > >> >> > >> >> On Tue, Oct 29, 2019 at 8:26 AM Robert Burke > wrote: > >> >> > > >> >> > As does the Go SDK. Invokers are memoized and when possible code > is generated to avoid reflection. > >> >> > > >> >> > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles > wrote: > >> >> >> > >> >> >> Noting for the benefit of the thread archive in case someone goes > digging and wonders if this affects other SDKs: the Java SDK memoizes > DoFnSignatures and generated DoFnInvoker classes. > >> >> >> > >> >> >> Kenn > >> >> >> > >> >> >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri > wrote: > >> >> >>> > >> >> >>> Re: #9283 slowing down tests, ideas for slowness: > >> >> >>> 1. I added a lot of test cases, some with locally run pipelines. > >> >> >>> 2. The PR somehow changed how coders are selected, and now we're > using less efficient ones. > >> >> >>> 3. New dependency funcsigs is slowing things down? (py2 only) > >> >> >>> > >> >> >>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and > 3.7 and got these cool graphs (attached). > >> >> >>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time > (IIUC), gets called ~230k times > >> >> >>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k > times > >> >> >>> > >> >> >>> After memoization of get_function_args_defaults: > >> >> >>> 2.7: core:294:get_function_arguments 20.02% > >> >> >>> 3.7: core:294:get_function_arguments 8.11% > >> >> >>> > >> >> >>> > >> >> >>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada < > pabl...@google.com> wrote: > >> >> > >> >> *not deciles, but 9-percentiles : ) > >> >> > >> >> On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada < > pabl...@google.com> wrote: > >> >> > > >> >> > I've ran the tests in Python 2 (without cython), and used a > utility to track runtime for each test method. I found some of the > following things: > >> >> > - Total test methods run: 2665 > >> >> > - Total test runtime: 990 seconds > >> >> > - Deciles of time spent: > >> >> > - 1949 tests run in the first 9% of time > >> >> > - 173 in the 9-18% rang3e > >> >> > - 130 in the 18-27% range > >> >> > - 95 in the 27-36% range > >> >> > - 77 > >> >> > - 66 > >> >> > - 55 > >> >> > - 46 > >> >> > - 37 > >> >> > - 24 > >> >> > - 13 tests run in the last 9% of time. This represents about > 1 minute and a half. > >> >> > > >> >> > We may be able to look at the slowest X tests, and get gradual > improvements from there. Although it seems .. not dramatic ones : ) > >> >> > > >> >> > FWIW I uploaded the results here: > https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json > >> >> > > >> >> > The slowest 13 tests were: > >> >> > > >> >> > >
Re: Python Precommit duration pushing 2 hours
+1, this seems like a good step with a clear win. On Mon, Nov 4, 2019 at 12:06 PM Ahmet Altay wrote: > > Python precommits are still timing out on #9925. I am guessing that means > this change would not be enough. > > I am proposing cutting down the number of test variants we run in precommits. > Currently for each version we ran the following variants serially: > - base: Runs all unit tests with tox > - Cython: Installs cython and runs all unit tests as base version. The > original purpose was to ensure that tests pass with or without cython. There > is probably a huge overlap with base. (IIRC only a few coders have different > slow vs fast tests.) > - GCP: Installs GCP dependencies and tests all base + additional gcp specific > tests. The original purpose was to ensure that GCP is an optional component > and all non-GCP tests still works without GCP components. > > We can reduce the list to cython + GCP tests only. This will cover the same > group of tests and will check that tests pass with or without cython or GCP > dependencies. This could reduce the precommit time by ~30 minutes. > > What do you think? > > Ahmet > > > On Tue, Oct 29, 2019 at 11:15 AM Robert Bradshaw wrote: >> >> https://github.com/apache/beam/pull/9925 >> >> On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri wrote: >> > >> > I don't have the bandwidth right now to tackle this. Feel free to take it. >> > >> > On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw >> > wrote: >> >> >> >> The Python SDK does as well. These calls are coming from >> >> to_runner_api, is_stateful_dofn, and validate_stateful_dofn which are >> >> invoked once per pipene or bundle. They are, however, surprisingly >> >> expensive. Even memoizing across those three calls should save a >> >> significant amount of time. Udi, did you want to tackle this? >> >> >> >> Looking at the profile, Pipeline.to_runner_api() is being called 30 >> >> times in this test, and [Applied]PTransform.to_fn_api being called >> >> 3111 times, so that in itself might be interesting to investigate. >> >> >> >> On Tue, Oct 29, 2019 at 8:26 AM Robert Burke wrote: >> >> > >> >> > As does the Go SDK. Invokers are memoized and when possible code is >> >> > generated to avoid reflection. >> >> > >> >> > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles wrote: >> >> >> >> >> >> Noting for the benefit of the thread archive in case someone goes >> >> >> digging and wonders if this affects other SDKs: the Java SDK memoizes >> >> >> DoFnSignatures and generated DoFnInvoker classes. >> >> >> >> >> >> Kenn >> >> >> >> >> >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri wrote: >> >> >>> >> >> >>> Re: #9283 slowing down tests, ideas for slowness: >> >> >>> 1. I added a lot of test cases, some with locally run pipelines. >> >> >>> 2. The PR somehow changed how coders are selected, and now we're >> >> >>> using less efficient ones. >> >> >>> 3. New dependency funcsigs is slowing things down? (py2 only) >> >> >>> >> >> >>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and 3.7 >> >> >>> and got these cool graphs (attached). >> >> >>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time (IIUC), >> >> >>> gets called ~230k times >> >> >>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k times >> >> >>> >> >> >>> After memoization of get_function_args_defaults: >> >> >>> 2.7: core:294:get_function_arguments 20.02% >> >> >>> 3.7: core:294:get_function_arguments 8.11% >> >> >>> >> >> >>> >> >> >>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada >> >> >>> wrote: >> >> >> >> *not deciles, but 9-percentiles : ) >> >> >> >> On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada >> >> wrote: >> >> > >> >> > I've ran the tests in Python 2 (without cython), and used a utility >> >> > to track runtime for each test method. I found some of the >> >> > following things: >> >> > - Total test methods run: 2665 >> >> > - Total test runtime: 990 seconds >> >> > - Deciles of time spent: >> >> > - 1949 tests run in the first 9% of time >> >> > - 173 in the 9-18% rang3e >> >> > - 130 in the 18-27% range >> >> > - 95 in the 27-36% range >> >> > - 77 >> >> > - 66 >> >> > - 55 >> >> > - 46 >> >> > - 37 >> >> > - 24 >> >> > - 13 tests run in the last 9% of time. This represents about 1 >> >> > minute and a half. >> >> > >> >> > We may be able to look at the slowest X tests, and get gradual >> >> > improvements from there. Although it seems .. not dramatic ones : ) >> >> > >> >> > FWIW I uploaded the results here: >> >> > https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json >> >> > >> >> > The slowest 13 tests were: >> >> > >> >> > [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic', >> >> > 5.253582000732422), >> >> > >> >> > ('apache_beam.r
Re: Python Precommit duration pushing 2 hours
Python precommits are still timing out on #9925. I am guessing that means this change would not be enough. I am proposing cutting down the number of test variants we run in precommits. Currently for each version we ran the following variants serially: - base: Runs all unit tests with tox - Cython: Installs cython and runs all unit tests as base version. The original purpose was to ensure that tests pass with or without cython. There is probably a huge overlap with base. (IIRC only a few coders have different slow vs fast tests.) - GCP: Installs GCP dependencies and tests all base + additional gcp specific tests. The original purpose was to ensure that GCP is an optional component and all non-GCP tests still works without GCP components. We can reduce the list to cython + GCP tests only. This will cover the same group of tests and will check that tests pass with or without cython or GCP dependencies. This could reduce the precommit time by ~30 minutes. What do you think? Ahmet On Tue, Oct 29, 2019 at 11:15 AM Robert Bradshaw wrote: > https://github.com/apache/beam/pull/9925 > > On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri wrote: > > > > I don't have the bandwidth right now to tackle this. Feel free to take > it. > > > > On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw > wrote: > >> > >> The Python SDK does as well. These calls are coming from > >> to_runner_api, is_stateful_dofn, and validate_stateful_dofn which are > >> invoked once per pipene or bundle. They are, however, surprisingly > >> expensive. Even memoizing across those three calls should save a > >> significant amount of time. Udi, did you want to tackle this? > >> > >> Looking at the profile, Pipeline.to_runner_api() is being called 30 > >> times in this test, and [Applied]PTransform.to_fn_api being called > >> 3111 times, so that in itself might be interesting to investigate. > >> > >> On Tue, Oct 29, 2019 at 8:26 AM Robert Burke > wrote: > >> > > >> > As does the Go SDK. Invokers are memoized and when possible code is > generated to avoid reflection. > >> > > >> > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles wrote: > >> >> > >> >> Noting for the benefit of the thread archive in case someone goes > digging and wonders if this affects other SDKs: the Java SDK memoizes > DoFnSignatures and generated DoFnInvoker classes. > >> >> > >> >> Kenn > >> >> > >> >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri wrote: > >> >>> > >> >>> Re: #9283 slowing down tests, ideas for slowness: > >> >>> 1. I added a lot of test cases, some with locally run pipelines. > >> >>> 2. The PR somehow changed how coders are selected, and now we're > using less efficient ones. > >> >>> 3. New dependency funcsigs is slowing things down? (py2 only) > >> >>> > >> >>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and 3.7 > and got these cool graphs (attached). > >> >>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time > (IIUC), gets called ~230k times > >> >>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k times > >> >>> > >> >>> After memoization of get_function_args_defaults: > >> >>> 2.7: core:294:get_function_arguments 20.02% > >> >>> 3.7: core:294:get_function_arguments 8.11% > >> >>> > >> >>> > >> >>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada > wrote: > >> > >> *not deciles, but 9-percentiles : ) > >> > >> On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada > wrote: > >> > > >> > I've ran the tests in Python 2 (without cython), and used a > utility to track runtime for each test method. I found some of the > following things: > >> > - Total test methods run: 2665 > >> > - Total test runtime: 990 seconds > >> > - Deciles of time spent: > >> > - 1949 tests run in the first 9% of time > >> > - 173 in the 9-18% rang3e > >> > - 130 in the 18-27% range > >> > - 95 in the 27-36% range > >> > - 77 > >> > - 66 > >> > - 55 > >> > - 46 > >> > - 37 > >> > - 24 > >> > - 13 tests run in the last 9% of time. This represents about 1 > minute and a half. > >> > > >> > We may be able to look at the slowest X tests, and get gradual > improvements from there. Although it seems .. not dramatic ones : ) > >> > > >> > FWIW I uploaded the results here: > https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json > >> > > >> > The slowest 13 tests were: > >> > > >> > > [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic', > >> > 5.253582000732422), > >> > > ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_wordcount', > >> > 7.907713890075684), > >> > > ('apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest.test_failure_has_same_insert_ids', > >> > 5.237942934036255), > >> > > ('apache_beam.transforms.combiners_test.CombineTest.test_global_sample', > >> > 5.56
Re: Python Precommit duration pushing 2 hours
https://github.com/apache/beam/pull/9925 On Tue, Oct 29, 2019 at 10:24 AM Udi Meiri wrote: > > I don't have the bandwidth right now to tackle this. Feel free to take it. > > On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw wrote: >> >> The Python SDK does as well. These calls are coming from >> to_runner_api, is_stateful_dofn, and validate_stateful_dofn which are >> invoked once per pipene or bundle. They are, however, surprisingly >> expensive. Even memoizing across those three calls should save a >> significant amount of time. Udi, did you want to tackle this? >> >> Looking at the profile, Pipeline.to_runner_api() is being called 30 >> times in this test, and [Applied]PTransform.to_fn_api being called >> 3111 times, so that in itself might be interesting to investigate. >> >> On Tue, Oct 29, 2019 at 8:26 AM Robert Burke wrote: >> > >> > As does the Go SDK. Invokers are memoized and when possible code is >> > generated to avoid reflection. >> > >> > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles wrote: >> >> >> >> Noting for the benefit of the thread archive in case someone goes digging >> >> and wonders if this affects other SDKs: the Java SDK memoizes >> >> DoFnSignatures and generated DoFnInvoker classes. >> >> >> >> Kenn >> >> >> >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri wrote: >> >>> >> >>> Re: #9283 slowing down tests, ideas for slowness: >> >>> 1. I added a lot of test cases, some with locally run pipelines. >> >>> 2. The PR somehow changed how coders are selected, and now we're using >> >>> less efficient ones. >> >>> 3. New dependency funcsigs is slowing things down? (py2 only) >> >>> >> >>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and 3.7 and >> >>> got these cool graphs (attached). >> >>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time (IIUC), >> >>> gets called ~230k times >> >>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k times >> >>> >> >>> After memoization of get_function_args_defaults: >> >>> 2.7: core:294:get_function_arguments 20.02% >> >>> 3.7: core:294:get_function_arguments 8.11% >> >>> >> >>> >> >>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada wrote: >> >> *not deciles, but 9-percentiles : ) >> >> On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada >> wrote: >> > >> > I've ran the tests in Python 2 (without cython), and used a utility to >> > track runtime for each test method. I found some of the following >> > things: >> > - Total test methods run: 2665 >> > - Total test runtime: 990 seconds >> > - Deciles of time spent: >> > - 1949 tests run in the first 9% of time >> > - 173 in the 9-18% rang3e >> > - 130 in the 18-27% range >> > - 95 in the 27-36% range >> > - 77 >> > - 66 >> > - 55 >> > - 46 >> > - 37 >> > - 24 >> > - 13 tests run in the last 9% of time. This represents about 1 >> > minute and a half. >> > >> > We may be able to look at the slowest X tests, and get gradual >> > improvements from there. Although it seems .. not dramatic ones : ) >> > >> > FWIW I uploaded the results here: >> > https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json >> > >> > The slowest 13 tests were: >> > >> > [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic', >> > 5.253582000732422), >> > >> > ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_wordcount', >> > 7.907713890075684), >> > >> > ('apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest.test_failure_has_same_insert_ids', >> > 5.237942934036255), >> > >> > ('apache_beam.transforms.combiners_test.CombineTest.test_global_sample', >> > 5.563946008682251), >> > >> > ('apache_beam.runners.worker.sideinputs_test.EmulatedCollectionsTest.test_large_iterable_values', >> > 5.680700063705444), >> > >> > ('apache_beam.io.parquetio_test.TestParquet.test_sink_transform_multiple_row_group', >> > 6.111238956451416), >> > >> > ('apache_beam.runners.worker.statesampler_test.StateSamplerTest.test_basic_sampler', >> > 6.007534980773926), >> > >> > ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_basic', >> > 13.993916988372803), >> > >> > ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_read_cache_expansion', >> > 6.3383049964904785), >> > >> > ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_word_count', >> > 9.157485008239746), >> > >> > ('apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses.test_pardo_side_and_main_outputs', >> > 5.191173076629639), >> > >> > ('apache_beam.io.vcfio_test.VcfSour
Re: Python Precommit duration pushing 2 hours
I don't have the bandwidth right now to tackle this. Feel free to take it. On Tue, Oct 29, 2019 at 10:16 AM Robert Bradshaw wrote: > The Python SDK does as well. These calls are coming from > to_runner_api, is_stateful_dofn, and validate_stateful_dofn which are > invoked once per pipene or bundle. They are, however, surprisingly > expensive. Even memoizing across those three calls should save a > significant amount of time. Udi, did you want to tackle this? > > Looking at the profile, Pipeline.to_runner_api() is being called 30 > times in this test, and [Applied]PTransform.to_fn_api being called > 3111 times, so that in itself might be interesting to investigate. > > On Tue, Oct 29, 2019 at 8:26 AM Robert Burke wrote: > > > > As does the Go SDK. Invokers are memoized and when possible code is > generated to avoid reflection. > > > > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles wrote: > >> > >> Noting for the benefit of the thread archive in case someone goes > digging and wonders if this affects other SDKs: the Java SDK memoizes > DoFnSignatures and generated DoFnInvoker classes. > >> > >> Kenn > >> > >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri wrote: > >>> > >>> Re: #9283 slowing down tests, ideas for slowness: > >>> 1. I added a lot of test cases, some with locally run pipelines. > >>> 2. The PR somehow changed how coders are selected, and now we're using > less efficient ones. > >>> 3. New dependency funcsigs is slowing things down? (py2 only) > >>> > >>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and 3.7 > and got these cool graphs (attached). > >>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time (IIUC), > gets called ~230k times > >>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k times > >>> > >>> After memoization of get_function_args_defaults: > >>> 2.7: core:294:get_function_arguments 20.02% > >>> 3.7: core:294:get_function_arguments 8.11% > >>> > >>> > >>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada > wrote: > > *not deciles, but 9-percentiles : ) > > On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada > wrote: > > > > I've ran the tests in Python 2 (without cython), and used a utility > to track runtime for each test method. I found some of the following things: > > - Total test methods run: 2665 > > - Total test runtime: 990 seconds > > - Deciles of time spent: > > - 1949 tests run in the first 9% of time > > - 173 in the 9-18% rang3e > > - 130 in the 18-27% range > > - 95 in the 27-36% range > > - 77 > > - 66 > > - 55 > > - 46 > > - 37 > > - 24 > > - 13 tests run in the last 9% of time. This represents about 1 > minute and a half. > > > > We may be able to look at the slowest X tests, and get gradual > improvements from there. Although it seems .. not dramatic ones : ) > > > > FWIW I uploaded the results here: > https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json > > > > The slowest 13 tests were: > > > > > [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic', > > 5.253582000732422), > > > ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_wordcount', > > 7.907713890075684), > > > ('apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest.test_failure_has_same_insert_ids', > > 5.237942934036255), > > > ('apache_beam.transforms.combiners_test.CombineTest.test_global_sample', > > 5.563946008682251), > > > ('apache_beam.runners.worker.sideinputs_test.EmulatedCollectionsTest.test_large_iterable_values', > > 5.680700063705444), > > > ('apache_beam.io.parquetio_test.TestParquet.test_sink_transform_multiple_row_group', > > 6.111238956451416), > > > ('apache_beam.runners.worker.statesampler_test.StateSamplerTest.test_basic_sampler', > > 6.007534980773926), > > > ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_basic', > > 13.993916988372803), > > > ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_read_cache_expansion', > > 6.3383049964904785), > > > ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_word_count', > > 9.157485008239746), > > > ('apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses.test_pardo_side_and_main_outputs', > > 5.191173076629639), > > > ('apache_beam.io.vcfio_test.VcfSourceTest.test_pipeline_read_file_pattern_large', > > 6.2221620082855225), > > > ('apache_beam.io.fileio_test.WriteFilesTest.test_streaming_complex_timing', > > 7.7187910079956055)] > > > > On Mon, Oct 28, 2019 at 3:10 PM Pablo Estrada > wrote: > >> > >> I have written https://github.com/apache/beam/pull/9910 to reduce > FnApiRunnerT
Re: Python Precommit duration pushing 2 hours
The Python SDK does as well. These calls are coming from to_runner_api, is_stateful_dofn, and validate_stateful_dofn which are invoked once per pipene or bundle. They are, however, surprisingly expensive. Even memoizing across those three calls should save a significant amount of time. Udi, did you want to tackle this? Looking at the profile, Pipeline.to_runner_api() is being called 30 times in this test, and [Applied]PTransform.to_fn_api being called 3111 times, so that in itself might be interesting to investigate. On Tue, Oct 29, 2019 at 8:26 AM Robert Burke wrote: > > As does the Go SDK. Invokers are memoized and when possible code is generated > to avoid reflection. > > On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles wrote: >> >> Noting for the benefit of the thread archive in case someone goes digging >> and wonders if this affects other SDKs: the Java SDK memoizes DoFnSignatures >> and generated DoFnInvoker classes. >> >> Kenn >> >> On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri wrote: >>> >>> Re: #9283 slowing down tests, ideas for slowness: >>> 1. I added a lot of test cases, some with locally run pipelines. >>> 2. The PR somehow changed how coders are selected, and now we're using less >>> efficient ones. >>> 3. New dependency funcsigs is slowing things down? (py2 only) >>> >>> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and 3.7 and got >>> these cool graphs (attached). >>> 2.7: core:294:get_function_arguments takes 56.66% of CPU time (IIUC), gets >>> called ~230k times >>> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k times >>> >>> After memoization of get_function_args_defaults: >>> 2.7: core:294:get_function_arguments 20.02% >>> 3.7: core:294:get_function_arguments 8.11% >>> >>> >>> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada wrote: *not deciles, but 9-percentiles : ) On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada wrote: > > I've ran the tests in Python 2 (without cython), and used a utility to > track runtime for each test method. I found some of the following things: > - Total test methods run: 2665 > - Total test runtime: 990 seconds > - Deciles of time spent: > - 1949 tests run in the first 9% of time > - 173 in the 9-18% rang3e > - 130 in the 18-27% range > - 95 in the 27-36% range > - 77 > - 66 > - 55 > - 46 > - 37 > - 24 > - 13 tests run in the last 9% of time. This represents about 1 minute > and a half. > > We may be able to look at the slowest X tests, and get gradual > improvements from there. Although it seems .. not dramatic ones : ) > > FWIW I uploaded the results here: > https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json > > The slowest 13 tests were: > > [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic', > 5.253582000732422), > > ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_wordcount', > 7.907713890075684), > > ('apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest.test_failure_has_same_insert_ids', > 5.237942934036255), > ('apache_beam.transforms.combiners_test.CombineTest.test_global_sample', > 5.563946008682251), > > ('apache_beam.runners.worker.sideinputs_test.EmulatedCollectionsTest.test_large_iterable_values', > 5.680700063705444), > > ('apache_beam.io.parquetio_test.TestParquet.test_sink_transform_multiple_row_group', > 6.111238956451416), > > ('apache_beam.runners.worker.statesampler_test.StateSamplerTest.test_basic_sampler', > 6.007534980773926), > > ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_basic', > 13.993916988372803), > > ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_read_cache_expansion', > 6.3383049964904785), > > ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_word_count', > 9.157485008239746), > > ('apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses.test_pardo_side_and_main_outputs', > 5.191173076629639), > > ('apache_beam.io.vcfio_test.VcfSourceTest.test_pipeline_read_file_pattern_large', > 6.2221620082855225), > > ('apache_beam.io.fileio_test.WriteFilesTest.test_streaming_complex_timing', > 7.7187910079956055)] > > On Mon, Oct 28, 2019 at 3:10 PM Pablo Estrada wrote: >> >> I have written https://github.com/apache/beam/pull/9910 to reduce >> FnApiRunnerTest variations. >> I'm not in a rush to merge, but rather happy to start a discussion. >> I'll also try to figure out if there are other tests slowing down the >> suite significantly. >> Best >> -P. >>>
Re: Python Precommit duration pushing 2 hours
As does the Go SDK. Invokers are memoized and when possible code is generated to avoid reflection. On Tue, Oct 29, 2019, 6:46 AM Kenneth Knowles wrote: > Noting for the benefit of the thread archive in case someone goes digging > and wonders if this affects other SDKs: the Java SDK > memoizes DoFnSignatures and generated DoFnInvoker classes. > > Kenn > > On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri wrote: > >> Re: #9283 slowing down tests, ideas for slowness: >> 1. I added a lot of test cases, some with locally run pipelines. >> 2. The PR somehow changed how coders are selected, and now we're using >> less efficient ones. >> 3. New dependency funcsigs is slowing things down? (py2 only) >> >> I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and 3.7 and >> got these cool graphs (attached). >> 2.7: core:294:get_function_arguments takes 56.66% of CPU time (IIUC), >> gets called ~230k times >> 3.7: core:294:get_function_arguments 30.88%, gets called ~200k times >> >> After memoization of get_function_args_defaults: >> 2.7: core:294:get_function_arguments 20.02% >> 3.7: core:294:get_function_arguments 8.11% >> >> >> On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada wrote: >> >>> *not deciles, but 9-percentiles : ) >>> >>> On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada >>> wrote: >>> I've ran the tests in Python 2 (without cython), and used a utility to track runtime for each test method. I found some of the following things: - Total test methods run: 2665 - Total test runtime: 990 seconds - Deciles of time spent: - 1949 tests run in the first 9% of time - 173 in the 9-18% rang3e - 130 in the 18-27% range - 95 in the 27-36% range - 77 - 66 - 55 - 46 - 37 - 24 - 13 tests run in the last 9% of time. This represents about 1 minute and a half. We may be able to look at the slowest X tests, and get gradual improvements from there. Although it seems .. not dramatic ones : ) FWIW I uploaded the results here: https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json The slowest 13 tests were: [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic', 5.253582000732422), ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_wordcount', 7.907713890075684), ('apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest.test_failure_has_same_insert_ids', 5.237942934036255), ('apache_beam.transforms.combiners_test.CombineTest.test_global_sample', 5.563946008682251), ('apache_beam.runners.worker.sideinputs_test.EmulatedCollectionsTest.test_large_iterable_values', 5.680700063705444), ('apache_beam.io.parquetio_test.TestParquet.test_sink_transform_multiple_row_group', 6.111238956451416), ('apache_beam.runners.worker.statesampler_test.StateSamplerTest.test_basic_sampler', 6.007534980773926), ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_basic', 13.993916988372803), ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_read_cache_expansion', 6.3383049964904785), ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_word_count', 9.157485008239746), ('apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses.test_pardo_side_and_main_outputs', 5.191173076629639), ('apache_beam.io.vcfio_test.VcfSourceTest.test_pipeline_read_file_pattern_large', 6.2221620082855225), ('apache_beam.io.fileio_test.WriteFilesTest.test_streaming_complex_timing', 7.7187910079956055)] On Mon, Oct 28, 2019 at 3:10 PM Pablo Estrada wrote: > I have written https://github.com/apache/beam/pull/9910 to reduce > FnApiRunnerTest variations. > I'm not in a rush to merge, but rather happy to start a discussion. > I'll also try to figure out if there are other tests slowing down the > suite significantly. > Best > -P. > > On Fri, Oct 25, 2019 at 7:41 PM Valentyn Tymofieiev < > valen...@google.com> wrote: > >> Thanks, Brian. >> +Udi Meiri >> As next step, it would be good to know whether slowdown is caused by >> tests in this PR, or its effect on other tests, and to confirm that only >> Python 2 codepaths were affected. >> >> On Fri, Oct 25, 2019 at 6:35 PM Brian Hulette >> wrote: >> >>> I did a bisect based on the runtime of `./gradlew >>> :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits between >>> 9/1 >>> and 9/15 to see if I could find the source of the spi
Re: Python Precommit duration pushing 2 hours
Noting for the benefit of the thread archive in case someone goes digging and wonders if this affects other SDKs: the Java SDK memoizes DoFnSignatures and generated DoFnInvoker classes. Kenn On Mon, Oct 28, 2019 at 6:59 PM Udi Meiri wrote: > Re: #9283 slowing down tests, ideas for slowness: > 1. I added a lot of test cases, some with locally run pipelines. > 2. The PR somehow changed how coders are selected, and now we're using > less efficient ones. > 3. New dependency funcsigs is slowing things down? (py2 only) > > I ran "pytest -k PipelineAnalyzerTest --profile-svg" on 2.7 and 3.7 and > got these cool graphs (attached). > 2.7: core:294:get_function_arguments takes 56.66% of CPU time (IIUC), gets > called ~230k times > 3.7: core:294:get_function_arguments 30.88%, gets called ~200k times > > After memoization of get_function_args_defaults: > 2.7: core:294:get_function_arguments 20.02% > 3.7: core:294:get_function_arguments 8.11% > > > On Mon, Oct 28, 2019 at 5:38 PM Pablo Estrada wrote: > >> *not deciles, but 9-percentiles : ) >> >> On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada wrote: >> >>> I've ran the tests in Python 2 (without cython), and used a utility to >>> track runtime for each test method. I found some of the following things: >>> - Total test methods run: 2665 >>> - Total test runtime: 990 seconds >>> - Deciles of time spent: >>> - 1949 tests run in the first 9% of time >>> - 173 in the 9-18% rang3e >>> - 130 in the 18-27% range >>> - 95 in the 27-36% range >>> - 77 >>> - 66 >>> - 55 >>> - 46 >>> - 37 >>> - 24 >>> - 13 tests run in the last 9% of time. This represents about 1 minute >>> and a half. >>> >>> We may be able to look at the slowest X tests, and get gradual >>> improvements from there. Although it seems .. not dramatic ones : ) >>> >>> FWIW I uploaded the results here: >>> https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json >>> >>> The slowest 13 tests were: >>> >>> >>> [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic', >>> 5.253582000732422), >>> >>> >>> ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_wordcount', >>> 7.907713890075684), >>> >>> >>> ('apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest.test_failure_has_same_insert_ids', >>> 5.237942934036255), >>> ('apache_beam.transforms.combiners_test.CombineTest.test_global_sample', >>> 5.563946008682251), >>> >>> >>> ('apache_beam.runners.worker.sideinputs_test.EmulatedCollectionsTest.test_large_iterable_values', >>> 5.680700063705444), >>> >>> >>> ('apache_beam.io.parquetio_test.TestParquet.test_sink_transform_multiple_row_group', >>> 6.111238956451416), >>> >>> >>> ('apache_beam.runners.worker.statesampler_test.StateSamplerTest.test_basic_sampler', >>> 6.007534980773926), >>> >>> >>> ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_basic', >>> 13.993916988372803), >>> >>> >>> ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_read_cache_expansion', >>> 6.3383049964904785), >>> >>> >>> ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_word_count', >>> 9.157485008239746), >>> >>> >>> ('apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses.test_pardo_side_and_main_outputs', >>> 5.191173076629639), >>> >>> >>> ('apache_beam.io.vcfio_test.VcfSourceTest.test_pipeline_read_file_pattern_large', >>> 6.2221620082855225), >>> >>> ('apache_beam.io.fileio_test.WriteFilesTest.test_streaming_complex_timing', >>> 7.7187910079956055)] >>> >>> On Mon, Oct 28, 2019 at 3:10 PM Pablo Estrada >>> wrote: >>> I have written https://github.com/apache/beam/pull/9910 to reduce FnApiRunnerTest variations. I'm not in a rush to merge, but rather happy to start a discussion. I'll also try to figure out if there are other tests slowing down the suite significantly. Best -P. On Fri, Oct 25, 2019 at 7:41 PM Valentyn Tymofieiev < valen...@google.com> wrote: > Thanks, Brian. > +Udi Meiri > As next step, it would be good to know whether slowdown is caused by > tests in this PR, or its effect on other tests, and to confirm that only > Python 2 codepaths were affected. > > On Fri, Oct 25, 2019 at 6:35 PM Brian Hulette > wrote: > >> I did a bisect based on the runtime of `./gradlew >> :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits between >> 9/1 >> and 9/15 to see if I could find the source of the spike that happened >> around 9/6. It looks like it was due to PR#9283 [1]. I thought maybe this >> search would reveal some mis-guided configuration change, but as far as I >> can tell 9283 just added a well-tested feature. I don't think there's >> anything to learn from that... I just wanted
Re: Python Precommit duration pushing 2 hours
*not deciles, but 9-percentiles : ) On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada wrote: > I've ran the tests in Python 2 (without cython), and used a utility to > track runtime for each test method. I found some of the following things: > - Total test methods run: 2665 > - Total test runtime: 990 seconds > - Deciles of time spent: > - 1949 tests run in the first 9% of time > - 173 in the 9-18% rang3e > - 130 in the 18-27% range > - 95 in the 27-36% range > - 77 > - 66 > - 55 > - 46 > - 37 > - 24 > - 13 tests run in the last 9% of time. This represents about 1 minute > and a half. > > We may be able to look at the slowest X tests, and get gradual > improvements from there. Although it seems .. not dramatic ones : ) > > FWIW I uploaded the results here: > https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json > > The slowest 13 tests were: > > > [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic', > 5.253582000732422), > > > ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_wordcount', > 7.907713890075684), > > > ('apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest.test_failure_has_same_insert_ids', > 5.237942934036255), > ('apache_beam.transforms.combiners_test.CombineTest.test_global_sample', > 5.563946008682251), > > > ('apache_beam.runners.worker.sideinputs_test.EmulatedCollectionsTest.test_large_iterable_values', > 5.680700063705444), > > > ('apache_beam.io.parquetio_test.TestParquet.test_sink_transform_multiple_row_group', > 6.111238956451416), > > > ('apache_beam.runners.worker.statesampler_test.StateSamplerTest.test_basic_sampler', > 6.007534980773926), > > > ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_basic', > 13.993916988372803), > > > ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_read_cache_expansion', > 6.3383049964904785), > > > ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_word_count', > 9.157485008239746), > > > ('apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses.test_pardo_side_and_main_outputs', > 5.191173076629639), > > > ('apache_beam.io.vcfio_test.VcfSourceTest.test_pipeline_read_file_pattern_large', > 6.2221620082855225), > > ('apache_beam.io.fileio_test.WriteFilesTest.test_streaming_complex_timing', > 7.7187910079956055)] > > On Mon, Oct 28, 2019 at 3:10 PM Pablo Estrada wrote: > >> I have written https://github.com/apache/beam/pull/9910 to reduce >> FnApiRunnerTest variations. >> I'm not in a rush to merge, but rather happy to start a discussion. >> I'll also try to figure out if there are other tests slowing down the >> suite significantly. >> Best >> -P. >> >> On Fri, Oct 25, 2019 at 7:41 PM Valentyn Tymofieiev >> wrote: >> >>> Thanks, Brian. >>> +Udi Meiri >>> As next step, it would be good to know whether slowdown is caused by >>> tests in this PR, or its effect on other tests, and to confirm that only >>> Python 2 codepaths were affected. >>> >>> On Fri, Oct 25, 2019 at 6:35 PM Brian Hulette >>> wrote: >>> I did a bisect based on the runtime of `./gradlew :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits between 9/1 and 9/15 to see if I could find the source of the spike that happened around 9/6. It looks like it was due to PR#9283 [1]. I thought maybe this search would reveal some mis-guided configuration change, but as far as I can tell 9283 just added a well-tested feature. I don't think there's anything to learn from that... I just wanted to circle back about it in case others are curious about that spike. I'm +1 on bumping some FnApiRunner configurations. Brian [1] https://github.com/apache/beam/pull/9283 On Fri, Oct 25, 2019 at 4:49 PM Pablo Estrada wrote: > I think it makes sense to remove some of the extra FnApiRunner > configurations. Perhaps some of the multiworkers and some of the grpc > versions? > Best > -P. > > On Fri, Oct 25, 2019 at 12:27 PM Robert Bradshaw > wrote: > >> It looks like fn_api_runner_test.py is quite expensive, taking 10-15+ >> minutes on each version of Python. This test consists of a base class >> that is basically a validates runner suite, and is then run in several >> configurations, many more of which (including some expensive ones) >> have been added lately. >> >> class FnApiRunnerTest(unittest.TestCase): >> class FnApiRunnerTestWithGrpc(FnApiRunnerTest): >> class FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest): >> class FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest): >> class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest): >> class FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest):
Re: Python Precommit duration pushing 2 hours
I've ran the tests in Python 2 (without cython), and used a utility to track runtime for each test method. I found some of the following things: - Total test methods run: 2665 - Total test runtime: 990 seconds - Deciles of time spent: - 1949 tests run in the first 9% of time - 173 in the 9-18% rang3e - 130 in the 18-27% range - 95 in the 27-36% range - 77 - 66 - 55 - 46 - 37 - 24 - 13 tests run in the last 9% of time. This represents about 1 minute and a half. We may be able to look at the slowest X tests, and get gradual improvements from there. Although it seems .. not dramatic ones : ) FWIW I uploaded the results here: https://storage.googleapis.com/apache-beam-website-pull-requests/python-tests/nosetimes.json The slowest 13 tests were: [('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_basic', 5.253582000732422), ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_wordcount', 7.907713890075684), ('apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest.test_failure_has_same_insert_ids', 5.237942934036255), ('apache_beam.transforms.combiners_test.CombineTest.test_global_sample', 5.563946008682251), ('apache_beam.runners.worker.sideinputs_test.EmulatedCollectionsTest.test_large_iterable_values', 5.680700063705444), ('apache_beam.io.parquetio_test.TestParquet.test_sink_transform_multiple_row_group', 6.111238956451416), ('apache_beam.runners.worker.statesampler_test.StateSamplerTest.test_basic_sampler', 6.007534980773926), ('apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest.test_basic', 13.993916988372803), ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_read_cache_expansion', 6.3383049964904785), ('apache_beam.runners.interactive.pipeline_analyzer_test.PipelineAnalyzerTest.test_word_count', 9.157485008239746), ('apache_beam.runners.portability.portable_runner_test.PortableRunnerTestWithSubprocesses.test_pardo_side_and_main_outputs', 5.191173076629639), ('apache_beam.io.vcfio_test.VcfSourceTest.test_pipeline_read_file_pattern_large', 6.2221620082855225), ('apache_beam.io.fileio_test.WriteFilesTest.test_streaming_complex_timing', 7.7187910079956055)] On Mon, Oct 28, 2019 at 3:10 PM Pablo Estrada wrote: > I have written https://github.com/apache/beam/pull/9910 to reduce > FnApiRunnerTest variations. > I'm not in a rush to merge, but rather happy to start a discussion. > I'll also try to figure out if there are other tests slowing down the > suite significantly. > Best > -P. > > On Fri, Oct 25, 2019 at 7:41 PM Valentyn Tymofieiev > wrote: > >> Thanks, Brian. >> +Udi Meiri >> As next step, it would be good to know whether slowdown is caused by >> tests in this PR, or its effect on other tests, and to confirm that only >> Python 2 codepaths were affected. >> >> On Fri, Oct 25, 2019 at 6:35 PM Brian Hulette >> wrote: >> >>> I did a bisect based on the runtime of `./gradlew >>> :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits between 9/1 >>> and 9/15 to see if I could find the source of the spike that happened >>> around 9/6. It looks like it was due to PR#9283 [1]. I thought maybe this >>> search would reveal some mis-guided configuration change, but as far as I >>> can tell 9283 just added a well-tested feature. I don't think there's >>> anything to learn from that... I just wanted to circle back about it in >>> case others are curious about that spike. >>> >>> I'm +1 on bumping some FnApiRunner configurations. >>> >>> Brian >>> >>> [1] https://github.com/apache/beam/pull/9283 >>> >>> On Fri, Oct 25, 2019 at 4:49 PM Pablo Estrada >>> wrote: >>> I think it makes sense to remove some of the extra FnApiRunner configurations. Perhaps some of the multiworkers and some of the grpc versions? Best -P. On Fri, Oct 25, 2019 at 12:27 PM Robert Bradshaw wrote: > It looks like fn_api_runner_test.py is quite expensive, taking 10-15+ > minutes on each version of Python. This test consists of a base class > that is basically a validates runner suite, and is then run in several > configurations, many more of which (including some expensive ones) > have been added lately. > > class FnApiRunnerTest(unittest.TestCase): > class FnApiRunnerTestWithGrpc(FnApiRunnerTest): > class FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest): > class FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest): > class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest): > class FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest): > class FnApiRunnerTestWithBundleRepeat(FnApiRunnerTest): > class FnApiRunnerTestWithBundleRepeatAndMultiWorkers(FnApiRunnerTest): > > I'm not convinced we need to run all of these permutations, or at > least not all tests in all permutations. > > On Fri, Oct 25, 2019 at 10:
Re: Python Precommit duration pushing 2 hours
I have written https://github.com/apache/beam/pull/9910 to reduce FnApiRunnerTest variations. I'm not in a rush to merge, but rather happy to start a discussion. I'll also try to figure out if there are other tests slowing down the suite significantly. Best -P. On Fri, Oct 25, 2019 at 7:41 PM Valentyn Tymofieiev wrote: > Thanks, Brian. > +Udi Meiri > As next step, it would be good to know whether slowdown is caused by tests > in this PR, or its effect on other tests, and to confirm that only Python 2 > codepaths were affected. > > On Fri, Oct 25, 2019 at 6:35 PM Brian Hulette wrote: > >> I did a bisect based on the runtime of `./gradlew >> :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits between 9/1 >> and 9/15 to see if I could find the source of the spike that happened >> around 9/6. It looks like it was due to PR#9283 [1]. I thought maybe this >> search would reveal some mis-guided configuration change, but as far as I >> can tell 9283 just added a well-tested feature. I don't think there's >> anything to learn from that... I just wanted to circle back about it in >> case others are curious about that spike. >> >> I'm +1 on bumping some FnApiRunner configurations. >> >> Brian >> >> [1] https://github.com/apache/beam/pull/9283 >> >> On Fri, Oct 25, 2019 at 4:49 PM Pablo Estrada wrote: >> >>> I think it makes sense to remove some of the extra FnApiRunner >>> configurations. Perhaps some of the multiworkers and some of the grpc >>> versions? >>> Best >>> -P. >>> >>> On Fri, Oct 25, 2019 at 12:27 PM Robert Bradshaw >>> wrote: >>> It looks like fn_api_runner_test.py is quite expensive, taking 10-15+ minutes on each version of Python. This test consists of a base class that is basically a validates runner suite, and is then run in several configurations, many more of which (including some expensive ones) have been added lately. class FnApiRunnerTest(unittest.TestCase): class FnApiRunnerTestWithGrpc(FnApiRunnerTest): class FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest): class FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest): class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest): class FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest): class FnApiRunnerTestWithBundleRepeat(FnApiRunnerTest): class FnApiRunnerTestWithBundleRepeatAndMultiWorkers(FnApiRunnerTest): I'm not convinced we need to run all of these permutations, or at least not all tests in all permutations. On Fri, Oct 25, 2019 at 10:57 AM Valentyn Tymofieiev wrote: > > I took another look at this and precommit ITs are already running in parallel, albeit in the same suite. However it appears Python precommits became slower, especially Python 2 precommits [35 min per suite x 3 suites], see [1]. Not sure yet what caused the increase, but precommits used to be faster. Perhaps we have added a slow test or a lot of new tests. > > [1] https://scans.gradle.com/s/jvcw5fpqfc64k/timeline?task=ancsbov425524 > > On Thu, Oct 24, 2019 at 4:53 PM Ahmet Altay wrote: >> >> Ack. Separating precommit ITs to a different suite sounds good. Anyone is interested in doing that? >> >> On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev < valen...@google.com> wrote: >>> >>> This should not increase the queue time substantially, since precommit ITs are running sequentially with precommit tests, unlike multiple precommit tests which run in parallel to each other. >>> >>> The precommit ITs we run are batch and streaming wordcount tests on Py2 and one Py3 version, so it's not a lot of tests. >>> >>> On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay wrote: +1 to separating ITs from precommit. Downside would be, when Chad tried to do something similar [1] it was noted that the total time to run all precommit tests would increase and also potentially increase the queue time. Another alternative, we could run a smaller set of IT tests in precommits and run the whole suite as part of post commit tests. [1] https://github.com/apache/beam/pull/9642 On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev < valen...@google.com> wrote: > > One improvement could be move to Precommit IT tests into a separate suite from precommit tests, and run it in parallel. > > On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette < bhule...@google.com> wrote: >> >> Python Precommits are taking quite a while now [1]. Just visually it looks like the average length is 1.5h or so, but it spikes up to 2h. I've had several precommit runs get aborted due to the 2 hour limit. >> >> It looks like there was a spike up above 1h back on 9/6 and the duration has been
Re: Python Precommit duration pushing 2 hours
Thanks, Brian. +Udi Meiri As next step, it would be good to know whether slowdown is caused by tests in this PR, or its effect on other tests, and to confirm that only Python 2 codepaths were affected. On Fri, Oct 25, 2019 at 6:35 PM Brian Hulette wrote: > I did a bisect based on the runtime of `./gradlew > :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits between 9/1 > and 9/15 to see if I could find the source of the spike that happened > around 9/6. It looks like it was due to PR#9283 [1]. I thought maybe this > search would reveal some mis-guided configuration change, but as far as I > can tell 9283 just added a well-tested feature. I don't think there's > anything to learn from that... I just wanted to circle back about it in > case others are curious about that spike. > > I'm +1 on bumping some FnApiRunner configurations. > > Brian > > [1] https://github.com/apache/beam/pull/9283 > > On Fri, Oct 25, 2019 at 4:49 PM Pablo Estrada wrote: > >> I think it makes sense to remove some of the extra FnApiRunner >> configurations. Perhaps some of the multiworkers and some of the grpc >> versions? >> Best >> -P. >> >> On Fri, Oct 25, 2019 at 12:27 PM Robert Bradshaw >> wrote: >> >>> It looks like fn_api_runner_test.py is quite expensive, taking 10-15+ >>> minutes on each version of Python. This test consists of a base class >>> that is basically a validates runner suite, and is then run in several >>> configurations, many more of which (including some expensive ones) >>> have been added lately. >>> >>> class FnApiRunnerTest(unittest.TestCase): >>> class FnApiRunnerTestWithGrpc(FnApiRunnerTest): >>> class FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest): >>> class FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest): >>> class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest): >>> class FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest): >>> class FnApiRunnerTestWithBundleRepeat(FnApiRunnerTest): >>> class FnApiRunnerTestWithBundleRepeatAndMultiWorkers(FnApiRunnerTest): >>> >>> I'm not convinced we need to run all of these permutations, or at >>> least not all tests in all permutations. >>> >>> On Fri, Oct 25, 2019 at 10:57 AM Valentyn Tymofieiev >>> wrote: >>> > >>> > I took another look at this and precommit ITs are already running in >>> parallel, albeit in the same suite. However it appears Python precommits >>> became slower, especially Python 2 precommits [35 min per suite x 3 >>> suites], see [1]. Not sure yet what caused the increase, but precommits >>> used to be faster. Perhaps we have added a slow test or a lot of new tests. >>> > >>> > [1] >>> https://scans.gradle.com/s/jvcw5fpqfc64k/timeline?task=ancsbov425524 >>> > >>> > On Thu, Oct 24, 2019 at 4:53 PM Ahmet Altay wrote: >>> >> >>> >> Ack. Separating precommit ITs to a different suite sounds good. >>> Anyone is interested in doing that? >>> >> >>> >> On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev < >>> valen...@google.com> wrote: >>> >>> >>> >>> This should not increase the queue time substantially, since >>> precommit ITs are running sequentially with precommit tests, unlike >>> multiple precommit tests which run in parallel to each other. >>> >>> >>> >>> The precommit ITs we run are batch and streaming wordcount tests on >>> Py2 and one Py3 version, so it's not a lot of tests. >>> >>> >>> >>> On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay >>> wrote: >>> >>> +1 to separating ITs from precommit. Downside would be, when Chad >>> tried to do something similar [1] it was noted that the total time to run >>> all precommit tests would increase and also potentially increase the queue >>> time. >>> >>> Another alternative, we could run a smaller set of IT tests in >>> precommits and run the whole suite as part of post commit tests. >>> >>> [1] https://github.com/apache/beam/pull/9642 >>> >>> On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev < >>> valen...@google.com> wrote: >>> > >>> > One improvement could be move to Precommit IT tests into a >>> separate suite from precommit tests, and run it in parallel. >>> > >>> > On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette < >>> bhule...@google.com> wrote: >>> >> >>> >> Python Precommits are taking quite a while now [1]. Just visually >>> it looks like the average length is 1.5h or so, but it spikes up to 2h. >>> I've had several precommit runs get aborted due to the 2 hour limit. >>> >> >>> >> It looks like there was a spike up above 1h back on 9/6 and the >>> duration has been steadily rising since then. Is there anything we can do >>> about this? >>> >> >>> >> Brian >>> >> >>> >> [1] >>> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4 >>> >>
Re: Python Precommit duration pushing 2 hours
I did a bisect based on the runtime of `./gradlew :sdks:python:test-suites:tox:py2:testPy2Gcp` around the commits between 9/1 and 9/15 to see if I could find the source of the spike that happened around 9/6. It looks like it was due to PR#9283 [1]. I thought maybe this search would reveal some mis-guided configuration change, but as far as I can tell 9283 just added a well-tested feature. I don't think there's anything to learn from that... I just wanted to circle back about it in case others are curious about that spike. I'm +1 on bumping some FnApiRunner configurations. Brian [1] https://github.com/apache/beam/pull/9283 On Fri, Oct 25, 2019 at 4:49 PM Pablo Estrada wrote: > I think it makes sense to remove some of the extra FnApiRunner > configurations. Perhaps some of the multiworkers and some of the grpc > versions? > Best > -P. > > On Fri, Oct 25, 2019 at 12:27 PM Robert Bradshaw > wrote: > >> It looks like fn_api_runner_test.py is quite expensive, taking 10-15+ >> minutes on each version of Python. This test consists of a base class >> that is basically a validates runner suite, and is then run in several >> configurations, many more of which (including some expensive ones) >> have been added lately. >> >> class FnApiRunnerTest(unittest.TestCase): >> class FnApiRunnerTestWithGrpc(FnApiRunnerTest): >> class FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest): >> class FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest): >> class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest): >> class FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest): >> class FnApiRunnerTestWithBundleRepeat(FnApiRunnerTest): >> class FnApiRunnerTestWithBundleRepeatAndMultiWorkers(FnApiRunnerTest): >> >> I'm not convinced we need to run all of these permutations, or at >> least not all tests in all permutations. >> >> On Fri, Oct 25, 2019 at 10:57 AM Valentyn Tymofieiev >> wrote: >> > >> > I took another look at this and precommit ITs are already running in >> parallel, albeit in the same suite. However it appears Python precommits >> became slower, especially Python 2 precommits [35 min per suite x 3 >> suites], see [1]. Not sure yet what caused the increase, but precommits >> used to be faster. Perhaps we have added a slow test or a lot of new tests. >> > >> > [1] >> https://scans.gradle.com/s/jvcw5fpqfc64k/timeline?task=ancsbov425524 >> > >> > On Thu, Oct 24, 2019 at 4:53 PM Ahmet Altay wrote: >> >> >> >> Ack. Separating precommit ITs to a different suite sounds good. Anyone >> is interested in doing that? >> >> >> >> On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev < >> valen...@google.com> wrote: >> >>> >> >>> This should not increase the queue time substantially, since >> precommit ITs are running sequentially with precommit tests, unlike >> multiple precommit tests which run in parallel to each other. >> >>> >> >>> The precommit ITs we run are batch and streaming wordcount tests on >> Py2 and one Py3 version, so it's not a lot of tests. >> >>> >> >>> On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay wrote: >> >> +1 to separating ITs from precommit. Downside would be, when Chad >> tried to do something similar [1] it was noted that the total time to run >> all precommit tests would increase and also potentially increase the queue >> time. >> >> Another alternative, we could run a smaller set of IT tests in >> precommits and run the whole suite as part of post commit tests. >> >> [1] https://github.com/apache/beam/pull/9642 >> >> On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev < >> valen...@google.com> wrote: >> > >> > One improvement could be move to Precommit IT tests into a separate >> suite from precommit tests, and run it in parallel. >> > >> > On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette >> wrote: >> >> >> >> Python Precommits are taking quite a while now [1]. Just visually >> it looks like the average length is 1.5h or so, but it spikes up to 2h. >> I've had several precommit runs get aborted due to the 2 hour limit. >> >> >> >> It looks like there was a spike up above 1h back on 9/6 and the >> duration has been steadily rising since then. Is there anything we can do >> about this? >> >> >> >> Brian >> >> >> >> [1] >> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4 >> >
Re: Python Precommit duration pushing 2 hours
I think it makes sense to remove some of the extra FnApiRunner configurations. Perhaps some of the multiworkers and some of the grpc versions? Best -P. On Fri, Oct 25, 2019 at 12:27 PM Robert Bradshaw wrote: > It looks like fn_api_runner_test.py is quite expensive, taking 10-15+ > minutes on each version of Python. This test consists of a base class > that is basically a validates runner suite, and is then run in several > configurations, many more of which (including some expensive ones) > have been added lately. > > class FnApiRunnerTest(unittest.TestCase): > class FnApiRunnerTestWithGrpc(FnApiRunnerTest): > class FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest): > class FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest): > class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest): > class FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest): > class FnApiRunnerTestWithBundleRepeat(FnApiRunnerTest): > class FnApiRunnerTestWithBundleRepeatAndMultiWorkers(FnApiRunnerTest): > > I'm not convinced we need to run all of these permutations, or at > least not all tests in all permutations. > > On Fri, Oct 25, 2019 at 10:57 AM Valentyn Tymofieiev > wrote: > > > > I took another look at this and precommit ITs are already running in > parallel, albeit in the same suite. However it appears Python precommits > became slower, especially Python 2 precommits [35 min per suite x 3 > suites], see [1]. Not sure yet what caused the increase, but precommits > used to be faster. Perhaps we have added a slow test or a lot of new tests. > > > > [1] https://scans.gradle.com/s/jvcw5fpqfc64k/timeline?task=ancsbov425524 > > > > On Thu, Oct 24, 2019 at 4:53 PM Ahmet Altay wrote: > >> > >> Ack. Separating precommit ITs to a different suite sounds good. Anyone > is interested in doing that? > >> > >> On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev < > valen...@google.com> wrote: > >>> > >>> This should not increase the queue time substantially, since precommit > ITs are running sequentially with precommit tests, unlike multiple > precommit tests which run in parallel to each other. > >>> > >>> The precommit ITs we run are batch and streaming wordcount tests on > Py2 and one Py3 version, so it's not a lot of tests. > >>> > >>> On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay wrote: > > +1 to separating ITs from precommit. Downside would be, when Chad > tried to do something similar [1] it was noted that the total time to run > all precommit tests would increase and also potentially increase the queue > time. > > Another alternative, we could run a smaller set of IT tests in > precommits and run the whole suite as part of post commit tests. > > [1] https://github.com/apache/beam/pull/9642 > > On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev < > valen...@google.com> wrote: > > > > One improvement could be move to Precommit IT tests into a separate > suite from precommit tests, and run it in parallel. > > > > On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette > wrote: > >> > >> Python Precommits are taking quite a while now [1]. Just visually > it looks like the average length is 1.5h or so, but it spikes up to 2h. > I've had several precommit runs get aborted due to the 2 hour limit. > >> > >> It looks like there was a spike up above 1h back on 9/6 and the > duration has been steadily rising since then. Is there anything we can do > about this? > >> > >> Brian > >> > >> [1] > http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4 >
Re: Python Precommit duration pushing 2 hours
It looks like fn_api_runner_test.py is quite expensive, taking 10-15+ minutes on each version of Python. This test consists of a base class that is basically a validates runner suite, and is then run in several configurations, many more of which (including some expensive ones) have been added lately. class FnApiRunnerTest(unittest.TestCase): class FnApiRunnerTestWithGrpc(FnApiRunnerTest): class FnApiRunnerTestWithGrpcMultiThreaded(FnApiRunnerTest): class FnApiRunnerTestWithDisabledCaching(FnApiRunnerTest): class FnApiRunnerTestWithMultiWorkers(FnApiRunnerTest): class FnApiRunnerTestWithGrpcAndMultiWorkers(FnApiRunnerTest): class FnApiRunnerTestWithBundleRepeat(FnApiRunnerTest): class FnApiRunnerTestWithBundleRepeatAndMultiWorkers(FnApiRunnerTest): I'm not convinced we need to run all of these permutations, or at least not all tests in all permutations. On Fri, Oct 25, 2019 at 10:57 AM Valentyn Tymofieiev wrote: > > I took another look at this and precommit ITs are already running in > parallel, albeit in the same suite. However it appears Python precommits > became slower, especially Python 2 precommits [35 min per suite x 3 suites], > see [1]. Not sure yet what caused the increase, but precommits used to be > faster. Perhaps we have added a slow test or a lot of new tests. > > [1] https://scans.gradle.com/s/jvcw5fpqfc64k/timeline?task=ancsbov425524 > > On Thu, Oct 24, 2019 at 4:53 PM Ahmet Altay wrote: >> >> Ack. Separating precommit ITs to a different suite sounds good. Anyone is >> interested in doing that? >> >> On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev >> wrote: >>> >>> This should not increase the queue time substantially, since precommit ITs >>> are running sequentially with precommit tests, unlike multiple precommit >>> tests which run in parallel to each other. >>> >>> The precommit ITs we run are batch and streaming wordcount tests on Py2 and >>> one Py3 version, so it's not a lot of tests. >>> >>> On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay wrote: +1 to separating ITs from precommit. Downside would be, when Chad tried to do something similar [1] it was noted that the total time to run all precommit tests would increase and also potentially increase the queue time. Another alternative, we could run a smaller set of IT tests in precommits and run the whole suite as part of post commit tests. [1] https://github.com/apache/beam/pull/9642 On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev wrote: > > One improvement could be move to Precommit IT tests into a separate suite > from precommit tests, and run it in parallel. > > On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette > wrote: >> >> Python Precommits are taking quite a while now [1]. Just visually it >> looks like the average length is 1.5h or so, but it spikes up to 2h. >> I've had several precommit runs get aborted due to the 2 hour limit. >> >> It looks like there was a spike up above 1h back on 9/6 and the duration >> has been steadily rising since then. Is there anything we can do about >> this? >> >> Brian >> >> [1] >> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4
Re: Python Precommit duration pushing 2 hours
I took another look at this and precommit ITs are already running in parallel, albeit in the same suite. However it appears Python precommits became slower, especially Python 2 precommits [35 min per suite x 3 suites], see [1]. Not sure yet what caused the increase, but precommits used to be faster. Perhaps we have added a slow test or a lot of new tests. [1] https://scans.gradle.com/s/jvcw5fpqfc64k/timeline?task=ancsbov425524 On Thu, Oct 24, 2019 at 4:53 PM Ahmet Altay wrote: > Ack. Separating precommit ITs to a different suite sounds good. Anyone is > interested in doing that? > > On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev > wrote: > >> This should not increase the queue time substantially, since precommit >> ITs are running *sequentially* with precommit tests, unlike multiple >> precommit tests which run in parallel to each other. >> >> The precommit ITs we run are batch and streaming wordcount tests on Py2 >> and one Py3 version, so it's not a lot of tests. >> >> On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay wrote: >> >>> +1 to separating ITs from precommit. Downside would be, when Chad tried >>> to do something similar [1] it was noted that the total time to run all >>> precommit tests would increase and also potentially increase the queue time. >>> >>> Another alternative, we could run a smaller set of IT tests in >>> precommits and run the whole suite as part of post commit tests. >>> >>> [1] https://github.com/apache/beam/pull/9642 >>> >>> On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev < >>> valen...@google.com> wrote: >>> One improvement could be move to Precommit IT tests into a separate suite from precommit tests, and run it in parallel. On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette wrote: > Python Precommits are taking quite a while now [1]. Just visually it > looks like the average length is 1.5h or so, but it spikes up to 2h. I've > had several precommit runs get aborted due to the 2 hour limit. > > It looks like there was a spike up above 1h back on 9/6 and the > duration has been steadily rising since then. Is there anything we can do > about this? > > Brian > > [1] > http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4 >
Re: Python Precommit duration pushing 2 hours
Ack. Separating precommit ITs to a different suite sounds good. Anyone is interested in doing that? On Thu, Oct 24, 2019 at 2:41 PM Valentyn Tymofieiev wrote: > This should not increase the queue time substantially, since precommit ITs > are running *sequentially* with precommit tests, unlike multiple > precommit tests which run in parallel to each other. > > The precommit ITs we run are batch and streaming wordcount tests on Py2 > and one Py3 version, so it's not a lot of tests. > > On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay wrote: > >> +1 to separating ITs from precommit. Downside would be, when Chad tried >> to do something similar [1] it was noted that the total time to run all >> precommit tests would increase and also potentially increase the queue time. >> >> Another alternative, we could run a smaller set of IT tests in precommits >> and run the whole suite as part of post commit tests. >> >> [1] https://github.com/apache/beam/pull/9642 >> >> On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev >> wrote: >> >>> One improvement could be move to Precommit IT tests into a separate >>> suite from precommit tests, and run it in parallel. >>> >>> On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette >>> wrote: >>> Python Precommits are taking quite a while now [1]. Just visually it looks like the average length is 1.5h or so, but it spikes up to 2h. I've had several precommit runs get aborted due to the 2 hour limit. It looks like there was a spike up above 1h back on 9/6 and the duration has been steadily rising since then. Is there anything we can do about this? Brian [1] http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4 >>>
Re: Python Precommit duration pushing 2 hours
This should not increase the queue time substantially, since precommit ITs are running *sequentially* with precommit tests, unlike multiple precommit tests which run in parallel to each other. The precommit ITs we run are batch and streaming wordcount tests on Py2 and one Py3 version, so it's not a lot of tests. On Thu, Oct 24, 2019 at 1:07 PM Ahmet Altay wrote: > +1 to separating ITs from precommit. Downside would be, when Chad tried to > do something similar [1] it was noted that the total time to run all > precommit tests would increase and also potentially increase the queue time. > > Another alternative, we could run a smaller set of IT tests in precommits > and run the whole suite as part of post commit tests. > > [1] https://github.com/apache/beam/pull/9642 > > On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev > wrote: > >> One improvement could be move to Precommit IT tests into a separate suite >> from precommit tests, and run it in parallel. >> >> On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette >> wrote: >> >>> Python Precommits are taking quite a while now [1]. Just visually it >>> looks like the average length is 1.5h or so, but it spikes up to 2h. I've >>> had several precommit runs get aborted due to the 2 hour limit. >>> >>> It looks like there was a spike up above 1h back on 9/6 and the duration >>> has been steadily rising since then. Is there anything we can do about this? >>> >>> Brian >>> >>> [1] >>> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4 >>> >>
Re: Python Precommit duration pushing 2 hours
+1 to separating ITs from precommit. Downside would be, when Chad tried to do something similar [1] it was noted that the total time to run all precommit tests would increase and also potentially increase the queue time. Another alternative, we could run a smaller set of IT tests in precommits and run the whole suite as part of post commit tests. [1] https://github.com/apache/beam/pull/9642 On Thu, Oct 24, 2019 at 12:15 PM Valentyn Tymofieiev wrote: > One improvement could be move to Precommit IT tests into a separate suite > from precommit tests, and run it in parallel. > > On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette > wrote: > >> Python Precommits are taking quite a while now [1]. Just visually it >> looks like the average length is 1.5h or so, but it spikes up to 2h. I've >> had several precommit runs get aborted due to the 2 hour limit. >> >> It looks like there was a spike up above 1h back on 9/6 and the duration >> has been steadily rising since then. Is there anything we can do about this? >> >> Brian >> >> [1] >> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4 >> >
Re: Python Precommit duration pushing 2 hours
One improvement could be move to Precommit IT tests into a separate suite from precommit tests, and run it in parallel. On Thu, Oct 24, 2019 at 11:41 AM Brian Hulette wrote: > Python Precommits are taking quite a while now [1]. Just visually it looks > like the average length is 1.5h or so, but it spikes up to 2h. I've had > several precommit runs get aborted due to the 2 hour limit. > > It looks like there was a spike up above 1h back on 9/6 and the duration > has been steadily rising since then. Is there anything we can do about this? > > Brian > > [1] > http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-90d&to=now&fullscreen&panelId=4 >