Re: Parallelizing test runs

2018-08-06 Thread Mikhail Gryzykhin
I don't see difference at first glance and no difference is expected. We never utilized concurrent jobs originally, because job took ~1 hour and was triggered once every 6 hours. At some point, I added triggering job when new commit is available and this started triggering jobs in parallel for ea

Re: Parallelizing test runs

2018-08-06 Thread Lukasz Cwik
How much slower did the post commits become after removing concurrency? On Thu, Aug 2, 2018 at 2:32 PM Mikhail Gryzykhin wrote: > I've disabled concurrency for auto-triggered post-commits job. That should > reduce job scheduling considerably. > > I believe that this change should resolve quota i

Re: Parallelizing test runs

2018-08-02 Thread Mikhail Gryzykhin
I've disabled concurrency for auto-triggered post-commits job. That should reduce job scheduling considerably. I believe that this change should resolve quota issue we have seen this time. I'll monitor if problem reappears. --Mikhail Have feedback ? On Wed, Aug 1, 20

Re: Parallelizing test runs

2018-08-01 Thread Pablo Estrada
It feels to me like a peak of 60 jobs per minute is pretty high. If I understand correctly, we run up to 20 dataflow jobs in parallel per test suite? Or what's the number here? It is also true that most our tests are simple NeedsRunner tests, that test a couple elements, so the whole pipeline over

Re: Parallelizing test runs

2018-08-01 Thread Andrew Pilloud
I like 1 and 2. How do credentials get into Jenkins? Could we create a user per Jenkins host? On Tue, Jul 31, 2018 at 4:33 PM Reuven Lax wrote: > There was also a proposal to lump multiple tests into a single Dataflow > job instead of spinning up a separate Dataflow job for each test. > > On Tue

Re: Parallelizing test runs

2018-07-31 Thread Reuven Lax
There was also a proposal to lump multiple tests into a single Dataflow job instead of spinning up a separate Dataflow job for each test. On Tue, Jul 31, 2018 at 4:26 PM Mikhail Gryzykhin wrote: > I synced with Rafael. Below is summary of discussion. > > This quota is CreateRequestsPerMinutePerU

Re: Parallelizing test runs

2018-07-31 Thread Mikhail Gryzykhin
I synced with Rafael. Below is summary of discussion. This quota is CreateRequestsPerMinutePerUser and it has 60 requests per user by default. I've created Jira [BEAM-5053]( https://issues.apache.org/jira/browse/BEAM-5053) for this. I see following options we can utilize: 1. Add retry logic. Alt

Re: Parallelizing test runs

2018-07-31 Thread Mikhail Gryzykhin
Hi Everyone, Seems that we hit quota issue again: https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/553/consoleFull Can someone share information on how was this triaged last time or guide me on possible follow-up actions? Regards, --Mikhail Have feedback ?

Re: Parallelizing test runs

2018-07-03 Thread Rafael Fernandez
Summary for all folks following this story -- and many thanks for explaining configs to me and pointing me to files and such. - Scott made changes to the config and we can now run 3 ValidatesRunner.Dataflow in parallel (each run is about 2 hours) - With the latest quota changes, we peaked at ~70%

Re: Parallelizing test runs

2018-07-02 Thread Rafael Fernandez
Done! On Mon, Jul 2, 2018 at 4:10 PM Scott Wegner wrote: > Hey Rafael, looks like we need more 'INSTANCE_TEMPLATES' quota [1]. Can > you take a look? I've filed [BEAM-4722]: > https://issues.apache.org/jira/browse/BEAM-4722 > > [1] https://github.com/apache/beam/pull/5861#issuecomment-401963630

Re: Parallelizing test runs

2018-07-02 Thread Scott Wegner
Hey Rafael, looks like we need more 'INSTANCE_TEMPLATES' quota [1]. Can you take a look? I've filed [BEAM-4722]: https://issues.apache.org/jira/browse/BEAM-4722 [1] https://github.com/apache/beam/pull/5861#issuecomment-401963630 On Mon, Jul 2, 2018 at 11:33 AM Rafael Fernandez wrote: > OK, Scot

Re: Parallelizing test runs

2018-07-02 Thread Rafael Fernandez
OK, Scott just sent https://github.com/apache/beam/pull/5860 . Quotas should not be a problem, if they are, please file a JIRA under gcp-quota. Cheers, r On Mon, Jul 2, 2018 at 10:06 AM Kenneth Knowles wrote: > One thing that is nice when you do this is to be able to share your > results. Thoug

Re: Parallelizing test runs

2018-07-02 Thread Kenneth Knowles
One thing that is nice when you do this is to be able to share your results. Though if all you are sharing is "they passed" then I guess we don't have to insist on evidence. Kenn On Mon, Jul 2, 2018 at 9:25 AM Scott Wegner wrote: > A few thoughts: > > * The Jenkins job getting backed up > is be

Re: Parallelizing test runs

2018-07-02 Thread Scott Wegner
A few thoughts: * The Jenkins job getting backed up is beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle_PR [1]. Since Mikhail refactored Jenkins jobs, this only runs when explicitly requested via "Run Dataflow ValidatesRunner", and only has 8 total runs. So this job is idle more often than bac

Re: Parallelizing test runs

2018-07-02 Thread Lukasz Cwik
The validates runner test parallelism is controlled here and is currently set to be "unlimited": https://github.com/apache/beam/blob/fbfe6ceaea9d99cb1c8964087aafaa2bc2297a03/runners/google-cloud-dataflow-java/build.gradle#L115 Each test fork is run on a different gradle worker, so the number of pa

Re: Parallelizing test runs

2018-06-30 Thread Rafael Fernandez
- How many resources to ValidatesRunner tests use? - Where are those settings? On Sat, Jun 30, 2018 at 9:50 AM Reuven Lax wrote: > The specific issue only affects Dataflow ValidatesRunner tests. We > currently allow only one of these to run at a time, to control usage of > Dataflow and of GCE qu

Re: Parallelizing test runs

2018-06-30 Thread Reuven Lax
The specific issue only affects Dataflow ValidatesRunner tests. We currently allow only one of these to run at a time, to control usage of Dataflow and of GCE quota. Other types of tests do not suffer from this issue. I would like to see if it's possible to increase Dataflow quota so we can run mo

Parallelizing test runs

2018-06-30 Thread Rafael Fernandez
+Reuven Lax told me yesterday that he was waiting for some test to be scheduled and run, and it took 6 hours or so. I would like to help reduce these wait times by increasing parallelism. I need help understanding the continuous minimum of what we use. It seems the following is true: - There