Hello everybody, Thanks everyone. I didn't receive any more feedback on the design proposal document [1] and I believe we've reached consensus. I've added implementation tasks in JIRA (BEAM-4559 [2]) and will start coding soon. As a recap, the high-level plan is:
- Split existing post-commit tests jobs to automatically and manually triggered - Add tracking by JIRA bugs for failing test job - Create document describing post-commit failures handling policies - Add tests status badge to PR template - Create dashboard for post-commit tests - Detect and fix flaky java tests (if any) [1] https://docs.google.com/document/d/1sczGwnCvdHiboVajGVdnZL0rfnr7ViXXAebBAf_uQME [2] https://issues.apache.org/jira/browse/BEAM-4559 --Mikhail On Wed, Jun 6, 2018 at 1:12 PM Mikhail Gryzykhin <mig...@google.com> wrote: > Hello everyone, > > Most of the comments on my last draft addressed technical details of > automation implementation of specific processes proposed. No major process > changes were suggested. > > If you have not yet, please review this document. > > Highlights from last change: > * Bumped splitting tests jobs after Kenneths comment. > * No-commit in case of too many open JIRA tickets (metric was there, > action was missing) > * No-commit in case of too old JIRA ticket (metric was there, action was > missing) > * Closed comments that are addressed in document. > > This document already has two LGTMs from Scott Wegner and Thomas Weise. > If no major comments will come, I'll treat this document as complete and > start working on implementing work items defined in this document. > > Thank you, > --Mikhail > > > On Tue, Jun 5, 2018 at 7:38 PM Thomas Weise <t...@apache.org> wrote: > >> Thanks for taking this initiative. As the number of contributors grows, >> so does the cost of broken builds. I'm also in favor of locking master >> merges until related issues are fixed (short term pain for long term >> gain). It would penalize a few for the benefit of many. >> >> On that note, recently we also had a fair share of pre-commit build >> issues, with a few making their way to master. These include instances >> unrelated to build tooling, such as compile error or packaging. I don't >> think we should run PR merges over the red light and suggest it is >> necessary to step up the gatekeeper responsibility committers have. >> >> Thanks, >> Thomas >> >> >> On Tue, Jun 5, 2018 at 10:56 AM, Scott Wegner <sweg...@google.com> wrote: >> >>> I've taken another pass over the doc, and it looks good to me. Thanks >>> for driving this effort! >>> >>> On Mon, Jun 4, 2018 at 9:08 AM Mikhail Gryzykhin <mig...@google.com> >>> wrote: >>> >>>> Hello everyone, >>>> >>>> I have addressed comments on the proposal doc and updated it >>>> accordingly. I have also added section on metrics that we want to track for >>>> pre-commit tests and contents for dashboard. >>>> >>>> Please, take a second look at the document. >>>> >>>> Highlights: >>>> * Sections that I feel require more discussion are marked with *[More >>>> opinions wanted]* >>>> ** I've kept original comments open for this iteration. Please, close >>>> them if you feel those resolved, or elaborate more on the topic.* >>>> * Added information on metrics to track >>>> * Moved “Split test jobs into automatically and manually triggered” to >>>> “Other ideas to consider” >>>> * Prioritized automated JIRA ticket creation over manual >>>> * Prioritized roll-back first policy >>>> * Added process for enforcing proposed policies. >>>> >>>> --Mikhail >>>> >>>> Have feedback <http://go/migryz-feedback>? >>>> >>>> >>>> On Tue, May 22, 2018 at 10:11 AM Scott Wegner <sweg...@google.com> >>>> wrote: >>>> >>>>> Thanks for the thoughtful proposal Mikhail. I've left some comments in >>>>> the doc. >>>>> >>>>> I encourage others to take a look: the proposal adds some strong >>>>> policies about dealing with post-commit failures (rollback policy, locking >>>>> master). Currently our post-commits are frequently red, and we're missing >>>>> out on a valuable quality signal. I'm in favor of such policies to help >>>>> get >>>>> the test signals back to a healthy state. >>>>> >>>>> On Mon, May 21, 2018 at 2:48 PM Mikhail Gryzykhin <mig...@google.com> >>>>> wrote: >>>>> >>>>>> Hi Everyone, >>>>>> >>>>>> I've updated design doc according to comments. >>>>>> >>>>>> https://docs.google.com/document/d/1sczGwnCvdHiboVajGVdnZL0rfnr7ViXXAebBAf_uQME >>>>>> >>>>>> In general, ideas proposed seem to be appreciated. Still, some of >>>>>> sections require more discussion. >>>>>> >>>>>> Changes highlight: >>>>>> * Added roll-back first policy to best practices. This includes >>>>>> process on how to handle roll-back. >>>>>> * Marked topics that I'd like to have more input on. [cyan color] >>>>>> >>>>>> --Mikhail >>>>>> >>>>>> Have feedback <http://go/migryz-feedback>? >>>>>> >>>>>> >>>>>> On Fri, May 18, 2018 at 10:56 AM Andrew Pilloud <apill...@google.com> >>>>>> wrote: >>>>>> >>>>>>> Blocking commits to master on test flaps seems critical here. The >>>>>>> test flaps won't get the attention they deserve as long as people are >>>>>>> just >>>>>>> spamming their PRs with 'Run Java Precommit' until they turn green. I'm >>>>>>> guilty of this behavior and I know it masks new flaky tests. >>>>>>> >>>>>>> I added a comment to your doc about detecting flaky tests. This can >>>>>>> easily be done by rerunning the postcommits during times when Jenkins >>>>>>> would >>>>>>> otherwise be idle. You'll easily get a few dozen runs every weekend, you >>>>>>> just need a process to triage all the flakes and ensure there are bugs. >>>>>>> I >>>>>>> worked on a project that did this along with blocking master on any post >>>>>>> commit failure. It was painful for the first few weeks, but things got >>>>>>> significantly better once most of the bugs were fixed. >>>>>>> >>>>>>> Andrew >>>>>>> >>>>>>> On Fri, May 18, 2018 at 10:39 AM Kenneth Knowles <k...@google.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Love it. I would pull out from the doc also the key point: make the >>>>>>>> postcommit status constantly visible to everyone. >>>>>>>> >>>>>>>> Kenn >>>>>>>> >>>>>>>> On Fri, May 18, 2018 at 10:17 AM Mikhail Gryzykhin < >>>>>>>> mig...@google.com> wrote: >>>>>>>> >>>>>>>>> Hi everyone, >>>>>>>>> >>>>>>>>> I'm Mikhail and started working on Google Dataflow several months >>>>>>>>> ago. I'm really excited to work with Beam opensource community. >>>>>>>>> >>>>>>>>> I have a proposal to improve contributor experience by keeping >>>>>>>>> post-commit tests green. >>>>>>>>> >>>>>>>>> I'm looking to get community consensus and approval about the >>>>>>>>> process for keeping post-commit tests green and addressing >>>>>>>>> post-commit test >>>>>>>>> failures. >>>>>>>>> >>>>>>>>> Find full list of ideas brought in for discussion in this document: >>>>>>>>> >>>>>>>>> https://docs.google.com/document/d/1sczGwnCvdHiboVajGVdnZL0rfnr7ViXXAebBAf_uQME >>>>>>>>> >>>>>>>>> Key points are: >>>>>>>>> 1. Add explicit tracking of failures via JIRA >>>>>>>>> 2. No-Commit policy when post-commit tests are red >>>>>>>>> >>>>>>>>> --Mikhail >>>>>>>>> >>>>>>>>> >>