Hey Yi, Thanks for pushing this forward! I'm +1 on doing all 3 action items, along with making the github action a required check once auto-merge is implemented.
Thanks, Danny On Thu, Jul 7, 2022 at 12:08 PM Yi Hu via dev <dev@beam.apache.org> wrote: > Hi all, > > I have drafted a document summarizing the ideas suggested in this thread. > > > https://docs.google.com/document/d/10FlXOo_hL2QYTPhwS8uHSyJbQCzwC3K3C12tFccANA8/edit# > > where I extracted the following action items: > > short-term (easy) items > - Split the linting task (e.g. java checkStyle) out from the test (java > percommit) step > > mid-to-long-term items > - A single github action that indicates the PR is ready to merge > - Merging the PR through action > > I am happy to volunteer to take on this task. Once we have decided the > direction(s) to proceed I am going to detail the design. > > Regards, > Yi > > > > On Tue, Jun 28, 2022 at 10:40 PM Ahmet Altay <al...@google.com> wrote: > >> I am worried about adding our own custom tooling. They require >> maintenance, they introduce new flakiness, and so far we have not been >> great about maintaining custom infra. >> >> On Tue, Jun 28, 2022 at 1:36 PM Kenneth Knowles <k...@apache.org> wrote: >> >>> Thanks for all the super useful info. I can add some experience from the >>> early days of Beam: Because the GitHub "merge" button did not exist, we >>> rolled our own "merge bot". The pros/cons below are not general to the >>> concept, but to our experience with it: >>> >>> PROs: >>> - It single-threaded commits to the repo so there weren't race >>> conditions on test results. I think this will be OK for our scale for the >>> foreseeable future. >>> - It re-ran tests after merge but before pushing to master, which is >>> the second half of eliminating that race condition. >>> >>> CONs: >>> - It was very flaky and often just didn't stay up. We danced for joy >>> when it was gone. >>> - It ran a kind of arbitrary set of tests that didn't match the PR >>> statuses. It did not have any filter on which tests were run during merge. >>> We were small enough that mostly just "run the tests" was specific enough. >>> - It always squashed the commits and then pushed the squashed commit >>> with a comment "this closes #<pr>". This messes up PRs that have multiple >>> commits that should remain separate. But more importantly it made it >>> impossible to easily distinguish PRs that were merged versus those that >>> were closed without merge. And of course it is way harder to navigate >>> history when the commits on master have different hashes than the commits >>> that were authored. >>> - Getting reasonable logs with error messages when a merge failed for >>> whatever reason was hard or impossible IIRC. >>> >>> So I think in what you have said there is an option to get the best of >>> all, something like: >>> >>> - Do merges through an action. Merge commit or squash-and-merge would >>> be separate labels. Or not bother with squash and merge, instead using >>> heuristics to block PRs with bad commit histories. >>> - Have the workflow check that all PR statuses are green before >>> continuing to merge. >>> >> >> This sounds like a reasonable process. I assume the only addition in this >> case would be another github action. >> >> >>> >>> Kenn >>> >>> On Tue, Jun 28, 2022 at 8:03 AM Danny McCormick < >>> dannymccorm...@google.com> wrote: >>> >>>> After looking into this a little bit more, I need to revise my opinion >>>> on how we would do this; I don't think it's practical to have all required >>>> status checks via the .asf.yml file because those required checks can't be >>>> filtered by path (for example, if we want to require Python precommit on >>>> Python PRs, it would need to be required on *all *PRs). That's a GitHub >>>> limitation >>>> <https://docs.github.com/en/enterprise-cloud@latest/repositories/configuring-branches-and-merges-in-your-repository/defining-the-mergeability-of-pull-requests/troubleshooting-required-status-checks#handling-skipped-but-required-checks>, >>>> not an ASF one. >>>> >>>> One option is to write an action that makes sure no checks are failing >>>> (except maybe codecov?) and put a single required check on that. That would >>>> also make it easy to build in logic to override required checks like Robert >>>> suggested ("specific wording would have to be in the last comment"). We >>>> already have logic in the PR bot that does some of this >>>> <https://github.com/apache/beam/blob/master/scripts/ci/pr-bot/shared/checks.ts> >>>> . >>>> >>>> The downside to that approach is that it's not clear what the best way >>>> to trigger that workflow is since it has to run after all other checks have >>>> completed. We could have it trigger on some label (e.g. "ready to merge") >>>> and then automatically merge the PR when it's done or comment and remove >>>> the label if checks are failing/incomplete. This changes the workflow for >>>> committers from "click the merge button" to "add a label", but doesn't >>>> require significantly more action or oversight and is pretty similar to how >>>> Kubernetes <https://github.com/kubernetes/kubernetes/pull/110812> and >>>> some other large repos run things. >>>> >>>> Another option would be to trigger that check at the end of every >>>> Actions run and use the check_suite trigger for external runs >>>> (unfortunately actions doesn't trigger that). That would require a lot of >>>> boilerplate on every Actions workflow though. There are other similar >>>> options which may be cleaner that also require modifying every Actions >>>> workflow, but I don't love that option. >>>> >>>> I'm still in favor of doing this via the "ready to merge" label option >>>> I just described, but this has dampened my enthusiasm a little bit since >>>> we'd need to build out/maintain our own tooling. >>>> >>>> On Tue, Jun 28, 2022 at 8:49 AM Danny McCormick < >>>> dannymccorm...@google.com> wrote: >>>> >>>>> Regarding code cov - it does check overall % coverage as well as the >>>>> coverage of the diff, but I don't think it's designed to be a blocking >>>>> metric. A good example of where this is not helpful is this pr to >>>>> remove dead code <https://github.com/apache/beam/pull/22019>. Because >>>>> it removes tested code, the pr lowers our coverage percentage and fails >>>>> the >>>>> check, but it's a totally reasonable PR that doesn't need additional >>>>> testing (because it only removes functionality). Other classes of changes >>>>> that are problematic include things like improving hard to cover error >>>>> messages or tough to test proto changes which can only be covered in an >>>>> integration test that doesn't make it into codecov (these are both common >>>>> at least in the Go SDK). >>>>> >>>>> I'd be opposed to including that as a required check, though I support >>>>> adding pretty much every other suite. >>>>> >>>>> > I assume we would have a way to override it in case the repo gets >>>>> into a bad state (for example needing to modify the workflow definitions >>>>> in >>>>> order to get them to pass). >>>>> >>>>> The way we make a check required is by adding it in the .asf.yml file >>>>> <https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Branchprotection>. >>>>> So we could always update that if needed to push an emergency fix as long >>>>> as we don't accidentally include any required checks on the .asf.yml >>>>> itself. If we mess that up we would need to bring in Infra to help. >>>>> >>>>> On Mon, Jun 27, 2022 at 7:59 PM Ahmet Altay <al...@google.com> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Mon, Jun 27, 2022 at 4:55 PM Robert Bradshaw <rober...@google.com> >>>>>> wrote: >>>>>> >>>>>>> On Mon, Jun 27, 2022 at 3:06 PM Kenneth Knowles <k...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> I am in favor of requiring precommit green before merge. To make >>>>>>>> this better, we should: >>>>>>>> >>>>>>>> (1) run fewer irrelevant tests, for example almost every >>>>>>>> language-based presubmit tests vastly more code than could possibly be >>>>>>>> affected. >>>>>>>> (2) run more relevant tests, for example when an IO is touched >>>>>>>> running slightly more heavyweight integration tests before merge is >>>>>>>> reasonable. >>>>>>>> >>>>>>> >>>>>>> Does it have to be all or nothing? E.g. we could start >>>>>>> with requiring a smaller subset of tests that must pass, with the goal >>>>>>> of >>>>>>> eventually requiring (almost?) everything. >>>>>>> >>>>>>> >>>>>>>> I worry that requiring total line coverage in unit tests / >>>>>>>> presubmits will encourage mock-based tests that don't demonstrate any >>>>>>>> meaningful correctness. But I am OK to try it. >>>>>>>> >>>>>>> >>>>>> IIRC, codecov test is checking that % coverage is not reduced with >>>>>> the new PR and it is not checking for total line coverage. I agree with >>>>>> you >>>>>> that total line coverage is probably not meaningful in most cases. >>>>>> >>>>>> >>>>>>> >>>>>>> +1. I think there are a class of presubmits (coverage among them) >>>>>>> that are advisory rather than binding. >>>>>>> >>>>>> >>>>>> +1 - especially if we can make the tools distinguish between required >>>>>> and advisory presubmits. >>>>>> >>>>>> >>>>>>> >>>>>>>> I assume we would have a way to override it in case the repo gets >>>>>>>> into a bad state (for example needing to modify the workflow >>>>>>>> definitions in >>>>>>>> order to get them to pass). >>>>>>>> >>>>>>> >>>>>>> +1. But making this explicit (e.g. specific wording would have to be >>>>>>> in the last comment) would be a good barrier to have even if it's not >>>>>>> absolute. >>>>>>> >>>>>>> >>>>>>>> On Mon, Jun 27, 2022 at 2:58 PM Ahmet Altay <al...@google.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Just an opinion: If we have a precommit (including codecov) we >>>>>>>>> should be willing to enforce those passing before merging a PR. >>>>>>>>> Selectively >>>>>>>>> applying when/which precommits would be blockers eventually leads to >>>>>>>>> more >>>>>>>>> opinions. I think codecov is actually providing value by checking that >>>>>>>>> delta coverage is not dropped. In my experience, for PRs that fail the >>>>>>>>> codecov precommit there tends to be an opportunity for adding unit >>>>>>>>> tests. >>>>>>>>> >>>>>>>>> On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick < >>>>>>>>> dannymccorm...@google.com> wrote: >>>>>>>>> >>>>>>>>>> Brian, you keep beating my emails by an instant :) I think >>>>>>>>>> codecov provides value independent of blocking a PR - it is a good >>>>>>>>>> sanity >>>>>>>>>> check (e.g. if codecov drops from 50% to 5%) and it gives us a nice >>>>>>>>>> metric >>>>>>>>>> to track over time. >>>>>>>>>> >>>>>>>>>> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick < >>>>>>>>>> dannymccorm...@google.com> wrote: >>>>>>>>>> >>>>>>>>>>> Yeah, codecov is the one exception I had in mind, there may be >>>>>>>>>>> others that are reasonable to exclude as well >>>>>>>>>>> >>>>>>>>>>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey < >>>>>>>>>>> jrmcclus...@google.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> +1 for enforcing some precommit checks as required, but making >>>>>>>>>>>> them all required would be a bit of a nightmare with how Codecov >>>>>>>>>>>> works. Any >>>>>>>>>>>> Python or Go patch gets flagged as failing codecov if its changes >>>>>>>>>>>> aren't >>>>>>>>>>>> covered at or above the current overall coverage level at master >>>>>>>>>>>> (around >>>>>>>>>>>> 75% right now.) This is a nice goal, but ultimately not every >>>>>>>>>>>> change lends >>>>>>>>>>>> itself to that level of unit testing. But I would be in favor of >>>>>>>>>>>> things >>>>>>>>>>>> like the various ValidatesRunner suites being required. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Jack McCluskey >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick < >>>>>>>>>>>> dannymccorm...@google.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Brian beat me by a few seconds, but I actually agree that >>>>>>>>>>>>> there is a problem here - when Damon's PR was submitted there was >>>>>>>>>>>>> basically >>>>>>>>>>>>> no way of knowing that checks would break without having detailed >>>>>>>>>>>>> knowledge >>>>>>>>>>>>> of how the build system works (so I'm on board with your >>>>>>>>>>>>> Subject/Background/Assessment). With that said, I disagree with >>>>>>>>>>>>> the >>>>>>>>>>>>> recommended path forward - our automation failed us here and I >>>>>>>>>>>>> don't think >>>>>>>>>>>>> the correct response is to make our manual validation better. >>>>>>>>>>>>> That shifts >>>>>>>>>>>>> the burden of verifying correctness from robots (easy to validate, >>>>>>>>>>>>> basically free) to humans (expensive, forgetful, and hard to >>>>>>>>>>>>> validate, >>>>>>>>>>>>> especially for newer contributors). Instead, I think we should >>>>>>>>>>>>> fix the >>>>>>>>>>>>> underlying technical problems, which are IMO: >>>>>>>>>>>>> >>>>>>>>>>>>> 1) Flaky tests unrelated to the PR caused it to fail >>>>>>>>>>>>> >>>>>>>>>>>> >> I missed this part earlier. I would encourage all committers who are >> blocked on flaky tests to file a new github issue with a link to the flaked >> test before re-running the flaking test. Re-running tests until they pass >> also makes it easier to introduce new flaky tests. >> >> >>> 2) There were no guardrails in place to avoid merging a PR that wasn't >>>>>>>>>>>>> fully validated. >>>>>>>>>>>>> >>>>>>>>>>>>> Things that would've made this better which I think we should >>>>>>>>>>>>> focus our attention on: >>>>>>>>>>>>> >>>>>>>>>>>>> 1) (easy) Split the linting task out from the test step. >>>>>>>>>>>>> 2) This one is more controversial, but I would vote we start >>>>>>>>>>>>> enforcing some (or all?) precommit checks as required. I don't >>>>>>>>>>>>> really blame >>>>>>>>>>>>> anyone for merging this PR since it had repeatedly run into >>>>>>>>>>>>> unrelated >>>>>>>>>>>>> flakes, but that's kinda the point - we need to feel some of that >>>>>>>>>>>>> pain to >>>>>>>>>>>>> avoid accidentally merging bad code. Blocking PRs on failed >>>>>>>>>>>>> precommit >>>>>>>>>>>>> suites would force us to deal with these kinds of flakes and >>>>>>>>>>>>> incentivize a >>>>>>>>>>>>> lot of good behavior in the repo (and disincentivize bad tests). >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Danny >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette < >>>>>>>>>>>>> bhule...@google.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Damon, >>>>>>>>>>>>>> I think our CI system was working as intended for your PR. We >>>>>>>>>>>>>> select which PreCommits to run based on the files that are >>>>>>>>>>>>>> edited, and the >>>>>>>>>>>>>> author/reviewer should collaborate to get those green before >>>>>>>>>>>>>> merging the >>>>>>>>>>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins >>>>>>>>>>>>>> to find the >>>>>>>>>>>>>> failing gradle tasks and run them locally to reproduce. >>>>>>>>>>>>>> >>>>>>>>>>>>>> It looks like what broke down in this case was that the PR >>>>>>>>>>>>>> was accidentally merged despite the failing Java PreCommit run. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Brian >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas < >>>>>>>>>>>>>> damondoug...@google.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I write this to prevent a recent experience I encountered >>>>>>>>>>>>>>> contributing to the Apache Beam repository. To address a >>>>>>>>>>>>>>> system issue, I >>>>>>>>>>>>>>> write in SBAR >>>>>>>>>>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf> >>>>>>>>>>>>>>> format for clarity. If there are no objections to the >>>>>>>>>>>>>>> recommendation, I'm >>>>>>>>>>>>>>> happy to fulfill the first recommendation i.e. create GitHub >>>>>>>>>>>>>>> issues. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> *Subject* >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953> >>>>>>>>>>>>>>> that was subsequently reverted >>>>>>>>>>>>>>> <https://github.com/apache/beam/pull/22044>. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> *Background* >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Despite passing the following gradle tasks, >>>>>>>>>>>>>>> the PreCommit_Java_Commit failed with a test unrelated to >>>>>>>>>>>>>>> the code in the PR >>>>>>>>>>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>. >>>>>>>>>>>>>>> The PR was merged and then reverted due to discovered vendor >>>>>>>>>>>>>>> dependency >>>>>>>>>>>>>>> issues and failed style checks. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ./gradlew rat >>>>>>>>>>>>>>> ./gradlew spotlessCheck >>>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> *Assessment* >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Output from the following gradle commands did not surface >>>>>>>>>>>>>>> prior to the PR merge. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain >>>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Both the aforementioned failing PreCommit_Java_Commit test >>>>>>>>>>>>>>> and a lack of knowledge contributed to the root cause. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> *Recommendation / Proposal* >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I propose updating PR template description >>>>>>>>>>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md> >>>>>>>>>>>>>>> that >>>>>>>>>>>>>>> prescribes gradle tasks to be successfully executed prior to >>>>>>>>>>>>>>> submitting a >>>>>>>>>>>>>>> PR. Toward this aim: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. Four separate GitHub issues in >>>>>>>>>>>>>>> https://github.com/apache/beam representative of *common*, >>>>>>>>>>>>>>> and language specific: *Java*, *Python*, and *Go*, >>>>>>>>>>>>>>> prescribed gradle tasks >>>>>>>>>>>>>>> 2. Don't reply to this email with specifics on how; >>>>>>>>>>>>>>> delegation of comments/discussion within the relevant context >>>>>>>>>>>>>>> of the GitHub >>>>>>>>>>>>>>> issues in 1 >>>>>>>>>>>>>>> 3. PRs that fix the GitHub issues in 1 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> *Damon Douglas* >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> damondoug...@google.com >>>>>>>>>>>>>>> >>>>>>>>>>>>>>