Hey Yi,

Thanks for pushing this forward! I'm +1 on doing all 3 action items, along
with making the github action a required check once auto-merge is
implemented.

Thanks,
Danny

On Thu, Jul 7, 2022 at 12:08 PM Yi Hu via dev <dev@beam.apache.org> wrote:

> Hi all,
>
> I have drafted a document summarizing the ideas suggested in this thread.
>
>
> https://docs.google.com/document/d/10FlXOo_hL2QYTPhwS8uHSyJbQCzwC3K3C12tFccANA8/edit#
>
> where I extracted the following action items:
>
> short-term (easy) items
> - Split the linting task (e.g. java checkStyle) out from the test (java
> percommit) step
>
> mid-to-long-term items
> - A single github action that indicates the PR is ready to merge
> - Merging the PR through action
>
> I am happy to volunteer to take on this task. Once we have decided the
> direction(s) to proceed I am going to detail the design.
>
> Regards,
> Yi
>
>
>
> On Tue, Jun 28, 2022 at 10:40 PM Ahmet Altay <al...@google.com> wrote:
>
>> I am worried about adding our own custom tooling. They require
>> maintenance, they introduce new flakiness, and so far we have not been
>> great about maintaining custom infra.
>>
>> On Tue, Jun 28, 2022 at 1:36 PM Kenneth Knowles <k...@apache.org> wrote:
>>
>>> Thanks for all the super useful info. I can add some experience from the
>>> early days of Beam: Because the GitHub "merge" button did not exist, we
>>> rolled our own "merge bot". The pros/cons below are not general to the
>>> concept, but to our experience with it:
>>>
>>> PROs:
>>>  - It single-threaded commits to the repo so there weren't race
>>> conditions on test results. I think this will be OK for our scale for the
>>> foreseeable future.
>>>  - It re-ran tests after merge but before pushing to master, which is
>>> the second half of eliminating that race condition.
>>>
>>> CONs:
>>>  - It was very flaky and often just didn't stay up. We danced for joy
>>> when it was gone.
>>>  - It ran a kind of arbitrary set of tests that didn't match the PR
>>> statuses. It did not have any filter on which tests were run during merge.
>>> We were small enough that mostly just "run the tests" was specific enough.
>>>  - It always squashed the commits and then pushed the squashed commit
>>> with a comment "this closes #<pr>". This messes up PRs that have multiple
>>> commits that should remain separate. But more importantly it made it
>>> impossible to easily distinguish PRs that were merged versus those that
>>> were closed without merge. And of course it is way harder to navigate
>>> history when the commits on master have different hashes than the commits
>>> that were authored.
>>>  - Getting reasonable logs with error messages when a merge failed for
>>> whatever reason was hard or impossible IIRC.
>>>
>>> So I think in what you have said there is an option to get the best of
>>> all, something like:
>>>
>>>  - Do merges through an action. Merge commit or squash-and-merge would
>>> be separate labels. Or not bother with squash and merge, instead using
>>> heuristics to block PRs with bad commit histories.
>>>  - Have the workflow check that all PR statuses are green before
>>> continuing to merge.
>>>
>>
>> This sounds like a reasonable process. I assume the only addition in this
>> case would be another github action.
>>
>>
>>>
>>> Kenn
>>>
>>> On Tue, Jun 28, 2022 at 8:03 AM Danny McCormick <
>>> dannymccorm...@google.com> wrote:
>>>
>>>> After looking into this a little bit more, I need to revise my opinion
>>>> on how we would do this; I don't think it's practical to have all required
>>>> status checks via the .asf.yml file because those required checks can't be
>>>> filtered by path (for example, if we want to require Python precommit on
>>>> Python PRs, it would need to be required on *all *PRs). That's a GitHub
>>>> limitation
>>>> <https://docs.github.com/en/enterprise-cloud@latest/repositories/configuring-branches-and-merges-in-your-repository/defining-the-mergeability-of-pull-requests/troubleshooting-required-status-checks#handling-skipped-but-required-checks>,
>>>> not an ASF one.
>>>>
>>>> One option is to write an action that makes sure no checks are failing
>>>> (except maybe codecov?) and put a single required check on that. That would
>>>> also make it easy to build in logic to override required checks like Robert
>>>> suggested ("specific wording would have to be in the last comment"). We
>>>> already have logic in the PR bot that does some of this
>>>> <https://github.com/apache/beam/blob/master/scripts/ci/pr-bot/shared/checks.ts>
>>>> .
>>>>
>>>> The downside to that approach is that it's not clear what the best way
>>>> to trigger that workflow is since it has to run after all other checks have
>>>> completed. We could have it trigger on some label (e.g. "ready to merge")
>>>> and then automatically merge the PR when it's done or comment and remove
>>>> the label if checks are failing/incomplete. This changes the workflow for
>>>> committers from "click the merge button" to "add a label", but doesn't
>>>> require significantly more action or oversight and is pretty similar to how
>>>> Kubernetes <https://github.com/kubernetes/kubernetes/pull/110812> and
>>>> some other large repos run things.
>>>>
>>>> Another option would be to trigger that check at the end of every
>>>> Actions run and use the check_suite trigger for external runs
>>>> (unfortunately actions doesn't trigger that). That would require a lot of
>>>> boilerplate on every Actions workflow though. There are other similar
>>>> options which may be cleaner that also require modifying every Actions
>>>> workflow, but I don't love that option.
>>>>
>>>> I'm still in favor of doing this via the "ready to merge" label option
>>>> I just described, but this has dampened my enthusiasm a little bit since
>>>> we'd need to build out/maintain our own tooling.
>>>>
>>>> On Tue, Jun 28, 2022 at 8:49 AM Danny McCormick <
>>>> dannymccorm...@google.com> wrote:
>>>>
>>>>> Regarding code cov - it does check overall % coverage as well as the
>>>>> coverage of the diff, but I don't think it's designed to be a blocking
>>>>> metric. A good example of where this is not helpful is this pr to
>>>>> remove dead code <https://github.com/apache/beam/pull/22019>. Because
>>>>> it removes tested code, the pr lowers our coverage percentage and fails 
>>>>> the
>>>>> check, but it's a totally reasonable PR that doesn't need additional
>>>>> testing (because it only removes functionality). Other classes of changes
>>>>> that are problematic include things like improving hard to cover error
>>>>> messages or tough to test proto changes which can only be covered in an
>>>>> integration test that doesn't make it into codecov (these are both common
>>>>> at least in the Go SDK).
>>>>>
>>>>> I'd be opposed to including that as a required check, though I support
>>>>> adding pretty much every other suite.
>>>>>
>>>>> > I assume we would have a way to override it in case the repo gets
>>>>> into a bad state (for example needing to modify the workflow definitions 
>>>>> in
>>>>> order to get them to pass).
>>>>>
>>>>> The way we make a check required is by adding it in the .asf.yml file
>>>>> <https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Branchprotection>.
>>>>> So we could always update that if needed to push an emergency fix as long
>>>>> as we don't accidentally include any required checks on the .asf.yml
>>>>> itself. If we mess that up we would need to bring in Infra to help.
>>>>>
>>>>> On Mon, Jun 27, 2022 at 7:59 PM Ahmet Altay <al...@google.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 27, 2022 at 4:55 PM Robert Bradshaw <rober...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Mon, Jun 27, 2022 at 3:06 PM Kenneth Knowles <k...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I am in favor of requiring precommit green before merge. To make
>>>>>>>> this better, we should:
>>>>>>>>
>>>>>>>> (1) run fewer irrelevant tests, for example almost every
>>>>>>>> language-based presubmit tests vastly more code than could possibly be
>>>>>>>> affected.
>>>>>>>> (2) run more relevant tests, for example when an IO is touched
>>>>>>>> running slightly more heavyweight integration tests before merge is
>>>>>>>> reasonable.
>>>>>>>>
>>>>>>>
>>>>>>> Does it have to be all or nothing? E.g. we could start
>>>>>>> with requiring a smaller subset of tests that must pass, with the goal 
>>>>>>> of
>>>>>>> eventually requiring (almost?) everything.
>>>>>>>
>>>>>>>
>>>>>>>> I worry that requiring total line coverage in unit tests /
>>>>>>>> presubmits will encourage mock-based tests that don't demonstrate any
>>>>>>>> meaningful correctness. But I am OK to try it.
>>>>>>>>
>>>>>>>
>>>>>> IIRC, codecov test is checking that % coverage is not reduced with
>>>>>> the new PR and it is not checking for total line coverage. I agree with 
>>>>>> you
>>>>>> that total line coverage is probably not meaningful in most cases.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> +1. I think there are a class of presubmits (coverage among them)
>>>>>>> that are advisory rather than binding.
>>>>>>>
>>>>>>
>>>>>> +1 - especially if we can make the tools distinguish between required
>>>>>> and advisory presubmits.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>> I assume we would have a way to override it in case the repo gets
>>>>>>>> into a bad state (for example needing to modify the workflow 
>>>>>>>> definitions in
>>>>>>>> order to get them to pass).
>>>>>>>>
>>>>>>>
>>>>>>> +1. But making this explicit (e.g. specific wording would have to be
>>>>>>> in the last comment) would be a good barrier to have even if it's not
>>>>>>> absolute.
>>>>>>>
>>>>>>>
>>>>>>>> On Mon, Jun 27, 2022 at 2:58 PM Ahmet Altay <al...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Just an opinion: If we have a precommit (including codecov) we
>>>>>>>>> should be willing to enforce those passing before merging a PR. 
>>>>>>>>> Selectively
>>>>>>>>> applying when/which precommits would be blockers eventually leads to 
>>>>>>>>> more
>>>>>>>>> opinions. I think codecov is actually providing value by checking that
>>>>>>>>> delta coverage is not dropped. In my experience, for PRs that fail the
>>>>>>>>> codecov precommit there tends to be an opportunity for adding unit 
>>>>>>>>> tests.
>>>>>>>>>
>>>>>>>>> On Mon, Jun 27, 2022 at 2:14 PM Danny McCormick <
>>>>>>>>> dannymccorm...@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Brian, you keep beating my emails by an instant :) I think
>>>>>>>>>> codecov provides value independent of blocking a PR - it is a good 
>>>>>>>>>> sanity
>>>>>>>>>> check (e.g. if codecov drops from 50% to 5%) and it gives us a nice 
>>>>>>>>>> metric
>>>>>>>>>> to track over time.
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 27, 2022 at 5:12 PM Danny McCormick <
>>>>>>>>>> dannymccorm...@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yeah, codecov is the one exception I had in mind, there may be
>>>>>>>>>>> others that are reasonable to exclude as well
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jun 27, 2022 at 4:59 PM Jack McCluskey <
>>>>>>>>>>> jrmcclus...@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> +1 for enforcing some precommit checks as required, but making
>>>>>>>>>>>> them all required would be a bit of a nightmare with how Codecov 
>>>>>>>>>>>> works. Any
>>>>>>>>>>>> Python or Go patch gets flagged as failing codecov if its changes 
>>>>>>>>>>>> aren't
>>>>>>>>>>>> covered at or above the current overall coverage level at master 
>>>>>>>>>>>> (around
>>>>>>>>>>>> 75% right now.) This is a nice goal, but ultimately not every 
>>>>>>>>>>>> change lends
>>>>>>>>>>>> itself to that level of unit testing. But I would be in favor of 
>>>>>>>>>>>> things
>>>>>>>>>>>> like the various ValidatesRunner suites being required.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Jack McCluskey
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Jun 27, 2022 at 4:51 PM Danny McCormick <
>>>>>>>>>>>> dannymccorm...@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Brian beat me by a few seconds, but I actually agree that
>>>>>>>>>>>>> there is a problem here - when Damon's PR was submitted there was 
>>>>>>>>>>>>> basically
>>>>>>>>>>>>> no way of knowing that checks would break without having detailed 
>>>>>>>>>>>>> knowledge
>>>>>>>>>>>>> of how the build system works (so I'm on board with your
>>>>>>>>>>>>> Subject/Background/Assessment). With that said, I disagree with 
>>>>>>>>>>>>> the
>>>>>>>>>>>>> recommended path forward - our automation failed us here and I 
>>>>>>>>>>>>> don't think
>>>>>>>>>>>>> the correct response is to make our manual validation better. 
>>>>>>>>>>>>> That shifts
>>>>>>>>>>>>> the burden of verifying correctness from robots (easy to validate,
>>>>>>>>>>>>> basically free) to humans (expensive, forgetful, and hard to 
>>>>>>>>>>>>> validate,
>>>>>>>>>>>>> especially for newer contributors). Instead, I think we should 
>>>>>>>>>>>>> fix the
>>>>>>>>>>>>> underlying technical problems, which are IMO:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1) Flaky tests unrelated to the PR caused it to fail
>>>>>>>>>>>>>
>>>>>>>>>>>>
>> I missed this part earlier. I would encourage all committers who are
>> blocked on flaky tests to file a new github issue with a link to the flaked
>> test before re-running the flaking test. Re-running tests until they pass
>> also makes it easier to introduce new flaky tests.
>>
>>
>>> 2) There were no guardrails in place to avoid merging a PR that wasn't
>>>>>>>>>>>>> fully validated.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Things that would've made this better which I think we should
>>>>>>>>>>>>> focus our attention on:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1) (easy) Split the linting task out from the test step.
>>>>>>>>>>>>> 2) This one is more controversial, but I would vote we start
>>>>>>>>>>>>> enforcing some (or all?) precommit checks as required. I don't 
>>>>>>>>>>>>> really blame
>>>>>>>>>>>>> anyone for merging this PR since it had repeatedly run into 
>>>>>>>>>>>>> unrelated
>>>>>>>>>>>>> flakes, but that's kinda the point - we need to feel some of that 
>>>>>>>>>>>>> pain to
>>>>>>>>>>>>> avoid accidentally merging bad code. Blocking PRs on failed 
>>>>>>>>>>>>> precommit
>>>>>>>>>>>>> suites would force us to deal with these kinds of flakes and 
>>>>>>>>>>>>> incentivize a
>>>>>>>>>>>>> lot of good behavior in the repo (and disincentivize bad tests).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Danny
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jun 27, 2022 at 4:48 PM Brian Hulette <
>>>>>>>>>>>>> bhule...@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Damon,
>>>>>>>>>>>>>> I think our CI system was working as intended for your PR. We
>>>>>>>>>>>>>> select which PreCommits to run based on the files that are 
>>>>>>>>>>>>>> edited, and the
>>>>>>>>>>>>>> author/reviewer should collaborate to get those green before 
>>>>>>>>>>>>>> merging the
>>>>>>>>>>>>>> PR. Sometimes that will involve inspecting the logs in Jenkins 
>>>>>>>>>>>>>> to find the
>>>>>>>>>>>>>> failing gradle tasks and run them locally to reproduce.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It looks like what broke down in this case was that the PR
>>>>>>>>>>>>>> was accidentally merged despite the failing Java PreCommit run.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Brian
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Jun 27, 2022 at 12:02 PM Damon Douglas <
>>>>>>>>>>>>>> damondoug...@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I write this to prevent a recent experience I encountered
>>>>>>>>>>>>>>> contributing to the Apache Beam repository.  To address a 
>>>>>>>>>>>>>>> system issue, I
>>>>>>>>>>>>>>> write in SBAR
>>>>>>>>>>>>>>> <https://www.hopkinsmedicine.org/antimicrobial-stewardship/nursing-toolkit/_docs/sbar-communication-tool.pdf>
>>>>>>>>>>>>>>> format for clarity.  If there are no objections to the 
>>>>>>>>>>>>>>> recommendation, I'm
>>>>>>>>>>>>>>> happy to fulfill the first recommendation i.e. create GitHub 
>>>>>>>>>>>>>>> issues.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Subject*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I submitted a PR <https://github.com/apache/beam/pull/21953>
>>>>>>>>>>>>>>> that was subsequently reverted
>>>>>>>>>>>>>>> <https://github.com/apache/beam/pull/22044>.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Background*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Despite passing the following gradle tasks,
>>>>>>>>>>>>>>> the PreCommit_Java_Commit failed with a test unrelated to
>>>>>>>>>>>>>>> the code in the PR
>>>>>>>>>>>>>>> <https://ci-beam.apache.org/job/beam_PreCommit_Java_Commit/23066/testReport/junit/org.apache.beam.sdk.io.gcp.bigquery/BigQueryServicesImplTest/testInsertWithinRowCountLimits/>.
>>>>>>>>>>>>>>> The PR was merged and then reverted due to discovered vendor 
>>>>>>>>>>>>>>> dependency
>>>>>>>>>>>>>>> issues and failed style checks.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ./gradlew rat
>>>>>>>>>>>>>>> ./gradlew spotlessCheck
>>>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:test
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Assessment*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Output from the following gradle commands did not surface
>>>>>>>>>>>>>>> prior to the PR merge.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
>>>>>>>>>>>>>>> ./gradlew sdks:java:io:google-cloud-platform:checkStyleTest
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Both the aforementioned failing PreCommit_Java_Commit test
>>>>>>>>>>>>>>> and a lack of knowledge contributed to the root cause.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Recommendation / Proposal*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I propose updating PR template description
>>>>>>>>>>>>>>> <https://github.com/apache/beam/blob/master/.github/PULL_REQUEST_TEMPLATE.md>
>>>>>>>>>>>>>>>  that
>>>>>>>>>>>>>>> prescribes gradle tasks to be successfully executed prior to 
>>>>>>>>>>>>>>> submitting a
>>>>>>>>>>>>>>> PR.  Toward this aim:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1.  Four separate GitHub issues in
>>>>>>>>>>>>>>> https://github.com/apache/beam representative of *common*,
>>>>>>>>>>>>>>> and language specific: *Java*, *Python*, and *Go*,
>>>>>>>>>>>>>>> prescribed gradle tasks
>>>>>>>>>>>>>>> 2.  Don't reply to this email with specifics on how;
>>>>>>>>>>>>>>> delegation of comments/discussion within the relevant context 
>>>>>>>>>>>>>>> of the GitHub
>>>>>>>>>>>>>>> issues in 1
>>>>>>>>>>>>>>> 3.  PRs that fix the GitHub issues in 1
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Damon Douglas*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Strategic Cloud Engineer, Data & Analytics, Google Cloud
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> damondoug...@google.com
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>

Reply via email to