> One easy and standard way to make it more resilient would be to make it idempotent instead of counting on uptime or receiving any particular event.
Yep, agreed that this wouldn't be super hard if someone wants to take it on. Basically it would just be updating the tool to run on a schedule and look for issues that have been closed as completed in the last N days (more or less this query - https://github.com/apache/beam/issues?q=is%3Aissue+is%3Aclosed+reason%3Acompleted+created%3A%3E2023-01-01+). I have seen some milestones intentionally removed from issues after the bot adds them (probably because it's non-obvious that you can mark an issue as not planned instead), so we'd probably want to account for that and no-op if a milestone was removed post-close. One downside of this approach is that you significantly increase the chances of an issue getting misassigned to the wrong milestone if it comes in around the cut; you'd need to either account for this by checking out the repo to get the version at the time the issue was closed (expensive/non-trivial) or live with this downside. It's probably an ok downside to live with. You could also do a hybrid approach where you run on issue close and run a scheduled or manual pre-release step to clean up any stragglers. This would be the most robust option. On Wed, Oct 25, 2023 at 7:43 AM Kenneth Knowles <k...@apache.org> wrote: > Agree. As long as we are getting enough of them, then our records as well > as any automation depending on it are fine. One easy and standard way to > make it more resilient would be to make it idempotent instead of counting > on uptime or receiving any particular event. > > Kenn > > On Tue, Oct 24, 2023 at 2:58 PM Danny McCormick <dannymccorm...@google.com> > wrote: > >> Looks like for some reason the workflow didn't trigger. This is running >> on GitHub's hosted runners, so my best guess is an outage. >> >> Looking at a more refined query, this year there have been 14 issues that >> were missed by the automation (3 had their milestone manually removed) - >> https://github.com/apache/beam/issues?q=is%3Aissue+no%3Amilestone+is%3Aclosed+reason%3Acompleted+created%3A%3E2023-01-01 >> out >> of 605 total - >> https://github.com/apache/beam/issues?q=is%3Aissue+is%3Aclosed+reason%3Acompleted+created%3A%3E2023-01-01+ >> - >> as best I can tell there were a small number of workflow flakes and then >> GHA didn't correctly trigger a few. >> >> If we wanted, we could set up some recurring automation to go through and >> try to pick up the ones without milestones (or modify our existing >> automation to be more tolerant to failures), but it doesn't seem super >> urgent to me (feel free to disagree). I don't think this piece needs to be >> perfect. >> >> On Tue, Oct 24, 2023 at 2:40 PM Kenneth Knowles <k...@apache.org> wrote: >> >>> Just grabbing one at random for an example, >>> https://github.com/apache/beam/issues/28635 seems like it was closed as >>> completed but not tagged. >>> >>> I'm happy to see that the bot reads the version from the repo to find >>> the appropriate milestone, rather than using the nearest open one. Just >>> recording that for the thread since I first read the description as the >>> latter. >>> >>> Kenn >>> >>> On Tue, Oct 24, 2023 at 2:34 PM Danny McCormick via dev < >>> dev@beam.apache.org> wrote: >>> >>>> We do tag issues to milestones when the issue is marked as "completed" >>>> (as opposed to "not planned") - >>>> https://github.com/apache/beam/blob/master/.github/workflows/assign_milestone.yml. >>>> So I think using issues is probably about as accurate as using commits. >>>> >>>> > It looks like we have 820 with no milestone >>>> https://github.com/apache/beam/issues?q=is%3Aissue+no%3Amilestone+is%3Aclosed >>>> >>>> Most predate the automation, though maybe not all? Some of those may >>>> have been closed as "not planned". >>>> >>>> > This could (should) be automatically discoverable. A (closed) issues >>>> is associated with commits which are associated with a release. >>>> >>>> Today, we just tag issues to the upcoming milestone when they're >>>> closed. In theory you could do something more sophisticated using linked >>>> commits, but in practice people aren't clean enough about linking commits >>>> to issues. Again, this is fixable by automation/enforcement, but I don't >>>> think it actually gives us much value beyond what we have today. >>>> >>>> On Tue, Oct 24, 2023 at 1:54 PM Robert Bradshaw via dev < >>>> dev@beam.apache.org> wrote: >>>> >>>>> On Tue, Oct 24, 2023 at 10:35 AM Kenneth Knowles <k...@apache.org> >>>>> wrote: >>>>> >>>>>> Tangentially related: >>>>>> >>>>>> Long ago, attaching an issue to a release was a mandatory step as >>>>>> part of closing. Now I think it is not. Is it automatically happening? It >>>>>> looks like we have 820 with no milestone >>>>>> https://github.com/apache/beam/issues?q=is%3Aissue+no%3Amilestone+is%3Aclosed >>>>>> >>>>> >>>>> This could (should) be automatically discoverable. A (closed) issues >>>>> is associated with commits which are associated with a release. >>>>> >>>>> >>>>>> On Tue, Oct 24, 2023 at 1:25 PM Chamikara Jayalath via dev < >>>>>> dev@beam.apache.org> wrote: >>>>>> >>>>>>> +1 for going by the commits since this is what matters at the end of >>>>>>> the day. Also, many issues may not get tagged correctly for a given >>>>>>> release >>>>>>> due to either the contributor not tagging the issue or due to commits >>>>>>> for >>>>>>> the issue spanning multiple Beam releases. >>>>>>> >>>>>>> For example, >>>>>>> >>>>>>> For all commits in a given release RC: >>>>>>> * If we find a Github issue for the commit: add a notice to the >>>>>>> Github issue >>>>>>> * Else: add the notice to a generic issue for the release >>>>>>> including tags for the commit ID, PR author, and the committer who >>>>>>> merged >>>>>>> the PR. >>>>>>> >>>>>>> Thanks, >>>>>>> Cham >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Oct 23, 2023 at 11:49 AM Danny McCormick via dev < >>>>>>> dev@beam.apache.org> wrote: >>>>>>> >>>>>>>> I'd probably vote to include both the issue filer and the >>>>>>>> contributor. It is pretty equally straightforward - one way to do this >>>>>>>> would be using all issues related to that release's milestone and >>>>>>>> extracting the issue author and the issue closer. >>>>>>>> >>>>>>>> This does leave out the (unfortunately sizable) set of >>>>>>>> contributions that don't have an associated issue; if we're worried >>>>>>>> about >>>>>>>> that, we could always fall back to anyone with a commit in the last >>>>>>>> release >>>>>>>> who doesn't have an associated issue (aka what I thought we were >>>>>>>> initially >>>>>>>> proposing and what I think Airflow does today). >>>>>>>> >>>>>>>> I'm pretty much +1 on any sort of automation here, and it >>>>>>>> certainly can come in stages :) >>>>>>>> >>>>>>>> On Mon, Oct 23, 2023 at 1:50 PM Johanna Öjeling via dev < >>>>>>>> dev@beam.apache.org> wrote: >>>>>>>> >>>>>>>>> Yes that's a good point to include also those who created the >>>>>>>>> issue. >>>>>>>>> >>>>>>>>> On Mon, Oct 23, 2023, 19:18 Robert Bradshaw via dev < >>>>>>>>> dev@beam.apache.org> wrote: >>>>>>>>> >>>>>>>>>> On Mon, Oct 23, 2023 at 7:26 AM Danny McCormick via dev < >>>>>>>>>> dev@beam.apache.org> wrote: >>>>>>>>>> >>>>>>>>>>> So to summarize, I think there's broad consensus (or at least >>>>>>>>>>> lazy consensus) around the following: >>>>>>>>>>> >>>>>>>>>>> - (1) Updating our release email/guidelines to be more specific >>>>>>>>>>> about what we mean by release validation/how to be helpful during >>>>>>>>>>> this >>>>>>>>>>> process. This includes both encouraging validation within each >>>>>>>>>>> user's own >>>>>>>>>>> code base and encouraging people to document/share their process of >>>>>>>>>>> validation and link it in the release spreadsheet. >>>>>>>>>>> - (2) Doing something like what Airflow does (#29424 >>>>>>>>>>> <https://github.com/apache/airflow/issues/29424>) and creating >>>>>>>>>>> an issue asking people who have contributed to the current release >>>>>>>>>>> to help >>>>>>>>>>> validate their changes. >>>>>>>>>>> >>>>>>>>>>> I'm also +1 on doing both of these. The first bit (updating our >>>>>>>>>>> guidelines) is relatively easy - it should just require updating >>>>>>>>>>> https://github.com/apache/beam/blob/master/contributor-docs/release-guide.md#vote-and-validate-the-release-candidate >>>>>>>>>>> . >>>>>>>>>>> >>>>>>>>>>> I took a look at the second piece (copying what Airflow does) to >>>>>>>>>>> see if we could just copy their automation, but it looks like it's >>>>>>>>>>> tied to airflow breeze >>>>>>>>>>> <https://github.com/apache/airflow/blob/main/dev/breeze/src/airflow_breeze/provider_issue_TEMPLATE.md.jinja2> >>>>>>>>>>> (their repo-specific automation tooling), so we'd probably need to >>>>>>>>>>> build >>>>>>>>>>> the automation ourselves. It shouldn't be terrible, basically we'd >>>>>>>>>>> want a >>>>>>>>>>> GitHub Action that compares the current release tag with the last >>>>>>>>>>> release >>>>>>>>>>> tag, grabs all the commits in between, parses them to get the >>>>>>>>>>> author, and >>>>>>>>>>> creates an issue with that data, but it does represent more effort >>>>>>>>>>> than >>>>>>>>>>> just updating a markdown file. There might even be an existing >>>>>>>>>>> Action that >>>>>>>>>>> can help with this, I haven't looked too hard. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I was thinking along the lines of a script that would scrape the >>>>>>>>>> issues resolved in a given release and add a comment to them noting >>>>>>>>>> that >>>>>>>>>> the change is in release N and encouraging (with clear instructions) >>>>>>>>>> how >>>>>>>>>> this can be validated. Creating a "validate this release" issue with >>>>>>>>>> all >>>>>>>>>> "contributing" participants could be an interesting way to do this >>>>>>>>>> as well. >>>>>>>>>> (I think it'd be valuable to get those who filed the issue, not just >>>>>>>>>> those >>>>>>>>>> who fixed it, to validate.) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> As our next release manager, I'm happy to review PRs for either >>>>>>>>>>> of these if anyone wants to volunteer to help out. If not, I'm >>>>>>>>>>> happy to >>>>>>>>>>> update the guidelines, but I probably won't have time to add the >>>>>>>>>>> commit >>>>>>>>>>> inspection tooling (I'm planning on throwing any extra time towards >>>>>>>>>>> continuing to automate release candidate creation which is >>>>>>>>>>> currently a more >>>>>>>>>>> impactful problem IMO). I would very much like it if both of these >>>>>>>>>>> things >>>>>>>>>>> happened though :) >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Danny >>>>>>>>>>> >>>>>>>>>>> On Mon, Oct 23, 2023 at 10:05 AM XQ Hu <x...@google.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> +1. This is a great idea to try. @Danny McCormick >>>>>>>>>>>> <dannymccorm...@google.com> FYI as our next release manager. >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Oct 18, 2023 at 2:30 PM Johanna Öjeling via dev < >>>>>>>>>>>> dev@beam.apache.org> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> When I have contributed to Apache Airflow, they have tagged >>>>>>>>>>>>> all contributors concerned in a GitHub issue when the RC is >>>>>>>>>>>>> available and >>>>>>>>>>>>> asked us to validate it. Example: #29424 >>>>>>>>>>>>> <https://github.com/apache/airflow/issues/29424>. >>>>>>>>>>>>> >>>>>>>>>>>>> I found that to be an effective way to notify contributors of >>>>>>>>>>>>> the RC and nudge them to help out. In the issue description there >>>>>>>>>>>>> is a >>>>>>>>>>>>> reference to the guidelines on how to test the RC and a note that >>>>>>>>>>>>> people >>>>>>>>>>>>> are encouraged to vote on the mailing list (which could >>>>>>>>>>>>> admittedly be more >>>>>>>>>>>>> highlighted because I did not pay attention to it until now and >>>>>>>>>>>>> was unaware >>>>>>>>>>>>> that contributors had a vote). >>>>>>>>>>>>> >>>>>>>>>>>>> It might be an idea to consider something similar here to >>>>>>>>>>>>> increase the participation? >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Oct 17, 2023 at 7:01 PM Jack McCluskey via dev < >>>>>>>>>>>>> dev@beam.apache.org> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I'm +1 on helping explain what we mean by "validate the RC" >>>>>>>>>>>>>> since we're really just asking users to see if their existing >>>>>>>>>>>>>> use cases >>>>>>>>>>>>>> work along with our typical slate of tests. I don't know if >>>>>>>>>>>>>> offloading that >>>>>>>>>>>>>> work to our active validators is the right approach though, >>>>>>>>>>>>>> documentation/screen share of their specific workflow is >>>>>>>>>>>>>> definitely less >>>>>>>>>>>>>> useful than having a more general outline of how to install the >>>>>>>>>>>>>> RC and >>>>>>>>>>>>>> things to look out for when testing. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Oct 17, 2023 at 12:55 PM Austin Bennett < >>>>>>>>>>>>>> aus...@apache.org> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Great effort. I'm also interested in streamlining releases >>>>>>>>>>>>>>> -- so if there are alot of manual tests that could be >>>>>>>>>>>>>>> automated, would be >>>>>>>>>>>>>>> great to discover and then look to address. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, Oct 17, 2023 at 8:47 AM Robert Bradshaw via dev < >>>>>>>>>>>>>>> dev@beam.apache.org> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> +1 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I would also strongly suggest that people try out the >>>>>>>>>>>>>>>> release against their own codebases. This has the benefit of >>>>>>>>>>>>>>>> ensuring the >>>>>>>>>>>>>>>> release won't break your own code when they go out, and >>>>>>>>>>>>>>>> stress-tests >>>>>>>>>>>>>>>> the new code against real-world pipelines. (Ideally our own >>>>>>>>>>>>>>>> tests are all >>>>>>>>>>>>>>>> passing, and this validation is automated as much as possible >>>>>>>>>>>>>>>> (though >>>>>>>>>>>>>>>> ensuring it matches our documentation and works in a clean >>>>>>>>>>>>>>>> environment >>>>>>>>>>>>>>>> still has value), but there's a lot of code and uses out there >>>>>>>>>>>>>>>> that we >>>>>>>>>>>>>>>> don't have access to during normal Beam development.) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, Oct 17, 2023 at 8:21 AM Svetak Sundhar via dev < >>>>>>>>>>>>>>>> dev@beam.apache.org> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I’ve participated in RC testing for a few releases and >>>>>>>>>>>>>>>>> have observed a bit of a knowledge gap in how releases can be >>>>>>>>>>>>>>>>> tested. Given >>>>>>>>>>>>>>>>> that Beam encourages contributors to vote on RC’s regardless >>>>>>>>>>>>>>>>> of tenure, and >>>>>>>>>>>>>>>>> that voting on an RC is a relatively low-effort, high >>>>>>>>>>>>>>>>> leverage way to >>>>>>>>>>>>>>>>> influence the release of the library, I propose the following: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> During the vote for the next release, voters can document >>>>>>>>>>>>>>>>> the process they followed on a separate document, and add the >>>>>>>>>>>>>>>>> link on >>>>>>>>>>>>>>>>> column G here >>>>>>>>>>>>>>>>> <https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=437054928>. >>>>>>>>>>>>>>>>> One step further, could be a screencast of running the test, >>>>>>>>>>>>>>>>> and attaching >>>>>>>>>>>>>>>>> a link of that. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> We can keep repeating this through releases until we have >>>>>>>>>>>>>>>>> documentation for many of the different tests. We can then >>>>>>>>>>>>>>>>> add these docs >>>>>>>>>>>>>>>>> into the repo. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I’m proposing this because I’ve gathered the following >>>>>>>>>>>>>>>>> feedback from colleagues that are tangentially involved with >>>>>>>>>>>>>>>>> Beam: They are >>>>>>>>>>>>>>>>> interested in participating in release validation, but don’t >>>>>>>>>>>>>>>>> know how to >>>>>>>>>>>>>>>>> get started. Happy to hear other suggestions too, if there >>>>>>>>>>>>>>>>> are any to >>>>>>>>>>>>>>>>> address the above. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Svetak Sundhar >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Data Engineer >>>>>>>>>>>>>>>>> s <nellywil...@google.com>vetaksund...@google.com >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>