I asked for an update on the Apache org GitHub Actions usage stats from Gavin McDonald on the-asf slack in this thread: https://the-asf.slack.com/archives/CBX4TSBQ8/p1662464113873539?thread_ts=1661512133.913279&cid=CBX4TSBQ8 .
I hope we get this issue resolved since it delays PR processing a lot. -Lari On 2022/09/06 11:16:07 Lari Hotari wrote: > Pulsar CI continues to be congested, and the build queue [1] is very long at > the moment. There are 147 build jobs in the queue and 16 jobs in progress at > the moment. > > I would strongly advice everyone to use "personal CI" to mitigate the issue > of the long delay of CI feedback. You can simply open a PR to your own > personal fork of apache/pulsar to run the builds in your "personal CI". > There's more details in the previous emails in this thread. > > -Lari > > [1] - build queue: https://github.com/apache/pulsar/actions?query=is%3Aqueued > > On 2022/08/30 12:39:19 Lari Hotari wrote: > > Pulsar CI continues to be congested, and the build queue is long. > > > > I would strongly advice everyone to use "personal CI" to mitigate the issue > > of the long delay of CI feedback. You can simply open a PR to your own > > personal fork of apache/pulsar to run the builds in your "personal CI". > > There's more details in the previous email in this thread. > > > > Some updates: > > > > There has been a discussion with Gavin McDonald from ASF infra on the-asf > > slack about getting usage reports from GitHub to support the investigation. > > Slack thread is the same one mentioned in the previous email, > > https://the-asf.slack.com/archives/CBX4TSBQ8/p1661512133913279 . Gavin > > already requested the usage report in GitHub UI, but it produced invalid > > results. > > > > I made a change to mitigate a source of additional GitHub Actions overhead. > > In the past, each cherry-picked commit to a maintenance branch of Pulsar > > has triggered a lot of workflow runs. > > > > The solution for cancelling duplicate builds automatically is to add this > > definition to the workflow definition: > > concurrency: > > group: ${{ github.workflow }}-${{ github.ref }} > > cancel-in-progress: true > > > > I added this to all maintenance branch GitHub Actions workflows: > > > > branch-2.10 change: > > https://github.com/apache/pulsar/commit/5d2c9851f4f4d70bfe74b1e683a41c5a040a6ca7 > > branch-2.9 change: > > https://github.com/apache/pulsar/commit/3ea124924fecf636cc105de75c62b3a99050847b > > branch-2.8 change: > > https://github.com/apache/pulsar/commit/48187bb5d95e581f8322a019b61d986e18a31e54 > > branch-2.7: > > https://github.com/apache/pulsar/commit/744b62c99344724eacdbe97c881311869d67f630 > > > > branch-2.11 already contains the necessary config for cancelling duplicate > > builds. > > > > The benefit of the above change is that when multiple commits are > > cherry-picked to a branch at once, only the build of the last commit will > > get run eventually. The builds for the intermediate commits will get > > cancelled. Obviously there's a tradeoff here that we don't get the > > information if one of the earlier commits breaks the build. It's the cost > > that we need to pay. Nevertheless our build is so flaky that it's hard to > > determine whether a failed build result is only caused by bad flaky test or > > whether it's an actual failure. Because of this we don't lose anything by > > cancelling builds. It's more important to save build resources. In the > > maintenance branches for 2.10 and older, the average total build time > > consumed is around 20 hours which is a lot. > > > > At this time, the overhead of maintenance branch builds doesn't seem to be > > the source of the problems. There must be some other issue which is > > possibly related to exceeding a usage quota. Hopefully we get the CI > > slowness issue solved asap. > > > > BR, > > > > Lari > > > > > > On 2022/08/26 12:00:20 Lari Hotari wrote: > > > Hi, > > > > > > GitHub Actions builds have been piling up in the build queue in the last > > > few days. > > > I posted on bui...@apache.org > > > https://lists.apache.org/thread/6lbqr0f6mqt9s8ggollp5kj2nv7rlo9s and > > > created INFRA ticket https://issues.apache.org/jira/browse/INFRA-23633 > > > about this issue. > > > There's also a thread on the-asf slack, > > > https://the-asf.slack.com/archives/CBX4TSBQ8/p1661512133913279 . > > > > > > It seems that our build queue is finally getting picked up, but it would > > > be great to see if we hit quota and whether that is the cause of pauses. > > > > > > Another issue is that the master branch broke after merging 2 conflicting > > > PRs. > > > The fix is in https://github.com/apache/pulsar/pull/17300 . > > > > > > Merging PRs will be slow until we have these 2 problems solved and > > > existing PRs rebased over the changes. Let's prioritize merging #17300 > > > before pushing more changes. > > > > > > I'd like to point out that a good way to get build feedback before > > > sending a PR, is to run builds on your personal GitHub Actions CI. The > > > benefit of this is that it doesn't consume the shared quota and builds > > > usually start instantly. > > > There are instructions in the contributors guide about this. > > > https://pulsar.apache.org/contributing/#ci-testing-in-your-fork > > > You simply open PRs to your own fork of apache/pulsar to run builds on > > > your personal GitHub Actions CI. > > > > > > BR, > > > > > > Lari > > > > > > > > > > > > > > > > > > > > > > > > > > >