Re: Superset Consumption of ASF Shared GitHub-hosted Runners

Bob Thomson Mon, 22 Jun 2026 06:45:57 -0700

Hi Evan,

That's 7th in terms of runner minutes used over the 7 day period - we can split 
that down by workflow, but we don't have data down to the level of PRs I'm 
afraid.


We can drill down into the workflow split, but even doing that nothing in 
particular consumes a lot, superset-e2e (3.4%), supetset-frontend (2.5%), 
docker (2.3%), superset-python-integration-test (2%) are the largest consumers 
of the time superset uses, the rest are under 2% and most are much less.

Kind regards,
-Bob Thomson.

On 2026/06/15 21:14:04 Evan Rusackas wrote:
> Hi Bob,
> 
> Just trying to disambiguate something… are we at #7 for overall use, or on a 
> per-PR basis?
> 
> It’s a little hard to separate the problem from a per-run basis (optimizing) 
> vs total use (i.e. lots of PRs flying around causing a larger overall load).
> 
> I’m hoping things have gotten significantly better per-PR, but the repo has 
> been a hive of activity lately, keeping us in a top overall spot.
> 
> Still trying to think of places to cut that aren’t detrimental to stability 
> or devex. We’ll keep whittling, and anyone reading this is welcome to provide 
> ideas/suggestions :)
> 
> Thanks,
> 
> -e-
> 
> Evan Rusackas
> Preset | preset.io
> Apache Superset PMC
> On Jun 11, 2026 at 1:55 AM -0700, Bob Thomson <[email protected]>, wrote:
> > Hi,
> >
> > Reviewing the picture today, the GitHub hosted runners utilisation overall 
> > is still maxxing out daily. WIth respect to superset we can see over the 
> > last 7 days that the project has dropped out of the top 5 to 7 which is 
> > good progress. Your continued work and vigilance on this area helps all 
> > projects and is appreciated.
> >
> > Please note that there is now a page for sharing tips and best practice for 
> > optimising workflows with respect to utilisation:
> >
> > https://cwiki.apache.org/confluence/display/INFRA/GitHub+Actions+Recommended+Practices
> >
> > There is also a Slack channel for community discussion and sharing ideas: 
> > project-workflow-optimisations
> >
> > Thanks.
> >
> > Kind regards,
> > -Bob Thomson,
> > ASF Infrastructure
> >
> >
> > On 2026/06/05 08:09:51 Robert Thomson wrote:
> > > Thanks Evan, all sounds like great work, hopefully will make a dent in the
> > > jobs in use.
> > >
> > > Kind regards,
> > > -Bob Thomson,
> > > ASF Infrastructure
> > >
> > >
> > > On Thu, Jun 4, 2026 at 9:25 PM Evan Rusackas <[email protected]> wrote:
> > >
> > > > Thanks for the tips.
> > > >
> > > > For your first suggestion, we took a different route to the same goal,
> > > > using a change-detector action and job-level gating, which has 
> > > > advantages
> > > > for our setup, but we do use “paths:” in several workflows.
> > > >
> > > > For the second one, we are using concurrency & cancel-in-progress, so 
> > > > all
> > > > set there. However, we’re using “github.run_id” rather than “github.ref”
> > > > there, since on push events, run_id lets every commit to master get 
> > > > fully
> > > > validated, whereas ref would cancel in-progress master validations when
> > > > commits land back-to-back (happening an awful lot right now).
> > > >
> > > > All the important PRs mentioned in my last email have landed, and we’re
> > > > just doing touch-ups now. Hopefully the situation has drastically 
> > > > improved,
> > > > though ironically, a ton of PRs need rebasing now, so pardon the CI 
> > > > churn
> > > > while we do so with the current backlog.
> > > >
> > > > Thanks again,
> > > >
> > > > -e-
> > > >
> > > > *Evan Rusackas*
> > > > Preset | preset.io
> > > > On Jun 4, 2026 at 1:09 AM -0700, Bob Thomson <[email protected]>, 
> > > > wrote:
> > > >
> > > > I have been experimenting with pointing Gemini at public repos and
> > > > prompting:
> > > >
> > > > "Analyse the GitHub Actions workflows in this repo
> > > > https://github.com/apache/PROJECT/tree/master/.github and report on
> > > > possible causes of long run time/high number of runs of GitHub Actions"
> > > >
> > > > One output here was:
> > > >
> > > > The Problem: Changes to frontend UI files (.ts, .tsx, .less) frequently
> > > > trigger backend Python unit test runs, and vice versa. Unless paths are
> > > > explicitly managed on every configuration entry, the entire testing 
> > > > suite
> > > > runs for micro-commits affecting only one side of the stack.
> > > > The Fix: Workflows must feature distinct path-routing restrictions:
> > > >
> > > > And the suggestion change was:
> > > >
> > > > # For frontend workflows
> > > > on:
> > > > pull_request:
> > > > paths:
> > > > - 'superset-frontend/**'
> > > >
> > > > I am no expert on Actions or this project, but thought I'd pass it on in
> > > > case it is helpful.
> > > >
> > > > A second one was:
> > > >
> > > > concurrency:
> > > > group: ${{ github.workflow }}-${{ github.event.pull_request.number ||
> > > > github.ref }}
> > > > cancel-in-progress: true
> > > >
> > > > Which is said to ensure that, when a PR is opened and workflows are
> > > > running for it, and a further new commit is made to the same PR, the old
> > > > runs from the first commit are then cancelled - otherwise an open PR 
> > > > that
> > > > gets 3 more commits pushed, resulst in 3 lots of workflows running for 
> > > > the
> > > > one PR, 2 of which are redundant.
> > > >
> > > > Hope these are useful, or at least food for thought on other possible
> > > > steamlining improvements.
> > > >
> > > > Kind regards,
> > > > -Bob Thomson
> > > >
> > > >
> > > > On 2026/06/03 18:22:55 Evan Rusackas wrote:
> > > >
> > > > Hi Bob (and all)
> > > >
> > > > Thanks for the heads up on this. I just opened a swath of PRs that 
> > > > should
> > > > cut this down significantly. I’m working with PMC members to
> > > > assess/touch-up/review/merge:
> > > >
> > > >
> > > > 1. This PR takes us from 6 Cypress runners down to 5, and takes
> > > > the /app/prefix smoke test (only running on master now) down from 2 
> > > > runners
> > > > to 1. https://github.com/apache/superset/pull/40717
> > > > 2. Cypress runners were all spinning up BEFORE they checked to see if 
> > > > they
> > > > were needed. This should fix that:
> > > > https://github.com/apache/superset/pull/40718
> > > > 3. Gating E2E behind pre-commit. That's such a common failure that we
> > > > probably needn't test E2E until it passes. See the caveats here, there 
> > > > are
> > > > some visibility and fork-based PR caveats:
> > > > https://github.com/apache/superset/pull/40719
> > > > 4. run unit/integration tests on CURRENT python version on PRs, and full
> > > > version matrix (3.10-3.12) on master:
> > > > https://github.com/apache/superset/pull/40722
> > > > 5. Don't run CodeQL checks on docs-only changes:
> > > > https://github.com/apache/superset/pull/40724
> > > > 6. Cancel-in-progress on a few things that churn needlessly on every
> > > > commit: https://github.com/apache/superset/pull/40725
> > > > 7. Only build docker on docker-relevant changes:
> > > > https://github.com/apache/superset/pull/40723
> > > >
> > > > There’s an alternate (radical) solution of just NOT running E2E tests on
> > > > PRs, but only running them on master. Sure would “nip it in the bud” 
> > > > cost
> > > > wise, but has potential repercussions if we don’t keep a close eye on 
> > > > CI on
> > > > `master`
> > > >
> > > > TL;DR: We’re whittling, and will ask for fresh reports (in private ASF
> > > > Slack channels, probably) for impact results.
> > > >
> > > >
> > > > -e-
> > > >
> > > > Evan Rusackas
> > > > Preset | preset.io
> > > > On Jun 3, 2026 at 10:29 AM -0700, Bob Thomson <[email protected]>,
> > > > wrote:
> > > >
> > > > Fewer parallel runs is essential yes - we are at 900/900 GitHub hosted
> > > > runner jobs/slots just now and looking at Superset Actions we can see
> > > > nearly 500 completed Supeset repo action runs in the last hour, some of
> > > > those are up to 25 minutes in execution time, so anything that can be 
> > > > done
> > > > to reduce the share of runner jobs used by Superset is an urgent issue 
> > > > when
> > > > we are at max jobs on runners on a daily basis now.
> > > >
> > > > Thanks.
> > > >
> > > > Kind regards,
> > > > -Bob Thomson,
> > > > ASF Infrastructure
> > > >
> > > > On 2026/05/22 19:54:16 Evan Rusackas wrote:
> > > >
> > > > Hi Bob (and everyone here),
> > > >
> > > > Thanks for the alert. The unfortunate thing is that this will only get
> > > > worse as we create/fix more things (security, dependabot, etc). Things 
> > > > only
> > > > seem to be ramping up.
> > > >
> > > > So, agreed, we must whittle. Cypress is the obvious killer (about half 
> > > > the
> > > > consumption). We’ll try to find ways to whittle away at this (we’re
> > > > migrating to Playwright, but it takes time). We might also be able to 
> > > > spend
> > > > less compute and more time by optimizing (or removing) some 
> > > > parallelization
> > > > here.
> > > >
> > > > We’re also looking at moving from dependabot for all dependency bumps (a
> > > > LOT of PRs) to `renovate` - which might optimize things a bit (bumping
> > > > dependencies in groups) but we will need to also leave dependabot in 
> > > > place
> > > > for security-driven fixes as well.
> > > >
> > > > As for Cypress tests, we have some “martixification” happening, that I
> > > > think we can optimize. For the Superset folks reading this, I think we 
> > > > can
> > > > split out the “app_root” tests to JUST run on merges to `master` rather
> > > > than every PR. That’ll save ~50% right there, we just have to keep a 
> > > > better
> > > > eye on CI on `master` (which we haven’t been great about historically, 
> > > > but
> > > > we’re getting better).
> > > >
> > > > Here’s the app_root PR https://github.com/apache/superset/pull/40385
> > > >
> > > > We can also reduce the E2E parallelization shards from 6 to… I dunno… 3 
> > > > or
> > > > 4. That’ll save a fair bit of setup time spinning up Superset instances.
> > > > Tests will run a bit longer, but consume less overall. Seems like a fair
> > > > tradeoff.
> > > >
> > > > Open to other ideas… maybe running fewer GHA workflows in parallel, and
> > > > having things more sequentially to fail faster (like nothing runs until
> > > > pre-commit passes, for example).
> > > >
> > > > Also, least importantly, we don’t have the access to see how we stack up
> > > > against other projects, but I sure am curious.
> > > >
> > > > Anyone's thoughts/PRs welcomed.
> > > >
> > > > Evan Rusackas
> > > > Preset | preset.io
> > > > On May 22, 2026 at 4:46 AM -0700, Robert Thomson <[email protected]>,
> > > > wrote:
> > > >
> > > > Hello, Superset PMC.
> > > >
> > > > In 2024, the ASF introduced the policy for GitHub Actions usage
> > > > across the foundation[1]. The ASF Github shared pool of
> > > > Github-hosted runners has been at, or very close to the limit of
> > > > 900 jobs most of the time in the past few weeks and this is the
> > > > case again today.
> > > >
> > > > Your project has been identified as being among the top 5 consumers of
> > > > build time over the past 7 days and we request that you bring your
> > > > usage down by stream-lining long-running builds. Contact Infra for
> > > > a consultation if you are unable to streamline your builds further.
> > > >
> > > > You can use the infra reporting tool[2] to monitor your GHA usage as you
> > > > work on stream-lining, as well as locate any bottlenecks in the 
> > > > workflows.
> > > >
> > > > Infra will allow you two weeks time (till the 8th of June, 2026) to
> > > > progress this, but should you still be above the limits by then,
> > > > without a viable path forward, we will be limiting your GHA usage.
> > > >
> > > > Kind regards,
> > > > Bob Thomson, on behalf of ASF Infrastructure.
> > > >
> > > >
> > > > [1] https://infra.apache.org/github-actions-policy.html
> > > > [2]
> > > >
> > > > https://infra-reports.apache.org/#ghactions&project=superset&hours=24&limit=15&group=name
> > > >
> > > >
> > > >
> > > >
> > >
>

Re: Superset Consumption of ASF Shared GitHub-hosted Runners

Reply via email to