andygrove commented on issue #4389:
URL:
https://github.com/apache/datafusion-comet/issues/4389#issuecomment-4512172121
Proposing a minimal first step toward tiered CI: split the existing
`spark_sql_test.yml` matrix so PRs run a 2-version subset and the GitHub merge
queue covers the remaining versions. Push to `main` and `workflow_dispatch`
continue to run all four versions.
### Proposed split
| Trigger | Spark versions | Jobs (modules x Sparks) |
|---|---|---|
| `pull_request` | 3.5.8, 4.1.1 | 7 x 2 = 14 |
| `merge_group` | 3.4.3, 4.0.2 | 7 x 2 = 14 |
| `push` to `main` | 3.4.3, 3.5.8, 4.0.2, 4.1.1 | 7 x 4 = 28 (unchanged) |
Net effect per PR: -14 `spark-sql-test/*` jobs from this workflow. The merge
queue adds 14 jobs, but each PR triggers `merge_group` only once when it enters
the queue.
### Mechanism
1. Add `merge_group:` to the workflow `on:` triggers (no `paths:` filter so
queue runs always execute).
2. Replace the static `config:` matrix with an event-conditional `\${{ ...
}}` expression that selects the Spark version subset based on
`github.event_name`.
3. Add a single rollup job `spark-sql-test-status` (`needs: spark-sql-test`,
`if: always()`) that becomes the sole required status check. This decouples
branch protection from matrix shape so future reshapes do not require
re-editing required checks.
### Branch protection coordination
The per-version status check names
(`spark-sql-{module}/spark-{full}-jdk{java}`) will no longer be produced on PRs
after this change. Branch protection must be updated to require `Spark SQL
Tests Status` instead, before merging. Suggested sequence: remove per-version
required checks, merge the workflow change, verify a `merge_group` run, then
add the rollup as the sole required check.
### Scope
In scope: one file only, `.github/workflows/spark_sql_test.yml`.
Out of scope (future follow-ups): `pr_build_linux.yml` (Comet's own tests
still run all five Spark profiles on PR), `iceberg_spark_test.yml`, macOS,
benchmarks, fuzz, nightly cron tier. Each of these can be tackled as its own
follow-up.
### Branch
The proposed change is on
[andygrove:ci/tiered-spark-sql-test-matrix](https://github.com/andygrove/datafusion-comet/tree/ci/tiered-spark-sql-test-matrix).
Not opening a PR yet, pending feedback on the approach and the
branch-protection coordination plan above.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]