zhengruifeng commented on PR #56110: URL: https://github.com/apache/spark/pull/56110#issuecomment-4543467601
### CI performance: before vs after Comparing per-job wall time on real CI runs: | Job | Before avg (n=2) | After (n=1) | Savings | |---|---:|---:|---:| | Precompile Spark | 16m34s | 16m13s | -- (same) | | **Run Docker integration tests** | **90m48s** | **74m12s** | **~16m36s (~18%)** | | Run Spark on Kubernetes Integration test | 66m56s | 65m48s | ~1m08s (~2%) | Samples: - BEFORE-1: [zhengruifeng/spark run 26072669641](https://github.com/zhengruifeng/spark/actions/runs/26072669641) (2026-05-19, on SPARK-56943's PR branch -- precompile already produced an artifact, but docker/k8s didn't consume it). - BEFORE-2: [zhengruifeng/spark run 25551778074](https://github.com/zhengruifeng/spark/actions/runs/25551778074) (2026-05-08, earlier push on the same PR). - AFTER: [zhengruifeng/spark run 26438104273](https://github.com/zhengruifeng/spark/actions/runs/26438104273) (2026-05-26, this PR). ### Reading the result - **Docker is a clean win** -- ~17m saved per run, ~18% of job wall time, same payoff shape as the pyspark sharing in [SPARK-56768](https://issues.apache.org/jira/browse/SPARK-56768). Docker tests are compile-heavy relative to their other work. - **K8s barely moves** (~1m). The savings on the Spark-side SBT compile are real, but they're absorbed by the parts of the K8s job that don't change: Minikube startup, Spark Docker image build, the `kubernetes-integration-tests` module's own compile (which isn't in the precompile because it needs `-Pkubernetes-integration-tests`), and the actual K8s integration test execution. Wall time is dominated by these. - I'd still keep the K8s wiring -- the change is small, the fallback is silent if precompile fails, and even small per-run savings add up. A possible follow-up that could shave 5-10m off the K8s job is to fold `-Pkubernetes-integration-tests` (and `-Psparkr`) into the precompile invocation so SBT doesn't recompile those modules at test time. Happy to do that in a separate PR if reviewers want. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
