zhengruifeng commented on PR #55762: URL: https://github.com/apache/spark/pull/55762#issuecomment-4487328381
### Worked-example: `Build modules: hive - slow tests` Apples-to-apples on a single matrix entry, comparing this PR's head against an apache/spark master run on the same day: - **This PR** ([fork run 26072669641, job 76658806177](https://github.com/zhengruifeng/spark/actions/runs/26072669641/job/76658806177)): **43:27** - **master baseline** ([apache run 26076954167, job 76669980347](https://github.com/apache/spark/actions/runs/26076954167/job/76669980347)): **52:24** - Δ: **−8:57** (~17% faster) #### Step-by-step diff | Step | master (no artifact reuse) | This PR | Δ | |---|---:|---:|---:| | Free up disk space | 3.93 m | 2.58 m | −1.35 (noise) | | **Download precompiled artifact** | — | 2.12 m | +2.12 | | **Extract precompiled artifact** | — | 0.33 m | +0.33 | | **Run tests** | **46.58 m** | **36.38 m** | **−10.20** | | (other steps) | ~1.9 m | ~1.9 m | ~0 | | Total | 52:24 | 43:27 | **−8:57** | The `Run tests` step drops by 10.2 min, partially offset by 2.5 min of artifact download+extract. Net per-job saving ≈ ~8 min after removing 1.4 min of "Free up disk space" runner variance. #### Why `Run tests` is shorter `dev/run-tests.py` makes three SBT calls inside this step. On this PR, `SKIP_SCALA_BUILD=true` is exported (gated on the extract step's success) and short-circuits the first two: 1. ~~`build_spark_sbt`: `Test/package`, kinesis-asl-assembly, connect-assembly~~ (skipped) 2. ~~`build_spark_assembly_sbt`: `assembly/package`~~ (skipped) 3. `run_scala_tests_sbt`: `hive/test` (still runs; Zinc sees up-to-date classes from the extracted artifact and only compiles test sources) #### Are tests running faster too? No - and that's the point Verified against the JUnit reports uploaded by both jobs (`test-results-hive-- slow tests-...`): | | This PR | master | Δ | |---|---:|---:|---:| | Suites executed | 133 | 133 | 0 | | Tests executed | 2,220 | 2,220 | 0 | | Sum of per-test `time` | 1,833.3 s | 1,816.6 s | +16.7 s (this PR slightly slower) | The aggregate +16.7 s is dominated by one network-dependent outlier (`HiveClientSuites.success sanity check`: 65.2 s vs 32.4 s, Maven download latency). Excluding it, the two runs are within ~0.1% of each other. Per-test deltas (the `run sql directly on files` family for example) are in the tens of milliseconds - JVM warmup / JIT noise. So the job-level saving is entirely "skip the SBT build phase," not "tests got faster." This is the expected and correct behavior: the precompile artifact is byte-equivalent to what `build_spark_sbt + build_spark_assembly_sbt` would have produced, so the test phase runs against identical classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
