Shekharrajak opened a new pull request, #3549:
URL: https://github.com/apache/datafusion-comet/pull/3549
## Which issue does this PR close?
Add GitHub CI workflow to run TPC-H benchmarks on a Kind Kubernetes cluster,
validating Comet performance achieves ≥1.1x speedup over Spark baseline.
Closes #3537
## Rationale for this change
Run Spark baseline benchmark
Run Comet benchmark
**Validate speedup ≥ 1.1x (10% improvement)**
## What changes are included in this PR?
The workflow triggers on PRs modifying:
- `native/**/*.rs`
- `spark/**/*.scala`
- `spark/**/*.java`
## How are these changes tested?
## Local Testing
```bash
# Setup cluster
./hack/k8s-benchmark-setup.sh
# Run benchmarks
./benchmarks/scripts/run-k8s-benchmark.sh spark q1
./benchmarks/scripts/run-k8s-benchmark.sh comet q1
# Compare results
python3 benchmarks/scripts/compare-results.py \
--spark /tmp/comet-bench-results/spark_q1_result.json \
--comet /tmp/comet-bench-results/comet_q1_result.json \
--min-speedup 1.1
# Cleanup
./hack/k8s-benchmark-setup.sh --delete
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]