andygrove opened a new pull request, #1688:
URL: https://github.com/apache/datafusion-ballista/pull/1688

   # Which issue does this PR close?
   
   Closes #.
   
   # Rationale for this change
   
   We currently have no CI coverage that exercises a real (multi-process) 
Ballista cluster end-to-end on every PR. The existing 
`verify-benchmark-results` job runs the benchmark crate's unit-style tests, and 
`dev/integration-tests.sh` is docker-based and not wired into CI. As a result, 
regressions in the scheduler/executor gRPC path, Arrow Flight shuffle, or 
distributed query planning can land on `main` without being caught until 
someone manually runs the integration script.
   
   # What changes are included in this PR?
   
   Adds `.github/workflows/tpch.yml`, a new workflow that runs on every push, 
pull request, and on manual dispatch. The job:
   
   - Builds `ballista-scheduler`, `ballista-executor`, and 
`ballista-benchmarks` with `--profile release-nonlto`.
   - Installs `tpchgen-cli` and generates TPC-H SF10 Parquet (16 partitions) 
into the runner's temp dir.
   - Starts a real cluster as background processes on `ubuntu-latest`: one 
scheduler and one executor (`--concurrent-tasks 4 --memory-pool-size 2GB`), 
with a `trap` that tails both logs and kills both PIDs on exit.
   - Polls TCP `50050` / `50051` for readiness (30s timeout).
   - Runs all 21 supported TPC-H queries (1..22 minus q16, matching 
`benchmarks/run.sh`) once each via `tpch benchmark ballista`, with `set -euo 
pipefail` so the first non-zero exit fails the job.
   - Uploads scheduler and executor logs as an artifact on failure.
   
   No Docker, no reuse of `dev/integration-tests.sh`. The workflow file is 
self-contained.
   
   # Are there any user-facing changes?
   
   No. CI-only change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to