gabotechs opened a new pull request, #1611: URL: https://github.com/apache/datafusion-python/pull/1611
## Quick Try Run two workers in separate terminals: ```bash python examples/distributed-localhost-worker.py 50051 python examples/distributed-localhost-worker.py 50052 ``` Then run the query or print the distributed plan: ```bash curl -LO https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2021-01.parquet WORKERS=50051,50052 python examples/distributed-run.py --plan yellow_tripdata_2021-01.parquet WORKERS=50051,50052 python examples/distributed-run.py yellow_tripdata_2021-01.parquet ``` # Which issue does this PR close? N/A. # Rationale for this change This lets Python users wire `datafusion-distributed` into `datafusion-python`: discover workers from Python, spawn worker servers, and inspect distributed plans. One integration wrinkle: upstream examples usually build from `SessionStateBuilder`, while this package exposes `SessionConfig`/`SessionContext` as the main public path. This PR installs the distributed planner when a `SessionContext` is built from a distributed config. # What changes are included in this PR? - Adds Python-facing `WorkerResolver`, `Worker`, and `WorkerQueryContext` support. - Adds distributed planner wiring and distributed worker server bindings. - Adds `ExecutionPlan.display_distributed(metrics_format=...)` using upstream ASCII plan rendering. - Adds localhost worker/run examples, including `--plan`. - Adds focused tests for the new API surface. # Are there any user-facing changes? Yes. New distributed APIs are exposed from Python, plus new examples. No intended breaking changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
