akshayjadiyanv commented on PR #38701: URL: https://github.com/apache/beam/pull/38701#issuecomment-4557933117
Thanks Danny. Pushed `6f8849169b2`: `vllm.dockerfile.old` is bumped to `apache-beam[gcp]==2.71.0` + matching `COPY` source, with `ai-dynamo[vllm]` and `etcd v3.5.13` added. `common.gradle`'s Dynamo IT is uncommented; it now inherits `n1-standard-4` from `argMap` (like the existing vLLM ITs) and uses `nvidia-tesla-t4` per your suggestion. No separate GPU pool — per-job `worker_accelerator` experiment. Validated end-to-end on Dataflow against an image built from the updated dockerfile, with `--sdk_container_image=<my-image>`. Job `2026-05-27_10_43_15-18102219387511428172` finished `JOB_STATE_DONE` with 5 coherent Qwen3-0.6B completions in GCS. The `nvext.timing` field on every `Completion` confirms the Dynamo frontend served the request (vanilla vLLM's OpenAI server doesn't emit it). Worker logs confirm Tesla T4 attached. **Ask**: when convenient, please rebuild `us.gcr.io/apache-beam-testing/python-postcommit-it/vllm:latest` from the updated dockerfile so the new Dynamo IT in `common.gradle` (which reuses that tag) can actually run in `apache-beam-testing`. Happy to coordinate on a staged tag first if you'd prefer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
