Hi Beam community,

I’m drafting a GSoC 2026 proposal based on the existing Beam idea "A
learning path to using accelerators with Beam" (mentor: Pablo
Estrada). I’d love to share my implementation draft and ask for any
technical feedback from the community.

Draft doc:
https://docs.google.com/document/d/1XvgS9k3ErjdrXdID-aCpDG28g4ylFEb4/edit

Summary (aligned with the idea):
A progressive set of examples that builds from a local CPU baseline to
Dataflow GPU speedups, then accelerator-backed training (GPU/TPU), and
finally parallel training orchestration (e.g., sweeps), plus a short
guide and lightweight CI/smoke tests to keep the examples fresh.

A few specific prompts to get discussion started:
Q1: Does the staged progression (CPU -> GPU -> TPU training ->
parallel sweeps) feel like the right "learning path" for Beam ML
users?
Q2: Any concerns with using --worker_accelerator for provisioning
while using resource_hints to annotate accelerator-benefiting
transforms?
Q3: Is the proposed "continuous freshness" approach (nightly mocks +
periodic Dataflow smoke runs) reasonable for examples in the Beam
repo?

Thanks in advance for any thoughts or pointers to existing
patterns/docs I should align with.

Best regards,
Elia

Reply via email to