Re: [Proposal] Beam ML containers

Robert Bradshaw Tue, 01 Jul 2025 11:31:57 -0700

On Tue, Jul 1, 2025 at 10:32 AM Kenneth Knowles <k...@apache.org> wrote:
>
> Obligatory question: can we automate this? Specifically: can we publish the 
> ML-specific containers and then use them as appropriate without making it a 
> user-facing knob?

+1

Transforms can declare their own environments. The only problem with
this is that distinct environments prohibit fusion--we need a way to
say that a given environment is a superset of another. (We can do this
with dependencies, but not with arbitrary docker images.) (One could
possibly get away with the "AnyOf" environment as the base environment
as well, if we define (and enforce) a preference order.)

This being the messy world of ML, would these images be
mahine/accelerator agnostic?

> Kenn
>
> On Mon, Jun 30, 2025 at 12:07 PM Danny McCormick via dev 
> <dev@beam.apache.org> wrote:
>>
>> Hey everyone, I'd like to propose publishing some ML-specific Beam 
>> containers alongside our normal base containers. The end result would be 
>> allowing users to specify `--sdk_container_image=ml` or 
>> `--sdk_container_image=gpu` so that their jobs run in containers which work 
>> well with ML/GPU jobs.
>>
>> I put together a tiny design, please take a look and let me know what you 
>> think.
>>
>> https://docs.google.com/document/d/1JcVFJsPbVvtvaYdGi-DzWy9PIIYJhL7LwWGEXt2NZMk/edit?usp=sharing
>>
>> Thanks,
>> Danny

Re: [Proposal] Beam ML containers

Reply via email to