Hi Luke - Thank you so much for the response. To tell you my use case, I have spark cluster locally with compose. The spark docker image has beam installed. I am planning on using beam-python. My question into using docker image in ParDo is "docker-in-docker" as you can tell based on the setup I have. Given this, do you recommend not using Python and use Java ProcessBuilder with docker run command? Or is there a ProcessBuilder equivalent in Beam-Python? Thank you for your time and patience in answering my questions.
Sincerely, Mahesh On Thu, Jan 6, 2022 at 2:45 PM Luke Cwik <lc...@google.com> wrote: > Some runners use docker containers to encapsulate the "worker". With these > runners you might run into the docker in docker problems and it may be > simpler to compose the docker container with the Apache Beam docker > container instead. > > For other runners that execute without docker then as long as docker is > installed and available on all the worker nodes then it should work. This > wouldn't be much different than launching an application and managing its > lifetime from within the transform using standard ways to launch > applications such as Java's ProcessBuilder[1]. > > 1: https://docs.oracle.com/javase/7/docs/api/java/lang/ProcessBuilder.html > > On Tue, Jan 4, 2022 at 8:13 PM Mahesh Vangala <vangalamahe...@gmail.com> > wrote: > >> Hello Beam community - >> >> Is it possible to run docker image as part of a transformation step? >> Could you point me to any examples with using "docker run" as a step in >> the pipeline? >> Let me know. >> >> Thank you, >> Mahesh >> >