Option 4) We are also thinking about adding process based SDKHarness. This will avoid docker in docker scenario. Process based SDKHarness also has other applications and might be desirable in some of the production use cases.
On Mon, Aug 20, 2018 at 11:49 AM Henning Rohde <[email protected]> wrote: > Option 3) would be to map in the docker binary and socket to allow the > containerized Flink job server to start "sibling" containers on the host. > That both avoids docker-in-docker (which is indeed undesirable) as well as > extra requirements for each SDK to spin up containers -- notably, if the > runner supports auto-scaling or similar non-trivial configurations, that > would be difficult to manage from the SDK side. > > Henning > > On Mon, Aug 20, 2018 at 8:31 AM Maximilian Michels <[email protected]> wrote: > >> Hi everyone, >> >> I wanted to get your opinion on the Job-Server startup [1] which is part >> of the portability story. >> >> I've created a docker container to bring up Beam's Job Server, which is >> the entry point for pipeline execution. Generally, this works fine when >> the backend (Flink in this case) runs externally and the Job Server >> connects to it. >> >> For tests or pipeline development we may want the backend to run >> embedded (inside the Job Server) which is rather problematic because the >> portability requires to spin up the SDK harness in a Docker container as >> well. This would happen at runtime inside the Docker container. >> >> Since Docker inside Docker is not desirable I'm thinking about other >> options: >> >> Option 1) Instead of a Docker container, we start a bundled Job-Server >> binary (or jar) when we run the pipeline. The bundle also contains an >> embedded variant of the backend. For Flink, this is basically the output >> of `:beam-runners-flink_2.11-job-server:shadowJar` but it is started >> during pipeline execution. >> >> Option 2) In addition to the Job Server, we let the SDK spin up another >> Docker container with the backend. This is may be most applicable to all >> types of backends since not all backends offer an embedded execution mode. >> >> >> Keep in mind that this is only a problem for local/test execution but it >> is an important aspect of Beam's usability. >> >> What do you think? I'm leaning towards option 2. Maybe you have other >> options in mind. >> >> Cheers, >> Max >> >> [1] https://issues.apache.org/jira/browse/BEAM-4130 >> >
