This is awesome! Any chance you could roadmap the PR for us with some links into the most interesting bits?
On Fri, Jan 20, 2017 at 12:19 PM, Robert Bradshaw < [email protected]> wrote: > Also, note that we can still support the "simple" case. For example, > if the user supplies us with a jar file (as they do now) a runner > could launch it as a subprocesses and communicate with it via this > same Fn API or install it in a fixed container itself--the user > doesn't *need* to know about docker or manually manage containers (and > indeed the Fn API could be used in-process, cross-process, > cross-container, and even cross-machine). > > However docker provides a nice cross-language way of specifying the > environment including all dependencies (especially for languages like > Python or C where the equivalent of a cross-platform, self-contained > jar isn't as easy to produce) and is strictly more powerful and > flexible (specifically it isolates the runtime environment and one can > even use it for local testing). > > Slicing a worker up like this without sacrificing performance is an > ambitious goal, but essential to the story of being able to mix and > match runners and SDKs arbitrarily, and I think this is a great start. > > > On Fri, Jan 20, 2017 at 9:39 AM, Lukasz Cwik <[email protected]> > wrote: > > Your correct, a docker container is created that contains the execution > > environment the user wants or the user re-uses an existing one (allowing > > for a user to embed all their code/dependencies or use a container that > can > > deploy code/dependencies on demand). > > A user creates a pipeline saying which docker container they want to use > > (this starts to allow for multiple container definitions within a single > > pipeline to support multiple languages, versioning, ...). > > A runner would then be responsible for launching one or more of these > > containers in a cluster manager of their choice (scaling up or down the > > number of instances depending on demand/load/...). > > A runner then interacts with the docker containers over the gRPC service > > definitions to delegate processing to. > > > > > > On Fri, Jan 20, 2017 at 4:56 AM, Jean-Baptiste Onofré <[email protected]> > > wrote: > > > >> Hi Luke, > >> > >> that's really great and very promising ! > >> > >> It's really ambitious but I like the idea. Just to clarify: the purpose > of > >> using gRPC is once the docker container is running, then we can > "interact" > >> with the container to spread and delegate processing to the docker > >> container, correct ? > >> The users/devops have to setup the docker containers as prerequisite. > >> Then, the "location" of the containers (kind of container registry) is > set > >> via the pipeline options and used by gRPC ? > >> > >> Thanks Luke ! > >> > >> Regards > >> JB > >> > >> > >> On 01/19/2017 03:56 PM, Lukasz Cwik wrote: > >> > >>> I have been prototyping several components towards the Beam technical > >>> vision of being able to execute an arbitrary language using an > arbitrary > >>> runner. > >>> > >>> I would like to share this overview [1] of what I have been working > >>> towards. I also share this PR [2] with a proposed API, service > definitions > >>> and partial implementation. > >>> > >>> 1: https://s.apache.org/beam-fn-api > >>> 2: https://github.com/apache/beam/pull/1801 > >>> > >>> Please comment on the overview within this thread, and any specific > code > >>> comments on the PR directly. > >>> > >>> Luke > >>> > >>> > >> -- > >> Jean-Baptiste Onofré > >> [email protected] > >> http://blog.nanthrax.net > >> Talend - http://www.talend.com > >> >
