Re: Graal instead of docker?

Robert Bradshaw Sat, 05 May 2018 15:59:15 -0700

Portability, at its core, is providing a spec for any runner to talk to any
SDK. I personally think it's done a great job in cleaning up the model by
forcing us to define a clean boundary (as specified at
https://github.com/apache/beam/tree/master/model ) between these two
components (even if the implementations of one or the other are
(temporarily, I hope for the most part) complicated).The pipeline (on the
runner submission side) and work execution (on what has traditionally been
called the fn api side) have concrete platform-independent descriptions,
rather than being a set of Java classes.


Currently, the portion that lives on the "runner" side of this boundary is
often shared among Java runners (via libraries like runners core), but it
is all still part of each runner, and because of this it removes the
requirement for the Runner be Java just like it remove the requirement for
the SDK to speak Java. (For example, I think a Python Dask runner makes a
lot of sense, Dataflow may decide to implement larger portions of its
runner in Go or C++ or even behind a service, and I've used the Python
ULRunner to run the Java SDK over the Fn API for testing development
purposes).

There is also the question of "why docker?" I actually don't see docker all
that intrinsic to the protocol; one only needs to be able to define and
bring up workers that communicate on specified ports. Docker happens to be
a fairly well supported way to package up an arbitrary chunk of code (in
any language), together with its nearly arbitrarily specified
dependencies/environment, in a way that's well specified and easy to start
up.

I would welcome changes to
https://github.com/apache/beam/blob/v2.4.0/model/pipeline/src/main/proto/beam_runner_api.proto#L730
that would provide alternatives to docker (one of which comes to mind is "I
already brought up a worker(s) for you (which could be the same process
that handled pipeline construction in testing scenarios), here's how to
connect to it/them.") Another option, which would seem to appeal to you in
particular, would be "the worker code is linked into the runner's binary,
use this process as the worker" (though note even for java-on-java, it can
be advantageous to shield the worker and runner code from each others
environments, dependencies, and version requirements.) This latter should
still likely use the FnApi to talk to itself (either over GRPC on local
ports, or possibly better via direct function calls eliminating the RPC
overhead altogether--this is how the fast local runner in Python works).
There may be runner environments well controlled enough that "start up the
workers" could be specified as "run this command line." We should make this
environment message extensible to other alternatives than "docker container
url," though of course we don't want the set of options to grow too large
or we loose the promise of portability unless every runner supports every
protocol.

Of course, the runner is always free to execute any Fn for which it
completely understands the URN and the environment any way it pleases, e.g.
directly in process, or even via lighter-weight mechanism like Jython or
Graal, rather than asking an external process to do it. But we need a
lowest common denominator for executing arbitrary URNs runners are not
expected to understand.

As an aside, there are also technical limitations in implementing
Portability
by simply requiring all runners to be Java and the portable layer simply
being wrappers of UserFnInLangaugeX in an equivalent UserFnObjectInJava,
executing everything as if it were pure Java. In particular the overheads
of unnecessarily crossing the language boundaries many times in a single
fused graph are often prohibitive.

Sorry for the long email, but hopefully this helps shed some light on (at
least how I see) the portability effort (at the core of the Beam vision
statement) as well as concrete actions we can take to decouple it from
specific technologies.

- Robert


On Sat, May 5, 2018 at 2:06 PM Romain Manni-Bucau <[email protected]>
wrote:

> All are good points.

> The only "?" I keep is: why beam doesnt uses its visitor api to make the
portability transversal to all runners "mutating" the user model before
translation? Technically it sounds easy and avoid hacking all impl. Was it
tested and failed?

> Le 5 mai 2018 22:50, "Thomas Weise" <[email protected]> a écrit :

>> Docker isn't a silver bullet and may not be the best choice for all
environments (I'm also looking at potentially launching SDK workers in a
different way), but AFAIK there has not been any alternative proposal for
default SDK execution that can handle all of Python, Go and Java.

>> Regardless of the default implementation, we should strive to keep the
implementation modular so users can plug in their own replacement as
needed. Looking at the prototype implementation, Docker comes downstream of
FlinkExecutableStageFunction, and it will be possible to supply a custom
implementation by making the translator pluggable (which I intend to work
on once backporting to master is complete), and possibly
"SDKHarnessManager" itself can also be swapped out.

>> I would also prefer that for Flink and other Java based runners we
retain the option to inline executable stages that are in Java. I would
expect a good number of use cases to benefit from direct execution in the
task manager, and it may be good to offer the user that optimization.

>> Thanks,
>> Thomas



>> On Sat, May 5, 2018 at 12:54 PM, Eugene Kirpichov <[email protected]>
wrote:

>>> To add on that: Romain, if you are really excited about Graal as a
project, here are some constructive suggestions as to what you can do on a
reasonably short timeframe:
>>> - Propose/prototype a design for writing UDFs in Beam SQL using Graal
>>> - Go through the portability-related design documents, come up with a
more precise assessment of what parts are actually dependent on Docker's
container format and/or on Docker itself, and propose a plan for untangling
this dependency and opening the door to other mechanisms of cross-language
execution

>>> On Sat, May 5, 2018 at 12:50 PM Eugene Kirpichov <[email protected]>
wrote:

>>>> Graal is a very young project, currently nowhere near the level of
maturity or completeness as to be sufficient for Beam to fully bet its
portability vision on it:
>>>> - Graal currently only claims to support Java and Javascript, with
Ruby and R in the status of "some applications may run", Python support
"just beginning", and Go lacking altogether.
>>>> - Regarding existing production usage, the Graal FAQ says it is "a
project with new innovative technology in its early stages."

>>>> That said, as Graal matures, I think it would be reasonable to keep an
eye on it as a potential future lightweight alternative to containers for
pipelines where Graal's level of support is sufficient for this particular
pipeline.

>>>> Please also keep in mind that execution of user code is only a small
part of the overall portability picture, and dependency on Docker is an
even smaller part of that (there is only 1 mention of the word "Docker" in
all of Beam's portability protos, and the mention is in an out-of-date TODO
comment). I hope this addresses your concerns.

>>>> On Sat, May 5, 2018 at 11:49 AM Romain Manni-Bucau <
[email protected]> wrote:

>>>>> Agree

>>>>> The jvm is still mainstream for big data and it is trivial to have a
remote facade to support natives but no point to have it in runners, it is
some particular transforms or even dofn and sources only...


>>>>> Le 5 mai 2018 19:03, "Andrew Pilloud" <[email protected]> a écrit :

>>>>>> Thanks for the examples earlier, I think Hazelcast is a great
example of something portability might make more difficult. I'm not working
on portability, but my understanding is that the data sent to the runner is
a blob of code and the name of the container to run it in. A runner with a
native language (java on Hazelcast for example) could run the code directly
without the container if it is in a language it supports. So when Hazelcast
sees a known java container specified, it just loads the java blob and runs
it. When it sees another container it rejects the pipeline. You could use
Graal in the Hazelcast runner to do this for a number of languages. I would
expect that this could also be done in the direct runner, which similarly
provides a native java environment, so portable Java pipelines can be
tested without docker?

>>>>>> For another way to frame this: if Beam was originally written in Go,
we would be having a different discussion. A pipeline written entirely in
java wouldn't be possible, so instead to enable Hazelcast, we would have to
be able to run the java from portability without running the container.

>>>>>> Andrew

>>>>>> On Sat, May 5, 2018 at 1:48 AM Romain Manni-Bucau <
[email protected]> wrote:



>>>>>>> 2018-05-05 9:27 GMT+02:00 Ismaël Mejía <[email protected]>:

>>>>>>>> Graal would not be a viable solution for the reasons Henning and
Andrew
>>>>>>>> mentioned, or put in other words, when users choose a programming
language
>>>>>>>> they don’t choose only a ‘friendly’ syntax or programming model,
they
>>>>>>>> choose also the ecosystem that comes with it, and the libraries
that make
>>>>>>>> their life easier. However isolating these user
libraries/dependencies is a
>>>>>>>> hard problem and so far the standard solution to this problem is
to use
>>>>>>>> operating systems containers via docker.


>>>>>>> Graal solves that Ismael. Same kind of experience than running npm
libs on nashorn but with a more unified API to run any language soft.



>>>>>>>> The Beam vision from day zero is to run pipelines written in
multiple
>>>>>>>> languages in runners in multiple systems, and so far we are not
doing this
>>>>>>>> in particular in the Apache runners. The portability work is the
cleanest
>>>>>>>> way to achieve this vision given the constraints.


>>>>>>> Hmm, did I read it wrong and we don't have specific integration of
the portable API in runners? This is what is messing up the runners and
limiting beam adoption on existing runners.
>>>>>>> Portable API is a feature buildable on top of runner, not in
runners.
>>>>>>> Same as a runner implementing the 5-6 primitives can run anything,
the portable API should just rely on that and not require more integration.
>>>>>>> It doesn't prevent more deep integrations as for some higher level
primitives existing in runners but it is not the case today for runners so
shouldn't exist IMHO.



>>>>>>>> I agree however that for the Java SDK to Java runner case this can
>>>>>>>> represent additional pain, docker ideally should not be a
requirement for
>>>>>>>> Java users with the Direct runner and debugging a pipeline should
be as
>>>>>>>> easy as it is today. I think the Univerrsal Local Runner exists to
cover
>>>>>>>> the Portable case, but after looking at this JIRA I am not sure if
>>>>>>>> unification is coming (and by consequence if docker would be
mandatory).
>>>>>>>> https://issues.apache.org/jira/browse/BEAM-4239

>>>>>>>> I suppose for the distributed runners that they must implement the
full
>>>>>>>> Portability APIs to be considered Beam multi language compliant
but they
>>>>>>>> can prefer for performance reasons to translate without the
portability
>>>>>>>> APIs the Java to Java case.



>>>>>>> This is my issue, language portability must NOT impact runners at
all, it is just a way to forward primitives to a runner.
>>>>>>> See it as a layer rewriting the pipeline and submitting it. No need
to modify any runner.


>>>>>>>> On Sat, May 5, 2018 at 9:11 AM Reuven Lax <[email protected]> wrote:

>>>>>>>> > A beam cluster with the spark runner would include a spark
cluster, plus
>>>>>>>> what's needed for portability, plus the beam sdk.

>>>>>>>> > On Fri, May 4, 2018, 11:55 PM Romain Manni-Bucau <
[email protected]>
>>>>>>>> wrote:



>>>>>>>> >> Le 5 mai 2018 08:43, "Reuven Lax" <[email protected]> a écrit :

>>>>>>>> >> I don't believe we enforce docker anywhere. In fact if someone
wanted to
>>>>>>>> run an all-windows beam cluster, they would probably not use
docker for
>>>>>>>> their runner (docker runs on Windows, but not efficiently).



>>>>>>>> >> Or doesnt run sometimes - a colleague hit that yesterday :(.

>>>>>>>> >> What is a "beam cluster" - opposed to a spark or foink cluster?
How
>>>>>>>> would it work on windows servers?


>>>>>>>> >> On Fri, May 4, 2018, 11:19 PM Romain Manni-Bucau <
[email protected]>
>>>>>>>> wrote:



>>>>>>>> >>> 2018-05-05 2:33 GMT+02:00 Andrew Pilloud <[email protected]>:

>>>>>>>> >>>> What docker really buys is a package format and runtime
environment
>>>>>>>> that is language and operating system agnostic. The docker
packaging and
>>>>>>>> runtime format is the de facto standard for portable applications
such as
>>>>>>>> this, and there is a group trying to turn it into an actual
standard.

>>>>>>>> >>>> I would agree with you that dockerd has become bloated but
there are
>>>>>>>> projects that solve that. There is no longer lock-in to dockerd,
there are
>>>>>>>> package format compatible docker replacements that eliminate the
>>>>>>>> performance issues and overhead associated with docker. CRI-O (
>>>>>>>> https://github.com/kubernetes-incubator/cri-o) is a really cool
RedHat
>>>>>>>> project which is a minimalist replacement for docker. I was
recently
>>>>>>>> working at a startup where I migrated our "data mover" appliance
from
>>>>>>>> Docker to CRI-O. Our application was able to get direct access to
the
>>>>>>>> ethernet driver and block devices which enabled a huge performance
boost
>>>>>>>> but we were also able to run containers produced by docker without
>>>>>>>> modification.

>>>>>>>> >>>> You mention that docker is "detail of one runner+vendor
corrupting all
>>>>>>>> the project and adding complexity and work to everyone". It sounds
like you
>>>>>>>> have a specific example you'd like to share? Is there a runner
that is
>>>>>>>> unable to move to portability because of docker?


>>>>>>>> >>> IBM one for instance, some custom ones like an hazelcast based
one,
>>>>>>>> etc... More generally any runner developped outside beam itself -
even if
>>>>>>>> we take a snapshot today, most of beam's ones have the same pitall.

>>>>>>>> >>> Note: i never said docker was a bad techno or so. Let me try
to clarify.

>>>>>>>> >>> Main issue is that you enforce docker usage which is still
trendy. It
>>>>>>>> is like scla which was promishing to kill java, check what it does
today...
>>>>>>>> >>> It starts to be tooled but it is also very impacting on the
deployment
>>>>>>>> side and for a good number of beam users who deploy it outside the
cloud it
>>>>>>>> is an issue.
>>>>>>>> >>> Keep in mind beam is embeddable by design, it is not a runner
>>>>>>>> environment and with the docker choice it imposes some environment
which is
>>>>>>>> inconsistent with beam design itself and this is where this choice
blocks.



>>>>>>>> >>>> Andrew

>>>>>>>> >>>> On Fri, May 4, 2018 at 4:32 PM Henning Rohde <
[email protected]>
>>>>>>>> wrote:

>>>>>>>> >>>>> Romain,

>>>>>>>> >>>>> Docker, unlike selinux, solves a great number of tangible
problems
>>>>>>>> for us with IMO a relatively small tax. It does not have to be the
only
>>>>>>>> way. Some of the concerns you bring up along with possibilities
were also
>>>>>>>> discussed here:
>>>>>>>> https://s.apache.org/beam-fn-api-container-contract.
I
>>>>>>>> encourage you to take a look.

>>>>>>>> >>>>> Thanks,
>>>>>>>> >>>>>   Henning


>>>>>>>> >>>>> On Fri, May 4, 2018 at 3:18 PM Romain Manni-Bucau <
>>>>>>>> [email protected]> wrote:



>>>>>>>> >>>>>> Le 4 mai 2018 21:31, "Henning Rohde" <[email protected]> a
écrit :

>>>>>>>> >>>>>> I disagree with the characterization of docker and the
implications
>>>>>>>> made towards portability. Graal looks like a neat project (and I
never
>>>>>>>> thought I would live to see the phrase "Practical Partial
Evaluation" ..),
>>>>>>>> but it doesn't address the needs of portability. In addition to
Luke's
>>>>>>>> examples, Go and most other languages don't work on it either.
Docker
>>>>>>>> containers also address packaging, OS dependencies, conflicting
versions
>>>>>>>> and distribution aspects in addition to truly universal language
support.


>>>>>>>> >>>>>> This is wrong, docker also has its conflicts, is not
universal
>>>>>>>> (fails on windows and mac easily - as host or not, cloud vendors
put layers
>>>>>>>> limiting or corrupting it, and it is an infra constraint imposed
and a
>>>>>>>> vendor locking not welcomed in beam IMHO).

>>>>>>>> >>>>>> This is my main concern. All the work done looks like an
>>>>>>>> implemzntation detail of one runner+vendor corrupting all the
project and
>>>>>>>> adding complexity and work to everyone instead of keeping it
localised
>>>>>>>> (technically it is possible).

>>>>>>>> >>>>>> Would you accept i enforce you to use selinux? Using docker
is the
>>>>>>>> same kind of constraint.


>>>>>>>> >>>>>> That said, it's entirely fine for some runners to use
Jython, Graal,
>>>>>>>> etc to provide a specialized offering similar to the direct
runners, but it
>>>>>>>> would be disjoint from portability IMO.

>>>>>>>> >>>>>> On Fri, May 4, 2018 at 10:14 AM Romain Manni-Bucau <
>>>>>>>> [email protected]> wrote:



>>>>>>>> >>>>>>> Le 4 mai 2018 17:55, "Lukasz Cwik" <[email protected]> a
écrit :

>>>>>>>> >>>>>>> I did take a look at Graal a while back when thinking
about how
>>>>>>>> execution environments could be defined, my concerns were related
to it not
>>>>>>>> supporting all of the features of a language.
>>>>>>>> >>>>>>> For example, its typical for Python to load and call native
>>>>>>>> libraries and Graal can only execute C/C++ code that has been
compiled to
>>>>>>>> LLVM.
>>>>>>>> >>>>>>> Also, a good amount of people interested in using ML
libraries will
>>>>>>>> want access to GPUs to improve performance which I believe that
Graal can't
>>>>>>>> support.

>>>>>>>> >>>>>>> It can be a very useful way to run simple lamda functions
written
>>>>>>>> in some language directly without needing to use a docker
environment but
>>>>>>>> you could probably use something even lighter weight then Graal
that is
>>>>>>>> language specific like Jython.



>>>>>>>> >>>>>>> Right, the jsr223 impl works very well but you can also
have a perf
>>>>>>>> boost using native (like v8 java binding for js for instance). It
is way
>>>>>>>> more efficient than docker most of the time and not code intrusive
at all
>>>>>>>> in runners so likely more adoption-able and maintainable. That
said all is
>>>>>>>> doable behind the jsr223 so maybe not a big deal in terms of api.
We just
>>>>>>>> need to ensure portability work stay clean and actually portable
and doesnt
>>>>>>>> impact runners as poc done until today did.

>>>>>>>> >>>>>>> Works for me.


>>>>>>>> >>>>>>> On Thu, May 3, 2018 at 10:05 PM Romain Manni-Bucau <
>>>>>>>> [email protected]> wrote:

>>>>>>>> >>>>>>>> Hi guys

>>>>>>>> >>>>>>>> Since some time there are efforts to have a language
portable
>>>>>>>> support in beam but I cant really find a case it "works" being
based on
>>>>>>>> docker except for some vendor specific infra.

>>>>>>>> >>>>>>>> Current solution:

>>>>>>>> >>>>>>>> 1. Is runner intrusive (which is bad for beam and prevents
>>>>>>>> adoption of big data vendors)
>>>>>>>> >>>>>>>> 2. Based on docker (which assumed a runtime environment
and is
>>>>>>>> very ops/infra intrusive and likely too $$ quite often for what it
brings)

>>>>>>>> >>>>>>>> Did anyone had a look to graal which seems a way to make
the
>>>>>>>> feature doable in a lighter manner and optimized compared to
default jsr223
>>>>>>>> impls?

Re: Graal instead of docker?

Reply via email to