I want to re-iterate on one point, that the init-container achieves a clear
separation between preparing an application and actually running the
application. It's a guarantee provided by the K8s admission control and
scheduling components that if the init-container fails, the main container
won't be run. I think this is definitely positive to have. In the case of a
Spark application, the application code and driver/executor code won't even
be run if the init-container fails to localize any of the dependencies. The
result is that it's much easier for users to figure out what's wrong if
their applications fail to run: they can tell if the pods are initialized
or not and if not, simply check the status/logs of the init-container.
Another argument I want to make is we can easily make the init-container to
be able to exclusively use certain credentials for downloading dependencies
that are not appropriate to be visible in the main containers and therefore
should not be shared. This is not achievable using the Spark canonical way.
K8s has built-in support for dynamically injecting containers into pods
through the admission control process. One use case would be for cluster
operators to inject an init-container (e.g., through a admission webhook)
for downloading certain dependencies that require certain
access-restrictive credentials.

Note that we are not blindly opposing getting rid of the init-container,
it's just that there's still valid reasons to keep it for now, particularly
given that we don't have a solid around client mode yet. Also given that we
have been using it in our fork for over a year, we are definitely more
confident on the current way of handling remote dependencies as it's been
tested more thoroughly. Since getting rid of the init-container is such a
significant change, I would suggest that we defer making a decision on if
we should get rid of it to 2.4 so we have a more thorough understanding of
the pros and cons.

On Wed, Jan 10, 2018 at 1:48 PM, Marcelo Vanzin <van...@cloudera.com> wrote:

> On Wed, Jan 10, 2018 at 1:47 PM, Matt Cheah <mch...@palantir.com> wrote:
> >> With a config value set by the submission code, like what I'm doing to
> prevent client mode submission in my p.o.c.?
> >
> > The contract for what determines the appropriate scheduler backend to
> instantiate is then going to be different in Kubernetes versus the other
> cluster managers.
>
> There is no contract for how to pick the appropriate scheduler. That's
> a decision that is completely internal to the cluster manager code
>
> --
> Marcelo
>

Reply via email to