I want to re-iterate on one point, that the init-container achieves a clear separation between preparing an application and actually running the application. It's a guarantee provided by the K8s admission control and scheduling components that if the init-container fails, the main container won't be run. I think this is definitely positive to have. In the case of a Spark application, the application code and driver/executor code won't even be run if the init-container fails to localize any of the dependencies. The result is that it's much easier for users to figure out what's wrong if their applications fail to run: they can tell if the pods are initialized or not and if not, simply check the status/logs of the init-container. Another argument I want to make is we can easily make the init-container to be able to exclusively use certain credentials for downloading dependencies that are not appropriate to be visible in the main containers and therefore should not be shared. This is not achievable using the Spark canonical way. K8s has built-in support for dynamically injecting containers into pods through the admission control process. One use case would be for cluster operators to inject an init-container (e.g., through a admission webhook) for downloading certain dependencies that require certain access-restrictive credentials.
Note that we are not blindly opposing getting rid of the init-container, it's just that there's still valid reasons to keep it for now, particularly given that we don't have a solid around client mode yet. Also given that we have been using it in our fork for over a year, we are definitely more confident on the current way of handling remote dependencies as it's been tested more thoroughly. Since getting rid of the init-container is such a significant change, I would suggest that we defer making a decision on if we should get rid of it to 2.4 so we have a more thorough understanding of the pros and cons. On Wed, Jan 10, 2018 at 1:48 PM, Marcelo Vanzin <van...@cloudera.com> wrote: > On Wed, Jan 10, 2018 at 1:47 PM, Matt Cheah <mch...@palantir.com> wrote: > >> With a config value set by the submission code, like what I'm doing to > prevent client mode submission in my p.o.c.? > > > > The contract for what determines the appropriate scheduler backend to > instantiate is then going to be different in Kubernetes versus the other > cluster managers. > > There is no contract for how to pick the appropriate scheduler. That's > a decision that is completely internal to the cluster manager code > > -- > Marcelo >