With regards to separation of concerns, there’s a fringe use case here – if more than one main container is on the pod, then none of them will run if the init-containers fail. A user can have a Pod Preset that attaches more sidecar containers to the driver and/or executors. In that case, those sidecars may perform side effects that are undesirable if the main Spark application failed because dependencies weren’t available. Using the init-container to localize the dependencies will prevent any of these sidecars from executing at all if the dependencies can’t be fetched.
It’s definitely a niche use case – I’m not sure how often pod presets are used in practice - but it’s an example to illustrate why the separation of concerns can be beneficial. -Matt Cheah On 1/10/18, 2:36 PM, "Marcelo Vanzin" <van...@cloudera.com> wrote: On Wed, Jan 10, 2018 at 2:30 PM, Yinan Li <liyinan...@gmail.com> wrote: > 1. Retries of init-containers are automatically supported by k8s through pod > restart policies. For this point, sorry I'm not sure how spark-submit > achieves this. Great, add that feature to spark-submit, everybody benefits, not just k8s. > 2. The ability to use credentials that are not shared with the main > containers. Not sure what that achieves. > 3. Not only the user code, but Spark internal code like Executor won't be > run if the init-container fails. Not sure what that achieves. Executor will fail if dependency download fails, Spark driver will recover (and start a new executor if needed). > 4. Easier to build tooling around k8s events/status of the init-container in > case of failures as it's doing exactly one thing: downloading dependencies. Again, I don't see what is all this hoopla about fine grained control of dependency downloads. Spark solved this years ago for Spark applications. Don't reinvent the wheel. -- Marcelo
smime.p7s
Description: S/MIME cryptographic signature