Re: Publishing official docker images for KubernetesSchedulerBackend

Mridul Muralidharan Wed, 29 Nov 2017 09:05:33 -0800

We do support running on Apache Mesos via docker images - so this
would not be restricted to k8s.
But unlike mesos support, which has other modes of running, I believe
k8s support more heavily depends on availability of docker images.



Regards,
Mridul


On Wed, Nov 29, 2017 at 8:56 AM, Sean Owen <so...@cloudera.com> wrote:
> Would it be logical to provide Docker-based distributions of other pieces of
> Spark? or is this specific to K8S?
> The problem is we wouldn't generally also provide a distribution of Spark
> for the reasons you give, because if that, then why not RPMs and so on.
>
> On Wed, Nov 29, 2017 at 10:41 AM Anirudh Ramanathan <ramanath...@google.com>
> wrote:
>>
>> In this context, I think the docker images are similar to the binaries
>> rather than an extension.
>> It's packaging the compiled distribution to save people the effort of
>> building one themselves, akin to binaries or the python package.
>>
>> For reference, this is the base dockerfile for the main image that we
>> intend to publish. It's not particularly complicated.
>> The driver and executor images are based on said base image and only
>> customize the CMD (any file/directory inclusions are extraneous and will be
>> removed).
>>
>> Is there only one way to build it? That's a bit harder to reason about.
>> The base image I'd argue is likely going to always be built that way. The
>> driver and executor images, there may be cases where people want to
>> customize it - (like putting all dependencies into it for example).
>> In those cases, as long as our images are bare bones, they can use the
>> spark-driver/spark-executor images we publish as the base, and build their
>> customization as a layer on top of it.
>>
>> I think the composability of docker images, makes this a bit different
>> from say - debian packages.
>> We can publish canonical images that serve as both - a complete image for
>> most Spark applications, as well as a stable substrate to build
>> customization upon.
>>
>> On Wed, Nov 29, 2017 at 7:38 AM, Mark Hamstra <m...@clearstorydata.com>
>> wrote:
>>>
>>> It's probably also worth considering whether there is only one,
>>> well-defined, correct way to create such an image or whether this is a
>>> reasonable avenue for customization. Part of why we don't do something like
>>> maintain and publish canonical Debian packages for Spark is because
>>> different organizations doing packaging and distribution of infrastructures
>>> or operating systems can reasonably want to do this in a custom (or
>>> non-customary) way. If there is really only one reasonable way to do a
>>> docker image, then my bias starts to tend more toward the Spark PMC taking
>>> on the responsibility to maintain and publish that image. If there is more
>>> than one way to do it and publishing a particular image is more just a
>>> convenience, then my bias tends more away from maintaining and publish it.
>>>
>>> On Wed, Nov 29, 2017 at 5:14 AM, Sean Owen <so...@cloudera.com> wrote:
>>>>
>>>> Source code is the primary release; compiled binary releases are
>>>> conveniences that are also released. A docker image sounds fairly different
>>>> though. To the extent it's the standard delivery mechanism for some 
>>>> artifact
>>>> (think: pyspark on PyPI as well) that makes sense, but is that the
>>>> situation? if it's more of an extension or alternate presentation of Spark
>>>> components, that typically wouldn't be part of a Spark release. The ones 
>>>> the
>>>> PMC takes responsibility for maintaining ought to be the core, critical
>>>> means of distribution alone.
>>>>
>>>> On Wed, Nov 29, 2017 at 2:52 AM Anirudh Ramanathan
>>>> <ramanath...@google.com.invalid> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> We're all working towards the Kubernetes scheduler backend (full steam
>>>>> ahead!) that's targeted towards Spark 2.3. One of the questions that comes
>>>>> up often is docker images.
>>>>>
>>>>> While we're making available dockerfiles to allow people to create
>>>>> their own docker images from source, ideally, we'd want to publish 
>>>>> official
>>>>> docker images as part of the release process.
>>>>>
>>>>> I understand that the ASF has procedure around this, and we would want
>>>>> to get that started to help us get these artifacts published by 2.3. I'd
>>>>> love to get a discussion around this started, and the thoughts of the
>>>>> community regarding this.
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Anirudh Ramanathan
>>>
>>>
>>
>>
>>
>> --
>> Anirudh Ramanathan

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Publishing official docker images for KubernetesSchedulerBackend

Reply via email to