@Ankit

Thanks for your support! Your questions are very valuable, but this SPIP is
just a start point to cover existing apache/spark image features first.
And we will also set up a build/test/publish image workflow (to make sure
the quality of image) and some helper scripts to help developers extend
more custom images easily in future.

> How do we support deployments of spark-standalone clusters in case the
users want to use the same image for spark-standalone clusters ? Since that
is also widely used.
Yes, it's possible, it can be done by exposing some ports, but still need
to validate and then doc them in standalone mode doc.

> 2. I am not sure about the End of Support of Hadoop 2 with spark, but if
that is not planned sooner, shouldn't we be making it configurable to be
able to use spark prebuilt with hadoop 2?
DOI required a static dockerfile, so couldn't be configurable in runtime.Of
course, all spark published releases can also be supported as a separate
image in principle. About supporting more distribution, we also planned to
add some scripts to help generate the dockerfile.

> 3. Also, don't we want to make it feasible for the users to be able to
customize the base linux flavour?
This is also a good point, but out of scope of this SPIP. Currently, we
start with Ubuntu OS (debian series, apt software manager). We might also
consider supporting more OS after this SPIP. Such as
rehl/centos/rocky/openEuler serious, yum/dnf software manager. But as you
know, different OS's have various package versions and upgrade policies, so
it's perhaps not very easy work for maintenance, but I think it's possible.

Regards,
Yikun


On Thu, Sep 22, 2022 at 3:43 AM Ankit Gupta <info.ank...@gmail.com> wrote:

> Hi Yikun
>
> Thanks for all your efforts! This is very much needed. But I have the
> below three questions:
> 1. How do we support deployments of spark-standalone clusters in case the
> users wants to use the same image for spark-standalone clusters ? Since
> that is also widely used.
> 2. I am not sure about the End of Support of Hadoop 2 with spark, but if
> that is not planned sooner, shouldn't we be making it configurable to be
> able to use spark prebuilt with hadoop 2?
> 3. Also, don't we want to make it feasible for the users to be able to
> customise the base linux flavour?
>
> Thanks and Regards.
>
> Ankit Prakash Gupta
>
>
> On Wed, Sep 21, 2022 at 9:19 PM Xiao Li <gatorsm...@gmail.com> wrote:
>
>> +1
>>
>> Yikun Jiang <yikunk...@gmail.com> 于2022年9月21日周三 07:22写道:
>>
>>> Thanks for all your inputs! BTW, I also create a JIRA to track related
>>> work: https://issues.apache.org/jira/browse/SPARK-40513
>>>
>>> > can I be involved in this work?
>>>
>>> @qian Of course! Thanks!
>>>
>>> Regards,
>>> Yikun
>>>
>>> On Wed, Sep 21, 2022 at 7:31 PM Xinrong Meng <xinrong.apa...@gmail.com>
>>> wrote:
>>>
>>>> +1
>>>>
>>>> On Tue, Sep 20, 2022 at 11:08 PM Qian SUN <qian.sun2...@gmail.com>
>>>> wrote:
>>>>
>>>>> +1.
>>>>> It's valuable, can I be involved in this work?
>>>>>
>>>>> Yikun Jiang <yikunk...@gmail.com> 于2022年9月19日周一 08:15写道:
>>>>>
>>>>>> Hi, all
>>>>>>
>>>>>> I would like to start the discussion for supporting Docker Official
>>>>>> Image for Spark.
>>>>>>
>>>>>> This SPIP is proposed to add Docker Official Image(DOI)
>>>>>> <https://github.com/docker-library/official-images> to ensure the
>>>>>> Spark Docker images meet the quality standards for Docker images, to
>>>>>> provide these Docker images for users who want to use Apache Spark via
>>>>>> Docker image.
>>>>>>
>>>>>> There are also several Apache projects that release the Docker
>>>>>> Official Images
>>>>>> <https://hub.docker.com/search?q=apache&image_filter=official>, such
>>>>>> as: flink <https://hub.docker.com/_/flink>, storm
>>>>>> <https://hub.docker.com/_/storm>, solr
>>>>>> <https://hub.docker.com/_/solr>, zookeeper
>>>>>> <https://hub.docker.com/_/zookeeper>, httpd
>>>>>> <https://hub.docker.com/_/httpd> (with 50M+ to 1B+ download for
>>>>>> each). From the huge download statistics, we can see the real demands of
>>>>>> users, and from the support of other apache projects, we should also be
>>>>>> able to do it.
>>>>>>
>>>>>> After support:
>>>>>>
>>>>>>    -
>>>>>>
>>>>>>    The Dockerfile will still be maintained by the Apache Spark
>>>>>>    community and reviewed by Docker.
>>>>>>    -
>>>>>>
>>>>>>    The images will be maintained by the Docker community to ensure
>>>>>>    the quality standards for Docker images of the Docker community.
>>>>>>
>>>>>>
>>>>>> It will also reduce the extra docker images maintenance effort (such
>>>>>> as frequently rebuilding, image security update) of the Apache Spark
>>>>>> community.
>>>>>>
>>>>>> See more in SPIP DOC:
>>>>>> https://docs.google.com/document/d/1nN-pKuvt-amUcrkTvYAQ-bJBgtsWb9nAkNoVNRM2S2o
>>>>>>
>>>>>> cc: Ruifeng (co-author) and Hyukjin (shepherd)
>>>>>>
>>>>>> Regards,
>>>>>> Yikun
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best!
>>>>> Qian SUN
>>>>>
>>>>

Reply via email to