Thanks Yikun for the explanation and Hyukjin. So, what I understand is,
this is to establish the structure so that we start building the DOIs for
now and slowly and steadily build and add more and more images further.

Regards,

Ankit Prakash Gupta

On Thu, Sep 22, 2022 at 7:02 AM Hyukjin Kwon <gurwls...@gmail.com> wrote:

> Given that support, I will start the vote officially.
>
> On Thu, 22 Sept 2022 at 08:40, Yikun Jiang <yikunk...@gmail.com> wrote:
>
>> @Ankit
>>
>> Thanks for your support! Your questions are very valuable, but this SPIP
>> is just a start point to cover existing apache/spark image features first.
>> And we will also set up a build/test/publish image workflow (to make sure
>> the quality of image) and some helper scripts to help developers extend
>> more custom images easily in future.
>>
>> > How do we support deployments of spark-standalone clusters in case the
>> users want to use the same image for spark-standalone clusters ? Since that
>> is also widely used.
>> Yes, it's possible, it can be done by exposing some ports, but still need
>> to validate and then doc them in standalone mode doc.
>>
>> > 2. I am not sure about the End of Support of Hadoop 2 with spark, but
>> if that is not planned sooner, shouldn't we be making it configurable to be
>> able to use spark prebuilt with hadoop 2?
>> DOI required a static dockerfile, so couldn't be configurable in
>> runtime.Of course, all spark published releases can also be supported as a
>> separate image in principle. About supporting more distribution, we also
>> planned to add some scripts to help generate the dockerfile.
>>
>> > 3. Also, don't we want to make it feasible for the users to be able to
>> customize the base linux flavour?
>> This is also a good point, but out of scope of this SPIP. Currently, we
>> start with Ubuntu OS (debian series, apt software manager). We might also
>> consider supporting more OS after this SPIP. Such as
>> rehl/centos/rocky/openEuler serious, yum/dnf software manager. But as you
>> know, different OS's have various package versions and upgrade policies, so
>> it's perhaps not very easy work for maintenance, but I think it's possible.
>>
>> Regards,
>> Yikun
>>
>>
>> On Thu, Sep 22, 2022 at 3:43 AM Ankit Gupta <info.ank...@gmail.com>
>> wrote:
>>
>>> Hi Yikun
>>>
>>> Thanks for all your efforts! This is very much needed. But I have the
>>> below three questions:
>>> 1. How do we support deployments of spark-standalone clusters in case
>>> the users wants to use the same image for spark-standalone clusters ? Since
>>> that is also widely used.
>>> 2. I am not sure about the End of Support of Hadoop 2 with spark, but if
>>> that is not planned sooner, shouldn't we be making it configurable to be
>>> able to use spark prebuilt with hadoop 2?
>>> 3. Also, don't we want to make it feasible for the users to be able to
>>> customise the base linux flavour?
>>>
>>> Thanks and Regards.
>>>
>>> Ankit Prakash Gupta
>>>
>>>
>>> On Wed, Sep 21, 2022 at 9:19 PM Xiao Li <gatorsm...@gmail.com> wrote:
>>>
>>>> +1
>>>>
>>>> Yikun Jiang <yikunk...@gmail.com> 于2022年9月21日周三 07:22写道:
>>>>
>>>>> Thanks for all your inputs! BTW, I also create a JIRA to track related
>>>>> work: https://issues.apache.org/jira/browse/SPARK-40513
>>>>>
>>>>> > can I be involved in this work?
>>>>>
>>>>> @qian Of course! Thanks!
>>>>>
>>>>> Regards,
>>>>> Yikun
>>>>>
>>>>> On Wed, Sep 21, 2022 at 7:31 PM Xinrong Meng <xinrong.apa...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> +1
>>>>>>
>>>>>> On Tue, Sep 20, 2022 at 11:08 PM Qian SUN <qian.sun2...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> +1.
>>>>>>> It's valuable, can I be involved in this work?
>>>>>>>
>>>>>>> Yikun Jiang <yikunk...@gmail.com> 于2022年9月19日周一 08:15写道:
>>>>>>>
>>>>>>>> Hi, all
>>>>>>>>
>>>>>>>> I would like to start the discussion for supporting Docker Official
>>>>>>>> Image for Spark.
>>>>>>>>
>>>>>>>> This SPIP is proposed to add Docker Official Image(DOI)
>>>>>>>> <https://github.com/docker-library/official-images> to ensure the
>>>>>>>> Spark Docker images meet the quality standards for Docker images, to
>>>>>>>> provide these Docker images for users who want to use Apache Spark via
>>>>>>>> Docker image.
>>>>>>>>
>>>>>>>> There are also several Apache projects that release the Docker
>>>>>>>> Official Images
>>>>>>>> <https://hub.docker.com/search?q=apache&image_filter=official>,
>>>>>>>> such as: flink <https://hub.docker.com/_/flink>, storm
>>>>>>>> <https://hub.docker.com/_/storm>, solr
>>>>>>>> <https://hub.docker.com/_/solr>, zookeeper
>>>>>>>> <https://hub.docker.com/_/zookeeper>, httpd
>>>>>>>> <https://hub.docker.com/_/httpd> (with 50M+ to 1B+ download for
>>>>>>>> each). From the huge download statistics, we can see the real demands 
>>>>>>>> of
>>>>>>>> users, and from the support of other apache projects, we should also be
>>>>>>>> able to do it.
>>>>>>>>
>>>>>>>> After support:
>>>>>>>>
>>>>>>>>    -
>>>>>>>>
>>>>>>>>    The Dockerfile will still be maintained by the Apache Spark
>>>>>>>>    community and reviewed by Docker.
>>>>>>>>    -
>>>>>>>>
>>>>>>>>    The images will be maintained by the Docker community to ensure
>>>>>>>>    the quality standards for Docker images of the Docker community.
>>>>>>>>
>>>>>>>>
>>>>>>>> It will also reduce the extra docker images maintenance effort
>>>>>>>> (such as frequently rebuilding, image security update) of the Apache 
>>>>>>>> Spark
>>>>>>>> community.
>>>>>>>>
>>>>>>>> See more in SPIP DOC:
>>>>>>>> https://docs.google.com/document/d/1nN-pKuvt-amUcrkTvYAQ-bJBgtsWb9nAkNoVNRM2S2o
>>>>>>>>
>>>>>>>> cc: Ruifeng (co-author) and Hyukjin (shepherd)
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Yikun
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best!
>>>>>>> Qian SUN
>>>>>>>
>>>>>>

Reply via email to