May I ask why you think that sentence, "might need to deprecate ..." of
SPIP, decided anything at that time?

>From my perspective,
- `might need to` suggested only a possible necessity at some point in the
future.
- `deprecation` means no breaking change.


Dongjoon



On Tue, May 9, 2023 at 12:01 AM Yikun Jiang <yi...@apache.org> wrote:

> > It seems that your reply (the following) didn't reach out to the mailing
> list correctly.
>
> Thanks! I'm not sure what happened before, thanks for your forward
>
> > Let me add my opinion. IIUC, the whole content of SPIP (Support Docker
> Official Image for Spark) aims to add (1) newly, not to corrupt or destroy
> the existing (2).
>
> - There were some description about how we should address the apache/spark
> image after DOI support in doc:
> "Considering that already had the apache/spark image, might need to
> deprecate: spark/spark-py/spark-r `v3.3.0`, `v3.1.3`, `v3.2.1`, `v3.2.2`
> tags, and *unified apache/spark image tags to docker official images tags
> rule*, and also still keep apache/spark images and update apache/spark
> images when released."
> - I also post a mail
> https://lists.apache.org/thread/zp550lt4f098zfpxgpc9bn360bwcfhs4 in Nov.
> 2022, it's about Apache Spark official image, it's not for Docker official
> image.
>
> So, it is not only for Docker official image (spark) but also for Apache
> Spark official image (apache/spark).
> Anyway, I am very sorry if there is any misleading, really many thanks for
> your feedback and review.
>
> On Tue, May 9, 2023 at 12:37 PM Dongjoon Hyun <dongj...@apache.org> wrote:
>
>> To Yikun,
>>
>> It seems that your reply (the following) didn't reach out to the mailing
>> list correctly.
>>
>> > Just FYI, we also had a discussion about tag policy (latest/3.4.0) and
>> also rough size estimation [1] in "SPIP: Support Docker Official Image for
>> Spark".
>> >
>> https://docs.google.com/document/d/1nN-pKuvt-amUcrkTvYAQ-bJBgtsWb9nAkNoVNRM2S2o/edit?disco=AAAAf2TyFr0
>>
>> Let me add my opinion. IIUC, the whole content of SPIP (Support Docker
>> Official Image for Spark) aims to add (1) newly, not to corrupt or destroy
>> the existing (2).
>>
>> (1) https://hub.docker.com/_/spark
>> (2) https://hub.docker.com/r/apache/spark/tags
>>
>> The reference model repos were also documented like the followings.
>>
>> https://hub.docker.com/_/flink
>> https://hub.docker.com/_/storm
>> https://hub.docker.com/_/solr
>> https://hub.docker.com/_/zookeeper
>>
>> In short, according to the SPIP's `Docker Official Image` definition, new
>> images should go to (1) only in order to achieve `Support Docker Official
>> Image for Spark`, shouldn't they?
>>
>> Dongjoon.
>>
>> On Mon, May 8, 2023 at 6:22 PM Yikun Jiang <yikunk...@gmail.com> wrote:
>>
>>> > 1. The size regression: `apache/spark:3.4.0` tag which is claimed to
>>> be a replacement of the existing `apache/spark:v3.4.0`. However, 3.4.0 is
>>> 500MB while the original v3.4.0 is 405MB. 25% is huge in terms of the size.
>>>
>>> > 2. Accidental overwrite: `apache/spark:latest` was accidentally
>>> overwritten by `apache/spark:python3` image which has a bigger size due to
>>> the additional python binary. This is a breaking change to enforce the
>>> downstream users to change to something like `apache/spark:scala`.
>>>
>>> Just FYI, we also had a discussion about tag policy (latest/3.4.0) and
>>> also rough size estimation [1] in "SPIP: Support Docker Official Image for
>>> Spark".
>>>
>>> [1]
>>> https://docs.google.com/document/d/1nN-pKuvt-amUcrkTvYAQ-bJBgtsWb9nAkNoVNRM2S2o/edit?disco=AAAAf2TyFr0
>>>
>>> Regards,
>>> Yikun
>>>
>>>
>>> On Tue, May 9, 2023 at 5:03 AM Dongjoon Hyun <dongj...@apache.org>
>>> wrote:
>>>
>>>> Thank you for initiating the discussion in the community. Yes, we need
>>>> to give more context in the dev mailing list.
>>>>
>>>> This root cause is not about SPARK-40941 or SPARK-40513. Technically,
>>>> this situation started 16 days ago due to SPARK-43148 because it made some
>>>> breaking changes.
>>>>
>>>> https://github.com/apache/spark-docker/pull/33
>>>> SPARK-43148 Add Apache Spark 3.4.0 Dockerfiles
>>>>
>>>> 1. The size regression: `apache/spark:3.4.0` tag which is claimed to be
>>>> a replacement of the existing `apache/spark:v3.4.0`. However, 3.4.0 is
>>>> 500MB while the original v3.4.0 is 405MB. 25% is huge in terms of the size.
>>>>
>>>> 2. Accidental overwrite: `apache/spark:latest` was accidentally
>>>> overwritten by `apache/spark:python3` image which has a bigger size due to
>>>> the additional python binary. This is a breaking change to enforce the
>>>> downstream users to change to something like `apache/spark:scala`.
>>>>
>>>> I believe (1) and (2) were our mistakes. We had better recover them
>>>> ASAP.
>>>> For Java questions, I prefer to be consistent with Apache Spark repo's
>>>> default.
>>>>
>>>> Dongjoon.
>>>>
>>>> On 2023/05/08 08:56:26 Yikun Jiang wrote:
>>>> > This is a call for discussion for how we can unified Apache Spark
>>>> Docker
>>>> > image tag fluently.
>>>> >
>>>> > As you might know, there is an apache/spark-docker
>>>> > <https://github.com/apache/spark-docker> repo to store the
>>>> dockerfiles and
>>>> > help to publish the docker images, also intended to replace the
>>>> original
>>>> > manually publish workflow.
>>>> >
>>>> > The scope of new images is to cover previous image cases (K8s /
>>>> docker run)
>>>> > and also cover base image, standalone, Docker Official Image.
>>>> >
>>>> > - (Previous) apache/spark:v3.4.0, apache/spark-py:v3.4.0,
>>>> > apache/spark-r:v3.4.0
>>>> >
>>>> >     * The image build from apache/spark spark on k8s dockerfiles
>>>> > <
>>>> https://github.com/apache/spark/tree/branch-3.4/resource-managers/kubernetes/docker/src/main/dockerfiles/spark
>>>> >
>>>> >
>>>> >     * Java version: Java 17 (It was Java 11 before v3.4.0, such as
>>>> > v3.3.0/v3.3.1/v3.3.2), set Java 17 by default in SPARK-40941
>>>> > <https://github.com/apache/spark/pull/38417>.
>>>> >
>>>> >     * Support: K8s / docker run
>>>> >
>>>> >     * See also: Time to start publishing Spark Docker Images
>>>> > <https://lists.apache.org/thread/h729bxrf1o803l4wz7g8bngkjd56y6x8>
>>>> >
>>>> > * Link: https://hub.docker.com/r/apache/spark-py,
>>>> > https://hub.docker.com/r/apache/spark-r,
>>>> > https://hub.docker.com/r/apache/spark
>>>> >
>>>> > - (New) apache/spark:3.4.0-python3(3.4.0/latest),
>>>> apache/spark:3.4.0-r,
>>>> > apache/spark:3.4.0-scala, and also a all in one image:
>>>> > apache/spark:3.4.0-scala2.12-java11-python3-r-ubuntu
>>>> >
>>>> >     * The image build from apache/spark-docker dockerfiles
>>>> > <https://github.com/apache/spark-docker/tree/master/3.4.0>
>>>> >
>>>> >     * Java version: Java 11, Java17 is supported by SPARK-40513
>>>> > <https://github.com/apache/spark-docker/pull/35> (under review)
>>>> >
>>>> >     * Support: K8s / docker run / base image / standalone / Docker
>>>> Official
>>>> > Image
>>>> >
>>>> >     * See detail in: Support Docker Official Image for Spark
>>>> > <https://issues.apache.org/jira/browse/SPARK-40513>
>>>> >
>>>> >     * About dropping prefix `v`:
>>>> > https://github.com/docker-library/official-images/issues/14506
>>>> >
>>>> >     * Link: https://hub.docker.com/r/apache/spark
>>>> >
>>>> > We had some initial discuss on spark-website#458
>>>> > <
>>>> https://github.com/apache/spark-website/pull/458#issuecomment-1522426236
>>>> >,
>>>> > the mainly discussion is around version tag and default Java version
>>>> > behavior changes, so we’d like to hear your idea in here about below
>>>> > questions:
>>>> >
>>>> > *#1.Which Java version should be used by default (latest tag)? Java8
>>>> or
>>>> > Java 11 or Java 17 or Any*
>>>> >
>>>> > *#2.Which tag should be used in apache/spark? v3.4.0 (with prefix v)
>>>> or
>>>> > 3.4.0 (dropping prefix v) or Both or Any*
>>>> >
>>>> > Starts with my prefer:
>>>> >
>>>> > 1. Java8 or Java17 are also ok to me (mainly considering the Java
>>>> > maintenance cycle). BTW, other apache projects: flink (8/11, 11 as
>>>> default
>>>> > <
>>>> https://github.com/docker-library/official-images/blob/93270eb07fb448fe7756b28af5495428242dcd6b/library/flink#L10
>>>> >),
>>>> > solr (11 as default
>>>> > <
>>>> https://github.com/apache/solr-docker/blob/989825ee6dce2f6bf7b31051f1ba053b6c4426f2/8.11/Dockerfile#L4
>>>> >
>>>> > for 8.x, 17 as default
>>>> > <
>>>> https://github.com/apache/solr-docker/blob/989825ee6dce2f6bf7b31051f1ba053b6c4426f2/9.2/Dockerfile#L17
>>>> >
>>>> > since solr9), zookeeper (11 as default
>>>> > <
>>>> https://github.com/31z4/zookeeper-docker/blob/181e5862c85b517e4599d79eb5c2c7339e60a4aa/3.8.1/Dockerfile#L1
>>>> >
>>>> > )
>>>> >
>>>> > 2. Only 3.4.0 (dropping prefix v). It will help us transition to the
>>>> new
>>>> > tags with less confusion and also consider DOI suggestions
>>>> > <https://github.com/docker-library/official-images/issues/14506>.
>>>> >
>>>> > Please feel free to share your ideas.
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>
>>>>

Reply via email to