Hi, all Last month the vote of "Support Docker Official Image for Spark <https://issues.apache.org/jira/browse/SPARK-40513>" passed.
# Progress of SPIP: ## Completed: - A new github repo created: https://github.com/apache/spark-docker - Add "Spark Docker <https://issues.apache.org/jira/browse/SPARK-40969?jql=project%20%3D%20SPARK%20AND%20component%20%3D%20%22Spark%20Docker%22>" component label in JIRA - Uploaded 3.3.0/3.3.1 dockerfiles: spark-docker#2 <https://github.com/apache/spark-docker/pull/2> spark-docker#20 <https://github.com/apache/spark-docker/pull/20> - Some fixes apply to dockerfiles to meet the DOI qualities requirements: * spark-docker#11 <https://github.com/apache/spark-docker/pull/11> Use spark as username in official image (instead of magic number 185), * spark-docker#14 <https://github.com/apache/spark-docker/pull/14> Cleanup os download list cache to reduce image size. * spark-docker#17 <https://github.com/apache/spark-docker/pull/17> Remove pip/setuptools dynamic upgrade to ensure image's repeatability - Support dockerfile template to help generate all kinds of Dockerfiles for specific version spark-docker#12 <https://github.com/apache/spark-docker/pull/12> - Add workflow to help build/test dockerfile to ensure the Dockerfile's quality * K8s Integration test spark-docker#9 <https://github.com/apache/spark-docker/pull/9> * Standalone test spark-docker#21 <https://github.com/apache/spark-docker/pull/21> (Great job by @dcoliversun) - spark-website#424 <https://github.com/apache/spark-website/pull/424> Use docker image in the example of SQL/Scala/Java - INFRA-23882 <https://issues.apache.org/jira/browse/INFRA-23882> Add Docker hub secrets to spark-docker repo to help publish docker hub image ## Not merged yet: - spark-docker#23 <https://github.com/apache/spark-docker/pull/23> One click to publish "apache/spark" image instead of the current Spark Docker Images publish step <https://github.com/wangyum/spark-website/blob/1c6b2ee13a1e22748ed416c5cc260c33795a76c8/release-process.md#create-and-upload-spark-docker-images>. It will also run K8s IT /standalone test first then publish. - docker-library/official-images#13089 <https://github.com/docker-library/official-images/pull/13089> Add Apache Spark Docker Official Image, waiting for review from docker side. After the above work, I think we almost reached the quality of DOI (might have some small fix according to docker side review in future maybe), but limited by the docker side review bandwith. The good news is that the PR are in the top of the review queue according to review history. # Next step? Should we publish the apache/spark image (3.3.0/3.3.1) according to new rules now? After publish, the apache/spark will add several new tags for v3.3.0 and v3.3.1 like: - apache/spark:python3 - apache/spark:scala - apache/spark:r - apache/spark all in one * You can see the complete tag info in here <https://github.com/apache/spark-docker/pull/23/files#diff-2b39d33506bc7a34cef4b9ebf4cf8b1e3a5532f2131ceb37011b94261cec5f8c> . WDYT? Regards, Yikun
