Hi all,
These docker images are created on spark 3.1.1. The reason I have chosen this version for now is that most production offerings (for example Google Dataproc) are based on 3.1.1. So with 3.1.1, we have scala-2.12 with Java version 11-jre-slim OR Java version 8-jre-slim all currently on Debian buster OS docker images REPOSITORY TAG IMAGE ID CREATED SIZE sparkpy/spark-py 3.1.1-scala_2.12-11-jre-slim-buster 96a7ec29967e 2 hours ago 1.01GB sparkpy/spark 3.1.1-scala_2.12-11-jre-slim-buster 769913b63a03 2 hours ago 635MB sparkpy/spark-py 3.1.1-scala_2.12-8-jre-slim-buster 115f1be1d64c 23 minutes ago 979MB spark/spark 3.1.1-scala_2.12-8-jre-slim-buster 92ffcf407889 27 minutes ago 602MB openjdk 8-jre-slim 0d0a85fdf642 4 days ago 187MB openjdk 11-jre-slim eb77da2ec13c 4 weeks ago 221MB These are all standard builds. spark-py has no additional python packages added. Only the default ones shown below: $ pip list Package Version ------------- ------- asn1crypto 0.24.0 cryptography 2.6.1 entrypoints 0.3 keyring 17.1.1 keyrings.alt 3.1.1 pip 21.2.4 pycrypto 2.6.1 PyGObject 3.30.4 pyxdg 0.25 SecretStorage 2.3.1 setuptools 57.4.0 six 1.12.0 wheel 0.32.3 Now it is time to decide where to put these dockers images for download and testing plus guidelines on how to deploy them. For test purposes minikubes can be used. I can push these images in my local GitHub repo but I think it is best to use Spark GitHub for this purpose. Also note that there is absolutely no issue building images based on 3.1.2 or anyone else for that matter. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Sat, 21 Aug 2021 at 17:00, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Apologies ignore the first line spark/spark-py (that is redundant). These > are correct > > > REPOSITORY TAG IMAGE ID > CREATED SIZE > spark/spark 3.1.1-scala_2.12-11-jre-slim-buster 71ff5ed3ca03 9 > seconds ago 635MB > openjdk 8-jre-slim 0d0a85fdf642 4 > days ago 187MB > openjdk 11-jre-slim eb77da2ec13c 4 > weeks ago 221MB > > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Sat, 21 Aug 2021 at 16:50, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> Sorry there was a typo >> >> BASE_OS="buster" >> SPARK_VERSION="3.1.1" >> SCALA_VERSION="scala_2.12" >> DOCKERFILE="Dockerfile" >> DOCKERIMAGETAG="11-jre-slim" >> cd $SPARK_HOME >> >> # Building Docker image from provided Dockerfile base 11 >> cd $SPARK_HOME >> /opt/spark/bin/docker-image-tool.sh \ >> -r spark -t >> ${SPARK_VERSION}-${SCALA_VERSION}-${DOCKERIMAGETAG}-${BASE_OS} \ >> -b java_image_tag=${DOCKERIMAGETAG} \ >> -p ./kubernetes/dockerfiles/spark/${DOCKERFILE} \ >> build >> >> >> REPOSITORY TAG IMAGE ID >> CREATED SIZE >> spark/spark-py 3.1.1-scala_2.12-11-jre-slim-buster 71ff5ed3ca03 9 >> seconds ago 635MB >> spark/spark 3.1.1-scala_2.12-11-jre-slim-buster 71ff5ed3ca03 9 >> seconds ago 635MB >> openjdk 8-jre-slim 0d0a85fdf642 4 >> days ago 187MB >> openjdk 11-jre-slim eb77da2ec13c 4 >> weeks ago 221MB >> >> >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> >> On Sat, 21 Aug 2021 at 16:38, Mich Talebzadeh <mich.talebza...@gmail.com> >> wrote: >> >>> Hi Ankit, >>> >>> Sure I suppose that elaboration on OS base can be added >>> >>> BASE_OS="buster" >>> SPARK_VERSION="3.1.1" >>> SCALA_VERSION="scala_2.12" >>> DOCKERFILE="Dockerfile" >>> DOCKERIMAGETAG="11-jre-slim" >>> >>> # Building Docker image from provided Dockerfile base 11 >>> cd $SPARK_HOME >>> /opt/spark/bin/docker-image-tool.sh \ >>> -r spark -t >>> ${SPARK_VERSION}-${SCALA_VERSION}-${DOCKERIMAGETAG}-${BASE_OS} \ >>> -b java_image_tag=${DOCKERIMAGETAG} \ >>> -p ./kubernetes/dockerfiles/spark/${Dockerfile} \ >>> build >>> >>> and with that we get >>> >>> REPOSITORY TAG IMAGE ID >>> CREATED SIZE >>> spark/spark 3.1.1-scala_2.12-11-jre-slim-buster 6ef051218938 2 >>> minutes ago 635MB >>> openjdk 8-jre-slim 0d0a85fdf642 4 >>> days ago 187MB >>> openjdk 11-jre-slim eb77da2ec13c 4 >>> weeks ago 221MB >>> >>> I guess that is descriptive enough with README file as well >>> >>> HTH >>> >>> >>> >>> view my Linkedin profile >>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>> >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technical content is explicitly >>> disclaimed. The author will in no case be liable for any monetary damages >>> arising from such loss, damage or destruction. >>> >>> >>> >>> >>> On Sat, 21 Aug 2021 at 15:58, Ankit Gupta <info.ank...@gmail.com> wrote: >>> >>>> Hey Mich >>>> >>>> Adding to what Mich is suggesting, how about having the base OS version >>>> in the image tag as well, like >>>> >>>> <PRODUCT_VERSION>-<SCALA_VERSION>-<JAVA_VERSION>-<BASE_OS> >>>> >>>> 3.1.2-scala_2.12-java11-slim >>>> 3.1.2_sparkpy-scala_2.12-java11-buster >>>> 3.1.2_sparkR-scala_2.12-java11-slim >>>> >>>> Regards. >>>> >>>> Ankit Prakash Gupta >>>> info.ank...@gmail.com >>>> LinkedIn : https://www.linkedin.com/in/infoankitp/ >>>> Medium: https://medium.com/@info.ankitp >>>> >>>> On Mon, Aug 16, 2021 at 5:39 PM Mich Talebzadeh < >>>> mich.talebza...@gmail.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> I propose that for Spark docker images we follow the following >>>>> convention similar to flink <https://hub.docker.com/_/flink>as shown >>>>> in the attached file >>>>> >>>>> So for Spark we will have >>>>> >>>>> >>>>> <PRODUCT_VERSION>-<SCALA_VERSION>-<JAVA_VERSION> >>>>> >>>>> 3.1.2-scala_2.12-java11 >>>>> 3.1.2_sparkpy-scala_2.12-java11 >>>>> 3.1.2_sparkR-scala_2.12-java11 >>>>> >>>>> >>>>> If this makes sense please respond, otherwise state your preference >>>>> >>>>> >>>>> HTH >>>>> >>>>> >>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>>> any loss, damage or destruction of data or any other property which may >>>>> arise from relying on this email's technical content is explicitly >>>>> disclaimed. The author will in no case be liable for any monetary damages >>>>> arising from such loss, damage or destruction. >>>>> >>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>> >>>>