Yikun Jiang created SPARK-43365:
-----------------------------------

             Summary: Refactor Dockerfile and workflow based on base image
                 Key: SPARK-43365
                 URL: https://issues.apache.org/jira/browse/SPARK-43365
             Project: Spark
          Issue Type: Sub-task
          Components: Spark Docker
    Affects Versions: 3.5.0
            Reporter: Yikun Jiang


https://github.com/docker-library/official-images/pull/13089?notification_referrer_id=NT_kwDOABp-orI0MzIwMzMwNzY5OjE3MzYzNTQ#issuecomment-1533540388

Would it be useful to save space by sharing layers by having one image from 
another? 🤔 Something like the *java11-ubuntu as the "base" with r and python 
variants FROM that and the r-python being FROM, probably, the larger one of 
those?

Rough example Dockerfiles

{code:java}
FROM eclipse-temurin:11-jre-focal
# user stuff, install common deps, etc
...
# download/extract spark (maybe keeping python and R files too? they seem 
relatively small compared to the rest)
{code}


{code:java}
# other images in separate Dockerfiles
FROM spark:3.3.0-scala2.12-java11-ubuntu
# get "/opt/spark/{python,R}/" contents if not kept in base
# install python or R (and things like R_HOME)
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to