Sookyung Park created SPARK-39565:
-------------------------------------

             Summary: “SPARK_HOME” usage in bin/docker-image-tool.sh might be 
explained
                 Key: SPARK-39565
                 URL: https://issues.apache.org/jira/browse/SPARK-39565
             Project: Spark
          Issue Type: Improvement
          Components: Build
    Affects Versions: 3.3.0
            Reporter: Sookyung Park


Hello, I am utilizing docker-image-tool.sh on building docker images for k8s, 
as described in here:
[https://spark.apache.org/docs/latest/running-on-kubernetes.html#user-identity] 

 

and I found out the “SPARK_HOME” env variable is utilized on build step, like 
this:
{code:java}
// 
https://github.com/apache/spark/blob/master/bin/docker-image-tool.sh#L177-L179

(cd $(img_ctx_dir base) && docker build $NOCACHEARG "${BUILD_ARGS[@]}" \
-t $(image_ref spark) \
-f "$BASEDOCKERFILE" .){code}
 

It means that if the user set the SPARK_HOME to other path on the local, and 
runs this script in another spark directory,

it would perform build on the directory where SPARK_HOME is set and would not 
work as intended.

 

Since this might becomes hard to debug, and this script usage is suggested to 
common users in public doc,

I thought further explanation on “SPARK_HOME” or “is_dev_build” usage in this 
script might be needed, like this:
{code:java}
// docker-image-tool.sh
 
Usage: $0 [options] [command]
Builds or pushes the built-in Spark Docker image.
…
Examples:
- Build image in current directory
  env SPARK_HOME=. $0 build {code}
Thank you.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to