Github user liyinan926 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20059#discussion_r158568089
  
    --- Diff: docs/running-on-kubernetes.md ---
    @@ -120,6 +120,23 @@ by their appropriate remote URIs. Also, application 
dependencies can be pre-moun
     Those dependencies can be added to the classpath by referencing them with 
`local://` URIs and/or setting the
     `SPARK_EXTRA_CLASSPATH` environment variable in your Dockerfiles.
     
    +### Using Remote Dependencies
    +When there are application dependencies hosted in remote locations like 
HDFS or HTTP servers, the driver and executor pods need a Kubernetes 
[init-container](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/)
 for downloading the dependencies so the driver and executor containers can use 
them locally. This requires users to specify the container image for the 
init-container using the configuration property 
`spark.kubernetes.initContainer.image`. For example, users simply add the 
following option to the `spark-submit` command to specify the init-container 
image:
    --- End diff --
    
    Regarding examples, I can add one spark-submit example showing how to use 
remote jars/files on http/https and hdfs. But gcs requires the connector in the 
init-container, which is non-trivial. I'm not sure about s3. I think we should 
avoid doing so.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to