koert kuipers created SPARK-31726:
-------------------------------------

             Summary: Make spark.files available in driver with cluster deploy 
mode on kubernetes
                 Key: SPARK-31726
                 URL: https://issues.apache.org/jira/browse/SPARK-31726
             Project: Spark
          Issue Type: Improvement
          Components: Kubernetes
    Affects Versions: 3.0.0
            Reporter: koert kuipers


currently on yarn with cluster deploy mode --files makes the files available 
for driver and executors and also put them on classpath for driver and 
executors.

on k8s with cluster deploy mode --files makes the files available on executors 
but they are not on classpath. it does not make the files available on driver 
and they are not on driver classpath.

it would be nice if the k8s behavior was consistent with yarn, or at least 
makes the files available on driver. once the files are available there is a 
simple workaround to get them on classpath using 
spark.driver.extraClassPath="./"

background:

we recently started testing kubernetes for spark. our main platform is yarn on 
which we use client deploy mode. our first experience was that client deploy 
mode was difficult to use on k8s (we dont launch from inside a pod). so we 
switched to cluster deploy mode, which seems to behave well on k8s. but then we 
realized that our program rely on reading files on classpath (application.conf, 
log4j.properties etc.) that are on the client but now are no longer on the 
driver (since driver is no longer on client). an easy fix for this seems to be 
to ship the files using --files to make them available on driver, but we could 
not get this to work.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to