[ 
https://issues.apache.org/jira/browse/SPARK-30985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Sharma reassigned SPARK-30985:
---------------------------------------

    Assignee:     (was: Prashant Sharma)

> Propagate SPARK_CONF_DIR files to driver and exec pods.
> -------------------------------------------------------
>
>                 Key: SPARK-30985
>                 URL: https://issues.apache.org/jira/browse/SPARK-30985
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 3.0.0
>            Reporter: Prashant Sharma
>            Priority: Major
>
> SPARK_CONF_DIR hosts configuration files like, 
> 1) spark-defaults.conf - containing all the spark properties.
> 2) log4j.properties - Logger configuration.
> 3) spark-env.sh - Environment variables to be setup at driver and executor.
> 4) core-site.xml - Hadoop related configuration.
> 5) fairscheduler.xml - Spark's fair scheduling policy at the job level.
> 6) metrics.properties - Spark metrics.
> 7) Any user specific - library or framework specific configuration file.
> Traditionally, SPARK_CONF_DIR has been the home to all user specific 
> configuration files and the default behaviour in the Yarn or standalone mode 
> is that these configuration files are copied to the worker nodes as required 
> by the users themselves. In other words, they are not auto-copied.
> But, in the case of  spark on kubernetes, we use spark images and generally 
> these images are approved or undergoe some kind of standardisation. These 
> files cannot be simply copied to the SPARK_CONF_DIR of the running executor 
> and driver pods by the user. 
> So, at the moment we have special casing for providing each configuration and 
> for any other user specific configuration files, the process is more complex, 
> i.e. - e.g. one can start with their own custom image of spark with 
> configuration files pre installed etc..
> Examples of special casing are:
> 1. Hadoop configuration in spark.kubernetes.hadoop.configMapName
> 2. Spark-env.sh as in spark.kubernetes.driverEnv.[EnvironmentVariableName]
> 3. Log4j.properties as in https://github.com/apache/spark/pull/26193
> ... And for those such special casing does not exist, they are simply out of 
> luck.
> So this feature, will let the user specific configuration files be mounted on 
> the driver and executor pods' SPARK_CONF_DIR.
> At the moment it is not clear, if there is a need to, let user specify which 
> config files to propagate - to driver and or executor. But, if there is a 
> case that feature will be helpful, we can increase the scope of this work or 
> create another JIRA issue to track that work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to