Prashant Sharma created SPARK-30985: ---------------------------------------
Summary: Propagate SPARK_CONF_DIR files to driver and exec pods. Key: SPARK-30985 URL: https://issues.apache.org/jira/browse/SPARK-30985 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 3.0.0 Reporter: Prashant Sharma Assignee: Prashant Sharma SPARK_CONF_DIR hosts configuration files like, 1) spark-defaults.conf - containing all the spark properties. 2) log4j.properties - Logger configuration. 3) spark-env.sh - Environment variables to be setup at driver and executor. 4) core-site.xml - Hadoop related configuration. 5) fairscheduler.xml - Spark's fair scheduling policy at the job level. 6) metrics.properties - Spark metrics. 7) Any user specific - library or framework specific configuration file. Traditionally, SPARK_CONF_DIR has been the home to all user specific configuration files and the default behaviour in the Yarn or standalone mode is that these configuration files are copied to the worker nodes as required by the users themselves. In other words, they are not auto-copied. But, in the case of spark on kubernetes, we use spark images and generally these images are approved or undergoe some kind of standardisation. These files cannot be simply copied to the SPARK_CONF_DIR of the running executor and driver pods by the user. So, at the moment we have special casing for providing each configuration and for any other user specific configuration files, the process is more complex, i.e. - e.g. one can start with their own custom image of spark with configuration files pre installed etc.. Examples of special casing are: 1. Hadoop configuration in spark.kubernetes.hadoop.configMapName 2. Spark-env.sh as in spark.kubernetes.driverEnv.[EnvironmentVariableName] 3. Log4j.properties as in https://github.com/apache/spark/pull/26193 ... And for those such special casing does not exist, they are simply out of luck. So this feature, will let the user specific configuration files be mounted on the driver and executor pods' SPARK_CONF_DIR. At the moment it is not clear, if there is a need to, let user specify which config files to propagate - to driver and or executor. But, if there is a case that feature will be helpful, we can increase the scope of this work or create another JIRA issue to track that work. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org