Github user ssaavedra commented on the issue: https://github.com/apache/spark/pull/20383 Sorry, I hadn't answered yet because it seems my patch does not work cleanly on 2.3. Many names were rewritten as part of the merge and some logic on how the executor pods look up the configMap had changed. I'll have to take a better look at it. I already changed all `initcontainer` appearances for `initContainer` and so on, according to the parameters at the Config.scala for kubernetes, but to no avail yet. @foxish maybe you have a hint on where to look? It seems that the new containers are still looking for the old ConfigMap. That must happen due to some property in the Checkpoint getting restored by the driver. Thus, the driver gets the correct ConfigMap as it is created by spark-submit, but the executors don't because the driver restores the Checkpoint and thereon the old property value is used to put in the ConfigMap names (however the executors get named correctly). This is an example execution on my test environment. ``` $ kubectl -nssaavedraspark get pod spark-pi-2-5081f5d7a88332da955417b6582f22f5-driver spark-pi-2-5081f5d7a88332da955417b6582f22f5-exec-1 -o json | jq '.items[]| {"configMap": (.spec.volumes[] | select(.configMap?).configMap.name), "appselector": .metadata.labels."spark-app-selector", "name": .metadata.name}' ``` ``` { "configMap": "spark-pi-2-5081f5d7a88332da955417b6582f22f5-init-config", "appselector": "spark-8be5e27c750e4384964fbcb93d7f4b1c", "name": "spark-pi-2-5081f5d7a88332da955417b6582f22f5-driver" } { "configMap": "spark-pi-2-59025c48a8483e749e02894b70fd371f-init-config", "appselector": "spark-application-1517424700542", "name": "spark-pi-2-5081f5d7a88332da955417b6582f22f5-exec-1" } ``` I have already made these changes (besides what's in the PR): ``` --- a/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala +++ b/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala @@ -48,25 +48,27 @@ class Checkpoint(ssc: StreamingContext, val checkpointTime: Time) // Reload properties for the checkpoint application since user wants to set a reload property // or spark had changed its value and user wants to set it back. val propertiesToReload = List( "spark.yarn.app.id", "spark.yarn.app.attemptId", "spark.driver.host", "spark.driver.bindAddress", "spark.driver.port", "spark.kubernetes.driver.pod.name", "spark.kubernetes.executor.podNamePrefix", - "spark.kubernetes.initcontainer.configMapName", - "spark.kubernetes.initcontainer.configMapKey", - "spark.kubernetes.initcontainer.downloadJarsResourceIdentifier", - "spark.kubernetes.initcontainer.downloadJarsSecretLocation", - "spark.kubernetes.initcontainer.downloadFilesResourceIdentifier", - "spark.kubernetes.initcontainer.downloadFilesSecretLocation", - "spark.kubernetes.initcontainer.remoteJars", - "spark.kubernetes.initcontainer.remoteFiles", - "spark.kubernetes.mountdependencies.jarsDownloadDir", - "spark.kubernetes.mountdependencies.filesDownloadDir", + "spark.kubernetes.initContainer.configMapName", + "spark.kubernetes.initContainer.configMapKey", + // "spark.kubernetes.initContainer.remoteJars", + // "spark.kubernetes.initContainer.remoteFiles", + // "spark.kubernetes.mountDependencies.jarsDownloadDir", + // "spark.kubernetes.mountDependencies.filesDownloadDir", + // "spark.kubernetes.mountDependencies.timeout", + // "spark.kubernetes.mountDependencies.maxSimultaneousDownloads", "spark.master", "spark.yarn.jars", "spark.yarn.keytab", ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org