[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user ssaavedra commented on the issue: https://github.com/apache/spark/pull/20383 Yes, sorry for the misunderstanding I was also probably too eager with this. However, if the changes I'm stating up there don't work, I am not sure what I'm missing now. I'll take a further look at it tomorrow. If any of you is at FOSDEM this weekend we could take a look at it in there. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20383 I agree. sorry to merge it so quickly, let me revert it. @ssaavedra would you please submit PR again when everything is done, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20383 my take is not that it doesn't work but some names are out of date because it was done for the k8s fork. I think we should revert the commit and wait till it is tested out complete. WDYT? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20383 @ssaavedra Do you mean the your current patch doesn't work even for master branch? In case of that do we need to revert the current patch? CC @felixcheung @foxish . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user ssaavedra commented on the issue: https://github.com/apache/spark/pull/20383 Sorry, I hadn't answered yet because it seems my patch does not work cleanly on 2.3. Many names were rewritten as part of the merge and some logic on how the executor pods look up the configMap had changed. I'll have to take a better look at it. I already changed all `initcontainer` appearances for `initContainer` and so on, according to the parameters at the Config.scala for kubernetes, but to no avail yet. @foxish maybe you have a hint on where to look? It seems that the new containers are still looking for the old ConfigMap. That must happen due to some property in the Checkpoint getting restored by the driver. Thus, the driver gets the correct ConfigMap as it is created by spark-submit, but the executors don't because the driver restores the Checkpoint and thereon the old property value is used to put in the ConfigMap names (however the executors get named correctly). This is an example execution on my test environment. ``` $ kubectl -nssaavedraspark get pod spark-pi-2-5081f5d7a88332da955417b6582f22f5-driver spark-pi-2-5081f5d7a88332da955417b6582f22f5-exec-1 -o json | jq '.items[]| {"configMap": (.spec.volumes[] | select(.configMap?).configMap.name), "appselector": .metadata.labels."spark-app-selector", "name": .metadata.name}' ``` ``` { "configMap": "spark-pi-2-5081f5d7a88332da955417b6582f22f5-init-config", "appselector": "spark-8be5e27c750e4384964fbcb93d7f4b1c", "name": "spark-pi-2-5081f5d7a88332da955417b6582f22f5-driver" } { "configMap": "spark-pi-2-59025c48a8483e749e02894b70fd371f-init-config", "appselector": "spark-application-1517424700542", "name": "spark-pi-2-5081f5d7a88332da955417b6582f22f5-exec-1" } ``` I have already made these changes (besides what's in the PR): ``` --- a/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala +++ b/streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala @@ -48,25 +48,27 @@ class Checkpoint(ssc: StreamingContext, val checkpointTime: Time) // Reload properties for the checkpoint application since user wants to set a reload property // or spark had changed its value and user wants to set it back. val propertiesToReload = List( "spark.yarn.app.id", "spark.yarn.app.attemptId", "spark.driver.host", "spark.driver.bindAddress", "spark.driver.port", "spark.kubernetes.driver.pod.name", "spark.kubernetes.executor.podNamePrefix", - "spark.kubernetes.initcontainer.configMapName", - "spark.kubernetes.initcontainer.configMapKey", - "spark.kubernetes.initcontainer.downloadJarsResourceIdentifier", - "spark.kubernetes.initcontainer.downloadJarsSecretLocation", - "spark.kubernetes.initcontainer.downloadFilesResourceIdentifier", - "spark.kubernetes.initcontainer.downloadFilesSecretLocation", - "spark.kubernetes.initcontainer.remoteJars", - "spark.kubernetes.initcontainer.remoteFiles", - "spark.kubernetes.mountdependencies.jarsDownloadDir", - "spark.kubernetes.mountdependencies.filesDownloadDir", + "spark.kubernetes.initContainer.configMapName", + "spark.kubernetes.initContainer.configMapKey", + // "spark.kubernetes.initContainer.remoteJars", + // "spark.kubernetes.initContainer.remoteFiles", + // "spark.kubernetes.mountDependencies.jarsDownloadDir", + // "spark.kubernetes.mountDependencies.filesDownloadDir", + // "spark.kubernetes.mountDependencies.timeout", + // "spark.kubernetes.mountDependencies.maxSimultaneousDownloads", "spark.master", "spark.yarn.jars", "spark.yarn.keytab", ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user foxish commented on the issue: https://github.com/apache/spark/pull/20383 That plan LGTM - we can merge into 2.3 after removing the non-existent config, and getting a clean test run against the 2.3 branch. Should be low risk. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20383 @ssaavedra why don't you submit a follow-up PR to remove the inexistent configuration as mentioned above? If we agreed to backport to 2.3, then you can create another backport PR against branch 2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user ssaavedra commented on the issue: https://github.com/apache/spark/pull/20383 I can probably take a look at testing this over 2.3.0-rc2 on Monday. I did not test this on a clean 2.3.0-ish branch. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20383 So you have tested this on latest Spark 2.3.0 bit? Test aside, do people think it is useful to include this fix in the 2.3.0 release? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user ssaavedra commented on the issue: https://github.com/apache/spark/pull/20383 spark-integration was created much later. I originally opened this as https://github.com/apache-spark-on-k8s/spark/pull/516 last September. However, the integration tests repo exists since December 1st. It would be interesting to have a test, but we'd need to set up checkpointing. In my test setup I used AWS S3 for the checkpoint storage but in order not to depend on that, we'd have to deploy a distributed filesystem supported by Hadoop in minikube. Although configuring it should be an orthogonal problem to this test scenario, but we may run into other integration issues. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20383 without this fix streaming app can't properly recovery from checkpoint, correct? that seems fairly important to me. was apache-spark-on-k8s/spark-integration done before this PR was opened? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user foxish commented on the issue: https://github.com/apache/spark/pull/20383 I'm ok with backporting it once the non-existent config is removed and we're confident we're covering all the requisite config. Also would make sense to have a test under https://github.com/apache-spark-on-k8s/spark-integration @ssaavedra --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user ssaavedra commented on the issue: https://github.com/apache/spark/pull/20383 However, Spark Streaming should always be used with checkpoint enabled if you are using at least `updateStateByKey` or `reduceByKeyAndWindow` and you don't want to lose data, or miscalculate results. I'm not sure why foxish tagged this as 2.4.0 when reported the issue, probably because we were already in 2.3.0-rc2, but if there's going to be another RC (which it seems per the latest vote in spark-dev) then I'm in favor of the backport. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20383 This is not a big issue unless when we run Spark Streaming with checkpoint enabled. I'm not sure for now it is OK to add to 2.3.0 release (as this is not a block issue). @felixcheung up to you to decide whether this needs to be backported ð . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20383 We are going to cut another RC? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20383 hmmm.. The target version in the JIRA is 2.4.0, and we about to release 2.3.0, are we sure we want to backport? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20383 actually, can we backport this to branch-2.3? k8s is added in 2.3.0 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20383 Merging to Master, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/20383 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20383 LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20383 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86596/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20383 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20383 **[Test build #86596 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86596/testReport)** for PR 20383 at commit [`cd9b73a`](https://github.com/apache/spark/commit/cd9b73a1e85c45886243ae7176a1179de7329c16). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20383 @zsxwing @jerryshao --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20383 **[Test build #86596 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86596/testReport)** for PR 20383 at commit [`cd9b73a`](https://github.com/apache/spark/commit/cd9b73a1e85c45886243ae7176a1179de7329c16). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20383 Jenkins test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20383 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20383: [SPARK-23200] Reset Kubernetes-specific config on Checkp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20383 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org