svn commit: r30589 - /dev/spark/v2.3.2-rc6-docs/
Author: wenchen Date: Fri Nov 2 05:02:56 2018 New Revision: 30589 Log: Removing RC artifacts. Removed: dev/spark/v2.3.2-rc6-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30588 - /dev/spark/v2.3.2-rc5-bin/
Author: wenchen Date: Fri Nov 2 05:02:31 2018 New Revision: 30588 Log: Removing RC artifacts. Removed: dev/spark/v2.3.2-rc5-bin/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30587 - /dev/spark/v2.3.2-rc5-docs/
Author: wenchen Date: Fri Nov 2 05:02:01 2018 New Revision: 30587 Log: Removing RC artifacts. Removed: dev/spark/v2.3.2-rc5-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30584 - in /dev/spark: v2.3.2-rc1-bin/ v2.3.2-rc1-docs/
Author: wenchen Date: Fri Nov 2 05:00:50 2018 New Revision: 30584 Log: Removing RC artifacts. Removed: dev/spark/v2.3.2-rc1-bin/ dev/spark/v2.3.2-rc1-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30586 - in /dev/spark: v2.3.2-rc4-bin/ v2.3.2-rc4-docs/
Author: wenchen Date: Fri Nov 2 05:01:23 2018 New Revision: 30586 Log: Removing RC artifacts. Removed: dev/spark/v2.3.2-rc4-bin/ dev/spark/v2.3.2-rc4-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30585 - in /dev/spark: v2.3.2-rc3-bin/ v2.3.2-rc3-docs/
Author: wenchen Date: Fri Nov 2 05:01:13 2018 New Revision: 30585 Log: Removing RC artifacts. Removed: dev/spark/v2.3.2-rc3-bin/ dev/spark/v2.3.2-rc3-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30583 - /dev/spark/v2.4.0-rc5-docs/
Author: wenchen Date: Fri Nov 2 04:59:27 2018 New Revision: 30583 Log: Removing RC artifacts. Removed: dev/spark/v2.4.0-rc5-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30580 - in /dev/spark: v2.4.0-rc2-bin/ v2.4.0-rc2-docs/
Author: wenchen Date: Fri Nov 2 04:58:01 2018 New Revision: 30580 Log: Removing RC artifacts. Removed: dev/spark/v2.4.0-rc2-bin/ dev/spark/v2.4.0-rc2-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30582 - in /dev/spark: v2.4.0-rc4-bin/ v2.4.0-rc4-docs/
Author: wenchen Date: Fri Nov 2 04:58:21 2018 New Revision: 30582 Log: Removing RC artifacts. Removed: dev/spark/v2.4.0-rc4-bin/ dev/spark/v2.4.0-rc4-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30579 - in /dev/spark: v2.4.0-rc1-bin/ v2.4.0-rc1-docs/
Author: wenchen Date: Fri Nov 2 04:57:28 2018 New Revision: 30579 Log: Removing RC artifacts. Removed: dev/spark/v2.4.0-rc1-bin/ dev/spark/v2.4.0-rc1-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] Git Push Summary
Repository: spark Updated Tags: refs/tags/v2.4.0 [created] 075447b39 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30578 - /dev/spark/v2.4.0-rc5-bin/ /release/spark/spark-2.4.0/
Author: wenchen Date: Fri Nov 2 04:25:06 2018 New Revision: 30578 Log: (empty) Added: release/spark/spark-2.4.0/ - copied from r30577, dev/spark/v2.4.0-rc5-bin/ Removed: dev/spark/v2.4.0-rc5-bin/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30576 - in /dev/spark/3.0.0-SNAPSHOT-2018_11_01_12_05-e9d3ca0-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Nov 1 19:19:34 2018 New Revision: 30576 Log: Apache Spark 3.0.0-SNAPSHOT-2018_11_01_12_05-e9d3ca0 docs [This commit notification would consist of 1472 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30575 - in /dev/spark/2.4.1-SNAPSHOT-2018_11_01_10_05-7389446-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Nov 1 17:22:03 2018 New Revision: 30575 Log: Apache Spark 2.4.1-SNAPSHOT-2018_11_01_10_05-7389446 docs [This commit notification would consist of 1477 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r30574 - in /dev/spark/2.3.3-SNAPSHOT-2018_11_01_10_05-49e1eb8-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Nov 1 17:20:18 2018 New Revision: 30574 Log: Apache Spark 2.3.3-SNAPSHOT-2018_11_01_10_05-49e1eb8 docs [This commit notification would consist of 1443 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25837][CORE] Fix potential slowdown in AppStatusListener when cleaning up stages
Repository: spark Updated Branches: refs/heads/branch-2.3 632c0d911 -> 49e1eb8bd [SPARK-25837][CORE] Fix potential slowdown in AppStatusListener when cleaning up stages ## What changes were proposed in this pull request? * Update `AppStatusListener` `cleanupStages` method to remove tasks for those stages in a single pass instead of 1 for each stage. * This fixes an issue where the cleanupStages method would get backed up, causing a backup in the executor in ElementTrackingStore, resulting in stages and jobs not getting cleaned up properly. Tasks seem most susceptible to this as there are a lot of them, however a similar issue could arise in other locations the `KVStore` `view` method is used. A broader fix might involve updates to `KVStoreView` and `InMemoryView` as it appears this interface and implementation can lead to multiple and inefficient traversals of the stored data. ## How was this patch tested? Using existing tests in AppStatusListenerSuite This is my original work and I license the work to the project under the projectâs open source license. Closes #22883 from patrickbrownsync/cleanup-stages-fix. Authored-by: Patrick Brown Signed-off-by: Marcelo Vanzin (cherry picked from commit e9d3ca0b7993995f24f5c555a570bc2521119e12) Signed-off-by: Marcelo Vanzin Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/49e1eb8b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/49e1eb8b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/49e1eb8b Branch: refs/heads/branch-2.3 Commit: 49e1eb8bdeff2ea13e235ed3a82173887c48643e Parents: 632c0d9 Author: Patrick Brown Authored: Thu Nov 1 09:34:29 2018 -0700 Committer: Marcelo Vanzin Committed: Thu Nov 1 09:38:21 2018 -0700 -- .../apache/spark/status/AppStatusListener.scala | 19 +-- 1 file changed, 9 insertions(+), 10 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/49e1eb8b/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala -- diff --git a/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala b/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala index d57c977..3164dc7 100644 --- a/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala +++ b/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala @@ -950,16 +950,6 @@ private[spark] class AppStatusListener( kvstore.delete(e.getClass(), e.id) } - val tasks = kvstore.view(classOf[TaskDataWrapper]) -.index("stage") -.first(key) -.last(key) -.asScala - - tasks.foreach { t => -kvstore.delete(t.getClass(), t.taskId) - } - // Check whether there are remaining attempts for the same stage. If there aren't, then // also delete the RDD graph data. val remainingAttempts = kvstore.view(classOf[StageDataWrapper]) @@ -982,6 +972,15 @@ private[spark] class AppStatusListener( cleanupCachedQuantiles(key) } + +// Delete tasks for all stages in one pass, as deleting them for each stage individually is slow +val tasks = kvstore.view(classOf[TaskDataWrapper]).asScala +val keys = stages.map { s => (s.info.stageId, s.info.attemptId) }.toSet +tasks.foreach { t => + if (keys.contains((t.stageId, t.stageAttemptId))) { +kvstore.delete(t.getClass(), t.taskId) + } +} } private def cleanupTasks(stage: LiveStage): Unit = { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25837][CORE] Fix potential slowdown in AppStatusListener when cleaning up stages
Repository: spark Updated Branches: refs/heads/branch-2.4 3d2fce5a3 -> 73894462c [SPARK-25837][CORE] Fix potential slowdown in AppStatusListener when cleaning up stages ## What changes were proposed in this pull request? * Update `AppStatusListener` `cleanupStages` method to remove tasks for those stages in a single pass instead of 1 for each stage. * This fixes an issue where the cleanupStages method would get backed up, causing a backup in the executor in ElementTrackingStore, resulting in stages and jobs not getting cleaned up properly. Tasks seem most susceptible to this as there are a lot of them, however a similar issue could arise in other locations the `KVStore` `view` method is used. A broader fix might involve updates to `KVStoreView` and `InMemoryView` as it appears this interface and implementation can lead to multiple and inefficient traversals of the stored data. ## How was this patch tested? Using existing tests in AppStatusListenerSuite This is my original work and I license the work to the project under the projectâs open source license. Closes #22883 from patrickbrownsync/cleanup-stages-fix. Authored-by: Patrick Brown Signed-off-by: Marcelo Vanzin (cherry picked from commit e9d3ca0b7993995f24f5c555a570bc2521119e12) Signed-off-by: Marcelo Vanzin Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/73894462 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/73894462 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/73894462 Branch: refs/heads/branch-2.4 Commit: 73894462cfb80b7c3e61c743b5a2f3be5d2282dd Parents: 3d2fce5 Author: Patrick Brown Authored: Thu Nov 1 09:34:29 2018 -0700 Committer: Marcelo Vanzin Committed: Thu Nov 1 09:34:45 2018 -0700 -- .../apache/spark/status/AppStatusListener.scala | 19 +-- 1 file changed, 9 insertions(+), 10 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/73894462/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala -- diff --git a/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala b/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala index 513c929..fdbef6f 100644 --- a/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala +++ b/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala @@ -1002,16 +1002,6 @@ private[spark] class AppStatusListener( kvstore.delete(e.getClass(), e.id) } - val tasks = kvstore.view(classOf[TaskDataWrapper]) -.index("stage") -.first(key) -.last(key) -.asScala - - tasks.foreach { t => -kvstore.delete(t.getClass(), t.taskId) - } - // Check whether there are remaining attempts for the same stage. If there aren't, then // also delete the RDD graph data. val remainingAttempts = kvstore.view(classOf[StageDataWrapper]) @@ -1034,6 +1024,15 @@ private[spark] class AppStatusListener( cleanupCachedQuantiles(key) } + +// Delete tasks for all stages in one pass, as deleting them for each stage individually is slow +val tasks = kvstore.view(classOf[TaskDataWrapper]).asScala +val keys = stages.map { s => (s.info.stageId, s.info.attemptId) }.toSet +tasks.foreach { t => + if (keys.contains((t.stageId, t.stageAttemptId))) { +kvstore.delete(t.getClass(), t.taskId) + } +} } private def cleanupTasks(stage: LiveStage): Unit = { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25837][CORE] Fix potential slowdown in AppStatusListener when cleaning up stages
Repository: spark Updated Branches: refs/heads/master fc898 -> e9d3ca0b7 [SPARK-25837][CORE] Fix potential slowdown in AppStatusListener when cleaning up stages ## What changes were proposed in this pull request? * Update `AppStatusListener` `cleanupStages` method to remove tasks for those stages in a single pass instead of 1 for each stage. * This fixes an issue where the cleanupStages method would get backed up, causing a backup in the executor in ElementTrackingStore, resulting in stages and jobs not getting cleaned up properly. Tasks seem most susceptible to this as there are a lot of them, however a similar issue could arise in other locations the `KVStore` `view` method is used. A broader fix might involve updates to `KVStoreView` and `InMemoryView` as it appears this interface and implementation can lead to multiple and inefficient traversals of the stored data. ## How was this patch tested? Using existing tests in AppStatusListenerSuite This is my original work and I license the work to the project under the projectâs open source license. Closes #22883 from patrickbrownsync/cleanup-stages-fix. Authored-by: Patrick Brown Signed-off-by: Marcelo Vanzin Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e9d3ca0b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e9d3ca0b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e9d3ca0b Branch: refs/heads/master Commit: e9d3ca0b7993995f24f5c555a570bc2521119e12 Parents: fc8 Author: Patrick Brown Authored: Thu Nov 1 09:34:29 2018 -0700 Committer: Marcelo Vanzin Committed: Thu Nov 1 09:34:29 2018 -0700 -- .../apache/spark/status/AppStatusListener.scala | 19 +-- 1 file changed, 9 insertions(+), 10 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/e9d3ca0b/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala -- diff --git a/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala b/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala index d52b7e8..e2c190e 100644 --- a/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala +++ b/core/src/main/scala/org/apache/spark/status/AppStatusListener.scala @@ -1073,16 +1073,6 @@ private[spark] class AppStatusListener( kvstore.delete(e.getClass(), e.id) } - val tasks = kvstore.view(classOf[TaskDataWrapper]) -.index("stage") -.first(key) -.last(key) -.asScala - - tasks.foreach { t => -kvstore.delete(t.getClass(), t.taskId) - } - // Check whether there are remaining attempts for the same stage. If there aren't, then // also delete the RDD graph data. val remainingAttempts = kvstore.view(classOf[StageDataWrapper]) @@ -1105,6 +1095,15 @@ private[spark] class AppStatusListener( cleanupCachedQuantiles(key) } + +// Delete tasks for all stages in one pass, as deleting them for each stage individually is slow +val tasks = kvstore.view(classOf[TaskDataWrapper]).asScala +val keys = stages.map { s => (s.info.stageId, s.info.attemptId) }.toSet +tasks.foreach { t => + if (keys.contains((t.stageId, t.stageAttemptId))) { +kvstore.delete(t.getClass(), t.taskId) + } +} } private def cleanupTasks(stage: LiveStage): Unit = { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
spark git commit: [SPARK-25809][K8S][TEST] New K8S integration testing backends
Repository: spark Updated Branches: refs/heads/master cd92f25be -> fc898 [SPARK-25809][K8S][TEST] New K8S integration testing backends ## What changes were proposed in this pull request? Currently K8S integration tests are hardcoded to use a `minikube` based backend. `minikube` is VM based so can be resource hungry and also doesn't cope well with certain networking setups (for example using Cisco AnyConnect software VPN `minikube` is unusable as it detects its own IP incorrectly). This PR Adds a new K8S integration testing backend that allows for using the Kubernetes support in [Docker for Desktop](https://blog.docker.com/2018/07/kubernetes-is-now-available-in-docker-desktop-stable-channel/). It also generalises the framework to be able to run the integration tests against an arbitrary Kubernetes cluster. To Do: - [x] General Kubernetes cluster backend - [x] Documentation on Kubernetes integration testing - [x] Testing of general K8S backend - [x] Check whether change from timestamps being `Time` to `String` in Fabric 8 upgrade needs additional fix up ## How was this patch tested? Ran integration tests with Docker for Desktop and all passed: ![screen shot 2018-10-23 at 14 19 56](https://user-images.githubusercontent.com/2104864/47363460-c5816a00-d6ce-11e8-9c15-56b34698e797.png) Suggested Reviewers: ifilonenko srowen Author: Rob Vesse Closes #22805 from rvesse/SPARK-25809. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fc89 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fc89 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/fc89 Branch: refs/heads/master Commit: fc898e26d9e4bb9ea1c0baa48cadba8ca673 Parents: cd92f25 Author: Rob Vesse Authored: Thu Nov 1 09:33:55 2018 -0700 Committer: mcheah Committed: Thu Nov 1 09:33:55 2018 -0700 -- .../k8s/SparkKubernetesClientFactory.scala | 5 + .../k8s/submit/LoggingPodStatusWatcher.scala| 3 - .../kubernetes/integration-tests/README.md | 183 +-- .../dev/dev-run-integration-tests.sh| 10 + .../kubernetes/integration-tests/pom.xml| 10 + .../scripts/setup-integration-test-env.sh | 43 +++-- .../k8s/integrationtest/KubernetesSuite.scala | 3 +- .../KubernetesTestComponents.scala | 5 +- .../k8s/integrationtest/ProcessUtils.scala | 5 +- .../deploy/k8s/integrationtest/TestConfig.scala | 6 +- .../k8s/integrationtest/TestConstants.scala | 15 +- .../backend/IntegrationTestBackend.scala| 21 ++- .../backend/cloud/KubeConfigBackend.scala | 70 +++ .../docker/DockerForDesktopBackend.scala| 25 +++ 14 files changed, 356 insertions(+), 48 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/fc89/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala -- diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala index c47e78c..77bd66b 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala @@ -42,6 +42,9 @@ private[spark] object SparkKubernetesClientFactory { sparkConf: SparkConf, defaultServiceAccountToken: Option[File], defaultServiceAccountCaCert: Option[File]): KubernetesClient = { + +// TODO [SPARK-25887] Support configurable context + val oauthTokenFileConf = s"$kubernetesAuthConfPrefix.$OAUTH_TOKEN_FILE_CONF_SUFFIX" val oauthTokenConf = s"$kubernetesAuthConfPrefix.$OAUTH_TOKEN_CONF_SUFFIX" val oauthTokenFile = sparkConf.getOption(oauthTokenFileConf) @@ -63,6 +66,8 @@ private[spark] object SparkKubernetesClientFactory { .getOption(s"$kubernetesAuthConfPrefix.$CLIENT_CERT_FILE_CONF_SUFFIX") val dispatcher = new Dispatcher( ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher")) + +// TODO [SPARK-25887] Create builder in a way that respects configurable context val config = new ConfigBuilder() .withApiVersion("v1") .withMasterUrl(master) http://git-wip-us.apache.org/repos/asf/spark/blob/fc89/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/LoggingPodStatusWatcher.scala -- diff --git
svn commit: r30565 - in /dev/spark/3.0.0-SNAPSHOT-2018_11_01_00_06-cd92f25-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s
Author: pwendell Date: Thu Nov 1 07:21:19 2018 New Revision: 30565 Log: Apache Spark 3.0.0-SNAPSHOT-2018_11_01_00_06-cd92f25 docs [This commit notification would consist of 1472 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.] - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org