[spark] branch master updated (3d98c9f -> be867e8)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3d98c9f [SPARK-30179][SQL][TESTS] Improve test in SingleSessionSuite add be867e8 [SPARK-30196][BUILD] Bump lz4-java version to 1.7.0 No new revisions were added by this update. Summary of changes: dev/deps/spark-deps-hadoop-2.7-hive-1.2 | 2 +- dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +- dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +- pom.xml | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3d98c9f -> be867e8)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3d98c9f [SPARK-30179][SQL][TESTS] Improve test in SingleSessionSuite add be867e8 [SPARK-30196][BUILD] Bump lz4-java version to 1.7.0 No new revisions were added by this update. Summary of changes: dev/deps/spark-deps-hadoop-2.7-hive-1.2 | 2 +- dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +- dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +- pom.xml | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (36fa198 -> 3d98c9f)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 36fa198 [SPARK-30158][SQL][CORE] Seq -> Array for sc.parallelize for 2.13 compatibility; remove WrappedArray add 3d98c9f [SPARK-30179][SQL][TESTS] Improve test in SingleSessionSuite No new revisions were added by this update. Summary of changes: .../sql/hive/thriftserver/HiveThriftServer2Suites.scala | 12 1 file changed, 12 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (36fa198 -> 3d98c9f)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 36fa198 [SPARK-30158][SQL][CORE] Seq -> Array for sc.parallelize for 2.13 compatibility; remove WrappedArray add 3d98c9f [SPARK-30179][SQL][TESTS] Improve test in SingleSessionSuite No new revisions were added by this update. Summary of changes: .../sql/hive/thriftserver/HiveThriftServer2Suites.scala | 12 1 file changed, 12 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8a9cccf -> 36fa198)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8a9cccf [SPARK-30146][ML][PYSPARK] Add setWeightCol to GBTs in PySpark add 36fa198 [SPARK-30158][SQL][CORE] Seq -> Array for sc.parallelize for 2.13 compatibility; remove WrappedArray No new revisions were added by this update. Summary of changes: .../examples/mllib/ElementwiseProductExample.scala | 2 +- .../apache/spark/mllib/pmml/PMMLExportable.scala | 2 +- .../spark/ml/clustering/BisectingKMeansSuite.scala | 4 +- .../apache/spark/ml/clustering/KMeansSuite.scala | 4 +- .../apache/spark/ml/recommendation/ALSSuite.scala | 4 +- .../mllib/clustering/GaussianMixtureSuite.scala| 4 +- .../spark/mllib/clustering/KMeansSuite.scala | 4 +- .../apache/spark/mllib/clustering/LDASuite.scala | 4 +- .../org/apache/spark/mllib/feature/PCASuite.scala | 2 +- .../spark/sql/JavaHigherOrderFunctionsSuite.java | 73 -- .../test/org/apache/spark/sql/JavaTestUtils.java | 47 -- .../parquet/ParquetPartitionDiscoverySuite.scala | 16 ++--- .../datasources/parquet/ParquetQuerySuite.scala| 2 +- 13 files changed, 78 insertions(+), 90 deletions(-) delete mode 100644 sql/core/src/test/java/test/org/apache/spark/sql/JavaTestUtils.java - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (538b8d1 -> 8a9cccf)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 538b8d1 [SPARK-30159][SQL][FOLLOWUP] Fix lint-java via removing unnecessary imports add 8a9cccf [SPARK-30146][ML][PYSPARK] Add setWeightCol to GBTs in PySpark No new revisions were added by this update. Summary of changes: python/pyspark/ml/classification.py | 27 +++ python/pyspark/ml/regression.py | 28 2 files changed, 47 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (538b8d1 -> 8a9cccf)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 538b8d1 [SPARK-30159][SQL][FOLLOWUP] Fix lint-java via removing unnecessary imports add 8a9cccf [SPARK-30146][ML][PYSPARK] Add setWeightCol to GBTs in PySpark No new revisions were added by this update. Summary of changes: python/pyspark/ml/classification.py | 27 +++ python/pyspark/ml/regression.py | 28 2 files changed, 47 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (729f43f -> 538b8d1)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 729f43f [SPARK-27189][CORE] Add Executor metrics and memory usage instrumentation to the metrics system add 538b8d1 [SPARK-30159][SQL][FOLLOWUP] Fix lint-java via removing unnecessary imports No new revisions were added by this update. Summary of changes: .../src/test/java/org/apache/spark/sql/avro/JavaAvroFunctionsSuite.java | 1 - sql/core/src/test/java/test/org/apache/spark/sql/JavaSaveLoadSuite.java | 1 - sql/hive/src/test/java/org/apache/spark/sql/hive/JavaDataFrameSuite.java | 1 - .../java/org/apache/spark/sql/hive/JavaMetastoreDataSourcesSuite.java| 1 - 4 files changed, 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (729f43f -> 538b8d1)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 729f43f [SPARK-27189][CORE] Add Executor metrics and memory usage instrumentation to the metrics system add 538b8d1 [SPARK-30159][SQL][FOLLOWUP] Fix lint-java via removing unnecessary imports No new revisions were added by this update. Summary of changes: .../src/test/java/org/apache/spark/sql/avro/JavaAvroFunctionsSuite.java | 1 - sql/core/src/test/java/test/org/apache/spark/sql/JavaSaveLoadSuite.java | 1 - sql/hive/src/test/java/org/apache/spark/sql/hive/JavaDataFrameSuite.java | 1 - .../java/org/apache/spark/sql/hive/JavaMetastoreDataSourcesSuite.java| 1 - 4 files changed, 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-27189][CORE] Add Executor metrics and memory usage instrumentation to the metrics system
This is an automated email from the ASF dual-hosted git repository. irashid pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 729f43f [SPARK-27189][CORE] Add Executor metrics and memory usage instrumentation to the metrics system 729f43f is described below commit 729f43f499f3dd2718c0b28d73f2ca29cc811eac Author: Luca Canali AuthorDate: Mon Dec 9 08:55:30 2019 -0600 [SPARK-27189][CORE] Add Executor metrics and memory usage instrumentation to the metrics system ## What changes were proposed in this pull request? This PR proposes to add instrumentation of memory usage via the Spark Dropwizard/Codahale metrics system. Memory usage metrics are available via the Executor metrics, recently implemented as detailed in https://issues.apache.org/jira/browse/SPARK-23206. Additional notes: This takes advantage of the metrics poller introduced in #23767. ## Why are the changes needed? Executor metrics bring have many useful insights on memory usage, in particular on the usage of storage memory and executor memory. This is useful for troubleshooting. Having the information in the metrics systems allows to add those metrics to Spark performance dashboards and study memory usage as a function of time, as in the example graph https://issues.apache.org/jira/secure/attachment/12962810/Example_dashboard_Spark_Memory_Metrics.PNG ## Does this PR introduce any user-facing change? Adds `ExecutorMetrics` source to publish executor metrics via the Dropwizard metrics system. Details of the available metrics in docs/monitoring.md Adds configuration parameter `spark.metrics.executormetrics.source.enabled` ## How was this patch tested? Tested on YARN cluster and with an existing setup for a Spark dashboard based on InfluxDB and Grafana. Closes #24132 from LucaCanali/memoryMetricsSource. Authored-by: Luca Canali Signed-off-by: Imran Rashid --- .../main/scala/org/apache/spark/SparkContext.scala | 16 +- .../scala/org/apache/spark/executor/Executor.scala | 11 +++- .../spark/executor/ExecutorMetricsPoller.scala | 4 +- .../spark/executor/ExecutorMetricsSource.scala | 65 ++ .../org/apache/spark/internal/config/package.scala | 6 ++ .../spark/metrics/source/SourceConfigSuite.scala | 30 +- docs/monitoring.md | 41 ++ 7 files changed, 167 insertions(+), 6 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/SparkContext.scala b/core/src/main/scala/org/apache/spark/SparkContext.scala index 0694501..96ca12b 100644 --- a/core/src/main/scala/org/apache/spark/SparkContext.scala +++ b/core/src/main/scala/org/apache/spark/SparkContext.scala @@ -42,7 +42,7 @@ import org.apache.spark.annotation.DeveloperApi import org.apache.spark.broadcast.Broadcast import org.apache.spark.deploy.{LocalSparkCluster, SparkHadoopUtil} import org.apache.spark.deploy.StandaloneResourceUtils._ -import org.apache.spark.executor.ExecutorMetrics +import org.apache.spark.executor.{ExecutorMetrics, ExecutorMetricsSource} import org.apache.spark.input.{FixedLengthBinaryInputFormat, PortableDataStream, StreamInputFormat, WholeTextFileInputFormat} import org.apache.spark.internal.Logging import org.apache.spark.internal.config._ @@ -551,9 +551,16 @@ class SparkContext(config: SparkConf) extends Logging { _dagScheduler = new DAGScheduler(this) _heartbeatReceiver.ask[Boolean](TaskSchedulerIsSet) +val _executorMetricsSource = + if (_conf.get(METRICS_EXECUTORMETRICS_SOURCE_ENABLED)) { +Some(new ExecutorMetricsSource) + } else { +None + } + // create and start the heartbeater for collecting memory metrics _heartbeater = new Heartbeater( - () => SparkContext.this.reportHeartBeat(), + () => SparkContext.this.reportHeartBeat(_executorMetricsSource), "driver-heartbeater", conf.get(EXECUTOR_HEARTBEAT_INTERVAL)) _heartbeater.start() @@ -622,6 +629,7 @@ class SparkContext(config: SparkConf) extends Logging { _env.metricsSystem.registerSource(_dagScheduler.metricsSource) _env.metricsSystem.registerSource(new BlockManagerSource(_env.blockManager)) _env.metricsSystem.registerSource(new JVMCPUSource()) +_executorMetricsSource.foreach(_.register(_env.metricsSystem)) _executorAllocationManager.foreach { e => _env.metricsSystem.registerSource(e.executorAllocationManagerSource) } @@ -2473,8 +2481,10 @@ class SparkContext(config: SparkConf) extends Logging { } /** Reports heartbeat metrics for the driver. */ - private def reportHeartBeat(): Unit = { + private def reportHeartBeat(executorMetricsSource: Option[ExecutorMetricsSource]): Unit = { val currentMetrics = ExecutorMetrics.getCurrentMetrics(env.memoryManager) +ex
[spark] branch master updated (c2f29d5 -> a717d21)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from c2f29d5 [SPARK-30138][SQL] Separate configuration key of max iterations for analyzer and optimizer add a717d21 [SPARK-30159][SQL][TESTS] Fix the method calls of `QueryTest.checkAnswer` No new revisions were added by this update. Summary of changes: .../spark/sql/avro/JavaAvroFunctionsSuite.java | 9 + .../org/apache/spark/sql/JavaSaveLoadSuite.java| 5 +-- .../scala/org/apache/spark/sql/QueryTest.scala | 40 +++--- .../ReduceNumShufflePartitionsSuite.scala | 25 +- .../binaryfile/BinaryFileFormatSuite.scala | 6 ++-- .../apache/spark/sql/streaming/StreamSuite.scala | 6 ++-- .../apache/spark/sql/hive/JavaDataFrameSuite.java | 5 +-- .../sql/hive/JavaMetastoreDataSourcesSuite.java| 9 + .../sql/hive/execution/AggregationQuerySuite.scala | 2 +- 9 files changed, 47 insertions(+), 60 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (dcea7a4 -> c2f29d5)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from dcea7a4 [SPARK-29883][SQL] Implement a helper method for aliasing bool_and() and bool_or() add c2f29d5 [SPARK-30138][SQL] Separate configuration key of max iterations for analyzer and optimizer No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/analysis/Analyzer.scala | 4 ++-- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 10 +- 2 files changed, 11 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (dcea7a4 -> c2f29d5)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from dcea7a4 [SPARK-29883][SQL] Implement a helper method for aliasing bool_and() and bool_or() add c2f29d5 [SPARK-30138][SQL] Separate configuration key of max iterations for analyzer and optimizer No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/analysis/Analyzer.scala | 4 ++-- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 10 +- 2 files changed, 11 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org