[spark] branch master updated (3d98c9f -> be867e8)

2019-12-09 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3d98c9f  [SPARK-30179][SQL][TESTS] Improve test in SingleSessionSuite
 add be867e8  [SPARK-30196][BUILD] Bump lz4-java version to 1.7.0

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.7-hive-1.2 | 2 +-
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +-
 pom.xml | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (3d98c9f -> be867e8)

2019-12-09 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3d98c9f  [SPARK-30179][SQL][TESTS] Improve test in SingleSessionSuite
 add be867e8  [SPARK-30196][BUILD] Bump lz4-java version to 1.7.0

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.7-hive-1.2 | 2 +-
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +-
 pom.xml | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (36fa198 -> 3d98c9f)

2019-12-09 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 36fa198  [SPARK-30158][SQL][CORE] Seq -> Array for sc.parallelize for 
2.13 compatibility; remove WrappedArray
 add 3d98c9f  [SPARK-30179][SQL][TESTS] Improve test in SingleSessionSuite

No new revisions were added by this update.

Summary of changes:
 .../sql/hive/thriftserver/HiveThriftServer2Suites.scala  | 12 
 1 file changed, 12 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (36fa198 -> 3d98c9f)

2019-12-09 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 36fa198  [SPARK-30158][SQL][CORE] Seq -> Array for sc.parallelize for 
2.13 compatibility; remove WrappedArray
 add 3d98c9f  [SPARK-30179][SQL][TESTS] Improve test in SingleSessionSuite

No new revisions were added by this update.

Summary of changes:
 .../sql/hive/thriftserver/HiveThriftServer2Suites.scala  | 12 
 1 file changed, 12 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (8a9cccf -> 36fa198)

2019-12-09 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8a9cccf  [SPARK-30146][ML][PYSPARK] Add setWeightCol to GBTs in PySpark
 add 36fa198  [SPARK-30158][SQL][CORE] Seq -> Array for sc.parallelize for 
2.13 compatibility; remove WrappedArray

No new revisions were added by this update.

Summary of changes:
 .../examples/mllib/ElementwiseProductExample.scala |  2 +-
 .../apache/spark/mllib/pmml/PMMLExportable.scala   |  2 +-
 .../spark/ml/clustering/BisectingKMeansSuite.scala |  4 +-
 .../apache/spark/ml/clustering/KMeansSuite.scala   |  4 +-
 .../apache/spark/ml/recommendation/ALSSuite.scala  |  4 +-
 .../mllib/clustering/GaussianMixtureSuite.scala|  4 +-
 .../spark/mllib/clustering/KMeansSuite.scala   |  4 +-
 .../apache/spark/mllib/clustering/LDASuite.scala   |  4 +-
 .../org/apache/spark/mllib/feature/PCASuite.scala  |  2 +-
 .../spark/sql/JavaHigherOrderFunctionsSuite.java   | 73 --
 .../test/org/apache/spark/sql/JavaTestUtils.java   | 47 --
 .../parquet/ParquetPartitionDiscoverySuite.scala   | 16 ++---
 .../datasources/parquet/ParquetQuerySuite.scala|  2 +-
 13 files changed, 78 insertions(+), 90 deletions(-)
 delete mode 100644 
sql/core/src/test/java/test/org/apache/spark/sql/JavaTestUtils.java


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (538b8d1 -> 8a9cccf)

2019-12-09 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 538b8d1  [SPARK-30159][SQL][FOLLOWUP] Fix lint-java via removing 
unnecessary imports
 add 8a9cccf  [SPARK-30146][ML][PYSPARK] Add setWeightCol to GBTs in PySpark

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/classification.py | 27 +++
 python/pyspark/ml/regression.py | 28 
 2 files changed, 47 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (538b8d1 -> 8a9cccf)

2019-12-09 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 538b8d1  [SPARK-30159][SQL][FOLLOWUP] Fix lint-java via removing 
unnecessary imports
 add 8a9cccf  [SPARK-30146][ML][PYSPARK] Add setWeightCol to GBTs in PySpark

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/classification.py | 27 +++
 python/pyspark/ml/regression.py | 28 
 2 files changed, 47 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (729f43f -> 538b8d1)

2019-12-09 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 729f43f  [SPARK-27189][CORE] Add Executor metrics and memory usage 
instrumentation to the metrics system
 add 538b8d1  [SPARK-30159][SQL][FOLLOWUP] Fix lint-java via removing 
unnecessary imports

No new revisions were added by this update.

Summary of changes:
 .../src/test/java/org/apache/spark/sql/avro/JavaAvroFunctionsSuite.java  | 1 -
 sql/core/src/test/java/test/org/apache/spark/sql/JavaSaveLoadSuite.java  | 1 -
 sql/hive/src/test/java/org/apache/spark/sql/hive/JavaDataFrameSuite.java | 1 -
 .../java/org/apache/spark/sql/hive/JavaMetastoreDataSourcesSuite.java| 1 -
 4 files changed, 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (729f43f -> 538b8d1)

2019-12-09 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 729f43f  [SPARK-27189][CORE] Add Executor metrics and memory usage 
instrumentation to the metrics system
 add 538b8d1  [SPARK-30159][SQL][FOLLOWUP] Fix lint-java via removing 
unnecessary imports

No new revisions were added by this update.

Summary of changes:
 .../src/test/java/org/apache/spark/sql/avro/JavaAvroFunctionsSuite.java  | 1 -
 sql/core/src/test/java/test/org/apache/spark/sql/JavaSaveLoadSuite.java  | 1 -
 sql/hive/src/test/java/org/apache/spark/sql/hive/JavaDataFrameSuite.java | 1 -
 .../java/org/apache/spark/sql/hive/JavaMetastoreDataSourcesSuite.java| 1 -
 4 files changed, 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-27189][CORE] Add Executor metrics and memory usage instrumentation to the metrics system

2019-12-09 Thread irashid
This is an automated email from the ASF dual-hosted git repository.

irashid pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 729f43f  [SPARK-27189][CORE] Add Executor metrics and memory usage 
instrumentation to the metrics system
729f43f is described below

commit 729f43f499f3dd2718c0b28d73f2ca29cc811eac
Author: Luca Canali 
AuthorDate: Mon Dec 9 08:55:30 2019 -0600

[SPARK-27189][CORE] Add Executor metrics and memory usage instrumentation 
to the metrics system

## What changes were proposed in this pull request?

This PR proposes to add instrumentation of memory usage via the Spark 
Dropwizard/Codahale metrics system. Memory usage metrics are available via the 
Executor metrics, recently implemented as detailed in 
https://issues.apache.org/jira/browse/SPARK-23206.
Additional notes: This takes advantage of the metrics poller introduced in 
#23767.

## Why are the changes needed?
Executor metrics bring have many useful insights on memory usage, in 
particular on the usage of storage memory and executor memory. This is useful 
for troubleshooting. Having the information in the metrics systems allows to 
add those metrics to Spark performance dashboards and study memory usage as a 
function of time, as in the example graph 
https://issues.apache.org/jira/secure/attachment/12962810/Example_dashboard_Spark_Memory_Metrics.PNG

## Does this PR introduce any user-facing change?
Adds `ExecutorMetrics` source to publish executor metrics via the 
Dropwizard metrics system. Details of the available metrics in 
docs/monitoring.md
Adds configuration parameter `spark.metrics.executormetrics.source.enabled`

## How was this patch tested?

Tested on YARN cluster and with an existing setup for a Spark dashboard 
based on InfluxDB and Grafana.

Closes #24132 from LucaCanali/memoryMetricsSource.

Authored-by: Luca Canali 
Signed-off-by: Imran Rashid 
---
 .../main/scala/org/apache/spark/SparkContext.scala | 16 +-
 .../scala/org/apache/spark/executor/Executor.scala | 11 +++-
 .../spark/executor/ExecutorMetricsPoller.scala |  4 +-
 .../spark/executor/ExecutorMetricsSource.scala | 65 ++
 .../org/apache/spark/internal/config/package.scala |  6 ++
 .../spark/metrics/source/SourceConfigSuite.scala   | 30 +-
 docs/monitoring.md | 41 ++
 7 files changed, 167 insertions(+), 6 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/SparkContext.scala 
b/core/src/main/scala/org/apache/spark/SparkContext.scala
index 0694501..96ca12b 100644
--- a/core/src/main/scala/org/apache/spark/SparkContext.scala
+++ b/core/src/main/scala/org/apache/spark/SparkContext.scala
@@ -42,7 +42,7 @@ import org.apache.spark.annotation.DeveloperApi
 import org.apache.spark.broadcast.Broadcast
 import org.apache.spark.deploy.{LocalSparkCluster, SparkHadoopUtil}
 import org.apache.spark.deploy.StandaloneResourceUtils._
-import org.apache.spark.executor.ExecutorMetrics
+import org.apache.spark.executor.{ExecutorMetrics, ExecutorMetricsSource}
 import org.apache.spark.input.{FixedLengthBinaryInputFormat, 
PortableDataStream, StreamInputFormat, WholeTextFileInputFormat}
 import org.apache.spark.internal.Logging
 import org.apache.spark.internal.config._
@@ -551,9 +551,16 @@ class SparkContext(config: SparkConf) extends Logging {
 _dagScheduler = new DAGScheduler(this)
 _heartbeatReceiver.ask[Boolean](TaskSchedulerIsSet)
 
+val _executorMetricsSource =
+  if (_conf.get(METRICS_EXECUTORMETRICS_SOURCE_ENABLED)) {
+Some(new ExecutorMetricsSource)
+  } else {
+None
+  }
+
 // create and start the heartbeater for collecting memory metrics
 _heartbeater = new Heartbeater(
-  () => SparkContext.this.reportHeartBeat(),
+  () => SparkContext.this.reportHeartBeat(_executorMetricsSource),
   "driver-heartbeater",
   conf.get(EXECUTOR_HEARTBEAT_INTERVAL))
 _heartbeater.start()
@@ -622,6 +629,7 @@ class SparkContext(config: SparkConf) extends Logging {
 _env.metricsSystem.registerSource(_dagScheduler.metricsSource)
 _env.metricsSystem.registerSource(new 
BlockManagerSource(_env.blockManager))
 _env.metricsSystem.registerSource(new JVMCPUSource())
+_executorMetricsSource.foreach(_.register(_env.metricsSystem))
 _executorAllocationManager.foreach { e =>
   _env.metricsSystem.registerSource(e.executorAllocationManagerSource)
 }
@@ -2473,8 +2481,10 @@ class SparkContext(config: SparkConf) extends Logging {
   }
 
   /** Reports heartbeat metrics for the driver. */
-  private def reportHeartBeat(): Unit = {
+  private def reportHeartBeat(executorMetricsSource: 
Option[ExecutorMetricsSource]): Unit = {
 val currentMetrics = ExecutorMetrics.getCurrentMetrics(env.memoryManager)
+ex

[spark] branch master updated (c2f29d5 -> a717d21)

2019-12-09 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c2f29d5  [SPARK-30138][SQL] Separate configuration key of max 
iterations for analyzer and optimizer
 add a717d21  [SPARK-30159][SQL][TESTS] Fix the method calls of 
`QueryTest.checkAnswer`

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/avro/JavaAvroFunctionsSuite.java |  9 +
 .../org/apache/spark/sql/JavaSaveLoadSuite.java|  5 +--
 .../scala/org/apache/spark/sql/QueryTest.scala | 40 +++---
 .../ReduceNumShufflePartitionsSuite.scala  | 25 +-
 .../binaryfile/BinaryFileFormatSuite.scala |  6 ++--
 .../apache/spark/sql/streaming/StreamSuite.scala   |  6 ++--
 .../apache/spark/sql/hive/JavaDataFrameSuite.java  |  5 +--
 .../sql/hive/JavaMetastoreDataSourcesSuite.java|  9 +
 .../sql/hive/execution/AggregationQuerySuite.scala |  2 +-
 9 files changed, 47 insertions(+), 60 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (dcea7a4 -> c2f29d5)

2019-12-09 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from dcea7a4  [SPARK-29883][SQL] Implement a helper method for aliasing 
bool_and() and bool_or()
 add c2f29d5  [SPARK-30138][SQL] Separate configuration key of max 
iterations for analyzer and optimizer

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/analysis/Analyzer.scala  |  4 ++--
 .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 10 +-
 2 files changed, 11 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (dcea7a4 -> c2f29d5)

2019-12-09 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from dcea7a4  [SPARK-29883][SQL] Implement a helper method for aliasing 
bool_and() and bool_or()
 add c2f29d5  [SPARK-30138][SQL] Separate configuration key of max 
iterations for analyzer and optimizer

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/analysis/Analyzer.scala  |  4 ++--
 .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 10 +-
 2 files changed, 11 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org