[spark] branch master updated (534f5d4 -> da32d1e)

2020-01-31 Thread ruifengz
This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 534f5d4  [SPARK-29138][PYTHON][TEST] Increase timeout of 
StreamingLogisticRegressionWithSGDTests.test_parameter_accuracy
 add da32d1e  [SPARK-30700][ML] NaiveBayesModel predict optimization

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/ml/classification/NaiveBayes.scala   | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (3538095 -> 534f5d4)

2020-01-31 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3538095  [SPARK-30698][BUILD] Bumps checkstyle from 8.25 to 8.29
 add 534f5d4  [SPARK-29138][PYTHON][TEST] Increase timeout of 
StreamingLogisticRegressionWithSGDTests.test_parameter_accuracy

No new revisions were added by this update.

Summary of changes:
 python/pyspark/mllib/tests/test_streaming_algorithms.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (878094f -> 3538095)

2020-01-31 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 878094f  [SPARK-30689][CORE][YARN] Add resource discovery plugin api 
to support YARN versions with resource scheduling
 add 3538095  [SPARK-30698][BUILD] Bumps checkstyle from 8.25 to 8.29

No new revisions were added by this update.

Summary of changes:
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-30689][CORE][YARN] Add resource discovery plugin api to support YARN versions with resource scheduling

2020-01-31 Thread tgraves
This is an automated email from the ASF dual-hosted git repository.

tgraves pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 878094f  [SPARK-30689][CORE][YARN] Add resource discovery plugin api 
to support YARN versions with resource scheduling
878094f is described below

commit 878094f9720d3c1866cbc01fb24c9794fe34edd9
Author: Thomas Graves 
AuthorDate: Fri Jan 31 22:20:28 2020 -0600

[SPARK-30689][CORE][YARN] Add resource discovery plugin api to support YARN 
versions with resource scheduling

### What changes were proposed in this pull request?

This change is to allow custom resource scheduler (GPUs,FPGAs,etc) resource 
discovery to be more flexible. Users are asking for it to work with hadoop 2.x 
versions that do not support resource scheduling in YARN and/or also they may 
not run in an isolated environment.
This change creates a plugin api that users can write their own resource 
discovery class that allows a lot more flexibility. The user can chain plugins 
for different resource types. The user specified plugins execute in the order 
specified and will fall back to use the discovery script plugin if they don't 
return information for a particular resource.

I had to open up a few of the classes to be public and change them to not 
be case classes and make them developer api in order for the the plugin to get 
enough information it needs.

I also relaxed the yarn side so that if yarn isn't configured for resource 
scheduling we just warn and go on. This helps users that have yarn 3.1 but 
haven't configured the resource scheduling side on their cluster yet, or aren't 
running in isolated environment.

The user would configured this like:
--conf 
spark.resources.discovery.plugin="org.apache.spark.resource.ResourceDiscoveryFPGAPlugin,
 org.apache.spark.resource.ResourceDiscoveryGPUPlugin"

Note the executor side had to be wrapped with a classloader to make sure we 
include the user classpath for jars they specified on submission.

Note this is more flexible because the discovery script has limitations 
such as spawning it in a separate process. This means if you are trying to 
allocate resources in that process they might be released when the script 
returns. Other things are the class makes it more flexible to be able to 
integrate with existing systems and solutions for assigning resources.

### Why are the changes needed?

to more easily use spark resource scheduling with older versions of hadoop 
or in non-isolated enivronments.

### Does this PR introduce any user-facing change?

Yes a plugin api

### How was this patch tested?

Unit tests added and manual testing done on yarn and standalone modes.

Closes #27410 from tgravescs/hadoop27spark3.

Lead-authored-by: Thomas Graves 
Co-authored-by: Thomas Graves 
Signed-off-by: Thomas Graves 
---
 .../api/resource/ResourceDiscoveryPlugin.java  |  63 +++
 .../main/scala/org/apache/spark/SparkContext.scala |   8 +-
 .../spark/deploy/StandaloneResourceUtils.scala |   4 +-
 .../executor/CoarseGrainedExecutorBackend.scala|  36 +++-
 .../org/apache/spark/internal/config/package.scala |  12 ++
 .../resource/ResourceDiscoveryScriptPlugin.scala   |  62 +++
 .../apache/spark/resource/ResourceProfile.scala|   4 +-
 .../org/apache/spark/resource/ResourceUtils.scala  | 136 +--
 .../scala/org/apache/spark/SparkConfSuite.scala|   2 +-
 .../CoarseGrainedExecutorBackendSuite.scala|   3 +-
 .../resource/ResourceDiscoveryPluginSuite.scala| 194 +
 .../apache/spark/resource/ResourceUtilsSuite.scala |  65 ---
 .../apache/spark/resource/TestResourceIDs.scala|  16 +-
 docs/configuration.md  |  12 ++
 .../apache/spark/deploy/k8s/KubernetesUtils.scala  |   8 +-
 .../k8s/features/BasicDriverFeatureStepSuite.scala |   2 +-
 .../features/BasicExecutorFeatureStepSuite.scala   |   4 +-
 .../spark/deploy/yarn/ResourceRequestHelper.scala  |  31 +++-
 .../org/apache/spark/deploy/yarn/ClientSuite.scala |   6 +-
 19 files changed, 560 insertions(+), 108 deletions(-)

diff --git 
a/core/src/main/java/org/apache/spark/api/resource/ResourceDiscoveryPlugin.java 
b/core/src/main/java/org/apache/spark/api/resource/ResourceDiscoveryPlugin.java
new file mode 100644
index 000..ffd2f83
--- /dev/null
+++ 
b/core/src/main/java/org/apache/spark/api/resource/ResourceDiscoveryPlugin.java
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not 

[spark] branch master updated (d0c3e9f -> 8eecc20)

2020-01-31 Thread lixiao
This is an automated email from the ASF dual-hosted git repository.

lixiao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d0c3e9f  [SPARK-30660][ML][PYSPARK] LinearRegression blockify input 
vectors
 add 8eecc20  [SPARK-27946][SQL] Hive DDL to Spark DDL conversion USING 
"show create table"

No new revisions were added by this update.

Summary of changes:
 docs/sql-migration-guide.md|   2 +
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|   2 +-
 .../spark/sql/catalyst/parser/AstBuilder.scala |   2 +-
 .../sql/catalyst/plans/logical/statements.scala|   4 +-
 .../catalyst/analysis/ResolveSessionCatalog.scala  |   6 +-
 .../spark/sql/execution/command/tables.scala   | 285 --
 .../org/apache/spark/sql/internal/HiveSerDe.scala  |  16 +
 .../sql-tests/inputs/show-create-table.sql |  11 +-
 .../sql-tests/results/show-create-table.sql.out|  34 ++-
 .../apache/spark/sql/ShowCreateTableSuite.scala|  16 +-
 .../spark/sql/hive/HiveShowCreateTableSuite.scala  | 327 -
 11 files changed, 581 insertions(+), 124 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (2fd15a2 -> d0c3e9f)

2020-01-31 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2fd15a2  [SPARK-30695][BUILD] Upgrade Apache ORC to 1.5.9
 add d0c3e9f  [SPARK-30660][ML][PYSPARK] LinearRegression blockify input 
vectors

No new revisions were added by this update.

Summary of changes:
 .../ml/optim/aggregator/HuberAggregator.scala  | 103 -
 .../optim/aggregator/LeastSquaresAggregator.scala  |  74 +++
 .../spark/ml/regression/LinearRegression.scala |  49 ++
 .../ml/optim/aggregator/HuberAggregatorSuite.scala |  61 +---
 .../aggregator/LeastSquaresAggregatorSuite.scala   |  62 ++---
 python/pyspark/ml/regression.py|  22 +++--
 6 files changed, 289 insertions(+), 82 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (2fd15a2 -> d0c3e9f)

2020-01-31 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2fd15a2  [SPARK-30695][BUILD] Upgrade Apache ORC to 1.5.9
 add d0c3e9f  [SPARK-30660][ML][PYSPARK] LinearRegression blockify input 
vectors

No new revisions were added by this update.

Summary of changes:
 .../ml/optim/aggregator/HuberAggregator.scala  | 103 -
 .../optim/aggregator/LeastSquaresAggregator.scala  |  74 +++
 .../spark/ml/regression/LinearRegression.scala |  49 ++
 .../ml/optim/aggregator/HuberAggregatorSuite.scala |  61 +---
 .../aggregator/LeastSquaresAggregatorSuite.scala   |  62 ++---
 python/pyspark/ml/regression.py|  22 +++--
 6 files changed, 289 insertions(+), 82 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (82b4f75 -> 2fd15a2)

2020-01-31 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 82b4f75  [SPARK-30508][SQL] Add SparkSession.executeCommand API for 
external datasource
 add 2fd15a2  [SPARK-30695][BUILD] Upgrade Apache ORC to 1.5.9

No new revisions were added by this update.

Summary of changes:
 LICENSE-binary  | 1 +
 dev/deps/spark-deps-hadoop-2.7-hive-1.2 | 7 ---
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 9 +
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 9 +
 pom.xml | 6 --
 5 files changed, 19 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] dongjoon-hyun commented on issue #258: Add target version of blockers

2020-01-31 Thread GitBox
dongjoon-hyun commented on issue #258: Add target version of blockers
URL: https://github.com/apache/spark-website/pull/258#issuecomment-580968165
 
 
   Thank you all!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] srowen closed pull request #258: Add target version of blockers

2020-01-31 Thread GitBox
srowen closed pull request #258: Add target version of blockers
URL: https://github.com/apache/spark-website/pull/258
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark-website] branch asf-site updated: Add target version of blockers

2020-01-31 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 86f060b  Add target version of blockers
86f060b is described below

commit 86f060b345abe82ff6ae21600de916c7263e5614
Author: Dongjoon Hyun 
AuthorDate: Fri Jan 31 18:33:24 2020 -0600

Add target version of blockers

Author: Dongjoon Hyun 

Closes #258 from dongjoon-hyun/target.
---
 contributing.md| 2 +-
 site/contributing.html | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/contributing.md b/contributing.md
index 9cde6ce..3016a26 100644
--- a/contributing.md
+++ b/contributing.md
@@ -268,7 +268,7 @@ Example: `Fix typos in Foo scaladoc`
 Blockers. JIRA tends to unfortunately conflate "size" and "importance" 
in its 
 Priority field values. Their meaning is roughly:
  1. Blocker: pointless to release without this change as the 
release would be unusable 
- to a large minority of users. Correctness and data loss issues 
should be considered Blockers.
+ to a large minority of users. Correctness and data loss issues 
should be considered Blockers for their target versions.
  1. Critical: a large minority of users are missing important 
functionality without 
  this, and/or a workaround is difficult
  1. Major: a small minority of users are missing important 
functionality without this, 
diff --git a/site/contributing.html b/site/contributing.html
index b29d381..98f7ed9 100644
--- a/site/contributing.html
+++ b/site/contributing.html
@@ -488,7 +488,7 @@ Example: Fix typos in Foo 
scaladoc
  Priority field values. Their meaning is roughly:
 
   Blocker: pointless to release without this change as the 
release would be unusable 
-  to a large minority of users. Correctness and data loss issues should be 
considered Blockers.
+  to a large minority of users. Correctness and data loss issues should be 
considered Blockers for their target versions.
   Critical: a large minority of users are missing important 
functionality without 
   this, and/or a workaround is difficult
   Major: a small minority of users are missing important 
functionality without this, 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (2d4b5ea -> 82b4f75)

2020-01-31 Thread lixiao
This is an automated email from the ASF dual-hosted git repository.

lixiao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2d4b5ea  [SPARK-30676][CORE][TESTS] Eliminate warnings from deprecated 
constructors of java.lang.Integer and java.lang.Double
 add 82b4f75  [SPARK-30508][SQL] Add SparkSession.executeCommand API for 
external datasource

No new revisions were added by this update.

Summary of changes:
 ...upportsRead.java => ExternalCommandRunner.java} | 30 +++--
 .../scala/org/apache/spark/sql/SparkSession.scala  | 31 +-
 .../spark/sql/execution/command/commands.scala | 30 ++---
 .../sql/sources/ExternalCommandRunnerSuite.scala   | 50 ++
 4 files changed, 120 insertions(+), 21 deletions(-)
 copy 
sql/catalyst/src/main/java/org/apache/spark/sql/connector/{catalog/SupportsRead.java
 => ExternalCommandRunner.java} (51%)
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/sources/ExternalCommandRunnerSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (387ce89 -> 2d4b5ea)

2020-01-31 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 387ce89  [SPARK-27324][DOC][CORE] Document configurations related to 
executor metrics and modify a configuration
 add 2d4b5ea  [SPARK-30676][CORE][TESTS] Eliminate warnings from deprecated 
constructors of java.lang.Integer and java.lang.Double

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/rdd/RDD.scala  | 6 +++---
 .../spark/sql/catalyst/expressions/MutableProjectionSuite.scala | 2 +-
 sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala | 4 ++--
 .../scala/org/apache/spark/sql/hive/HiveUserDefinedTypeSuite.scala  | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-27324][DOC][CORE] Document configurations related to executor metrics and modify a configuration

2020-01-31 Thread irashid
This is an automated email from the ASF dual-hosted git repository.

irashid pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 387ce89  [SPARK-27324][DOC][CORE] Document configurations related to 
executor metrics and modify a configuration
387ce89 is described below

commit 387ce89a0631f1a4c6668b90ff2a7bbcf11919cd
Author: Wing Yew Poon 
AuthorDate: Fri Jan 31 14:28:02 2020 -0600

[SPARK-27324][DOC][CORE] Document configurations related to executor 
metrics and modify a configuration

### What changes were proposed in this pull request?

Add a section to the Configuration page to document configurations for 
executor metrics.
At the same time, rename 
spark.eventLog.logStageExecutorProcessTreeMetrics.enabled to 
spark.executor.processTreeMetrics.enabled and make it independent of 
spark.eventLog.logStageExecutorMetrics.enabled.

### Why are the changes needed?

Executor metrics are new in Spark 3.0. They lack documentation.
Memory metrics as a whole are always collected, but the ones obtained from 
the process tree have to be optionally enabled. Making this depend on a single 
configuration makes for more intuitive behavior. Given this, the configuration 
property is renamed to better reflect its meaning.

### Does this PR introduce any user-facing change?

Yes, only in that the configurations are all new to 3.0.

### How was this patch tested?

Not necessary.

Closes #27329 from wypoon/SPARK-27324.

Authored-by: Wing Yew Poon 
Signed-off-by: Imran Rashid 
---
 .../spark/executor/ExecutorMetricsSource.scala |  3 +-
 .../spark/executor/ProcfsMetricsGetter.scala   |  8 ++---
 .../org/apache/spark/internal/config/package.scala | 17 +++---
 .../spark/deploy/history/HistoryServerSuite.scala  |  2 +-
 docs/configuration.md  | 37 ++
 docs/monitoring.md | 20 ++--
 6 files changed, 65 insertions(+), 22 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/executor/ExecutorMetricsSource.scala 
b/core/src/main/scala/org/apache/spark/executor/ExecutorMetricsSource.scala
index b052e43..14645f7 100644
--- a/core/src/main/scala/org/apache/spark/executor/ExecutorMetricsSource.scala
+++ b/core/src/main/scala/org/apache/spark/executor/ExecutorMetricsSource.scala
@@ -32,8 +32,7 @@ import org.apache.spark.metrics.source.Source
  * spark.executor.metrics.pollingInterval=.
  * (2) Procfs metrics are gathered all in one-go and only conditionally:
  * if the /proc filesystem exists
- * and spark.eventLog.logStageExecutorProcessTreeMetrics.enabled=true
- * and spark.eventLog.logStageExecutorMetrics.enabled=true.
+ * and spark.executor.processTreeMetrics.enabled=true.
  */
 private[spark] class ExecutorMetricsSource extends Source {
 
diff --git 
a/core/src/main/scala/org/apache/spark/executor/ProcfsMetricsGetter.scala 
b/core/src/main/scala/org/apache/spark/executor/ProcfsMetricsGetter.scala
index 0d5dcfb4..80ef757 100644
--- a/core/src/main/scala/org/apache/spark/executor/ProcfsMetricsGetter.scala
+++ b/core/src/main/scala/org/apache/spark/executor/ProcfsMetricsGetter.scala
@@ -58,11 +58,9 @@ private[spark] class ProcfsMetricsGetter(procfsDir: String = 
"/proc/") extends L
   logWarning("Exception checking for procfs dir", ioe)
   false
   }
-  val shouldLogStageExecutorMetrics =
-SparkEnv.get.conf.get(config.EVENT_LOG_STAGE_EXECUTOR_METRICS)
-  val shouldLogStageExecutorProcessTreeMetrics =
-SparkEnv.get.conf.get(config.EVENT_LOG_PROCESS_TREE_METRICS)
-  procDirExists.get && shouldLogStageExecutorProcessTreeMetrics && 
shouldLogStageExecutorMetrics
+  val shouldPollProcessTreeMetrics =
+SparkEnv.get.conf.get(config.EXECUTOR_PROCESS_TREE_METRICS_ENABLED)
+  procDirExists.get && shouldPollProcessTreeMetrics
 }
   }
 
diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala 
b/core/src/main/scala/org/apache/spark/internal/config/package.scala
index 40b05cf..e68368f 100644
--- a/core/src/main/scala/org/apache/spark/internal/config/package.scala
+++ b/core/src/main/scala/org/apache/spark/internal/config/package.scala
@@ -148,11 +148,8 @@ package object config {
 
   private[spark] val EVENT_LOG_STAGE_EXECUTOR_METRICS =
 ConfigBuilder("spark.eventLog.logStageExecutorMetrics.enabled")
-  .booleanConf
-  .createWithDefault(false)
-
-  private[spark] val EVENT_LOG_PROCESS_TREE_METRICS =
-ConfigBuilder("spark.eventLog.logStageExecutorProcessTreeMetrics.enabled")
+  .doc("Whether to write per-stage peaks of executor metrics (for each 
executor) " +
+"to the event log.")
   .booleanConf
   .createWithDefault(false)
 
@@ -215,8 +212,18 @@ package object config {
   private[spark] val 

[spark] branch master updated (33546d6 -> 18bc4e5)

2020-01-31 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 33546d6  Revert "[SPARK-30036][SQL] Fix: REPARTITION hint does not 
work with order by"
 add 18bc4e5  [SPARK-30684][WEBUI] Show the descripton of metrics for 
WholeStageCodegen in DAG viz

No new revisions were added by this update.

Summary of changes:
 .../sql/execution/ui/static/spark-sql-viz.css  |  7 -
 .../spark/sql/execution/ui/static/spark-sql-viz.js | 30 ++
 .../spark/sql/execution/ui/SparkPlanGraph.scala| 17 +++-
 3 files changed, 47 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (33546d6 -> 18bc4e5)

2020-01-31 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 33546d6  Revert "[SPARK-30036][SQL] Fix: REPARTITION hint does not 
work with order by"
 add 18bc4e5  [SPARK-30684][WEBUI] Show the descripton of metrics for 
WholeStageCodegen in DAG viz

No new revisions were added by this update.

Summary of changes:
 .../sql/execution/ui/static/spark-sql-viz.css  |  7 -
 .../spark/sql/execution/ui/static/spark-sql-viz.js | 30 ++
 .../spark/sql/execution/ui/SparkPlanGraph.scala| 17 +++-
 3 files changed, 47 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5eac2dc -> 33546d6)

2020-01-31 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5eac2dc  [SPARK-30691][SQL][DOC] Add a few main pages to SQL Reference
 add 33546d6  Revert "[SPARK-30036][SQL] Fix: REPARTITION hint does not 
work with order by"

No new revisions were added by this update.

Summary of changes:
 .../execution/exchange/EnsureRequirements.scala|  2 -
 .../org/apache/spark/sql/ConfigBehaviorSuite.scala |  8 ++--
 .../apache/spark/sql/execution/PlannerSuite.scala  | 50 --
 3 files changed, 5 insertions(+), 55 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5e0faf9 -> 5eac2dc)

2020-01-31 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5e0faf9  [SPARK-29779][SPARK-30479][CORE][SQL][FOLLOWUP] Reflect 
review comments on post-hoc review
 add 5eac2dc  [SPARK-30691][SQL][DOC] Add a few main pages to SQL Reference

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml | 17 -
 docs/sql-ref-functions-builtin.md|  2 +-
 docs/sql-ref-functions-udf.md|  2 +-
 docs/sql-ref-functions.md|  2 +-
 docs/sql-ref-syntax-aux-analyze.md   |  4 ++--
 docs/sql-ref-syntax-aux-cache.md |  4 ++--
 docs/sql-ref-syntax-aux-conf-mgmt.md | 10 --
 docs/sql-ref-syntax-aux-describe.md  | 12 ++--
 docs/sql-ref-syntax-aux-resource-mgmt.md | 12 ++--
 docs/sql-ref-syntax-aux-show.md  | 17 ++---
 docs/sql-ref-syntax-aux.md   | 16 ++--
 docs/sql-ref-syntax-ddl.md   | 25 +++--
 docs/sql-ref-syntax-dml.md   |  8 
 docs/sql-ref-syntax.md   |  9 +++--
 docs/sql-ref.md  |  2 +-
 15 files changed, 70 insertions(+), 72 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (5e0faf9 -> 5eac2dc)

2020-01-31 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5e0faf9  [SPARK-29779][SPARK-30479][CORE][SQL][FOLLOWUP] Reflect 
review comments on post-hoc review
 add 5eac2dc  [SPARK-30691][SQL][DOC] Add a few main pages to SQL Reference

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml | 17 -
 docs/sql-ref-functions-builtin.md|  2 +-
 docs/sql-ref-functions-udf.md|  2 +-
 docs/sql-ref-functions.md|  2 +-
 docs/sql-ref-syntax-aux-analyze.md   |  4 ++--
 docs/sql-ref-syntax-aux-cache.md |  4 ++--
 docs/sql-ref-syntax-aux-conf-mgmt.md | 10 --
 docs/sql-ref-syntax-aux-describe.md  | 12 ++--
 docs/sql-ref-syntax-aux-resource-mgmt.md | 12 ++--
 docs/sql-ref-syntax-aux-show.md  | 17 ++---
 docs/sql-ref-syntax-aux.md   | 16 ++--
 docs/sql-ref-syntax-ddl.md   | 25 +++--
 docs/sql-ref-syntax-dml.md   |  8 
 docs/sql-ref-syntax.md   |  9 +++--
 docs/sql-ref.md  |  2 +-
 15 files changed, 70 insertions(+), 72 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (ff0f636 -> 5e0faf9)

2020-01-31 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ff0f636  [SPARK-30638][CORE][FOLLOWUP] Fix a spacing issue and use 
UTF-8 instead of ASCII
 add 5e0faf9  [SPARK-29779][SPARK-30479][CORE][SQL][FOLLOWUP] Reflect 
review comments on post-hoc review

No new revisions were added by this update.

Summary of changes:
 .../deploy/history/BasicEventFilterBuilder.scala   | 52 +++---
 .../org/apache/spark/internal/config/History.scala |  3 ++
 .../execution/history/SQLEventFilterBuilder.scala  | 50 ++---
 3 files changed, 54 insertions(+), 51 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (481e521 -> ff0f636)

2020-01-31 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 481e521  [SPARK-30657][SPARK-30658][SS] Fixed two bugs in streaming 
limits
 add ff0f636  [SPARK-30638][CORE][FOLLOWUP] Fix a spacing issue and use 
UTF-8 instead of ASCII

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/internal/plugin/PluginContainerSuite.scala | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-30657][SPARK-30658][SS] Fixed two bugs in streaming limits

2020-01-31 Thread zsxwing
This is an automated email from the ASF dual-hosted git repository.

zsxwing pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 481e521  [SPARK-30657][SPARK-30658][SS] Fixed two bugs in streaming 
limits
481e521 is described below

commit 481e5211d237173ea0fb7c0b292eb7abd2b8a3fe
Author: Tathagata Das 
AuthorDate: Fri Jan 31 09:26:03 2020 -0800

[SPARK-30657][SPARK-30658][SS] Fixed two bugs in streaming limits

This PR solves two bugs related to streaming limits

**Bug 1 (SPARK-30658)**: Limit before a streaming aggregate (i.e. 
`df.limit(5).groupBy().count()`) in complete mode was not being planned as a 
stateful streaming limit. The planner rule planned a logical limit with a 
stateful streaming limit plan only if the query is in append mode. As a result, 
instead of allowing max 5 rows across batches, the planned streaming query was 
allowing 5 rows in every batch thus producing incorrect results.

**Solution**: Change the planner rule to plan the logical limit with a 
streaming limit plan even when the query is in complete mode if the logical 
limit has no stateful operator before it.

**Bug 2 (SPARK-30657)**: `LocalLimitExec` does not consume the iterator of 
the child plan. So if there is a limit after a stateful operator like streaming 
dedup in append mode (e.g. `df.dropDuplicates().limit(5)`), the state changes 
of streaming duplicate may not be committed (most stateful ops commit state 
changes only after the generated iterator is fully consumed).

**Solution**: Change the planner rule to always use a new 
`StreamingLocalLimitExec` which always fully consumes the iterator. This is the 
safest thing to do. However, this will introduce a performance regression as 
consuming the iterator is extra work. To minimize this performance impact, add 
an additional post-planner optimization rule to replace 
`StreamingLocalLimitExec` with `LocalLimitExec` when there is no stateful 
operator before the limit that could be affected by it.

No

Updated incorrect unit tests and added new ones

Closes #27373 from tdas/SPARK-30657.

Authored-by: Tathagata Das 
Signed-off-by: Shixiong Zhu 
---
 .../spark/sql/execution/SparkStrategies.scala  |  38 ---
 .../execution/streaming/IncrementalExecution.scala |  34 ++-
 ...GlobalLimitExec.scala => streamingLimits.scala} |  55 --
 .../apache/spark/sql/streaming/StreamSuite.scala   | 112 -
 4 files changed, 211 insertions(+), 28 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
index 00ad4e0..bd2684d 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
@@ -451,21 +451,35 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
* Used to plan the streaming global limit operator for streams in append 
mode.
* We need to check for either a direct Limit or a Limit wrapped in a 
ReturnAnswer operator,
* following the example of the SpecialLimits Strategy above.
-   * Streams with limit in Append mode use the stateful 
StreamingGlobalLimitExec.
-   * Streams with limit in Complete mode use the stateless CollectLimitExec 
operator.
-   * Limit is unsupported for streams in Update mode.
*/
   case class StreamingGlobalLimitStrategy(outputMode: OutputMode) extends 
Strategy {
-override def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match {
-  case ReturnAnswer(rootPlan) => rootPlan match {
-case Limit(IntegerLiteral(limit), child)
-if plan.isStreaming && outputMode == InternalOutputModes.Append =>
-  StreamingGlobalLimitExec(limit, LocalLimitExec(limit, 
planLater(child))) :: Nil
-case _ => Nil
+
+private def generatesStreamingAppends(plan: LogicalPlan): Boolean = {
+
+  /** Ensures that this plan does not have a streaming aggregate in it. */
+  def hasNoStreamingAgg: Boolean = {
+plan.collectFirst { case a: Aggregate if a.isStreaming => a }.isEmpty
   }
-  case Limit(IntegerLiteral(limit), child)
-  if plan.isStreaming && outputMode == InternalOutputModes.Append =>
-StreamingGlobalLimitExec(limit, LocalLimitExec(limit, 
planLater(child))) :: Nil
+
+  // The following cases of limits on a streaming plan has to be executed 
with a stateful
+  // streaming plan.
+  // 1. When the query is in append mode (that is, all logical plan 
operate on appended data).
+  // 2. When the plan does not contain any streaming aggregate (that is, 
plan has only
+  //operators that operate on appended data). This must be executed 
with a stateful
+  //streaming 

[spark] branch master updated (21bc047 -> 5ccbb38)

2020-01-31 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 21bc047  [SPARK-30511][SPARK-28403][CORE] Don't treat failed/killed 
speculative tasks as pending in ExecutorAllocationManager
 add 5ccbb38  [SPARK-29938][SQL][FOLLOW-UP] Improve 
AlterTableAddPartitionCommand

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/internal/SQLConf.scala| 10 
 .../command/AnalyzePartitionCommand.scala  |  2 +-
 .../spark/sql/execution/command/CommandUtils.scala | 64 --
 .../apache/spark/sql/execution/command/ddl.scala   | 15 ++---
 4 files changed, 63 insertions(+), 28 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (21bc047 -> 5ccbb38)

2020-01-31 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 21bc047  [SPARK-30511][SPARK-28403][CORE] Don't treat failed/killed 
speculative tasks as pending in ExecutorAllocationManager
 add 5ccbb38  [SPARK-29938][SQL][FOLLOW-UP] Improve 
AlterTableAddPartitionCommand

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/internal/SQLConf.scala| 10 
 .../command/AnalyzePartitionCommand.scala  |  2 +-
 .../spark/sql/execution/command/CommandUtils.scala | 64 --
 .../apache/spark/sql/execution/command/ddl.scala   | 15 ++---
 4 files changed, 63 insertions(+), 28 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] dongjoon-hyun opened a new pull request #258: Add target version of blockers

2020-01-31 Thread GitBox
dongjoon-hyun opened a new pull request #258: Add target version of blockers
URL: https://github.com/apache/spark-website/pull/258
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-30511][SPARK-28403][CORE] Don't treat failed/killed speculative tasks as pending in ExecutorAllocationManager

2020-01-31 Thread tgraves
This is an automated email from the ASF dual-hosted git repository.

tgraves pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 21bc047  [SPARK-30511][SPARK-28403][CORE] Don't treat failed/killed 
speculative tasks as pending in ExecutorAllocationManager
21bc047 is described below

commit 21bc0474bbb16c7648aed40f25a2945d98d2a167
Author: zebi...@fb.com 
AuthorDate: Fri Jan 31 08:49:34 2020 -0600

[SPARK-30511][SPARK-28403][CORE] Don't treat failed/killed speculative 
tasks as pending in ExecutorAllocationManager

### What changes were proposed in this pull request?

Currently, when speculative tasks fail/get killed, they are still 
considered as pending and count towards the calculation of number of needed 
executors. To be more accurate: 
`stageAttemptToNumSpeculativeTasks(stageAttempt)` is incremented on 
onSpeculativeTaskSubmitted, but never decremented.  
`stageAttemptToNumSpeculativeTasks -= stageAttempt` is performed on stage 
completion. **This means Spark is marking ended speculative tasks as pending, 
which leads to Spark to hold more executors [...]

This PR fixes this issue by updating `stageAttemptToSpeculativeTaskIndices` 
and  `stageAttemptToNumSpeculativeTasks` on speculative tasks completion.  This 
PR also addresses some other minor issues: scheduler behavior after receiving 
an intentionally killed task event; try to address 
[SPARK-28403](https://issues.apache.org/jira/browse/SPARK-28403).

### Why are the changes needed?

This has caused resource wastage in our production with speculation 
enabled. With aggressive speculation, we found data skewed jobs can hold 
hundreds of idle executors with less than 10 tasks running.

An easy repro of the issue (`--conf spark.speculation=true --conf 
spark.executor.cores=4 --conf spark.dynamicAllocation.maxExecutors=1000` in 
cluster mode):
```
val n = 4000
val someRDD = sc.parallelize(1 to n, n)
someRDD.mapPartitionsWithIndex( (index: Int, it: Iterator[Int]) => {
if (index < 300 && index >= 150) {
Thread.sleep(index * 1000) // Fake running tasks
} else if (index == 300) {
Thread.sleep(1000 * 1000) // Fake long running tasks
}
it.toList.map(x => index + ", " + x).iterator
}).collect
```
You will see when running the last task, we would be hold 38 executors (see 
below), which is exactly (152 + 3) / 4 = 38.

![image](https://user-images.githubusercontent.com/9404831/72469112-9a7fac00-3793-11ea-8f50-74d0ab7325a4.png)

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Added a comprehensive unit test.

Test with the above repro shows that we are holding 2 executors at the end

![image](https://user-images.githubusercontent.com/9404831/72469177-bbe09800-3793-11ea-850f-4a2c67142899.png)

Closes #27223 from linzebing/speculation_fix.

Authored-by: zebi...@fb.com 
Signed-off-by: Thomas Graves 
---
 .../apache/spark/ExecutorAllocationManager.scala   |  61 ++
 .../spark/ExecutorAllocationManagerSuite.scala | 135 +
 2 files changed, 172 insertions(+), 24 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala 
b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
index bff854a..677386c 100644
--- a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
+++ b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
@@ -263,9 +263,16 @@ private[spark] class ExecutorAllocationManager(
*/
   private def maxNumExecutorsNeeded(): Int = {
 val numRunningOrPendingTasks = listener.totalPendingTasks + 
listener.totalRunningTasks
-math.ceil(numRunningOrPendingTasks * executorAllocationRatio /
-  tasksPerExecutorForFullParallelism)
-  .toInt
+val maxNeeded = math.ceil(numRunningOrPendingTasks * 
executorAllocationRatio /
+  tasksPerExecutorForFullParallelism).toInt
+if (tasksPerExecutorForFullParallelism > 1 && maxNeeded == 1 &&
+  listener.pendingSpeculativeTasks > 0) {
+  // If we have pending speculative tasks and only need a single executor, 
allocate one more
+  // to satisfy the locality requirements of speculation
+  maxNeeded + 1
+} else {
+  maxNeeded
+}
   }
 
   private def totalRunningTasks(): Int = synchronized {
@@ -377,14 +384,8 @@ private[spark] class ExecutorAllocationManager(
 // If our target has not changed, do not send a message
 // to the cluster manager and reset our exponential growth
 if (delta == 0) {
-  // Check if there is any speculative jobs pending
-  if (listener.pendingTasks == 0 && listener.pendingSpeculativeTasks > 0) {
-numExecutorsTarget =
-  math.max(math.min(maxNumExecutorsNeeded + 

[spark] branch master updated: [SPARK-30638][CORE] Add resources allocated to PluginContext

2020-01-31 Thread tgraves
This is an automated email from the ASF dual-hosted git repository.

tgraves pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3d2b8d8  [SPARK-30638][CORE] Add resources allocated to PluginContext
3d2b8d8 is described below

commit 3d2b8d8b13eff0faa02316542a343e7a64873b8a
Author: Thomas Graves 
AuthorDate: Fri Jan 31 08:25:32 2020 -0600

[SPARK-30638][CORE] Add resources allocated to PluginContext

### What changes were proposed in this pull request?

Add the allocated resources to parameters to the PluginContext so that any 
plugins in driver or executor could use this information to initialize devices 
or use this information in a useful manner.

### Why are the changes needed?

To allow users to initialize/track devices once at the executor level 
before each task runs to use them.

### Does this PR introduce any user-facing change?

Yes to the people using the Executor/Driver plugin interface.

### How was this patch tested?

Unit tests and manually by writing a plugin that initialized GPU's using 
this interface.

Closes #27367 from tgravescs/pluginWithResources.

Lead-authored-by: Thomas Graves 
Co-authored-by: Thomas Graves 
Signed-off-by: Thomas Graves 
---
 .../org/apache/spark/api/plugin/PluginContext.java |   5 +
 .../main/scala/org/apache/spark/SparkContext.scala |   2 +-
 .../executor/CoarseGrainedExecutorBackend.scala|  10 +-
 .../scala/org/apache/spark/executor/Executor.scala |   7 +-
 .../spark/internal/plugin/PluginContainer.scala|  36 +--
 .../spark/internal/plugin/PluginContextImpl.scala  |   6 +-
 .../scheduler/local/LocalSchedulerBackend.scala|   5 +-
 .../org/apache/spark/executor/ExecutorSuite.scala  |  12 ++-
 .../internal/plugin/PluginContainerSuite.scala | 109 +++--
 .../spark/executor/MesosExecutorBackend.scala  |   4 +-
 10 files changed, 167 insertions(+), 29 deletions(-)

diff --git a/core/src/main/java/org/apache/spark/api/plugin/PluginContext.java 
b/core/src/main/java/org/apache/spark/api/plugin/PluginContext.java
index b9413cf..36d8275 100644
--- a/core/src/main/java/org/apache/spark/api/plugin/PluginContext.java
+++ b/core/src/main/java/org/apache/spark/api/plugin/PluginContext.java
@@ -18,11 +18,13 @@
 package org.apache.spark.api.plugin;
 
 import java.io.IOException;
+import java.util.Map;
 
 import com.codahale.metrics.MetricRegistry;
 
 import org.apache.spark.SparkConf;
 import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.resource.ResourceInformation;
 
 /**
  * :: DeveloperApi ::
@@ -54,6 +56,9 @@ public interface PluginContext {
   /** The host name which is being used by the Spark process for 
communication. */
   String hostname();
 
+  /** The custom resources (GPUs, FPGAs, etc) allocated to driver or executor. 
*/
+  Map resources();
+
   /**
* Send a message to the plugin's driver-side component.
* 
diff --git a/core/src/main/scala/org/apache/spark/SparkContext.scala 
b/core/src/main/scala/org/apache/spark/SparkContext.scala
index 3262631..6e0c7ac 100644
--- a/core/src/main/scala/org/apache/spark/SparkContext.scala
+++ b/core/src/main/scala/org/apache/spark/SparkContext.scala
@@ -542,7 +542,7 @@ class SparkContext(config: SparkConf) extends Logging {
   HeartbeatReceiver.ENDPOINT_NAME, new HeartbeatReceiver(this))
 
 // Initialize any plugins before the task scheduler is initialized.
-_plugins = PluginContainer(this)
+_plugins = PluginContainer(this, _resources.asJava)
 
 // Create and start the scheduler
 val (sched, ts) = SparkContext.createTaskScheduler(this, master, 
deployMode)
diff --git 
a/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
 
b/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
index 511c63a..ce211ce 100644
--- 
a/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
+++ 
b/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
@@ -69,6 +69,8 @@ private[spark] class CoarseGrainedExecutorBackend(
   // to be changed so that we don't share the serializer instance across 
threads
   private[this] val ser: SerializerInstance = 
env.closureSerializer.newInstance()
 
+  private var _resources = Map.empty[String, ResourceInformation]
+
   /**
* Map each taskId to the information about the resource allocated to it, 
Please refer to
* [[ResourceInformation]] for specifics.
@@ -78,9 +80,8 @@ private[spark] class CoarseGrainedExecutorBackend(
 
   override def onStart(): Unit = {
 logInfo("Connecting to driver: " + driverUrl)
-var resources = Map.empty[String, ResourceInformation]
 try {
-  resources = parseOrFindResources(resourcesFileOpt)
+  _resources = parseOrFindResources(resourcesFileOpt)
 } 

[spark] branch master updated (290a528 -> 6f4703e)

2020-01-31 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 290a528  [SPARK-30615][SQL] Introduce Analyzer rule for V2 AlterTable 
column change resolution
 add 6f4703e  [SPARK-30690][DOCS][BUILD] Add CalendarInterval into API 
documentation

No new revisions were added by this update.

Summary of changes:
 .../src/main/java/org/apache/spark/unsafe/types/CalendarInterval.java | 2 ++
 project/SparkBuild.scala  | 4 +++-
 2 files changed, 5 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (290a528 -> 6f4703e)

2020-01-31 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 290a528  [SPARK-30615][SQL] Introduce Analyzer rule for V2 AlterTable 
column change resolution
 add 6f4703e  [SPARK-30690][DOCS][BUILD] Add CalendarInterval into API 
documentation

No new revisions were added by this update.

Summary of changes:
 .../src/main/java/org/apache/spark/unsafe/types/CalendarInterval.java | 2 ++
 project/SparkBuild.scala  | 4 +++-
 2 files changed, 5 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (6fac411 -> 290a528)

2020-01-31 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6fac411  [SPARK-29093][ML][PYSPARK][FOLLOW-UP] Remove duplicate setter
 add 290a528  [SPARK-30615][SQL] Introduce Analyzer rule for V2 AlterTable 
column change resolution

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 156 ++
 .../sql/catalyst/analysis/CheckAnalysis.scala  |  60 +-
 .../org/apache/spark/sql/types/StructType.scala|  85 +---
 .../spark/sql/catalyst/analysis/AnalysisTest.scala |   9 +-
 .../CreateTablePartitioningValidationSuite.scala   |   4 +-
 .../spark/sql/connector/AlterTableTests.scala  |  71 ++-
 .../connector/V2CommandsCaseSensitivitySuite.scala | 227 +
 .../execution/command/PlanResolutionSuite.scala|  26 ++-
 8 files changed, 583 insertions(+), 55 deletions(-)
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/connector/V2CommandsCaseSensitivitySuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org