date:20190318

[spark] branch master updated: [SPARK-27193][SQL] CodeFormatter should format multiple comment lines correctly

2019-03-18 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new a8af23d  [SPARK-27193][SQL] CodeFormatter should format multiple 
comment lines correctly
a8af23d is described below

commit a8af23d7aba62d22b257c94d38a50cba05d0fcfd
Author: wuyi 
AuthorDate: Tue Mar 19 14:47:51 2019 +0800

[SPARK-27193][SQL] CodeFormatter should format multiple comment lines 
correctly

## What changes were proposed in this pull request?

when enable `spark.sql.codegen.comments`,  there will be multiple comment 
lines. However, CodeFormatter can not handle multiple comment lines currently:

```
/* 001 */ public Object generate(Object[] references) {
/* 002 */   return new GeneratedIteratorForCodegenStage1(references);
/* 003 */ }
/* 004 */
/* 005 */ /**
 * Codegend pipeline for stage (id=1)
 * *(1) Project [(id#0L + 1) AS (id + 1)#3L]
 * +- *(1) Filter (id#0L = 1)
 *+- *(1) Range (0, 10, step=1, splits=4)
 */
/* 006 */ // codegenStageId=1
/* 007 */ final class GeneratedIteratorForCodegenStage1 extends 
org.apache.spark.sql.execution.BufferedRowIterator {
```

After applying this pr:

```
/* 001 */ public Object generate(Object[] references) {
/* 002 */   return new GeneratedIteratorForCodegenStage1(references);
/* 003 */ }
/* 004 */
/* 005 */ /**
/* 006 */  * Codegend pipeline for stage (id=1)
/* 007 */  * *(1) Project [(id#0L + 1) AS (id + 1)#4L]
/* 008 */  * +- *(1) Filter (id#0L = 1)
/* 009 */  *+- *(1) Range (0, 10, step=1, splits=2)
/* 010 */  */
/* 011 */ // codegenStageId=1
/* 012 */ final class GeneratedIteratorForCodegenStage1 extends 
org.apache.spark.sql.execution.BufferedRowIterator {
```

## How was this patch tested?

Tested Manually.

Closes #24133 from Ngone51/fix-codeformatter-for-multi-comment-lines.

Authored-by: wuyi 
Signed-off-by: Wenchen Fan 
---
 .../apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala  | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala
index ea1bb87..2ec3145 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeFormatter.scala
@@ -41,7 +41,8 @@ object CodeFormatter {
   val commentReplaced = commentHolder.replaceAllIn(
 line.trim,
 m => 
code.comment.get(m.group(1)).map(Matcher.quoteReplacement).getOrElse(m.group(0)))
-  formatter.addLine(commentReplaced)
+  val comments = commentReplaced.split("\n")
+  comments.foreach(formatter.addLine)
 }
 if (needToTruncate) {
   formatter.addLine(s"[truncated to $maxLines lines (total lines is 
${lines.length})]")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-27162][SQL] Add new method asCaseSensitiveMap in CaseInsensitiveStringMap

2019-03-18 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 28d35c8  [SPARK-27162][SQL] Add new method asCaseSensitiveMap in 
CaseInsensitiveStringMap
28d35c8 is described below

commit 28d35c85789edb060dda47a984b5b3f0a27d8bf0
Author: Gengliang Wang 
AuthorDate: Tue Mar 19 13:35:47 2019 +0800

[SPARK-27162][SQL] Add new method asCaseSensitiveMap in 
CaseInsensitiveStringMap

## What changes were proposed in this pull request?

Currently, DataFrameReader/DataFrameReader supports setting Hadoop 
configurations via method `.option()`.
E.g, the following test case should be passed in both ORC V1 and V2
```
  class TestFileFilter extends PathFilter {
override def accept(path: Path): Boolean = path.getParent.getName != 
"p=2"
  }

  withTempPath { dir =>
  val path = dir.getCanonicalPath

  val df = spark.range(2)
  df.write.orc(path + "/p=1")
  df.write.orc(path + "/p=2")
  val extraOptions = Map(
"mapred.input.pathFilter.class" -> classOf[TestFileFilter].getName,
"mapreduce.input.pathFilter.class" -> 
classOf[TestFileFilter].getName
  )
  assert(spark.read.options(extraOptions).orc(path).count() === 2)
}
  }
```
While Hadoop Configurations are case sensitive, the current data source V2 
APIs are using `CaseInsensitiveStringMap` in the top level entry 
`TableProvider`.
To create Hadoop configurations correctly, I suggest
1. adding a new method `asCaseSensitiveMap` in `CaseInsensitiveStringMap`.
2. Make `CaseInsensitiveStringMap` read-only to ambiguous conversion in 
`asCaseSensitiveMap`

## How was this patch tested?

Unit test

Closes #24094 from gengliangwang/originalMap.

Authored-by: Gengliang Wang 
Signed-off-by: Wenchen Fan 
---
 .../org/apache/spark/sql/catalog/v2/Catalogs.java  |  5 +--
 .../spark/sql/util/CaseInsensitiveStringMap.java   | 37 +-
 .../sql/util/CaseInsensitiveStringMapSuite.scala   | 31 --
 .../sql/execution/datasources/v2/FileTable.scala   |  7 ++--
 .../datasources/v2/FileWriteBuilder.scala  |  8 ++---
 .../datasources/v2/orc/OrcScanBuilder.scala|  6 +++-
 .../orc/OrcPartitionDiscoverySuite.scala   | 23 ++
 7 files changed, 96 insertions(+), 21 deletions(-)

diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/Catalogs.java 
b/sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/Catalogs.java
index efae266..aa4cbfc 100644
--- a/sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/Catalogs.java
+++ b/sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/Catalogs.java
@@ -23,6 +23,7 @@ import org.apache.spark.sql.internal.SQLConf;
 import org.apache.spark.sql.util.CaseInsensitiveStringMap;
 import org.apache.spark.util.Utils;
 
+import java.util.HashMap;
 import java.util.Map;
 import java.util.regex.Matcher;
 import java.util.regex.Pattern;
@@ -96,7 +97,7 @@ public class Catalogs {
 Map allConfs = 
mapAsJavaMapConverter(conf.getAllConfs()).asJava();
 Pattern prefix = Pattern.compile("^spark\\.sql\\.catalog\\." + name + 
"\\.(.+)");
 
-CaseInsensitiveStringMap options = CaseInsensitiveStringMap.empty();
+HashMap options = new HashMap<>();
 for (Map.Entry entry : allConfs.entrySet()) {
   Matcher matcher = prefix.matcher(entry.getKey());
   if (matcher.matches() && matcher.groupCount() > 0) {
@@ -104,6 +105,6 @@ public class Catalogs {
   }
 }
 
-return options;
+return new CaseInsensitiveStringMap(options);
   }
 }
diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/util/CaseInsensitiveStringMap.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/util/CaseInsensitiveStringMap.java
index 704d90e..da41346 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/util/CaseInsensitiveStringMap.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/util/CaseInsensitiveStringMap.java
@@ -18,8 +18,11 @@
 package org.apache.spark.sql.util;
 
 import org.apache.spark.annotation.Experimental;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 import java.util.Collection;
+import java.util.Collections;
 import java.util.HashMap;
 import java.util.Locale;
 import java.util.Map;
@@ -35,16 +38,29 @@ import java.util.Set;
  */
 @Experimental
 public class CaseInsensitiveStringMap implements Map {
+  private final Logger logger = 
LoggerFactory.getLogger(CaseInsensitiveStringMap.class);
+
+  private String unsupportedOperationMsg = "CaseInsensitiveStringMap is 
read-only.";
 
   public static CaseInsensitiveStringMap empty() {
 return new CaseInsensitiveStringMap(new HashMap<>(0));
   }
 
+  private final Map original;
+
   priv

[spark] branch master updated: [SPARK-27195][SQL][TEST] Add AvroReadSchemaSuite

2019-03-18 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 26e9849  [SPARK-27195][SQL][TEST] Add AvroReadSchemaSuite
26e9849 is described below

commit 26e9849cb49e4f83656d482d0a979a4cb14cbefc
Author: Dongjoon Hyun 
AuthorDate: Mon Mar 18 20:10:30 2019 -0700

[SPARK-27195][SQL][TEST] Add AvroReadSchemaSuite

## What changes were proposed in this pull request?

The reader schema is said to be evolved (or projected) when it changed 
after the data is written by writers. Apache Spark file-based data sources have 
a test coverage for that, 
[ReadSchemaSuite.scala](https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/ReadSchemaSuite.scala).
 This PR aims to add `AvroReadSchemaSuite` to ensure the minimal consistency 
among file-based data sources and prevent a future regression in Avro data 
source.

## How was this patch tested?

Pass the Jenkins with the newly added test suite.

Closes #24135 from dongjoon-hyun/SPARK-27195.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 .../datasources/AvroReadSchemaSuite.scala  | 27 ++
 .../execution/datasources/ReadSchemaSuite.scala|  2 ++
 .../sql/execution/datasources/ReadSchemaTest.scala |  6 -
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git 
a/external/avro/src/test/scala/org/apache/spark/sql/execution/datasources/AvroReadSchemaSuite.scala
 
b/external/avro/src/test/scala/org/apache/spark/sql/execution/datasources/AvroReadSchemaSuite.scala
new file mode 100644
index 000..a480bb9
--- /dev/null
+++ 
b/external/avro/src/test/scala/org/apache/spark/sql/execution/datasources/AvroReadSchemaSuite.scala
@@ -0,0 +1,27 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+class AvroReadSchemaSuite
+  extends ReadSchemaSuite
+  with AddColumnIntoTheMiddleTest
+  with HideColumnInTheMiddleTest
+  with ChangePositionTest {
+
+  override val format: String = "avro"
+}
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/ReadSchemaSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/ReadSchemaSuite.scala
index de234c1..802fda8 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/ReadSchemaSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/ReadSchemaSuite.scala
@@ -36,6 +36,8 @@ import org.apache.spark.sql.internal.SQLConf
  * -> ParquetReadSchemaSuite
  * -> VectorizedParquetReadSchemaSuite
  * -> MergedParquetReadSchemaSuite
+ *
+ * -> AvroReadSchemaSuite
  */
 
 /**
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/ReadSchemaTest.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/ReadSchemaTest.scala
index 17d9d43..d428095 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/ReadSchemaTest.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/ReadSchemaTest.scala
@@ -46,6 +46,7 @@ import org.apache.spark.sql.test.{SharedSQLContext, 
SQLTestUtils}
  *   | JSON | 1, 2, 3, 4   |   
 |
  *   | ORC  | 1, 2, 3, 4   | Native vectorized ORC reader has the 
widest coverage.  |
  *   | PARQUET  | 1, 2, 3  |   
 |
+ *   | AVRO | 1, 2, 3  |   
 |
  *
  * This aims to provide an explicit test coverage for reader schema change on 
file-based data
  * sources. Since a file format has its own coverage, we need a test suite for 
each file-based
@@ -55,9 +56,12 @@ import org.apache.spark.sql.test.{SharedSQLContext, 
SQLTestUtils}
  *
  *   ReadSchemaTest
  * -> AddColumnTest
- * -> HideColumnTest
+ * -> AddColumnIntoTheMiddleTest
+ * -> HideColumnAtTheEndTest
+ * -> HideC

[spark] branch branch-2.4 updated: [SPARK-27178][K8S][BRANCH-2.4] adding nss package to fix tests

2019-03-18 Thread vanzin

This is an automated email from the ASF dual-hosted git repository.

vanzin pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 342e91f  [SPARK-27178][K8S][BRANCH-2.4] adding nss package to fix tests
342e91f is described below

commit 342e91fdfa4e6ce5cc3a0da085d1fe723184021b
Author: shane knapp 
AuthorDate: Mon Mar 18 16:51:57 2019 -0700

[SPARK-27178][K8S][BRANCH-2.4] adding nss package to fix tests

## What changes were proposed in this pull request?

see also:  https://github.com/apache/spark/pull/24111

while performing some tests on our existing minikube and k8s 
infrastructure, i noticed that the integration tests were failing. i dug in and 
discovered the following message buried at the end of the stacktrace:

```
  Caused by: java.io.FileNotFoundException: /usr/lib/libnss3.so
at sun.security.pkcs11.Secmod.initialize(Secmod.java:193)
at sun.security.pkcs11.SunPKCS11.(SunPKCS11.java:218)
... 81 more
```
after i added the `nss` package to 
`resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile`, 
everything worked.

this is also impacting current builds.  see:  
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8959/console

## How was this patch tested?

i tested locally before pushing, and the build system will test the rest.

Closes #24137 from shaneknapp/add-nss-package.

Authored-by: shane knapp 
Signed-off-by: Marcelo Vanzin 
---
 .../kubernetes/docker/src/main/dockerfiles/spark/Dockerfile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile 
b/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile
index 1c4dcd5..196b74f 100644
--- a/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile
+++ b/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile
@@ -30,7 +30,7 @@ ARG k8s_tests=kubernetes/tests
 
 RUN set -ex && \
 apk upgrade --no-cache && \
-apk add --no-cache bash tini libc6-compat linux-pam && \
+apk add --no-cache bash tini libc6-compat linux-pam nss && \
 mkdir -p /opt/spark && \
 mkdir -p /opt/spark/work-dir && \
 touch /opt/spark/RELEASE && \


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-27178][K8S] add nss to the spark/k8s Dockerfile

2019-03-18 Thread shaneknapp

This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 5564fe5  [SPARK-27178][K8S] add nss to the spark/k8s Dockerfile
5564fe5 is described below

commit 5564fe51513f725d2526dbf9e25a2f2c40d19afc
Author: shane knapp 
AuthorDate: Mon Mar 18 16:38:42 2019 -0700

[SPARK-27178][K8S] add nss to the spark/k8s Dockerfile

## What changes were proposed in this pull request?

while performing some tests on our existing minikube and k8s 
infrastructure, i noticed that the integration tests were failing. i dug in and 
discovered the following message buried at the end of the stacktrace:

```
  Caused by: java.io.FileNotFoundException: /usr/lib/libnss3.so
at sun.security.pkcs11.Secmod.initialize(Secmod.java:193)
at sun.security.pkcs11.SunPKCS11.(SunPKCS11.java:218)
... 81 more
```
after i added the `nss` package to 
`resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile`, 
everything worked.

this is also impacting current builds.  see:  
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8959/console

## How was this patch tested?

i tested locally before pushing, and the build system will test the rest.

Closes #24111 from shaneknapp/add-nss-package-to-dockerfile.

Authored-by: shane knapp 
Signed-off-by: shane knapp 
---
 .../kubernetes/docker/src/main/dockerfiles/spark/Dockerfile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile 
b/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile
index 1d8ac3c..871d34b 100644
--- a/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile
+++ b/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile
@@ -29,7 +29,7 @@ ARG spark_uid=185
 RUN set -ex && \
 apk upgrade --no-cache && \
 ln -s /lib /lib64 && \
-apk add --no-cache bash tini libc6-compat linux-pam krb5 krb5-libs && \
+apk add --no-cache bash tini libc6-compat linux-pam krb5 krb5-libs nss && \
 mkdir -p /opt/spark && \
 mkdir -p /opt/spark/examples && \
 mkdir -p /opt/spark/work-dir && \


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-27112][CORE] : Create a resource ordering between threads to resolve the deadlocks encountered …

2019-03-18 Thread irashid

This is an automated email from the ASF dual-hosted git repository.

irashid pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 7043aee  [SPARK-27112][CORE] : Create a resource ordering between 
threads to resolve the deadlocks encountered …
7043aee is described below

commit 7043aee1ba95e92e1cbd0ebafcc5b09b69ee3082
Author: pgandhi 
AuthorDate: Mon Mar 18 10:33:51 2019 -0500

[SPARK-27112][CORE] : Create a resource ordering between threads to resolve 
the deadlocks encountered …

…when trying to kill executors either due to dynamic allocation or 
blacklisting

## What changes were proposed in this pull request?

There are two deadlocks as a result of the interplay between three 
different threads:

**task-result-getter thread**

**spark-dynamic-executor-allocation thread**

**dispatcher-event-loop thread(makeOffers())**

The fix ensures ordering synchronization constraint by acquiring lock on 
`TaskSchedulerImpl` before acquiring lock on `CoarseGrainedSchedulerBackend` in 
`makeOffers()` as well as killExecutors() method. This ensures resource 
ordering between the threads and thus, fixes the deadlocks.

## How was this patch tested?

Manual Tests

Closes #24072 from pgandhi999/SPARK-27112-2.

Authored-by: pgandhi 
Signed-off-by: Imran Rashid 
---
 .../scheduler/cluster/CoarseGrainedSchedulerBackend.scala   | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 
b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
index dc0f21c..808ef08 100644
--- 
a/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
+++ 
b/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
@@ -258,7 +258,7 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
 // Make fake resource offers on all executors
 private def makeOffers() {
   // Make sure no executor is killed while some task is launching on it
-  val taskDescs = CoarseGrainedSchedulerBackend.this.synchronized {
+  val taskDescs = withLock {
 // Filter out executors under killing
 val activeExecutors = executorDataMap.filterKeys(executorIsAlive)
 val workOffers = activeExecutors.map {
@@ -284,7 +284,7 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
 // Make fake resource offers on just one executor
 private def makeOffers(executorId: String) {
   // Make sure no executor is killed while some task is launching on it
-  val taskDescs = CoarseGrainedSchedulerBackend.this.synchronized {
+  val taskDescs = withLock {
 // Filter out executors under killing
 if (executorIsAlive(executorId)) {
   val executorData = executorDataMap(executorId)
@@ -631,7 +631,7 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
   force: Boolean): Seq[String] = {
 logInfo(s"Requesting to kill executor(s) ${executorIds.mkString(", ")}")
 
-val response = synchronized {
+val response = withLock {
   val (knownExecutors, unknownExecutors) = 
executorIds.partition(executorDataMap.contains)
   unknownExecutors.foreach { id =>
 logWarning(s"Executor to kill $id does not exist!")
@@ -730,6 +730,13 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
 
   protected def currentDelegationTokens: Array[Byte] = delegationTokens.get()
 
+  // SPARK-27112: We need to ensure that there is ordering of lock acquisition
+  // between TaskSchedulerImpl and CoarseGrainedSchedulerBackend objects in 
order to fix
+  // the deadlock issue exposed in SPARK-27112
+  private def withLock[T](fn: => T): T = scheduler.synchronized {
+CoarseGrainedSchedulerBackend.this.synchronized { fn }
+  }
+
 }
 
 private[spark] object CoarseGrainedSchedulerBackend {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-26811][SQL] Add capabilities to v2.Table

2019-03-18 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e348f14  [SPARK-26811][SQL] Add capabilities to v2.Table
e348f14 is described below

commit e348f14259d3e8699319e7c2fe220902de255f44
Author: Ryan Blue 
AuthorDate: Mon Mar 18 18:25:11 2019 +0800

[SPARK-26811][SQL] Add capabilities to v2.Table

## What changes were proposed in this pull request?

This adds a new method, `capabilities` to `v2.Table` that returns a set of 
`TableCapability`. Capabilities are used to fail queries during analysis 
checks, `V2WriteSupportCheck`, when the table does not support operations, like 
truncation.

## How was this patch tested?

Existing tests for regressions, added new analysis suite, 
`V2WriteSupportCheckSuite`, for new capability checks.

Closes #24012 from rdblue/SPARK-26811-add-capabilities.

Authored-by: Ryan Blue 
Signed-off-by: Wenchen Fan 
---
 .../spark/sql/kafka010/KafkaSourceProvider.scala   |   4 +-
 .../spark/sql/sources/v2/SupportsBatchRead.java|  34 -
 .../spark/sql/sources/v2/SupportsBatchWrite.java   |  33 -
 .../apache/spark/sql/sources/v2/SupportsRead.java  |   2 +-
 .../apache/spark/sql/sources/v2/SupportsWrite.java |   2 +-
 .../org/apache/spark/sql/sources/v2/Table.java |  14 +-
 .../spark/sql/sources/v2/TableCapability.java  |  69 ++
 .../apache/spark/sql/sources/v2/reader/Scan.java   |   7 +-
 .../spark/sql/sources/v2/writer/WriteBuilder.java  |   4 +-
 .../org/apache/spark/sql/DataFrameReader.scala |   6 +-
 .../org/apache/spark/sql/DataFrameWriter.scala |   6 +-
 .../datasources/noop/NoopDataSource.scala  |   7 +-
 .../datasources/v2/DataSourceV2Implicits.scala |  18 ++-
 .../datasources/v2/DataSourceV2Relation.scala  |   2 +-
 .../datasources/v2/DataSourceV2Strategy.scala  |   6 +-
 .../sql/execution/datasources/v2/FileTable.scala   |  11 +-
 .../datasources/v2/V2WriteSupportCheck.scala   |  56 
 .../datasources/v2/WriteToDataSourceV2Exec.scala   |  12 +-
 .../spark/sql/execution/streaming/console.scala|   5 +
 .../spark/sql/execution/streaming/memory.scala |   4 +
 .../streaming/sources/ForeachWriterTable.scala |   7 +-
 .../streaming/sources/RateStreamProvider.scala |   5 +
 .../sources/TextSocketSourceProvider.scala |   5 +-
 .../sql/execution/streaming/sources/memoryV2.scala |   6 +-
 .../sql/internal/BaseSessionStateBuilder.scala |   2 +
 .../spark/sql/sources/v2/JavaSimpleBatchTable.java |  17 ++-
 .../spark/sql/sources/v2/DataSourceV2Suite.scala   |   8 +-
 .../sources/v2/FileDataSourceV2FallBackSuite.scala |  12 +-
 .../sql/sources/v2/SimpleWritableDataSource.scala  |   7 +-
 .../sql/sources/v2/V2WriteSupportCheckSuite.scala  | 149 +
 .../sources/StreamingDataSourceV2Suite.scala   |   8 ++
 .../spark/sql/hive/HiveSessionStateBuilder.scala   |   2 +
 32 files changed, 416 insertions(+), 114 deletions(-)

diff --git 
a/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
 
b/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
index 8496cbd..a8eff6b 100644
--- 
a/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
+++ 
b/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
@@ -18,7 +18,7 @@
 package org.apache.spark.sql.kafka010
 
 import java.{util => ju}
-import java.util.{Locale, UUID}
+import java.util.{Collections, Locale, UUID}
 
 import scala.collection.JavaConverters._
 
@@ -359,6 +359,8 @@ private[kafka010] class KafkaSourceProvider extends 
DataSourceRegister
 
 override def schema(): StructType = KafkaOffsetReader.kafkaSchema
 
+override def capabilities(): ju.Set[TableCapability] = 
Collections.emptySet()
+
 override def newScanBuilder(options: CaseInsensitiveStringMap): 
ScanBuilder = new ScanBuilder {
   override def build(): Scan = new KafkaScan(options)
 }
diff --git 
a/sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsBatchRead.java 
b/sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsBatchRead.java
deleted file mode 100644
index ea7c5d2..000
--- 
a/sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsBatchRead.java
+++ /dev/null
@@ -1,34 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *

[spark] branch master updated: [SPARK-27193][SQL] CodeFormatter should format multiple comment lines correctly

[spark] branch master updated: [SPARK-27162][SQL] Add new method asCaseSensitiveMap in CaseInsensitiveStringMap

[spark] branch master updated: [SPARK-27195][SQL][TEST] Add AvroReadSchemaSuite

[spark] branch branch-2.4 updated: [SPARK-27178][K8S][BRANCH-2.4] adding nss package to fix tests

[spark] branch master updated: [SPARK-27178][K8S] add nss to the spark/k8s Dockerfile

[spark] branch master updated: [SPARK-27112][CORE] : Create a resource ordering between threads to resolve the deadlocks encountered …

[spark] branch master updated: [SPARK-26811][SQL] Add capabilities to v2.Table

7 matches

Site Navigation

Mail list logo

Footer information