[spark] branch branch-3.2 updated (e55bab5 -> 90b7ee0)

2021-11-07 Thread sarutak
This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e55bab5  [SPARK-37214][SQL] Fail query analysis earlier with invalid 
identifiers
 add 90b7ee0  [SPARK-37238][BUILD][3.2] Upgrade ORC to 1.6.12

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 6 +++---
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 6 +++---
 pom.xml | 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (7ef6a2e -> e29c4e1)

2021-11-07 Thread sarutak
This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7ef6a2e  [SPARK-37231][SQL] Dynamic writes/reads of ANSI interval 
partitions
 add e29c4e1  [SPARK-37211][INFRA] Added descriptions and an image to the 
guide for enabling GitHub Actions in notify_test_workflow.yml

No new revisions were added by this update.

Summary of changes:
 .github/workflows/images/workflow-enable-button.png | Bin 0 -> 79807 bytes
 .github/workflows/notify_test_workflow.yml  |  10 --
 2 files changed, 8 insertions(+), 2 deletions(-)
 create mode 100644 .github/workflows/images/workflow-enable-button.png

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] MaxGekk commented on pull request #367: Add Chao Sun to committers

2021-11-07 Thread GitBox


MaxGekk commented on pull request #367:
URL: https://github.com/apache/spark-website/pull/367#issuecomment-962850694


   Congratulations @sunchao !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (8ab9d63 -> 7ef6a2e)

2021-11-07 Thread sarutak
This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8ab9d63  [SPARK-37214][SQL] Fail query analysis earlier with invalid 
identifiers
 add 7ef6a2e  [SPARK-37231][SQL] Dynamic writes/reads of ANSI interval 
partitions

No new revisions were added by this update.

Summary of changes:
 .../execution/datasources/PartitioningUtils.scala  |  2 ++
 .../spark/sql/sources/PartitionedWriteSuite.scala  | 40 ++
 2 files changed, 36 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.2 updated: [SPARK-37214][SQL] Fail query analysis earlier with invalid identifiers

2021-11-07 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new e55bab5  [SPARK-37214][SQL] Fail query analysis earlier with invalid 
identifiers
e55bab5 is described below

commit e55bab5267b066fb78921ef6828924c32adbc637
Author: Wenchen Fan 
AuthorDate: Mon Nov 8 13:33:30 2021 +0800

[SPARK-37214][SQL] Fail query analysis earlier with invalid identifiers

This is a followup of #31427 , which introduced two issues:
1. When we lookup `spark_catalog.t`, we failed earlier with `The namespace 
in session catalog must have exactly one name part` before that PR, now we fail 
very late in `CheckAnalysis` with `NoSuchTableException`
2. The error message is a bit confusing now. We report `Table t not found` 
even if table `t` exists.

This PR fixes the 2 issues.

save analysis time and improve error message

no

updated test

Closes #34490 from cloud-fan/table.

Authored-by: Wenchen Fan 
Signed-off-by: Wenchen Fan 
(cherry picked from commit 8ab9d6327d7db20a4257f9fe6d0b17919576be5e)
Signed-off-by: Wenchen Fan 
---
 .../sql/connector/catalog/LookupCatalog.scala  |  4 +-
 .../spark/sql/errors/QueryCompilationErrors.scala  | 10 +---
 .../spark/sql/catalyst/parser/DDLParserSuite.scala |  3 +
 .../catalyst/analysis/ResolveSessionCatalog.scala  |  2 +-
 .../datasources/v2/V2SessionCatalog.scala  |  4 +-
 .../spark/sql/connector/DataSourceV2SQLSuite.scala | 66 +-
 .../spark/sql/execution/command/DDLSuite.scala |  3 +-
 7 files changed, 27 insertions(+), 65 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
index 0635859..0362caf 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
@@ -191,8 +191,8 @@ private[sql] trait LookupCatalog extends Logging {
 } else {
   ident.namespace match {
 case Array(db) => FunctionIdentifier(ident.name, Some(db))
-case _ =>
-  throw 
QueryCompilationErrors.unsupportedFunctionNameError(ident.toString)
+case other =>
+  throw 
QueryCompilationErrors.requiresSinglePartNamespaceError(other)
   }
 }
 
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index e7af006..7c2780a 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -519,10 +519,6 @@ object QueryCompilationErrors {
   "SHOW VIEWS, only SessionCatalog supports this command.")
   }
 
-  def unsupportedFunctionNameError(quoted: String): Throwable = {
-new AnalysisException(s"Unsupported function name '$quoted'")
-  }
-
   def sqlOnlySupportedWithV1TablesError(sql: String): Throwable = {
 new AnalysisException(s"$sql is only supported with v1 tables.")
   }
@@ -850,9 +846,9 @@ object QueryCompilationErrors {
 new TableAlreadyExistsException(ident)
   }
 
-  def requiresSinglePartNamespaceError(ident: Identifier): Throwable = {
-new NoSuchTableException(
-  s"V2 session catalog requires a single-part namespace: ${ident.quoted}")
+  def requiresSinglePartNamespaceError(ns: Seq[String]): Throwable = {
+new AnalysisException(CatalogManager.SESSION_CATALOG_NAME +
+  " requires a single-part namespace, but got " + ns.mkString("[", ", ", 
"]"))
   }
 
   def namespaceAlreadyExistsError(namespace: Array[String]): Throwable = {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DDLParserSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DDLParserSuite.scala
index a1d9f89..886c9a6 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DDLParserSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DDLParserSuite.scala
@@ -2237,6 +2237,9 @@ class DDLParserSuite extends AnalysisTest {
   false,
   LocalTempView)
 comparePlans(parsed2, expected2)
+
+val v3 = "CREATE TEMPORARY VIEW a.b AS SELECT 1"
+intercept(v3, "It is not allowed to add database prefix")
   }
 
   test("create view - full") {
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
index 80063cd..b73ccbb 100644
--- 

[spark] branch master updated: [SPARK-37214][SQL] Fail query analysis earlier with invalid identifiers

2021-11-07 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 8ab9d63  [SPARK-37214][SQL] Fail query analysis earlier with invalid 
identifiers
8ab9d63 is described below

commit 8ab9d6327d7db20a4257f9fe6d0b17919576be5e
Author: Wenchen Fan 
AuthorDate: Mon Nov 8 13:33:30 2021 +0800

[SPARK-37214][SQL] Fail query analysis earlier with invalid identifiers

### What changes were proposed in this pull request?

This is a followup of #31427 , which introduced two issues:
1. When we lookup `spark_catalog.t`, we failed earlier with `The namespace 
in session catalog must have exactly one name part` before that PR, now we fail 
very late in `CheckAnalysis` with `NoSuchTableException`
2. The error message is a bit confusing now. We report `Table t not found` 
even if table `t` exists.

This PR fixes the 2 issues.

### Why are the changes needed?

save analysis time and improve error message

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

updated test

Closes #34490 from cloud-fan/table.

Authored-by: Wenchen Fan 
Signed-off-by: Wenchen Fan 
---
 .../sql/connector/catalog/LookupCatalog.scala  |  4 +-
 .../spark/sql/errors/QueryCompilationErrors.scala  | 10 +---
 .../catalyst/analysis/ResolveSessionCatalog.scala  |  2 +-
 .../datasources/v2/V2SessionCatalog.scala  |  4 +-
 .../spark/sql/connector/DataSourceV2SQLSuite.scala | 66 +-
 .../sql/execution/command/DDLParserSuite.scala |  3 +
 .../spark/sql/execution/command/DDLSuite.scala |  3 +-
 7 files changed, 27 insertions(+), 65 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
index 0635859..0362caf 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
@@ -191,8 +191,8 @@ private[sql] trait LookupCatalog extends Logging {
 } else {
   ident.namespace match {
 case Array(db) => FunctionIdentifier(ident.name, Some(db))
-case _ =>
-  throw 
QueryCompilationErrors.unsupportedFunctionNameError(ident.toString)
+case other =>
+  throw 
QueryCompilationErrors.requiresSinglePartNamespaceError(other)
   }
 }
 
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index 527a2b9..b7f4cce 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -530,10 +530,6 @@ object QueryCompilationErrors {
   "SHOW VIEWS, only SessionCatalog supports this command.")
   }
 
-  def unsupportedFunctionNameError(quoted: String): Throwable = {
-new AnalysisException(s"Unsupported function name '$quoted'")
-  }
-
   def sqlOnlySupportedWithV1TablesError(sql: String): Throwable = {
 new AnalysisException(s"$sql is only supported with v1 tables.")
   }
@@ -861,9 +857,9 @@ object QueryCompilationErrors {
 new TableAlreadyExistsException(ident)
   }
 
-  def requiresSinglePartNamespaceError(ident: Identifier): Throwable = {
-new NoSuchTableException(
-  s"V2 session catalog requires a single-part namespace: ${ident.quoted}")
+  def requiresSinglePartNamespaceError(ns: Seq[String]): Throwable = {
+new AnalysisException(CatalogManager.SESSION_CATALOG_NAME +
+  " requires a single-part namespace, but got " + ns.mkString("[", ", ", 
"]"))
   }
 
   def namespaceAlreadyExistsError(namespace: Array[String]): Throwable = {
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
index f211054..e5be7f4 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
@@ -430,7 +430,7 @@ class ResolveSessionCatalog(val catalogManager: 
CatalogManager)
 className, resources, ignoreIfExists, replace) =>
   if (isSessionCatalog(catalog)) {
 val database = if (nameParts.length > 2) {
-  throw 
QueryCompilationErrors.unsupportedFunctionNameError(nameParts.quoted)
+  throw 
QueryCompilationErrors.requiresSinglePartNamespaceError(nameParts)
 } else if 

[spark] branch master updated (5cb0fb3 -> fe41d18)

2021-11-07 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5cb0fb3  [SPARK-35437][SQL] Use expressions to filter Hive partitions 
at client side
 add fe41d18  [SPARK-37199][SQL] Add deterministic field to QueryPlan

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/subquery.scala  |  3 +++
 .../spark/sql/catalyst/optimizer/InlineCTE.scala   |  2 +-
 .../spark/sql/catalyst/plans/QueryPlan.scala   |  7 +
 .../spark/sql/catalyst/plans/QueryPlanSuite.scala  | 30 +-
 .../scala/org/apache/spark/sql/SubquerySuite.scala | 11 
 5 files changed, 51 insertions(+), 2 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (e2ea690 -> 5cb0fb3)

2021-11-07 Thread sunchao
This is an automated email from the ASF dual-hosted git repository.

sunchao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e2ea690  [SPARK-37221][SQL] The collect-like API in SparkPlan should 
support columnar output
 add 5cb0fb3  [SPARK-35437][SQL] Use expressions to filter Hive partitions 
at client side

No new revisions were added by this update.

Summary of changes:
 .../catalyst/catalog/ExternalCatalogUtils.scala| 42 ++-
 .../org/apache/spark/sql/internal/SQLConf.scala| 14 
 .../spark/sql/hive/client/HiveClientImpl.scala |  2 +-
 .../apache/spark/sql/hive/client/HiveShim.scala| 85 +++---
 .../hive/client/HivePartitionFilteringSuite.scala  | 67 -
 5 files changed, 176 insertions(+), 34 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (442dedb -> e2ea690)

2021-11-07 Thread viirya
This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 442dedb  [SPARK-37120][BUILD] Add Daily GitHub Action jobs for 
Java11/17
 add e2ea690  [SPARK-37221][SQL] The collect-like API in SparkPlan should 
support columnar output

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/execution/SparkPlan.scala |  7 +-
 .../spark/sql/execution/SparkPlanSuite.scala   | 25 ++
 2 files changed, 31 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.2 updated: Revert "[SPARK-36998][CORE] Handle concurrent eviction of same application in SHS"

2021-11-07 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 6ecdde1  Revert "[SPARK-36998][CORE] Handle concurrent eviction of 
same application in SHS"
6ecdde1 is described below

commit 6ecdde189a61ea07125a3bddca6ec1ddd6a1c866
Author: Dongjoon Hyun 
AuthorDate: Sun Nov 7 17:12:51 2021 -0800

Revert "[SPARK-36998][CORE] Handle concurrent eviction of same application 
in SHS"

This reverts commit 248e07b49187bc7082e6cb2b0d9daa4b48ffe3cb.
---
 .../deploy/history/HistoryServerDiskManager.scala  | 10 ++--
 .../history/HistoryServerDiskManagerSuite.scala| 30 +++---
 2 files changed, 5 insertions(+), 35 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/HistoryServerDiskManager.scala
 
b/core/src/main/scala/org/apache/spark/deploy/history/HistoryServerDiskManager.scala
index 8a5b285..31f9d18 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/HistoryServerDiskManager.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/HistoryServerDiskManager.scala
@@ -17,7 +17,7 @@
 
 package org.apache.spark.deploy.history
 
-import java.io.{File, IOException}
+import java.io.File
 import java.util.concurrent.atomic.AtomicLong
 
 import scala.collection.JavaConverters._
@@ -210,13 +210,7 @@ private class HistoryServerDiskManager(
   def committed(): Long = committedUsage.get()
 
   private def deleteStore(path: File): Unit = {
-try {
-  FileUtils.deleteDirectory(path)
-} catch {
-  // Handle simultaneous eviction of the same app
-  case e: IOException =>
-if (path.exists()) throw e
-}
+FileUtils.deleteDirectory(path)
 listing.delete(classOf[ApplicationStoreInfo], path.getAbsolutePath())
   }
 
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerDiskManagerSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerDiskManagerSuite.scala
index fecf905..9004e86 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerDiskManagerSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerDiskManagerSuite.scala
@@ -21,9 +21,8 @@ import java.io.File
 
 import org.mockito.AdditionalAnswers
 import org.mockito.ArgumentMatchers.{anyBoolean, anyLong, eq => meq}
-import org.mockito.Mockito.{doAnswer, spy, when}
-import org.scalatest.{BeforeAndAfter, PrivateMethodTester}
-import org.scalatestplus.mockito.MockitoSugar.mock
+import org.mockito.Mockito.{doAnswer, spy}
+import org.scalatest.BeforeAndAfter
 
 import org.apache.spark.{SparkConf, SparkFunSuite}
 import org.apache.spark.internal.config.History._
@@ -31,8 +30,7 @@ import org.apache.spark.status.KVUtils
 import org.apache.spark.util.{ManualClock, Utils}
 import org.apache.spark.util.kvstore.KVStore
 
-class HistoryServerDiskManagerSuite extends SparkFunSuite
-  with PrivateMethodTester with BeforeAndAfter {
+class HistoryServerDiskManagerSuite extends SparkFunSuite with BeforeAndAfter {
 
   private def doReturn(value: Any) = org.mockito.Mockito.doReturn(value, 
Seq.empty: _*)
 
@@ -160,28 +158,6 @@ class HistoryServerDiskManagerSuite extends SparkFunSuite
 assert(manager.approximateSize(50L, true) > 50L)
   }
 
-  test("SPARK-36998: Should be able to delete a store") {
-val manager = mockManager()
-val tempDir = Utils.createTempDir()
-tempDir.delete()
-Seq(true, false).foreach { exists =>
-  val file = mock[File]
-  when(file.exists()).thenReturn(true).thenReturn(true).thenReturn(exists)
-  when(file.isDirectory).thenReturn(true)
-  when(file.toPath).thenReturn(tempDir.toPath)
-  when(file.getAbsolutePath).thenReturn(tempDir.getAbsolutePath)
-  val deleteStore = PrivateMethod[Unit]('deleteStore)
-  if (exists) {
-val m = intercept[Exception] {
-  manager invokePrivate deleteStore(file)
-}.getMessage
-assert(m.contains("Unknown I/O error"))
-  } else {
-manager invokePrivate deleteStore(file)
-  }
-}
-  }
-
   test("SPARK-32024: update ApplicationStoreInfo.size during initializing") {
 val manager = mockManager()
 val leaseA = manager.lease(2)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-37120][BUILD] Add Daily GitHub Action jobs for Java11/17

2021-11-07 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 442dedb  [SPARK-37120][BUILD] Add Daily GitHub Action jobs for 
Java11/17
442dedb is described below

commit 442dedba835f43532d049adb8b56ba05bf675f3d
Author: Dongjoon Hyun 
AuthorDate: Mon Nov 8 09:30:07 2021 +0900

[SPARK-37120][BUILD] Add Daily GitHub Action jobs for Java11/17

### What changes were proposed in this pull request?

This PR aims to add Daily GitHub Action jobs for Java 11/17.

### Why are the changes needed?

To add a test coverage on Java 11/17.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

N/A.

Closes #34508 from dongjoon-hyun/SPARK-37120.

Authored-by: Dongjoon Hyun 
Signed-off-by: Hyukjin Kwon 
---
 .github/workflows/build_and_test.yml | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index 3f2d500..9f375b3 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -33,12 +33,17 @@ on:
 - cron: '0 7 * * *'
 # PySpark coverage for master branch
 - cron: '0 10 * * *'
+# Java 11
+- cron: '0 13 * * *'
+# Java 17
+- cron: '0 16 * * *'
 
 jobs:
   configure-jobs:
 name: Configure jobs
 runs-on: ubuntu-20.04
 outputs:
+  java: ${{ steps.set-outputs.outputs.java }}
   branch: ${{ steps.set-outputs.outputs.branch }}
   hadoop: ${{ steps.set-outputs.outputs.hadoop }}
   type: ${{ steps.set-outputs.outputs.type }}
@@ -48,26 +53,43 @@ jobs:
   id: set-outputs
   run: |
 if [ "${{ github.event.schedule }}" = "0 1 * * *" ]; then
+  echo '::set-output name=java::8'
   echo '::set-output name=branch::master'
   echo '::set-output name=type::scheduled'
   echo '::set-output name=envs::{}'
   echo '::set-output name=hadoop::hadoop2.7'
 elif [ "${{ github.event.schedule }}" = "0 4 * * *" ]; then
+  echo '::set-output name=java::8'
   echo '::set-output name=branch::master'
   echo '::set-output name=type::scheduled'
   echo '::set-output name=envs::{"SCALA_PROFILE": "scala2.13"}'
   echo '::set-output name=hadoop::hadoop3.2'
 elif [ "${{ github.event.schedule }}" = "0 7 * * *" ]; then
+  echo '::set-output name=java::8'
   echo '::set-output name=branch::branch-3.2'
   echo '::set-output name=type::scheduled'
   echo '::set-output name=envs::{"SCALA_PROFILE": "scala2.13"}'
   echo '::set-output name=hadoop::hadoop3.2'
 elif [ "${{ github.event.schedule }}" = "0 10 * * *" ]; then
+  echo '::set-output name=java::8'
   echo '::set-output name=branch::master'
   echo '::set-output name=type::pyspark-coverage-scheduled'
   echo '::set-output name=envs::{"PYSPARK_CODECOV": "true"}'
   echo '::set-output name=hadoop::hadoop3.2'
+elif [ "${{ github.event.schedule }}" = "0 13 * * *" ]; then
+  echo '::set-output name=java::11'
+  echo '::set-output name=branch::branch-3.2'
+  echo '::set-output name=type::scheduled'
+  echo '::set-output name=envs::{}'
+  echo '::set-output name=hadoop::hadoop3.2'
+elif [ "${{ github.event.schedule }}" = "0 16 * * *" ]; then
+  echo '::set-output name=java::17'
+  echo '::set-output name=branch::branch-3.2'
+  echo '::set-output name=type::scheduled'
+  echo '::set-output name=envs::{}'
+  echo '::set-output name=hadoop::hadoop3.2'
 else
+  echo '::set-output name=java::8'
   echo '::set-output name=branch::master' # Default branch to run on. 
CHANGE here when a branch is cut out.
   echo '::set-output name=type::regular'
   echo '::set-output name=envs::{}'
@@ -89,7 +111,7 @@ jobs:
   fail-fast: false
   matrix:
 java:
-  - 8
+  - ${{ needs.configure-jobs.outputs.java }}
 hadoop:
   - ${{ needs.configure-jobs.outputs.hadoop }}
 hive:

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-37232][BUILD] Upgrade ORC to 1.7.1

2021-11-07 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new a3cb70a  [SPARK-37232][BUILD] Upgrade ORC to 1.7.1
a3cb70a is described below

commit a3cb70aed6a075c580b9f5c4afcb6e2859f636d7
Author: William Hyun 
AuthorDate: Sun Nov 7 13:16:42 2021 -0800

[SPARK-37232][BUILD] Upgrade ORC to 1.7.1

### What changes were proposed in this pull request?
This PR aims to upgrade ORC to 1.7.1.
- http://orc.apache.org/news/2021/11/07/ORC-1.7.1/

### Why are the changes needed?
This will bring the latest bug fixes.
- https://github.com/apache/orc/milestone/1?closed=1

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass the CIs.

Closes #34507 from williamhyun/orc171.

Authored-by: William Hyun 
Signed-off-by: Dongjoon Hyun 
---
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 6 +++---
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 6 +++---
 pom.xml | 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-2.7-hive-2.3 
b/dev/deps/spark-deps-hadoop-2.7-hive-2.3
index 90e6304..59b2b9b 100644
--- a/dev/deps/spark-deps-hadoop-2.7-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-2.7-hive-2.3
@@ -202,9 +202,9 @@ objenesis/2.6//objenesis-2.6.jar
 okhttp/3.12.12//okhttp-3.12.12.jar
 okio/1.14.0//okio-1.14.0.jar
 opencsv/2.3//opencsv-2.3.jar
-orc-core/1.7.0//orc-core-1.7.0.jar
-orc-mapreduce/1.7.0//orc-mapreduce-1.7.0.jar
-orc-shims/1.7.0//orc-shims-1.7.0.jar
+orc-core/1.7.1//orc-core-1.7.1.jar
+orc-mapreduce/1.7.1//orc-mapreduce-1.7.1.jar
+orc-shims/1.7.1//orc-shims-1.7.1.jar
 oro/2.0.8//oro-2.0.8.jar
 osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
 paranamer/2.8//paranamer-2.8.jar
diff --git a/dev/deps/spark-deps-hadoop-3.2-hive-2.3 
b/dev/deps/spark-deps-hadoop-3.2-hive-2.3
index 7f45c5c..b5b8406 100644
--- a/dev/deps/spark-deps-hadoop-3.2-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3.2-hive-2.3
@@ -189,9 +189,9 @@ objenesis/2.6//objenesis-2.6.jar
 okhttp/3.12.12//okhttp-3.12.12.jar
 okio/1.14.0//okio-1.14.0.jar
 opencsv/2.3//opencsv-2.3.jar
-orc-core/1.7.0//orc-core-1.7.0.jar
-orc-mapreduce/1.7.0//orc-mapreduce-1.7.0.jar
-orc-shims/1.7.0//orc-shims-1.7.0.jar
+orc-core/1.7.1//orc-core-1.7.1.jar
+orc-mapreduce/1.7.1//orc-mapreduce-1.7.1.jar
+orc-shims/1.7.1//orc-shims-1.7.1.jar
 oro/2.0.8//oro-2.0.8.jar
 osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
 paranamer/2.8//paranamer-2.8.jar
diff --git a/pom.xml b/pom.xml
index 45ff0ee..d7bab23 100644
--- a/pom.xml
+++ b/pom.xml
@@ -137,7 +137,7 @@
 
 10.14.2.0
 1.12.2
-1.7.0
+1.7.1
 9.4.43.v20210629
 4.0.3
 0.10.0

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (ddf27bd -> 1047708)

2021-11-07 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ddf27bd  [SPARK-37223][SQL][TESTS] Fix unit test check in JoinHintSuite
 add 1047708  [SPARK-37207][SQL][PYTHON] Add isEmpty method for the Python 
DataFrame API

No new revisions were added by this update.

Summary of changes:
 python/docs/source/reference/pyspark.sql.rst |  1 +
 python/pyspark/sql/dataframe.py  | 16 
 2 files changed, 17 insertions(+)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-37223][SQL][TESTS] Fix unit test check in JoinHintSuite

2021-11-07 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ddf27bd  [SPARK-37223][SQL][TESTS] Fix unit test check in JoinHintSuite
ddf27bd is described below

commit ddf27bd3af4cee733b8303c9cde386861e87c449
Author: Cheng Su 
AuthorDate: Sun Nov 7 07:48:31 2021 -0600

[SPARK-37223][SQL][TESTS] Fix unit test check in JoinHintSuite

### What changes were proposed in this pull request?

This is to fix the unit test where we should assert on the content of log 
in `JoinHintSuite`.

### Why are the changes needed?

Improve test.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Changed test itself.

Closes #34501 from c21/test-fix.

Authored-by: Cheng Su 
Signed-off-by: Sean Owen 
---
 .../test/scala/org/apache/spark/sql/JoinHintSuite.scala| 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/sql/core/src/test/scala/org/apache/spark/sql/JoinHintSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/JoinHintSuite.scala
index 91cad85..99bad40 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/JoinHintSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/JoinHintSuite.scala
@@ -612,8 +612,9 @@ class JoinHintSuite extends PlanTest with 
SharedSparkSession with AdaptiveSparkP
 
 val logs = hintAppender.loggingEvents.map(_.getRenderedMessage)
   .filter(_.contains("is not supported in the query:"))
-assert(logs.size == 2)
-logs.forall(_.contains(s"build left for 
${joinType.split("_").mkString(" ")} join."))
+assert(logs.size === 2)
+logs.foreach(log =>
+  assert(log.contains(s"build left for 
${joinType.split("_").mkString(" ")} join.")))
   }
 
   Seq("left_outer", "left_semi", "left_anti").foreach { joinType =>
@@ -640,8 +641,9 @@ class JoinHintSuite extends PlanTest with 
SharedSparkSession with AdaptiveSparkP
 }
 val logs = hintAppender.loggingEvents.map(_.getRenderedMessage)
   .filter(_.contains("is not supported in the query:"))
-assert(logs.size == 2)
-logs.forall(_.contains(s"build right for 
${joinType.split("_").mkString(" ")} join."))
+assert(logs.size === 2)
+logs.foreach(log =>
+  assert(log.contains(s"build right for 
${joinType.split("_").mkString(" ")} join.")))
   }
 
   Seq("right_outer").foreach { joinType =>
@@ -689,8 +691,8 @@ class JoinHintSuite extends PlanTest with 
SharedSparkSession with AdaptiveSparkP
 }
 val logs = hintAppender.loggingEvents.map(_.getRenderedMessage)
   .filter(_.contains("is not supported in the query:"))
-assert(logs.size == 2)
-logs.forall(_.contains("no equi-join keys"))
+assert(logs.size === 2)
+logs.foreach(log => assert(log.contains("no equi-join keys")))
   }
 
   test("SPARK-36652: AQE dynamic join selection should not apply to non-equi 
join") {

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-36665][SQL] Add more Not operator simplifications

2021-11-07 Thread viirya
This is an automated email from the ASF dual-hosted git repository.

viirya pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 977dd05  [SPARK-36665][SQL] Add more Not operator simplifications
977dd05 is described below

commit 977dd054ed0946b62e62d2d480dbf25598545a5e
Author: Kazuyuki Tanimura 
AuthorDate: Sun Nov 7 01:17:58 2021 -0700

[SPARK-36665][SQL] Add more Not operator simplifications

### What changes were proposed in this pull request?
This PR proposes to add more Not operator simplifications in 
`BooleanSimplification` by applying the following rules
  - Not(null) == null
- e.g. IsNull(Not(...)) can be IsNull(...)
  - (Not(a) = b) == (a = Not(b))
- e.g. Not(...) = true can be (...) = false
  - (a != b) == (a = Not(b))
- e.g. (...) != true can be (...) = false

### Why are the changes needed?
This PR simplifies SQL statements that includes Not operators.
In addition, the following query does not push down the filter in the 
current implementation
```
SELECT * FROM t WHERE (not boolean_col) <=> null
```
although the following equivalent query pushes down the filter as expected.
```
SELECT * FROM t WHERE boolean_col <=> null
```
That is because the first query creates `IsNull(Not(boolean_col))` in the 
current implementation, which should be able to get simplified further to 
`IsNull(boolean_col)`
This PR helps optimizing such cases.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Added unit tests
```
build/sbt "testOnly *BooleanSimplificationSuite  -- -z SPARK-36665"
```

Closes #33930 from kazuyukitanimura/SPARK-36665.

Authored-by: Kazuyuki Tanimura 
Signed-off-by: Liang-Chi Hsieh 
---
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |   2 +
 .../spark/sql/catalyst/optimizer/expressions.scala |  80 ++
 .../sql/catalyst/rules/RuleIdCollection.scala  |   2 +
 .../catalyst/optimizer/NotPropagationSuite.scala   | 176 +
 .../optimizer/NullDownPropagationSuite.scala   |  59 +++
 5 files changed, 319 insertions(+)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
index 298da4f..5386907 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
@@ -99,6 +99,7 @@ abstract class Optimizer(catalogManager: CatalogManager)
 OptimizeRepartition,
 TransposeWindow,
 NullPropagation,
+NullDownPropagation,
 ConstantPropagation,
 FoldablePropagation,
 OptimizeIn,
@@ -106,6 +107,7 @@ abstract class Optimizer(catalogManager: CatalogManager)
 EliminateAggregateFilter,
 ReorderAssociativeOperator,
 LikeSimplification,
+NotPropagation,
 BooleanSimplification,
 SimplifyConditionals,
 PushFoldableIntoBranches,
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
index 0ec8bad..a32306f 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
@@ -447,6 +447,53 @@ object BooleanSimplification extends Rule[LogicalPlan] 
with PredicateHelper {
 
 
 /**
+ * Move/Push `Not` operator if it's beneficial.
+ */
+object NotPropagation extends Rule[LogicalPlan] {
+  // Given argument x, return true if expression Not(x) can be simplified
+  // E.g. let x == Not(y), then canSimplifyNot(x) == true because Not(x) == 
Not(Not(y)) == y
+  // For the case of x = EqualTo(a, b), recursively check each child expression
+  // Extra nullable check is required for EqualNullSafe because
+  // Not(EqualNullSafe(e, null)) is different from EqualNullSafe(e, Not(null))
+  private def canSimplifyNot(x: Expression): Boolean = x match {
+case Literal(_, BooleanType) | Literal(_, NullType) => true
+case _: Not | _: IsNull | _: IsNotNull | _: And | _: Or => true
+case _: GreaterThan | _: GreaterThanOrEqual | _: LessThan | _: 
LessThanOrEqual => true
+case EqualTo(a, b) if canSimplifyNot(a) || canSimplifyNot(b) => true
+case EqualNullSafe(a, b)
+  if !a.nullable && !b.nullable && (canSimplifyNot(a) || 
canSimplifyNot(b)) => true
+case _ => false
+  }
+
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformWithPruning(
+_.containsPattern(NOT), ruleId) {
+case q: LogicalPlan =>