[spark] branch master updated (afc508722e0 -> 5d03950b358)

2023-05-20 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from afc508722e0 [SPARK-43595][BUILD] Update some maven plugins to newest 
version
 add 5d03950b358 [SPARK-43534][BUILD] Add log4j-1.2-api and 
log4j-slf4j2-impl to classpath if active hadoop-provided

No new revisions were added by this update.

Summary of changes:
 pom.xml | 2 --
 1 file changed, 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-43595][BUILD] Update some maven plugins to newest version

2023-05-20 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new afc508722e0 [SPARK-43595][BUILD] Update some maven plugins to newest 
version
afc508722e0 is described below

commit afc508722e07cf8fceb24204f538e51c6192c3e4
Author: panbingkun 
AuthorDate: Sat May 20 08:50:16 2023 -0500

[SPARK-43595][BUILD] Update some maven plugins to newest version

### What changes were proposed in this pull request?
The pr aims to update some maven plugins to newest version. include:
- exec-maven-plugin from 1.6.0 to 3.1.0
- scala-maven-plugin from 4.8.0 to 4.8.1
- maven-antrun-plugin from 1.8 to 3.1.0
- maven-enforcer-plugin from 3.2.1 to 3.3.0
- build-helper-maven-plugin from 3.3.0 to 3.4.0
- maven-surefire-plugin from 3.0.0 to 3.1.0
- maven-assembly-plugin from 3.1.0 to 3.6.0
- maven-install-plugin from 3.1.0 to 3.1.1
- maven-deploy-plugin from 3.1.0 to 3.1.1
- maven-checkstyle-plugin from 3.2.1 to 3.2.2

### Why are the changes needed?
Routine upgrade.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes #41228 from panbingkun/maven_plugin_upgrade.

Authored-by: panbingkun 
Signed-off-by: Sean Owen 
---
 pom.xml | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/pom.xml b/pom.xml
index dfc54b25705..1c4c4eb0fa6 100644
--- a/pom.xml
+++ b/pom.xml
@@ -115,7 +115,7 @@
 ${java.version}
 ${java.version}
 3.8.8
-1.6.0
+3.1.0
 spark
 9.5
 2.0.7
@@ -175,7 +175,7 @@
   errors building different Hadoop versions.
   See: SPARK-36547, SPARK-38394.
-->
-4.8.0
+4.8.1
 false
 2.15.0
 
@@ -210,7 +210,7 @@
 4.7.2
 4.7.2
 2.67.0
-1.8
+3.1.0
 1.1.0
 1.5.0
 1.60
@@ -2744,7 +2744,7 @@
 
   org.apache.maven.plugins
   maven-enforcer-plugin
-  3.2.1
+  3.3.0
   
 
   enforce-versions
@@ -2787,7 +2787,7 @@
 
   org.codehaus.mojo
   build-helper-maven-plugin
-  3.3.0
+  3.4.0
   
 
   module-timestamp-property
@@ -2907,7 +2907,7 @@
 
   org.apache.maven.plugins
   maven-surefire-plugin
-  3.0.0
+  3.1.0
   
   
 
@@ -3118,7 +3118,7 @@
 
   org.apache.maven.plugins
   maven-assembly-plugin
-  3.1.0
+  3.6.0
   
 posix
   
@@ -3143,12 +3143,12 @@
 
   org.apache.maven.plugins
   maven-install-plugin
-  3.1.0
+  3.1.1
 
 
   org.apache.maven.plugins
   maven-deploy-plugin
-  3.1.0
+  3.1.1
 
 
   org.apache.maven.plugins
@@ -3293,7 +3293,7 @@
   
 org.apache.maven.plugins
 maven-checkstyle-plugin
-3.2.1
+3.2.2
 
   false
   true


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (37b9c532d69 -> f55fdca10b1)

2023-05-20 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 37b9c532d69 [SPARK-43542][SS] Define a new error class and apply for 
the case where streaming query fails due to concurrent run of streaming query 
with same checkpoint
 add f55fdca10b1 [MINOR][INFRA] Deduplicate `scikit-learn` in Dockerfile

No new revisions were added by this update.

Summary of changes:
 dev/infra/Dockerfile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-43542][SS] Define a new error class and apply for the case where streaming query fails due to concurrent run of streaming query with same checkpoint

2023-05-20 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 37b9c532d69 [SPARK-43542][SS] Define a new error class and apply for 
the case where streaming query fails due to concurrent run of streaming query 
with same checkpoint
37b9c532d69 is described below

commit 37b9c532d698a35d2f577a8fd85ba31b4529f5ea
Author: Eric Marnadi 
AuthorDate: Sat May 20 10:33:12 2023 +0300

[SPARK-43542][SS] Define a new error class and apply for the case where 
streaming query fails due to concurrent run of streaming query with same 
checkpoint

### What changes were proposed in this pull request?

We are migrating to a new error framework in order to surface errors in a 
friendlier way to customers. This PR defines a new error class specifically for 
when there are concurrent updates to the log for the same batch ID

### Why are the changes needed?

This gives more information to customers, and allows them to filter in a 
better way

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

There is an existing test to check the error message upon this condition. 
Because we are only changing the error type, and not the error message, this 
test is sufficient.

Closes #41205 from ericm-db/SC-130782.

Authored-by: Eric Marnadi 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   |  7 +++
 .../org/apache/spark/sql/errors/QueryExecutionErrors.scala |  7 +++
 .../spark/sql/execution/streaming/AsyncCommitLog.scala |  5 ++---
 .../spark/sql/execution/streaming/AsyncOffsetSeqLog.scala  |  5 ++---
 .../AsyncProgressTrackingMicroBatchExecution.scala |  5 ++---
 .../sql/execution/streaming/MicroBatchExecution.scala  | 14 --
 6 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 3c7c29f7532..b130f6f6c93 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -212,6 +212,13 @@
   "Another instance of this query was just started by a concurrent 
session."
 ]
   },
+  "CONCURRENT_STREAM_LOG_UPDATE" : {
+"message" : [
+  "Concurrent update to the log. Multiple streaming jobs detected for 
.",
+  "Please make sure only one streaming job runs on a specific checkpoint 
location at a time."
+],
+"sqlState" : "4"
+  },
   "CONNECT" : {
 "message" : [
   "Generic Spark Connect error."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index 99f7489e8bc..67c5fa54732 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -1409,6 +1409,13 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
   messageParameters = Map.empty[String, String])
   }
 
+  def concurrentStreamLogUpdate(batchId: Long): Throwable = {
+new SparkException(
+  errorClass = "CONCURRENT_STREAM_LOG_UPDATE",
+  messageParameters = Map("batchId" -> batchId.toString),
+  cause = null)
+  }
+
   def cannotParseJsonArraysAsStructsError(): SparkRuntimeException = {
 new SparkRuntimeException(
   errorClass = "_LEGACY_ERROR_TEMP_2132",
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncCommitLog.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncCommitLog.scala
index e9ad8bed27c..495f2f7ac0b 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncCommitLog.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncCommitLog.scala
@@ -23,6 +23,7 @@ import java.util.concurrent.{CompletableFuture, 
ConcurrentLinkedDeque, ThreadPoo
 import scala.collection.JavaConverters._
 
 import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.errors.QueryExecutionErrors
 
 /**
  * Implementation of CommitLog to perform asynchronous writes to storage
@@ -54,9 +55,7 @@ class AsyncCommitLog(sparkSession: SparkSession, path: 
String, executorService:
   if (ret) {
 batchId
   } else {
-throw new IllegalStateException(
-  s"Concurrent update to the log. Multiple streaming jobs detected for 
$batchId"
-)
+throw QueryExecutionErrors.concurrentStreamLogUpdate(batchId)
   }
 })
 
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncOffsetSeqLog.scala