[spark] branch master updated: [SPARK-41937][R] Fix error in R (>= 4.2.0) for SparkR datetime column comparing with Sys.time()

2023-01-08 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e90b60743c7 [SPARK-41937][R] Fix error in R (>= 4.2.0) for SparkR 
datetime column comparing with Sys.time()
e90b60743c7 is described below

commit e90b60743c7887eeda2742bbf398340dd2b106c7
Author: Vivek Atal 
AuthorDate: Mon Jan 9 09:40:23 2023 +0900

[SPARK-41937][R] Fix error in R (>= 4.2.0) for SparkR datetime column 
comparing with Sys.time()

### What changes were proposed in this pull request?
1. Added the base R `all` function (or `any`, as appropriate) while doing 
the check of whether `class(.)` is `Column` or not; e.g., `if(all(class(.) == 
"Column"))`, or `if(inherits(., "Column"))` is proposed instead of the current 
approach `if(class(.) == "Column")`.
2. Also added relevant SparkR test cases to verify no warning or error is 
resulted on the updated functions.
3. Only R scripts have been modified.

### Why are the changes needed?
Base R 4.2.0 introduced a change ([[Rd] R 4.2.0 is 
released](https://stat.ethz.ch/pipermail/r-announce/2022/000683.html)), 
"Calling if() or while() with a condition of length greater than one gives an 
error rather than a warning."

The below code is a reproducible example of the issue. If it is executed in 
R >=4.2.0 then it will generate an error, or else just a warning message. 
`Sys.time()` is a multi-class object in R, and throughout the Spark R 
repository 'if' statement is used as: `if(class == "Column")` - this causes 
error in the latest R version >= 4.2.0. Note that R allows an object to have 
multiple 'class' names as a character vector ([R: Object 
Classes](https://stat.ethz.ch/R-manual/R-devel/library/base [...]

{
 SparkR::sparkR.session()
 t <- Sys.time()
 sdf <- SparkR::createDataFrame(data.frame(x = t + c(-1, 1, -1, 1, -1)))
 SparkR::collect(SparkR::filter(sdf, SparkR::column('x') > t))
}
#> Warning in if (class(e2) == 'Column') {: the condition has length > 1
#> and only the first element will be used
#> x
#> 1 2023-01-07 20:40:20
#> 2 2023-01-07 20:40:20

{
 Sys.setenv(`_R_CHECK_LENGTH_1_CONDITION_` = "true")
 SparkR::sparkR.session()
 t <- Sys.time()
 sdf <- SparkR::createDataFrame(data.frame(x = t + c(-1, 1, -1, 1, -1)))
 SparkR::collect(SparkR::filter(sdf, SparkR::column('x') > t))
}
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'x'
#> in selecting a method for function 'collect': error in evaluating the
#> argument 'condition' in selecting a method for function 'filter': the
#> condition has length > 1

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Added multiple relevant SparkR unit test cases to verify the fix; the Jira 
ticket number is added in the new test_that block.

Closes #39454 from atalv/r42-comparetime-fixclass.

Authored-by: Vivek Atal 
Signed-off-by: Hyukjin Kwon 
---
 R/pkg/R/DataFrame.R   |  4 +--
 R/pkg/R/column.R  |  6 ++---
 R/pkg/R/functions.R   |  8 +++---
 R/pkg/tests/fulltests/test_sparkSQL.R | 48 +++
 4 files changed, 57 insertions(+), 9 deletions(-)

diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R
index 456e3d9509f..3f9bc9cb6d0 100644
--- a/R/pkg/R/DataFrame.R
+++ b/R/pkg/R/DataFrame.R
@@ -3366,7 +3366,7 @@ setMethod("na.omit",
 setMethod("fillna",
   signature(x = "SparkDataFrame"),
   function(x, value, cols = NULL) {
-if (!(class(value) %in% c("integer", "numeric", "character", 
"list"))) {
+if (!(inherits(value, c("integer", "numeric", "character", 
"list" {
   stop("value should be an integer, numeric, character or named 
list.")
 }
 
@@ -3378,7 +3378,7 @@ setMethod("fillna",
   }
   # Check each item in the named list is of valid type
   lapply(value, function(v) {
-if (!(class(v) %in% c("integer", "numeric", "character"))) {
+if (!(inherits(v, c("integer", "numeric", "character" {
   stop("Each item in value should be an integer, numeric or 
character.")
 }
   })
diff --git a/R/pkg/R/column.R b/R/pkg/R/column.R
index f1fd30e144b..e4865056f58 100644
--- a/R/pkg/R/column.R
+++ b/R/pkg/R/column.R
@@ -85,7 +85,7 @@ createOperator <- function(op) {
   callJMethod(e1@jc, operators[[op]])
 }
   } else {
-if (class(e2) == "Column") {
+if (inherits(e2, "Column")) {
   e2 <- e2@jc
 }
 if (op == "^") {
@@ -110,7 +110,7 @@ createColumnF

[spark] branch master updated: [SPARK-41938][BUILD] Upgrade sbt to 1.8.2

2023-01-08 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 1ba37740a13 [SPARK-41938][BUILD] Upgrade sbt to 1.8.2
1ba37740a13 is described below

commit 1ba37740a13ffa7fb10efeef91ea76c6d5737ad0
Author: panbingkun 
AuthorDate: Mon Jan 9 09:42:06 2023 +0900

[SPARK-41938][BUILD] Upgrade sbt to 1.8.2

### What changes were proposed in this pull request?
This pr aims to upgrade sbt from 1.8.0 to 1.8.2

### Why are the changes needed?
Release notes:
- https://github.com/sbt/sbt/releases/tag/v1.8.2
- https://github.com/sbt/sbt/releases/tag/v1.8.1
https://user-images.githubusercontent.com/15246973/211179459-00c846b4-ba1e-4367-9192-e85e85ad15bc.png";>
https://user-images.githubusercontent.com/15246973/211179482-f0a40a1e-6213-48de-9b77-6596140b38c0.png";>

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes #39453 from panbingkun/SPARK-41938.

Authored-by: panbingkun 
Signed-off-by: Hyukjin Kwon 
---
 dev/appveyor-install-dependencies.ps1 | 2 +-
 project/build.properties  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/dev/appveyor-install-dependencies.ps1 
b/dev/appveyor-install-dependencies.ps1
index caa281d5c9c..a369e9285a0 100644
--- a/dev/appveyor-install-dependencies.ps1
+++ b/dev/appveyor-install-dependencies.ps1
@@ -97,7 +97,7 @@ if (!(Test-Path $tools)) {
 # == SBT
 Push-Location $tools
 
-$sbtVer = "1.8.0"
+$sbtVer = "1.8.2"
 Start-FileDownload 
"https://github.com/sbt/sbt/releases/download/v$sbtVer/sbt-$sbtVer.zip"; 
"sbt.zip"
 
 # extract
diff --git a/project/build.properties b/project/build.properties
index 33236b9f48d..04e6352b12c 100644
--- a/project/build.properties
+++ b/project/build.properties
@@ -15,4 +15,4 @@
 # limitations under the License.
 #
 # Please update the version in appveyor-install-dependencies.ps1 together.
-sbt.version=1.8.0
+sbt.version=1.8.2


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-28764][CORE][TEST] Remove writePartitionedFile in ExternalSorter

2023-01-08 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 379ef50d142 [SPARK-28764][CORE][TEST] Remove writePartitionedFile in 
ExternalSorter
379ef50d142 is described below

commit 379ef50d142537d01e31dc1c1402632de09388ee
Author: smallzhongfeng 
AuthorDate: Mon Jan 9 09:51:13 2023 +0900

[SPARK-28764][CORE][TEST] Remove writePartitionedFile in ExternalSorter

### What changes were proposed in this pull request?

Remove `writePartitionedFile` as this is only used by 
UnsafeRowSerializerSuite in the SQL project.

### Why are the changes needed?

Remove `writePartitionedFile` in `ExternalSorter` and update 
`UnsafeRowSerializerSuite`.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Update ut.

Closes #39368 from smallzhongfeng/SPARK-28764.

Authored-by: smallzhongfeng 
Signed-off-by: Hyukjin Kwon 
---
 .../spark/util/collection/ExternalSorter.scala | 49 +-
 .../sql/execution/UnsafeRowSerializerSuite.scala   | 17 +---
 2 files changed, 12 insertions(+), 54 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala 
b/core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala
index 284e70e2b05..4ca838b7655 100644
--- a/core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala
+++ b/core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala
@@ -66,7 +66,7 @@ import org.apache.spark.util.{CompletionIterator, Utils => 
TryUtils}
  *
  * 3. Request an iterator() back to traverse sorted/aggregated records.
  * - or -
- *Invoke writePartitionedFile() to create a file containing 
sorted/aggregated outputs
+ *Invoke writePartitionedMapOutput() to create a file containing 
sorted/aggregated outputs
  *that can be used in Spark's sort shuffle.
  *
  * At a high level, this class works internally as follows:
@@ -687,53 +687,6 @@ private[spark] class ExternalSorter[K, V, C](
 CompletionIterator[Product2[K, C], Iterator[Product2[K, C]]](iterator, 
stop())
   }
 
-  /**
-   * TODO(SPARK-28764): remove this, as this is only used by 
UnsafeRowSerializerSuite in the SQL
-   * project. We should figure out an alternative way to test that so that we 
can remove this
-   * otherwise unused code path.
-   */
-  def writePartitionedFile(
-  blockId: BlockId,
-  outputFile: File): Array[Long] = {
-
-// Track location of each range in the output file
-val lengths = new Array[Long](numPartitions)
-val writer = blockManager.getDiskWriter(blockId, outputFile, serInstance, 
fileBufferSize,
-  context.taskMetrics().shuffleWriteMetrics)
-
-if (spills.isEmpty) {
-  // Case where we only have in-memory data
-  val collection = if (aggregator.isDefined) map else buffer
-  val it = 
collection.destructiveSortedWritablePartitionedIterator(comparator)
-  while (it.hasNext) {
-val partitionId = it.nextPartition()
-while (it.hasNext && it.nextPartition() == partitionId) {
-  it.writeNext(writer)
-}
-val segment = writer.commitAndGet()
-lengths(partitionId) = segment.length
-  }
-} else {
-  // We must perform merge-sort; get an iterator by partition and write 
everything directly.
-  for ((id, elements) <- this.partitionedIterator) {
-if (elements.hasNext) {
-  for (elem <- elements) {
-writer.write(elem._1, elem._2)
-  }
-  val segment = writer.commitAndGet()
-  lengths(id) = segment.length
-}
-  }
-}
-
-writer.close()
-context.taskMetrics().incMemoryBytesSpilled(memoryBytesSpilled)
-context.taskMetrics().incDiskBytesSpilled(diskBytesSpilled)
-context.taskMetrics().incPeakExecutionMemory(peakMemoryUsedBytes)
-
-lengths
-  }
-
   /**
* Write all the data added into this ExternalSorter into a map output 
writer that pushes bytes
* to some arbitrary backing store. This is called by the SortShuffleWriter.
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/UnsafeRowSerializerSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/UnsafeRowSerializerSuite.scala
index 3b9984a312e..d9493421061 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/UnsafeRowSerializerSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/UnsafeRowSerializerSuite.scala
@@ -17,20 +17,20 @@
 
 package org.apache.spark.sql.execution
 
-import java.io.{ByteArrayInputStream, ByteArrayOutputStream, File}
-import java.util.Properties
+import java.io.{ByteArrayInputStream, ByteArrayOutputStream}
+import java.util.{HashMap, Properties}
 
 import org.apache.spark

[spark] branch master updated: [SPARK-41894][SS][TESTS] Restore the write permission of `commitDir` after run `testAsyncWriteErrorsPermissionsIssue`

2023-01-08 Thread kabhwan
This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 90da0baf5ab [SPARK-41894][SS][TESTS] Restore the write permission of 
`commitDir` after run `testAsyncWriteErrorsPermissionsIssue`
90da0baf5ab is described below

commit 90da0baf5ab6319f3219f5270f7ecb5e6501adcc
Author: yangjie01 
AuthorDate: Mon Jan 9 11:22:20 2023 +0900

[SPARK-41894][SS][TESTS] Restore the write permission of `commitDir` after 
run `testAsyncWriteErrorsPermissionsIssue`

### What changes were proposed in this pull request?
This pr aims to restore the write permission of `commitDir` after run 
`testAsyncWriteErrorsPermissionsIssue` in 
`AsyncProgressTrackingMicroBatchExecutionSuite` so that `mvn clean` can run 
successfully.

### Why are the changes needed?
Make `mvn clean` can run successfully after run `mvn test` with 
`AsyncProgressTrackingMicroBatchExecutionSuite`

### Does this PR introduce _any_ user-facing change?
No, just for test

### How was this patch tested?

- Pass Github Actions
- Manual test

```
build/mvn clean install -pl sql/core -am -DskipTests
build/mvn clean test -pl sql/core -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.execution.streaming.AsyncProgressTrackingMicroBatchExecutionSuite
build/mvn clean -pl sql/core
```

**Before**

`build/mvn clean -pl sql/core` run failed

```
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-clean-plugin:3.1.0:clean (default-clean) on 
project spark-sql_2.12: Failed to clean project: Failed to delete 
/${basedir}/sql/core/target/tmp/streaming.metadata-4d41fbcc-d517-4159-961c-95688dadd2c8/offsets/0
 -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, 
please read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
```

**After**

All three commands run successfully.

Closes #39406 from LuciferYang/SPARK-41894.

Authored-by: yangjie01 
Signed-off-by: Jungtaek Lim 
---
 ...cProgressTrackingMicroBatchExecutionSuite.scala | 59 --
 1 file changed, 32 insertions(+), 27 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecutionSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecutionSuite.scala
index 6b51367f207..d083cac48ff 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecutionSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecutionSuite.scala
@@ -722,36 +722,41 @@ class AsyncProgressTrackingMicroBatchExecutionSuite
 val inputData = new MemoryStream[Int](id = 0, sqlContext = sqlContext)
 val ds = inputData.toDS()
 val checkpointLocation = Utils.createTempDir(namePrefix = 
"streaming.metadata").getCanonicalPath
+val commitDir = new File(checkpointLocation + path)
 
-testStream(
-  ds,
-  extraOptions = Map(
-ASYNC_PROGRESS_TRACKING_ENABLED -> "true",
-ASYNC_PROGRESS_TRACKING_CHECKPOINTING_INTERVAL_MS -> "0"
-  )
-)(
-  StartStream(checkpointLocation = checkpointLocation),
-  AddData(inputData, 0),
-  CheckAnswer(0),
-  Execute { q =>
-waitPendingOffsetWrites(q)
-// to simulate write error
-import java.io._
-val commitDir = new File(checkpointLocation + path)
-commitDir.setReadOnly()
+try {
+  testStream(
+ds,
+extraOptions = Map(
+  ASYNC_PROGRESS_TRACKING_ENABLED -> "true",
+  ASYNC_PROGRESS_TRACKING_CHECKPOINTING_INTERVAL_MS -> "0"
+)
+  )(
+StartStream(checkpointLocation = checkpointLocation),
+AddData(inputData, 0),
+CheckAnswer(0),
+Execute { q =>
+  waitPendingOffsetWrites(q)
+  // to simulate write error
+  commitDir.setReadOnly()
 
-  },
-  AddData(inputData, 1),
-  Execute {
-q =>
-  eventually(timeout(Span(5, Seconds))) {
-val e = intercept[StreamingQueryException] {
-  q.processAllAvailable()
+},
+AddData(inputData, 1),
+Execute {
+  q =>
+eventually(timeout(Span(5, Seconds))) {
+  val e = intercept[StreamingQueryException] {
+q.processAllAvailable()
+  }
+  e.getCause.getCause.getMessage sho

[spark] branch master updated: [SPARK-40711][SQL] Add spill size metrics for window

2023-01-08 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new c6788e5c2fc [SPARK-40711][SQL] Add spill size metrics for window
c6788e5c2fc is described below

commit c6788e5c2fc3ea6e65f9d49f98dec1cd5f2b820d
Author: ulysses-you 
AuthorDate: Mon Jan 9 12:44:23 2023 +0800

[SPARK-40711][SQL] Add spill size metrics for window

### What changes were proposed in this pull request?

Window may spill if one partition size is large that can not hold in 
memory. This pr makes window support report spill size metrics.

### Why are the changes needed?

Help user get window spill information, to track how much size would spill.

### Does this PR introduce _any_ user-facing change?

yes, a new metrics. people can see it in UI

### How was this patch tested?

add test for window and manual test for WindowInPandasExec:

https://user-images.githubusercontent.com/12025282/194706054-91c75f5f-e513-40fb-a148-6493d97f8c51.png";>

Closes #38163 from ulysses-you/window-metrics.

Authored-by: ulysses-you 
Signed-off-by: Wenchen Fan 
---
 .../spark/sql/execution/python/WindowInPandasExec.scala  |  6 ++
 .../apache/spark/sql/execution/window/WindowExec.scala   |  6 ++
 .../spark/sql/execution/metric/SQLMetricsSuite.scala | 16 
 3 files changed, 28 insertions(+)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala
index dcaffed89cc..5e903aa991d 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala
@@ -29,6 +29,7 @@ import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.errors.QueryExecutionErrors
 import org.apache.spark.sql.execution.{ExternalAppendOnlyUnsafeRowArray, 
SparkPlan}
+import org.apache.spark.sql.execution.metric.{SQLMetric, SQLMetrics}
 import org.apache.spark.sql.execution.window._
 import org.apache.spark.sql.types._
 import org.apache.spark.sql.util.ArrowUtils
@@ -85,6 +86,9 @@ case class WindowInPandasExec(
 orderSpec: Seq[SortOrder],
 child: SparkPlan)
   extends WindowExecBase with PythonSQLMetrics {
+  override lazy val metrics: Map[String, SQLMetric] = pythonMetrics ++ Map(
+"spillSize" -> SQLMetrics.createSizeMetric(sparkContext, "spill size")
+  )
 
   /**
* Helper functions and data structures for window bounds
@@ -245,6 +249,7 @@ case class WindowInPandasExec(
 
 val allInputs = windowBoundsInput ++ dataInputs
 val allInputTypes = allInputs.map(_.dataType)
+val spillSize = longMetric("spillSize")
 
 // Start processing.
 child.execute().mapPartitions { iter =>
@@ -337,6 +342,7 @@ case class WindowInPandasExec(
   if (!found) {
 // clear final partition
 buffer.clear()
+spillSize += buffer.spillSize
   }
   found
 }
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala
index dc85585b13d..dda5da6c9e9 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala
@@ -21,6 +21,7 @@ import org.apache.spark.rdd.RDD
 import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.execution.{ExternalAppendOnlyUnsafeRowArray, 
SparkPlan}
+import org.apache.spark.sql.execution.metric.{SQLMetric, SQLMetrics}
 
 /**
  * This class calculates and outputs (windowed) aggregates over the rows in a 
single (sorted)
@@ -89,6 +90,9 @@ case class WindowExec(
 orderSpec: Seq[SortOrder],
 child: SparkPlan)
   extends WindowExecBase {
+  override lazy val metrics: Map[String, SQLMetric] = Map(
+"spillSize" -> SQLMetrics.createSizeMetric(sparkContext, "spill size")
+  )
 
   protected override def doExecute(): RDD[InternalRow] = {
 // Unwrap the window expressions and window frame factories from the map.
@@ -96,6 +100,7 @@ case class WindowExec(
 val factories = windowFrameExpressionFactoryPairs.map(_._2).toArray
 val inMemoryThreshold = conf.windowExecBufferInMemoryThreshold
 val spillThreshold = conf.windowExecBufferSpillThreshold
+val spillSize = longMetric("spillSize")
 
 // Start processing.
 child.execute().mapPartitions { stream =>
@@ -163,6 +168,7 @@ case class WindowExec(
   if (!found) {
 // clear final par

[spark] branch master updated: [SPARK-41941][BUILD] Upgrade `scalatest` related test dependencies to 3.2.15

2023-01-08 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3f381bf270a [SPARK-41941][BUILD] Upgrade `scalatest` related test 
dependencies to 3.2.15
3f381bf270a is described below

commit 3f381bf270a6968d1a37b00802bec4f98710cad0
Author: yangjie01 
AuthorDate: Sun Jan 8 21:21:50 2023 -0800

[SPARK-41941][BUILD] Upgrade `scalatest` related test dependencies to 3.2.15

### What changes were proposed in this pull request?
This pr aims upgrade `scalatest` related test dependencies to 3.2.15:

- scalatest: upgrade scalatest to 3.2.15
- scalatestplus
   - scalacheck: upgrade to `scalacheck-1-17` 3.2.15.0
   - mockito: upgrade to `mockito-4-6` to 3.2.15.0
   - selenium: uprade to `selenium-4-7` to 3.2.15.0 and `selenium-java` to 
4.7.2, `htmlunit-driver` to 4.7.2

### Why are the changes needed?
The release notes as follows:

- 
scalatest:https://github.com/scalatest/scalatest/releases/tag/release-3.2.15
- scalatestplus
   - scalacheck-1-17: 
https://github.com/scalatest/scalatestplus-scalacheck/releases/tag/release-3.2.15.0-for-scalacheck-1.17
   - mockito-4-6: 
https://github.com/scalatest/scalatestplus-mockito/releases/tag/release-3.2.15.0-for-mockito-4.6
   - selenium-4-7: 
https://github.com/scalatest/scalatestplus-selenium/releases/tag/release-3.2.15.0-for-selenium-4.7

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

- Pass GitHub Actions
- Manual test:
   - ChromeUISeleniumSuite
   - RocksDBBackendChromeUIHistoryServerSuite

```
build/sbt -Dguava.version=31.1-jre 
-Dspark.test.webdriver.chrome.driver=/path/to/chromedriver 
-Dtest.default.exclude.tags="" -Phive -Phive-thriftserver "core/testOnly 
org.apache.spark.ui.ChromeUISeleniumSuite"

build/sbt -Dguava.version=31.1-jre 
-Dspark.test.webdriver.chrome.driver=/path/to/chromedriver 
-Dtest.default.exclude.tags="" -Phive -Phive-thriftserver "core/testOnly 
org.apache.spark.deploy.history.RocksDBBackendChromeUIHistoryServerSuite"
```

```
ChromeDriver was started successfully.
[info] - SPARK-31534: text for tooltip should be escaped (3 seconds, 421 
milliseconds)
[info] - SPARK-31882: Link URL for Stage DAGs should not depend on paged 
table. (945 milliseconds)
[info] - SPARK-31886: Color barrier execution mode RDD correctly (310 
milliseconds)
[info] - Search text for paged tables should not be saved (1 second, 761 
milliseconds)
[info] Run completed in 10 seconds, 809 milliseconds.
[info] Total number of tests run: 4
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 123 s (02:03), completed 2023-1-8 21:33:56
```

```
ChromeDriver was started successfully.
[info] - ajax rendered relative links are prefixed with uiRoot 
(spark.ui.proxyBase) (2 seconds, 341 milliseconds)
[info] Run completed in 8 seconds, 792 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 23 s, completed 2023-1-8 21:34:48
```

Closes #39458 from LuciferYang/SPARK-41941.

Authored-by: yangjie01 
Signed-off-by: Dongjoon Hyun 
---
 pom.xml | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/pom.xml b/pom.xml
index 5e28fd4edfe..4e68c69f01a 100644
--- a/pom.xml
+++ b/pom.xml
@@ -207,8 +207,8 @@
 
 4.9.3
 1.1
-4.7.1
-4.7.0
+4.7.2
+4.7.2
 2.67.0
 1.8
 1.1.0
@@ -1129,25 +1129,25 @@
   
 org.scalatest
 scalatest_${scala.binary.version}
-3.2.14
+3.2.15
 test
   
   
 org.scalatestplus
 scalacheck-1-17_${scala.binary.version}
-3.2.14.0
+3.2.15.0
 test
   
   
 org.scalatestplus
 mockito-4-6_${scala.binary.version}
-3.2.14.0
+3.2.15.0
 test
   
   
 org.scalatestplus
 selenium-4-7_${scala.binary.version}
-3.2.14.0
+3.2.15.0
 test
   
   


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (6f31566094a -> 5ab2cfa865b)

2023-01-08 Thread ruifengz
This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 6f31566094a [SPARK-41805][SQL] Reuse expressions in 
WindowSpecDefinition
 add 5ab2cfa865b [SPARK-41354][CONNECT][PYTHON] Implement 
RepartitionByExpression

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/connect/dataframe.py| 87 --
 python/pyspark/sql/connect/plan.py | 35 +
 .../sql/tests/connect/test_connect_basic.py| 52 +
 .../pyspark/sql/tests/connect/test_connect_plan.py | 26 +++
 4 files changed, 193 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-41581][SQL] Update `_LEGACY_ERROR_TEMP_1230` as `INTERNAL_ERROR`

2023-01-08 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 6b92cda04e6 [SPARK-41581][SQL] Update `_LEGACY_ERROR_TEMP_1230` as 
`INTERNAL_ERROR`
6b92cda04e6 is described below

commit 6b92cda04e618f82711587d027fa20601e094418
Author: itholic 
AuthorDate: Mon Jan 9 10:41:49 2023 +0300

[SPARK-41581][SQL] Update `_LEGACY_ERROR_TEMP_1230` as `INTERNAL_ERROR`

### What changes were proposed in this pull request?

This PR proposes to update `_LEGACY_ERROR_TEMP_1230`, as `INTERNAL_ERROR`.

### Why are the changes needed?

We should assign proper name to _LEGACY_ERROR_TEMP_*

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

`./build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*`

Closes #39282 from itholic/LEGACY_1230.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json|  5 -
 .../apache/spark/sql/errors/QueryCompilationErrors.scala| 10 --
 .../scala/org/apache/spark/sql/types/DecimalSuite.scala | 13 -
 3 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 5409507c3c8..a3acb940585 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -2944,11 +2944,6 @@
   " can only support precision up to ."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1230" : {
-"message" : [
-  "Negative scale is not allowed: . You can use =true to 
enable legacy mode to allow it."
-]
-  },
   "_LEGACY_ERROR_TEMP_1231" : {
 "message" : [
   " is not a valid partition column in table ."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index 2ced0b8ac7a..25005a1f609 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -21,7 +21,7 @@ import scala.collection.mutable
 
 import org.apache.hadoop.fs.Path
 
-import org.apache.spark.{SparkThrowable, SparkThrowableHelper}
+import org.apache.spark.{SparkException, SparkThrowable, SparkThrowableHelper}
 import org.apache.spark.sql.AnalysisException
 import org.apache.spark.sql.catalyst.{FunctionIdentifier, QualifiedTableName, 
TableIdentifier}
 import 
org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, 
FunctionAlreadyExistsException, NamespaceAlreadyExistsException, 
NoSuchFunctionException, NoSuchNamespaceException, NoSuchPartitionException, 
NoSuchTableException, ResolvedTable, Star, TableAlreadyExistsException, 
UnresolvedRegex}
@@ -2242,11 +2242,9 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
   }
 
   def negativeScaleNotAllowedError(scale: Int): Throwable = {
-new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1230",
-  messageParameters = Map(
-"scale" -> scale.toString,
-"config" -> LEGACY_ALLOW_NEGATIVE_SCALE_OF_DECIMAL_ENABLED.key))
+SparkException.internalError(s"Negative scale is not allowed: 
${scale.toString}." +
+  s" Set the config 
${toSQLConf(LEGACY_ALLOW_NEGATIVE_SCALE_OF_DECIMAL_ENABLED.key)}" +
+  " to \"true\" to allow it.")
   }
 
   def invalidPartitionColumnKeyInTableError(key: String, tblName: String): 
Throwable = {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala
index 73944d9dff9..465c25118fa 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala
@@ -19,8 +19,7 @@ package org.apache.spark.sql.types
 
 import org.scalatest.PrivateMethodTester
 
-import org.apache.spark.{SparkArithmeticException, SparkFunSuite, 
SparkNumberFormatException}
-import org.apache.spark.sql.AnalysisException
+import org.apache.spark.{SparkArithmeticException, SparkException, 
SparkFunSuite, SparkNumberFormatException}
 import org.apache.spark.sql.catalyst.plans.SQLHelper
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types.Decimal._
@@ -111,9 +110,13 @@ class DecimalSuite extends SparkFunSuite with 
PrivateMethodTester with SQLHelper
 
   test("SPARK-30252: Negative scale is not allowed by default") {
 def checkNegativeScaleDecimal(d: => Decimal): Unit = {
-  intercept[AnalysisException](d)
-.getMessage
-.contains("Negative scale is not all