[spark] branch master updated: [SPARK-45333][CORE] Fix one unit mistake related to spark.eventLog.buffer.kb
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 91b76070b05f [SPARK-45333][CORE] Fix one unit mistake related to spark.eventLog.buffer.kb 91b76070b05f is described below commit 91b76070b05fdf026a41f22c12e404ee99bd8cc3 Author: lanmengran1 AuthorDate: Tue Sep 26 14:56:51 2023 +0900 [SPARK-45333][CORE] Fix one unit mistake related to spark.eventLog.buffer.kb ### What changes were proposed in this pull request? Fixing a unit mistake in the usage of configuration "spark.eventLog.buffer.kb" ### Why are the changes needed? Making the size of the event log output buffer as expected ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Maybe don't need Closes #42294 from amoylan2/fix_unit_mistake_in_eventLogOutputBufferSize. Authored-by: lanmengran1 Signed-off-by: Hyukjin Kwon --- .../scala/org/apache/spark/deploy/history/EventLogFileWriters.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/history/EventLogFileWriters.scala b/core/src/main/scala/org/apache/spark/deploy/history/EventLogFileWriters.scala index 144dadf29bc3..e7eb05c85367 100644 --- a/core/src/main/scala/org/apache/spark/deploy/history/EventLogFileWriters.scala +++ b/core/src/main/scala/org/apache/spark/deploy/history/EventLogFileWriters.scala @@ -57,7 +57,7 @@ abstract class EventLogFileWriter( protected val shouldCompress = sparkConf.get(EVENT_LOG_COMPRESS) && !sparkConf.get(EVENT_LOG_COMPRESSION_CODEC).equalsIgnoreCase("none") protected val shouldOverwrite = sparkConf.get(EVENT_LOG_OVERWRITE) - protected val outputBufferSize = sparkConf.get(EVENT_LOG_OUTPUT_BUFFER_SIZE).toInt + protected val outputBufferSize = sparkConf.get(EVENT_LOG_OUTPUT_BUFFER_SIZE).toInt * 1024 protected val fileSystem = Utils.getHadoopFileSystem(logBaseDir, hadoopConf) protected val compressionCodec = if (shouldCompress) { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r64200 - /release/spark/spark-3.2.4/
Author: dongjoon Date: Tue Sep 26 04:26:46 2023 New Revision: 64200 Log: Remove Apache Spark 3.2.4 due to the end-of-life Removed: release/spark/spark-3.2.4/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r64199 - /release/spark/spark-3.4.0/
Author: dongjoon Date: Tue Sep 26 04:25:22 2023 New Revision: 64199 Log: Remove Apache Spark 3.4.0 after uploading 3.4.1 Removed: release/spark/spark-3.4.0/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r64198 - in /dev/spark: v3.5.0-rc1-bin/ v3.5.0-rc1-docs/ v3.5.0-rc2-bin/ v3.5.0-rc2-docs/ v3.5.0-rc3-bin/ v3.5.0-rc3-docs/ v3.5.0-rc4-bin/ v3.5.0-rc4-docs/
Author: dongjoon Date: Tue Sep 26 04:24:47 2023 New Revision: 64198 Log: Remove Apache Spark 3.5.0 RC artifacts after releasing Removed: dev/spark/v3.5.0-rc1-bin/ dev/spark/v3.5.0-rc1-docs/ dev/spark/v3.5.0-rc2-bin/ dev/spark/v3.5.0-rc2-docs/ dev/spark/v3.5.0-rc3-bin/ dev/spark/v3.5.0-rc3-docs/ dev/spark/v3.5.0-rc4-bin/ dev/spark/v3.5.0-rc4-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
svn commit: r64197 - /dev/spark/v3.4.1-rc1-docs/
Author: dongjoon Date: Tue Sep 26 04:23:28 2023 New Revision: 64197 Log: Remove Apache Spark 3.4.1 RC1 doc after releasing Removed: dev/spark/v3.4.1-rc1-docs/ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-45248][CORE] Set the timeout for spark ui server
This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 273a375cd314 [SPARK-45248][CORE] Set the timeout for spark ui server 273a375cd314 is described below commit 273a375cd314fbf52b5f2538526374f6b24fb2cf Author: chenyu <119398199+chenyu-opensou...@users.noreply.github.com> AuthorDate: Mon Sep 25 22:38:27 2023 -0500 [SPARK-45248][CORE] Set the timeout for spark ui server **What changes were proposed in this pull request?** The PR supports to set the timeout for spark ui server. **Why are the changes needed?** It can avoid slow HTTP Denial of Service Attack because the jetty server's timeout is 30 for deafult. **Does this PR introduce any user-facing change?** No **How was this patch tested?** Manual review **Was this patch authored or co-authored using generative AI tooling?** No Closes #43078 from chenyu-opensource/branch-SPARK-45248-new. Authored-by: chenyu <119398199+chenyu-opensou...@users.noreply.github.com> Signed-off-by: Sean Owen --- core/src/main/scala/org/apache/spark/ui/JettyUtils.scala | 2 ++ 1 file changed, 2 insertions(+) diff --git a/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala b/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala index 9582bdbf5264..22adcbc32ed8 100644 --- a/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala +++ b/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala @@ -296,6 +296,8 @@ private[spark] object JettyUtils extends Logging { connector.setPort(port) connector.setHost(hostName) connector.setReuseAddress(!Utils.isWindows) + // spark-45248: set the idle timeout to prevent slow DoS +connector.setIdleTimeout(8000) // Currently we only use "SelectChannelConnector" // Limit the max acceptor number to 8 so that we don't waste a lot of threads - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-45299][TESTS] Remove JDK 8 workaround in UtilsSuite
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new d49780037b51 [SPARK-45299][TESTS] Remove JDK 8 workaround in UtilsSuite d49780037b51 is described below commit d49780037b5169d478a505ce9e637234f3eadb67 Author: Hyukjin Kwon AuthorDate: Mon Sep 25 20:17:43 2023 -0700 [SPARK-45299][TESTS] Remove JDK 8 workaround in UtilsSuite ### What changes were proposed in this pull request? This PR removes the legacy workaround for JDK 7 and below at SPARK-12486. The main code was cleaned up at SPARK-16182 but the test code was not cleaned up. ### Why are the changes needed? To remove legacy workaround. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Fixed unittests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43084 from HyukjinKwon/SPARK-45299. Lead-authored-by: Hyukjin Kwon Co-authored-by: Hyukjin Kwon Signed-off-by: Dongjoon Hyun --- .../scala/org/apache/spark/util/UtilsSuite.scala | 75 ++ 1 file changed, 33 insertions(+), 42 deletions(-) diff --git a/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala b/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala index 2a91b45ef5b7..58ce15cfaf81 100644 --- a/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala +++ b/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala @@ -32,7 +32,7 @@ import scala.util.Random import com.google.common.io.Files import org.apache.commons.io.IOUtils -import org.apache.commons.lang3.{JavaVersion, SystemUtils} +import org.apache.commons.lang3.SystemUtils import org.apache.commons.math3.stat.inference.ChiSquareTest import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.Path @@ -983,19 +983,13 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties { // Verify that we can terminate a process even if it is in a bad state. This is only run // on UNIX since it does some OS specific things to verify the correct behavior. if (SystemUtils.IS_OS_UNIX) { - def getPid(p: Process): Int = { -val f = p.getClass().getDeclaredField("pid") -f.setAccessible(true) -f.get(p).asInstanceOf[Int] - } - - def pidExists(pid: Int): Boolean = { + def pidExists(pid: Long): Boolean = { val p = Runtime.getRuntime.exec(Array("kill", "-0", s"$pid")) p.waitFor() p.exitValue() == 0 } - def signal(pid: Int, s: String): Unit = { + def signal(pid: Long, s: String): Unit = { val p = Runtime.getRuntime.exec(Array("kill", s"-$s", s"$pid")) p.waitFor() } @@ -1003,8 +997,8 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties { // Start up a process that runs 'sleep 10'. Terminate the process and assert it takes // less time and the process is no longer there. val startTimeNs = System.nanoTime() - val process = new ProcessBuilder("sleep", "10").start() - val pid = getPid(process) + var process = new ProcessBuilder("sleep", "10").start() + var pid = process.toHandle.pid() try { assert(pidExists(pid)) val terminated = Utils.terminateProcess(process, 5000) @@ -1018,37 +1012,34 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties { signal(pid, "SIGKILL") } - if (SystemUtils.isJavaVersionAtLeast(JavaVersion.JAVA_1_8)) { -// We'll make sure that forcibly terminating a process works by -// creating a very misbehaving process. It ignores SIGTERM and has been SIGSTOPed. On -// older versions of java, this will *not* terminate. -val file = File.createTempFile("temp-file-name", ".tmp") -file.deleteOnExit() -val cmd = - s""" - |#!/usr/bin/env bash - |trap "" SIGTERM - |sleep 10 - """.stripMargin -Files.write(cmd.getBytes(UTF_8), file) -file.getAbsoluteFile.setExecutable(true) - -val process = new ProcessBuilder(file.getAbsolutePath).start() -val pid = getPid(process) -assert(pidExists(pid)) -try { - signal(pid, "SIGSTOP") - val startNs = System.nanoTime() - val terminated = Utils.terminateProcess(process, 5000) - assert(terminated.isDefined) - process.waitFor(5, TimeUnit.SECONDS) - val duration = System.nanoTime() - startNs - // add a little extra time to allow a force kill to finish - assert(duration < TimeUnit.SECONDS.toNanos(6)) - assert(!pidExists(pid)) -} finally { - signal(pid, "SIGKILL") -} + //
[spark] branch master updated: [SPARK-45300][SQL][TESTS] Remove JDK 8 workaround in TimestampFormatterSuite
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 933f17d13f6e [SPARK-45300][SQL][TESTS] Remove JDK 8 workaround in TimestampFormatterSuite 933f17d13f6e is described below commit 933f17d13f6e72a687a383b8fc1797a9ba700a98 Author: Hyukjin Kwon AuthorDate: Mon Sep 25 20:15:03 2023 -0700 [SPARK-45300][SQL][TESTS] Remove JDK 8 workaround in TimestampFormatterSuite ### What changes were proposed in this pull request? This PR removes the legacy workaround for JDK 8 in https://github.com/apache/spark/pull/28736. ### Why are the changes needed? - We still need the main code for completeness, and in case there are other diff in the future JDK versions so this PR only fixes the tests. - We dropped JDK 11/8 at SPARK-44112 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Fixed unittests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43085 from HyukjinKwon/SPARK-45300. Authored-by: Hyukjin Kwon Signed-off-by: Dongjoon Hyun --- .../sql/catalyst/util/TimestampFormatterSuite.scala | 20 ++-- 1 file changed, 2 insertions(+), 18 deletions(-) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/TimestampFormatterSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/TimestampFormatterSuite.scala index eb173bc7f8c8..ecd849dd3af9 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/TimestampFormatterSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/TimestampFormatterSuite.scala @@ -19,8 +19,6 @@ package org.apache.spark.sql.catalyst.util import java.time.{DateTimeException, LocalDateTime} -import org.apache.commons.lang3.{JavaVersion, SystemUtils} - import org.apache.spark.SparkUpgradeException import org.apache.spark.sql.catalyst.util.DateTimeTestUtils._ import org.apache.spark.sql.catalyst.util.DateTimeUtils._ @@ -333,14 +331,8 @@ class TimestampFormatterSuite extends DatetimeFormatterSuite { val micros1 = formatter.parse("2009-12-12 00 am") assert(micros1 === date(2009, 12, 12)) - // JDK-8223773: DateTimeFormatter Fails to throw an Exception on Invalid HOUR_OF_AMPM // For `KK`, "12:00:00 am" is the same as "00:00:00 pm". - if (SystemUtils.isJavaVersionAtLeast(JavaVersion.JAVA_13)) { -intercept[DateTimeException](formatter.parse("2009-12-12 12 am")) - } else { -val micros2 = formatter.parse("2009-12-12 12 am") -assert(micros2 === date(2009, 12, 12, 12)) - } + intercept[DateTimeException](formatter.parse("2009-12-12 12 am")) val micros3 = formatter.parse("2009-12-12 00 pm") assert(micros3 === date(2009, 12, 12, 12)) @@ -410,15 +402,7 @@ class TimestampFormatterSuite extends DatetimeFormatterSuite { val formatter = TimestampFormatter("DD", UTC, isParsing = false) assert(formatter.format(date(1970, 1, 3)) == "03") assert(formatter.format(date(1970, 4, 9)) == "99") - -if (SystemUtils.isJavaVersionAtMost(JavaVersion.JAVA_1_8)) { - // https://bugs.openjdk.java.net/browse/JDK-8079628 - intercept[SparkUpgradeException] { -formatter.format(date(1970, 4, 10)) - } -} else { - assert(formatter.format(date(1970, 4, 10)) == "100") -} +assert(formatter.format(date(1970, 4, 10)) == "100") } test("SPARK-32424: avoid silent data change when timestamp overflows") { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.5 updated: [SPARK-45182][CORE] Ignore task completion from old stage after retrying indeterminate stages
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.5 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.5 by this push: new 252970bab65d [SPARK-45182][CORE] Ignore task completion from old stage after retrying indeterminate stages 252970bab65d is described below commit 252970bab65d80020bae5f86f35d29a75fe54804 Author: mayurb AuthorDate: Tue Sep 26 11:04:07 2023 +0800 [SPARK-45182][CORE] Ignore task completion from old stage after retrying indeterminate stages ### What changes were proposed in this pull request? [SPARK-25342](https://issues.apache.org/jira/browse/SPARK-25342) Added a support for rolling back shuffle map stage so that all tasks of the stage can be retried when the stage output is indeterminate. This is done by clearing all map outputs at the time of stage submission. This approach workouts well except for this case: Assume both Shuffle 1 and 2 are indeterminate ShuffleMapStage1 ––> Shuffle 1 ---–> ShuffleMapStage2 > Shuffle 2 > ResultStage - ShuffleMapStage1 is complete - A task from ShuffleMapStage2 fails with FetchFailed. Other tasks are still running - Both ShuffleMapStage1 and ShuffleMapStage2 are retried - ShuffleMapStage1 is retried and completes - ShuffleMapStage2 reattempt is scheduled for execution - Before all tasks of ShuffleMapStage2 reattempt could finish, one/more laggard tasks from the original attempt of ShuffleMapStage2 finish and ShuffleMapStage2 also gets marked as complete - Result Stage gets scheduled and finishes After this change, such laggard tasks from the old attempt of the indeterminate stage will be ignored ### Why are the changes needed? This can give wrong result when indeterminate stages needs to be retried under the circumstances mentioned above ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? A new test case ### Was this patch authored or co-authored using generative AI tooling? No Closes #42950 from mayurdb/rollbackFix. Authored-by: mayurb Signed-off-by: Wenchen Fan (cherry picked from commit 7ffc0b71aa3e416a9b21e0975a169b2a8a8403a8) Signed-off-by: Wenchen Fan --- .../org/apache/spark/scheduler/DAGScheduler.scala | 29 +++--- .../apache/spark/scheduler/DAGSchedulerSuite.scala | 104 + 2 files changed, 122 insertions(+), 11 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala index fc83439454dc..d73bb6339015 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala @@ -1903,19 +1903,26 @@ private[spark] class DAGScheduler( case smt: ShuffleMapTask => val shuffleStage = stage.asInstanceOf[ShuffleMapStage] -shuffleStage.pendingPartitions -= task.partitionId -val status = event.result.asInstanceOf[MapStatus] -val execId = status.location.executorId -logDebug("ShuffleMapTask finished on " + execId) -if (executorFailureEpoch.contains(execId) && +// Ignore task completion for old attempt of indeterminate stage +val ignoreIndeterminate = stage.isIndeterminate && + task.stageAttemptId < stage.latestInfo.attemptNumber() +if (!ignoreIndeterminate) { + shuffleStage.pendingPartitions -= task.partitionId + val status = event.result.asInstanceOf[MapStatus] + val execId = status.location.executorId + logDebug("ShuffleMapTask finished on " + execId) + if (executorFailureEpoch.contains(execId) && smt.epoch <= executorFailureEpoch(execId)) { - logInfo(s"Ignoring possibly bogus $smt completion from executor $execId") +logInfo(s"Ignoring possibly bogus $smt completion from executor $execId") + } else { +// The epoch of the task is acceptable (i.e., the task was launched after the most +// recent failure we're aware of for the executor), so mark the task's output as +// available. +mapOutputTracker.registerMapOutput( + shuffleStage.shuffleDep.shuffleId, smt.partitionId, status) + } } else { - // The epoch of the task is acceptable (i.e., the task was launched after the most - // recent failure we're aware of for the executor), so mark the task's output as - // available. - mapOutputTracker.registerMapOutput( -shuffleStage.shuffleDep.shuffleId, smt.partitionId, status) +
[spark] branch master updated: [SPARK-45182][CORE] Ignore task completion from old stage after retrying indeterminate stages
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 7ffc0b71aa3e [SPARK-45182][CORE] Ignore task completion from old stage after retrying indeterminate stages 7ffc0b71aa3e is described below commit 7ffc0b71aa3e416a9b21e0975a169b2a8a8403a8 Author: mayurb AuthorDate: Tue Sep 26 11:04:07 2023 +0800 [SPARK-45182][CORE] Ignore task completion from old stage after retrying indeterminate stages ### What changes were proposed in this pull request? [SPARK-25342](https://issues.apache.org/jira/browse/SPARK-25342) Added a support for rolling back shuffle map stage so that all tasks of the stage can be retried when the stage output is indeterminate. This is done by clearing all map outputs at the time of stage submission. This approach workouts well except for this case: Assume both Shuffle 1 and 2 are indeterminate ShuffleMapStage1 ––> Shuffle 1 ---–> ShuffleMapStage2 > Shuffle 2 > ResultStage - ShuffleMapStage1 is complete - A task from ShuffleMapStage2 fails with FetchFailed. Other tasks are still running - Both ShuffleMapStage1 and ShuffleMapStage2 are retried - ShuffleMapStage1 is retried and completes - ShuffleMapStage2 reattempt is scheduled for execution - Before all tasks of ShuffleMapStage2 reattempt could finish, one/more laggard tasks from the original attempt of ShuffleMapStage2 finish and ShuffleMapStage2 also gets marked as complete - Result Stage gets scheduled and finishes After this change, such laggard tasks from the old attempt of the indeterminate stage will be ignored ### Why are the changes needed? This can give wrong result when indeterminate stages needs to be retried under the circumstances mentioned above ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? A new test case ### Was this patch authored or co-authored using generative AI tooling? No Closes #42950 from mayurdb/rollbackFix. Authored-by: mayurb Signed-off-by: Wenchen Fan --- .../org/apache/spark/scheduler/DAGScheduler.scala | 29 +++--- .../apache/spark/scheduler/DAGSchedulerSuite.scala | 104 + 2 files changed, 122 insertions(+), 11 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala index 8a1480fd2100..a456f91d4c96 100644 --- a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala +++ b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala @@ -1903,19 +1903,26 @@ private[spark] class DAGScheduler( case smt: ShuffleMapTask => val shuffleStage = stage.asInstanceOf[ShuffleMapStage] -shuffleStage.pendingPartitions -= task.partitionId -val status = event.result.asInstanceOf[MapStatus] -val execId = status.location.executorId -logDebug("ShuffleMapTask finished on " + execId) -if (executorFailureEpoch.contains(execId) && +// Ignore task completion for old attempt of indeterminate stage +val ignoreIndeterminate = stage.isIndeterminate && + task.stageAttemptId < stage.latestInfo.attemptNumber() +if (!ignoreIndeterminate) { + shuffleStage.pendingPartitions -= task.partitionId + val status = event.result.asInstanceOf[MapStatus] + val execId = status.location.executorId + logDebug("ShuffleMapTask finished on " + execId) + if (executorFailureEpoch.contains(execId) && smt.epoch <= executorFailureEpoch(execId)) { - logInfo(s"Ignoring possibly bogus $smt completion from executor $execId") +logInfo(s"Ignoring possibly bogus $smt completion from executor $execId") + } else { +// The epoch of the task is acceptable (i.e., the task was launched after the most +// recent failure we're aware of for the executor), so mark the task's output as +// available. +mapOutputTracker.registerMapOutput( + shuffleStage.shuffleDep.shuffleId, smt.partitionId, status) + } } else { - // The epoch of the task is acceptable (i.e., the task was launched after the most - // recent failure we're aware of for the executor), so mark the task's output as - // available. - mapOutputTracker.registerMapOutput( -shuffleStage.shuffleDep.shuffleId, smt.partitionId, status) + logInfo(s"Ignoring $smt completion from an older attempt of indeterminate stage") }
[spark] branch master updated: [SPARK-45318][SHELL][TESTS] Merge test cases from `SingletonRepl2Suite/Repl2Suite` back into `SingletonReplSuite/ReplSuite`
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 96b591abdba5 [SPARK-45318][SHELL][TESTS] Merge test cases from `SingletonRepl2Suite/Repl2Suite` back into `SingletonReplSuite/ReplSuite` 96b591abdba5 is described below commit 96b591abdba58be1e3cb38c8c19885f6ceb17fd1 Author: yangjie01 AuthorDate: Mon Sep 25 20:02:10 2023 -0700 [SPARK-45318][SHELL][TESTS] Merge test cases from `SingletonRepl2Suite/Repl2Suite` back into `SingletonReplSuite/ReplSuite` ### What changes were proposed in this pull request? This pr aims to merge test cases from `SingletonRepl2Suite/Repl2Suite` back into `SingletonReplSuite/ReplSuite` to reduce duplicate code. ### Why are the changes needed? https://github.com/apache/spark/pull/28545 split the relevant test cases from `SingletonReplSuite/ReplSuite` into `SingletonRepl2Suite/Repl2Suite`, distinguishing different test versions of Scala 2.12 and Scala 2.13. Currently, Spark 4.0 no longer supports Scala 2.12, so they can be merged back into the original files to reduce duplicate code. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes #43104 from LuciferYang/SPARK-45318. Authored-by: yangjie01 Signed-off-by: Dongjoon Hyun --- .../scala/org/apache/spark/repl/Repl2Suite.scala | 46 -- .../scala/org/apache/spark/repl/ReplSuite.scala| 24 ++- .../apache/spark/repl/SingletonRepl2Suite.scala| 171 - .../org/apache/spark/repl/SingletonReplSuite.scala | 65 4 files changed, 88 insertions(+), 218 deletions(-) diff --git a/repl/src/test/scala/org/apache/spark/repl/Repl2Suite.scala b/repl/src/test/scala/org/apache/spark/repl/Repl2Suite.scala deleted file mode 100644 index d55ac91e466f.. --- a/repl/src/test/scala/org/apache/spark/repl/Repl2Suite.scala +++ /dev/null @@ -1,46 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - *http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -package org.apache.spark.repl - -import java.io._ - -import org.apache.spark.{SparkContext, SparkFunSuite} - -class Repl2Suite extends SparkFunSuite { - test("propagation of local properties") { -// A mock ILoop that doesn't install the SIGINT handler. -class ILoop(out: PrintWriter) extends SparkILoop(null, out) - -val out = new StringWriter() -Main.interp = new ILoop(new PrintWriter(out)) -Main.sparkContext = new SparkContext("local", "repl-test") -val settings = new scala.tools.nsc.Settings -settings.usejavacp.value = true -Main.interp.createInterpreter(settings) - -Main.sparkContext.setLocalProperty("someKey", "someValue") - -// Make sure the value we set in the caller to interpret is propagated in the thread that -// interprets the command. - Main.interp.interpret("org.apache.spark.repl.Main.sparkContext.getLocalProperty(\"someKey\")") -assert(out.toString.contains("someValue")) - -Main.sparkContext.stop() -System.clearProperty("spark.driver.port") - } -} diff --git a/repl/src/test/scala/org/apache/spark/repl/ReplSuite.scala b/repl/src/test/scala/org/apache/spark/repl/ReplSuite.scala index bb2a85cfa0de..b9f44a707465 100644 --- a/repl/src/test/scala/org/apache/spark/repl/ReplSuite.scala +++ b/repl/src/test/scala/org/apache/spark/repl/ReplSuite.scala @@ -23,7 +23,7 @@ import java.nio.file.Files import org.apache.logging.log4j.{Level, LogManager} import org.apache.logging.log4j.core.{Logger, LoggerContext} -import org.apache.spark.SparkFunSuite +import org.apache.spark.{SparkContext, SparkFunSuite} import org.apache.spark.internal.Logging import org.apache.spark.sql.SparkSession import org.apache.spark.sql.internal.StaticSQLConf.CATALOG_IMPLEMENTATION @@ -398,4 +398,26 @@ class ReplSuite extends SparkFunSuite { assertContains(infoLogMessage2, out) assertContains(debugLogMessage1, out) } + + test("propagation of local
[spark] branch master updated: [SPARK-45312][SQL][UI] Support toggle display/hide plan svg on execution page
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new b1a0d6703bf0 [SPARK-45312][SQL][UI] Support toggle display/hide plan svg on execution page b1a0d6703bf0 is described below commit b1a0d6703bf0eb7609b394a41463ef5d02937223 Author: Kent Yao AuthorDate: Mon Sep 25 19:59:58 2023 -0700 [SPARK-45312][SQL][UI] Support toggle display/hide plan svg on execution page ### What changes were proposed in this pull request? This PR supports toggle display/hide plan svg on the execution page. ### Why are the changes needed? Improve UX for the execution page, especially for large plans ### Does this PR introduce _any_ user-facing change? yes, UI changes ### How was this patch tested? tested locally https://github.com/apache/spark/assets/8326978/e8b7573a-20b6-4a7d-9542-b1dd62bb04db ### Was this patch authored or co-authored using generative AI tooling? no Closes #43099 from yaooqinn/SPARK-45312. Authored-by: Kent Yao Signed-off-by: Dongjoon Hyun --- .../spark/sql/execution/ui/static/spark-sql-viz.js | 12 .../apache/spark/sql/execution/ui/ExecutionPage.scala | 18 +- 2 files changed, 25 insertions(+), 5 deletions(-) diff --git a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js index ea42877924d4..8999d6ff1fed 100644 --- a/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js +++ b/sql/core/src/main/resources/org/apache/spark/sql/execution/ui/static/spark-sql-viz.js @@ -257,3 +257,15 @@ function onClickAdditionalMetricsCheckbox(checkboxNode) { } window.localStorage.setItem("stageId-and-taskId-checked", isChecked); } + +function togglePlanViz() { + const arrow = d3.select("#plan-viz-graph-arrow"); + arrow.each(function () { +$(this).toggleClass("arrow-open").toggleClass("arrow-closed") + }); + if (arrow.classed("arrow-open")) { +planVizContainer().style("display", "block"); + } else { +planVizContainer().style("display", "none"); + } +} diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala index aa8fd261c58f..d1aefdb3463f 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala @@ -77,10 +77,6 @@ class ExecutionPage(parent: SQLTab) extends WebUIPage("execution") with Logging {jobLinks(JobExecutionStatus.FAILED, "Failed Jobs:")} - - - Show the Stage ID and Task ID that corresponds to the max metric - val metrics = sqlStore.executionMetrics(executionId) val graph = sqlStore.planGraph(executionId) @@ -117,7 +113,19 @@ class ExecutionPage(parent: SQLTab) extends WebUIPage("execution") with Logging graph: SparkPlanGraph): Seq[Node] = { - + + + + Plan Visualization + + + + + + + Show the Stage ID and Task ID that corresponds to the max metric + + {graph.makeDotFile(metrics)} - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fc5342314c8d -> 47bad35a4da5)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from fc5342314c8d [SPARK-44126][CORE] Shuffle migration failure count should not increase when target executor decommissioned add 47bad35a4da5 [SPARK-45325][BUILD] Upgrade Avro to 1.11.3 No new revisions were added by this update. Summary of changes: dev/deps/spark-deps-hadoop-3-hive-2.3 | 6 +++--- pom.xml | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-44126][CORE] Shuffle migration failure count should not increase when target executor decommissioned
This is an automated email from the ASF dual-hosted git repository. wuyi pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new fc5342314c8d [SPARK-44126][CORE] Shuffle migration failure count should not increase when target executor decommissioned fc5342314c8d is described below commit fc5342314c8d0890cf97a808bcf5fdf3720a5864 Author: Warren Zhu AuthorDate: Tue Sep 26 10:52:40 2023 +0800 [SPARK-44126][CORE] Shuffle migration failure count should not increase when target executor decommissioned ### What changes were proposed in this pull request? Do not increase shuffle migration failure count when target executor decommissioned ### Why are the changes needed? Block manager decommissioner only sync with block manager master about live peers every `spark.storage.cachedPeersTtl`(default 60s). If some block manager decommissioned between this, it still try to migrated shuffle to such decommissioned block manger. The migration will be failed with RuntimeException("BlockSavedOnDecommissionedBlockManagerException"). Detailed stack trace as below: ``` org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301) at org.apache.spark.network.BlockTransferService.uploadBlockSync(BlockTransferService.scala:122) at org.apache.spark.storage.BlockManagerDecommissioner$ShuffleMigrationRunnable.$anonfun$run$5(BlockManagerDecommissioner.scala:127) at org.apache.spark.storage.BlockManagerDecommissioner$ShuffleMigrationRunnable.$anonfun$run$5$adapted(BlockManagerDecommissioner.scala:118) at scala.collection.immutable.List.foreach(List.scala:431) at org.apache.spark.storage.BlockManagerDecommissioner$ShuffleMigrationRunnable.run(BlockManagerDecommissioner.scala:118) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: java.lang.RuntimeException: org.apache.spark.storage.BlockSavedOnDecommissionedBlockManagerException: Block shuffle_2_6429_0.data cannot be saved on decommissioned executor at org.apache.spark.errors.SparkCoreErrors$.cannotSaveBlockOnDecommissionedExecutorError(SparkCoreErrors.scala:238) at org.apache.spark.storage.BlockManager.checkShouldStore(BlockManager.scala:277) at org.apache.spark.storage.BlockManager.putBlockDataAsStream(BlockManager.scala:741) at org.apache.spark.network.netty.NettyBlockRpcServer.receiveStream(NettyBlockRpcServer.scala:174) ``` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added UT in `BlockManagerDecommissionUnitSuite` Closes #41905 from warrenzhu25/migrate-decom. Authored-by: Warren Zhu Signed-off-by: Yi Wu --- .../spark/storage/BlockManagerDecommissioner.scala | 16 - .../BlockManagerDecommissionUnitSuite.scala| 40 ++ 2 files changed, 55 insertions(+), 1 deletion(-) diff --git a/core/src/main/scala/org/apache/spark/storage/BlockManagerDecommissioner.scala b/core/src/main/scala/org/apache/spark/storage/BlockManagerDecommissioner.scala index 59d1f3b4c4ba..cbac3fd1a994 100644 --- a/core/src/main/scala/org/apache/spark/storage/BlockManagerDecommissioner.scala +++ b/core/src/main/scala/org/apache/spark/storage/BlockManagerDecommissioner.scala @@ -43,6 +43,8 @@ private[storage] class BlockManagerDecommissioner( private val fallbackStorage = FallbackStorage.getFallbackStorage(conf) private val maxReplicationFailuresForDecommission = conf.get(config.STORAGE_DECOMMISSION_MAX_REPLICATION_FAILURE_PER_BLOCK) + private val blockSavedOnDecommissionedBlockManagerException = +classOf[BlockSavedOnDecommissionedBlockManagerException].getSimpleName // Used for tracking if our migrations are complete. Readable for testing @volatile private[storage] var lastRDDMigrationTime: Long = 0 @@ -101,6 +103,7 @@ private[storage] class BlockManagerDecommissioner( try { val (shuffleBlockInfo, retryCount) = nextShuffleBlockToMigrate() val blocks = bm.migratableResolver.getMigrationBlocks(shuffleBlockInfo) + var isTargetDecommissioned = false // We only migrate a shuffle block when both index file and data file exist. if (blocks.isEmpty) { logInfo(s"Ignore deleted shuffle block $shuffleBlockInfo") @@ -143,6 +146,11 @@ private[storage] class
[spark] branch master updated: [SPARK-45317][SQL][CONNECT] Handle null filename in stack traces of exceptions
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 8230f16164b1 [SPARK-45317][SQL][CONNECT] Handle null filename in stack traces of exceptions 8230f16164b1 is described below commit 8230f16164b1cbd20ca0cb052c28c9fdb8d892d1 Author: Yihong He AuthorDate: Tue Sep 26 11:04:14 2023 +0900 [SPARK-45317][SQL][CONNECT] Handle null filename in stack traces of exceptions ### What changes were proposed in this pull request? - Handle null filename in stack traces of exceptions - Change the filename field in protobuf to optional ### Why are the changes needed? - In Java exceptions, filename is the only field that can be nullable and null filename may cause NullPointerException ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - `build/sbt "connect-client-jvm/testOnly *ClientE2ETestSuite"` - `build/sbt "connect-client-jvm/testOnly *ClientStreamingQuerySuite"` ### Was this patch authored or co-authored using generative AI tooling? Closes #43103 from heyihong/SPARK-45317. Authored-by: Yihong He Signed-off-by: Hyukjin Kwon --- .../org/apache/spark/sql/ClientE2ETestSuite.scala | 28 ++ .../src/main/protobuf/spark/connect/base.proto | 2 +- .../connect/client/GrpcExceptionConverter.scala| 2 +- .../spark/sql/connect/utils/ErrorUtils.scala | 10 +--- .../service/FetchErrorDetailsHandlerSuite.scala| 25 +++ python/pyspark/sql/connect/proto/base_pb2.py | 14 +-- python/pyspark/sql/connect/proto/base_pb2.pyi | 13 +- 7 files changed, 81 insertions(+), 13 deletions(-) diff --git a/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala b/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala index ec9b1698a4ee..55718ed9c0be 100644 --- a/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala +++ b/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala @@ -45,6 +45,34 @@ import org.apache.spark.sql.types._ class ClientE2ETestSuite extends RemoteSparkSession with SQLHelper with PrivateMethodTester { + test(s"throw SparkException with null filename in stack trace elements") { +withSQLConf("spark.sql.connect.enrichError.enabled" -> "true") { + val session = spark + import session.implicits._ + + val throwException = +udf((_: String) => { + val testError = new SparkException("test") + val stackTrace = testError.getStackTrace() + stackTrace(0) = new StackTraceElement( +stackTrace(0).getClassName, +stackTrace(0).getMethodName, +null, +stackTrace(0).getLineNumber) + testError.setStackTrace(stackTrace) + throw testError +}) + + val ex = intercept[SparkException] { +Seq("1").toDS.withColumn("udf_val", throwException($"value")).collect() + } + + assert(ex.getCause.isInstanceOf[SparkException]) + assert(ex.getCause.getStackTrace().length > 0) + assert(ex.getCause.getStackTrace()(0).getFileName == null) +} + } + for (enrichErrorEnabled <- Seq(false, true)) { test(s"cause exception - ${enrichErrorEnabled}") { withSQLConf("spark.sql.connect.enrichError.enabled" -> enrichErrorEnabled.toString) { diff --git a/connector/connect/common/src/main/protobuf/spark/connect/base.proto b/connector/connect/common/src/main/protobuf/spark/connect/base.proto index e5317cae6dc8..b30c578421c2 100644 --- a/connector/connect/common/src/main/protobuf/spark/connect/base.proto +++ b/connector/connect/common/src/main/protobuf/spark/connect/base.proto @@ -808,7 +808,7 @@ message FetchErrorDetailsResponse { string method_name = 2; // The name of the file containing the execution point. -string file_name = 3; +optional string file_name = 3; // The line number of the source line containing the execution point. int32 line_number = 4; diff --git a/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala b/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala index edbc434ef964..2d86e8c1e417 100644 --- a/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala +++ b/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala @@ -222,7 +222,7 @@ private object GrpcExceptionConverter { new StackTraceElement( stackTraceElement.getDeclaringClass,
[spark] branch master updated: [SPARK-44550][SQL] Enable correctness fixes for `null IN (empty list)` under ANSI
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 7741fe79e20 [SPARK-44550][SQL] Enable correctness fixes for `null IN (empty list)` under ANSI 7741fe79e20 is described below commit 7741fe79e201eea42ceefedd10013680736fbbea Author: Jack Chen AuthorDate: Tue Sep 26 09:40:38 2023 +0800 [SPARK-44550][SQL] Enable correctness fixes for `null IN (empty list)` under ANSI ### What changes were proposed in this pull request? Enables the correctness fixes for `null IN (empty list)` expressions by default when ANSI is enabled. Under non-ANSI the old behavior remains the default for now. After soaking for some time under ANSI, we should switch the new behavior to default in both cases. Prior to this, `null IN (empty list)` incorrectly evaluated to null, when it should evaluate to false. (The reason it should be false is because a IN (b1, b2) is defined as a = b1 OR a = b2, and an empty IN list is treated as an empty OR which is false. This is specified by ANSI SQL.) Many places in Spark execution (In, InSet, InSubquery) and optimization (OptimizeIn, NullPropagation) implemented this wrong behavior. This is a longstanding correctness issue which has existed since null support for IN expressions was first added to Spark. See previous PRs where the fixes were implemented: https://github.com/apache/spark/pull/42007 and https://github.com/apache/spark/pull/42163. See [this doc](https://docs.google.com/document/d/1k8AY8oyT-GI04SnP7eXttPDnDj-Ek-c3luF2zL6DPNU/edit) for more information. ### Why are the changes needed? Fix wrong SQL semantics ### Does this PR introduce _any_ user-facing change? Yes, fix wrong SQL semantics ### How was this patch tested? Unit tests ### Was this patch authored or co-authored using generative AI tooling? No Closes #43068 from jchen5/null-in-empty-enable. Authored-by: Jack Chen Signed-off-by: Wenchen Fan --- .../spark/sql/catalyst/expressions/predicates.scala | 4 ++-- .../spark/sql/catalyst/optimizer/expressions.scala| 10 +- .../scala/org/apache/spark/sql/internal/SQLConf.scala | 8 +++- .../scala/org/apache/spark/sql/EmptyInSuite.scala | 19 +++ 4 files changed, 33 insertions(+), 8 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala index 31b872e04ce..419d11b13a2 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala @@ -469,7 +469,7 @@ case class In(value: Expression, list: Seq[Expression]) extends Predicate { final override val nodePatterns: Seq[TreePattern] = Seq(IN) private val legacyNullInEmptyBehavior = -SQLConf.get.getConf(SQLConf.LEGACY_NULL_IN_EMPTY_LIST_BEHAVIOR) +SQLConf.get.legacyNullInEmptyBehavior override lazy val canonicalized: Expression = { val basic = withNewChildren(children.map(_.canonicalized)).asInstanceOf[In] @@ -626,7 +626,7 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with final override val nodePatterns: Seq[TreePattern] = Seq(INSET) private val legacyNullInEmptyBehavior = -SQLConf.get.getConf(SQLConf.LEGACY_NULL_IN_EMPTY_LIST_BEHAVIOR) +SQLConf.get.legacyNullInEmptyBehavior override def eval(input: InternalRow): Any = { if (hset.isEmpty && !legacyNullInEmptyBehavior) { diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala index 8a7f54093d5..90773a1eb86 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala @@ -283,7 +283,7 @@ object OptimizeIn extends Rule[LogicalPlan] { case In(v, list) if list.isEmpty => // IN (empty list) is always false under current behavior. // Under legacy behavior it's null if the left side is null, otherwise false (SPARK-44550). -if (!SQLConf.get.getConf(SQLConf.LEGACY_NULL_IN_EMPTY_LIST_BEHAVIOR)) { +if (!SQLConf.get.legacyNullInEmptyBehavior) { FalseLiteral } else { If(IsNotNull(v), FalseLiteral, Literal(null, BooleanType)) @@ -845,20 +845,20 @@ object NullPropagation extends Rule[LogicalPlan] { // If the list is empty, transform the In expression to false literal. case In(_, list) -if list.isEmpty &&
[spark] branch master updated (36e626bc60a -> c1b12bd5642)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 36e626bc60a [SPARK-43711][SPARK-44372][CONNECT][PS][TESTS] Clear message for Spark ML dependent tests add c1b12bd5642 [SPARK-45322][CORE] Use ProcessHandle to get pid directly No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/util/Utils.scala | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch dependabot/maven/org.xerial.snappy-snappy-java-1.1.10.4 created (now 2b18d0c7daa)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch dependabot/maven/org.xerial.snappy-snappy-java-1.1.10.4 in repository https://gitbox.apache.org/repos/asf/spark.git at 2b18d0c7daa Bump org.xerial.snappy:snappy-java from 1.1.10.3 to 1.1.10.4 No new revisions were added by this update. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-43711][SPARK-44372][CONNECT][PS][TESTS] Clear message for Spark ML dependent tests
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 36e626bc60a [SPARK-43711][SPARK-44372][CONNECT][PS][TESTS] Clear message for Spark ML dependent tests 36e626bc60a is described below commit 36e626bc60af4ce94a1ca304e05390418b965135 Author: Haejoon Lee AuthorDate: Mon Sep 25 11:23:51 2023 -0700 [SPARK-43711][SPARK-44372][CONNECT][PS][TESTS] Clear message for Spark ML dependent tests ### What changes were proposed in this pull request? Similar to https://github.com/apache/spark/pull/42955, this PR proposes to correct the message for Spark ML only tests from Spark Connect. ### Why are the changes needed? Among Spark ML dependent tests, there are some edge tests that can only be tested using Spark ML features. We need to be clearer about why these cannot be tested. ### Does this PR introduce _any_ user-facing change? No, it's test-only. ### How was this patch tested? Updated the existing tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43051 from itholic/ml_dependent. Authored-by: Haejoon Lee Signed-off-by: Dongjoon Hyun --- .../pyspark/pandas/tests/connect/plot/test_parity_frame_plot.py | 4 ++-- .../tests/connect/plot/test_parity_frame_plot_matplotlib.py | 4 ++-- .../pandas/tests/connect/plot/test_parity_frame_plot_plotly.py| 6 +++--- .../tests/connect/plot/test_parity_series_plot_matplotlib.py | 8 .../pandas/tests/connect/plot/test_parity_series_plot_plotly.py | 4 ++-- python/pyspark/pandas/tests/connect/test_parity_default_index.py | 4 +--- 6 files changed, 14 insertions(+), 16 deletions(-) diff --git a/python/pyspark/pandas/tests/connect/plot/test_parity_frame_plot.py b/python/pyspark/pandas/tests/connect/plot/test_parity_frame_plot.py index 24392eaf27c..10054f58501 100644 --- a/python/pyspark/pandas/tests/connect/plot/test_parity_frame_plot.py +++ b/python/pyspark/pandas/tests/connect/plot/test_parity_frame_plot.py @@ -24,11 +24,11 @@ from pyspark.testing.pandasutils import PandasOnSparkTestUtils class DataFramePlotParityTests( DataFramePlotTestsMixin, PandasOnSparkTestUtils, ReusedConnectTestCase ): -@unittest.skip("TODO(SPARK-43711): Fix Transformer.transform to work with Spark Connect.") +@unittest.skip("Test depends on Spark ML which is not supported from Spark Connect.") def test_compute_hist_multi_columns(self): super().test_compute_hist_multi_columns() -@unittest.skip("TODO(SPARK-43711): Fix Transformer.transform to work with Spark Connect.") +@unittest.skip("Test depends on Spark ML which is not supported from Spark Connect.") def test_compute_hist_single_column(self): super().test_compute_hist_single_column() diff --git a/python/pyspark/pandas/tests/connect/plot/test_parity_frame_plot_matplotlib.py b/python/pyspark/pandas/tests/connect/plot/test_parity_frame_plot_matplotlib.py index 3f615326f2b..9fec1c57c02 100644 --- a/python/pyspark/pandas/tests/connect/plot/test_parity_frame_plot_matplotlib.py +++ b/python/pyspark/pandas/tests/connect/plot/test_parity_frame_plot_matplotlib.py @@ -24,11 +24,11 @@ from pyspark.testing.pandasutils import PandasOnSparkTestUtils, TestUtils class DataFramePlotMatplotlibParityTests( DataFramePlotMatplotlibTestsMixin, PandasOnSparkTestUtils, TestUtils, ReusedConnectTestCase ): -@unittest.skip("TODO(SPARK-43711): Fix Transformer.transform to work with Spark Connect.") +@unittest.skip("Test depends on Spark ML which is not supported from Spark Connect.") def test_hist_plot(self): super().test_hist_plot() -@unittest.skip("TODO(SPARK-44372): Enable KernelDensity within Spark Connect.") +@unittest.skip("Test depends on Spark ML which is not supported from Spark Connect.") def test_kde_plot(self): super().test_kde_plot() diff --git a/python/pyspark/pandas/tests/connect/plot/test_parity_frame_plot_plotly.py b/python/pyspark/pandas/tests/connect/plot/test_parity_frame_plot_plotly.py index 16b97d6814e..452962d8135 100644 --- a/python/pyspark/pandas/tests/connect/plot/test_parity_frame_plot_plotly.py +++ b/python/pyspark/pandas/tests/connect/plot/test_parity_frame_plot_plotly.py @@ -24,15 +24,15 @@ from pyspark.testing.pandasutils import PandasOnSparkTestUtils, TestUtils class DataFramePlotPlotlyParityTests( DataFramePlotPlotlyTestsMixin, PandasOnSparkTestUtils, TestUtils, ReusedConnectTestCase ): -@unittest.skip("TODO(SPARK-43711): Fix Transformer.transform to work with Spark Connect.") +@unittest.skip("Test depends on Spark ML which is not supported from Spark Connect.") def test_hist_layout_kwargs(self):
[spark] branch master updated (42a6557172c -> 0edeb605f2f)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 42a6557172c [MINOR][SQL][DOCS] Remove JDK 8 related information in the comemnts for aes_encrypt and aes_decrypt add 0edeb605f2f [SPARK-45304][BUILD] Remove test classloader workaround for SBT build No new revisions were added by this update. Summary of changes: project/SparkBuild.scala | 4 1 file changed, 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [MINOR][SQL][DOCS] Remove JDK 8 related information in the comemnts for aes_encrypt and aes_decrypt
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 42a6557172c [MINOR][SQL][DOCS] Remove JDK 8 related information in the comemnts for aes_encrypt and aes_decrypt 42a6557172c is described below commit 42a6557172cd5c7cfcd049ff95a9ad1bc8fdeaa5 Author: Hyukjin Kwon AuthorDate: Mon Sep 25 10:25:29 2023 -0700 [MINOR][SQL][DOCS] Remove JDK 8 related information in the comemnts for aes_encrypt and aes_decrypt ### What changes were proposed in this pull request? This PR proposes to fix the comments in both `aes_encrypt` and `aes_decrypt`. Did a quick check for Scala/Python/R API, and seems like this is the only place to fix. ### Why are the changes needed? We dropped JDK 8 at SPARK-44112 ### Does this PR introduce _any_ user-facing change? No, it's Scaladoc, and the doc is not user-facing. ### How was this patch tested? Build in this CI should check them. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43091 from HyukjinKwon/minor-doc-AesEncrypt. Authored-by: Hyukjin Kwon Signed-off-by: Dongjoon Hyun --- .../main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala | 4 1 file changed, 4 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala index 92ed0843521..6a7f841c324 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala @@ -304,8 +304,6 @@ case class CurrentUser() extends LeafExpression with Unevaluable { /** * A function that encrypts input using AES. Key lengths of 128, 192 or 256 bits can be used. - * For versions prior to JDK 8u161, 192 and 256 bits keys can be used - * if Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files are installed. * If either argument is NULL or the key length is not one of the permitted values, * the return value is NULL. */ @@ -388,8 +386,6 @@ case class AesEncrypt( /** * A function that decrypts input using AES. Key lengths of 128, 192 or 256 bits can be used. - * For versions prior to JDK 8u161, 192 and 256 bits keys can be used - * if Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files are installed. * If either argument is NULL or the key length is not one of the permitted values, * the return value is NULL. */ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (c1c1c9fa98f -> 772802d1165)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from c1c1c9fa98f [SPARK-45305][SQL][TESTS] Remove JDK 8 workaround added TreeNodeSuite add 772802d1165 [SPARK-45301][BUILD] Remove org.scala-lang scala-library added for JDK 11 workaround No new revisions were added by this update. Summary of changes: common/network-common/pom.xml | 6 -- 1 file changed, 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-45305][SQL][TESTS] Remove JDK 8 workaround added TreeNodeSuite
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new c1c1c9fa98f [SPARK-45305][SQL][TESTS] Remove JDK 8 workaround added TreeNodeSuite c1c1c9fa98f is described below commit c1c1c9fa98f74fcc646214d39d2fec9dad6b5cc5 Author: Hyukjin Kwon AuthorDate: Mon Sep 25 10:08:32 2023 -0700 [SPARK-45305][SQL][TESTS] Remove JDK 8 workaround added TreeNodeSuite ### What changes were proposed in this pull request? In theory, we don't need https://github.com/apache/spark/pull/29875 anymore because we dropped JDK 8 (according to the PR description in https://github.com/apache/spark/pull/29875) but `Utils.getSimpleClass` handles malformed class names in any event, so should be safe to keep them. ### Why are the changes needed? To remove test that does not run. We dropped JDK 11/8 at SPARK-44112 ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? CI in this PR should test them out. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43092 from HyukjinKwon/SPARK-45305. Authored-by: Hyukjin Kwon Signed-off-by: Dongjoon Hyun --- .../spark/sql/catalyst/trees/TreeNodeSuite.scala | 29 -- 1 file changed, 29 deletions(-) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/trees/TreeNodeSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/trees/TreeNodeSuite.scala index 3411415bbb6..c2f7287758d 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/trees/TreeNodeSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/trees/TreeNodeSuite.scala @@ -864,35 +864,6 @@ class TreeNodeSuite extends SparkFunSuite with SQLHelper { assert(getStateful(withNestedStatefulBefore) ne getStateful(withNestedStatefulAfter)) } - object MalformedClassObject extends Serializable { -case class MalformedNameExpression(child: Expression) extends TaggingExpression { - override protected def withNewChildInternal(newChild: Expression): Expression = -copy(child = newChild) -} - } - - test("SPARK-32999: TreeNode.nodeName should not throw malformed class name error") { -val testTriggersExpectedError = try { - classOf[MalformedClassObject.MalformedNameExpression].getSimpleName - false -} catch { - case ex: java.lang.InternalError if ex.getMessage.contains("Malformed class name") => -true - case ex: Throwable => throw ex -} -// This test case only applies on older JDK versions (e.g. JDK8u), and doesn't trigger the -// issue on newer JDK versions (e.g. JDK11u). -assume(testTriggersExpectedError, "the test case didn't trigger malformed class name error") - -val expr = MalformedClassObject.MalformedNameExpression(Literal(1)) -try { - expr.nodeName -} catch { - case ex: java.lang.InternalError if ex.getMessage.contains("Malformed class name") => -fail("TreeNode.nodeName should not throw malformed class name error") -} - } - test("SPARK-37800: TreeNode.argString incorrectly formats arguments of type Set[_]") { case class Node(set: Set[String], nested: Seq[Set[Int]]) extends LeafNode { val output: Seq[Attribute] = Nil - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-45303][CORE] Remove JDK 8/11 workaround in KryoSerializerBenchmark
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 4eb5577ece2 [SPARK-45303][CORE] Remove JDK 8/11 workaround in KryoSerializerBenchmark 4eb5577ece2 is described below commit 4eb5577ece2449676c804e358a4a07fcc52ce670 Author: Hyukjin Kwon AuthorDate: Mon Sep 25 09:58:58 2023 -0700 [SPARK-45303][CORE] Remove JDK 8/11 workaround in KryoSerializerBenchmark ### What changes were proposed in this pull request? This PR removes the legacy workaround for JDK 8/11 in SPARK-29282. They were already removed in SPARK-37293. This is the leftover. ### Why are the changes needed? For consistency. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Fixed unittests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43088 from HyukjinKwon/SPARK-45303. Authored-by: Hyukjin Kwon Signed-off-by: Dongjoon Hyun --- core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala| 2 +- .../scala/org/apache/spark/serializer/KryoSerializerBenchmark.scala | 4 2 files changed, 1 insertion(+), 5 deletions(-) diff --git a/core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala b/core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala index 99620fc9757..5eb22032a5e 100644 --- a/core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala +++ b/core/src/test/scala/org/apache/spark/benchmark/BenchmarkBase.scala @@ -23,7 +23,7 @@ import org.apache.spark.internal.config.Tests.IS_TESTING /** * A base class for generate benchmark results to a file. - * For JDK9+, JDK major version number is added to the file names to distinguish the results. + * For JDK 21+, JDK major version number is added to the file names to distinguish the results. */ abstract class BenchmarkBase { var output: Option[OutputStream] = None diff --git a/core/src/test/scala/org/apache/spark/serializer/KryoSerializerBenchmark.scala b/core/src/test/scala/org/apache/spark/serializer/KryoSerializerBenchmark.scala index e1e4c218e9c..97051e375cf 100644 --- a/core/src/test/scala/org/apache/spark/serializer/KryoSerializerBenchmark.scala +++ b/core/src/test/scala/org/apache/spark/serializer/KryoSerializerBenchmark.scala @@ -28,7 +28,6 @@ import org.apache.spark.benchmark.{Benchmark, BenchmarkBase} import org.apache.spark.internal.config._ import org.apache.spark.internal.config.Kryo._ import org.apache.spark.internal.config.Tests.IS_TESTING -import org.apache.spark.launcher.SparkLauncher.EXECUTOR_EXTRA_JAVA_OPTIONS import org.apache.spark.serializer.KryoTest._ import org.apache.spark.util.ThreadUtils @@ -76,9 +75,6 @@ object KryoSerializerBenchmark extends BenchmarkBase { def createSparkContext(usePool: Boolean): SparkContext = { val conf = new SparkConf() -// SPARK-29282 This is for consistency between JDK8 and JDK11. -conf.set(EXECUTOR_EXTRA_JAVA_OPTIONS, - "-XX:+UseParallelGC -XX:-UseDynamicNumberOfGCThreads") conf.set(SERIALIZER, "org.apache.spark.serializer.KryoSerializer") conf.set(KRYO_USER_REGISTRATORS, Seq(classOf[MyRegistrator].getName)) conf.set(KRYO_USE_POOL, usePool) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-45313][CORE] Inline `Iterators#size` and remove `Iterators.scala`
This is an automated email from the ASF dual-hosted git repository. yangjie01 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 30c73f6e103 [SPARK-45313][CORE] Inline `Iterators#size` and remove `Iterators.scala` 30c73f6e103 is described below commit 30c73f6e103851ee1f3ce012572455ab3c9d5625 Author: yangjie01 AuthorDate: Tue Sep 26 00:40:29 2023 +0800 [SPARK-45313][CORE] Inline `Iterators#size` and remove `Iterators.scala` ### What changes were proposed in this pull request? This pr inlined the code of `Iterators#size` and remove `Iterators.scala`. ### Why are the changes needed? https://github.com/apache/spark/pull/37353 introduced optimizations based on Scala 2.13 for the `Utils.getIteratorSize` function, hence there exist different versions of `Iterators.scala` for Scala 2.12 and Scala 2.13. Currently, Apache Spark 4.0 no longer supports Scala 2.12, so the corresponding code simplification can be performed. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes #43100 from LuciferYang/SPARK-45313. Authored-by: yangjie01 Signed-off-by: yangjie01 --- .../scala/org/apache/spark/util/Iterators.scala| 40 -- .../main/scala/org/apache/spark/util/Utils.scala | 12 ++- 2 files changed, 11 insertions(+), 41 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/util/Iterators.scala b/core/src/main/scala/org/apache/spark/util/Iterators.scala deleted file mode 100644 index 9756cf49b95..000 --- a/core/src/main/scala/org/apache/spark/util/Iterators.scala +++ /dev/null @@ -1,40 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - *http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -package org.apache.spark.util - -private[util] object Iterators { - - /** - * Counts the number of elements of an iterator. - * This method is slower than `iterator.size` when using Scala 2.13, - * but it can avoid overflowing problem. - */ - def size(iterator: Iterator[_]): Long = { -// SPARK-39928: For Scala 2.13, add check of `iterator.knownSize` refer to -// `IterableOnceOps#size` to reduce the performance gap with `iterator.size`. -if (iterator.knownSize > 0) iterator.knownSize.toLong -else { - var count = 0L - while (iterator.hasNext) { -count += 1L -iterator.next() - } - count -} - } -} diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala b/core/src/main/scala/org/apache/spark/util/Utils.scala index 149071ee1b6..b9f7eccdfe1 100644 --- a/core/src/main/scala/org/apache/spark/util/Utils.scala +++ b/core/src/main/scala/org/apache/spark/util/Utils.scala @@ -1785,7 +1785,17 @@ private[spark] object Utils /** * Counts the number of elements of an iterator. */ - def getIteratorSize(iterator: Iterator[_]): Long = Iterators.size(iterator) + def getIteratorSize(iterator: Iterator[_]): Long = { +if (iterator.knownSize >= 0) iterator.knownSize.toLong +else { + var count = 0L + while (iterator.hasNext) { +count += 1L +iterator.next() + } + count +} + } /** * Generate a zipWithIndex iterator, avoid index value overflowing problem - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-45320][SQL][TESTS] Update benchmark result for `InMemoryColumnarBenchmark`
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 3ff8977bf65 [SPARK-45320][SQL][TESTS] Update benchmark result for `InMemoryColumnarBenchmark` 3ff8977bf65 is described below commit 3ff8977bf6584699fb4e7c24912a3a2e8405fca0 Author: yangjie01 AuthorDate: Mon Sep 25 09:17:41 2023 -0700 [SPARK-45320][SQL][TESTS] Update benchmark result for `InMemoryColumnarBenchmark` ### What changes were proposed in this pull request? This pr aims update `InMemoryColumnarBenchmark` benchmark result for Java 17 and add benchmark result for Java 21 ### Why are the changes needed? Track the performance of Java 17/21 through micro-benchmark testing. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes #43105 from LuciferYang/SPARK-45320. Authored-by: yangjie01 Signed-off-by: Dongjoon Hyun --- ...rk-results.txt => InMemoryColumnarBenchmark-jdk21-results.txt} | 8 sql/core/benchmarks/InMemoryColumnarBenchmark-results.txt | 6 +++--- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/sql/core/benchmarks/InMemoryColumnarBenchmark-results.txt b/sql/core/benchmarks/InMemoryColumnarBenchmark-jdk21-results.txt similarity index 57% copy from sql/core/benchmarks/InMemoryColumnarBenchmark-results.txt copy to sql/core/benchmarks/InMemoryColumnarBenchmark-jdk21-results.txt index fee34039a3d..f757ce2d707 100644 --- a/sql/core/benchmarks/InMemoryColumnarBenchmark-results.txt +++ b/sql/core/benchmarks/InMemoryColumnarBenchmark-jdk21-results.txt @@ -2,11 +2,11 @@ Int In-memory with 100 rows -OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1031-azure -Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz +OpenJDK 64-Bit Server VM 21+35 on Linux 5.15.0-1046-azure +Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz Int In-Memory scan: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative -- -columnar deserialization + columnar-to-row216235 27 4.6 215.9 1.0X -row-based deserialization 179182 3 5.6 178.8 1.2X +columnar deserialization + columnar-to-row336413 67 3.0 336.2 1.0X +row-based deserialization 215285 61 4.7 214.7 1.6X diff --git a/sql/core/benchmarks/InMemoryColumnarBenchmark-results.txt b/sql/core/benchmarks/InMemoryColumnarBenchmark-results.txt index fee34039a3d..e2da2eed94e 100644 --- a/sql/core/benchmarks/InMemoryColumnarBenchmark-results.txt +++ b/sql/core/benchmarks/InMemoryColumnarBenchmark-results.txt @@ -2,11 +2,11 @@ Int In-memory with 100 rows -OpenJDK 64-Bit Server VM 1.8.0_362-b09 on Linux 5.15.0-1031-azure +OpenJDK 64-Bit Server VM 17.0.8+7-LTS on Linux 5.15.0-1046-azure Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz Int In-Memory scan: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative -- -columnar deserialization + columnar-to-row216235 27 4.6 215.9 1.0X -row-based deserialization 179182 3 5.6 178.8 1.2X +columnar deserialization + columnar-to-row274437 146 3.7 273.8 1.0X +row-based deserialization 263308 39 3.8 263.2 1.0X - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-45307][INFRA] Use Zulu JDK in `benchmark` GitHub Action and Java 21
This is an automated email from the ASF dual-hosted git repository. yangjie01 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6e16401fbff [SPARK-45307][INFRA] Use Zulu JDK in `benchmark` GitHub Action and Java 21 6e16401fbff is described below commit 6e16401fbff9b25df39daf3b1060c8e21e92c55d Author: panbingkun AuthorDate: Mon Sep 25 22:39:31 2023 +0800 [SPARK-45307][INFRA] Use Zulu JDK in `benchmark` GitHub Action and Java 21 ### What changes were proposed in this pull request? The pr aims to use Zulu JDK in benchmark GitHub Action and Java 21. ### Why are the changes needed? When I was preparing to obtain the results of `org.apache.spark.MapStatusesConvertBenchmark` benchmark running on JDK21 in GA, the following error occurred, https://github.com/panbingkun/spark/actions/runs/6293925655/job/17085885694 https://github.com/apache/spark/assets/15246973/36e293e0-cae8-4764-a93a-93a139b6eaaa;> ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually test. https://github.com/panbingkun/spark/actions/runs/6295588793 ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43094 from panbingkun/SPARK-45307. Authored-by: panbingkun Signed-off-by: yangjie01 --- .github/workflows/benchmark.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/benchmark.yml b/.github/workflows/benchmark.yml index 79afb46f301..11ebb8ae7e2 100644 --- a/.github/workflows/benchmark.yml +++ b/.github/workflows/benchmark.yml @@ -107,7 +107,7 @@ jobs: if: steps.cache-tpcds-sf-1.outputs.cache-hit != 'true' uses: actions/setup-java@v3 with: - distribution: temurin + distribution: zulu java-version: ${{ github.event.inputs.jdk }} - name: Generate TPC-DS (SF=1) table data if: steps.cache-tpcds-sf-1.outputs.cache-hit != 'true' @@ -159,7 +159,7 @@ jobs: - name: Install Java ${{ github.event.inputs.jdk }} uses: actions/setup-java@v3 with: -distribution: temurin +distribution: zulu java-version: ${{ github.event.inputs.jdk }} - name: Cache TPC-DS generated data if: contains(github.event.inputs.class, 'TPCDSQueryBenchmark') || contains(github.event.inputs.class, '*') - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-45298][SPARK-31959][SQL][TESTS] Remove the workaround for JDK-8228469 in test
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new a975a086093 [SPARK-45298][SPARK-31959][SQL][TESTS] Remove the workaround for JDK-8228469 in test a975a086093 is described below commit a975a086093a63075cc3a2d7e944a7075e3f185e Author: Hyukjin Kwon AuthorDate: Mon Sep 25 23:22:35 2023 +0900 [SPARK-45298][SPARK-31959][SQL][TESTS] Remove the workaround for JDK-8228469 in test ### What changes were proposed in this pull request? This PR removes the legacy workaround for old JDK added at SPARK-31959 ### Why are the changes needed? To remove legacy workaround. We dropped JDK 8/11 at SPARK-44112 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing unittest/docs added in SPARK-31959 ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43083 from HyukjinKwon/SPARK-45298. Authored-by: Hyukjin Kwon Signed-off-by: Hyukjin Kwon --- .../sql/catalyst/util/RebaseDateTimeSuite.scala| 22 ++ 1 file changed, 6 insertions(+), 16 deletions(-) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/RebaseDateTimeSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/RebaseDateTimeSuite.scala index a17ca2358de..0a44db5a699 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/RebaseDateTimeSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/RebaseDateTimeSuite.scala @@ -417,22 +417,12 @@ class RebaseDateTimeSuite extends SparkFunSuite with Matchers with SQLHelper { // clocks were moved backward to become Sunday, 18 November, 1945 01:00:00 AM. // In this way, the overlap happened w/o Daylight Saving Time. val hkZid = getZoneId("Asia/Hong_Kong") -var expected = "1945-11-18 01:30:00.0" -var ldt = LocalDateTime.of(1945, 11, 18, 1, 30, 0) -var earlierMicros = instantToMicros(ldt.atZone(hkZid).withEarlierOffsetAtOverlap().toInstant) -var laterMicros = instantToMicros(ldt.atZone(hkZid).withLaterOffsetAtOverlap().toInstant) -var overlapInterval = MICROS_PER_HOUR -if (earlierMicros + overlapInterval != laterMicros) { - // Old JDK might have an outdated time zone database. - // See https://bugs.openjdk.java.net/browse/JDK-8228469: "Hong Kong ... Its 1945 transition - // from JST to HKT was on 11-18 at 02:00, not 09-15 at 00:00" - expected = "1945-09-14 23:30:00.0" - ldt = LocalDateTime.of(1945, 9, 14, 23, 30, 0) - earlierMicros = instantToMicros(ldt.atZone(hkZid).withEarlierOffsetAtOverlap().toInstant) - laterMicros = instantToMicros(ldt.atZone(hkZid).withLaterOffsetAtOverlap().toInstant) - // If time zone db doesn't have overlapping at all, set the overlap interval to zero. - overlapInterval = laterMicros - earlierMicros -} +val expected = "1945-11-18 01:30:00.0" +val ldt = LocalDateTime.of(1945, 11, 18, 1, 30, 0) +val earlierMicros = instantToMicros(ldt.atZone(hkZid).withEarlierOffsetAtOverlap().toInstant) +val laterMicros = instantToMicros(ldt.atZone(hkZid).withLaterOffsetAtOverlap().toInstant) +val overlapInterval = MICROS_PER_HOUR +assert(earlierMicros + overlapInterval == laterMicros) val hkTz = TimeZone.getTimeZone(hkZid) val rebasedEarlierMicros = rebaseGregorianToJulianMicros(hkTz, earlierMicros) val rebasedLaterMicros = rebaseGregorianToJulianMicros(hkTz, laterMicros) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-45247][PYTHON][TESTS][FOLLOW-UP] Deduplicate FrameReidexingTests.test_filter test
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6e327858d3c [SPARK-45247][PYTHON][TESTS][FOLLOW-UP] Deduplicate FrameReidexingTests.test_filter test 6e327858d3c is described below commit 6e327858d3ccf5cb34297147c7a9c6e54f7218a2 Author: Haejoon Lee AuthorDate: Mon Sep 25 22:19:50 2023 +0900 [SPARK-45247][PYTHON][TESTS][FOLLOW-UP] Deduplicate FrameReidexingTests.test_filter test ### What changes were proposed in this pull request? This is followup PR for https://github.com/apache/spark/pull/43025 to cleanup the duplicated tests in the code. ### Why are the changes needed? Cleanup ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No test needed / the existing CI should pass. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43101 from itholic/pandas_2.1.1_followup. Authored-by: Haejoon Lee Signed-off-by: Hyukjin Kwon --- python/pyspark/pandas/tests/frame/test_reindexing.py | 3 --- 1 file changed, 3 deletions(-) diff --git a/python/pyspark/pandas/tests/frame/test_reindexing.py b/python/pyspark/pandas/tests/frame/test_reindexing.py index 3e40c35edd6..606efd95188 100644 --- a/python/pyspark/pandas/tests/frame/test_reindexing.py +++ b/python/pyspark/pandas/tests/frame/test_reindexing.py @@ -856,9 +856,6 @@ class FrameReindexingMixin: class FrameReidexingTests(FrameReindexingMixin, ComparisonTestBase, SQLTestUtils): -def test_filter(self): -super().test_filter() - pass - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-45295][CORE][SQL] Remove Utils.isMemberClass workaround for JDK 8
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new f94cc9f7e08 [SPARK-45295][CORE][SQL] Remove Utils.isMemberClass workaround for JDK 8 f94cc9f7e08 is described below commit f94cc9f7e0858697f04486bf52f34fbaa4b0106e Author: Hyukjin Kwon AuthorDate: Mon Sep 25 22:18:32 2023 +0900 [SPARK-45295][CORE][SQL] Remove Utils.isMemberClass workaround for JDK 8 ### What changes were proposed in this pull request? This PR removes the legacy workaround for JDK 8 added at SPARK-34607 ### Why are the changes needed? To remove legacy workaround. We dropped JDK 8 at SPARK-44112 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing unittest added in SPARK-34607 ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43080 from HyukjinKwon/SPARK-45295. Authored-by: Hyukjin Kwon Signed-off-by: Hyukjin Kwon --- .../org/apache/spark/util/SparkClassUtils.scala| 28 -- .../spark/sql/catalyst/encoders/OuterScopes.scala | 2 +- .../sql/catalyst/expressions/objects/objects.scala | 2 +- 3 files changed, 2 insertions(+), 30 deletions(-) diff --git a/common/utils/src/main/scala/org/apache/spark/util/SparkClassUtils.scala b/common/utils/src/main/scala/org/apache/spark/util/SparkClassUtils.scala index 5984eaee42e..42d6d9fb421 100644 --- a/common/utils/src/main/scala/org/apache/spark/util/SparkClassUtils.scala +++ b/common/utils/src/main/scala/org/apache/spark/util/SparkClassUtils.scala @@ -50,34 +50,6 @@ private[spark] trait SparkClassUtils { def classIsLoadable(clazz: String): Boolean = { Try { classForName(clazz, initialize = false) }.isSuccess } - - /** - * Returns true if and only if the underlying class is a member class. - * - * Note: jdk8u throws a "Malformed class name" error if a given class is a deeply-nested - * inner class (See SPARK-34607 for details). This issue has already been fixed in jdk9+, so - * we can remove this helper method safely if we drop the support of jdk8u. - */ - def isMemberClass(cls: Class[_]): Boolean = { -try { - cls.isMemberClass -} catch { - case _: InternalError => -// We emulate jdk8u `Class.isMemberClass` below: -// public boolean isMemberClass() { -// return getSimpleBinaryName() != null && !isLocalOrAnonymousClass(); -// } -// `getSimpleBinaryName()` returns null if a given class is a top-level class, -// so we replace it with `cls.getEnclosingClass != null`. The second condition checks -// if a given class is not a local or an anonymous class, so we replace it with -// `cls.getEnclosingMethod == null` because `cls.getEnclosingMethod()` return a value -// only in either case (JVM Spec 4.8.6). -// -// Note: The newer jdk evaluates `!isLocalOrAnonymousClass()` first, -// we reorder the conditions to follow it. -cls.getEnclosingMethod == null && cls.getEnclosingClass != null -} - } } private[spark] object SparkClassUtils extends SparkClassUtils diff --git a/sql/api/src/main/scala/org/apache/spark/sql/catalyst/encoders/OuterScopes.scala b/sql/api/src/main/scala/org/apache/spark/sql/catalyst/encoders/OuterScopes.scala index b497cd3f386..85876889569 100644 --- a/sql/api/src/main/scala/org/apache/spark/sql/catalyst/encoders/OuterScopes.scala +++ b/sql/api/src/main/scala/org/apache/spark/sql/catalyst/encoders/OuterScopes.scala @@ -70,7 +70,7 @@ object OuterScopes { * useful for inner class defined in REPL. */ def getOuterScope(innerCls: Class[_]): () => AnyRef = { -if (!SparkClassUtils.isMemberClass(innerCls)) { +if (!innerCls.isMemberClass) { return null } val outerClass = innerCls.getDeclaringClass diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala index 32bcdaf8609..beb07259384 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala @@ -557,7 +557,7 @@ case class NewInstance( // Note that static inner classes (e.g., inner classes within Scala objects) don't need // outer pointer registration. val needOuterPointer = - outerPointer.isEmpty && Utils.isMemberClass(cls) && !Modifier.isStatic(cls.getModifiers) + outerPointer.isEmpty && cls.isMemberClass && !Modifier.isStatic(cls.getModifiers) childrenResolved && !needOuterPointer }
[spark] branch master updated: [SPARK-45297][SQL] Remove workaround for dateformatter added in SPARK-31827
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new e9cc76a27f6 [SPARK-45297][SQL] Remove workaround for dateformatter added in SPARK-31827 e9cc76a27f6 is described below commit e9cc76a27f6f372cde1c885055cfe0bdd4fd4e7d Author: Hyukjin Kwon AuthorDate: Mon Sep 25 22:15:37 2023 +0900 [SPARK-45297][SQL] Remove workaround for dateformatter added in SPARK-31827 ### What changes were proposed in this pull request? This PR removes the legacy workaround for JDK 8 added at SPARK-31827 ### Why are the changes needed? To remove legacy workaround. We dropped JDK 8 at SPARK-44112 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing unittest/docs added in SPARK-31827 ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43082 from HyukjinKwon/SPARK-45297. Authored-by: Hyukjin Kwon Signed-off-by: Hyukjin Kwon --- .../spark/sql/catalyst/util/DateTimeFormatterHelper.scala | 15 --- 1 file changed, 15 deletions(-) diff --git a/sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala b/sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala index 43701d1d8ff..e2a897a3211 100644 --- a/sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala +++ b/sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala @@ -262,15 +262,6 @@ private object DateTimeFormatterHelper { toFormatter(builder, TimestampFormatter.defaultLocale) } - private final val bugInStandAloneForm = { -// Java 8 has a bug for stand-alone form. See https://bugs.openjdk.java.net/browse/JDK-8114833 -// Note: we only check the US locale so that it's a static check. It can produce false-negative -// as some locales are not affected by the bug. Since `L`/`q` is rarely used, we choose to not -// complicate the check here. -// TODO: remove it when we drop Java 8 support. -val formatter = DateTimeFormatter.ofPattern("LLL qqq", Locale.US) -formatter.format(LocalDate.of(2000, 1, 1)) == "1 1" - } // SPARK-31892: The week-based date fields are rarely used and really confusing for parsing values // to datetime, especially when they are mixed with other non-week-based ones; // SPARK-31879: It's also difficult for us to restore the behavior of week-based date fields @@ -328,12 +319,6 @@ private object DateTimeFormatterHelper { for (style <- unsupportedPatternLengths if patternPart.contains(style)) { throw new IllegalArgumentException(s"Too many pattern letters: ${style.head}") } - if (bugInStandAloneForm && (patternPart.contains("LLL") || patternPart.contains("qqq"))) { -throw new IllegalArgumentException("Java 8 has a bug to support stand-alone " + - "form (3 or more 'L' or 'q' in the pattern string). Please use 'M' or 'Q' instead, " + - "or upgrade your Java version. For more details, please read " + - "https://bugs.openjdk.java.net/browse/JDK-8114833;) - } // In DateTimeFormatter, 'u' supports negative years. We substitute 'y' to 'u' here for // keeping the support in Spark 3.0. If parse failed in Spark 3.0, fall back to 'y'. // We only do this substitution when there is no era designator found in the pattern. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-45296][INFRA][BUILD] Comment out unused JDK 11 related in dev/run-tests.py
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new c326928edde [SPARK-45296][INFRA][BUILD] Comment out unused JDK 11 related in dev/run-tests.py c326928edde is described below commit c326928edde319c0d8ff3ff723c7711f8596ca3f Author: Hyukjin Kwon AuthorDate: Mon Sep 25 20:27:36 2023 +0900 [SPARK-45296][INFRA][BUILD] Comment out unused JDK 11 related in dev/run-tests.py ### What changes were proposed in this pull request? This PR proposes to comment unused JDK 11 related in `dev/run-tests.py`. ### Why are the changes needed? For readability, and commenting out unused code. I added some explanation inlined. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? No. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43081 from HyukjinKwon/SPARK-45296. Authored-by: Hyukjin Kwon Signed-off-by: Hyukjin Kwon --- dev/run-tests.py | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/dev/run-tests.py b/dev/run-tests.py index 57fe1de811d..cf0db66fba1 100755 --- a/dev/run-tests.py +++ b/dev/run-tests.py @@ -361,12 +361,13 @@ def run_scala_tests(build_tool, extra_profiles, test_modules, excluded_tags, inc if excluded_tags: test_profiles += ["-Dtest.exclude.tags=" + ",".join(excluded_tags)] -# set up java11 env if this is a pull request build with 'test-java11' in the title -if "ghprbPullTitle" in os.environ: -if "test-java11" in os.environ["ghprbPullTitle"].lower(): -os.environ["JAVA_HOME"] = "/usr/java/jdk-11.0.1" -os.environ["PATH"] = "%s/bin:%s" % (os.environ["JAVA_HOME"], os.environ["PATH"]) -test_profiles += ["-Djava.version=11"] +# SPARK-45296: legacy code for Jenkins. If we move to Jenkins, we should +# revive this logic with a different combination of JDK. +# if "ghprbPullTitle" in os.environ: +# if "test-java11" in os.environ["ghprbPullTitle"].lower(): +# os.environ["JAVA_HOME"] = "/usr/java/jdk-11.0.1" +# os.environ["PATH"] = "%s/bin:%s" % (os.environ["JAVA_HOME"], os.environ["PATH"]) +# test_profiles += ["-Djava.version=11"] if build_tool == "maven": run_scala_tests_maven(test_profiles) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.5 updated: [SPARK-45306][SQL][TESTS] Make `InMemoryColumnarBenchmark` use AQE-aware utils to collect plans
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.5 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.5 by this push: new 3551f8b89f1 [SPARK-45306][SQL][TESTS] Make `InMemoryColumnarBenchmark` use AQE-aware utils to collect plans 3551f8b89f1 is described below commit 3551f8b89f1d70a9218b8c0331bddc06c5020e95 Author: yangjie01 AuthorDate: Mon Sep 25 19:06:02 2023 +0800 [SPARK-45306][SQL][TESTS] Make `InMemoryColumnarBenchmark` use AQE-aware utils to collect plans ### What changes were proposed in this pull request? This pr makes `InMemoryColumnarBenchmark` inherit from AdaptiveSparkPlanHelper and use the `AdaptiveSparkPlanHelper#collect` function to collect plans, enabling `InMemoryColumnarBenchmark` to run successfully. ### Why are the changes needed? After SPARK-42768 merged, the default value of `spark.sql.optimizer.canChangeCachedPlanOutputPartitioning` has changed from false to true, so `InMemoryColumnarBenchmark ` should use AQE-aware utils to collect plans. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manual verification. run `build/sbt "sql/Test/runMain org.apache.spark.sql.execution.columnar.InMemoryColumnarBenchmark"` **Before** ``` [error] Exception in thread "main" java.lang.IndexOutOfBoundsException: 0 [error] at scala.collection.LinearSeqOps.apply(LinearSeq.scala:131) [error] at scala.collection.LinearSeqOps.apply$(LinearSeq.scala:128) [error] at scala.collection.immutable.List.apply(List.scala:79) [error] at org.apache.spark.sql.execution.columnar.InMemoryColumnarBenchmark$.intCache(InMemoryColumnarBenchmark.scala:47) [error] at org.apache.spark.sql.execution.columnar.InMemoryColumnarBenchmark$.$anonfun$runBenchmarkSuite$1(InMemoryColumnarBenchmark.scala:68) [error] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) [error] at org.apache.spark.benchmark.BenchmarkBase.runBenchmark(BenchmarkBase.scala:42) [error] at org.apache.spark.sql.execution.columnar.InMemoryColumnarBenchmark$.runBenchmarkSuite(InMemoryColumnarBenchmark.scala:68) [error] at org.apache.spark.benchmark.BenchmarkBase.main(BenchmarkBase.scala:72) [error] at org.apache.spark.sql.execution.columnar.InMemoryColumnarBenchmark.main(InMemoryColumnarBenchmark.scala) [error] Nonzero exit code returned from runner: 1 [error] (sql / Test / runMain) Nonzero exit code returned from runner: 1 ``` **After** ``` [info] OpenJDK 64-Bit Server VM 17.0.8+7-LTS on Mac OS X 13.5.2 [info] Apple M2 Max [info] Int In-Memory scan: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative [info] -- [info] columnar deserialization + columnar-to-row 95 116 34 10.5 95.4 1.0X [info] row-based deserialization 85 99 22 11.8 85.1 1.1X ``` ### Was this patch authored or co-authored using generative AI tooling? No Closes #43093 from LuciferYang/fix-InMemoryColumnarBenchmark. Authored-by: yangjie01 Signed-off-by: Wenchen Fan (cherry picked from commit 7e9666be15b5210db00231faacd3cfa15ed71907) Signed-off-by: Wenchen Fan --- .../spark/sql/execution/columnar/InMemoryColumnarBenchmark.scala | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/columnar/InMemoryColumnarBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/columnar/InMemoryColumnarBenchmark.scala index 55d9fb27317..1f132dabd28 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/columnar/InMemoryColumnarBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/columnar/InMemoryColumnarBenchmark.scala @@ -18,6 +18,7 @@ package org.apache.spark.sql.execution.columnar import org.apache.spark.benchmark.Benchmark import org.apache.spark.sql.execution.ColumnarToRowExec +import org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanHelper import org.apache.spark.sql.execution.benchmark.SqlBasedBenchmark /** @@ -33,11 +34,11 @@ import org.apache.spark.sql.execution.benchmark.SqlBasedBenchmark * Results will be written to "benchmarks/InMemoryColumnarBenchmark-results.txt". * }}} */ -object InMemoryColumnarBenchmark extends SqlBasedBenchmark { +object InMemoryColumnarBenchmark extends SqlBasedBenchmark with AdaptiveSparkPlanHelper { def
[spark] branch master updated: [SPARK-45306][SQL][TESTS] Make `InMemoryColumnarBenchmark` use AQE-aware utils to collect plans
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 7e9666be15b [SPARK-45306][SQL][TESTS] Make `InMemoryColumnarBenchmark` use AQE-aware utils to collect plans 7e9666be15b is described below commit 7e9666be15b5210db00231faacd3cfa15ed71907 Author: yangjie01 AuthorDate: Mon Sep 25 19:06:02 2023 +0800 [SPARK-45306][SQL][TESTS] Make `InMemoryColumnarBenchmark` use AQE-aware utils to collect plans ### What changes were proposed in this pull request? This pr makes `InMemoryColumnarBenchmark` inherit from AdaptiveSparkPlanHelper and use the `AdaptiveSparkPlanHelper#collect` function to collect plans, enabling `InMemoryColumnarBenchmark` to run successfully. ### Why are the changes needed? After SPARK-42768 merged, the default value of `spark.sql.optimizer.canChangeCachedPlanOutputPartitioning` has changed from false to true, so `InMemoryColumnarBenchmark ` should use AQE-aware utils to collect plans. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manual verification. run `build/sbt "sql/Test/runMain org.apache.spark.sql.execution.columnar.InMemoryColumnarBenchmark"` **Before** ``` [error] Exception in thread "main" java.lang.IndexOutOfBoundsException: 0 [error] at scala.collection.LinearSeqOps.apply(LinearSeq.scala:131) [error] at scala.collection.LinearSeqOps.apply$(LinearSeq.scala:128) [error] at scala.collection.immutable.List.apply(List.scala:79) [error] at org.apache.spark.sql.execution.columnar.InMemoryColumnarBenchmark$.intCache(InMemoryColumnarBenchmark.scala:47) [error] at org.apache.spark.sql.execution.columnar.InMemoryColumnarBenchmark$.$anonfun$runBenchmarkSuite$1(InMemoryColumnarBenchmark.scala:68) [error] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) [error] at org.apache.spark.benchmark.BenchmarkBase.runBenchmark(BenchmarkBase.scala:42) [error] at org.apache.spark.sql.execution.columnar.InMemoryColumnarBenchmark$.runBenchmarkSuite(InMemoryColumnarBenchmark.scala:68) [error] at org.apache.spark.benchmark.BenchmarkBase.main(BenchmarkBase.scala:72) [error] at org.apache.spark.sql.execution.columnar.InMemoryColumnarBenchmark.main(InMemoryColumnarBenchmark.scala) [error] Nonzero exit code returned from runner: 1 [error] (sql / Test / runMain) Nonzero exit code returned from runner: 1 ``` **After** ``` [info] OpenJDK 64-Bit Server VM 17.0.8+7-LTS on Mac OS X 13.5.2 [info] Apple M2 Max [info] Int In-Memory scan: Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative [info] -- [info] columnar deserialization + columnar-to-row 95 116 34 10.5 95.4 1.0X [info] row-based deserialization 85 99 22 11.8 85.1 1.1X ``` ### Was this patch authored or co-authored using generative AI tooling? No Closes #43093 from LuciferYang/fix-InMemoryColumnarBenchmark. Authored-by: yangjie01 Signed-off-by: Wenchen Fan --- .../spark/sql/execution/columnar/InMemoryColumnarBenchmark.scala | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/columnar/InMemoryColumnarBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/columnar/InMemoryColumnarBenchmark.scala index 55d9fb27317..1f132dabd28 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/columnar/InMemoryColumnarBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/columnar/InMemoryColumnarBenchmark.scala @@ -18,6 +18,7 @@ package org.apache.spark.sql.execution.columnar import org.apache.spark.benchmark.Benchmark import org.apache.spark.sql.execution.ColumnarToRowExec +import org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanHelper import org.apache.spark.sql.execution.benchmark.SqlBasedBenchmark /** @@ -33,11 +34,11 @@ import org.apache.spark.sql.execution.benchmark.SqlBasedBenchmark * Results will be written to "benchmarks/InMemoryColumnarBenchmark-results.txt". * }}} */ -object InMemoryColumnarBenchmark extends SqlBasedBenchmark { +object InMemoryColumnarBenchmark extends SqlBasedBenchmark with AdaptiveSparkPlanHelper { def intCache(rowsNum: Long, numIters: Int): Unit = { val data = spark.range(0, rowsNum, 1, 1).toDF("i").cache() -
[GitHub] [spark-website] panbingkun commented on pull request #480: Fix UI issue for `published` docs about Switch languages consistently across docs for all code snippets
panbingkun commented on PR #480: URL: https://github.com/apache/spark-website/pull/480#issuecomment-1733085912 cc @HyukjinKwon @allisonwang-db -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] panbingkun opened a new pull request, #480: Fix UI issue for `published` docs about Switch languages consistently across docs for all code snippets
panbingkun opened a new pull request, #480: URL: https://github.com/apache/spark-website/pull/480 The pr aims to fix UI issue for `published docs` about Switch languages consistently across docs for all code snippets https://github.com/apache/spark-website/pull/474#issuecomment-1731741985 https://github.com/apache/spark-website/assets/15246973/50b86a6f-4bde--93bf-cb85b66e962c;> As discussed, we aim to fix the aforementioned issues by directly repairing files that have already been released in history. include versions: 3.1.1, 3.1.2, 3.1.3, 3.2.0, 3.2.1, 3.2.2, 3.2.3, 3.2.4, 3.3.0, 3.3.1, 3.3.2, 3.3.3, 3.4.0, 3.4.1 Manually test: ``` bundle exec jekyll serve --watch ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.5 updated: [SPARK-45291][SQL][REST] Use unknown query execution id instead of no such app when id is invalid
This is an automated email from the ASF dual-hosted git repository. yao pushed a commit to branch branch-3.5 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.5 by this push: new 94661758c30 [SPARK-45291][SQL][REST] Use unknown query execution id instead of no such app when id is invalid 94661758c30 is described below commit 94661758c3072a279a29d0c493ce419af0414d3a Author: Kent Yao AuthorDate: Mon Sep 25 14:23:46 2023 +0800 [SPARK-45291][SQL][REST] Use unknown query execution id instead of no such app when id is invalid ### What changes were proposed in this pull request? This PR fixes `/api/v1/applications/{appId}/sql/{executionId}` API when the executionId is invalid. Before this, we get `no such app: $appId`; after this, we get `unknown query execution id: $executionId` ### Why are the changes needed? bugfix ### Does this PR introduce _any_ user-facing change? no, bugfix ### How was this patch tested? new test ### Was this patch authored or co-authored using generative AI tooling? no Closes #43073 from yaooqinn/SPARK-45291. Authored-by: Kent Yao Signed-off-by: Kent Yao (cherry picked from commit 5d422155f1dae09f1631375d09e2f3c8dffba9a5) Signed-off-by: Kent Yao --- .../scala/org/apache/spark/status/api/v1/sql/SqlResource.scala | 3 +-- .../status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala| 9 + 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala b/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala index 3c96f612da6..fa5bea5f9bb 100644 --- a/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala +++ b/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala @@ -56,10 +56,9 @@ private[v1] class SqlResource extends BaseAppResource { planDescription: Boolean): ExecutionData = { withUI { ui => val sqlStore = new SQLAppStatusStore(ui.store.store) - val graph = sqlStore.planGraph(execId) sqlStore .execution(execId) -.map(prepareExecutionData(_, graph, details, planDescription)) +.map(prepareExecutionData(_, sqlStore.planGraph(execId), details, planDescription)) .getOrElse(throw new NotFoundException("unknown query execution id: " + execId)) } } diff --git a/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala b/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala index 658f79fc289..c63c748953f 100644 --- a/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala @@ -19,6 +19,7 @@ package org.apache.spark.status.api.v1.sql import java.net.URL import java.text.SimpleDateFormat +import javax.servlet.http.HttpServletResponse import org.json4s.DefaultFormats import org.json4s.jackson.JsonMethods @@ -148,4 +149,12 @@ class SqlResourceWithActualMetricsSuite } } + test("SPARK-45291: Use unknown query execution id instead of no such app when id is invalid") { +val url = new URL(spark.sparkContext.ui.get.webUrl + + s"/api/v1/applications/${spark.sparkContext.applicationId}/sql/${Long.MaxValue}") +val (code, resultOpt, error) = getContentAndCode(url) +assert(code === HttpServletResponse.SC_NOT_FOUND) +assert(resultOpt.isEmpty) +assert(error.get === s"unknown query execution id: ${Long.MaxValue}") + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-45291][SQL][REST] Use unknown query execution id instead of no such app when id is invalid
This is an automated email from the ASF dual-hosted git repository. yao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 5d422155f1d [SPARK-45291][SQL][REST] Use unknown query execution id instead of no such app when id is invalid 5d422155f1d is described below commit 5d422155f1dae09f1631375d09e2f3c8dffba9a5 Author: Kent Yao AuthorDate: Mon Sep 25 14:23:46 2023 +0800 [SPARK-45291][SQL][REST] Use unknown query execution id instead of no such app when id is invalid ### What changes were proposed in this pull request? This PR fixes `/api/v1/applications/{appId}/sql/{executionId}` API when the executionId is invalid. Before this, we get `no such app: $appId`; after this, we get `unknown query execution id: $executionId` ### Why are the changes needed? bugfix ### Does this PR introduce _any_ user-facing change? no, bugfix ### How was this patch tested? new test ### Was this patch authored or co-authored using generative AI tooling? no Closes #43073 from yaooqinn/SPARK-45291. Authored-by: Kent Yao Signed-off-by: Kent Yao --- .../scala/org/apache/spark/status/api/v1/sql/SqlResource.scala | 3 +-- .../status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala| 9 + 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala b/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala index 3c96f612da6..fa5bea5f9bb 100644 --- a/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala +++ b/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala @@ -56,10 +56,9 @@ private[v1] class SqlResource extends BaseAppResource { planDescription: Boolean): ExecutionData = { withUI { ui => val sqlStore = new SQLAppStatusStore(ui.store.store) - val graph = sqlStore.planGraph(execId) sqlStore .execution(execId) -.map(prepareExecutionData(_, graph, details, planDescription)) +.map(prepareExecutionData(_, sqlStore.planGraph(execId), details, planDescription)) .getOrElse(throw new NotFoundException("unknown query execution id: " + execId)) } } diff --git a/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala b/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala index 658f79fc289..c63c748953f 100644 --- a/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala @@ -19,6 +19,7 @@ package org.apache.spark.status.api.v1.sql import java.net.URL import java.text.SimpleDateFormat +import javax.servlet.http.HttpServletResponse import org.json4s.DefaultFormats import org.json4s.jackson.JsonMethods @@ -148,4 +149,12 @@ class SqlResourceWithActualMetricsSuite } } + test("SPARK-45291: Use unknown query execution id instead of no such app when id is invalid") { +val url = new URL(spark.sparkContext.ui.get.webUrl + + s"/api/v1/applications/${spark.sparkContext.applicationId}/sql/${Long.MaxValue}") +val (code, resultOpt, error) = getContentAndCode(url) +assert(code === HttpServletResponse.SC_NOT_FOUND) +assert(resultOpt.isEmpty) +assert(error.get === s"unknown query execution id: ${Long.MaxValue}") + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new fb2bee37c96 [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0 fb2bee37c96 is described below commit fb2bee37c964bf2164fc89a0a55085dd0c840b56 Author: zhyhimont AuthorDate: Mon Sep 25 15:22:32 2023 +0900 [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0 ### What changes were proposed in this pull request? Support `isocalendar` from the pandas 2.0.0 ### Why are the changes needed? When pandas 2.0.0 is released, we should match the behavior in pandas API on Spark. ### Does this PR introduce _any_ user-facing change? Added new method `DatetimeIndex.isocalendar` and removed two depreceted `DatetimeIndex.week` and `DatetimeIndex.weekofyear` ``` dfs = ps.from_pandas(pd.date_range(start='2019-12-29', freq='D', periods=4).to_series()) dfs.dt.isocalendar() year week day 2019-12-29 2019527 2019-12-30 2020 11 2019-12-31 2020 12 2020-01-01 2020 13 dfs.dt.isocalendar().week 2019-12-2952 2019-12-30 1 2019-12-31 1 2020-01-01 1 ``` ### How was this patch tested? UT was updated Closes #40420 from dzhigimont/SPARK-42617_ZH. Lead-authored-by: zhyhimont Co-authored-by: Zhyhimont Dmitry Co-authored-by: Dmitry Zhyhimont Co-authored-by: Zhyhimont Dmitry Signed-off-by: Hyukjin Kwon --- .../source/reference/pyspark.pandas/indexing.rst | 3 +- .../source/reference/pyspark.pandas/series.rst | 3 +- python/pyspark/pandas/datetimes.py | 70 -- python/pyspark/pandas/indexes/base.py | 4 +- python/pyspark/pandas/indexes/datetimes.py | 49 +-- python/pyspark/pandas/namespace.py | 3 +- .../pyspark/pandas/tests/indexes/test_datetime.py | 28 ++--- .../pandas/tests/indexes/test_datetime_property.py | 19 +- .../pyspark/pandas/tests/test_series_datetime.py | 17 +- 9 files changed, 100 insertions(+), 96 deletions(-) diff --git a/python/docs/source/reference/pyspark.pandas/indexing.rst b/python/docs/source/reference/pyspark.pandas/indexing.rst index 70d463c052a..d6be57ee9c8 100644 --- a/python/docs/source/reference/pyspark.pandas/indexing.rst +++ b/python/docs/source/reference/pyspark.pandas/indexing.rst @@ -338,8 +338,7 @@ Time/date components DatetimeIndex.minute DatetimeIndex.second DatetimeIndex.microsecond - DatetimeIndex.week - DatetimeIndex.weekofyear + DatetimeIndex.isocalendar DatetimeIndex.dayofweek DatetimeIndex.day_of_week DatetimeIndex.weekday diff --git a/python/docs/source/reference/pyspark.pandas/series.rst b/python/docs/source/reference/pyspark.pandas/series.rst index 552acec096f..7b658d45d4b 100644 --- a/python/docs/source/reference/pyspark.pandas/series.rst +++ b/python/docs/source/reference/pyspark.pandas/series.rst @@ -313,8 +313,7 @@ Datetime Properties Series.dt.minute Series.dt.second Series.dt.microsecond - Series.dt.week - Series.dt.weekofyear + Series.dt.isocalendar Series.dt.dayofweek Series.dt.weekday Series.dt.dayofyear diff --git a/python/pyspark/pandas/datetimes.py b/python/pyspark/pandas/datetimes.py index b0649cf5761..4b6e23fae7a 100644 --- a/python/pyspark/pandas/datetimes.py +++ b/python/pyspark/pandas/datetimes.py @@ -18,7 +18,6 @@ """ Date/Time related functions on pandas-on-Spark Series """ -import warnings from typing import Any, Optional, Union, no_type_check import numpy as np @@ -27,7 +26,9 @@ from pandas.tseries.offsets import DateOffset import pyspark.pandas as ps import pyspark.sql.functions as F -from pyspark.sql.types import DateType, TimestampType, TimestampNTZType, LongType, IntegerType +from pyspark.sql.types import DateType, TimestampType, TimestampNTZType, IntegerType +from pyspark.pandas import DataFrame +from pyspark.pandas.config import option_context class DatetimeMethods: @@ -116,26 +117,59 @@ class DatetimeMethods: def nanosecond(self) -> "ps.Series": raise NotImplementedError() -# TODO(SPARK-42617): Support isocalendar.week and replace it. -# See also https://github.com/pandas-dev/pandas/pull/33595. -@property -def week(self) -> "ps.Series": +def isocalendar(self) -> "ps.DataFrame": """ -The week ordinal of the year. +Calculate year, week, and day according to the ISO 8601 standard. -.. deprecated:: 3.4.0 -""" -warnings.warn( -"weekofyear and week have been deprecated.", -FutureWarning, -) -