JoshRosen commented on code in PR #47374: URL: https://github.com/apache/spark/pull/47374#discussion_r1688632697
########## project/MimaExcludes.scala: ########## @@ -100,7 +100,13 @@ object MimaExcludes { ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLContext#implicits._sqlContext"), ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SQLImplicits._sqlContext"), ProblemFilters.exclude[ReversedMissingMethodProblem]("org.apache.spark.sql.SQLImplicits.session"), - ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SparkSession#implicits._sqlContext") + ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.SparkSession#implicits._sqlContext"), + // SPARK-48900: Add `reason` string to all job / stage / job group cancellation calls + ProblemFilters.exclude[ReversedMissingMethodProblem]("org.apache.spark.scheduler.JobWaiter#cancel"), Review Comment: `ReversedMissingMethodProblem` is okay to suppress here, since that one is warning that we have an interface abstract / class which has gained an abstract method such that old implementations might hit an error when linked against the updated interface / abstract class. I don't think we intend to support _user-defined implementations of FutureAction_, so it's okay if Spark 4.0 imposes new requirements on this. But `DirectMissingMethodProblem` is **not** okay to ignore: that's saying that we're _removing_ a method which used to be present in older versions, which will call callers to fail. We should add back a zero-argument overload so that calls of the old signature will continue to work. ########## connect/server/src/main/scala/org/apache/spark/sql/connect/execution/ExecuteThreadRunner.scala: ########## @@ -115,7 +116,8 @@ private[connect] class ExecuteThreadRunner(executeHolder: ExecuteHolder) extends case e: Throwable => logDebug(s"Exception in execute: $e") // Always cancel all remaining execution after error. - executeHolder.sessionHolder.session.sparkContext.cancelJobsWithTag(executeHolder.jobTag) + executeHolder.sessionHolder.session.sparkContext.cancelJobsWithTag(executeHolder.jobTag, + "Exception in execute:" + ExceptionUtils.getStackTrace(e)) Review Comment: Huh, looks like `ExceptionUtils.getStackTrace` is approximately equivalent to Spark's own `Utils.exceptionString` but it seems that we're not very consistent about using one vs. the other. I think you should add an extra space after `execute:` so this concatenates properly. ########## core/src/main/scala/org/apache/spark/FutureAction.scala: ########## @@ -39,9 +39,9 @@ trait FutureAction[T] extends Future[T] { // documentation (with reference to the word "action"). /** - * Cancels the execution of this action. + * Cancels the execution of this action with an optional reason. */ - def cancel(): Unit + def cancel(reason: Option[String]): Unit Review Comment: As I also commented on MimaExcludes, it is okay to _add_ a new overload which accepts an optional reason, but we should not _remove_ the old overload. Instead, I think we should keep the zero-argument `def cancel(): Unit` and provide a default implementation which simply calls `cancel(None)`. We don't need to override the zero-argument `cancel()` in subclasses; those can remain unchanged. If we do that then we won't have to add in explicit `cancel(reason = None)` in a bunch of places, which will help to shrink the size of this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org