[ https://issues.apache.org/jira/browse/SPARK-40106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-40106: ------------------------------------ Assignee: Apache Spark > Task failure handlers should always run if the task failed > ---------------------------------------------------------- > > Key: SPARK-40106 > URL: https://issues.apache.org/jira/browse/SPARK-40106 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 3.3.0 > Reporter: Ryan Johnson > Assignee: Apache Spark > Priority: Major > > Today, if a task body succeeds, but a task completion listener fails, task > failure listeners are not called -- even tho the task has indeed failed at > that point. > If a completion listener fails, and failure listeners were not previously > invoked, we should invoke them before running the remaining completion > listeners. > Such a change would increase the utility of task listeners, especially ones > intended to assist with task cleanup. > To give one arbitrary example, code like this appears at several places in > the code (taken from {{executeTask}} method of FileFormatWriter.scala): > {code:java} > try { > Utils.tryWithSafeFinallyAndFailureCallbacks(block = { > // Execute the task to write rows out and commit the task. > dataWriter.writeWithIterator(iterator) > dataWriter.commit() > })(catchBlock = { > // If there is an error, abort the task > dataWriter.abort() > logError(s"Job $jobId aborted.") > }, finallyBlock = { > dataWriter.close() > }) > } catch { > case e: FetchFailedException => > throw e > case f: FileAlreadyExistsException if > SQLConf.get.fastFailFileFormatOutput => > // If any output file to write already exists, it does not make sense > to re-run this task. > // We throw the exception and let Executor throw ExceptionFailure to > abort the job. > throw new TaskOutputFileAlreadyExistException(f) > case t: Throwable => > throw QueryExecutionErrors.taskFailedWhileWritingRowsError(t) > }{code} > If failure listeners were reliably called, the above idiom could potentially > be factored out as two failure listeners plus a completion listener, and > reused rather than duplicated. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org