Eyal Zituny created SPARK-27330: ----------------------------------- Summary: ForeachWriter is not being closed once a batch is aborted Key: SPARK-27330 URL: https://issues.apache.org/jira/browse/SPARK-27330 Project: Spark Issue Type: Bug Components: Structured Streaming Affects Versions: 2.4.0 Reporter: Eyal Zituny
in cases where a micro batch is being killed (interrupted), not during actual processing done by the ForeachDataWriter (for example when iterating the iterator), DataWritingSparkTask will catch the interrupted exception and call dataWriter.abort() the problem is that ForeachDataWriter has an empty implementation for the abort method. as a result of that, i have encountered issues in connections which were opened in the "open" method when the writer has been created but never closed. this wasn't the behavior pre spark 2.4 my suggestion is to call ForeachWriter.close() when DataWriter.abort() is called, and exception should also be provided in order to notify the foreach writer that this task has failed stack trace from the exception i have encountered: org.apache.spark.TaskKilledException: null at org.apache.spark.TaskContextImpl.killTaskIfInterrupted(TaskContextImpl.scala:149) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:36) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:117) at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:116) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394) at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:146) at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:67) at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:66) -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org