[ https://issues.apache.org/jira/browse/SPARK-42485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17690847#comment-17690847 ]
Mich Talebzadeh commented on SPARK-42485: ----------------------------------------- How about Target Version? > SPIP: Shutting down spark structured streaming when the streaming process > completed current process > --------------------------------------------------------------------------------------------------- > > Key: SPARK-42485 > URL: https://issues.apache.org/jira/browse/SPARK-42485 > Project: Spark > Issue Type: New Feature > Components: Structured Streaming > Affects Versions: 3.3.2 > Reporter: Mich Talebzadeh > Priority: Major > Labels: SPIP > > Spark Structured Streaming is a very useful tool in dealing with Event Driven > Architecture. In an Event Driven Architecture, there is generally a main loop > that listens for events and then triggers a call-back function when one of > those events is detected. In a streaming application the application waits to > receive the source messages in a set interval or whenever they happen and > reacts accordingly. > There are occasions that you may want to stop the Spark program gracefully. > Gracefully meaning that Spark application handles the last streaming message > completely and terminates the application. This is different from invoking > interrupts such as CTRL-C. > Of course one can terminate the process based on the following > # query.awaitTermination() # Waits for the termination of this query, with > stop() or with error > # query.awaitTermination(timeoutMs) # Returns true if this query is > terminated within the timeout in milliseconds. > So the first one above waits until an interrupt signal is received. The > second one will count the timeout and will exit when timeout in milliseconds > is reached. > The issue is that one needs to predict how long the streaming job needs to > run. Clearly any interrupt at the terminal or OS level (kill process), may > end up the processing terminated without a proper completion of the streaming > process. > I have devised a method that allows one to terminate the spark application > internally after processing the last received message. Within say 2 seconds > of the confirmation of shutdown, the process will invoke a graceful shutdown. > {color:#000000}This new feature proposes a solution to handle the topic doing > work for the message being processed gracefully, wait for it to complete and > shutdown the streaming process for a given topic without loss of data or > orphaned transactions{color} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org