Hello all, I am using Spark 2.x streaming with kafka. I noticed that spark streaming is processing subsequent micro-batches in case of failure as it takes a while to notify the driver about the error and interrupt streaming-executor thread. This is creating a problem as we are checkpointing the offsets internally.
To avoid the problem, we wanted to catch the exception in the RDD process and stop the spark streaming immediately. streamRDD.foreachRDD { (rdd, microBatchTime) => { try { // business logic }catch (Exception ex) { case ex: Exception => // stop spark streaming streamingContext.stop(stopSparkContext = true, stopGracefully = false) } } } But the spark application state is set to Completed. So, the application is not restarted automatically by spark (with max attempts config). I checked if there is a way to notify the error during the shutdown which sets the spark application status to Failed. ContextWaiter#notiftError is steaming package scoped and couldn’t find any other interfaces to propagate the error/exception to stop the process. How to tell spark streaming to stop processing subsequent micro batches if a micro-batch throws an exception ? Is it possible to configure spark to create one micro batch RDD at a time ? How to stop the spark streaming context with error ? Any help would be appreciated. Thanks in advance. Regards.