Hi all, What is the correct way to schedule multiple jobs inside foreachRDD method and importantly await on result to ensure those jobs have completed successfully? E.g.:
kafkaDStream.foreachRDD{ rdd => val rdd1 = rdd.map(...) val rdd2 = rdd1.map(...) val job1Future = Future{ rdd1.saveToCassandra(...) } val job2Future = Future{ rdd1.foreachPartition( iter => /* save to Kafka */) } Await.result( Future.sequence(job1Future, job2Future), Duration.Inf) // commit Kafka offsets } In this code I'm scheduling two actions in futures and awaiting them. I need to be sure when I commit Kafka offsets at the end of the batch processing that job1 and job2 have actually executed successfully. Does given approach provide these guarantees? I.e. in case one of the jobs fails the entire batch will be marked as failed too? Thanks, Andrii