Hello, I have a spark streaming process on a cluster ingesting a realtime data stream from Kafka. The aggregated, processed output is written to Cassandra and also used for dashboard display.
My question is - If the node running the driver program fails, I am guessing that the entire process fails and has to be restarted. Is there any way to obviate this? Is my understanding correct that the spark-submit in its current form is a Single Point of Vulnerability, much akin to the NameNode in HDFS? regards Sivakumaran S --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org