Rohit Agarwal created SPARK-16338: ------------------------------------- Summary: Streaming driver running on standalone cluster mode with supervise goes into bad state when application is killed from the UI Key: SPARK-16338 URL: https://issues.apache.org/jira/browse/SPARK-16338 Project: Spark Issue Type: Bug Components: Deploy, Streaming, Web UI Affects Versions: 1.6.1 Reporter: Rohit Agarwal
We are going to start using Spark Streaming in production and I was testing various failure scenarios. I noticed one case where the spark streaming driver got into a bad state. Steps to reproduce: 1. Create a spark streaming application with Direct Kafka Streams and checkpointing enabled. 2. Deploy the application to a spark standalone cluster. With cluster mode and --supervise. 3. Let it run for sometime. 4. Kill the application (but not the driver) from the Spark Master UI. 5. The driver keeps on running but doesn't restart the application. What's worse is that it keeps updating the checkpoint every batch duration, so when you do restart the driver, it starts at a later point and you have lost data. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org