If you would like an overview of Spark Stream and fault tolerance, these slides are great (Slides 24+ focus on fault tolerance; Slide 52 is on resilience to traffic spikes): http://www.lightbend.com/blog/four-things-to-know-about-reliable-spark-streaming-typesafe-databricks
This recent Spark Summit talk is all about backpressure and dynamic scaling: https://spark-summit.org/east-2016/events/building-robust-scalable-and-adaptive-applications-on-spark-streaming/ >From the Spark docs, backpressure works by placing a limit on the receiving rate, and this limit is adjusted dynamically based on processing times. If there is a burst and the data source generates events at a higher rate, those extra events will get backed up in the data source. So, how much buffering is available in the data source? For instance, Kafka can use HDFS as a huge buffer, with capacity to buffer traffic spikes. Spark itself doesn't handle the buffering of unprocessed events, so in some cases, Kafka (or some other storage) is placed between the data source and Spark to provide a buffer. Xinh On Mon, Mar 7, 2016 at 2:10 PM, Andy Davidson <a...@santacruzintegration.com > wrote: > One of the challenges we need to prepare for with streaming apps is bursty > data. Typically we need to estimate our worst case data load and make sure > we have enough capacity > > > It not obvious what best practices are with spark streaming. > > > - we have implemented check pointing as described in the prog guide > - Use stand alone cluster manager and spark-submit > - We use the mgmt console to kill drives when needed > > > - we plan to configure write ahead spark.streaming.backpressure.enabled > to true. > > > - our application runs a single unreliable receive > - We run multiple implementation configured to partition the input > > > As long as our processing time is < our windowing time everything is fine > > In the streaming systems I have worked on in the past we scaled out by > using load balancers and proxy farms to create buffering capacity. Its not > clear how to scale out spark > > In our limited testing it seems like we have a single app configure to > receive a predefined portion of the data. Once it is stated we can not add > additional resources. Adding cores and memory does not seem increase our > capacity > > > Kind regards > > Andy > > >