Hi I was going through the Spark Streaming BackPressure feature documentation and wanted to understand how I can ensure my custom receiver is able to handle rate limiting. I have a custom receiver similar to the TwitterInputDStream, but there is no obvious way to throttle what is being read from the source, in response to rate limit events.
class MyReceiver(storageLevel: StorageLevel) extends NetworkReceiver[String](storageLevel) {def onStart() { // Setup stuff (start threads, open sockets, etc.) to start receiving data. // Must start new thread to receive data, as onStart() must be non-blocking. // Call store(...) in those threads to store received data into Spark's memory. // Call stop(...), restart(...) or reportError(...) on any thread based on how // different errors needs to be handled. // See corresponding method documentation for more details } def onStop() { // Cleanup stuff (stop threads, close sockets, etc.) to stop receiving data. } } http://spark.apache.org/docs/latest/streaming-custom-receivers.html The back pressure design documentation states the following. I am unable to figure out how this works for the TwitterInputDStream either. My receiver is similar to the TwitterInputDStream one. - TwitterInputDStream - no changes required. The receiver-based mechanism will handle rate limiting References https://issues.apache.org/jira/browse/SPARK-7398 and https://docs.google.com/document/d/1ls_g5fFmfbbSTIfQQpUxH56d0f3OksF567zwA00zK9E/edit# Regards Deenar