Hi

I was going through the Spark Streaming BackPressure feature documentation and
wanted to understand how I can ensure my custom receiver is able to handle
rate limiting. I have a custom receiver similar to the TwitterInputDStream,
but there is no obvious way to throttle what is being read from the source,
in response to rate limit events.

class MyReceiver(storageLevel: StorageLevel) extends
NetworkReceiver[String](storageLevel) {def onStart() {
    // Setup stuff (start threads, open sockets, etc.) to start receiving data.
    // Must start new thread to receive data, as onStart() must be non-blocking.

    // Call store(...) in those threads to store received data into
Spark's memory.

    // Call stop(...), restart(...) or reportError(...) on any thread
based on how
    // different errors needs to be handled.

    // See corresponding method documentation for more details
}
def onStop() {
    // Cleanup stuff (stop threads, close sockets, etc.) to stop receiving data.
}
}


http://spark.apache.org/docs/latest/streaming-custom-receivers.html

The back pressure design documentation states the following. I am unable to
figure out how this works for the TwitterInputDStream either. My receiver
is similar to the TwitterInputDStream one.

   -

   TwitterInputDStream
   - no changes required. The receiver-based mechanism will handle rate
      limiting


References https://issues.apache.org/jira/browse/SPARK-7398 and
https://docs.google.com/document/d/1ls_g5fFmfbbSTIfQQpUxH56d0f3OksF567zwA00zK9E/edit#

Regards
Deenar

Reply via email to