You hit block not found issues when you processing time exceeds the batch
duration (this happens with receiver oriented streaming). If you are
consuming messages from Kafka then try to use the directStream or you can
also set StorageLevel to MEMORY_AND_DISK with receiver oriented consumer.
(This
Some background on what we're trying to do:
We have four Kinesis receivers with varying amounts of data coming through
them. Ultimately we work on a unioned stream that is getting about 11
MB/second of data. We use a batch size of 5 seconds.
We create four distinct DStreams from this data that