Re: Spark Streaming: Some issues (Could not compute split, block —— not found) and questions

2015-08-25 Thread Akhil Das
You hit block not found issues when you processing time exceeds the batch duration (this happens with receiver oriented streaming). If you are consuming messages from Kafka then try to use the directStream or you can also set StorageLevel to MEMORY_AND_DISK with receiver oriented consumer. (This

Spark Streaming: Some issues (Could not compute split, block —— not found) and questions

2015-08-19 Thread jlg
Some background on what we're trying to do: We have four Kinesis receivers with varying amounts of data coming through them. Ultimately we work on a unioned stream that is getting about 11 MB/second of data. We use a batch size of 5 seconds. We create four distinct DStreams from this data that