Hi Spark users,

I have been seeing this issue where receivers enter a "stuck" state after
it encounters a the following exception "Error in block pushing thread -
java.util.concurrent.TimeoutException: Futures timed out".
I am running the application on spark-1.4.1 and using kinesis-asl-1.4.

When this happens, the observation is that the
Kinesis.ProcessTask.shardxxxx.MillisBehindLatest metric does not get
published anymore, when I look at cloudwatch, which indicates that the
workers associated with the receiver are not checkpointing any more for the
shards that they were reading from.

This seems like a bug in to BlockGenerator code , here -
https://github.com/apache/spark/blob/branch-1.4/streaming/src/main/scala/org/apache/spark/streaming/receiver/BlockGenerator.scala#L171
when pushBlock encounters an exception, in this case the TimeoutException,
it stops pushing blocks. Is this really expected behavior?

Has anyone else seen this error and have you also seen the issue where
receivers stop receiving records? I'm also trying to find the root cause
for the TimeoutException. If anyone has an idea on this please share.

Thanks,

Bharath

Reply via email to