Re: Error in block pushing thread puts the KinesisReceiver in a stuck state

2015-12-02 Thread Akhil Das
Did you go through the executor logs completely? Futures timed out
exception can occur mostly when one of the task/job spend way too much time
and fails to respond, this happens when there's a GC pause or memory
overhead.

Thanks
Best Regards

On Tue, Dec 1, 2015 at 12:09 AM, Spark Newbie 
wrote:

> Pinging again to see if anyone has any thoughts or prior experience with
> this issue.
>
> On Wed, Nov 25, 2015 at 3:56 PM, Spark Newbie 
> wrote:
>
>> Hi Spark users,
>>
>> I have been seeing this issue where receivers enter a "stuck" state after
>> it encounters a the following exception "Error in block pushing thread -
>> java.util.concurrent.TimeoutException: Futures timed out".
>> I am running the application on spark-1.4.1 and using kinesis-asl-1.4.
>>
>> When this happens, the observation is that the
>> Kinesis.ProcessTask.shard.MillisBehindLatest metric does not get
>> published anymore, when I look at cloudwatch, which indicates that the
>> workers associated with the receiver are not checkpointing any more for the
>> shards that they were reading from.
>>
>> This seems like a bug in to BlockGenerator code , here -
>> https://github.com/apache/spark/blob/branch-1.4/streaming/src/main/scala/org/apache/spark/streaming/receiver/BlockGenerator.scala#L171
>> when pushBlock encounters an exception, in this case the TimeoutException,
>> it stops pushing blocks. Is this really expected behavior?
>>
>> Has anyone else seen this error and have you also seen the issue where
>> receivers stop receiving records? I'm also trying to find the root cause
>> for the TimeoutException. If anyone has an idea on this please share.
>>
>> Thanks,
>>
>> Bharath
>>
>
>


Error in block pushing thread puts the KinesisReceiver in a stuck state

2015-11-25 Thread Spark Newbie
Hi Spark users,

I have been seeing this issue where receivers enter a "stuck" state after
it encounters a the following exception "Error in block pushing thread -
java.util.concurrent.TimeoutException: Futures timed out".
I am running the application on spark-1.4.1 and using kinesis-asl-1.4.

When this happens, the observation is that the
Kinesis.ProcessTask.shard.MillisBehindLatest metric does not get
published anymore, when I look at cloudwatch, which indicates that the
workers associated with the receiver are not checkpointing any more for the
shards that they were reading from.

This seems like a bug in to BlockGenerator code , here -
https://github.com/apache/spark/blob/branch-1.4/streaming/src/main/scala/org/apache/spark/streaming/receiver/BlockGenerator.scala#L171
when pushBlock encounters an exception, in this case the TimeoutException,
it stops pushing blocks. Is this really expected behavior?

Has anyone else seen this error and have you also seen the issue where
receivers stop receiving records? I'm also trying to find the root cause
for the TimeoutException. If anyone has an idea on this please share.

Thanks,

Bharath