Re: "java.lang.AssertionError: assertion failed: Failed to get records for **** after polling for 180000" error

2019-03-06 Thread JF Chen
Hi
The max bytes setting should be enough, because if the tasks fail, it read
the data from kafka very fast as normal.
The   request.timeout.ms  I set is 180 seconds.
I think it should be time out setting or max  bandwidth setting because of
the reason that it recoveries and read the same partition very fast after
the tasks are marked failed.

Regard,
Junfeng Chen


On Wed, Mar 6, 2019 at 4:01 PM Akshay Bhardwaj <
akshay.bhardwaj1...@gmail.com> wrote:

> Sorry message sent as incomplete.
>
> To better debug the issue, please check the below config properties:
>
>- At Kafka consumer properties
>   - max.partition.fetch.bytes within spark kafka consumer. If not set
>   for consumer then the global config at broker level.
>   - request.timeout.ms
>- At spark's configurations
>   - spark.streaming.kafka.consumer.poll.ms
>   - spark.network.timeout (If the above is not set, then poll.ms is
>   default to spark.network.timeout)
>
>
> Generally I have faced this issue if spark.streaming.kafka.
> consumer.poll.ms is less than request.timeout.ms
>
> Also, what is the average kafka record message size in bytes?
>
>
>
> Akshay Bhardwaj
> +91-97111-33849
>
>
> On Wed, Mar 6, 2019 at 1:26 PM Akshay Bhardwaj <
> akshay.bhardwaj1...@gmail.com> wrote:
>
>> Hi,
>>
>> To better debug the issue, please check the below config properties:
>>
>>- max.partition.fetch.bytes within spark kafka consumer. If not set
>>for consumer then the global config at broker level.
>>- spark.streaming.kafka.consumer.poll.ms
>>   - spark.network.timeout (If the above is not set, then poll.ms is
>>   default to spark.network.timeout)
>>-
>>-
>>
>> Akshay Bhardwaj
>> +91-97111-33849
>>
>>
>> On Wed, Mar 6, 2019 at 8:39 AM JF Chen  wrote:
>>
>>> When my kafka executor reads data from kafka, sometimes it throws the
>>> error "java.lang.AssertionError: assertion failed: Failed to get records
>>> for  after polling for 18" , which after 3 minutes of executing.
>>> The data waiting for read is not so huge, which is about 1GB. And other
>>> partitions read by other tasks are very fast, the error always occurs on
>>> some specific executor..
>>>
>>> Regard,
>>> Junfeng Chen
>>>
>>


Re: "java.lang.AssertionError: assertion failed: Failed to get records for **** after polling for 180000" error

2019-03-06 Thread Akshay Bhardwaj
Sorry message sent as incomplete.

To better debug the issue, please check the below config properties:

   - At Kafka consumer properties
  - max.partition.fetch.bytes within spark kafka consumer. If not set
  for consumer then the global config at broker level.
  - request.timeout.ms
   - At spark's configurations
  - spark.streaming.kafka.consumer.poll.ms
  - spark.network.timeout (If the above is not set, then poll.ms is
  default to spark.network.timeout)


Generally I have faced this issue if spark.streaming.kafka.consumer.poll.ms is
less than request.timeout.ms

Also, what is the average kafka record message size in bytes?



Akshay Bhardwaj
+91-97111-33849


On Wed, Mar 6, 2019 at 1:26 PM Akshay Bhardwaj <
akshay.bhardwaj1...@gmail.com> wrote:

> Hi,
>
> To better debug the issue, please check the below config properties:
>
>- max.partition.fetch.bytes within spark kafka consumer. If not set
>for consumer then the global config at broker level.
>- spark.streaming.kafka.consumer.poll.ms
>   - spark.network.timeout (If the above is not set, then poll.ms is
>   default to spark.network.timeout)
>-
>-
>
> Akshay Bhardwaj
> +91-97111-33849
>
>
> On Wed, Mar 6, 2019 at 8:39 AM JF Chen  wrote:
>
>> When my kafka executor reads data from kafka, sometimes it throws the
>> error "java.lang.AssertionError: assertion failed: Failed to get records
>> for  after polling for 18" , which after 3 minutes of executing.
>> The data waiting for read is not so huge, which is about 1GB. And other
>> partitions read by other tasks are very fast, the error always occurs on
>> some specific executor..
>>
>> Regard,
>> Junfeng Chen
>>
>


Re: "java.lang.AssertionError: assertion failed: Failed to get records for **** after polling for 180000" error

2019-03-05 Thread Akshay Bhardwaj
Hi,

To better debug the issue, please check the below config properties:

   - max.partition.fetch.bytes within spark kafka consumer. If not set for
   consumer then the global config at broker level.
   - spark.streaming.kafka.consumer.poll.ms
  - spark.network.timeout (If the above is not set, then poll.ms is
  default to spark.network.timeout)
   -
   -

Akshay Bhardwaj
+91-97111-33849


On Wed, Mar 6, 2019 at 8:39 AM JF Chen  wrote:

> When my kafka executor reads data from kafka, sometimes it throws the
> error "java.lang.AssertionError: assertion failed: Failed to get records
> for  after polling for 18" , which after 3 minutes of executing.
> The data waiting for read is not so huge, which is about 1GB. And other
> partitions read by other tasks are very fast, the error always occurs on
> some specific executor..
>
> Regard,
> Junfeng Chen
>


Re: "java.lang.AssertionError: assertion failed: Failed to get records for **** after polling for 180000" error

2019-03-05 Thread Shyam P
Would be better if you share some code block to understand it better.

Else would be difficult to provide answer.

~Shyam

On Wed, Mar 6, 2019 at 8:38 AM JF Chen  wrote:

> When my kafka executor reads data from kafka, sometimes it throws the
> error "java.lang.AssertionError: assertion failed: Failed to get records
> for  after polling for 18" , which after 3 minutes of executing.
> The data waiting for read is not so huge, which is about 1GB. And other
> partitions read by other tasks are very fast, the error always occurs on
> some specific executor..
>
> Regard,
> Junfeng Chen
>


"java.lang.AssertionError: assertion failed: Failed to get records for **** after polling for 180000" error

2019-03-05 Thread JF Chen
When my kafka executor reads data from kafka, sometimes it throws the error
"java.lang.AssertionError: assertion failed: Failed to get records for 
after polling for 18" , which after 3 minutes of executing.
The data waiting for read is not so huge, which is about 1GB. And other
partitions read by other tasks are very fast, the error always occurs on
some specific executor..

Regard,
Junfeng Chen