I have enabled the spark.streaming.backpressure.enabled setting and also
 set spark.streaming.backpressure.initialRate  to 15000, but my spark job
is not respecting these settings when reading from Kafka after a failure.

In my kafka topic around 500k records are waiting for being processed and
they are all taken in 1 huge batch which ultimately takes a long time and
fails with executor failure exception. We don't have more resources to give
in our test cluster and we expect the backpressure to kick in and take
smaller batches.

What can I be doing wrong?


Thanks & Regards
Biplob Biswas

Reply via email to