Greetings all,

I’ve recently started hitting on the following error in Spark Streaming in 
Kafka. Adjusting maxRetries and spark.streaming.kafka.consumer.poll.ms even to 
five minutes doesn’t seem to be helping. The problem only manifested in the 
last few days, restarting with a new consumer group seems to remedy the issue 
for a few hours (< retention, which is 12 hours).

Error:
Caused by: java.lang.AssertionError: assertion failed: Got wrong record for 
spark-executor-<consumergrouphere> <topichere> 76 even after seeking to offset 
1759148155
    at scala.Predef$.assert(Predef.scala:170)
    at 
org.apache.spark.streaming.kafka010.CachedKafkaConsumer.get(CachedKafkaConsumer.scala:85)
    at 
org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:223)
    at 
org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:189)
    at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)

I guess my questions are, why is that assertion a job killer vs a warning and 
is there anything I can tweak settings wise that may keep it at bay.

I wouldn’t be surprised if this issue were exacerbated by the volume we do on 
Kafka topics (~150k/sec on the persister that’s crashing).

Thank you!
Justin


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to