Hello Spark community, Please let me know if this is the appropriate place to ask this question – will happily move it. I haven’t been able to find the answer going to the usual outlets.
I am currently implementing two custom readers for our projects (JMS / SQS) and am experiencing a problem where I can’t determine the root cause. I can’t share the code out as of right now, but I followed this as boiler plate: https://github.com/hienluu/wikiedit-streaming/blob/master/streaming-receiver/src/main/scala/org/twitterstreaming/receiver/TwitterStreamingSource.scala The problem I am encountering is within my commit implementation – I seem to be getting commit ids out of order after the job runs for about 30-60 minutes. I am getting: Caused by: java.lang.RuntimeException: Offsets committed out of order: 608799 followed by 2982 From line 206 (https://github.com/hienluu/wikiedit-streaming/blob/master/streaming-receiver/src/main/scala/org/twitterstreaming/receiver/TwitterStreamingSource.scala#L206) I have a vague suspicion that it is related to Spark reloading checkpoints? But don’t have anything concrete to confirm my suspicion. Has anyone else encountered this issue? Or have any guidance on what I may be doing wrong? Thanks, Taylor Cressy