Github user askprasanna commented on the issue:
https://github.com/apache/storm/pull/2104
Want to add the following note. This issue could potentially happen with
non-compacted topics also. We had multiple topologies getting stuck due to this
issue including a topic with "delete" config. Based on the learnings from the
debugging myself and @vivekmittal did, even "delete" topics could encounter
this issue under the following circumstance
- A Kafka topic with relatively short retention period
- A Storm topology that consumes from such a topic with a slow processing
rate
- A fallback strategy to earliest offset OR uncommitted_earliest
Given the above, it is possible that while the spout/consumer is busy
processing messages fetched from a particular partition, the async cleaner has
run and cleaned up expired logs on the Kafka broker managing that partition.
Now when the spout fetches the next batch during a subsequent poll it is likely
to see message offsets that are not sequential to the ones it received in the
previous batch.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---