Hi, I'm working on the Kafka connector for Apache Storm, which pulls messages from Kafka and emits them into a Storm topology. The connector uses manual offset control since message processing happens asynchronously to pulling messages from Kafka, and we hit an issue a while back related to topic compaction. I think we can solve it, but I'd like confirmation that the way we're going about it isn't wrong.
The connector keeps track of which offsets have been emitted into the topology, along with other information such as how many times they've been retried. When an offset should be retried the connector fetches the message from Kafka again (it is not kept in-memory once emitted). We only clean up the state for an offset once it is fully processed. The issue we hit is that if topic compaction is enabled, we need to know that the offset is no longer available so we can delete the corresponding state. Would the approach described here https://issues.apache.org/ jira/browse/STORM-2546?focusedCommentId=16151172&page=com.atlassian.jira. plugin.system.issuetabpanels:comment-tabpanel#comment-16151172 be reasonable for this, or is there another way to check if an offset has been deleted? Thanks.