[ https://issues.apache.org/jira/browse/KAFKA-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703755#comment-13703755 ]
Joel Koshy commented on KAFKA-966: ---------------------------------- One way to accomplish this is to turn off autocommit and checkpoint offsets only after a message (or batch of messages) have been written to the DB. One caveat though is that rebalances (e.g., if a new consumer instance shows up) will result in offsets being committed so there would be an issue if the DB is unavailable and a rebalance occurs simultaneously and there are unprocessed messages that have already been pulled out of the iterator. > Allow high level consumer to 'nak' a message and force Kafka to close the > KafkaStream without losing that message > ----------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-966 > URL: https://issues.apache.org/jira/browse/KAFKA-966 > Project: Kafka > Issue Type: Improvement > Components: consumer > Affects Versions: 0.8 > Reporter: Chris Curtin > Assignee: Neha Narkhede > Priority: Minor > > Enhancement request. > The high level consumer is very close to handling a lot of situations a > 'typical' client would need. Except for when the message received from Kafka > is valid, but the business logic that wants to consume it has a problem. > For example if I want to write the value to a MongoDB or Cassandra database > and the database is not available. I won't know until I go to do the write > that the database isn't available, but by then it is too late to NOT read the > message from Kafka. Thus if I call shutdown() to stop reading, that message > is lost since the offset Kafka writes to ZooKeeper is the next offset. > Ideally I'd like to be able to tell Kafka: close the KafkaStream but set the > next offset to read for this partition to this message when I start up again. > And if there are any messages in the BlockingQueue for other partitions, find > the lowest # and use it for that partitions offset since I haven't consumed > them yet. > Thus I can cleanly shutdown my processing, resolve whatever the issue is and > restart the process. > Another idea might be to allow a 'peek' into the next message and if I > succeed in writing to the database call 'next' to remove it from the queue. > I understand this won't deal with a 'kill -9' or hard failure of the JVM > leading to the latest offsets not being written to ZooKeeper but it addresses > a likely common scenario for consumers. Nor will it add true transactional > support since the ZK update could fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira