[jira] [Commented] (KAFKA-966) Allow high level consumer to 'nak' a message and force Kafka to close the KafkaStream without losing that message

Joel Koshy (JIRA) Tue, 09 Jul 2013 13:38:02 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703755#comment-13703755
 ]


Joel Koshy commented on KAFKA-966:
----------------------------------

One way to accomplish this is to turn off autocommit and checkpoint offsets 
only after a message (or batch of messages) have been written to the DB.

One caveat though is that rebalances (e.g., if a new consumer instance shows 
up) will result in offsets being committed so there would be an issue if the DB 
is unavailable and a rebalance occurs simultaneously and there are unprocessed 
messages that have already been pulled out of the iterator.

                
> Allow high level consumer to 'nak' a message and force Kafka to close the 
> KafkaStream without losing that message
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-966
>                 URL: https://issues.apache.org/jira/browse/KAFKA-966
>             Project: Kafka
>          Issue Type: Improvement
>          Components: consumer
>    Affects Versions: 0.8
>            Reporter: Chris Curtin
>            Assignee: Neha Narkhede
>            Priority: Minor
>
> Enhancement request.
> The high level consumer is very close to handling a lot of situations a 
> 'typical' client would need. Except for when the message received from Kafka 
> is valid, but the business logic that wants to consume it has a problem.
> For example if I want to write the value to a MongoDB or Cassandra database 
> and the database is not available. I won't know until I go to do the write 
> that the database isn't available, but by then it is too late to NOT read the 
> message from Kafka. Thus if I call shutdown() to stop reading, that message 
> is lost since the offset Kafka writes to ZooKeeper is the next offset.
> Ideally I'd like to be able to tell Kafka: close the KafkaStream but set the 
> next offset to read for this partition to this message when I start up again. 
> And if there are any messages in the BlockingQueue for other partitions, find 
> the lowest # and use it for that partitions offset since I haven't consumed 
> them yet.
> Thus I can cleanly shutdown my processing, resolve whatever the issue is and 
> restart the process.
> Another idea might be to allow a 'peek' into the next message and if I 
> succeed in writing to the database call 'next' to remove it from the queue. 
> I understand this won't deal with a 'kill -9' or hard failure of the JVM 
> leading to the latest offsets not being written to ZooKeeper but it addresses 
> a likely common scenario for consumers. Nor will it add true transactional 
> support since the ZK update could fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-966) Allow high level consumer to 'nak' a message and force Kafka to close the KafkaStream without losing that message

Reply via email to