I think I know the answer to this already but I wanted to check my assumptions 
before proceeding. 

We are using Kafka as a queueing mechanism for receiving messages from 
stateless producers. We are operating in a legal framework where we can never 
lose a committed message, but we can reject a write if Kafka is unavailable and 
it will be retried in the future. We are operating all of our servers in one 
rack so we are vulnerable if a whole rack goes out. We will have 3-4 Kafka 
brokers and have RF=3

To guarantee that we never (to the greatest extent possible) lose a message 
that we have acknowledged, it seems like we need to have 
request.required.acks=-1 and log.flush.interval.messages = 1, i.e. fsync on 
every message and wait for all brokers in ISR to reply before returning 
successfully. This would guard against the failure scenario where all servers 
in our rack go down simultaneously.

Is my understanding correct?  

Thanks, Daniel.

Reply via email to