Hello Kafka Users, We've been using the old Scala Producer for quite some time now.
In the Scala Producer, the send() call throws an exception when Kafka is down. We catch this exception, and perform error handling by spooling the message locally, which another thread consumes and retries with an increasing backoff. However, with the new Java producer and Kafka 0.9, I have noticed the following behaviour in these cases. 1. Send when Kafka is down: Configuration: max.block.ms 500 request.timeout.ms 1000 metadata.fetch.timeout.ms 10000 Sequence of events: - Kafka is down - Send a message - Exception is thrown Here, we attempt to send a message when Kafka is down. As it is the first send to the topic, the Producer attempts to update the metadata, and when it hits `metadata.fetch.timeout.ms`, we get an exception with the message: `TimeoutException Failed to update metadata after 10000 ms.` 2. Send multiple messages by simulating Kafka failure for the second message: Configuration: max.block.ms 500 request.timeout.ms 1000 metadata.fetch.timeout.ms 10000 Sequence of events: - Kafka is up. - Send a message. - Sleep for 10 seconds during which Kafka is manually shut down - Attempt to send another message blocks indefinitely The `metadata.fetch.timeout.ms` does not kick in here, as it only does during the first send to a topic. However, I'm confused as to why `max.block.ms` or `request.timeout.ms` do /not/ take effect. What parameter I should configure to get the original exception throwing behaviour of the earlier Scala producer when Kafka is down? Any advice will be greatly appreciated. :-) Samuel