Mike Pedersen created KAFKA-16651:
-------------------------------------

             Summary: KafkaProducer.send does not throw TimeoutException as 
documented
                 Key: KAFKA-16651
                 URL: https://issues.apache.org/jira/browse/KAFKA-16651
             Project: Kafka
          Issue Type: Bug
          Components: producer 
    Affects Versions: 3.6.2
            Reporter: Mike Pedersen


In the JavaDoc for {{KafkaProducer#send(ProducerRecord, Callback)}}, it claims 
that it will throw a {{TimeoutException}} if blocking on fetching metadata or 
allocating memory and surpassing {{max.block.ms}}.

bq. Throws:
bq. {{TimeoutException}} - If the time taken for fetching metadata or 
allocating memory for the record has surpassed max.block.ms.

([link|https://kafka.apache.org/36/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html#send(org.apache.kafka.clients.producer.ProducerRecord,org.apache.kafka.clients.producer.Callback)])

But this is not the case. As {{TimeoutException}} is an {{ApiException}} it 
will hit [this 
catch|https://github.com/a0x8o/kafka/blob/54eff6af115ee647f60129f2ce6a044cb17215d0/clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java#L1073-L1084]
 which will result in a failed future being returned instead of the exception 
being thrown.

The "allocating memory" part likely changed as part of 
[KAFKA-3720|https://github.com/apache/kafka/pull/8399/files#diff-43491ffa1e0f8d28db071d8c23f1a76b54f1f20ea98cf6921bfd1c77a90446abR29]
 which changed the base exception for buffer exhaustion exceptions to 
{{TimeoutException}}. Timing out waiting on metadata suffers the same issue, 
but it is not clear whether this has always been the case.

This is basically a discrepancy between documentation and behavior, so it's a 
question of which one should be adjusted.

And on that, being able to differentiate between synchronous timeouts (as 
caused by waiting on metadata or allocating memory) and asynchronous timeouts 
(eg. timing out waiting for acks) is useful. In the former case we _know_ that 
the broker has not received the event but in the latter it _may_ be that the 
broker has received it but the ack could not be delivered, and our actions 
might vary because of this. The current behavior makes this hard to 
differentiate since both result in a {{TimeoutException}} being delivered via 
the callback. Currently, we are relying on the exception message, but this is 
basically just relying on implementation detail that may change at any time. 
Therefore I would suggest to either:

* Revert to the documented behavior of throwing in case of synchronous timeouts
* Correct the javadoc and introduce an exception base class/interface for 
synchronous timeouts



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to