-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33417/#review81097
-----------------------------------------------------------


This piece of logic has been quite complex and awkward to me now, for example 
in ready() a node will only not be considered if ALL of its partitions are 
either not sendable or are in the backoff period, and the reason we want to get 
ready nodes before drain is to check if they are "really" ready or not. This is 
mainly because 1) we need to be careful when calling client.poll() later about 
the timeout value in order to avoid busy waiting, 2) we need to make sure if 
metadata refresh is needed, it needs to be sent as higher priority than other 
requests.

I suggest re-writing this fraction of code to make it clearer, in the following 
process:

0. while handle metadata response and update the metadata, check for ANY 
partitions if their leader is not known; if there is set 
metadata.requestUpdate. So we do not need to do this step anymore at the start 
of run().

1. get all the ready nodes based on their connection state only (i.e. no 
peeking in RecordAccumulator), and record the node_backoff as min 
(reconnection_backoff - time_waited) of all nodes; if one of these node is 
connected or connecting, this backoff should be 0.

2. for each of ready nodes, try to drain their corresponding partitions in 
RecordAccumulator while considering or kinds of conditions (full, expired, 
exhausted, etc...), and record the data_backoff as min (retry_backoff - 
time_waited) of all partitions; if one of the partitions is immediately 
sendable, this backoff should be 0.

3. formulate produce request and call client.poll() with timeout = 
reconnection_backoff > 0 ? recconection_backoff : retry_backoff.

4. in NetworkClient.poll(), the logic of "maybeUpdateMetadata" while update 
metadataTimeout can also be simplified.

This may contain some flaw, Jiangjie / Ewen let me know if you see any issues.

- Guozhang Wang


On April 21, 2015, 10:51 p.m., Jiangjie Qin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33417/
> -----------------------------------------------------------
> 
> (Updated April 21, 2015, 10:51 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2138
>     https://issues.apache.org/jira/browse/KAFKA-2138
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> Patch for KAFKA-2138 honor retry backoff in KafkaProducer
> 
> 
> Diffs
> -----
> 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java
>  0e7ab29a07301309aa8ca5b75b32b8e05ddc3a94 
>   
> clients/src/test/java/org/apache/kafka/clients/producer/internals/RecordAccumulatorTest.java
>  05e2929c0a9fc8cf2a8ebe0776ca62d5f3962d5c 
> 
> Diff: https://reviews.apache.org/r/33417/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jiangjie Qin
> 
>

Reply via email to