d8tltanc edited a comment on pull request #8421:
URL: https://github.com/apache/kafka/pull/8421#issuecomment-637300554


   @skaundinya15 @ijuma @abbccdda Thanks for all the feedback and comments. 
This patch was made when I was new to Kafka. It's kind of naive to me at this 
time as I gained more insights into Kafka. Let me talk about two of my major 
concerns and thoughts about implementing the universal client exponential 
backoff.
   
   **AdminClient logic redundancy**
   
   NetworkClient has request timeout handlers. Producer / Consumer are using 
NetworkClient to help handle timeout but AdminClient doesn’t. The reason, to my 
understanding, is that AdminClient is implementing the per-request timeout.
   
   For example,
   
   1. Wrapping the request builder into a new class `Call`, (the construction 
lambda adds tons of lines into the AdminClient.java, which should probably have 
been living in each AbstractRequest implementation classes files)
   2. Re-writing the request queues for different request status, while normal 
clients are fully using the NetworkClient.
   
   After we add support to the per-request retry backoff to all clients, we can 
implement the per-request timeout together by the way. Thus we can clean up the 
redundant request handling logic in AdminClient.
   
   Are we considering refactoring the AdminClient further and remove all the 
redundant logic which should have belonged to the networking layer and the 
AbstractRequest implementation classes?
   
   **Flexible backoff modes**
   
   Let's analyze the request backoff demands of all the types of clients at 
this point. In my opinion, there are simply two:
   
   1. Requests do not need exponential backoff. These requests need to be sent 
ASAP to avoid dataflow performance degradation, such as the `ProduceRequest` 
and its related/preceding metadata requests.
   
   2. Request do need exponential backoff. These requests are “second-class 
citizens” and can be throttled to avoid request storms on the broker side. Such 
as metadata related requests in AdminClient.
   
   Now the question comes. Even when two requests are of the same request type, 
one may have to get sent ASAP while the other one may wait, depending on the 
use case. We need to think deeper about how to make a classification.
   
   But the implementation would be simple. We can utilize the existing builder 
pattern AbstractRequest and build the request flexibly upon a given 
retry_backoff mode. For example, 
   
   1. AbstractRequest.Builder will interact with a new abstract class 
specifying the retry_backoff option, static or exponential. 
   2. AbstractRequest will have some new interfaces controlling the backoff. 
   
   Then, we can control if the request should have a static backoff or an 
exponential backoff when we construct each implementation instance of 
AbstractRequest.Builder. 
   
   
   I'll include more details in the Jira ticket and rewrite this PR. Before we 
talk more about the code details and start the new implementation, please let 
me know what you think about the AdminClient refactor and static/exponential 
retry_backoff classification rule. 
   
   As @abbccdda suggests, let's re-direct our further discussion to 
[Jira](https://issues.apache.org/jira/browse/KAFKA-9800) Thanks.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to