gharris1727 commented on PR #13913:
URL: https://github.com/apache/kafka/pull/13913#issuecomment-1639053363

   The original MM2 KIPs use very few words to describe very large parts of 
it's functionality, often leaving things very under-specified, which I think is 
the case here. I don't think that the original proposal gives us enough to 
decide for or against this change.
   
   Personally, I think that if a user can get themselves into a situation where 
they:
   1. Have ACL sync enabled
   2. Can externally observe some difference between the source and target
   3. The difference has existed for (2x) longer than the sync interval
   
   They are reasonable to conclude that the system is misbehaving, either 
because the source or target system is unhealthy, or MM2 is unhealthy, or MM2 
has a bug in it. If caching causes the above situation to occur, I don't think 
that caching is a viable solution.
   
   I'd be interested in trying other strategies such as:
   1. Intentionally un-batching these requests so as to spread them evenly 
across the poll interval
   2. Performing a target read-before-write to replace (potentially expensive?) 
write calls with read calls
   3. Waiting for previous requests to finish before initiating subsequent ones
   4. Exponentially backing off after failures
   
   @hudeqi In your environment, are you noticing the load on the source system 
from the ACL reads? Do you have more MM2s connected to the target cluster or 
the source cluster? I'm wondering if (2) would actually be helpful, or if reads 
and writes have approximately the same cost.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to