[
https://issues.apache.org/jira/browse/KAFKA-9729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Manikumar resolved KAFKA-9729.
------------------------------
Resolution: Fixed
> Shrink inWriteLock time in SimpleAuthorizer
> -------------------------------------------
>
> Key: KAFKA-9729
> URL: https://issues.apache.org/jira/browse/KAFKA-9729
> Project: Kafka
> Issue Type: Improvement
> Components: security
> Affects Versions: 1.1.0
> Reporter: Jiao Zhang
> Assignee: Lucas Bradstreet
> Priority: Minor
>
> Current SimpleAuthorizer needs 'inWriteLock' when processing add/remove acls
> requests, while getAcls in authorize() needs 'inReadLock'.
> That means handling add/remove acls requests would block all other requests
> for example produce and fetch requests.
> When processing add/remove acls, updateResourceAcls() access zk to update
> acls, which could be long in the case like network glitch.
> We did the simulation for zk delay.
> When adding 100ms delay on zk side, 'inWriteLock' in addAcls()/removeAcls
> lasts for 400ms~500ms.
> When adding 500ms delay on zk side, 'inWriteLock' in addAcls()/removeAcls
> lasts for 2000ms~2500ms.
> {code:java}
> override def addAcls(acls: Set[Acl], resource: Resource) {
> if (acls != null && acls.nonEmpty) {
> inWriteLock(lock) {
> val startMs = Time.SYSTEM.milliseconds()
> updateResourceAcls(resource) { currentAcls =>
> currentAcls ++ acls
> }
> warn(s"inWriteLock in addAcls consumes ${Time.SYSTEM.milliseconds() -
> startMs} milliseconds.")
> }
> }
> }{code}
> Blocking produce/fetch requests for 2s would cause apparent performance
> degradation for the whole cluster.
> So considering is it possible to remove 'inWriteLock' in addAcls/removeAcls
> and only put 'inWriteLock' inside updateCache, which is called by
> addAcls/removeAcls.
> {code:java}
> // code placeholder
> private def updateCache(resource: Resource, versionedAcls: VersionedAcls) {
> if (versionedAcls.acls.nonEmpty) {
> aclCache.put(resource, versionedAcls)
> } else {
> aclCache.remove(resource)
> }
> }
> {code}
> If do this, block time is only the time for updating local cache, which isn't
> influenced by network glitch. But don't know if there were special concerns
> to have current strict write lock and not sure if there are side effects if
> only put lock to updateCache.
> Btw, the latest version uses 'inWriteLock' at same places as version 1.1.0.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)