[jira] [Commented] (KAFKA-5261) Performance improvement of SimpleAclAuthorizer
[ https://issues.apache.org/jira/browse/KAFKA-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014945#comment-16014945 ] Stephane Maarek commented on KAFKA-5261: this and possibly {code} private def aclMatch(session: Session, operations: Operation, resource: Resource, principal: KafkaPrincipal, host: String, permissionType: PermissionType, acls: Set[Acl]): Boolean = { acls.find { acl => acl.permissionType == permissionType && (acl.principal == principal || acl.principal == Acl.WildCardPrincipal) && (operations == acl.operation || acl.operation == All) && (acl.host == host || acl.host == Acl.WildCardHost) }.exists { acl => authorizerLogger.debug(s"operation = $operations on resource = $resource from host = $host is $permissionType based on acl = $acl") true } } {code} In case acls is big, that could be costly to do over and over again > Performance improvement of SimpleAclAuthorizer > -- > > Key: KAFKA-5261 > URL: https://issues.apache.org/jira/browse/KAFKA-5261 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.10.2.1 >Reporter: Stephane Maarek > > Currently, looking at the KafkaApis class, it seems that every request going > through Kafka is also going through an authorize check: > {code} > private def authorize(session: Session, operation: Operation, resource: > Resource): Boolean = > authorizer.forall(_.authorize(session, operation, resource)) > {code} > The SimpleAclAuthorizer logic runs through checks which all look to be done > in linear time (except on first run) proportional to the number of acls on a > specific resource. This operation is re-run every time a client tries to use > a Kafka Api, especially on the very often called `handleProducerRequest` and > `handleFetchRequest` > I believe a cache could be built to store the result of the authorize call, > possibly allowing more expensive authorize() calls to happen, and reducing > greatly the CPU usage in the long run. The cache would be invalidated every > time a change happens to aclCache > Thoughts before I try giving it a go with a PR? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5261) Performance improvement of SimpleAclAuthorizer
[ https://issues.apache.org/jira/browse/KAFKA-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013872#comment-16013872 ] Ismael Juma commented on KAFKA-5261: Is this a bottleneck? We have an in-memory cache: {code} override def getAcls(resource: Resource): Set[Acl] = { inReadLock(lock) { aclCache.get(resource).map(_.acls).getOrElse(Set.empty[Acl]) } } {code} Your concern is that a resource may have a large number of ACLs causing this check to be a bottleneck. It would be good to have some numbers to confirm this. > Performance improvement of SimpleAclAuthorizer > -- > > Key: KAFKA-5261 > URL: https://issues.apache.org/jira/browse/KAFKA-5261 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.10.2.1 >Reporter: Stephane Maarek > > Currently, looking at the KafkaApis class, it seems that every request going > through Kafka is also going through an authorize check: > {code} > private def authorize(session: Session, operation: Operation, resource: > Resource): Boolean = > authorizer.forall(_.authorize(session, operation, resource)) > {code} > The SimpleAclAuthorizer logic runs through checks which all look to be done > in linear time (except on first run) proportional to the number of acls on a > specific resource. This operation is re-run every time a client tries to use > a Kafka Api, especially on the very often called `handleProducerRequest` and > `handleFetchRequest` > I believe a cache could be built to store the result of the authorize call, > possibly allowing more expensive authorize() calls to happen, and reducing > greatly the CPU usage in the long run. The cache would be invalidated every > time a change happens to aclCache > Thoughts before I try giving it a go with a PR? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5261) Performance improvement of SimpleAclAuthorizer
[ https://issues.apache.org/jira/browse/KAFKA-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013665#comment-16013665 ] Onur Karaman commented on KAFKA-5261: - Would one alternative be to just make a CachedAclAuthorizer which decorates the SimpleAclAuthorizer with a cache? > Performance improvement of SimpleAclAuthorizer > -- > > Key: KAFKA-5261 > URL: https://issues.apache.org/jira/browse/KAFKA-5261 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.10.2.1 >Reporter: Stephane Maarek > > Currently, looking at the KafkaApis class, it seems that every request going > through Kafka is also going through an authorize check: > {code} > private def authorize(session: Session, operation: Operation, resource: > Resource): Boolean = > authorizer.forall(_.authorize(session, operation, resource)) > {code} > The SimpleAclAuthorizer logic runs through checks which all look to be done > in linear time (except on first run) proportional to the number of acls on a > specific resource. This operation is re-run every time a client tries to use > a Kafka Api, especially on the very often called `handleProducerRequest` and > `handleFetchRequest` > I believe a cache could be built to store the result of the authorize call, > possibly allowing more expensive authorize() calls to happen, and reducing > greatly the CPU usage in the long run. The cache would be invalidated every > time a change happens to aclCache > Thoughts before I try giving it a go with a PR? -- This message was sent by Atlassian JIRA (v6.3.15#6346)