[jira] [Commented] (KAFKA-5261) Performance improvement of SimpleAclAuthorizer

2017-05-17 Thread Stephane Maarek (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014945#comment-16014945
 ] 

Stephane Maarek commented on KAFKA-5261:


this and possibly
{code}

  private def aclMatch(session: Session, operations: Operation, resource: 
Resource, principal: KafkaPrincipal, host: String, permissionType: 
PermissionType, acls: Set[Acl]): Boolean = {
acls.find { acl =>
  acl.permissionType == permissionType &&
(acl.principal == principal || acl.principal == Acl.WildCardPrincipal) 
&&
(operations == acl.operation || acl.operation == All) &&
(acl.host == host || acl.host == Acl.WildCardHost)
}.exists { acl =>
  authorizerLogger.debug(s"operation = $operations on resource = $resource 
from host = $host is $permissionType based on acl = $acl")
  true
}
  }
{code}

In case acls is big, that could be costly to do over and over again

> Performance improvement of SimpleAclAuthorizer
> --
>
> Key: KAFKA-5261
> URL: https://issues.apache.org/jira/browse/KAFKA-5261
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.10.2.1
>Reporter: Stephane Maarek
>
> Currently, looking at the KafkaApis class, it seems that every request going 
> through Kafka is also going through an authorize check:
> {code}
>   private def authorize(session: Session, operation: Operation, resource: 
> Resource): Boolean =
> authorizer.forall(_.authorize(session, operation, resource))
> {code}
> The SimpleAclAuthorizer logic runs through checks which all look to be done 
> in linear time (except on first run) proportional to the number of acls on a 
> specific resource. This operation is re-run every time a client tries to use 
> a Kafka Api, especially on the very often called `handleProducerRequest` and  
> `handleFetchRequest`
> I believe a cache could be built to store the result of the authorize call, 
> possibly allowing more expensive authorize() calls to happen, and reducing 
> greatly the CPU usage in the long run. The cache would be invalidated every 
> time a change happens to aclCache
> Thoughts before I try giving it a go with a PR? 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5261) Performance improvement of SimpleAclAuthorizer

2017-05-17 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013872#comment-16013872
 ] 

Ismael Juma commented on KAFKA-5261:


Is this a bottleneck? We have an in-memory cache:

{code}
override def getAcls(resource: Resource): Set[Acl] = {
inReadLock(lock) {
  aclCache.get(resource).map(_.acls).getOrElse(Set.empty[Acl])
}
  }
{code}

Your concern is that a resource may have a large number of ACLs causing this 
check to be a bottleneck. It would be good to have some numbers to confirm this.

> Performance improvement of SimpleAclAuthorizer
> --
>
> Key: KAFKA-5261
> URL: https://issues.apache.org/jira/browse/KAFKA-5261
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.10.2.1
>Reporter: Stephane Maarek
>
> Currently, looking at the KafkaApis class, it seems that every request going 
> through Kafka is also going through an authorize check:
> {code}
>   private def authorize(session: Session, operation: Operation, resource: 
> Resource): Boolean =
> authorizer.forall(_.authorize(session, operation, resource))
> {code}
> The SimpleAclAuthorizer logic runs through checks which all look to be done 
> in linear time (except on first run) proportional to the number of acls on a 
> specific resource. This operation is re-run every time a client tries to use 
> a Kafka Api, especially on the very often called `handleProducerRequest` and  
> `handleFetchRequest`
> I believe a cache could be built to store the result of the authorize call, 
> possibly allowing more expensive authorize() calls to happen, and reducing 
> greatly the CPU usage in the long run. The cache would be invalidated every 
> time a change happens to aclCache
> Thoughts before I try giving it a go with a PR? 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5261) Performance improvement of SimpleAclAuthorizer

2017-05-17 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013665#comment-16013665
 ] 

Onur Karaman commented on KAFKA-5261:
-

Would one alternative be to just make a CachedAclAuthorizer which decorates the 
SimpleAclAuthorizer with a cache?

> Performance improvement of SimpleAclAuthorizer
> --
>
> Key: KAFKA-5261
> URL: https://issues.apache.org/jira/browse/KAFKA-5261
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.10.2.1
>Reporter: Stephane Maarek
>
> Currently, looking at the KafkaApis class, it seems that every request going 
> through Kafka is also going through an authorize check:
> {code}
>   private def authorize(session: Session, operation: Operation, resource: 
> Resource): Boolean =
> authorizer.forall(_.authorize(session, operation, resource))
> {code}
> The SimpleAclAuthorizer logic runs through checks which all look to be done 
> in linear time (except on first run) proportional to the number of acls on a 
> specific resource. This operation is re-run every time a client tries to use 
> a Kafka Api, especially on the very often called `handleProducerRequest` and  
> `handleFetchRequest`
> I believe a cache could be built to store the result of the authorize call, 
> possibly allowing more expensive authorize() calls to happen, and reducing 
> greatly the CPU usage in the long run. The cache would be invalidated every 
> time a change happens to aclCache
> Thoughts before I try giving it a go with a PR? 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)