Re: Performance Issues

Les Hazlewood Thu, 10 Jan 2013 18:23:45 -0800

On Thu, Jan 10, 2013 at 9:39 AM, gwhite <[email protected]> wrote:


> We have been running several performance tests using Shiro in a desktop
> application, with the following scenario:
> •       An user has 25500 permissions directly assigned.
> •       The user has 11 roles with 25500 permissions each one, i.e. 280500
> roles
> indirectly assigned.
>

This is an extremely large number of permissions, not something suitable
for a most Shiro Realms' default implementation.  The Shiro Realms default
permission mechanisms probably work for 80-90% of use cases.  But make no
mistake - Shiro can handle the number of permissions you have and even more
- you'll just have to modify your Realm implementation a bit because you're
probably in the 10% of use cases.  I'll elaborate below.

•       The stored permissions have the following format:
> entityX:attributeY:a1,a2,…,aN
> where
> entityX: name of an entity (001 <= X <= 500)
> attributeY: name of an entity’s attribute (01 <= Y <= 50)
> A: actions permitted on the attribute (1 <= N <= 4)
>  Example: entity054:attribute28:1,2,3
>

The comma-delimited approach is a convenience mechanism so you don't need
too many permission assignments.  However, due to the nature of strings and
parsing, and token comparison, if you have a very large number of
permissions assigned to a Subject (or indirectly via groups) like you do,
this would indeed have a performance impact:

Each permission String is, by default, converted to a WildcardPermission
instance.  The WildcardPermission looks at the tokens and chops them up
into respective parts for internal storage for later comparison when a
permission 'implies' method is executed.  More on that in a minute.

When performing a permission check, the Shiro Realm's default
implementation will do the following:

- Iterate over all permissions assigned to the Subject (or assigned to its
Roles via a RolePermissionResolver).
- For each permission, execute its implies method (i.e.
assignedPerm.implies(permToCheck)).  If true, short circuit and return true
immediately.
- If false, continue to the next loop iteration.

The reason it does this is because the implies method is for _implication_
logic - it's not a direct equality check, i.e. perm 1 might imply more
rights than perm 2 and therefore encompass perm 2's rights, and computation
needs to occur to perform this logic.

When a WildcardPermission instance's implies method is called, more
iteration occurs:

- Iterate over the tokens in the WildcardPermission instance (parsed from
the permission string).
- For each token, if the token is equal or surpass the corresponding token
in the permission being checked, continue to the next token.  If not equal
or greater, short circuit and return false.
If greater than or equal, check the next token until you run out of tokens.

Also, if using commas in a token, those also need to be parsed out and
compared against an incoming permission (more computation).

This logic can be seen here:
http://svn.apache.org/repos/asf/shiro/trunk/core/src/main/java/org/apache/shiro/authz/permission/WildcardPermission.java

The point is, is that there is a lot of iteration going on: iteration for
each permission, and then more iteration _within_ each implies method call.
 For most applications that have, say, under 100 permissions per subject,
the 'implies' logic cost is negligible and you don't have to worry about
it.  But in your situation, you'll need another technique.

For me, the best way to solve this is using a constant time operation via
an hash index.  That is, don't have any implication logic for permission
checks - use a direct equality check.  You might lose some of the
expression semantics of normal WildcardPermissions, but because of the
quantity of permissions you have, your app will be _much_ faster.  But this
is common in computing - the larger data sets we have, we often have to
change our algorithms accordingly.  It's the good 'ol CS time vs space
tradeoff.

So, one solution is to not use commas in your stored permissions or in the
permissions you use when executing a check, and expand them into the more
verbose form and assign them that way.  Then you do a straight equality
check/lookup against the stored permissions and you'll receive a very fast
yes/no answer.

For example, consider your previously mentioned assigned permission string:

entity054:attribute28:1,2,3

Assigning this to a user or group is effectively the same thing as
individually assigning the following three permissions:

entity054:attribute28:1
entity054:attribute28:2
entity054:attribute28:3

Doing this, you'll of course have even more permission assignments, but you
can now perform a direct equality check (we're using more space to reduce
time).  You no longer need to 'imply' anything - either they have an exact
permission or they dont.  The perm checks based on equality can now be
super fast.

For example, you might change your AuthorizingRealm implementation's
isPermitted method to be similar to the following (assuming you have
Account and Group concepts in your Realm's data model):

boolean isPermitted(PrincipalCollection principals, String permToCheck) {
    Account account = getAccountByShiroPrincipals(principals); //lookup
from your DB or cache
    Set<String> permissions = account.getPermissions(); //or get from cache
    if (permissions.contains(permToCheck)) {
        return true;
    }

    Set<Group> groups = account.getGroups(); //or get from db or cache
    for( Group group : groups ) {
        permissions = group.getPermissions();
        if (permissions.contains(permToCheck) {
            return true;
        }
    }

    return false;
}

This is a simplified example of course - maybe you can get the collections
from a cache somewhere.  Since JDK Set implementations tend to use a
backing HashMap, the contains method should be a constant time operation,
i.e. O(1).  Also, instead of looping in the isPermitted implementation, you
could also use a Fork/Join technique (JDK 7 or JD6 via
http://g.oswego.edu/dl/concurrency-interest/) to execute the 'contains'
method concurrently across all of the Permission sets that might exist,
giving you an answer even faster.  Finally, because we're not performing a
Permission implies check (implication logic), there is no need to
instantiate or invoke Permission objects, cutting down computation time
significantly.

I hope this gives you ideas - without having visibility into your
particular permission assignment/storage techniques, this should at least
get you pointed in the right direction.

HTH!

Best regards,

--
Les Hazlewood | @lhazlewood
CTO, Stormpath | http://stormpath.com | @goStormpath | 888.391.5282
Stormpath wins GigaOM Structure Launchpad Award! http://bit.ly/MvZkMk

Re: Performance Issues

Reply via email to