[jira] [Comment Edited] (OAK-2807) Improve getSize performance for "public" content

angela (JIRA) Thu, 18 Jun 2015 06:26:05 -0700

    [ 
https://issues.apache.org/jira/browse/OAK-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14591761#comment-14591761
 ]


angela edited comment on OAK-2807 at 6/18/15 1:24 PM:
------------------------------------------------------

hi michael

{quote}
There is a very common special case: content (a subtree) that is readable by 
everyone (anonymous). 
{quote}

unfortunately it only looks like being readable to everyone because the 
access-checks will filter out all 'meta' data that is not readable. it's only 
the 'regular' content that is readable. special things like e.g. the policy 
that opens up the read-access for everyone is not world-readable, nor are other 
special data like e.g. analytics data.

{quote}
If we mark an index on that subtree as "readable by everyone" on index creation 
then we could skip ACL check on the result set or precompute/cache certain 
query results.
{quote}

the problem is that you don't know if it is really readable by everyone for the 
following reasons:
- any policy _above_ your target tree denying access for a  user-principal will 
take precedence
- any policy _below_ (e.g. CUG) that looks down access again will make your 
test for 'readable by everyone' become wrong
- with OAK-1268 and having the latter covered by dedicated policies looking a 
given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will 
no longer reflect the complete picture
- with OAK-2008 the permission evaluation will become a multiplexing one and 
there will be _no_ simple shortcut to determine 
'readable-for-everyone-for-the-whole-subtree' any more (if it ever was 
possible).
- {{TreePermissions.canReadAll()}} is already intended to provide exact the 
short-cut you are proposing and the reason why this only returns {{true}} for 
the administrative access is, that i didn't see a scalable way to predict this! 
in the multiplexing setup this would mean that _all_ authorization models 
plugged into the system must return {{TreePermission.canReadAll()}} to return 
true (you wouldn't need to do that yourself as the multiplexer would hide the 
different implementations for you; just meant to illustrate that it might be 
tricky to ever get {{true}} for non-administrative sessions).

{quote}
In order to avoid information leakage the index would have to be marked 
"invalid" as soon as one node in that sub-tree is not readable by everyone 
anymore. (could be checked through a commit hook)
{quote}

if that was _really_ feasible... why not... but so far i don't see how this 
would work reliably for _every_ combination of principals, policies and special 
node types (like e.g. access control content). 
as you can see in {{TreePermission}} I added this shortcut because I thought 
that it might be doable.... but so far I didn't find a solution for this except 
for the the trivial case.

{quote}
Maybe this concept could even be generalized later to work with other 
principals than everyone.
{quote}

the problem is not _everyone_ versus some other principals... if we have a 
concept that _really_ works, it doesn't matter on whether it's everyone or some 
other principal.

btw: the same suggestion has been made by david ages ago, because he just made 
the same assumptions that this is easy to achieve. but unfortunately it's not 
if you look at it from a repository point of view and not from a demo-hack 
point of view.

kind regards
angela


was (Author: anchela):
hi michael

{quote}
There is a very common special case: content (a subtree) that is readable by 
everyone (anonymous). 
{quote}

unfortunately it only looks like being readable to everyone because the 
access-checks will filter out all 'meta' data that is not readable. it's only 
the 'regular' content that is readable. special things like e.g. the policy 
that opens up the read-access for everyone is not world-readable, nor are other 
special data like e.g. analytics data.

{quote}
If we mark an index on that subtree as "readable by everyone" on index creation 
then we could skip ACL check on the result set or precompute/cache certain 
query results.
{quote}

the problem is that you don't know if it is really readable by everyone for the 
following reasons:
- any policy _above_ your target tree denying access for a  user-principal will 
take precedence
- any policy _below_ (e.g. CUG) that looks down access again will make your 
test for 'readable by everyone' become wrong
- with OAK-1268 and having the latter covered by dedicated policies looking a 
given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will 
no longer reflect the complete picture
- with OAK-2008 the permission evaluation will become a multiplexing one and 
there will be simple shortcut to determine 
'readable-for-everyone-for-the-whole-subtree' any more (if it ever was 
possible).
- {{TreePermissions.canReadAll()}} is already intended to provide exact the 
short-cut you are proposing and the reason why this only returns {{true}} for 
the administrative access is, that i didn't see a scalable way to predict this!

{quote}
In order to avoid information leakage the index would have to be marked 
"invalid" as soon as one node in that sub-tree is not readable by everyone 
anymore. (could be checked through a commit hook)
{quote}

if that was _really_ feasible... why not... but so far i don't see how this 
would work reliably for _every_ combination of principals, policies and special 
node types (like e.g. access control content). 
as you can see in {{TreePermission}} I added this shortcut because I thought 
that it might be doable.... but so far I didn't find a solution for this except 
for the the trivial case.

{quote}
Maybe this concept could even be generalized later to work with other 
principals than everyone.
{quote}

the problem is not _everyone_ versus some other principals... if we have a 
concept that _really_ works, it doesn't matter on whether it's everyone or some 
other principal.

btw: the same suggestion has been made by david ages ago, because he just made 
the same assumptions that this is easy to achieve. but unfortunately it's not 
if you look at it from a repository point of view and not from a demo-hack 
point of view.

kind regards
angela

> Improve getSize performance for "public" content
> ------------------------------------------------
>
>                 Key: OAK-2807
>                 URL: https://issues.apache.org/jira/browse/OAK-2807
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: query, security
>    Affects Versions: 1.0.13, 1.2
>            Reporter: Michael Marth
>
> Certain operations in the query engine like getting the size of a result set 
> or facets are expensive to compute due to the fact that ACLs need to be 
> computed on the entire result set. This issue is to discuss an idea how we 
> could improve this:
> There is a very common special case: content (a subtree) that is readable by 
> everyone (anonymous). If we mark an index on that subtree as "readable by 
> everyone" on index creation then we could skip ACL check on the result set or 
>  precompute/cache certain query results.
> In order to avoid information leakage the index would have to be marked 
> "invalid" as soon as one node in that sub-tree is not readable by everyone 
> anymore. (could be checked through a commit hook)
> Maybe this concept could even be generalized later to work with other 
> principals than everyone.
> Just an idea - feel free to poke holes and shoot it down :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (OAK-2807) Improve getSize performance for "public" content

Reply via email to