[jira] [Comment Edited] (OAK-2807) Improve getSize performance for public content

2015-06-18 Thread angela (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591761#comment-14591761
 ] 

angela edited comment on OAK-2807 at 6/18/15 1:19 PM:
--

hi michael

{quote}
There is a very common special case: content (a subtree) that is readable by 
everyone (anonymous). 
{quote}

unfortunately it only looks like being readable to everyone because the 
access-checks will filter out all 'meta' data that is not readable. it's only 
the 'regular' content that is readable. special things like e.g. the policy 
that opens up the read-access for everyone is not world-readable, nor are other 
special data like e.g. analytics data.

{quote}
If we mark an index on that subtree as readable by everyone on index creation 
then we could skip ACL check on the result set or precompute/cache certain 
query results.
{quote}

the problem is that you don't know if it is really readable by everyone for the 
following reasons:
- any policy _above_ your target tree denying access for a  user-principal will 
take precedence
- any policy _below_ (e.g. CUG) that looks down access again will make your 
test for 'readable by everyone' become wrong
- with OAK-1268 and having the latter covered by dedicated policies looking a 
given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will 
no longer reflect the complete picture
- with OAK-2008 the permission evaluation will become a multiplexing one and 
there will be simple shortcut to determine 
'readable-for-everyone-for-the-whole-subtree' any more (if it ever was 
possible).
- {{TreePermissions.canReadAll()}} is already intended to provide exact the 
short-cut you are proposing and the reason why this only returns {{true}} for 
the administrative access is, that i didn't see a scalable way to predict this!

{quote}
In order to avoid information leakage the index would have to be marked 
invalid as soon as one node in that sub-tree is not readable by everyone 
anymore. (could be checked through a commit hook)
{quote}

if that was _really_ feasible... why not... but so far i don't see how this 
would work reliably for _every_ combination of principals, policies and special 
node types (like e.g. access control content). 
as you can see in {{TreePermission}} I added this shortcut because I thought 
that it might be doable but so far I didn't find a solution for this except 
for the the trivial case.

{quote}
Maybe this concept could even be generalized later to work with other 
principals than everyone.
{quote}

the problem is not _everyone_ versus some other principals... if we have a 
concept that _really_ works, it doesn't matter on whether it's everyone or some 
other principal.

btw: the same suggestion has been made by david ages ago, because he just made 
the same assumptions that this is easy to achieve. but unfortunately it's not 
if you look at it from a repository point of view and not from a demo-hack 
point of view.

kind regards
angela


was (Author: anchela):
hi michael

{quote}
There is a very common special case: content (a subtree) that is readable by 
everyone (anonymous). 
{quote}

unfortunately it only looks like being readable to everyone because the 
access-checks will filter out all 'meta' data that is not readable. it's only 
the 'regular' content that is readable. special things like e.g. the policy 
that opens up the read-access for everyone is not world-readable, nor are other 
special data like e.g. analytics data.

{quote}
If we mark an index on that subtree as readable by everyone on index creation 
then we could skip ACL check on the result set or precompute/cache certain 
query results.
{quote}

the problem is that you don't know if it is really readable by everyone for the 
following reasons:
- any policy _above_ your target tree denying access for a  user-principal will 
take precedence
- any policy _below_ (e.g. CUG) that looks down access again will make your 
test for 'readable by everyone' become wrong
- with OAK-1268 and having the latter covered by dedicated policies looking a 
given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will 
no longer reflect the complete picture
- with OAK-2008 the permission evaluation will become a multiplexing one and 
there will be simple shortcut to determine 
'readable-for-everyone-for-the-whole-subtree' any more (if it ever was 
possible).
- {{TreePermissions.canReadAll()}} is already intended to provide exact the 
short-cut you are proposing and the reason why this only returns {{true}} for 
the administrative access is, that i didn't see a scalable way to predict this!

{quote}
In order to avoid information leakage the index would have to be marked 
invalid as soon as one node in that sub-tree is not readable by everyone 
anymore. (could be checked through a commit hook)
{quote}

if that was _really_ feasible... why not... but so far i 

[jira] [Comment Edited] (OAK-2807) Improve getSize performance for public content

2015-06-18 Thread angela (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591761#comment-14591761
 ] 

angela edited comment on OAK-2807 at 6/18/15 1:24 PM:
--

hi michael

{quote}
There is a very common special case: content (a subtree) that is readable by 
everyone (anonymous). 
{quote}

unfortunately it only looks like being readable to everyone because the 
access-checks will filter out all 'meta' data that is not readable. it's only 
the 'regular' content that is readable. special things like e.g. the policy 
that opens up the read-access for everyone is not world-readable, nor are other 
special data like e.g. analytics data.

{quote}
If we mark an index on that subtree as readable by everyone on index creation 
then we could skip ACL check on the result set or precompute/cache certain 
query results.
{quote}

the problem is that you don't know if it is really readable by everyone for the 
following reasons:
- any policy _above_ your target tree denying access for a  user-principal will 
take precedence
- any policy _below_ (e.g. CUG) that looks down access again will make your 
test for 'readable by everyone' become wrong
- with OAK-1268 and having the latter covered by dedicated policies looking a 
given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will 
no longer reflect the complete picture
- with OAK-2008 the permission evaluation will become a multiplexing one and 
there will be _no_ simple shortcut to determine 
'readable-for-everyone-for-the-whole-subtree' any more (if it ever was 
possible).
- {{TreePermissions.canReadAll()}} is already intended to provide exact the 
short-cut you are proposing and the reason why this only returns {{true}} for 
the administrative access is, that i didn't see a scalable way to predict this! 
in the multiplexing setup this would mean that _all_ authorization models 
plugged into the system must return {{TreePermission.canReadAll()}} to return 
true (you wouldn't need to do that yourself as the multiplexer would hide the 
different implementations for you; just meant to illustrate that it might be 
tricky to ever get {{true}} for non-administrative sessions).

{quote}
In order to avoid information leakage the index would have to be marked 
invalid as soon as one node in that sub-tree is not readable by everyone 
anymore. (could be checked through a commit hook)
{quote}

if that was _really_ feasible... why not... but so far i don't see how this 
would work reliably for _every_ combination of principals, policies and special 
node types (like e.g. access control content). 
as you can see in {{TreePermission}} I added this shortcut because I thought 
that it might be doable but so far I didn't find a solution for this except 
for the the trivial case.

{quote}
Maybe this concept could even be generalized later to work with other 
principals than everyone.
{quote}

the problem is not _everyone_ versus some other principals... if we have a 
concept that _really_ works, it doesn't matter on whether it's everyone or some 
other principal.

btw: the same suggestion has been made by david ages ago, because he just made 
the same assumptions that this is easy to achieve. but unfortunately it's not 
if you look at it from a repository point of view and not from a demo-hack 
point of view.

kind regards
angela


was (Author: anchela):
hi michael

{quote}
There is a very common special case: content (a subtree) that is readable by 
everyone (anonymous). 
{quote}

unfortunately it only looks like being readable to everyone because the 
access-checks will filter out all 'meta' data that is not readable. it's only 
the 'regular' content that is readable. special things like e.g. the policy 
that opens up the read-access for everyone is not world-readable, nor are other 
special data like e.g. analytics data.

{quote}
If we mark an index on that subtree as readable by everyone on index creation 
then we could skip ACL check on the result set or precompute/cache certain 
query results.
{quote}

the problem is that you don't know if it is really readable by everyone for the 
following reasons:
- any policy _above_ your target tree denying access for a  user-principal will 
take precedence
- any policy _below_ (e.g. CUG) that looks down access again will make your 
test for 'readable by everyone' become wrong
- with OAK-1268 and having the latter covered by dedicated policies looking a 
given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will 
no longer reflect the complete picture
- with OAK-2008 the permission evaluation will become a multiplexing one and 
there will be simple shortcut to determine 
'readable-for-everyone-for-the-whole-subtree' any more (if it ever was 
possible).
- {{TreePermissions.canReadAll()}} is already intended to provide exact the 
short-cut you are proposing and the reason why this 

[jira] [Comment Edited] (OAK-2807) Improve getSize performance for public content

2015-06-18 Thread angela (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591761#comment-14591761
 ] 

angela edited comment on OAK-2807 at 6/18/15 1:32 PM:
--

hi michael

{quote}
There is a very common special case: content (a subtree) that is readable by 
everyone (anonymous). 
{quote}

unfortunately it only looks like being readable to everyone because the 
access-checks will filter out all 'meta' data that is not readable. it's only 
the 'regular' content that is readable. special things like e.g. the policy 
that opens up the read-access for everyone is not world-readable, nor are other 
special data like e.g. analytics data.

{quote}
If we mark an index on that subtree as readable by everyone on index creation 
then we could skip ACL check on the result set or precompute/cache certain 
query results.
{quote}

the problem is that you don't know if it is really readable by everyone for the 
following reasons:
- any policy _above_ your target tree denying access for a  user-principal will 
take precedence
- any policy _below_ (e.g. CUG) that looks down access again will make your 
test for 'readable by everyone' become wrong
- with OAK-1268 and having the latter covered by dedicated policies looking a 
given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will 
no longer reflect the complete picture
- with OAK-2008 the permission evaluation will become a multiplexing one and 
there will be _no_ simple shortcut to determine 
'readable-for-everyone-for-the-whole-subtree' any more (if it ever was 
possible).
- {{TreePermissions.canReadAll()}} is already intended to provide exact the 
short-cut you are proposing and the reason why this only returns {{true}} for 
the administrative access is, that i didn't see a scalable way to predict this! 
in the multiplexing setup this would mean that _all_ authorization models 
plugged into the system must return {{TreePermission.canReadAll()}} to return 
true (you wouldn't need to do that yourself as the multiplexer would hide the 
different implementations for you; just meant to illustrate that it might be 
tricky to ever get {{true}} for non-administrative sessions).

{quote}
In order to avoid information leakage the index would have to be marked 
invalid as soon as one node in that sub-tree is not readable by everyone 
anymore. (could be checked through a commit hook)
{quote}

if that was _really_ feasible... why not... but so far i don't see how this 
would work reliably for _every_ combination of principals, policies and special 
node types (like e.g. access control content). 
as you can see in {{TreePermission}} I added this shortcut because I thought 
that it might be doable but so far I didn't find a solution for this except 
for the the trivial case.

{quote}
Maybe this concept could even be generalized later to work with other 
principals than everyone.
{quote}

the problem is not _everyone_ versus some other principals... if we have a 
concept that _really_ works, it doesn't matter on whether it's everyone or some 
other principal.

btw: the same suggestion has been made by david ages ago, because he just made 
the same assumptions that this is easy to achieve. but unfortunately it's not 
if you look at it from a repository point of view and not from a demo-hack 
point of view.

having said that: it's definitely worth some additional investigations but 
please be asserted that it's not as simple as it might look like. i would 
suggest that use {{TreePermission.canReadAll()}} for any kind of optimizations, 
where can easily verify any possible approach instead of being mislead by 
simplistic assumptions. :-)

kind regards
angela


was (Author: anchela):
hi michael

{quote}
There is a very common special case: content (a subtree) that is readable by 
everyone (anonymous). 
{quote}

unfortunately it only looks like being readable to everyone because the 
access-checks will filter out all 'meta' data that is not readable. it's only 
the 'regular' content that is readable. special things like e.g. the policy 
that opens up the read-access for everyone is not world-readable, nor are other 
special data like e.g. analytics data.

{quote}
If we mark an index on that subtree as readable by everyone on index creation 
then we could skip ACL check on the result set or precompute/cache certain 
query results.
{quote}

the problem is that you don't know if it is really readable by everyone for the 
following reasons:
- any policy _above_ your target tree denying access for a  user-principal will 
take precedence
- any policy _below_ (e.g. CUG) that looks down access again will make your 
test for 'readable by everyone' become wrong
- with OAK-1268 and having the latter covered by dedicated policies looking a 
given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will 
no longer reflect the complete 

[jira] [Comment Edited] (OAK-2807) Improve getSize performance for public content

2015-04-24 Thread Alexander Klimetschek (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511865#comment-14511865
 ] 

Alexander Klimetschek edited comment on OAK-2807 at 4/24/15 10:03 PM:
--

Sounds great. The security folks will argue that the invalidation is extremely 
important, so it should work well, though in reality it would rarely occur. The 
common case of a public site at say /content/mysite that would always be 
public, especially on a published environment, should benefit greatly from 
that. The prerequisite of separate indexes with one for say /content/mysite in 
particular was done with Oak already, it's time to make use of it :)


was (Author: alexander.klimetschek):
Sounds great. The security folks will argue that the invalidation is extremely 
important, though in reality it would never occur. The common case of a public 
site at say /content/mysite that would always be public, especially on a 
published environment, should benefit greatly from that. The prerequisite of 
separate indexes with one for say /content/mysite in particular was done with 
Oak already, it's time to make use of it :)

 Improve getSize performance for public content
 

 Key: OAK-2807
 URL: https://issues.apache.org/jira/browse/OAK-2807
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: query, security
Affects Versions: 1.0.13, 1.2
Reporter: Michael Marth

 Certain operations in the query engine like getting the size of a result set 
 or facets are expensive to compute due to the fact that ACLs need to be 
 computed on the entire result set. This issue is to discuss an idea how we 
 could improve this:
 There is a very common special case: content (a subtree) that is readable by 
 everyone (anonymous). If we mark an index on that subtree as readable by 
 everyone on index creation then we could skip ACL check on the result set or 
  precompute/cache certain query results.
 In order to avoid information leakage the index would have to be marked 
 invalid as soon as one node in that sub-tree is not readable by everyone 
 anymore. (could be checked through a commit hook)
 Maybe this concept could even be generalized later to work with other 
 principals than everyone.
 Just an idea - feel free to poke holes and shoot it down :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)