[jira] [Comment Edited] (OAK-2807) Improve getSize performance for public content
[ https://issues.apache.org/jira/browse/OAK-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591761#comment-14591761 ] angela edited comment on OAK-2807 at 6/18/15 1:19 PM: -- hi michael {quote} There is a very common special case: content (a subtree) that is readable by everyone (anonymous). {quote} unfortunately it only looks like being readable to everyone because the access-checks will filter out all 'meta' data that is not readable. it's only the 'regular' content that is readable. special things like e.g. the policy that opens up the read-access for everyone is not world-readable, nor are other special data like e.g. analytics data. {quote} If we mark an index on that subtree as readable by everyone on index creation then we could skip ACL check on the result set or precompute/cache certain query results. {quote} the problem is that you don't know if it is really readable by everyone for the following reasons: - any policy _above_ your target tree denying access for a user-principal will take precedence - any policy _below_ (e.g. CUG) that looks down access again will make your test for 'readable by everyone' become wrong - with OAK-1268 and having the latter covered by dedicated policies looking a given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will no longer reflect the complete picture - with OAK-2008 the permission evaluation will become a multiplexing one and there will be simple shortcut to determine 'readable-for-everyone-for-the-whole-subtree' any more (if it ever was possible). - {{TreePermissions.canReadAll()}} is already intended to provide exact the short-cut you are proposing and the reason why this only returns {{true}} for the administrative access is, that i didn't see a scalable way to predict this! {quote} In order to avoid information leakage the index would have to be marked invalid as soon as one node in that sub-tree is not readable by everyone anymore. (could be checked through a commit hook) {quote} if that was _really_ feasible... why not... but so far i don't see how this would work reliably for _every_ combination of principals, policies and special node types (like e.g. access control content). as you can see in {{TreePermission}} I added this shortcut because I thought that it might be doable but so far I didn't find a solution for this except for the the trivial case. {quote} Maybe this concept could even be generalized later to work with other principals than everyone. {quote} the problem is not _everyone_ versus some other principals... if we have a concept that _really_ works, it doesn't matter on whether it's everyone or some other principal. btw: the same suggestion has been made by david ages ago, because he just made the same assumptions that this is easy to achieve. but unfortunately it's not if you look at it from a repository point of view and not from a demo-hack point of view. kind regards angela was (Author: anchela): hi michael {quote} There is a very common special case: content (a subtree) that is readable by everyone (anonymous). {quote} unfortunately it only looks like being readable to everyone because the access-checks will filter out all 'meta' data that is not readable. it's only the 'regular' content that is readable. special things like e.g. the policy that opens up the read-access for everyone is not world-readable, nor are other special data like e.g. analytics data. {quote} If we mark an index on that subtree as readable by everyone on index creation then we could skip ACL check on the result set or precompute/cache certain query results. {quote} the problem is that you don't know if it is really readable by everyone for the following reasons: - any policy _above_ your target tree denying access for a user-principal will take precedence - any policy _below_ (e.g. CUG) that looks down access again will make your test for 'readable by everyone' become wrong - with OAK-1268 and having the latter covered by dedicated policies looking a given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will no longer reflect the complete picture - with OAK-2008 the permission evaluation will become a multiplexing one and there will be simple shortcut to determine 'readable-for-everyone-for-the-whole-subtree' any more (if it ever was possible). - {{TreePermissions.canReadAll()}} is already intended to provide exact the short-cut you are proposing and the reason why this only returns {{true}} for the administrative access is, that i didn't see a scalable way to predict this! {quote} In order to avoid information leakage the index would have to be marked invalid as soon as one node in that sub-tree is not readable by everyone anymore. (could be checked through a commit hook) {quote} if that was _really_ feasible... why not... but so far i
[jira] [Comment Edited] (OAK-2807) Improve getSize performance for public content
[ https://issues.apache.org/jira/browse/OAK-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591761#comment-14591761 ] angela edited comment on OAK-2807 at 6/18/15 1:24 PM: -- hi michael {quote} There is a very common special case: content (a subtree) that is readable by everyone (anonymous). {quote} unfortunately it only looks like being readable to everyone because the access-checks will filter out all 'meta' data that is not readable. it's only the 'regular' content that is readable. special things like e.g. the policy that opens up the read-access for everyone is not world-readable, nor are other special data like e.g. analytics data. {quote} If we mark an index on that subtree as readable by everyone on index creation then we could skip ACL check on the result set or precompute/cache certain query results. {quote} the problem is that you don't know if it is really readable by everyone for the following reasons: - any policy _above_ your target tree denying access for a user-principal will take precedence - any policy _below_ (e.g. CUG) that looks down access again will make your test for 'readable by everyone' become wrong - with OAK-1268 and having the latter covered by dedicated policies looking a given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will no longer reflect the complete picture - with OAK-2008 the permission evaluation will become a multiplexing one and there will be _no_ simple shortcut to determine 'readable-for-everyone-for-the-whole-subtree' any more (if it ever was possible). - {{TreePermissions.canReadAll()}} is already intended to provide exact the short-cut you are proposing and the reason why this only returns {{true}} for the administrative access is, that i didn't see a scalable way to predict this! in the multiplexing setup this would mean that _all_ authorization models plugged into the system must return {{TreePermission.canReadAll()}} to return true (you wouldn't need to do that yourself as the multiplexer would hide the different implementations for you; just meant to illustrate that it might be tricky to ever get {{true}} for non-administrative sessions). {quote} In order to avoid information leakage the index would have to be marked invalid as soon as one node in that sub-tree is not readable by everyone anymore. (could be checked through a commit hook) {quote} if that was _really_ feasible... why not... but so far i don't see how this would work reliably for _every_ combination of principals, policies and special node types (like e.g. access control content). as you can see in {{TreePermission}} I added this shortcut because I thought that it might be doable but so far I didn't find a solution for this except for the the trivial case. {quote} Maybe this concept could even be generalized later to work with other principals than everyone. {quote} the problem is not _everyone_ versus some other principals... if we have a concept that _really_ works, it doesn't matter on whether it's everyone or some other principal. btw: the same suggestion has been made by david ages ago, because he just made the same assumptions that this is easy to achieve. but unfortunately it's not if you look at it from a repository point of view and not from a demo-hack point of view. kind regards angela was (Author: anchela): hi michael {quote} There is a very common special case: content (a subtree) that is readable by everyone (anonymous). {quote} unfortunately it only looks like being readable to everyone because the access-checks will filter out all 'meta' data that is not readable. it's only the 'regular' content that is readable. special things like e.g. the policy that opens up the read-access for everyone is not world-readable, nor are other special data like e.g. analytics data. {quote} If we mark an index on that subtree as readable by everyone on index creation then we could skip ACL check on the result set or precompute/cache certain query results. {quote} the problem is that you don't know if it is really readable by everyone for the following reasons: - any policy _above_ your target tree denying access for a user-principal will take precedence - any policy _below_ (e.g. CUG) that looks down access again will make your test for 'readable by everyone' become wrong - with OAK-1268 and having the latter covered by dedicated policies looking a given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will no longer reflect the complete picture - with OAK-2008 the permission evaluation will become a multiplexing one and there will be simple shortcut to determine 'readable-for-everyone-for-the-whole-subtree' any more (if it ever was possible). - {{TreePermissions.canReadAll()}} is already intended to provide exact the short-cut you are proposing and the reason why this
[jira] [Comment Edited] (OAK-2807) Improve getSize performance for public content
[ https://issues.apache.org/jira/browse/OAK-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591761#comment-14591761 ] angela edited comment on OAK-2807 at 6/18/15 1:32 PM: -- hi michael {quote} There is a very common special case: content (a subtree) that is readable by everyone (anonymous). {quote} unfortunately it only looks like being readable to everyone because the access-checks will filter out all 'meta' data that is not readable. it's only the 'regular' content that is readable. special things like e.g. the policy that opens up the read-access for everyone is not world-readable, nor are other special data like e.g. analytics data. {quote} If we mark an index on that subtree as readable by everyone on index creation then we could skip ACL check on the result set or precompute/cache certain query results. {quote} the problem is that you don't know if it is really readable by everyone for the following reasons: - any policy _above_ your target tree denying access for a user-principal will take precedence - any policy _below_ (e.g. CUG) that looks down access again will make your test for 'readable by everyone' become wrong - with OAK-1268 and having the latter covered by dedicated policies looking a given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will no longer reflect the complete picture - with OAK-2008 the permission evaluation will become a multiplexing one and there will be _no_ simple shortcut to determine 'readable-for-everyone-for-the-whole-subtree' any more (if it ever was possible). - {{TreePermissions.canReadAll()}} is already intended to provide exact the short-cut you are proposing and the reason why this only returns {{true}} for the administrative access is, that i didn't see a scalable way to predict this! in the multiplexing setup this would mean that _all_ authorization models plugged into the system must return {{TreePermission.canReadAll()}} to return true (you wouldn't need to do that yourself as the multiplexer would hide the different implementations for you; just meant to illustrate that it might be tricky to ever get {{true}} for non-administrative sessions). {quote} In order to avoid information leakage the index would have to be marked invalid as soon as one node in that sub-tree is not readable by everyone anymore. (could be checked through a commit hook) {quote} if that was _really_ feasible... why not... but so far i don't see how this would work reliably for _every_ combination of principals, policies and special node types (like e.g. access control content). as you can see in {{TreePermission}} I added this shortcut because I thought that it might be doable but so far I didn't find a solution for this except for the the trivial case. {quote} Maybe this concept could even be generalized later to work with other principals than everyone. {quote} the problem is not _everyone_ versus some other principals... if we have a concept that _really_ works, it doesn't matter on whether it's everyone or some other principal. btw: the same suggestion has been made by david ages ago, because he just made the same assumptions that this is easy to achieve. but unfortunately it's not if you look at it from a repository point of view and not from a demo-hack point of view. having said that: it's definitely worth some additional investigations but please be asserted that it's not as simple as it might look like. i would suggest that use {{TreePermission.canReadAll()}} for any kind of optimizations, where can easily verify any possible approach instead of being mislead by simplistic assumptions. :-) kind regards angela was (Author: anchela): hi michael {quote} There is a very common special case: content (a subtree) that is readable by everyone (anonymous). {quote} unfortunately it only looks like being readable to everyone because the access-checks will filter out all 'meta' data that is not readable. it's only the 'regular' content that is readable. special things like e.g. the policy that opens up the read-access for everyone is not world-readable, nor are other special data like e.g. analytics data. {quote} If we mark an index on that subtree as readable by everyone on index creation then we could skip ACL check on the result set or precompute/cache certain query results. {quote} the problem is that you don't know if it is really readable by everyone for the following reasons: - any policy _above_ your target tree denying access for a user-principal will take precedence - any policy _below_ (e.g. CUG) that looks down access again will make your test for 'readable by everyone' become wrong - with OAK-1268 and having the latter covered by dedicated policies looking a given implementation of {{AccessControlPolicy}} and {{AccessControlEntry}} will no longer reflect the complete
[jira] [Comment Edited] (OAK-2807) Improve getSize performance for public content
[ https://issues.apache.org/jira/browse/OAK-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511865#comment-14511865 ] Alexander Klimetschek edited comment on OAK-2807 at 4/24/15 10:03 PM: -- Sounds great. The security folks will argue that the invalidation is extremely important, so it should work well, though in reality it would rarely occur. The common case of a public site at say /content/mysite that would always be public, especially on a published environment, should benefit greatly from that. The prerequisite of separate indexes with one for say /content/mysite in particular was done with Oak already, it's time to make use of it :) was (Author: alexander.klimetschek): Sounds great. The security folks will argue that the invalidation is extremely important, though in reality it would never occur. The common case of a public site at say /content/mysite that would always be public, especially on a published environment, should benefit greatly from that. The prerequisite of separate indexes with one for say /content/mysite in particular was done with Oak already, it's time to make use of it :) Improve getSize performance for public content Key: OAK-2807 URL: https://issues.apache.org/jira/browse/OAK-2807 Project: Jackrabbit Oak Issue Type: Improvement Components: query, security Affects Versions: 1.0.13, 1.2 Reporter: Michael Marth Certain operations in the query engine like getting the size of a result set or facets are expensive to compute due to the fact that ACLs need to be computed on the entire result set. This issue is to discuss an idea how we could improve this: There is a very common special case: content (a subtree) that is readable by everyone (anonymous). If we mark an index on that subtree as readable by everyone on index creation then we could skip ACL check on the result set or precompute/cache certain query results. In order to avoid information leakage the index would have to be marked invalid as soon as one node in that sub-tree is not readable by everyone anymore. (could be checked through a commit hook) Maybe this concept could even be generalized later to work with other principals than everyone. Just an idea - feel free to poke holes and shoot it down :) -- This message was sent by Atlassian JIRA (v6.3.4#6332)