[ 
https://issues.apache.org/jira/browse/OAK-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16824155#comment-16824155
 ] 

Thomas Mueller commented on OAK-8167:
-------------------------------------

I updated the documentation of facets in http://svn.apache.org/r1858008. Now 
(in my view) the security aspects are clearly documented. See 
http://jackrabbit.apache.org/oak/docs/query/lucene.html#facets "Warning: this 
setting potentially leaks repository information the user that runs the query 
may not see" [~anchela] do you think this is sufficient? 

I also documented the unfortunate drawback of the sampling method that is the 
motivation for this issue, [~kexu] - "Do note that the beauty of sampling is 
that a sample size of 1000 has an error rate of 3% with 95% confidence, if ACLs 
are evenly distributed over the sampled data. However, often ACLs are not 
evenly distributed." (Technically, for the low error rate, the ACLs would also 
need to be _independent_ of the PRNG used for sampling, but in practise I don't 
think that's an issue).

That done, I don't see a way to improve the situation if ACLs are _not_ evenly 
distributed. So I'm afraid we will have to close this issue as "Won't fix".


> With uneven distribution of ACL restriction across facet labels statistical 
> facet count become too inaccurate
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: OAK-8167
>                 URL: https://issues.apache.org/jira/browse/OAK-8167
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: lucene, query
>    Affects Versions: 1.6.16
>            Reporter: Kelvin Xu
>            Priority: Major
>              Labels: vulnerability
>
> With the statistical mode, facet count is updated proportionally to the 
> percentage of accessible samples, which works for secured contents scattered 
> across different facets. For edge case where the whole facet (results) is not 
> accessible, the count still shows a number after the sampling percent is 
> applied. Even if the number is small, user experience is 
> misleading/inaccurate as nothing would return when the facet is clicked 
> (applied as a query condition).
> For example, a ACLs/CUGs guarded "private" folder, in which all the assets 
> are tagged with the same facet value. Non authorized user may still see this 
> facet with a count but gets nothing when clicking on the facet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to