On Fri, Mar 23, 2018 at 4:06 PM, mdladakos <mdlada...@gmail.com> wrote: > Keith, thanks for your quick response! > > Maybe I wasn't clear enough or I am not understanding your explanation. > > What I was exploring was performing a scan with a large number of > authorizations. While I did use tables with thousands of rows, I also ran > scans against empty tables and still performed at ~25 Seconds. So shouldn't > VisibilityEvaluator not be in involved? > > I don't think the actual filtering is the problem. Is there some work done > by the tablet servers when receiving the scan request, specifically in > regard to user authorizations? > > Again, if I used -s to pass a subset of authorizations for the user with > 100000 authorizations, this increase in return time would be equivalent to a > user with that number of authorizations (i.e.: If I scanned with 100 > authorizations out of the 100000, it would be the normal, fast speed)
I think the following code may be the problem. The collection userauths is a list, so performance will O(M*N). Is M and N are 100K, then its not good. If userauths were a set this would be much faster for the case you are testing. https://github.com/apache/accumulo/blob/17bc708dcabd17824a8378597e0542002470ed18/server/base/src/main/java/org/apache/accumulo/server/security/handler/ZKAuthorizor.java#L166 > > > > -- > Sent from: http://apache-accumulo.1065345.n5.nabble.com/Users-f2.html