Hi,
I looked a bit into how MongoDB selects indexes (query plans) and think we
could take some inspiration.
So, the way MongoDB does it afaiu:
* query gets parsed into Abstract Syntax Tree (so that parameters can get
stripped out)
* the first time this query is performed then the query is
hi jukka
this is not quite true. as i will explain below.
first i would strongly recommend not to rely on the current implementation.
if we have the requirement to evaluated permissions based on the path
we may extend the permissionprovider which IMO is the key API for these
cases; not the
Hi,
Can't we do the ACL check lazily?
That's what we do right now.
Regards,
Thomas
Hi,
On Thu, Jun 26, 2014 at 4:10 AM, Angela Schreiber anch...@adobe.com wrote:
however, please be aware that one key feature of oak (compared to
jackrabbit which only allowed permission evaluation by path) is that
it always needs to be clear if the target for the permission evaluation
is a
Hi,
On Thu, Jun 26, 2014 at 2:55 AM, Davide Giannella dav...@apache.org wrote:
Can't we do the ACL check lazily? Instead of the query engine looping
through the nodes and check, if there's no need of doing so already (IE
sorting), why not returning the set and then filter out the ACLs while
Hi,
On Mon, Jun 23, 2014 at 4:23 PM, Thomas Mueller muel...@adobe.com wrote:
Sorry, sure, the condition is verified again. But this might be an
in-memory operation. The index may return the property value for each
entry as part of running the query (QueryIndex - Cursor - IndexRow). I
think
Hi,
But getting
to that point may be a bit tricky, especially because of access
control.
Yes, we would need to use a different access control API. The ability to
check whether a session has access to a path/node/property, without
actually loading the node from the storage backend. Maybe that API
Hi,
On Wed, Jun 25, 2014 at 10:16 AM, Thomas Mueller muel...@adobe.com wrote:
Yes, we would need to use a different access control API. The ability to
check whether a session has access to a path/node/property, without
actually loading the node from the storage backend. Maybe that API is
Hi,
should we just return the number of estimated entries for the cost?
For Lucene, the property index, the ordered index, and the node type
index: yes.
For Solr, the cost per index lookup (not per entry) is probably a bit
higher, because there is a network round trip. Specially if Solr is
Hi,
On Mon, Jun 23, 2014 at 3:30 AM, Thomas Mueller muel...@adobe.com wrote:
Right. I don't believe the cost of the index lookup is significant (at
least in the asymptotic sense) compared to the overall cost of
executing a query.
Sorry, I don't understand. The cost of the index lookup *is*
Hi,
The problem with that assumption is that typically a single disk read
to the index would return n paths, whereas loading those n nodes might
well take n more disk reads.
Ideally, the cost returned of the index would reflect that. For
single-property indexes (all property indexes are single
Hi,
On Mon, Jun 23, 2014 at 11:18 AM, Thomas Mueller muel...@adobe.com wrote:
Sure, but we don't use a covered index.
Yes, we are not there yet. The node is currently loaded to check access
rights, but that's an implementation detail of access control part. And
it's not needed for the admin.
Hi,
It's more than access control. The query engine needs to double-check
the constraints of the query for each matching path before passing
that node to the client (see the constraint.evaluate() call in [1]). I
don't see any easy way to avoid that step without major refactoring.
If there is no
Hi,
On Mon, Jun 23, 2014 at 1:58 PM, Thomas Mueller muel...@adobe.com wrote:
It's more than access control. The query engine needs to double-check
the constraints of the query for each matching path before passing
that node to the client (see the constraint.evaluate() call in [1]). I
don't see
Hi,
It's more than access control. The query engine needs to double-check
the constraints of the query for each matching path before passing
that node to the client (see the constraint.evaluate() call in [1]). I
don't see any easy way to avoid that step without major refactoring.
If there is no
2014-06-04 9:36 GMT+02:00 Thomas Mueller muel...@adobe.com:
Hi,
QueryIndex.getCost: this is actually quite well documented (see the
Javadocs). But the implementations might not fully follow the contract :-)
this is probably just my opinion but the contract is not much clear; to me
finding
On 18/06/2014 10:26, Tommaso Teofili wrote:
it would be ok for me to either deprecate it or improve the semantics
of the cost calculation (e.g. explicitly introduce other metrics to be
taken into account in the cost calculation: local / remote index,
With the IndexPlan.isDelayed() we instruct
Hi,
QueryIndex.getCost
my doubt is what
this heuristic function to estimate the traversed entries should look
like in general
Relational databases typically know the number of entries in the index
(total indexed entries), plus the selectivity of a column. See also
Hi,
On Wed, Jun 18, 2014 at 4:26 AM, Tommaso Teofili
tommaso.teof...@gmail.com wrote:
should we just return the number of estimated entries for the cost?
Yes, that's what I think the contract should be.
My other concern on this point is that it's not granted, in my opinion,
that the index
Hi,
On Wed, Jun 18, 2014 at 7:44 AM, Thomas Mueller muel...@adobe.com wrote:
My other concern on this point is that it's not granted, in my opinion,
that the index returning less entries would be the faster.
Yes, it's not that much about less entries or more entries, it's about
lower or higher
ok, thanks Davide for the pointers.
Regards,
Tommaso
2014-06-18 13:36 GMT+02:00 Davide Giannella giannella.dav...@gmail.com:
On 18/06/2014 10:26, Tommaso Teofili wrote:
it would be ok for me to either deprecate it or improve the semantics
of the cost calculation (e.g. explicitly introduce
Hi,
2014-06-18 13:44 GMT+02:00 Thomas Mueller muel...@adobe.com:
Hi,
QueryIndex.getCost
my doubt is what
this heuristic function to estimate the traversed entries should look
like in general
Relational databases typically know the number of entries in the index
(total indexed entries),
Hi,
2014-06-18 16:02 GMT+02:00 Jukka Zitting jukka.zitt...@gmail.com:
Hi,
On Wed, Jun 18, 2014 at 4:26 AM, Tommaso Teofili
tommaso.teof...@gmail.com wrote:
should we just return the number of estimated entries for the cost?
Yes, that's what I think the contract should be.
ok, that's
Hi,
On Wed, Jun 18, 2014 at 11:31 AM, Tommaso Teofili
tommaso.teof...@gmail.com wrote:
2014-06-18 16:02 GMT+02:00 Jukka Zitting jukka.zitt...@gmail.com:
On Wed, Jun 18, 2014 at 4:26 AM, Tommaso Teofili
tommaso.teof...@gmail.com wrote:
should we just return the number of estimated entries for
We could let the
user decide if using an asynchronous index is OK or not.
Another option is if there is no synch index available but an asynch
index is available then QueryEngine should use that instead of
resorting to traversal.
Well, this is the current behavior. The query engine doesn't
25 matches
Mail list logo