Re: New Query API - in a distinct bundle? (was [jira] [Commented] (SLING-4752) New resource query API)

Justin Edelson Tue, 23 Jun 2015 15:24:00 -0700

Hi,

On Tue, Jun 23, 2015 at 8:49 PM Alexander Klimetschek <aklim...@adobe.com>
wrote:


> > On 22.06.2015, at 15:49, Justin Edelson <jus...@justinedelson.com>
> wrote:
> > IIUC, the core problem we are trying to solve is to provide a query
> syntax
> > indepdent of any particular ResourceResolver implementation. While, to be
> > honest, this is not a problem I have personally run into using Sling for
> > the past 6 years,
>
> That's my main concern as well. An edge case creating a ton of complexity
> with probably leaky abstractions (inevitable little tricks to pass through
> query language/resource provider specific stuff, the AEM query builder
> experienced this already with the @orderby statement and fn: functions).
>
> > One thing which concerns me about the current Query API is that it
> appears
> > to be completely non-extensible. How, for example, would one implement
> > something like
> >
> https://docs.adobe.com/docs/en/cq/5-6-1/javadoc/com/day/cq/search/eval/RelativeDateRangePredicateEvaluator.html
>
> This is not the Query API. This is the SPI.


Yes, I know this is the SPI of the QueryBuilder. My point is that because
the current Sling Query API is all strongly typed, there's no way to extend
it with custom predicates like this. In order to add this, the Query API
itself would need to be modified.

Perhaps this extensibility is not desired in Sling, but IMHO it certainly
is one advantage of the (AEM) QueryBuilder.

And if we don't have it in Sling, it only makes the developer decision as
to what query abstraction to use that much more complicated.


> But yes, you would need a way to have a different SPI per resource
> provider. Currently a PredicateEvaluator [1] has a single
> getXpathExpression().
>
> [1]
> https://docs.adobe.com/docs/en/aem/6-1/ref/javadoc/com/day/cq/search/eval/PredicateEvaluator.html
>
> > ? If I'm reading this correctly, the date math has to be done by the
> > caller. Which isn't that problematic at first, but the code would be
> > significantly more verbose than
> >
> > relativedaterange.property=jcr:lastModified
> > relativedaterange.lowerBound=-1d
>
> Some of that common parsing logic should be shared, of course, used by the
> different SPIs.


> > What is potentially problematic about not having this type of
> extensibility
> > is that it prevents specific implementations from providing the best
> > implementation possible.
>
> Yep, the AEM querybuilder so far was not designed for different underlying
> query languages / engines, this would be something to look into.
>
> Its design goal was to allow customers to plugin own predicate evaluators
> mainly for making client side queries short and descriptive, and have them
> "expanded" into the full, maybe more complex xpath query involving multiple
> predicates or some custom parsing as the date example.
>
> Taken plain, this would lead to a matrix of things, predicate evaluators X
> query languages (resource providers). Not sure if this is desirable.
>
> > Here's a better example: JCR is unable to compare two properties, i.e.
> give
> > me all nodes where property foo equals the value of property bar. But
> > MongoDB *can* do this (it isn't super-efficient, but it is possible). I
> can
> > almost see how you would do this with the new Query API, but it would be
> > ugly at best. Or, more broadly, how would the MongoDB $where operator be
> > supported?
>
> Could be a new predicate:
>
> compare.left=jcr:title
> compare.right=jcr:description
>

It could be in the AEM QueryBuilder, but this isn't something the Sling
Query API can support.


>
> > 1) A map of key/value pairs is turned into a PredicateGroup object.
> > 2) The PredicateGroup (which is a nested tree) at this point represents
> the
> > query statement.
> > 3) Each ResourceProvider analyzes the predicates and decides whether or
> not
> > it knows how to evaluate all of them. If it can't, it should return no
> > results (this is debatable, but I think it makes sense). The only
> exception
> > is where you had an or clause, i.e. this query:
> >
> > fulltext=Management
> > group.p.or=true
> > group.1_jcrType=dam:Asset
> > group.2_resourceType=some/resource/type
>
> Yep, these tend to be joins.
>
> Once you join/merge results across different resource providers, you will
> never be able to get acceptable performance. And the implementation is no
> longer resource provider specific, since you need someone on the resource
> resolver level to understand the query.
>

I'm not sure why the performance would be suboptimal in this case unless
sorting was involved. This predicate list would map to three queries (in
the JCR + Mongo use case):

//element(*, dam:Asset)[@jcr:contains(., 'Management')
//element(*, nt:base)[@sling:resourceType='some/resource/type' and
@jcr:contains(., 'Management')
{ 'sling:resourceType' : { $eq : 'some/resource/type' } }, { $text :
'Management' }

And you wouldn't actually need to execute all three queries at once (unless
you needed sizing information) - just return some kind of lazy executor
which went through each result set before executing one query.

The performance for this would be as good as could be expected.

But let's be clear - query is always going to be a highly leaky
abstraction. Even querying against the JCR API directly is very leaky at
this point in Oak because you really need to know the indexes available in
the system in order to know that a query is going to perform well. Ditto
with MongoDB or any other queryable system.


>
> Here a central search index (Solr, ElasticSearch etc.) is the right
> solution anyway. And that's what I am preaching, anyone who actually has
> the use case of searching across multiple resource providers with the same
> query language should do this.
>

I don't disagree that a centralized index would be a better functional
match, albeit with additional operational complexity. I don't think there's
anything in the model I proposed which would preclude the ResourceResolver
from handing the query off directly to Solr instead of passing it down to
the ResourceProviders.


>
> If the use case is "one resource provider" only, then IMO you can live
> with rp specific query languages, and the current findResources() is fine
> (as long as you can put the query statement in a single string).
>
> > 4) The ResourceProvider uses PredicateEvaluators to map each predicate to
> > its native query syntax. For this to work, each ResourceProvider would
> > expose its own PredicateEvaluator interface (in theory,
> > a ResourceProvider doesn't need to do this if the evaluation process
> isn't
> > intended to be pluggable).
>
> The PredicateEvaluator SPI could be rp specific and not part of the sling
> resource query API.
>

Yes, this is exactly what I'm thinking.

Regards,
Justin


>
> Cheers,
> Alex

Re: New Query API - in a distinct bundle? (was [jira] [Commented] (SLING-4752) New resource query API)

Reply via email to