Re: New Query API - in a distinct bundle? (was [jira] [Commented] (SLING-4752) New resource query API)

Justin Edelson Mon, 22 Jun 2015 15:56:50 -0700

Hi,

On Mon, Jun 22, 2015 at 10:57 PM Alexander Klimetschek <aklim...@adobe.com>
wrote:

> On 15.06.2015, at 02:23, Carsten Ziegeler <cziege...@apache.org> wrote:
> >
> > It really seems that people who are not convinced have never felt the
> > current pain - while people who are on the pro side exactly felt this
> > pain and ran into the problems which this is trying to solve. I'm
> > absolutely unsure on how to solve that situation.
>
> I was asking this before: what are the pains and specific use cases?
>
> (Apart from the paging of results)
>

Apologies for not tracking this discussion, but I wanted to weigh in before
things got much further.

IIUC, the core problem we are trying to solve is to provide a query syntax
indepdent of any particular ResourceResolver implementation. While, to be
honest, this is not a problem I have personally run into using Sling for
the past 6 years, I can certainly see why it is one.

But I do think we have a good answer available which was Alex's original
proposal to have Adobe donate the QueryBuilder code to Sling. Now the
QueryBuilder code as-is wouldn't solve this problem; it would require a
refactoring, but I believe this refactoring is managable. This would have
the following benefits:

1) Adopt a syntax many (but certainly not all) Sling developers are
famililar with.
2) Provide a path to avoid YAQL. While yes, in the near term we will have
"Sling QueryBuilder" and "AEM QueryBuilder", the AEM QueryBuilder could be
deprecated (obviously up to AEM Product Management) and eventually removed.
3) An opportunity to fix some of the issues with QueryBuilder (granted,
this isn't necessarily Sling's problem to solve).

One thing which concerns me about the current Query API is that it appears
to be completely non-extensible. How, for example, would one implement
something like
https://docs.adobe.com/docs/en/cq/5-6-1/javadoc/com/day/cq/search/eval/RelativeDateRangePredicateEvaluator.html
? If I'm reading this correctly, the date math has to be done by the
caller. Which isn't that problematic at first, but the code would be
significantly more verbose than

relativedaterange.property=jcr:lastModified
relativedaterange.lowerBound=-1d

What is potentially problematic about not having this type of extensibility
is that it prevents specific implementations from providing the best
implementation possible. For example, let's say that MongoDB has a really
efficient way to query for documents modified in the last day. If I do the
date math in Java code, I'm making it that much harder for the MongoDB
ResourceProvider to opimitize this query (sorry, this isn't a great
example, but it's late and I'm getting tired). Plus, the query isn't really
expressing what I want -- I want to find resources modified in the last
day, not from some absolute date. So someone reading my code later has to
figure out what the calls to Calendar.add(Calendar.DAY_OF_MONTH, -1) are
there for.

Here's a better example: JCR is unable to compare two properties, i.e. give
me all nodes where property foo equals the value of property bar. But
MongoDB *can* do this (it isn't super-efficient, but it is possible). I can
almost see how you would do this with the new Query API, but it would be
ugly at best. Or, more broadly, how would the MongoDB $where operator be
supported?

The advantage of the AEM QueryBuilder's model is that figuring all of this
stuff out isn't the responsibility of the platform developer. We just need
to provide a solid basis and then let downstream users add their own hooks.
As soon as you say that these are the only 8 operations anyone is ever
going to do on a property or the 4 operations anyone is ever going to do on
a resource, you're into "640k should be enough memory for anyone" territory.

So how specifically would the Sling QueryBuilder be different than the AEM
QueryBuilder?

I think of QueryBuilder queries being processed in these separate steps
(FWIW, none of this is proprietary information, it is based on public
documentation):

1) A map of key/value pairs is turned into a PredicateGroup object. While
technically this step is optional (you can build a PredicateGroup by hand),
it is pretty common. This would be common functionality across all
ResourceResolvers and the code from AEM could probably be brought over
as-is.
2) The PredicateGroup (which is a nested tree) at this point represents the
query statement. It is then passed to the ResourceResolver (this part is
somewhat different than the AEM QueryBuilder).
3) Each ResourceProvider analyzes the predicates and decides whether or not
it knows how to evaluate all of them. If it can't, it should return no
results (this is debatable, but I think it makes sense). The only exception
is where you had an or clause, i.e. this query:

fulltext=Management
group.p.or=true
group.1_jcrType=dam:Asset
group.2_resourceType=some/resource/type

If a non-JCR provider didn't know how to evaluate the jcrType predicate
type, it could still evaluate the query because it is OR'd with a
resourceType predicate (which let's say it does know how to evaluate). But
if it didn't know how to evaluate the fulltext predicate type, it shouldn't
return any results.

4) The ResourceProvider uses PredicateEvaluators to map each predicate to
its native query syntax. For this to work, each ResourceProvider would
expose its own PredicateEvaluator interface (in theory,
a ResourceProvider doesn't need to do this if the evaluation process isn't
intended to be pluggable). IIOW, the current AEM PredicateEvaluator
interface would be renamed JcrPredicateEvaluator.
5) At least in JCR (based on current functionality), some Predicates can't
be evaluated in a native query (i.e. XPath) and will need to be handled as
filters on the result set, but this is an implementation detail left to
the ResourceProvider.
6) The ResourceProvider returns results to the ResourceResolver.
7) Sorting is handled (or not) as currently proposed.

To be clear, I don't have a concrete proposal for how to replicate (or not)
AEM QueryBuilder's facet support. Alex might...

Regards,
Justin

>
> Cheers,
> Alex
>

Re: New Query API - in a distinct bundle? (was [jira] [Commented] (SLING-4752) New resource query API)

Reply via email to