Re: Simulate facet.exists for json query facets

Michael Gibney Fri, 30 Oct 2020 06:59:25 -0700

>If all of those facet queries are _known_ to be a performance hit,
you might be able to do something custom.That would require
custom code though and I wouldn’t go there unless you can
demonstrate need.


Yeah ... indeed if those facet queries are relatively static (and thus
cacheable ... even if there are a lot of them), an appropriately-sized
filterCache would allow them to be cached to good effect and then the
performance hit should be negligible. Knowing what the queries are up
front, you could even add them to your warming queries.

It'd also be unusual (though possible, sure?) to run these kinds of
facet queries with no intention of ever conditionally following up in
a way that would want the actual results/docSet -- even if the
initial/more common query only cares about boolean existence.

The case in which this type of functionality really might be indicated is:
1. only care about boolean result (obvious, ok)
2. dynamic (i.e., not-particularly-cacheable) queries
3. never intend to follow up with a request that calls for full results

If both of the first two conditions hold, and especially if the third
also holds, there would in principle definitely be efficiency to be
gained by early termination (and avoiding the creation of a DocSet,
which at the moment happens unconditionally for every facet query).
I'm also thinking about this through the lens of bringing the JSON
Facet API to parity with the legacy facet API, fwiw ...

On Fri, Oct 30, 2020 at 9:02 AM Erick Erickson <erickerick...@gmail.com> wrote:
>
> I don’t think there’s anything to do what you’re asking OOB.
>
> If all of those facet queries are _known_ to be a performance hit,
> you might be able to do something custom.That would require
> custom code though and I wouldn’t go there unless you can
> demonstrate need.
>
> If you issue a debug=timing you’ll see the time each component
> takes,  and there’s a separate entry for faceting so that’ll give you
> a clue whether it’s worth the effort.
>
> Best,
> Erick
>
> > On Oct 30, 2020, at 8:10 AM, Michael Gibney <mich...@michaelgibney.net> 
> > wrote:
> >
> > Michael, sorry for the confusion; I was positing a *hypothetical*
> > "exists()" function that doesn't currently exist, that *is* an
> > aggregate function, and the *does* stop early. I didn't account for
> > the fact that there's already an "exists()" function *query* that
> > behaves very differently. So yes, definitely confusing :-). I guess
> > choosing a different name for the proposed aggregate function would
> > make sense. I was suggesting it mostly as an alternative to extending
> > the syntax of JSON Facet "query" facet type, and to say that I think
> > the implementation of such an aggregate function would be pretty
> > straightforward.
> >
> > On Fri, Oct 30, 2020 at 3:44 AM michael dürr <due...@gmail.com> wrote:
> >>
> >> @Erick
> >>
> >> Sorry! I chose a simple example as I wanted to reduce complexity.
> >> In detail:
> >> * We have distinct contents like tours, offers, events, etc which
> >> themselves may be categorized: A tour may be a hiking tour, a
> >> mountaineering tour, ...
> >> * We have hundreds of customers that want to facet their searches to that
> >> content types but often with distinct combinations of categories, i.e.
> >> customer A wants his facet "tours" to only count hiking tours, customer B
> >> only mountaineering tours, customer C a combination of both, etc
> >> * We use "query" facets as each facet request will be build dynamically (it
> >> is not feasible to aggregate certain categories and add them as an
> >> additional solr schema field as we have hundreds of different 
> >> combinations).
> >> * Anyways, our ui only requires adding a toggle to filter for (for example)
> >> "tours" in case a facet result is present. We do not care about the number
> >> of tours.
> >> * As we have millions of contents and dozens of content types (and dozens
> >> of categories per content type) such queries may take a very long time.
> >>
> >> A complex example may look like this:
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> *q=*:*&json.facet={   tour:{     type : query,     q: \"+categoryId:(21450
> >> 21453)\"   },   guide:{     type : query,     q: \"+categoryId:(21105 21401
> >> 21301 21302 21303 21304 21305 21403 21404)\"   },   story:{     type :
> >> query,     q: \"+categoryId:21515\"   },   condition:{     type : query,
> >> q: \"+categoryId:21514\"   },   hut:{     type : query,     q:
> >> \"+categoryId:8510\"   },   skiresort:{     type : query,     q:
> >> \"+categoryId:21493\"   },   offer:{     type : query,     q:
> >> \"+categoryId:21462\"   },   lodging:{     type : query,     q:
> >> \"+categoryId:6061\"   },   event:{     type : query,     q:
> >> \"+categoryId:21465\"   },   poi:{     type : query,     q:
> >> \"+(+categoryId:6000 -categoryId:(6061 21493 8510))\"   },   authors:{
> >> type : query,     q: \"+categoryId:(21205 21206)\"   },   partners:{
> >> type : query,     q: \"+categoryId:21200\"   },   list:{     type :
> >> query,     q: \"+categoryId:21481\"   } }\&rows=0"*
> >>
> >> @Michael
> >>
> >> Thanks for your suggestion but this does not work as
> >> * the facet module expects an aggregate function (which i simply added by
> >> embracing your call with sum(...))
> >> * and (please correct me if I am wrong) the exists() function not stops on
> >> the first match, but counts the number of results for which the query
> >> matches a document.
>

Re: Simulate facet.exists for json query facets

Reply via email to