I agree it would make long-term sense to consolidate the backend
implementation. I think leaving the "classic" user-facing facet API (with
JSON Facet module as a backend) would be a good idea. Either way, I think a
first step would be checking for parity between existing backend
implementations -- possibly in terms of features [1], but certainly in
terms of performance for common use cases [2].

I think removal of the "classic" user-facing API would cause a lot of
consternation in the user community. I can even see a
non-backward-compatibility argument for preserving the "classic"
user-facing API: it's simpler for simple use cases. _If_ the ultimate goal
is removal of the "classic" user-facing API (not presuming that it is),
that approach could be facilitated in the short term by enticing users
towards "JSON Facet" API ... basically with a "feature freeze" on the
legacy implementation. No new features [3], no new optimizations [4] for
"classic"; concentrate such efforts on JSON Facet. This seems to already be
the de facto case, but it could be a more intentional decision -- e.g. in
[3] it's straightforward to extend the the proposed "facet cache" to the
"classic" impl ... but I could see an argument for intentionally not doing
so.

Robert, I think your concerns about UninvertedField could be addressed by
the `uninvertible="false"` property (currently defaults to "true" for
backward compatibility iiuc; but could default to "false", or at least
provide the ability to set the default for all fields to "false" at node
level solr.xml? -- I know I've wished for the latter!). Also fwiw I'm not
aware of any JSON Facet processors that work with string values in RAM ...
I do think all JSON Facet processors use OrdinalMap now, where relevant.

[1] https://issues.apache.org/jira/browse/SOLR-14921
[2] https://issues.apache.org/jira/browse/SOLR-14764
[3] https://issues.apache.org/jira/browse/SOLR-13807
[4] https://issues.apache.org/jira/browse/SOLR-10732

On Fri, Jan 22, 2021 at 12:46 AM Robert Muir <rcm...@gmail.com> wrote:

> Do these two options conflate concerns of input format vs. actual
> algorithm? That was always my disappointment.
>
> I feel like the java apis are off here at the lower level, and it
> hurts the user.
> I don't talk about the input format from the user, instead I mean the
> execution of the faceting query.
>
> IMO: building top-level caches (e.g. uninvertedfield) or
> on-the-fly-caches (e.g. fieldcache) is totally trappy already.
> But with the uninvertedfield of json facets it does its own thing,
> even if you went thru the trouble to enable docvalues at index time:
> that's sad.
>
> the code by default should not give the user jvm
> heap/garbage-collector hell. If you want to do that to yourself, for a
> totally static index, IMO that should be opt-in.
>
> But for the record, it is no longer just two shitty choices like
> "top-level vs per-segment". There are different field types, e.g.
> numeric types where the per-segment approach works efficiently.
> Then you have the strings, but there is a newish middle ground for
> Strings: OrdinalMap (lucene Multi* interfaces do it) which builds
> top-level integers structures to speed up string-faceting, but doesnt
> need *string values* in ram.
> It is just integers and mostly compresses as deltas. Adrien compresses
> the shit out of it.
>
> So I'd hate for the user to lose the option here of using docvalues to
> keep faceting out of heap memory, which should not be hassling them
> already in 2021.
> Maybe better to refactor the code such that all these concerns aren't
> unexpectedly tied together.
>
> On Thu, Jan 21, 2021 at 10:08 PM David Smiley <dsmi...@apache.org> wrote:
> >
> > There's a JIRA issue about this from 5 years ago:
> https://issues.apache.org/jira/browse/SOLR-7296
> > I don't recall seeing any resistance to the idea of having the JSON
> Faceting module act as a back-end to the front-end (API surface) of Solr's
> common/classic/original/whatever faceting API.  I don't think that simple
> API should go away; it's strength is simple/common cases that are
> comparatively verbose in the JSON one.
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> >
> > On Thu, Jan 21, 2021 at 9:57 PM Marcus Eagan <marcusea...@gmail.com>
> wrote:
> >>
> >> Hi all,
> >>
> >> Sorry to spam the list. I am querying the list in such quick succession
> because of a realization I came to while on Twitter. Is it time to
> deprecate the Legacy Facet API?
> >>
> >> I understood in the past that they behaved slightly differently. Now,
> I'm wondering if it makes sense to keep the legacy facets package as it
> adds a burden of maintenance to the project. If some activists really want
> it, I will abandon the effort. If the interest is very light, I suppose
> they can package it up in a plugin. In fact, I would help if they run into
> trouble and I am able to help.
> >>
> >> Anyway, let me know what you think. If it's a good idea, I will head
> over to the chopping block.
> >>
> >> --
> >> Marcus Eagan
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

Reply via email to