On Thu, Dec 16, 2021 at 3:53 PM Greg Miller <[email protected]> wrote:
>

> TaxonomyReader was recently updated
> to support bulk ordinal resolution (LUCENE-9476), but SSDV faceting is
> stuck looking up paths one-at-a-time via SSDV#lookupOrd(ord). This
> results in a separate TermsEnum#seekExact() call down in
> Lucene90DocValuesProducer for each ordinal being returned.
>

I'm confused, where do we do gazillions of lookupOrd(), we should not
be doing that. The ordinals should be used for all the heavy-duty
work, and at the very end, only the top-10 or whatever resolved back
to strings with lookupOrd. Think of it kinda like the stored fields :)

> Having no knowledge about the actual data representation behind the
> TermsDict in an SSDV field, I'm wondering if someone here can provide
> a high-level sense of whether-or-not there might be an advantage to
> looking up ordinals in bulk. I'm going to dig into the code anyway
> (curious!), but thought I'd raise the idea/question here as well
> regarding whether-or-not a bulk lookup might be advantageous in
> general for SSDV fields. Any thoughts?

I don't think we should provide such an API, because the operation is
slow and should not be done in "bulk" anyway. Number of lookups should
be low (e.g. 10, 50, whatever the user's top-N is). If you want to
optimize it, sort them in ascending order and look that up first, but
honestly in most cases, that probably isn't even worth it.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to