On Thu, Dec 16, 2021 at 3:53 PM Greg Miller <[email protected]> wrote: >
> TaxonomyReader was recently updated > to support bulk ordinal resolution (LUCENE-9476), but SSDV faceting is > stuck looking up paths one-at-a-time via SSDV#lookupOrd(ord). This > results in a separate TermsEnum#seekExact() call down in > Lucene90DocValuesProducer for each ordinal being returned. > I'm confused, where do we do gazillions of lookupOrd(), we should not be doing that. The ordinals should be used for all the heavy-duty work, and at the very end, only the top-10 or whatever resolved back to strings with lookupOrd. Think of it kinda like the stored fields :) > Having no knowledge about the actual data representation behind the > TermsDict in an SSDV field, I'm wondering if someone here can provide > a high-level sense of whether-or-not there might be an advantage to > looking up ordinals in bulk. I'm going to dig into the code anyway > (curious!), but thought I'd raise the idea/question here as well > regarding whether-or-not a bulk lookup might be advantageous in > general for SSDV fields. Any thoughts? I don't think we should provide such an API, because the operation is slow and should not be done in "bulk" anyway. Number of lookups should be low (e.g. 10, 50, whatever the user's top-N is). If you want to optimize it, sort them in ascending order and look that up first, but honestly in most cases, that probably isn't even worth it. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
