Hey folks- I've been chatting with Marc D'Mello a bit about the SSDV faceting he's working on (LUCENE-10250) (disclaimer: we both work on Amazon's Product Search engine). We're trying to figure out where taxonomy-based faceting has a performance advantage over SSDV, and it occurred to me that the way the two approaches resolve the paths for given ordinals is a bit different. TaxonomyReader was recently updated to support bulk ordinal resolution (LUCENE-9476), but SSDV faceting is stuck looking up paths one-at-a-time via SSDV#lookupOrd(ord). This results in a separate TermsEnum#seekExact() call down in Lucene90DocValuesProducer for each ordinal being returned.
Having no knowledge about the actual data representation behind the TermsDict in an SSDV field, I'm wondering if someone here can provide a high-level sense of whether-or-not there might be an advantage to looking up ordinals in bulk. I'm going to dig into the code anyway (curious!), but thought I'd raise the idea/question here as well regarding whether-or-not a bulk lookup might be advantageous in general for SSDV fields. Any thoughts? Cheers, -Greg --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
