gsmiller commented on pull request #509: URL: https://github.com/apache/lucene/pull/509#issuecomment-992752034
Ah I think I understand the need for retaining the tree structure now. With taxonomy faceting, we generally "roll up" ancestry counts at construction-time but with this approach we can't actually do that because ordinals may not actually exist for some of the ancestors right? So we need this structure in order to essentially "roll up" counts when doing the top-n or specific value retrieval. Have you considered the multi-value scenario? What happens if a document has multiple values for a faceting field that share common ancestry? We'll double-count ancestors right? Taxonomy-based faceting handles this at indexing time based on the `DimConfig` specified for the field. So if a field is configured as multi-valued, we explicitly index all of the ancestry values and don't do the "roll up" to avoid the double-counting problem. I wonder if we need to do something similar for this as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org