Hi, On OpenSearch, we've been taking advantage of the various O(1) Weight#count() implementations to quickly compute various aggregations without needing to iterate over all the matching documents (at least when the top-level query is functionally a match-all at the segment level). Of course, from what I've seen, every clever Weight#count() implementation falls apart (returns -1) in the face of deletes.
I was thinking that we could still handle small numbers of deletes efficiently if only we could get a DocIdSetIterator for deleted docs. Like suppose you're doing a date histogram aggregation, you could get the counts for each bucket from the points tree (ignoring deletes), then iterate through the deleted docs and decrement their contribution from the relevant bucket (determined based on a docvalues lookup). Assuming the number of deleted docs is small, it should be cheap, right? The current LiveDocs implementation is just a FixedBitSet, so AFAIK it's not great for iteration. I'm imagining adding a supplementary "deleted docs iterator" that could sit next to the FixedBitSet if and only if the number of deletes is "small". Is there a better way that I should be thinking about this? Thanks, Froh