benwtrent opened a new issue, #15561:
URL: https://github.com/apache/lucene/issues/15561

   ### Description
   
   I have been digging into the AcceptDocs API and I noticed the following from 
the java docs:
   
   ```
     /**
      * Return an approximation of the number of accepted documents. This is 
typically useful to decide
      * whether to consume these accept docs using random access ({@link 
#bits()}) or sequential access
      * ({@link #iterator()}).
      *
      * <p><b>NOTE</b>: This must not be called after {@link #iterator()}.
      *
      * @return approximate cost
      */
     public abstract int cost() throws IOException;
   ```
   
   However the implementation for the most common non-cached iterator:
   
   ```
       public int cost() throws IOException {
         createBitSetAcceptDocsIfNecessary();
         return acceptBitSet.cardinality();
       }
   ```
   
   Actually fully consumes the iterator and just calls cardinality (nothing 
approximate at all...).
   
   Why are we doing that? Why aren't we relying on `DocIdSetIterator#cost` or 
at least `acceptBitSet.cardinality`?
   
   It seems to me the main idea behind AcceptDocs is the ability to bypass 
realizing the bitset and to just iterate as normal when the filter is very 
restrictive...
   
   
   //cc @shubhamvishu 
   
   ### Version and environment details
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to