On Fri, Apr 30, 2010 at 1:15 PM, Burton-West, Tom <[email protected]> wrote: > I’m a bit confused about the DocsEnum.read() in the flex API. I have three > questions: > > > DocsEnum.read() currently delegates to nextDoc() in the base class and there > is a note that subclasses may do this more efficiently. Is there currently > a more efficient implementation in a subclass? I didn’t see one in > MultiDocsEnum or MappingMultiDocsEnum, but perhaps I’m not understanding the > code.
Yes, the standard codec does so (StandardPostingsReaderImpl.java). MultiDocsEnum doesn't... but you should not use that (if performance is important). Instead you should go segment by segment. > DocsEnum.read reads 64 docs/freqs at a time as set up in initBulkResult(). > Would it make sense to have this configurable as an argument somewhere? > I’m looking at very large indexes where a common term might occur in 100,000 > or more docs. We could do that... maybe .getBulkResult should take a "suggested size"? It'd just be a suggestion though, since eg block based codecs would presumably return to you a direct slice into their underlying int[] buffers. > At the very top of the JavaDoc there is a warning “you must first call > nextDoc” It seems that this applies to calling DocsEnum.docID() or > DocsEnum.freq() but not to DocsEnum.read(). Is that correct? That's right -- I just committed a small fix to the jdoc to clarify this. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
