[
https://issues.apache.org/jira/browse/LUCENE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13592366#comment-13592366
]
Adrien Grand commented on LUCENE-4752:
--------------------------------------
bq. How can you early terminate a query for a single segment? [...] To early
terminate efficiently, you must have the segments in a consistent order, e.g.
S1 > S2 > S3.
I think this is just an API limitation? Segments being processed independently,
we should be able to terminate collection on a per-segment basis?
bq. instead of stuffing into IWC what seems like a random setting
(pick-segments-for-sorting), we should have something more generic, like
AtomicReaderFactory
I didn't mean this should be a boolean. Of course it should be something more
flexible/configurable! I'm very bad at picking names, but following your naming
suggestion, we could have something like
{code}
abstract class AtomicReaderFactory {
abstract List<AtomicReader> reorder(List<SegmentReader> segmentReaders);
}
{code}?
The default impl would be the identity whereas the sorting impl would return a
singleton containing a sorted view over the segment readers?
bq. Also, a custom SegmentMerger to implement the zig-zag merge would help too.
This is another option. I actually started exploring this option when David
opened this issue, but it can become complicated, especially for postings lists
merging, whereas reusing the sorted view from LUCENE-3918 would make merging
trivial.
> Merge segments to sort them
> ---------------------------
>
> Key: LUCENE-4752
> URL: https://issues.apache.org/jira/browse/LUCENE-4752
> Project: Lucene - Core
> Issue Type: New Feature
> Components: core/index
> Reporter: David Smiley
> Assignee: Adrien Grand
>
> It would be awesome if Lucene could write the documents out in a segment
> based on a configurable order. This of course applies to merging segments
> to. The benefit is increased locality on disk of documents that are likely to
> be accessed together. This often applies to documents near each other in
> time, but also spatially.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]