[ 
https://issues.apache.org/jira/browse/LUCENE-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15255402#comment-15255402
 ] 

Michael McCandless commented on LUCENE-6766:
--------------------------------------------

bq. I think a challenge to sorting flushed segments is how we write stored 
fields and term vectors directly to the directory at index time. We should 
somehow buffer them in memory and sort on flush when a non-default sort order 
is configured? Or do you see an easier way?

Hmm tricky.  Yeah, we could buffer in heap if IWC.indexSort is set, or ... we 
could just write as we do today, but then ask the codec for a stored fields 
(and term vectors) reader to do the sorting at flush time.

Or we separate "sorting on flushed segments" out for the future, keeping 
{{SortingLeafReader}}, since the rest of this is already plenty hard, and focus 
here on making merging more efficient (don't use 
{{SlowCompositeReaderWrapper}}?  I think it would mean fixing the default merge 
impls ... today they all assume they concatenate each segments document 
sequentially (mapping around deletions) but with indexSort in use, they just 
need to merge sort instead.  Maybe we can abstract "concat vs merge sort" away 
so that all default merge impls could re-use it ... seems like it could be 
fairly clean maybe. 

> Make index sorting a first-class citizen
> ----------------------------------------
>
>                 Key: LUCENE-6766
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6766
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-6766.patch
>
>
> Today index sorting is a very expert feature. You need to use a custom merge 
> policy, custom collectors, etc. I would like to explore making it a 
> first-class citizen so that:
>  - the sort order could be configured on IndexWriterConfig
>  - segments would record the sort order that was used to write them
>  - IndexSearcher could automatically early terminate when computing top docs 
> on a sort order that is a prefix of the sort order of a segment (and if the 
> user is not interested in totalHits).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to