SortingAtomicReader uses the TimSort algorithm, which performs well when the two segments are already sorted. Anyway, that's the way to do it, even if it looks like it does more work than it should.
Shai On Wed, Oct 23, 2013 at 10:46 PM, Arvind Kalyan <bas...@gmail.com> wrote: > Thanks, my understanding is that SortingMergePolicy performs sorting after > wrapping the 2 segments, correct? > > As I mentioned in my original email I would like to avoid the re-sorting > and exploit the fact that the input segments are already sorted. > > > > On Wed, Oct 23, 2013 at 11:02 AM, Shai Erera <ser...@gmail.com> wrote: > > > Hi > > > > You can use SortingMergePolicy and SortingAtomicReader to achieve that. > You > > can read more about index sorting here: > > http://shaierera.blogspot.com/2013/04/index-sorting-with-lucene.html > > > > Shai > > > > > > On Wed, Oct 23, 2013 at 8:13 PM, Arvind Kalyan <bas...@gmail.com> wrote: > > > > > Hi there, I'm looking for pointers, suggestions on how to approach this > > in > > > Lucene 4.5. > > > > > > Say I am creating an index using a sequence of addDocument() calls and > > end > > > up with segments that each contain documents in a specified ordering. > It > > is > > > guaranteed that there won't be updates/deletes/reads etc happening on > the > > > index -- this is an offline index building task for a read-only index. > > > > > > I create the index in the above mentioned fashion > > > using LogByteSizeMergePolicy and finally do a forceMerge(1) to get a > > single > > > segment in the ordering I want. > > > > > > Now my requirement is that I need to be able to merge this single > segment > > > with another such segment (say from yesterday's index) and guarantee > some > > > ordering -- say I have a comparator which looks at some field values in > > the > > > 2 given docs and defines the ordering. > > > > > > Index 1 with segment X: > > > (a,1) > > > (b,2) > > > (e,10) > > > > > > Index 2 (say from yesterday) with some segment Y: > > > (c,4) > > > (d,6) > > > > > > Essentially we have 2 ordered segments, and I'm looking to 'merge' them > > > (literally) using the value of some field, without having to re-sort > them > > > which would be too time & resource consuming. > > > > > > Output Index, with some segment Z: > > > (a,1) > > > (b,2) > > > (c,4) > > > (d,6) > > > (e,10) > > > > > > Is this already possible? If not, any tips on how I can approach > > > implementing this requirement? > > > > > > Thanks, > > > > > > -- > > > Arvind Kalyan > > > > > > > > > -- > Arvind Kalyan > http://www.linkedin.com/in/base16 > cell: (408) 761-2030 >