loadSortTerm is your method right? In the current Sorter.sort
implementation, I see this code:
boolean sorted = true;
for (int i = 1; i < maxDoc; ++i) {
if (comparator.compare(i-1, i) > 0) {
sorted = false;
break;
}
}
if (sorted) {
return null;
}
Perhaps you can write similar code?
Also note that the sorting interface has changed, I think in 4.8, and now
you don't really need to implement a Sorter, but rather pass a SortField,
if that works for you.
Shai
On Tue, Jun 17, 2014 at 9:41 AM, Ravikumar Govindarajan <
[email protected]> wrote:
> Shai,
>
> This is the code snippet I use inside my class...
>
> public class MySorter extends Sorter {
>
> @Override
>
> public DocMap sort(AtomicReader reader) throws IOException {
>
> final Map<Integer, BytesRef> docVsId = loadSortTerm(reader);
>
> final Sorter.DocComparator comparator = new Sorter.DocComparator() {
>
> @Override
>
> public int compare(int docID1, int docID2) {
>
> BytesRef v1 = docVsId.get(docID1);
>
> BytesRef v2 = docVsId.get(docID2);
>
> return v1.compareTo(v2);
>
> }
>
> };
>
> return sort(reader.maxDoc(), comparator);
>
> }
> }
>
> My Problem is, the "AtomicReader" passed to Sorter.sort method is actually
> a SlowCompositeReader, composed of a list of AtomicReaders each of which is
> already sorted.
>
> I find this "loadSortTerm(compositeReader)" to be a bit heavy where it
> tries to all load the doc-to-term mappings eagerly...
>
> Are there some alternatives for this?
>
> --
> Ravi
>
>
> On Tue, Jun 17, 2014 at 10:58 AM, Shai Erera <[email protected]> wrote:
>
> > I'm not sure that I follow ... where do you see DocMap being loaded up
> > front? Specifically, Sorter.sort may return null of the readers are
> already
> > sorted ... I think we already optimized for the case where the readers
> are
> > sorted.
> >
> > Shai
> >
> >
> > On Tue, Jun 17, 2014 at 4:04 AM, Ravikumar Govindarajan <
> > [email protected]> wrote:
> >
> > > I am planning to use SortingMergePolicy where all the
> merge-participating
> > > segments are already sorted... I understand that I need to define a
> > DocMap
> > > with old-new doc-id mappings.
> > >
> > > Is it possible to optimize the eager loading of DocMap and make it kind
> > of
> > > lazy load on-demand?
> > >
> > > Ex: Pass List<AtomicReader> to the caller and ask for next new-old doc
> > > mapping..
> > >
> > > Since my segments are already sorted, I could save on memory a
> little-bit
> > > this way, instead of loading the full DocMap upfront
> > >
> > > --
> > > Ravi
> > >
> >
>