I was planning to use ETSC in-conjunction with SortingMergePolicy and got
stuck.

In ESTC, we have

@Override

 public void collect(int doc) throws IOException {

    in.collect(doc);

    if (++numCollected >= numDocsToCollect) {

      throw new CollectionTerminatedException();

    }

  }

I understand this collector is per-segment. There is one-doubt regarding it.

Since a global-sort ordering is difficult, I collect hits for each segment
& return the final "numDocsToCollect" results using a PQ

If my "numDocsToCollect" = 50 and no.of. segments = 15, then
collector.collect() will be called 750 times.

When I use a SortField instead, then TopFieldDocs does the sorting for all
segments and collector.collect() will be called only 50 times...

Assuming a stored-field seek for every collector.collect(), will it be
advisable to still persist with ETSC? Was it introduced as a trade-off b/n
memory & disk?

Any help is much appreciated

--

Ravi

Reply via email to