Hi all,
Is it possible to page over results when using
an EarlyTerminatingSortingCollector?
I'm using the following code with Lucene 5.5.0 to read results in pages of
10 documents each:
/** The Lucene field name */
private static final String FIELD_NAME = "id";
/** The Lucene field type */
private static final FieldType FIELD_TYPE = new FieldType();
static {
FIELD_TYPE.setTokenized(true);
FIELD_TYPE.setOmitNorms(true);
FIELD_TYPE.setIndexOptions(IndexOptions.DOCS);
FIELD_TYPE.setNumericType(FieldType.NumericType.INT);
FIELD_TYPE.setDocValuesType(DocValuesType.NUMERIC);
FIELD_TYPE.setStored(true);
FIELD_TYPE.freeze();
}
public static void main(String[] args) throws Exception {
// Sort to be used both with merge policy and queries
Sort sort = new Sort(new SortedNumericSortField(FIELD_NAME,
SortField.Type.INT));
// Create directory
RAMDirectory directory = new RAMDirectory();
// Setup merge policy
TieredMergePolicy tieredMergePolicy = new TieredMergePolicy();
SortingMergePolicy sortingMergePolicy = new
SortingMergePolicy(tieredMergePolicy, sort);
// Setup index writer
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(new
SimpleAnalyzer());
indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND);
indexWriterConfig.setMergePolicy(sortingMergePolicy);
IndexWriter indexWriter = new IndexWriter(directory, indexWriterConfig);
// Index values
for (int i = 1; i <= 1000; i++) {
Document document = new Document();
document.add(new IntField(FIELD_NAME, i, FIELD_TYPE));
indexWriter.addDocument(document);
}
// Force index merge to ensure early termination
indexWriter.forceMerge(1, true);
indexWriter.commit();
// Create index searcher
IndexReader reader = DirectoryReader.open(directory);
IndexSearcher searcher = new IndexSearcher(reader);
// Paginated read
int pageSize = 10;
FieldDoc pageStart = null;
while (true) {
System.out.println(String.format("\nCollecting page starting
at: %s", pageStart));
Query query = new MatchAllDocsQuery();
TopFieldCollector tfc = TopFieldCollector.create(sort,
pageSize, pageStart, true, false, false);
EarlyTerminatingSortingCollector collector = new
EarlyTerminatingSortingCollector(tfc, sort, pageSize, sort);
searcher.search(query, collector);
ScoreDoc[] scoreDocs = tfc.topDocs().scoreDocs;
for (ScoreDoc scoreDoc : scoreDocs) {
pageStart = (FieldDoc) scoreDoc;
Document document = searcher.doc(scoreDoc.doc);
System.out.println(String.format("FOUND %s -> %s",
document, scoreDoc));
}
System.out.println(String.format("Terminated early: %s",
collector.terminatedEarly()));
if (scoreDocs.length < pageSize) {
break;
}
}
// Close
reader.close();
indexWriter.close();
directory.close();
}
But the query for the second page doesn't return any results. However, I
get the expected results when I don't wrap the TopFieldCollector with the
EarlyTerminatingSortingCollector.
Is there something I am missing? Is EarlyTerminatingSortingCollector not
compatible with paging?
Thanks in advance,
--
Andrés de la Peña
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*