Toke, I just look throw code we already using such method
IndexSearcher indexSearcher = getIndexSearcher(searchResult);
TopDocs topDocs;
ScoreDoc currectScoreDoc = p.startScoreDoc;
for (int page = 1; page < pages - 1; page++) {
topDocs = indexSearcher.searchAfter(currectScoreDoc,
query, queryFilter, searchResult.getPageSize(), sort);
int endpos = topDocs.scoreDocs.length - 1;
if (endpos > 0) {
startIdx += topDocs.scoreDocs.length;
currectScoreDoc = topDocs.scoreDocs[endpos];
searchResult.setPage(currectScoreDoc, startIdx);
}
topDocs = null;
if (searchResult.getCancelled()) {
return searchResult;
}
}
> On 12 нояб. 2015 г., at 20:42, Toke Eskildsen <[email protected]>
> wrote:
>
> Valentin Popov <[email protected]> wrote:
>
>> We have ~10 indexes for 500M documents, each document
>> has «archive date», and «to» address, one of our task is
>> calculate statistics of «to» for last year. Right now we are
>> using search archive_date:(current_date - 1 year) and paginate
>> results for 50k records for page. Bottleneck of that approach,
>> pagination take too long time and on powerful server it take
>> ~20 days to execute, and it is very long.
>
> Lucene does not like deep page requests due to the way the internal Priority
> Queue works. Solr has CursorMark, which should be fairly simple to emulate in
> your Lucene handling code:
>
> http://lucidworks.com/blog/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/
>
> - Toke Eskildsen
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
Regards,
Valentin Popov
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]