Hi, Additionally, since the latest 3.x version (not sure if its already in 3.4), there is a new searchAfter method in IndexSearcher that allows deep paging. As MultiSearcher is deprecated, it is not supported there, so use MultiReader with IndexSearcher.
Uwe -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de Uwe Schindler <u...@thetaphi.de> schrieb: Hi, MultiReader is the way to go. MultiSearcher is broken and therefore deprecated. See javadocs since Lucene 3.1. Uwe -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de Alexander Devine <alex.dev...@gmail.com> schrieb: Hi all, I'm an trying to provide a way to efficiently allow a client to page over all of the documents in multiple Lucene indexes that I'm querying with a MultiSearcher (~1-2 million docs). Unfortunately, I can't use the standard paging algorithm of getting TopDocs to the last record needed and then skipping all of the preceding pages because the queries get extremely slow and memory usage becomes prohibitive as the client requests higher and higher page numbers. Thus, my workaround for this was to run a search using an indexorder sort (that is, sort by document ID), and then the client could page over the results by running a query that says "get me all the documents where the doc ID is greater than the last doc ID of the previous page". This way the client only ever asks for a TopDocs the size of a single page, but the client can still run forward to eventually get all the documents in the index. While this works when searching over a single IndexReader, it fails when using a MultiSearcher for 2 reasons: 1. Sorting by docId doesn't really work in a MultiSearcher because of the way the searcher munges the IDs. For example, if there are 2 indexes each with 3 docs #1, #2 and #3, the MultiSearcher will return results that look like "1,4,2,5,3,6". 2. The "MinimumDocIdQuery" I wrote only works when you pass it the ORIGINAL doc ID that is local to the index reader, not the one that was munged by the MultiSearcher. Does anyone have any advice to work around this? I was thinking if I could somehow get the "local" document ID back from the MultiSearcher that would work, as I could return that with my search results (and sorted by that ID things would look good, e.g. "1, 1, 2, 2, 3, 3"). If anyone has some advice on how to better solve my original problem, that is, being able to run over all of the documents in a potentially very large index using time and memory efficient paging, that would also be appreciated. Thanks, Alex