Robert

Hmmm..... why did Mike go to all the trouble of implementing NRT search, if we are not supposed to be using it?

The user simply wants the latest result set. To me, this doesn't appear out of scope for the Lucene project.

Jamie

On 2014/06/03, 1:17 PM, Robert Muir wrote:
No, you are incorrect. The point of a search engine is to return top-N
most relevant.

If you insist you need to open an indexreader on every single search,
and then return huge amounts of docs, maybe you should use a database
instead.

On Tue, Jun 3, 2014 at 6:42 AM, Jamie <ja...@mailarchiva.com> wrote:
Vitality / Robert

I wouldn't go so far as to call our pagination naive!? Sub-optimal, yes.
Unless I am mistaken, the Lucene library's pagination mechanism, makes the
assumption that you will cache the scoredocs for the entire result set. This
is not practical  when you have a result set that exceeds 60M. As stated
earlier, in any case, it is the first query that is slow.

We do open index readers.. since we are using NRT search. Since documents
are being added to the indexes on a continuous basis. When the user clicks
on the Search button, the user will expect to see the latest result set.
With regards to NRT search, my understanding is that we do need to open the
index readers on each search operation to see the latest changes.

Thus, on each search, we combine the indexreaders into a multireader, and
open each reader based their corresponding writer.

protected IndexReader initIndexReader() {
     List<IndexReader> readers = new LinkedList<>();
     for (Writer writer : writers) {
         readers.add(DirectoryReader.open(writer, true);
     }
     return MultiReader(readers,true);
}

Thank you for your ideas/suggestions.

Regards

Jamie


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to