It looks like a use case for using Solrj with queryAndStreamResponse ?

http://lucene.apache.org/solr/api-4_0_0-BETA/org/apache/solr/client/solrj/SolrServer.html#queryAndStreamResponse%28org.apache.solr.common.params.SolrParams,%20org.apache.solr.client.solrj.StreamingResponseCallback%29

André

On 01/15/2013 04:49 PM, Mikhail Khludnev wrote:
It's a well know search engines limitation. This post will help you get
into the core problem
http://www.searchworkings.org/blog/-/blogs/lucene-solr-and-deep-paging . it
seems that the solution is contributed into Lucene, but not yet for Solr.


On Tue, Jan 15, 2013 at 6:36 PM, Upayavira<u...@odoko.co.uk>  wrote:

You are setting yourself up for disaster.

If you ask Solr for documents 1000 to 1010, it needs to sort documents 1
to 1010, and discard the first 1000, which causes horrible performance.

I'm curious to hear if others have strategies to extract content
sequentially from an index. I suspect a new SearchComponent could really
help here.

I suspect it would work better if you don't sort at all, in which case
you'll return the documents in index order. The issue is that a commit,
or a background merge could change index order which would mess up your
export.

Sorry no clearer answers.

Upayavira

On Tue, Jan 15, 2013, at 02:07 PM, elisabeth benoit wrote:
Hello,

I have a Solr instance (solr 3.6.1) with around 3 000 000 documents. I
want
to read (in a java test application) all my documents, but not in one
shot
(because it takes too much memory).

So I send the same request, over and over, with

q=*:*
rows=1000
sort=id desc  =>  to be sure I always get same ordering*
and start parameter increased of 1000 at each iteration


checking the solr logs, I realized that the query responding time
increases
as the start parameter gets bigger

for instance

with start<  500 000, it takes about 500ms
with start>  1 100 000  and<  1 200 000, it takes between 5000 and 5200
ms
with start>  1 250 000 and<  1 320 000, it takes between 6100 and 6400 ms


Does someone have an idea how to optimize this query?

Thanks,
Elisabeth



--
André Bois-Crettez

Search technology, Kelkoo
http://www.kelkoo.com/

Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.

Reply via email to