You are setting yourself up for disaster.

If you ask Solr for documents 1000 to 1010, it needs to sort documents 1
to 1010, and discard the first 1000, which causes horrible performance.

I'm curious to hear if others have strategies to extract content
sequentially from an index. I suspect a new SearchComponent could really
help here.

I suspect it would work better if you don't sort at all, in which case
you'll return the documents in index order. The issue is that a commit,
or a background merge could change index order which would mess up your
export.

Sorry no clearer answers.

Upayavira

On Tue, Jan 15, 2013, at 02:07 PM, elisabeth benoit wrote:
> Hello,
> 
> I have a Solr instance (solr 3.6.1) with around 3 000 000 documents. I
> want
> to read (in a java test application) all my documents, but not in one
> shot
> (because it takes too much memory).
> 
> So I send the same request, over and over, with
> 
> q=*:*
> rows=1000
> sort=id desc  => to be sure I always get same ordering*
> and start parameter increased of 1000 at each iteration
> 
> 
> checking the solr logs, I realized that the query responding time
> increases
> as the start parameter gets bigger
> 
> for instance
> 
> with start < 500 000, it takes about 500ms
> with start > 1 100 000  and < 1 200 000, it takes between 5000 and 5200
> ms
> with start > 1 250 000 and < 1 320 000, it takes between 6100 and 6400 ms
> 
> 
> Does someone have an idea how to optimize this query?
> 
> Thanks,
> Elisabeth

Reply via email to