You are setting yourself up for disaster. If you ask Solr for documents 1000 to 1010, it needs to sort documents 1 to 1010, and discard the first 1000, which causes horrible performance.
I'm curious to hear if others have strategies to extract content sequentially from an index. I suspect a new SearchComponent could really help here. I suspect it would work better if you don't sort at all, in which case you'll return the documents in index order. The issue is that a commit, or a background merge could change index order which would mess up your export. Sorry no clearer answers. Upayavira On Tue, Jan 15, 2013, at 02:07 PM, elisabeth benoit wrote: > Hello, > > I have a Solr instance (solr 3.6.1) with around 3 000 000 documents. I > want > to read (in a java test application) all my documents, but not in one > shot > (because it takes too much memory). > > So I send the same request, over and over, with > > q=*:* > rows=1000 > sort=id desc => to be sure I always get same ordering* > and start parameter increased of 1000 at each iteration > > > checking the solr logs, I realized that the query responding time > increases > as the start parameter gets bigger > > for instance > > with start < 500 000, it takes about 500ms > with start > 1 100 000 and < 1 200 000, it takes between 5000 and 5200 > ms > with start > 1 250 000 and < 1 320 000, it takes between 6100 and 6400 ms > > > Does someone have an idea how to optimize this query? > > Thanks, > Elisabeth