Performance of start= and rows= parameters are exponentially slow with large 
data sets
--------------------------------------------------------------------------------------

                 Key: SOLR-2218
                 URL: https://issues.apache.org/jira/browse/SOLR-2218
             Project: Solr
          Issue Type: Improvement
          Components: Build
    Affects Versions: 1.4.1
            Reporter: Bill Bell


With large data sets, > 10M rows.

Setting start=<large number> and rows=<large numbers> is slow, and gets slower 
the farther you get from start=0 with a complex query. Random also makes this 
slower.

Would like to somehow make this performance faster for looping through large 
data sets. It would be nice if we could pass a pointer to the result set to 
loop, or support very large rows=<number>.

Something like:
rows=1000
start=0
spointer=string_my_query_1

Then within interval (like 5 mins) I can reference this loop:
Something like:
rows=1000
start=1000
spointer=string_my_query_1

What do you think? Since the data is too great the cache is not helping.




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to