[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882481#comment-13882481
 ] 

Yonik Seeley commented on SOLR-5463:
------------------------------------

Nice work guys!

Some further thoughts:

We should consider allowing non-zero "start" parameters with cursorMark.  The 
primary use case is when someone is skipping pages (perhaps trying to get to a 
different section of results, or trying to get much later in a time based 
search, or just viewing the long tail).

For example, a user at page 50 clicks on page 60.  It would be nice to support 
this by just specifying start=90 (i.e. 600-510) assuming 10 docs per page, 
along with the normal cursorMark (that would have started at page 51 / doc 
510).  Currently, the prohibition on non-zero start parameters would mean that 
we would either have to abandon cursoring altogether, or we would have to 
actually retrieve 100 documents to continue it.

The other thought is around how to do reverse paging efficiently.  One way is 
to save previous cursorMarks on the client side and just return to them if one 
wants to page backwards.  The other potential way is to reverse the sort 
parameters and use the current cursorMark.  The only pitfall to this approach 
is that you don't get the current document you are on (because we 
"searchAfter").





> Provide cursor/token based "searchAfter" support that works with arbitrary 
> sorting (ie: "deep paging")
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-5463
>                 URL: https://issues.apache.org/jira/browse/SOLR-5463
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>             Fix For: 5.0, 4.7
>
>         Attachments: SOLR-5463-randomized-faceting-test.patch, 
> SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
> SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
> SOLR-5463__straw_man__MissingStringLastComparatorSource.patch
>
>
> I'd like to revist a solution to the problem of "deep paging" in Solr, 
> leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
> at the lucene level: require the clients to provide back a token indicating 
> the sort values of the last document seen on the previous "page".  This is 
> similar to the "cursor" model I've seen in several other REST APIs that 
> support "pagnation" over a large sets of results (notable the twitter API and 
> it's "since_id" param) except that we'll want something that works with 
> arbitrary multi-level sort critera that can be either ascending or descending.
> SOLR-1726 laid some initial ground work here and was commited quite a while 
> ago, but the key bit of argument parsing to leverage it was commented out due 
> to some problems (see comments in that issue).  It's also somewhat out of 
> date at this point: at the time it was commited, IndexSearcher only supported 
> searchAfter for simple scores, not arbitrary field sorts; and the params 
> added in SOLR-1726 suffer from this limitation as well.
> ---
> I think it would make sense to start fresh with a new issue with a focus on 
> ensuring that we have deep paging which:
> * supports arbitrary field sorts in addition to sorting by score
> * works in distributed mode
> {panel:title=Basic Usage}
> * send a request with {{sort=X&start=0&rows=N&cursorMark=*}}
> ** sort can be anything, but must include the uniqueKey field (as a tie 
> breaker) 
> ** "N" can be any number you want per page
> ** start must be "0"
> ** "\*" denotes you want to use a cursor starting at the beginning mark
> * parse the response body and extract the (String) {{nextCursorMark}} value
> * Replace the "\*" value in your initial request params with the 
> {{nextCursorMark}} value from the response in the subsequent request
> * repeat until the {{nextCursorMark}} value stops changing, or you have 
> collected as many docs as you need
> {panel}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to