[ 
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-5463:
-----------------------------

    Attachment: SOLR-5463-randomized-faceting-test.patch

Patch adding a randomized faceting test to {{CursorPagingTest}} to validate 
that aggregating field value counts via a deep paging full walk arrives at the 
same results as faceting.  Also checks that facet results are the same with 
each page.  

I'm running this test in a loop 100 times - once that finishes with no failures 
(none yet at ~75 iterations), I'll commit.

> Provide cursor/token based "searchAfter" support that works with arbitrary 
> sorting (ie: "deep paging")
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-5463
>                 URL: https://issues.apache.org/jira/browse/SOLR-5463
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>             Fix For: 5.0, 4.7
>
>         Attachments: SOLR-5463-randomized-faceting-test.patch, 
> SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
> SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, 
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, 
> SOLR-5463__straw_man__MissingStringLastComparatorSource.patch
>
>
> I'd like to revist a solution to the problem of "deep paging" in Solr, 
> leveraging an HTTP based API similar to how IndexSearcher.searchAfter works 
> at the lucene level: require the clients to provide back a token indicating 
> the sort values of the last document seen on the previous "page".  This is 
> similar to the "cursor" model I've seen in several other REST APIs that 
> support "pagnation" over a large sets of results (notable the twitter API and 
> it's "since_id" param) except that we'll want something that works with 
> arbitrary multi-level sort critera that can be either ascending or descending.
> SOLR-1726 laid some initial ground work here and was commited quite a while 
> ago, but the key bit of argument parsing to leverage it was commented out due 
> to some problems (see comments in that issue).  It's also somewhat out of 
> date at this point: at the time it was commited, IndexSearcher only supported 
> searchAfter for simple scores, not arbitrary field sorts; and the params 
> added in SOLR-1726 suffer from this limitation as well.
> ---
> I think it would make sense to start fresh with a new issue with a focus on 
> ensuring that we have deep paging which:
> * supports arbitrary field sorts in addition to sorting by score
> * works in distributed mode
> {panel:title=Basic Usage}
> * send a request with {{sort=X&start=0&rows=N&cursorMark=*}}
> ** sort can be anything, but must include the uniqueKey field (as a tie 
> breaker) 
> ** "N" can be any number you want per page
> ** start must be "0"
> ** "\*" denotes you want to use a cursor starting at the beginning mark
> * parse the response body and extract the (String) {{nextCursorMark}} value
> * Replace the "\*" value in your initial request params with the 
> {{nextCursorMark}} value from the response in the subsequent request
> * repeat until the {{nextCursorMark}} value stops changing, or you have 
> collected as many docs as you need
> {panel}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to