[
https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844820#comment-13844820
]
Hoss Man commented on SOLR-5463:
--------------------------------
bq. I think that error message should include the param name (cursor) that
couldn't be parsed.
Agreed ... the current error text is basically just a placeholder, ideally it
should be something like...
{code}
Unable to parse cursor param: value must either be '*' or the cursorContinue
value from a previous search: NOK
{code}
bq. Also, maybe it would be useful to include a prefix that will (probably)
never be used in unique ids, to visually identify the cursor as such: like
always perpending '*'?
Hmmm, I'm not sure if that's really worth the added bytes & parsing.
If folks really felt like the param name should be "searchAfter" then i could
certainly see the value in having some clear prefix, since the param name might
lead folks to assuming they know what hte input should be; but with "cursor" i
don't think we need to worry as much about people assuming they know what to
put there, and with a clear error message instructing people how to get a valid
cursor (from cursorContinue), that seems good enough. (right?)
bq. the Base64-encoded text is used verbatim, including the trailing padding
'=' characters - these could be stripped out for external use (since they're
there just to make the string length divisible by four), and then added back
before Base64-decoding. In a URL non-metacharacter '='-s look weird, since
they're already used to separate param names and values.
Interesting idea ... again: i'm not sure how i feel about the added overhead to
the parsing just to shorten the totem -- especially since clients will always
need to safely url encode anyway since Base64 strings can also include "+"
However....
In the current patch, I used the base64 utility class Solr already had (used by
BinaryField and a few other places). But your suggestion reminds me that
commons codec's Base64 class (jar already used by solr) supports a "url safe"
variant of base64 (which looks like it's defined in RFC 4648?)...
https://commons.apache.org/proper/commons-codec/javadocs/api-release/org/apache/commons/codec/binary/Base64.html#encodeBase64URLSafeString(byte[])
...something to consider.
----
One other comment i got from a coworker offline was why I liked
{{cursorContinue}} instead of {{nextCursor}} or {{cursorNext}}. My thinking
was that since 'cursor', (as a concept) is a noun, "next cursor" might suggest
that it was a (different) cursor then the one currently in use. I don't want
people to think these strings are _names_ of cursors, and they re-use the same
name until they are done with it. I want to make it clear that to _continue_
fetching results from this cursor, you have to specify the new value.
Would "{{cursorAdvance}}" convey that better then {{cursorContinue}} ?
> Provide cursor/token based "searchAfter" support that works with arbitrary
> sorting (ie: "deep paging")
> ------------------------------------------------------------------------------------------------------
>
> Key: SOLR-5463
> URL: https://issues.apache.org/jira/browse/SOLR-5463
> Project: Solr
> Issue Type: New Feature
> Reporter: Hoss Man
> Assignee: Hoss Man
> Attachments: SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch,
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch,
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch,
> SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch,
> SOLR-5463__straw_man.patch
>
>
> I'd like to revist a solution to the problem of "deep paging" in Solr,
> leveraging an HTTP based API similar to how IndexSearcher.searchAfter works
> at the lucene level: require the clients to provide back a token indicating
> the sort values of the last document seen on the previous "page". This is
> similar to the "cursor" model I've seen in several other REST APIs that
> support "pagnation" over a large sets of results (notable the twitter API and
> it's "since_id" param) except that we'll want something that works with
> arbitrary multi-level sort critera that can be either ascending or descending.
> SOLR-1726 laid some initial ground work here and was commited quite a while
> ago, but the key bit of argument parsing to leverage it was commented out due
> to some problems (see comments in that issue). It's also somewhat out of
> date at this point: at the time it was commited, IndexSearcher only supported
> searchAfter for simple scores, not arbitrary field sorts; and the params
> added in SOLR-1726 suffer from this limitation as well.
> ---
> I think it would make sense to start fresh with a new issue with a focus on
> ensuring that we have deep paging which:
> * supports arbitrary field sorts in addition to sorting by score
> * works in distributed mode
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]