Hoss,

Good point, didn't know about cursor mark when we designed this a year ago
:(

Small potato: I assume cursor mark breaks when the number of shards changes
while keeping the original values doesn't, since the relative position is
encoded per shard...But that's an edge case.

Looking forward for http://yonik.com/solr-cross-data-center-replication/

On Tue, Sep 29, 2015 at 10:20 PM, Chris Hostetter <hossman_luc...@fucit.org>
wrote:

>
>
> You're basically re-implementing Solr' cursors.
>
> you can change your system of reading docs from the old collection to
> use...
>
> cursorMark=*&sort=timestamp+asc,id+asc
>
> ...and then instead of keeping track of the last timestamp & id values and
> constructing a filter, you can just keep track of the nextCursorMark and
> pass it the next time you want to check for newer documents...
>
> https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
>
>
>
>
>
> : Date: Mon, 21 Sep 2015 21:32:33 +0300
> : From: Gili Nachum <gilinac...@gmail.com>
> : Reply-To: solr-user@lucene.apache.org
> : To: solr-user@lucene.apache.org
> : Subject: Re: How can I get a monotonically increasing field value for
> docs?
> :
> : Thanks for the indepth explanation!
> :
> : The secondary sort by uuid would allow me to read a series of docs with
> : identical time over multiple batches by specifying filtering
> : time>timeOnLastReadDoc or (time=timeOnLastReadDoc and
> : uuid>uuidOnLastReaDoc) which essentially creates a unique sorted value to
> : track progress over.
> : On Sep 21, 2015 19:56, "Shawn Heisey" <apa...@elyograg.org> wrote:
> :
> : > On 9/21/2015 9:01 AM, Gili Nachum wrote:
> : > > TimestampUpdateProcessorFactory takes place only on the leader
> shard, or
> : > on
> : > > each shard replica?
> : > > if on each replica then I would get different values on each replica.
> : > >
> : > > My alternative would be to perform secondary sort on a UUID to ensure
> : > order.
> : >
> : > If the update chain is configured properly, it runs on the leader, so
> : > all replicas get the same timestamp.
> : >
> : > Without SolrCloud, the way to create an "indexed at" time field is in
> : > the schema -- specify a default value of NOW on the field definition
> and
> : > don't send the field when indexing.  The old master/slave replication
> : > copies the actual index contents, so the indexed values in all replicas
> : > are the same.
> : >
> : > The problem with NOW in the schema when running SolrCloud is that each
> : > replica indexes the document independently, so each replica can have a
> : > different timestamp.  This is why the timestamp update processor exists
> : > -- to set the timestamp to a specific value before the document is
> : > duplicated to each replica, eliminating the problem.
> : >
> : > FYI, secondary sort parameters affect the order when the primary sort
> : > field is identical between two documents.  It may not do what you are
> : > intending because of that.
> : >
> : > Thanks,
> : > Shawn
> : >
> : >
> :
>
> -Hoss
> http://www.lucidworks.com/
>

Reply via email to