Re: Replicas for same shard not in sync

David Smith Mon, 25 Apr 2016 11:42:01 -0700

Erick,

So that my understanding is correct, let me ask, if one or more replicas are 
down, updates presented to the leader still succeed, right?  If so, tedsolr is 
correct that the Solr client app needs to re-issue updates, if it wants 
stronger guarantees on replica consistency than what Solr provides.


The “Write Fault Tolerance” section of the Solr Wiki makes what I believe is 
the same point:

"On the client side, if the achieved replication factor is less than the 
acceptable level, then the client application can take additional measures to 
handle the degraded state. For instance, a client application may want to keep 
a log of which update requests were sent while the state of the collection was 
degraded and then resend the updates once the problem has been resolved."


https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Fault+Tolerance


Kind Regards,

David




On 4/25/16, 11:57 AM, "Erick Erickson" <erickerick...@gmail.com> wrote:

>bq: I also read that it's up to the
>client to keep track of updates in case commits don't happen on all the
>replicas.
>
>This is not true. Or if it is it's a bug.
>
>The update cycle is this:
>1> updates get to the leader
>2> updates are sent to all followers and indexed on the leader as well
>3> each replica writes the updates to the local transaction log
>4> all the replicas ack back to the leader
>5> the leader responds to the client.
>
>At this point, all the replicas for the shard have the docs locally
>and can take over as leader.
>
>You may be confusing indexing in batches and having errors with
>updates getting to replicas. When you send a batch of docs to Solr,
>if one of them fails indexing some of the rest of the docs may not
>be indexed. See SOLR-445 for some work on this front.
>
>That said, bouncing servers willy-nilly during heavy indexing, especially
>if the indexer doesn't know enough to retry if an indexing attempt fails may
>be the root cause here. Have you verified that your indexing program
>retries in the event of failure?
>
>Best,
>Erick
>
>On Mon, Apr 25, 2016 at 6:13 AM, tedsolr <tsm...@sciquest.com> wrote:
>> I've done a bit of reading - found some other posts with similar questions.
>> So I gather "Optimizing" a collection is rarely a good idea. It does not
>> need to be condensed to a single segment. I also read that it's up to the
>> client to keep track of updates in case commits don't happen on all the
>> replicas. Solr will commit and return success as long as one replica gets
>> the update.
>>
>> I have a state where the two replicas for one collection are out of sync.
>> One has some updates that the other does not. And I don't have log data to
>> tell me what the differences are. This happened during a maintenance window
>> when the servers got restarted while a large index job was running. Normally
>> this doesn't cause a problem, but it did last Thursday.
>>
>> What I plan to do is select the replica I believe is incomplete and delete
>> it. Then add a new one. I was just hoping Solr had a solution for this -
>> maybe using the ZK transaction logs to replay some updates, or force a
>> resync between the replicas.
>>
>> I will also implement a fix to prevent Solr from restarting unless one of
>> its config files has changed. No need to bounce Solr just for kicks.
>>
>>
>>
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/Replicas-for-same-shard-not-in-sync-tp4272236p4272602.html
>> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Replicas for same shard not in sync

Reply via email to