[ 
https://issues.apache.org/jira/browse/SOLR-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435980#comment-13435980
 ] 

Mark Miller commented on SOLR-3721:
-----------------------------------

{quote}What if two Solrs, respectively running leader and replica for the same 
slice (only one replica), lose their ZK connection at about the same time. Then 
there will be no active shard that either of them can recover from. Could it be 
in such scenarios that multiple concurrent recoveries of the same shard somehow 
get started?{quote}

No recovery will be started if a node cannot talk to zookeeper. So nothing 
would happen until one or both of the nodes reconnected to ZooKeeper. That 
would trigger a leader election, that leader node would attempt to sync up with 
all the other nodes for that shard and any recoveries would procede against him.

There is a lock for each core that only allows one recovery per core to happen 
at a time. I'm not saying there is no bug in this, but that is the intention.

{quote}BTW, the scenario above shouldnt end in a situation where the slice is 
just dead. The two shards in the same slice ought to find out who has the 
newest version of the shard-data (will probably be the one that was leader 
last), make that shard the leader (without recovering) and let the other shard 
recover from it. Is this scenarios handled (in the way I suggest or in another 
way) already in Solr 4.0 (beta - tip of branch) or is that a future thing (e.g. 
on 4.1 or 5.0)?{quote}

It happens as I mention above. A little more detail on the "leader attempts to 
sync up":

When a new node is elected as a leader by ZooKeeper it first tries to do a peer 
sync against every other live node. So lets say the first node in your two node 
situation comes back and he is behind the other node, but he comes back first 
and is elected leader. The second node has the latest updates, but is second in 
line to be leader and a few updates ahead. The potential leader will try and 
peer sync with the other node and get those missing updates if it's fewer than 
100 or fail because the other node is ahead by too much. If the peer sync is a 
fail, the potential leader will give up his leader role, realizing that it 
seems there is a better candidate. The other node, being the next in line to be 
leader, will now try and peer sync with the other nodes in the shard. In this 
case, that will be a success since he is ahead of the first node. He will then 
ask the other nodes to peer sync to him. If they are less than 100 docs behind, 
it will succeed. If any sync back attempts fail, the leader tries to ask them 
to recover and they will replicate. Only after this sync process is completed 
does the leader advertise that he is now the leader in the cloud state.

That is the current process - we will continually be hardening and improving it 
I'm sure.
                
> Multiple concurrent recoveries of same shard?
> ---------------------------------------------
>
>                 Key: SOLR-3721
>                 URL: https://issues.apache.org/jira/browse/SOLR-3721
>             Project: Solr
>          Issue Type: Bug
>          Components: multicore, SolrCloud
>    Affects Versions: 4.0
>         Environment: Using our own Solr release based on Apache revision 
> 1355667 from 4.x branch. Our changes to the Solr version is our solutions to 
> TLT-3178 etc., and should have no effect on this issue.
>            Reporter: Per Steffensen
>              Labels: concurrency, multicore, recovery, solrcloud
>             Fix For: 4.0
>
>         Attachments: recovery_in_progress.png, recovery_start_finish.log
>
>
> We run a performance/endurance test on a 7 Solr instance SolrCloud setup and 
> eventually Solrs lose ZK connections and go into recovery. BTW the recovery 
> often does not ever succeed, but we are looking into that. While doing that I 
> noticed that, according to logs, multiple recoveries are in progress at the 
> same time for the same shard. That cannot be intended and I can certainly 
> imagine that it will cause some problems.
> It is just the logs that are wrong, did I make some mistake, or is this a 
> real bug?
> See attached grep from log, grepping only on "Finished recovery" and 
> "Starting recovery" logs.
> {code}
> grep -B 1 "Finished recovery\|Starting recovery" solr9.log solr8.log 
> solr7.log solr6.log solr5.log solr4.log solr3.log solr2.log solr1.log 
> solr0.log > recovery_start_finish.log
> {code}
> It can be hard to get an overview of the log, but I have generated a graph 
> showing (based alone on "Started recovery" and "Finished recovery" logs) how 
> many recoveries are in progress at any time for the different shards. See 
> attached recovery_in_progress.png. The graph is also a little hard to get an 
> overview of (due to the many shards) but it is clear that for several shards 
> there are multiple recoveries going on at the same time, and that several 
> recoveries never succeed.
> Regards, Per Steffensen

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to