[ 
https://issues.apache.org/jira/browse/SOLR-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436154#comment-13436154
 ] 

Per Steffensen commented on SOLR-3721:
--------------------------------------

{quote} No recovery will be started if a node cannot talk to zookeeper. {quote}

Well, I knew that. I meant that the two Solrs where disconnected from ZK at the 
same time, but of course both got their connection reestablished - after 
session timeout (believe (kinda hope) that a session timeout has to have 
happened before Solr needs to go into recovery after a ZK connection loss)

When it gets prioritized on my side, I will try to investigate further what 
causes the log to claim that many recoveries goes on for the same shard 
concurrently.

{quote} When a new node is elected as a leader by ZooKeeper it first tries to 
do a peer sync against every other live node. So lets say the first node in 
your two node situation comes back and he is behind the other node, but he 
comes back first and is elected leader. The second node has the latest updates, 
but is second in line to be leader and a few updates ahead. The potential 
leader will try and peer sync with the other node and get those missing updates 
if it's fewer than 100 or fail because the other node is ahead by too much. 
{quote}

Well we shouldnt let this issue (SOLR-3721) become about many other issues, but 
when the "behind" node has reconnected and become leader and the one with the 
latest updates does not come back live right away, isnt the new leader (which 
is behind) allowed to start handling update-requests. If yes, then it will be 
possible that both shards have documents/updates that the other one doesnt, and 
it is possible to come up with scenarios where there is no good algorithm for 
generating the "correct" merged union of the data in both shards. So what to do 
when the other shard (which used to have a later version than the current 
leader) comes live? Believe there is nothing solid to do!

How to avoid that? I was thinking about keeping the latest version for every 
slice in ZK, so that a "behind" shard will know if it has the latest version of 
a slice, and therefore if it is allowed to take the role as leader. Of course 
the writing of this "latest version" to ZK and the writing of the corresponding 
update in leaders transaction-log would have to be atomic (like the A in ACID) 
as much as possible. And it would be nice if writing of the update in replica 
transaction-log would also be atomic with the leader-writing and the ZK 
writing, in order to increase the chance that a replica is actually allowed to 
take over the leader role if the leader dies (or both dies and replica comes 
back first). But all that is just an idea on top of my head. 
Do you already have a solution implemented or a solution on the drawing board 
or how do you/we prevent such a problem? As far as I understand "the drill" 
during leader-election/recovery (whether its peer-sync or 
file-copy-replication) from the little code-reading I have done and from what 
you explain, there is not a current solution. But I might be wrong?

{quote} The other node, being the next in line to be leader, will now try and 
peer sync with the other nodes in the shard {quote}

Guess/hope you mean "...with the other shards (running on different nodes) in 
the slice". As I understand Solr terminology a logical chunk of the "entire 
data" (a collection in Solr) is a "slice", and the data in a slice might 
physically exist more than one place (in more shards - if replication is used). 
Back when I started my interest in Solr I used a considerable amount of time 
understanding Solr terminology - mainly because it is different that what I 
have been used to (in my pre-Solr-world a "shard" is what you call a "slice") - 
so now please dont tell me that I misunderstood :-)

{qoute} That is the current process - we will continually be hardening and 
improving it I'm sure. {quote}

I will probably stick around for that. The correctness and robusteness of this 
live-replication feature is (currently) very important to us.

                
> Multiple concurrent recoveries of same shard?
> ---------------------------------------------
>
>                 Key: SOLR-3721
>                 URL: https://issues.apache.org/jira/browse/SOLR-3721
>             Project: Solr
>          Issue Type: Bug
>          Components: multicore, SolrCloud
>    Affects Versions: 4.0
>         Environment: Using our own Solr release based on Apache revision 
> 1355667 from 4.x branch. Our changes to the Solr version is our solutions to 
> TLT-3178 etc., and should have no effect on this issue.
>            Reporter: Per Steffensen
>              Labels: concurrency, multicore, recovery, solrcloud
>             Fix For: 4.0
>
>         Attachments: recovery_in_progress.png, recovery_start_finish.log
>
>
> We run a performance/endurance test on a 7 Solr instance SolrCloud setup and 
> eventually Solrs lose ZK connections and go into recovery. BTW the recovery 
> often does not ever succeed, but we are looking into that. While doing that I 
> noticed that, according to logs, multiple recoveries are in progress at the 
> same time for the same shard. That cannot be intended and I can certainly 
> imagine that it will cause some problems.
> It is just the logs that are wrong, did I make some mistake, or is this a 
> real bug?
> See attached grep from log, grepping only on "Finished recovery" and 
> "Starting recovery" logs.
> {code}
> grep -B 1 "Finished recovery\|Starting recovery" solr9.log solr8.log 
> solr7.log solr6.log solr5.log solr4.log solr3.log solr2.log solr1.log 
> solr0.log > recovery_start_finish.log
> {code}
> It can be hard to get an overview of the log, but I have generated a graph 
> showing (based alone on "Started recovery" and "Finished recovery" logs) how 
> many recoveries are in progress at any time for the different shards. See 
> attached recovery_in_progress.png. The graph is also a little hard to get an 
> overview of (due to the many shards) but it is clear that for several shards 
> there are multiple recoveries going on at the same time, and that several 
> recoveries never succeed.
> Regards, Per Steffensen

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to