[ 
https://issues.apache.org/jira/browse/SOLR-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13864475#comment-13864475
 ] 

Mark Miller commented on SOLR-5615:
-----------------------------------

Even with the other changes, I like the idea of using a background thread 
because I don't think it's right that we do that whole reconnect process before 
we set that we are connected to zk and get out of the connection manager. I 
really don't think that process should hold up the connection manager at all - 
it's meant to just trigger it.

> Deadlock while trying to recover after a ZK session expiry
> ----------------------------------------------------------
>
>                 Key: SOLR-5615
>                 URL: https://issues.apache.org/jira/browse/SOLR-5615
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.4, 4.5, 4.6
>            Reporter: Ramkumar Aiyengar
>            Assignee: Mark Miller
>             Fix For: 5.0, 4.7, 4.6.1
>
>         Attachments: SOLR-5615.patch, SOLR-5615.patch
>
>
> The sequence of events which might trigger this is as follows:
>  - Leader of a shard, say OL, has a ZK expiry
>  - The new leader, NL, starts the election process
>  - NL, through Overseer, clears the current leader (OL) for the shard from 
> the cluster state
>  - OL reconnects to ZK, calls onReconnect from event thread (main-EventThread)
>  - OL marks itself down
>  - OL sets up watches for cluster state, and then retrieves it (with no 
> leader for this shard)
>  - NL, through Overseer, updates cluster state to mark itself leader for the 
> shard
>  - OL tries to register itself as a replica, and waits till the cluster state 
> is updated
>    with the new leader from event thread
>  - ZK sends a watch update to OL, but it is blocked on the event thread 
> waiting for it.
> Oops. This finally breaks out after trying to register itself as replica 
> times out after 20 mins.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to