[ 
https://issues.apache.org/jira/browse/SOLR-5945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17885127#comment-17885127
 ] 

Christos Malliaridis commented on SOLR-5945:
--------------------------------------------

This ticket seems to be related to the removed SolrZooKeeper class (see 
SOLR-16114). Zookeeper is now directly used instead. Should this be resolved as 
"won't fix" or "abandoned"?

> Add retry for zookeeper reconnect failure
> -----------------------------------------
>
>                 Key: SOLR-5945
>                 URL: https://issues.apache.org/jira/browse/SOLR-5945
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>    Affects Versions: 4.7
>            Reporter: Jessica Cheng Mallet
>            Priority: Major
>              Labels: solrcloud, zookeeper
>         Attachments: solr_6_6-5945.patch
>
>
> We had some network issue where we temporarily lost connection and DNS. The 
> zookeeper client properly triggered the watcher. However, when trying to 
> reconnect, this following Exception is thrown:
> 2014-03-27 17:24:46,882 ERROR [main-EventThread] SolrException.java (line 
> 121) :java.net.UnknownHostException: <host name (scrubbed)>: Name or service 
> not known
>         at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
>         at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:866)
>         at 
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1258)
>         at java.net.InetAddress.getAllByName0(InetAddress.java:1211)
>         at java.net.InetAddress.getAllByName(InetAddress.java:1127)
>         at java.net.InetAddress.getAllByName(InetAddress.java:1063)
>         at 
> org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:60)
>         at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
>         at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380)
>         at 
> org.apache.solr.common.cloud.SolrZooKeeper.<init>(SolrZooKeeper.java:41)
>         at 
> org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:53)
>         at 
> org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:147)
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> I tried to look at the code and it seems that there'd be no further retries 
> to connect to Zookeeper, and the node is basically left in a bad state and 
> will not recover on its own. (Please correct me if I'm reading this wrong.) 
> Thinking about it, this is probably fair, since normally you wouldn't expect 
> retries to fix an "unknown host" issue (even though in our case it would 
> have) but I'm wondering what we should do to handle this situation if it 
> happens again in the future.
> Any advice is appreciated.
> From Mark Miller:
> We don’t currently retry, but I don’t think it would hurt much if we did - at 
> least briefly.
> If you want to file a JIRA issue, that would be the best way to get it in a 
> future release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to