[ 
https://issues.apache.org/jira/browse/YARN-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14950642#comment-14950642
 ] 

Junping Du commented on YARN-4243:
----------------------------------

Thanks for reporting the issue and delivering the patch, [~xgong]! 
The patch make sense in overall. Some minor comments:
1. I think we are adding a new configuration here, and we may want to add it to 
yarn-default.xml as well. It is only for documentation purpose and we don't 
have to specify default value though.
2. Do we need to add another configuration for sleep interval during retry? 
hard coded with 5 seconds sounds lack of flexibility.
3. If connection still get failed after max retry times, shall we put retry 
times in error messages as well? like: "Can not establish Zookeeper 
Connection... after retry x times").

> Add retry on establishing Zookeeper conenction in 
> EmbeddedElectorService#serviceInit
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-4243
>                 URL: https://issues.apache.org/jira/browse/YARN-4243
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>         Attachments: YARN-4243.1.patch
>
>
> Right now, the RM would shut down if the zk connection is down when the RM do 
> the initialization. We need to add retry on this part



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to