[ https://issues.apache.org/jira/browse/HDFS-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14525133#comment-14525133 ]
Hadoop QA commented on HDFS-4101: --------------------------------- \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 51s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 1m 1s | The patch appears to cause the build to fail. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12550528/HDFS-4101-2.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / f1a152c | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/10712/console | This message was automatically generated. > ZKFC should implement zookeeper.recovery.retry like HBase to connect to > ZooKeeper > --------------------------------------------------------------------------------- > > Key: HDFS-4101 > URL: https://issues.apache.org/jira/browse/HDFS-4101 > Project: Hadoop HDFS > Issue Type: Improvement > Components: auto-failover, ha > Affects Versions: 2.0.0-alpha, 3.0.0 > Environment: running CDH4.1.1 > Reporter: Damien Hardy > Assignee: Damien Hardy > Priority: Minor > Labels: newbie > Attachments: HDFS-4101-2.patch > > > When zkfc start and zookeeper is not yet started ZKFC fails and stop directly. > Maybe ZKFC should allow some retries on Zookeeper services like does HBase > with zookeeper.recovery.retry > This particularly appends when I start my whole cluster on VirtualBox for > example (every components nearly at the same time) ZKFC is the only that fail > and stop ... > Every others can wait each-others some time independently of the start order > like NameNode/DataNode/JournalNode/Zookeeper/HBaseMaster/HBaseRS so that the > system can be set and stable in few seconds -- This message was sent by Atlassian JIRA (v6.3.4#6332)