[ https://issues.apache.org/jira/browse/AMBARI-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Magyari Sandor Szilard resolved AMBARI-17901. --------------------------------------------- Resolution: Won't Fix > Make HDFS operations resilient to namenode safemode > --------------------------------------------------- > > Key: AMBARI-17901 > URL: https://issues.apache.org/jira/browse/AMBARI-17901 > Project: Ambari > Issue Type: Bug > Components: ambari-server > Affects Versions: 2.4.0 > Reporter: Magyari Sandor Szilard > Assignee: Magyari Sandor Szilard > Fix For: 3.0.0 > > > HdfsResourceJar and HdfsResourceWebHDFS (WebHDFSUtil) are the classes that > carry out the HDFS operations. All retry able operations (e.g. SETPERMISSION) > should be guarded with retry logic that would retry the operation until a > given timeout before giving up and bailing out. > To determine which HDFS operations are retry able might be as easy as just > looking the returned status/error code or the type of the exception (e.g. > "RetriableException") though this needs to be verified if it's consistent > with both the webhdfs and hdfsresource jar. > This problem came up in https://issues.apache.org/jira/browse/AMBARI-17182 > when starting all services after Enabling HA. > Retry count and timeout should be clarified, as sometimes it may take a long > time for namenode to exit safemode. -- This message was sent by Atlassian JIRA (v6.4.14#64029)