Hi Subru and Arun. Thanks for driving 2.9 release. Great work!
I installed cluster built from source. - Ran few MR jobs with application priority enabled. Runs fine. - Accessed new UI and it also seems fine. However I am also getting same issue as Rohith reported. - Started an HA cluster - Pushed RM to standby - Pushed back RM to active then seeing an exception. org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorServic e.becomeActive(ActiveStandbyElectorBasedElectorService.java:146) at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:894 ) Caused by: org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth at org.apache.zookeeper.KeeperException.create(KeeperException.java:113) at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:949) Will check and post more details, - Sunil On Tue, Nov 7, 2017 at 12:47 PM Rohith Sharma K S <rohithsharm...@apache.org> wrote: > Thanks Subru/Arun for the great work! > > Downloaded source and built from it. Deployed RM HA non-secured cluster > along with new YARN UI and ATSv2. > > I am facing basic RM HA switch issue after first time successful start. > *Can > anyone else is facing this issue?* > > When RM is switched from ACTIVE to STANDBY to ACTIVE, RM never switch to > active successfully. Exception trace I see from the log is > > 2017-11-07 12:35:56,540 WARN org.apache.hadoop.ha.ActiveStandbyElector: > Exception handling the winning of election > org.apache.hadoop.ha.ServiceFailedException: RM could not transition to > Active > at > > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:146) > at > > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:894) > at > > org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:473) > at > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when > transitioning to Active mode > at > > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:325) > at > > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144) > ... 4 more > Caused by: org.apache.hadoop.service.ServiceStateException: > org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = > NoAuth > at > > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:205) > at > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1131) > at > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1171) > at > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1167) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1886) > at > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1167) > at > > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320) > ... 5 more > Caused by: org.apache.zookeeper.KeeperException$NoAuthException: > KeeperErrorCode = NoAuth > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:113) > at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:949) > at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915) > at > > org.apache.curator.framework.imps.CuratorTransactionImpl.doOperation(CuratorTransactionImpl.java:159) > at > > org.apache.curator.framework.imps.CuratorTransactionImpl.access$200(CuratorTransactionImpl.java:44) > at > > org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:129) > at > > org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:125) > at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) > at > > org.apache.curator.framework.imps.CuratorTransactionImpl.commit(CuratorTransactionImpl.java:122) > at > > org.apache.hadoop.util.curator.ZKCuratorManager$SafeTransaction.commit(ZKCuratorManager.java:403) > at > > org.apache.hadoop.util.curator.ZKCuratorManager.safeSetData(ZKCuratorManager.java:372) > at > > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.getAndIncrementEpoch(ZKRMStateStore.java:493) > at > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:754) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > ... 13 more > > Thanks & Regards > Rohith Sharma K S > > On 4 November 2017 at 04:20, Arun Suresh <asur...@apache.org> wrote: > > > Hi folks, > > > > Apache Hadoop 2.9.0 is the first stable release of Hadoop 2.9 line > and > > will be the latest stable/production release for Apache Hadoop - it > > includes 30 New Features with 500+ subtasks, 407 Improvements, 787 Bug > > fixes new fixed issues since 2.8.2 . > > > > More information about the 2.9.0 release plan can be found here: > > *https://cwiki.apache.org/confluence/display/HADOOP/ > > Roadmap#Roadmap-Version2.9 > > <https://cwiki.apache.org/confluence/display/HADOOP/ > > Roadmap#Roadmap-Version2.9>* > > > > New RC is available at: > > http://home.apache.org/~asuresh/hadoop-2.9.0-RC0/ > > > > The RC tag in git is: release-2.9.0-RC0, and the latest commit id > is: > > 6697f0c18b12f1bdb99cbdf81394091f4fef1f0a > > > > The maven artifacts are available via repository.apache.org at: > > * > https://repository.apache.org/content/repositories/orgapachehadoop-1065/ > > < > https://repository.apache.org/content/repositories/orgapachehadoop-1065/ > > >* > > > > Please try the release and vote; the vote will run for the usual 5 > > days, ending on 11/10/2017 4pm PST time. > > > > Thanks, > > > > Arun/Subru > > >