Thanks for testing Rohith and Sunil

Can you please confirm if it is not a config issue at your end ?
We (both Jonathan and myself) just tried testing this on a fresh cluster
(both automatic and manual) and we are not able to reproduce this. I've
updated the YARN-7453 <https://issues.apache.org/jira/browse/YARN-7453> JIRA
with details of testing.

Cheers
-Arun/Subru

On Tue, Nov 7, 2017 at 3:17 AM, Rohith Sharma K S <rohithsharm...@apache.org
> wrote:

> Thanks Sunil for confirmation. Btw, I have raised YARN-7453
> <https://issues.apache.org/jira/browse/YARN-7453> JIRA to track this
> issue.
>
> - Rohith Sharma K S
>
> On 7 November 2017 at 16:44, Sunil G <sun...@apache.org> wrote:
>
>> Hi Subru and Arun.
>>
>> Thanks for driving 2.9 release. Great work!
>>
>> I installed cluster built from source.
>> - Ran few MR jobs with application priority enabled. Runs fine.
>> - Accessed new UI and it also seems fine.
>>
>> However I am also getting same issue as Rohith reported.
>> - Started an HA cluster
>> - Pushed RM to standby
>> - Pushed back RM to active then seeing an exception.
>>
>> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to
>> Active
>>         at
>> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyE
>> lectorBasedElectorServic
>>     e.becomeActive(ActiveStandbyElectorBasedElectorService.java:146)
>>         at
>> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(Activ
>> eStandbyElector.java:894
>>     )
>>
>> Caused by: org.apache.zookeeper.KeeperException$NoAuthException:
>> KeeperErrorCode = NoAuth
>>         at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
>>         at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:
>> 949)
>>
>> Will check and post more details,
>>
>> - Sunil
>>
>>
>> On Tue, Nov 7, 2017 at 12:47 PM Rohith Sharma K S <
>> rohithsharm...@apache.org>
>> wrote:
>>
>> > Thanks Subru/Arun for the great work!
>> >
>> > Downloaded source and built from it. Deployed RM HA non-secured cluster
>> > along with new YARN UI and ATSv2.
>> >
>> > I am facing basic RM HA switch issue after first time successful start.
>> > *Can
>> > anyone else is facing this issue?*
>> >
>> > When RM is switched from ACTIVE to STANDBY to ACTIVE, RM never switch to
>> > active successfully. Exception trace I see from the log is
>> >
>> > 2017-11-07 12:35:56,540 WARN org.apache.hadoop.ha.ActiveStandbyElector:
>> > Exception handling the winning of election
>> > org.apache.hadoop.ha.ServiceFailedException: RM could not transition to
>> > Active
>> >     at
>> >
>> > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyE
>> lectorBasedElectorService.becomeActive(ActiveStandbyElec
>> torBasedElectorService.java:146)
>> >     at
>> >
>> > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(Activ
>> eStandbyElector.java:894)
>> >     at
>> >
>> > org.apache.hadoop.ha.ActiveStandbyElector.processResult(Acti
>> veStandbyElector.java:473)
>> >     at
>> >
>> > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(
>> ClientCnxn.java:599)
>> >     at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.
>> java:498)
>> > Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when
>> > transitioning to Active mode
>> >     at
>> >
>> > org.apache.hadoop.yarn.server.resourcemanager.AdminService.t
>> ransitionToActive(AdminService.java:325)
>> >     at
>> >
>> > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyE
>> lectorBasedElectorService.becomeActive(ActiveStandbyElec
>> torBasedElectorService.java:144)
>> >     ... 4 more
>> > Caused by: org.apache.hadoop.service.ServiceStateException:
>> > org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode =
>> > NoAuth
>> >     at
>> >
>> > org.apache.hadoop.service.ServiceStateException.convert(Serv
>> iceStateException.java:105)
>> >     at
>> > org.apache.hadoop.service.AbstractService.start(AbstractServ
>> ice.java:205)
>> >     at
>> >
>> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
>> r.startActiveServices(ResourceManager.java:1131)
>> >     at
>> >
>> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
>> r$1.run(ResourceManager.java:1171)
>> >     at
>> >
>> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
>> r$1.run(ResourceManager.java:1167)
>> >     at java.security.AccessController.doPrivileged(Native Method)
>> >     at javax.security.auth.Subject.doAs(Subject.java:422)
>> >     at
>> >
>> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>> upInformation.java:1886)
>> >     at
>> >
>> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
>> r.transitionToActive(ResourceManager.java:1167)
>> >     at
>> >
>> > org.apache.hadoop.yarn.server.resourcemanager.AdminService.t
>> ransitionToActive(AdminService.java:320)
>> >     ... 5 more
>> > Caused by: org.apache.zookeeper.KeeperException$NoAuthException:
>> > KeeperErrorCode = NoAuth
>> >     at
>> > org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
>> >     at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:949)
>> >     at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915)
>> >     at
>> >
>> > org.apache.curator.framework.imps.CuratorTransactionImpl.doO
>> peration(CuratorTransactionImpl.java:159)
>> >     at
>> >
>> > org.apache.curator.framework.imps.CuratorTransactionImpl.acc
>> ess$200(CuratorTransactionImpl.java:44)
>> >     at
>> >
>> > org.apache.curator.framework.imps.CuratorTransactionImpl$2.c
>> all(CuratorTransactionImpl.java:129)
>> >     at
>> >
>> > org.apache.curator.framework.imps.CuratorTransactionImpl$2.c
>> all(CuratorTransactionImpl.java:125)
>> >     at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
>> >     at
>> >
>> > org.apache.curator.framework.imps.CuratorTransactionImpl.com
>> mit(CuratorTransactionImpl.java:122)
>> >     at
>> >
>> > org.apache.hadoop.util.curator.ZKCuratorManager$SafeTransact
>> ion.commit(ZKCuratorManager.java:403)
>> >     at
>> >
>> > org.apache.hadoop.util.curator.ZKCuratorManager.safeSetData(
>> ZKCuratorManager.java:372)
>> >     at
>> >
>> > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMS
>> tateStore.getAndIncrementEpoch(ZKRMStateStore.java:493)
>> >     at
>> >
>> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
>> r$RMActiveServices.serviceStart(ResourceManager.java:754)
>> >     at
>> > org.apache.hadoop.service.AbstractService.start(AbstractServ
>> ice.java:194)
>> >     ... 13 more
>> >
>> > Thanks & Regards
>> > Rohith Sharma K S
>> >
>> > On 4 November 2017 at 04:20, Arun Suresh <asur...@apache.org> wrote:
>> >
>> > > Hi folks,
>> > >
>> > >      Apache Hadoop 2.9.0 is the first stable release of Hadoop 2.9
>> line
>> > and
>> > > will be the latest stable/production release for Apache Hadoop - it
>> > > includes 30 New Features with 500+ subtasks, 407 Improvements, 787 Bug
>> > > fixes new fixed issues since 2.8.2 .
>> > >
>> > >       More information about the 2.9.0 release plan can be found here:
>> > > *https://cwiki.apache.org/confluence/display/HADOOP/
>> > > Roadmap#Roadmap-Version2.9
>> > > <https://cwiki.apache.org/confluence/display/HADOOP/
>> > > Roadmap#Roadmap-Version2.9>*
>> > >
>> > >       New RC is available at:
>> > > http://home.apache.org/~asuresh/hadoop-2.9.0-RC0/
>> > >
>> > >       The RC tag in git is: release-2.9.0-RC0, and the latest commit
>> id
>> > is:
>> > > 6697f0c18b12f1bdb99cbdf81394091f4fef1f0a
>> > >
>> > >       The maven artifacts are available via repository.apache.org at:
>> > > *
>> > https://repository.apache.org/content/repositories/orgapache
>> hadoop-1065/
>> > > <
>> > https://repository.apache.org/content/repositories/orgapache
>> hadoop-1065/
>> > > >*
>> > >
>> > >       Please try the release and vote; the vote will run for the
>> usual 5
>> > > days, ending on 11/10/2017 4pm PST time.
>> > >
>> > > Thanks,
>> > >
>> > > Arun/Subru
>> > >
>> >
>>
>
>

Reply via email to