Hi Gour, I added properties in /etc/hadoop/conf/yarn-site.xml and emptied the /data/slider/conf/slider-client.xml and restarted both RMs.
- hadoop.registry.zk.quorum - hadoop.registry.zk.root - slider.yarn.queue Now there are no issues in creating or destroying cluster. This helps as it keeps all configs in one location - thanks for the update. I am still hitting the original issue - Starting application with RM1 active and then RM1 to RM2 fail over leads to slider-AM getting Client cannot authenticate via:[TOKEN] errors. I will upload the config files soon ... Thanks, On Thu, Jul 28, 2016 at 5:28 PM, Manoj Samel <manojsamelt...@gmail.com> wrote: > Thanks. I will test with the updated config and then upload the latest > ones ... > > Thanks, > > Manoj > > On Thu, Jul 28, 2016 at 5:21 PM, Gour Saha <gs...@hortonworks.com> wrote: > >> slider.zookeeper.quorum is deprecated and should not be used. >> hadoop.registry.zk.quorum is used instead and is typically defined in >> yarn-site.xml. So is hadoop.registry.zk.root. >> >> It is not encouraged to specify slider.yarn.queue at the cluster config >> level. Ideally it is best to specify the queue during the application >> submission. So you can use --queue option with slider create cmd. You can >> also set on the command line using -D slider.yarn.queue=<> during the >> create call. If indeed all slider apps should go to one and only one >> queue, then this prop can be specified in any one of the existing site xml >> files under /etc/hadoop/conf. >> >> -Gour >> >> On 7/28/16, 4:43 PM, "Manoj Samel" <manojsamelt...@gmail.com> wrote: >> >> >Following slider specific properties are at present added in >> >/data/slider/conf/slider-client.xml. If you think they should be picked >> up >> >from HADOOP_CONF_DIR (/etc/hadoop/conf) file, which file in >> >HADOOP_CONF_DIR >> >should these be added ? >> > >> > - slider.zookeeper.quorum >> > - hadoop.registry.zk.quorum >> > - hadoop.registry.zk.root >> > - slider.yarn.queue >> > >> > >> >On Thu, Jul 28, 2016 at 4:37 PM, Gour Saha <gs...@hortonworks.com> >> wrote: >> > >> >> That is strange, since it is indeed not required to contain anything in >> >> slider-client.xml (except <configuration></configuration>) if >> >> HADOOP_CONF_DIR has everything that Slider needs. This probably gives >> an >> >> indication that there might be some issue with cluster configuration >> >>based >> >> on files solely under HADOOP_CONF_DIR to begin with. >> >> >> >> Suggest you to upload all the config files to the jira to help debug >> >>this >> >> further. >> >> >> >> -Gour >> >> >> >> On 7/28/16, 4:27 PM, "Manoj Samel" <manojsamelt...@gmail.com> wrote: >> >> >> >> >Thanks Gour for prompt reply >> >> > >> >> >BTW - Creating a empty slider-client.xml (with just >> >> ><configuration></configuration>) does not works. The AM starts but >> >>fails >> >> >to >> >> >create any components and shows errors like >> >> > >> >> >2016-07-28 23:18:46,018 >> >> >[AmExecutor-006-SendThread(localhost.localdomain:2181)] WARN >> >> > zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, >> >> >closing socket connection and attempting reconnect >> >> >java.net.ConnectException: Connection refused >> >> > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >> >> > at >> >> >sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) >> >> > at >> >> >> >> >>>org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO >> >>>.j >> >> >ava:361) >> >> > at >> >> >org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) >> >> > >> >> >Also, command "slider destroy <app>" fails with zookeeper errors ... >> >> > >> >> >I had to keep a minimal slider-client.xml. It does not have any RM >> info >> >> >etc. but does contain slider ZK related properties like >> >> >"slider.zookeeper.quorum", "hadoop.registry.zk.quorum", >> >> >"hadoop.registry.zk.root". I haven't yet distilled the absolute >> minimal >> >> >set >> >> >of properties required, but this should suffice for now. All RM / HDFS >> >> >properties will be read from HADOOP_CONF_DIR files. >> >> > >> >> >Let me know if this could cause any issues. >> >> > >> >> >On Thu, Jul 28, 2016 at 3:36 PM, Gour Saha <gs...@hortonworks.com> >> >>wrote: >> >> > >> >> >> No need to copy any files. Pointing HADOOP_CONF_DIR to >> >>/etc/hadoop/conf >> >> >>is >> >> >> good. >> >> >> >> >> >> -Gour >> >> >> >> >> >> On 7/28/16, 3:24 PM, "Manoj Samel" <manojsamelt...@gmail.com> >> wrote: >> >> >> >> >> >> >Follow up question regarding Gour's comment in earlier thread - >> >> >> > >> >> >> >Slider is installed on one of the hadoop nodes. SLIDER_HOME/conf >> >> >>directory >> >> >> >(say /data/slider/conf) is different than HADOOP_CONF_DIR >> >> >> >(/etc/hadoop/conf). Is it required/recommended that files in >> >> >> >HADOOP_CONF_DIR be copied to SLIDER_HOME/conf and slider-env.sh >> >>script >> >> >> >sets >> >> >> >HADOOP_CONF_DIR to /data/slider/conf ? >> >> >> > >> >> >> >Or can the slider-env.sh set HADOOP_CONF_DIR to /etc/hadoop/conf , >> >> >>without >> >> >> >copying the files ? >> >> >> > >> >> >> >Using slider .80 for now, but would like to know recommendation for >> >> >>this >> >> >> >and future versions as well. >> >> >> > >> >> >> >Thanks in advance, >> >> >> > >> >> >> >Manoj >> >> >> > >> >> >> >On Tue, Jul 26, 2016 at 3:27 PM, Manoj Samel >> >><manojsamelt...@gmail.com >> >> > >> >> >> >wrote: >> >> >> > >> >> >> >> Filed https://issues.apache.org/jira/browse/SLIDER-1158 with >> logs >> >> and >> >> >> my >> >> >> >> analysis of logs. >> >> >> >> >> >> >> >> On Tue, Jul 26, 2016 at 10:36 AM, Gour Saha >> >><gs...@hortonworks.com> >> >> >> >>wrote: >> >> >> >> >> >> >> >>> Please file a JIRA and upload the logs to it. >> >> >> >>> >> >> >> >>> On 7/26/16, 10:21 AM, "Manoj Samel" <manojsamelt...@gmail.com> >> >> >>wrote: >> >> >> >>> >> >> >> >>> >Hi Gour, >> >> >> >>> > >> >> >> >>> >Can you please reach me using your own email-id? I will then >> >>send >> >> >> >>>logs to >> >> >> >>> >you, along with my analysis - I don't want to send logs on >> >>public >> >> >>list >> >> >> >>> > >> >> >> >>> >Thanks, >> >> >> >>> > >> >> >> >>> >On Mon, Jul 25, 2016 at 5:39 PM, Gour Saha >> >><gs...@hortonworks.com> >> >> >> >>> wrote: >> >> >> >>> > >> >> >> >>> >> Ok, so this node is not a gateway. It is part of the cluster, >> >> >>which >> >> >> >>> >>means >> >> >> >>> >> you don¹t need slider-client.xml at all. Just have >> >> >>HADOOP_CONF_DIR >> >> >> >>> >> pointing to /etc/hadoop/conf in slider-env.sh and that should >> >>be >> >> >>it. >> >> >> >>> >> >> >> >> >>> >> So the above simplifies your config setup. It will not solve >> >> >>either >> >> >> >>>of >> >> >> >>> >>the >> >> >> >>> >> 2 problems you are facing. >> >> >> >>> >> >> >> >> >>> >> Now coming to the 2 issues you are facing, you have to >> provide >> >> >> >>> >>additional >> >> >> >>> >> logs for us to understand better. Let¹s start with - >> >> >> >>> >> 1. RM logs (specifically between the time when rm1->rm2 >> >>failover >> >> >>is >> >> >> >>> >> simulated) >> >> >> >>> >> 2. Slider App logs >> >> >> >>> >> >> >> >> >>> >> -Gour >> >> >> >>> >> >> >> >> >>> >> On 7/25/16, 5:16 PM, "Manoj Samel" <manojsamelt...@gmail.com >> > >> >> >> wrote: >> >> >> >>> >> >> >> >> >>> >> > 1. Not clear about your question on "gateway" node. The >> >>node >> >> >> >>> running >> >> >> >>> >> > slider is part of the hadoop cluster and there are other >> >> >> >>>services >> >> >> >>> >>like >> >> >> >>> >> > Oozie that run on this node that utilizes hdfs and yarn. >> >>So >> >> >>if >> >> >> >>>your >> >> >> >>> >> > question is whether the node is otherwise working for >> HDFS >> >> >>and >> >> >> >>>Yarn >> >> >> >>> >> > configuration, it is working >> >> >> >>> >> > 2. I copied all files from HADOOP_CONF_DIR (say >> >> >> >>>/etc/hadoop/conf) >> >> >> >>> to >> >> >> >>> >> >the >> >> >> >>> >> > directory containing slider-client.xml (say >> >> >>/data/latest/conf) >> >> >> >>> >> > 3. In earlier email, I had done a mistake where >> >>slider-env.sh >> >> >> >>>file >> >> >> >>> >> >HADOOP_CONF_DIR >> >> >> >>> >> > was pointing to original directory /etc/hadoop/conf. I >> >>edited >> >> >> >>>it to >> >> >> >>> >> > point to same directory containing slider-client.xml & >> >> >> >>> slider-env.sh >> >> >> >>> >> >i.e. >> >> >> >>> >> > /data/latest/conf >> >> >> >>> >> > 4. I emptied slider-client.xml. It just had the >> >> >> >>> >> ><configuration></configuration>. >> >> >> >>> >> > The creation of spas worked but the Slider AM still shows >> >>the >> >> >> >>>same >> >> >> >>> >> >issue. >> >> >> >>> >> > i.e. when RM1 goes from active to standby, slider AM goes >> >> >>from >> >> >> >>> >>RUNNING >> >> >> >>> >> >to >> >> >> >>> >> > ACCPTED state with same error about TOKEN. Also NOTE that >> >> >>when >> >> >> >>> >> > slider-client.xml is empty, the "slider destroy xxx" >> >>command >> >> >> >>>still >> >> >> >>> >> >fails >> >> >> >>> >> > with Zookeeper connection errors. >> >> >> >>> >> > 5. I then added same parameters (as my last email - >> except >> >> >> >>> >> > HADOOP_CONF_DIR) to slider-client.xml and ran. This time >> >> >> >>> >>slider-env.sh >> >> >> >>> >> > has HADOOP_CONF_DIR pointing to /data/latest/conf and >> >> >> >>> >>slider-client.xml >> >> >> >>> >> > does not have HADOOP_CONF_DIR. The same issue exists (but >> >> >> >>>"slider >> >> >> >>> >> > destroy" does not fails) >> >> >> >>> >> > 6. Could you explain what do you expect to pick up from >> >> >>Hadoop >> >> >> >>> >> > configurations that will help you in RM Token ? If slider >> >>has >> >> >> >>>token >> >> >> >>> >> >from >> >> >> >>> >> > RM1, and it switches to RM2, not clear what slider does >> to >> >> >>get >> >> >> >>> >> >delegation >> >> >> >>> >> > token for RM2 communication ? >> >> >> >>> >> > 7. It is worth repeating again that issue happens only >> >>when >> >> >>RM1 >> >> >> >>>was >> >> >> >>> >> > active when slider app was created and then RM1 becomes >> >> >> >>>standby. If >> >> >> >>> >> >RM2 was >> >> >> >>> >> > active when slider app was created, then slider AM keeps >> >> >>running >> >> >> >>> for >> >> >> >>> >> >any >> >> >> >>> >> > number of switches between RM2 and RM1 back and forth ... >> >> >> >>> >> > >> >> >> >>> >> > >> >> >> >>> >> >On Mon, Jul 25, 2016 at 4:21 PM, Gour Saha >> >> >><gs...@hortonworks.com> >> >> >> >>> >>wrote: >> >> >> >>> >> > >> >> >> >>> >> >> The node you are running slider from, is that a gateway >> >>node? >> >> >> >>>Sorry >> >> >> >>> >>for >> >> >> >>> >> >> not being explicit. I meant copy everything under >> >> >> >>>/etc/hadoop/conf >> >> >> >>> >>from >> >> >> >>> >> >> your cluster into some temp directory (say >> >>/tmp/hadoop_conf) >> >> >>in >> >> >> >>>your >> >> >> >>> >> >> gateway node or local or whichever node you are running >> >>slider >> >> >> >>>from. >> >> >> >>> >> >>Then >> >> >> >>> >> >> set HADOOP_CONF_DIR to /tmp/hadoop_conf and clear >> >>everything >> >> >>out >> >> >> >>> from >> >> >> >>> >> >> slider-client.xml. >> >> >> >>> >> >> >> >> >> >>> >> >> On 7/25/16, 4:12 PM, "Manoj Samel" >> >><manojsamelt...@gmail.com> >> >> >> >>> wrote: >> >> >> >>> >> >> >> >> >> >>> >> >> >Hi Gour, >> >> >> >>> >> >> > >> >> >> >>> >> >> >Thanks for your prompt reply. >> >> >> >>> >> >> > >> >> >> >>> >> >> >FYI, issue happens when I create slider app when rm1 is >> >> >>active >> >> >> >>>and >> >> >> >>> >>when >> >> >> >>> >> >> >rm1 >> >> >> >>> >> >> >fails over to rm2. As soon as rm2 becomes active; the >> >>slider >> >> >>AM >> >> >> >>> goes >> >> >> >>> >> >>from >> >> >> >>> >> >> >RUNNING to ACCEPTED state with above error. >> >> >> >>> >> >> > >> >> >> >>> >> >> >For your suggestion, I did following >> >> >> >>> >> >> > >> >> >> >>> >> >> >1) Copied core-site, hdfs-site, yarn-site, and >> mapred-site >> >> >>from >> >> >> >>> >> >> >HADOOP_CONF_DIR >> >> >> >>> >> >> >to slider conf directory. >> >> >> >>> >> >> >2) Our slider-env.sh already had HADOOP_CONF_DIR set >> >> >> >>> >> >> >3) I removed all properties from slider-client.xml EXCEPT >> >> >> >>>following >> >> >> >>> >> >> > >> >> >> >>> >> >> > - HADOOP_CONF_DIR >> >> >> >>> >> >> > - slider.yarn.queue >> >> >> >>> >> >> > - slider.zookeeper.quorum >> >> >> >>> >> >> > - hadoop.registry.zk.quorum >> >> >> >>> >> >> > - hadoop.registry.zk.root >> >> >> >>> >> >> > - hadoop.security.authorization >> >> >> >>> >> >> > - hadoop.security.authentication >> >> >> >>> >> >> > >> >> >> >>> >> >> >Then I made rm1 active, installed and created slider app >> >>and >> >> >> >>> >>restarted >> >> >> >>> >> >>rm1 >> >> >> >>> >> >> >(to make rm2) active. The slider-am again went from >> >>RUNNING >> >> >>to >> >> >> >>> >>ACCEPTED >> >> >> >>> >> >> >state. >> >> >> >>> >> >> > >> >> >> >>> >> >> >Let me know if you want me to try further changes. >> >> >> >>> >> >> > >> >> >> >>> >> >> >If I make the slider-client.xml completely empty per your >> >> >> >>> >>suggestion, >> >> >> >>> >> >>only >> >> >> >>> >> >> >slider AM comes up but it >> >> >> >>> >> >> >fails to start components. The AM log shows errors trying >> >>to >> >> >> >>> >>connect to >> >> >> >>> >> >> >zookeeper like below. >> >> >> >>> >> >> >2016-07-25 23:07:41,532 >> >> >> >>> >> >> >[AmExecutor-006-SendThread(localhost.localdomain:2181)] >> >>WARN >> >> >> >>> >> >> >zookeeper.ClientCnxn - Session 0x0 for server null, >> >> >>unexpected >> >> >> >>> >>error, >> >> >> >>> >> >> >closing socket connection and attempting reconnect >> >> >> >>> >> >> >java.net.ConnectException: Connection refused >> >> >> >>> >> >> > >> >> >> >>> >> >> >Hence I kept minimal info in slider-client.xml >> >> >> >>> >> >> > >> >> >> >>> >> >> >FYI This is slider version 0.80 >> >> >> >>> >> >> > >> >> >> >>> >> >> >Thanks, >> >> >> >>> >> >> > >> >> >> >>> >> >> >Manoj >> >> >> >>> >> >> > >> >> >> >>> >> >> >On Mon, Jul 25, 2016 at 2:54 PM, Gour Saha >> >> >> >>><gs...@hortonworks.com> >> >> >> >>> >> >>wrote: >> >> >> >>> >> >> > >> >> >> >>> >> >> >> If possible, can you copy the entire content of the >> >> >>directory >> >> >> >>> >> >> >> /etc/hadoop/conf and then set HADOOP_CONF_DIR in >> >> >> >>>slider-env.sh to >> >> >> >>> >>it. >> >> >> >>> >> >> >>Keep >> >> >> >>> >> >> >> slider-client.xml empty. >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> Now when you do the same rm1->rm2 and then the reverse >> >> >> >>>failovers, >> >> >> >>> >>do >> >> >> >>> >> >>you >> >> >> >>> >> >> >> see the same behaviors? >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> -Gour >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> On 7/25/16, 2:28 PM, "Manoj Samel" >> >> >><manojsamelt...@gmail.com> >> >> >> >>> >>wrote: >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >Another observation (whatever it is worth) >> >> >> >>> >> >> >> > >> >> >> >>> >> >> >> >If slider app is created and started when rm2 was >> >>active, >> >> >> >>>then >> >> >> >>> it >> >> >> >>> >> >> >>seems to >> >> >> >>> >> >> >> >survive switches between rm2 and rm1 (and back). I.e >> >> >> >>> >> >> >> > >> >> >> >>> >> >> >> >* rm2 is active >> >> >> >>> >> >> >> >* create and start slider application >> >> >> >>> >> >> >> >* fail over to rm1. Now the Slider AM keeps running >> >> >> >>> >> >> >> >* fail over to rm2 again. Slider AM still keeps >> running >> >> >> >>> >> >> >> > >> >> >> >>> >> >> >> >So, it seems if it starts with rm1 active, then the AM >> >> >>goes >> >> >> >>>to >> >> >> >>> >> >> >>"ACCEPTED" >> >> >> >>> >> >> >> >state when RM fails to rm2. If it starts with rm2 >> >>active, >> >> >> >>>then >> >> >> >>> it >> >> >> >>> >> >>runs >> >> >> >>> >> >> >> >fine >> >> >> >>> >> >> >> >with any switches between rm1 and rm2. >> >> >> >>> >> >> >> > >> >> >> >>> >> >> >> >Any feedback ? >> >> >> >>> >> >> >> > >> >> >> >>> >> >> >> >Thanks, >> >> >> >>> >> >> >> > >> >> >> >>> >> >> >> >Manoj >> >> >> >>> >> >> >> > >> >> >> >>> >> >> >> >On Mon, Jul 25, 2016 at 12:25 PM, Manoj Samel >> >> >> >>> >> >> >><manojsamelt...@gmail.com> >> >> >> >>> >> >> >> >wrote: >> >> >> >>> >> >> >> > >> >> >> >>> >> >> >> >> Setup >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> - Hadoop 2.6 with RM HA, Kerberos enabled >> >> >> >>> >> >> >> >> - Slider 0.80 >> >> >> >>> >> >> >> >> - In my slider-client.xml, I have added all RM HA >> >> >> >>>properties, >> >> >> >>> >> >> >>including >> >> >> >>> >> >> >> >> the ones mentioned in >> >> >> >>> >> >>http://markmail.org/message/wnhpp2zn6ixo65e3. >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> Following is the issue >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> * rm1 is active, rm2 is standby >> >> >> >>> >> >> >> >> * deploy and start slider application, it runs fine >> >> >> >>> >> >> >> >> * restart rm1, rm2 is now active. >> >> >> >>> >> >> >> >> * The slider-am now goes from running into >> "ACCEPTED" >> >> >> >>>mode. It >> >> >> >>> >> >>stays >> >> >> >>> >> >> >> >>there >> >> >> >>> >> >> >> >> till rm1 is made active again. >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> In the slider-am log, it tries to connect to RM2 and >> >> >> >>> connection >> >> >> >>> >> >>fails >> >> >> >>> >> >> >> >>due >> >> >> >>> >> >> >> >> to >> org.apache.hadoop.security.AccessControlException: >> >> >> >>>Client >> >> >> >>> >> >>cannot >> >> >> >>> >> >> >> >> authenticate via:[TOKEN]. See detailed log below >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> It seems it has some token (delegation token?) for >> >>RM1 >> >> >>but >> >> >> >>> >>tries >> >> >> >>> >> >>to >> >> >> >>> >> >> >>use >> >> >> >>> >> >> >> >> same(?) for RM2 and fails. Am I missing some >> >> >>configuration >> >> >> >>>??? >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> Thanks, >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] >> >>INFO >> >> >> >>> >> >> >> >> client.ConfiguredRMFailoverProxyProvider - Failing >> >> >>over to >> >> >> >>> rm2 >> >> >> >>> >> >> >> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] >> >>WARN >> >> >> >>> >> >> >> >> security.UserGroupInformation - >> >> >>PriviledgedActionException >> >> >> >>> >> >> >>as:abc@XYZ >> >> >> >>> >> >> >> >> (auth:KERBEROS) >> >> >> >>> >> >> >>cause:org.apache.hadoop.security.AccessControlException: >> >> >> >>> >> >> >> >> Client cannot authenticate via:[TOKEN] >> >> >> >>> >> >> >> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] >> >>WARN >> >> >> >>> >> >>ipc.Client - >> >> >> >>> >> >> >> >> Exception encountered while connecting to the server >> >>: >> >> >> >>> >> >> >> >> org.apache.hadoop.security.AccessControlException: >> >> >>Client >> >> >> >>> >>cannot >> >> >> >>> >> >> >> >> authenticate via:[TOKEN] >> >> >> >>> >> >> >> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] >> >>WARN >> >> >> >>> >> >> >> >> security.UserGroupInformation - >> >> >>PriviledgedActionException >> >> >> >>> >> >> >>as:abc@XYZ >> >> >> >>> >> >> >> >> (auth:KERBEROS) cause:java.io.IOException: >> >> >> >>> >> >> >> >> org.apache.hadoop.security.AccessControlException: >> >> >>Client >> >> >> >>> >>cannot >> >> >> >>> >> >> >> >> authenticate via:[TOKEN] >> >> >> >>> >> >> >> >> 2016-07-25 19:06:48,089 [AMRM Heartbeater thread] >> >>INFO >> >> >> >>> >> >> >> >> retry.RetryInvocationHandler - Exception while >> >>invoking >> >> >> >>> >>allocate >> >> >> >>> >> >>of >> >> >> >>> >> >> >> >>class >> >> >> >>> >> >> >> >> ApplicationMasterProtocolPBClientImpl over rm2 after >> >>287 >> >> >> >>>fail >> >> >> >>> >>over >> >> >> >>> >> >> >> >> attempts. Trying to fail over immediately. >> >> >> >>> >> >> >> >> java.io.IOException: Failed on local exception: >> >> >> >>> >> >>java.io.IOException: >> >> >> >>> >> >> >> >> org.apache.hadoop.security.AccessControlException: >> >> >>Client >> >> >> >>> >>cannot >> >> >> >>> >> >> >> >> authenticate via:[TOKEN]; Host Details : local host >> >>is: >> >> >> >>> >>"<SliderAM >> >> >> >>> >> >> >> >> HOST>/<slider AM Host IP>"; destination host is: >> >>"<RM2 >> >> >> >>> >> >>HOST>":23130; >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >>>>>org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) >> >> >> >>> >> >> >> >> at >> >> >> >>>org.apache.hadoop.ipc.Client.call(Client.java:1476) >> >> >> >>> >> >> >> >> at >> >> >> >>>org.apache.hadoop.ipc.Client.call(Client.java:1403) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(Proto >> >>>>>>>>>>>>>>>bu >> >> >>>>>>>>>>>>>fR >> >> >> >>>>>>>>>>>pcE >> >> >> >>> >>>>>>>>ng >> >> >> >>> >> >>>>>>in >> >> >> >>> >> >> >>>>e. >> >> >> >>> >> >> >> >>java:230) >> >> >> >>> >> >> >> >> at com.sun.proxy.$Proxy23.allocate(Unknown >> >> >>Source) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>>>org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterP >> >>>>>>>>>>>>>>>ro >> >> >>>>>>>>>>>>>to >> >> >> >>>>>>>>>>>col >> >> >> >>> >>>>>>>>PB >> >> >> >>> >> >>>>>>Cl >> >> >> >>> >> >> >>>>ie >> >> >> >>> >> >> >> >> >> >> >>>>>ntImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77) >> >> >> >>> >> >> >> >> at >> >> >> >>> sun.reflect.GeneratedMethodAccessor10.invoke(Unknown >> >> >> >>> >> >> >>Source) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe >> >>>>>>>>>>>>>>>th >> >> >>>>>>>>>>>>>od >> >> >> >>>>>>>>>>>Acc >> >> >> >>> >>>>>>>>es >> >> >> >>> >> >>>>>>so >> >> >> >>> >> >> >>>>rI >> >> >> >>> >> >> >> >>mpl.java:43) >> >> >> >>> >> >> >> >> at >> >> >>java.lang.reflect.Method.invoke(Method.java:497) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth >> >>>>>>>>>>>>>>>od >> >> >>>>>>>>>>>>>(R >> >> >> >>>>>>>>>>>etr >> >> >> >>> >>>>>>>>yI >> >> >> >>> >> >>>>>>nv >> >> >> >>> >> >> >>>>oc >> >> >> >>> >> >> >> >>ationHandler.java:252) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret >> >>>>>>>>>>>>>>>ry >> >> >>>>>>>>>>>>>In >> >> >> >>>>>>>>>>>voc >> >> >> >>> >>>>>>>>at >> >> >> >>> >> >>>>>>io >> >> >> >>> >> >> >>>>nH >> >> >> >>> >> >> >> >>andler.java:104) >> >> >> >>> >> >> >> >> at com.sun.proxy.$Proxy24.allocate(Unknown >> >> >>Source) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>>>org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.alloca >> >>>>>>>>>>>>>>>te >> >> >>>>>>>>>>>>>(A >> >> >> >>>>>>>>>>>MRM >> >> >> >>> >>>>>>>>Cl >> >> >> >>> >> >>>>>>ie >> >> >> >>> >> >> >>>>nt >> >> >> >>> >> >> >> >>Impl.java:278) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>>>org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsync >> >>>>>>>>>>>>>>>Im >> >> >>>>>>>>>>>>>pl >> >> >> >>>>>>>>>>>$He >> >> >> >>> >>>>>>>>ar >> >> >> >>> >> >>>>>>tb >> >> >> >>> >> >> >>>>ea >> >> >> >>> >> >> >> >>tThread.run(AMRMClientAsyncImpl.java:224) >> >> >> >>> >> >> >> >> Caused by: java.io.IOException: >> >> >> >>> >> >> >> >> org.apache.hadoop.security.AccessControlException: >> >> >>Client >> >> >> >>> >>cannot >> >> >> >>> >> >> >> >> authenticate via:[TOKEN] >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >>>>>org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682) >> >> >> >>> >> >> >> >> at >> >> >> >>>java.security.AccessController.doPrivileged(Native >> >> >> >>> >> >>Method) >> >> >> >>> >> >> >> >> at >> >> >> >>>javax.security.auth.Subject.doAs(Subject.java:422) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGro >> >>>>>>>>>>>>>>>up >> >> >>>>>>>>>>>>>In >> >> >> >>>>>>>>>>>for >> >> >> >>> >>>>>>>>ma >> >> >> >>> >> >>>>>>ti >> >> >> >>> >> >> >>>>on >> >> >> >>> >> >> >> >>.java:1671) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>>>org.apache.hadoop.ipc.Client$Connection.handleSaslConnection >> >>>>>>>>>>>>>>>Fa >> >> >>>>>>>>>>>>>il >> >> >> >>>>>>>>>>>ure >> >> >> >>> >>>>>>>>(C >> >> >> >>> >> >>>>>>li >> >> >> >>> >> >> >>>>en >> >> >> >>> >> >> >> >>t.java:645) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client. >> >>>>>>>>>>>>>ja >> >> >>>>>>>>>>>va >> >> >> >>>>>>>>>:73 >> >> >> >>> >>>>>>3) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >> >> >>>>>>>>>org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:37 >> >>>>>>>>>0) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >>>>org.apache.hadoop.ipc.Client.getConnection(Client.java:1525) >> >> >> >>> >> >> >> >> at >> >> >> >>>org.apache.hadoop.ipc.Client.call(Client.java:1442) >> >> >> >>> >> >> >> >> ... 12 more >> >> >> >>> >> >> >> >> Caused by: >> >> >> >>>org.apache.hadoop.security.AccessControlException: >> >> >> >>> >> >>Client >> >> >> >>> >> >> >> >> cannot authenticate via:[TOKEN] >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>>>org.apache.hadoop.security.SaslRpcClient.selectSaslClient(Sa >> >>>>>>>>>>>>>>>sl >> >> >>>>>>>>>>>>>Rp >> >> >> >>>>>>>>>>>cCl >> >> >> >>> >>>>>>>>ie >> >> >> >>> >> >>>>>>nt >> >> >> >>> >> >> >>>>.j >> >> >> >>> >> >> >> >>ava:172) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>>>org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpc >> >>>>>>>>>>>>>>>Cl >> >> >>>>>>>>>>>>>ie >> >> >> >>>>>>>>>>>nt. >> >> >> >>> >>>>>>>>ja >> >> >> >>> >> >>>>>>va >> >> >> >>> >> >> >>>>:3 >> >> >> >>> >> >> >> >>96) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>>>org.apache.hadoop.ipc.Client$Connection.setupSaslConnection( >> >>>>>>>>>>>>>>>Cl >> >> >>>>>>>>>>>>>ie >> >> >> >>>>>>>>>>>nt. >> >> >> >>> >>>>>>>>ja >> >> >> >>> >> >>>>>>va >> >> >> >>> >> >> >>>>:5 >> >> >> >>> >> >> >> >>55) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >> >> >>>>>>>>>org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:37 >> >>>>>>>>>0) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >>>>>org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >>>>>org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:721) >> >> >> >>> >> >> >> >> at >> >> >> >>>java.security.AccessController.doPrivileged(Native >> >> >> >>> >> >>Method) >> >> >> >>> >> >> >> >> at >> >> >> >>>javax.security.auth.Subject.doAs(Subject.java:422) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGro >> >>>>>>>>>>>>>>>up >> >> >>>>>>>>>>>>>In >> >> >> >>>>>>>>>>>for >> >> >> >>> >>>>>>>>ma >> >> >> >>> >> >>>>>>ti >> >> >> >>> >> >> >>>>on >> >> >> >>> >> >> >> >>.java:1671) >> >> >> >>> >> >> >> >> at >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >>>>>>>>>>>>>org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client. >> >>>>>>>>>>>>>ja >> >> >>>>>>>>>>>va >> >> >> >>>>>>>>>:72 >> >> >> >>> >>>>>>0) >> >> >> >>> >> >> >> >> ... 15 more >> >> >> >>> >> >> >> >> 2016-07-25 19:06:48,089 [AMRM Heartbeater thread] >> >>INFO >> >> >> >>> >> >> >> >> client.ConfiguredRMFailoverProxyProvider - Failing >> >> >>over to >> >> >> >>> rm1 >> >> >> >>> >> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >