Thanks. I will test with the updated config and then upload the latest ones ...
Thanks, Manoj On Thu, Jul 28, 2016 at 5:21 PM, Gour Saha <gs...@hortonworks.com> wrote: > slider.zookeeper.quorum is deprecated and should not be used. > hadoop.registry.zk.quorum is used instead and is typically defined in > yarn-site.xml. So is hadoop.registry.zk.root. > > It is not encouraged to specify slider.yarn.queue at the cluster config > level. Ideally it is best to specify the queue during the application > submission. So you can use --queue option with slider create cmd. You can > also set on the command line using -D slider.yarn.queue=<> during the > create call. If indeed all slider apps should go to one and only one > queue, then this prop can be specified in any one of the existing site xml > files under /etc/hadoop/conf. > > -Gour > > On 7/28/16, 4:43 PM, "Manoj Samel" <manojsamelt...@gmail.com> wrote: > > >Following slider specific properties are at present added in > >/data/slider/conf/slider-client.xml. If you think they should be picked up > >from HADOOP_CONF_DIR (/etc/hadoop/conf) file, which file in > >HADOOP_CONF_DIR > >should these be added ? > > > > - slider.zookeeper.quorum > > - hadoop.registry.zk.quorum > > - hadoop.registry.zk.root > > - slider.yarn.queue > > > > > >On Thu, Jul 28, 2016 at 4:37 PM, Gour Saha <gs...@hortonworks.com> wrote: > > > >> That is strange, since it is indeed not required to contain anything in > >> slider-client.xml (except <configuration></configuration>) if > >> HADOOP_CONF_DIR has everything that Slider needs. This probably gives an > >> indication that there might be some issue with cluster configuration > >>based > >> on files solely under HADOOP_CONF_DIR to begin with. > >> > >> Suggest you to upload all the config files to the jira to help debug > >>this > >> further. > >> > >> -Gour > >> > >> On 7/28/16, 4:27 PM, "Manoj Samel" <manojsamelt...@gmail.com> wrote: > >> > >> >Thanks Gour for prompt reply > >> > > >> >BTW - Creating a empty slider-client.xml (with just > >> ><configuration></configuration>) does not works. The AM starts but > >>fails > >> >to > >> >create any components and shows errors like > >> > > >> >2016-07-28 23:18:46,018 > >> >[AmExecutor-006-SendThread(localhost.localdomain:2181)] WARN > >> > zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, > >> >closing socket connection and attempting reconnect > >> >java.net.ConnectException: Connection refused > >> > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > >> > at > >> >sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > >> > at > >> > >>>org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO > >>>.j > >> >ava:361) > >> > at > >> >org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) > >> > > >> >Also, command "slider destroy <app>" fails with zookeeper errors ... > >> > > >> >I had to keep a minimal slider-client.xml. It does not have any RM info > >> >etc. but does contain slider ZK related properties like > >> >"slider.zookeeper.quorum", "hadoop.registry.zk.quorum", > >> >"hadoop.registry.zk.root". I haven't yet distilled the absolute minimal > >> >set > >> >of properties required, but this should suffice for now. All RM / HDFS > >> >properties will be read from HADOOP_CONF_DIR files. > >> > > >> >Let me know if this could cause any issues. > >> > > >> >On Thu, Jul 28, 2016 at 3:36 PM, Gour Saha <gs...@hortonworks.com> > >>wrote: > >> > > >> >> No need to copy any files. Pointing HADOOP_CONF_DIR to > >>/etc/hadoop/conf > >> >>is > >> >> good. > >> >> > >> >> -Gour > >> >> > >> >> On 7/28/16, 3:24 PM, "Manoj Samel" <manojsamelt...@gmail.com> wrote: > >> >> > >> >> >Follow up question regarding Gour's comment in earlier thread - > >> >> > > >> >> >Slider is installed on one of the hadoop nodes. SLIDER_HOME/conf > >> >>directory > >> >> >(say /data/slider/conf) is different than HADOOP_CONF_DIR > >> >> >(/etc/hadoop/conf). Is it required/recommended that files in > >> >> >HADOOP_CONF_DIR be copied to SLIDER_HOME/conf and slider-env.sh > >>script > >> >> >sets > >> >> >HADOOP_CONF_DIR to /data/slider/conf ? > >> >> > > >> >> >Or can the slider-env.sh set HADOOP_CONF_DIR to /etc/hadoop/conf , > >> >>without > >> >> >copying the files ? > >> >> > > >> >> >Using slider .80 for now, but would like to know recommendation for > >> >>this > >> >> >and future versions as well. > >> >> > > >> >> >Thanks in advance, > >> >> > > >> >> >Manoj > >> >> > > >> >> >On Tue, Jul 26, 2016 at 3:27 PM, Manoj Samel > >><manojsamelt...@gmail.com > >> > > >> >> >wrote: > >> >> > > >> >> >> Filed https://issues.apache.org/jira/browse/SLIDER-1158 with logs > >> and > >> >> my > >> >> >> analysis of logs. > >> >> >> > >> >> >> On Tue, Jul 26, 2016 at 10:36 AM, Gour Saha > >><gs...@hortonworks.com> > >> >> >>wrote: > >> >> >> > >> >> >>> Please file a JIRA and upload the logs to it. > >> >> >>> > >> >> >>> On 7/26/16, 10:21 AM, "Manoj Samel" <manojsamelt...@gmail.com> > >> >>wrote: > >> >> >>> > >> >> >>> >Hi Gour, > >> >> >>> > > >> >> >>> >Can you please reach me using your own email-id? I will then > >>send > >> >> >>>logs to > >> >> >>> >you, along with my analysis - I don't want to send logs on > >>public > >> >>list > >> >> >>> > > >> >> >>> >Thanks, > >> >> >>> > > >> >> >>> >On Mon, Jul 25, 2016 at 5:39 PM, Gour Saha > >><gs...@hortonworks.com> > >> >> >>> wrote: > >> >> >>> > > >> >> >>> >> Ok, so this node is not a gateway. It is part of the cluster, > >> >>which > >> >> >>> >>means > >> >> >>> >> you don¹t need slider-client.xml at all. Just have > >> >>HADOOP_CONF_DIR > >> >> >>> >> pointing to /etc/hadoop/conf in slider-env.sh and that should > >>be > >> >>it. > >> >> >>> >> > >> >> >>> >> So the above simplifies your config setup. It will not solve > >> >>either > >> >> >>>of > >> >> >>> >>the > >> >> >>> >> 2 problems you are facing. > >> >> >>> >> > >> >> >>> >> Now coming to the 2 issues you are facing, you have to provide > >> >> >>> >>additional > >> >> >>> >> logs for us to understand better. Let¹s start with - > >> >> >>> >> 1. RM logs (specifically between the time when rm1->rm2 > >>failover > >> >>is > >> >> >>> >> simulated) > >> >> >>> >> 2. Slider App logs > >> >> >>> >> > >> >> >>> >> -Gour > >> >> >>> >> > >> >> >>> >> On 7/25/16, 5:16 PM, "Manoj Samel" <manojsamelt...@gmail.com> > >> >> wrote: > >> >> >>> >> > >> >> >>> >> > 1. Not clear about your question on "gateway" node. The > >>node > >> >> >>> running > >> >> >>> >> > slider is part of the hadoop cluster and there are other > >> >> >>>services > >> >> >>> >>like > >> >> >>> >> > Oozie that run on this node that utilizes hdfs and yarn. > >>So > >> >>if > >> >> >>>your > >> >> >>> >> > question is whether the node is otherwise working for HDFS > >> >>and > >> >> >>>Yarn > >> >> >>> >> > configuration, it is working > >> >> >>> >> > 2. I copied all files from HADOOP_CONF_DIR (say > >> >> >>>/etc/hadoop/conf) > >> >> >>> to > >> >> >>> >> >the > >> >> >>> >> > directory containing slider-client.xml (say > >> >>/data/latest/conf) > >> >> >>> >> > 3. In earlier email, I had done a mistake where > >>slider-env.sh > >> >> >>>file > >> >> >>> >> >HADOOP_CONF_DIR > >> >> >>> >> > was pointing to original directory /etc/hadoop/conf. I > >>edited > >> >> >>>it to > >> >> >>> >> > point to same directory containing slider-client.xml & > >> >> >>> slider-env.sh > >> >> >>> >> >i.e. > >> >> >>> >> > /data/latest/conf > >> >> >>> >> > 4. I emptied slider-client.xml. It just had the > >> >> >>> >> ><configuration></configuration>. > >> >> >>> >> > The creation of spas worked but the Slider AM still shows > >>the > >> >> >>>same > >> >> >>> >> >issue. > >> >> >>> >> > i.e. when RM1 goes from active to standby, slider AM goes > >> >>from > >> >> >>> >>RUNNING > >> >> >>> >> >to > >> >> >>> >> > ACCPTED state with same error about TOKEN. Also NOTE that > >> >>when > >> >> >>> >> > slider-client.xml is empty, the "slider destroy xxx" > >>command > >> >> >>>still > >> >> >>> >> >fails > >> >> >>> >> > with Zookeeper connection errors. > >> >> >>> >> > 5. I then added same parameters (as my last email - except > >> >> >>> >> > HADOOP_CONF_DIR) to slider-client.xml and ran. This time > >> >> >>> >>slider-env.sh > >> >> >>> >> > has HADOOP_CONF_DIR pointing to /data/latest/conf and > >> >> >>> >>slider-client.xml > >> >> >>> >> > does not have HADOOP_CONF_DIR. The same issue exists (but > >> >> >>>"slider > >> >> >>> >> > destroy" does not fails) > >> >> >>> >> > 6. Could you explain what do you expect to pick up from > >> >>Hadoop > >> >> >>> >> > configurations that will help you in RM Token ? If slider > >>has > >> >> >>>token > >> >> >>> >> >from > >> >> >>> >> > RM1, and it switches to RM2, not clear what slider does to > >> >>get > >> >> >>> >> >delegation > >> >> >>> >> > token for RM2 communication ? > >> >> >>> >> > 7. It is worth repeating again that issue happens only > >>when > >> >>RM1 > >> >> >>>was > >> >> >>> >> > active when slider app was created and then RM1 becomes > >> >> >>>standby. If > >> >> >>> >> >RM2 was > >> >> >>> >> > active when slider app was created, then slider AM keeps > >> >>running > >> >> >>> for > >> >> >>> >> >any > >> >> >>> >> > number of switches between RM2 and RM1 back and forth ... > >> >> >>> >> > > >> >> >>> >> > > >> >> >>> >> >On Mon, Jul 25, 2016 at 4:21 PM, Gour Saha > >> >><gs...@hortonworks.com> > >> >> >>> >>wrote: > >> >> >>> >> > > >> >> >>> >> >> The node you are running slider from, is that a gateway > >>node? > >> >> >>>Sorry > >> >> >>> >>for > >> >> >>> >> >> not being explicit. I meant copy everything under > >> >> >>>/etc/hadoop/conf > >> >> >>> >>from > >> >> >>> >> >> your cluster into some temp directory (say > >>/tmp/hadoop_conf) > >> >>in > >> >> >>>your > >> >> >>> >> >> gateway node or local or whichever node you are running > >>slider > >> >> >>>from. > >> >> >>> >> >>Then > >> >> >>> >> >> set HADOOP_CONF_DIR to /tmp/hadoop_conf and clear > >>everything > >> >>out > >> >> >>> from > >> >> >>> >> >> slider-client.xml. > >> >> >>> >> >> > >> >> >>> >> >> On 7/25/16, 4:12 PM, "Manoj Samel" > >><manojsamelt...@gmail.com> > >> >> >>> wrote: > >> >> >>> >> >> > >> >> >>> >> >> >Hi Gour, > >> >> >>> >> >> > > >> >> >>> >> >> >Thanks for your prompt reply. > >> >> >>> >> >> > > >> >> >>> >> >> >FYI, issue happens when I create slider app when rm1 is > >> >>active > >> >> >>>and > >> >> >>> >>when > >> >> >>> >> >> >rm1 > >> >> >>> >> >> >fails over to rm2. As soon as rm2 becomes active; the > >>slider > >> >>AM > >> >> >>> goes > >> >> >>> >> >>from > >> >> >>> >> >> >RUNNING to ACCEPTED state with above error. > >> >> >>> >> >> > > >> >> >>> >> >> >For your suggestion, I did following > >> >> >>> >> >> > > >> >> >>> >> >> >1) Copied core-site, hdfs-site, yarn-site, and mapred-site > >> >>from > >> >> >>> >> >> >HADOOP_CONF_DIR > >> >> >>> >> >> >to slider conf directory. > >> >> >>> >> >> >2) Our slider-env.sh already had HADOOP_CONF_DIR set > >> >> >>> >> >> >3) I removed all properties from slider-client.xml EXCEPT > >> >> >>>following > >> >> >>> >> >> > > >> >> >>> >> >> > - HADOOP_CONF_DIR > >> >> >>> >> >> > - slider.yarn.queue > >> >> >>> >> >> > - slider.zookeeper.quorum > >> >> >>> >> >> > - hadoop.registry.zk.quorum > >> >> >>> >> >> > - hadoop.registry.zk.root > >> >> >>> >> >> > - hadoop.security.authorization > >> >> >>> >> >> > - hadoop.security.authentication > >> >> >>> >> >> > > >> >> >>> >> >> >Then I made rm1 active, installed and created slider app > >>and > >> >> >>> >>restarted > >> >> >>> >> >>rm1 > >> >> >>> >> >> >(to make rm2) active. The slider-am again went from > >>RUNNING > >> >>to > >> >> >>> >>ACCEPTED > >> >> >>> >> >> >state. > >> >> >>> >> >> > > >> >> >>> >> >> >Let me know if you want me to try further changes. > >> >> >>> >> >> > > >> >> >>> >> >> >If I make the slider-client.xml completely empty per your > >> >> >>> >>suggestion, > >> >> >>> >> >>only > >> >> >>> >> >> >slider AM comes up but it > >> >> >>> >> >> >fails to start components. The AM log shows errors trying > >>to > >> >> >>> >>connect to > >> >> >>> >> >> >zookeeper like below. > >> >> >>> >> >> >2016-07-25 23:07:41,532 > >> >> >>> >> >> >[AmExecutor-006-SendThread(localhost.localdomain:2181)] > >>WARN > >> >> >>> >> >> >zookeeper.ClientCnxn - Session 0x0 for server null, > >> >>unexpected > >> >> >>> >>error, > >> >> >>> >> >> >closing socket connection and attempting reconnect > >> >> >>> >> >> >java.net.ConnectException: Connection refused > >> >> >>> >> >> > > >> >> >>> >> >> >Hence I kept minimal info in slider-client.xml > >> >> >>> >> >> > > >> >> >>> >> >> >FYI This is slider version 0.80 > >> >> >>> >> >> > > >> >> >>> >> >> >Thanks, > >> >> >>> >> >> > > >> >> >>> >> >> >Manoj > >> >> >>> >> >> > > >> >> >>> >> >> >On Mon, Jul 25, 2016 at 2:54 PM, Gour Saha > >> >> >>><gs...@hortonworks.com> > >> >> >>> >> >>wrote: > >> >> >>> >> >> > > >> >> >>> >> >> >> If possible, can you copy the entire content of the > >> >>directory > >> >> >>> >> >> >> /etc/hadoop/conf and then set HADOOP_CONF_DIR in > >> >> >>>slider-env.sh to > >> >> >>> >>it. > >> >> >>> >> >> >>Keep > >> >> >>> >> >> >> slider-client.xml empty. > >> >> >>> >> >> >> > >> >> >>> >> >> >> Now when you do the same rm1->rm2 and then the reverse > >> >> >>>failovers, > >> >> >>> >>do > >> >> >>> >> >>you > >> >> >>> >> >> >> see the same behaviors? > >> >> >>> >> >> >> > >> >> >>> >> >> >> -Gour > >> >> >>> >> >> >> > >> >> >>> >> >> >> On 7/25/16, 2:28 PM, "Manoj Samel" > >> >><manojsamelt...@gmail.com> > >> >> >>> >>wrote: > >> >> >>> >> >> >> > >> >> >>> >> >> >> >Another observation (whatever it is worth) > >> >> >>> >> >> >> > > >> >> >>> >> >> >> >If slider app is created and started when rm2 was > >>active, > >> >> >>>then > >> >> >>> it > >> >> >>> >> >> >>seems to > >> >> >>> >> >> >> >survive switches between rm2 and rm1 (and back). I.e > >> >> >>> >> >> >> > > >> >> >>> >> >> >> >* rm2 is active > >> >> >>> >> >> >> >* create and start slider application > >> >> >>> >> >> >> >* fail over to rm1. Now the Slider AM keeps running > >> >> >>> >> >> >> >* fail over to rm2 again. Slider AM still keeps running > >> >> >>> >> >> >> > > >> >> >>> >> >> >> >So, it seems if it starts with rm1 active, then the AM > >> >>goes > >> >> >>>to > >> >> >>> >> >> >>"ACCEPTED" > >> >> >>> >> >> >> >state when RM fails to rm2. If it starts with rm2 > >>active, > >> >> >>>then > >> >> >>> it > >> >> >>> >> >>runs > >> >> >>> >> >> >> >fine > >> >> >>> >> >> >> >with any switches between rm1 and rm2. > >> >> >>> >> >> >> > > >> >> >>> >> >> >> >Any feedback ? > >> >> >>> >> >> >> > > >> >> >>> >> >> >> >Thanks, > >> >> >>> >> >> >> > > >> >> >>> >> >> >> >Manoj > >> >> >>> >> >> >> > > >> >> >>> >> >> >> >On Mon, Jul 25, 2016 at 12:25 PM, Manoj Samel > >> >> >>> >> >> >><manojsamelt...@gmail.com> > >> >> >>> >> >> >> >wrote: > >> >> >>> >> >> >> > > >> >> >>> >> >> >> >> Setup > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> >> - Hadoop 2.6 with RM HA, Kerberos enabled > >> >> >>> >> >> >> >> - Slider 0.80 > >> >> >>> >> >> >> >> - In my slider-client.xml, I have added all RM HA > >> >> >>>properties, > >> >> >>> >> >> >>including > >> >> >>> >> >> >> >> the ones mentioned in > >> >> >>> >> >>http://markmail.org/message/wnhpp2zn6ixo65e3. > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> >> Following is the issue > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> >> * rm1 is active, rm2 is standby > >> >> >>> >> >> >> >> * deploy and start slider application, it runs fine > >> >> >>> >> >> >> >> * restart rm1, rm2 is now active. > >> >> >>> >> >> >> >> * The slider-am now goes from running into "ACCEPTED" > >> >> >>>mode. It > >> >> >>> >> >>stays > >> >> >>> >> >> >> >>there > >> >> >>> >> >> >> >> till rm1 is made active again. > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> >> In the slider-am log, it tries to connect to RM2 and > >> >> >>> connection > >> >> >>> >> >>fails > >> >> >>> >> >> >> >>due > >> >> >>> >> >> >> >> to org.apache.hadoop.security.AccessControlException: > >> >> >>>Client > >> >> >>> >> >>cannot > >> >> >>> >> >> >> >> authenticate via:[TOKEN]. See detailed log below > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> >> It seems it has some token (delegation token?) for > >>RM1 > >> >>but > >> >> >>> >>tries > >> >> >>> >> >>to > >> >> >>> >> >> >>use > >> >> >>> >> >> >> >> same(?) for RM2 and fails. Am I missing some > >> >>configuration > >> >> >>>??? > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> >> Thanks, > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] > >>INFO > >> >> >>> >> >> >> >> client.ConfiguredRMFailoverProxyProvider - Failing > >> >>over to > >> >> >>> rm2 > >> >> >>> >> >> >> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] > >>WARN > >> >> >>> >> >> >> >> security.UserGroupInformation - > >> >>PriviledgedActionException > >> >> >>> >> >> >>as:abc@XYZ > >> >> >>> >> >> >> >> (auth:KERBEROS) > >> >> >>> >> >> >>cause:org.apache.hadoop.security.AccessControlException: > >> >> >>> >> >> >> >> Client cannot authenticate via:[TOKEN] > >> >> >>> >> >> >> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] > >>WARN > >> >> >>> >> >>ipc.Client - > >> >> >>> >> >> >> >> Exception encountered while connecting to the server > >>: > >> >> >>> >> >> >> >> org.apache.hadoop.security.AccessControlException: > >> >>Client > >> >> >>> >>cannot > >> >> >>> >> >> >> >> authenticate via:[TOKEN] > >> >> >>> >> >> >> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] > >>WARN > >> >> >>> >> >> >> >> security.UserGroupInformation - > >> >>PriviledgedActionException > >> >> >>> >> >> >>as:abc@XYZ > >> >> >>> >> >> >> >> (auth:KERBEROS) cause:java.io.IOException: > >> >> >>> >> >> >> >> org.apache.hadoop.security.AccessControlException: > >> >>Client > >> >> >>> >>cannot > >> >> >>> >> >> >> >> authenticate via:[TOKEN] > >> >> >>> >> >> >> >> 2016-07-25 19:06:48,089 [AMRM Heartbeater thread] > >>INFO > >> >> >>> >> >> >> >> retry.RetryInvocationHandler - Exception while > >>invoking > >> >> >>> >>allocate > >> >> >>> >> >>of > >> >> >>> >> >> >> >>class > >> >> >>> >> >> >> >> ApplicationMasterProtocolPBClientImpl over rm2 after > >>287 > >> >> >>>fail > >> >> >>> >>over > >> >> >>> >> >> >> >> attempts. Trying to fail over immediately. > >> >> >>> >> >> >> >> java.io.IOException: Failed on local exception: > >> >> >>> >> >>java.io.IOException: > >> >> >>> >> >> >> >> org.apache.hadoop.security.AccessControlException: > >> >>Client > >> >> >>> >>cannot > >> >> >>> >> >> >> >> authenticate via:[TOKEN]; Host Details : local host > >>is: > >> >> >>> >>"<SliderAM > >> >> >>> >> >> >> >> HOST>/<slider AM Host IP>"; destination host is: > >>"<RM2 > >> >> >>> >> >>HOST>":23130; > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> > >> >> >>>>>org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) > >> >> >>> >> >> >> >> at > >> >> >>>org.apache.hadoop.ipc.Client.call(Client.java:1476) > >> >> >>> >> >> >> >> at > >> >> >>>org.apache.hadoop.ipc.Client.call(Client.java:1403) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(Proto > >>>>>>>>>>>>>>>bu > >> >>>>>>>>>>>>>fR > >> >> >>>>>>>>>>>pcE > >> >> >>> >>>>>>>>ng > >> >> >>> >> >>>>>>in > >> >> >>> >> >> >>>>e. > >> >> >>> >> >> >> >>java:230) > >> >> >>> >> >> >> >> at com.sun.proxy.$Proxy23.allocate(Unknown > >> >>Source) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>>>org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterP > >>>>>>>>>>>>>>>ro > >> >>>>>>>>>>>>>to > >> >> >>>>>>>>>>>col > >> >> >>> >>>>>>>>PB > >> >> >>> >> >>>>>>Cl > >> >> >>> >> >> >>>>ie > >> >> >>> >> >> >> > >> >> >>>>>ntImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77) > >> >> >>> >> >> >> >> at > >> >> >>> sun.reflect.GeneratedMethodAccessor10.invoke(Unknown > >> >> >>> >> >> >>Source) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe > >>>>>>>>>>>>>>>th > >> >>>>>>>>>>>>>od > >> >> >>>>>>>>>>>Acc > >> >> >>> >>>>>>>>es > >> >> >>> >> >>>>>>so > >> >> >>> >> >> >>>>rI > >> >> >>> >> >> >> >>mpl.java:43) > >> >> >>> >> >> >> >> at > >> >>java.lang.reflect.Method.invoke(Method.java:497) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth > >>>>>>>>>>>>>>>od > >> >>>>>>>>>>>>>(R > >> >> >>>>>>>>>>>etr > >> >> >>> >>>>>>>>yI > >> >> >>> >> >>>>>>nv > >> >> >>> >> >> >>>>oc > >> >> >>> >> >> >> >>ationHandler.java:252) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret > >>>>>>>>>>>>>>>ry > >> >>>>>>>>>>>>>In > >> >> >>>>>>>>>>>voc > >> >> >>> >>>>>>>>at > >> >> >>> >> >>>>>>io > >> >> >>> >> >> >>>>nH > >> >> >>> >> >> >> >>andler.java:104) > >> >> >>> >> >> >> >> at com.sun.proxy.$Proxy24.allocate(Unknown > >> >>Source) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>>>org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.alloca > >>>>>>>>>>>>>>>te > >> >>>>>>>>>>>>>(A > >> >> >>>>>>>>>>>MRM > >> >> >>> >>>>>>>>Cl > >> >> >>> >> >>>>>>ie > >> >> >>> >> >> >>>>nt > >> >> >>> >> >> >> >>Impl.java:278) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>>>org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsync > >>>>>>>>>>>>>>>Im > >> >>>>>>>>>>>>>pl > >> >> >>>>>>>>>>>$He > >> >> >>> >>>>>>>>ar > >> >> >>> >> >>>>>>tb > >> >> >>> >> >> >>>>ea > >> >> >>> >> >> >> >>tThread.run(AMRMClientAsyncImpl.java:224) > >> >> >>> >> >> >> >> Caused by: java.io.IOException: > >> >> >>> >> >> >> >> org.apache.hadoop.security.AccessControlException: > >> >>Client > >> >> >>> >>cannot > >> >> >>> >> >> >> >> authenticate via:[TOKEN] > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> > >> >> >>>>>org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682) > >> >> >>> >> >> >> >> at > >> >> >>>java.security.AccessController.doPrivileged(Native > >> >> >>> >> >>Method) > >> >> >>> >> >> >> >> at > >> >> >>>javax.security.auth.Subject.doAs(Subject.java:422) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGro > >>>>>>>>>>>>>>>up > >> >>>>>>>>>>>>>In > >> >> >>>>>>>>>>>for > >> >> >>> >>>>>>>>ma > >> >> >>> >> >>>>>>ti > >> >> >>> >> >> >>>>on > >> >> >>> >> >> >> >>.java:1671) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>>>org.apache.hadoop.ipc.Client$Connection.handleSaslConnection > >>>>>>>>>>>>>>>Fa > >> >>>>>>>>>>>>>il > >> >> >>>>>>>>>>>ure > >> >> >>> >>>>>>>>(C > >> >> >>> >> >>>>>>li > >> >> >>> >> >> >>>>en > >> >> >>> >> >> >> >>t.java:645) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client. > >>>>>>>>>>>>>ja > >> >>>>>>>>>>>va > >> >> >>>>>>>>>:73 > >> >> >>> >>>>>>3) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> > >> >> > >> > >>>>>>>>>org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:37 > >>>>>>>>>0) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> > >>>>org.apache.hadoop.ipc.Client.getConnection(Client.java:1525) > >> >> >>> >> >> >> >> at > >> >> >>>org.apache.hadoop.ipc.Client.call(Client.java:1442) > >> >> >>> >> >> >> >> ... 12 more > >> >> >>> >> >> >> >> Caused by: > >> >> >>>org.apache.hadoop.security.AccessControlException: > >> >> >>> >> >>Client > >> >> >>> >> >> >> >> cannot authenticate via:[TOKEN] > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>>>org.apache.hadoop.security.SaslRpcClient.selectSaslClient(Sa > >>>>>>>>>>>>>>>sl > >> >>>>>>>>>>>>>Rp > >> >> >>>>>>>>>>>cCl > >> >> >>> >>>>>>>>ie > >> >> >>> >> >>>>>>nt > >> >> >>> >> >> >>>>.j > >> >> >>> >> >> >> >>ava:172) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>>>org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpc > >>>>>>>>>>>>>>>Cl > >> >>>>>>>>>>>>>ie > >> >> >>>>>>>>>>>nt. > >> >> >>> >>>>>>>>ja > >> >> >>> >> >>>>>>va > >> >> >>> >> >> >>>>:3 > >> >> >>> >> >> >> >>96) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>>>org.apache.hadoop.ipc.Client$Connection.setupSaslConnection( > >>>>>>>>>>>>>>>Cl > >> >>>>>>>>>>>>>ie > >> >> >>>>>>>>>>>nt. > >> >> >>> >>>>>>>>ja > >> >> >>> >> >>>>>>va > >> >> >>> >> >> >>>>:5 > >> >> >>> >> >> >> >>55) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> > >> >> > >> > >>>>>>>>>org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:37 > >>>>>>>>>0) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> > >> >> >>>>>org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> > >> >> >>>>>org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:721) > >> >> >>> >> >> >> >> at > >> >> >>>java.security.AccessController.doPrivileged(Native > >> >> >>> >> >>Method) > >> >> >>> >> >> >> >> at > >> >> >>>javax.security.auth.Subject.doAs(Subject.java:422) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGro > >>>>>>>>>>>>>>>up > >> >>>>>>>>>>>>>In > >> >> >>>>>>>>>>>for > >> >> >>> >>>>>>>>ma > >> >> >>> >> >>>>>>ti > >> >> >>> >> >> >>>>on > >> >> >>> >> >> >> >>.java:1671) > >> >> >>> >> >> >> >> at > >> >> >>> >> >> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> > >> > >>>>>>>>>>>>>org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client. > >>>>>>>>>>>>>ja > >> >>>>>>>>>>>va > >> >> >>>>>>>>>:72 > >> >> >>> >>>>>>0) > >> >> >>> >> >> >> >> ... 15 more > >> >> >>> >> >> >> >> 2016-07-25 19:06:48,089 [AMRM Heartbeater thread] > >>INFO > >> >> >>> >> >> >> >> client.ConfiguredRMFailoverProxyProvider - Failing > >> >>over to > >> >> >>> rm1 > >> >> >>> >> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> >> > >> >> >>> >> >> > >> >> >>> >> >> > >> >> >>> >> > >> >> >>> >> > >> >> >>> > >> >> >>> > >> >> >> > >> >> > >> >> > >> > >> > >