Hi Gour,

I added properties in /etc/hadoop/conf/yarn-site.xml and emptied the
/data/slider/conf/slider-client.xml and restarted both RMs.

   - hadoop.registry.zk.quorum
   - hadoop.registry.zk.root
   - slider.yarn.queue

Now there are no issues in creating or destroying cluster. This helps as it
keeps all configs in one location - thanks for the update.

 I am still hitting the original issue - Starting application with RM1
active and then RM1 to RM2 fail over leads to slider-AM getting Client
cannot authenticate via:[TOKEN] errors.

I will upload the config files soon ...

Thanks,

On Thu, Jul 28, 2016 at 5:28 PM, Manoj Samel <manojsamelt...@gmail.com>
wrote:

> Thanks. I will test with the updated config and then upload the latest
> ones ...
>
> Thanks,
>
> Manoj
>
> On Thu, Jul 28, 2016 at 5:21 PM, Gour Saha <gs...@hortonworks.com> wrote:
>
>> slider.zookeeper.quorum is deprecated and should not be used.
>> hadoop.registry.zk.quorum is used instead and is typically defined in
>> yarn-site.xml. So is hadoop.registry.zk.root.
>>
>> It is not encouraged to specify slider.yarn.queue at the cluster config
>> level. Ideally it is best to specify the queue during the application
>> submission. So you can use --queue option with slider create cmd. You can
>> also set on the command line using -D slider.yarn.queue=<> during the
>> create call. If indeed all slider apps should go to one and only one
>> queue, then this prop can be specified in any one of the existing site xml
>> files under /etc/hadoop/conf.
>>
>> -Gour
>>
>> On 7/28/16, 4:43 PM, "Manoj Samel" <manojsamelt...@gmail.com> wrote:
>>
>> >Following slider specific properties are at present added in
>> >/data/slider/conf/slider-client.xml. If you think they should be picked
>> up
>> >from HADOOP_CONF_DIR (/etc/hadoop/conf) file, which file in
>> >HADOOP_CONF_DIR
>> >should these be added ?
>> >
>> >   - slider.zookeeper.quorum
>> >   - hadoop.registry.zk.quorum
>> >   - hadoop.registry.zk.root
>> >   - slider.yarn.queue
>> >
>> >
>> >On Thu, Jul 28, 2016 at 4:37 PM, Gour Saha <gs...@hortonworks.com>
>> wrote:
>> >
>> >> That is strange, since it is indeed not required to contain anything in
>> >> slider-client.xml (except <configuration></configuration>) if
>> >> HADOOP_CONF_DIR has everything that Slider needs. This probably gives
>> an
>> >> indication that there might be some issue with cluster configuration
>> >>based
>> >> on files solely under HADOOP_CONF_DIR to begin with.
>> >>
>> >> Suggest you to upload all the config files to the jira to help debug
>> >>this
>> >> further.
>> >>
>> >> -Gour
>> >>
>> >> On 7/28/16, 4:27 PM, "Manoj Samel" <manojsamelt...@gmail.com> wrote:
>> >>
>> >> >Thanks Gour for prompt reply
>> >> >
>> >> >BTW - Creating a empty slider-client.xml (with just
>> >> ><configuration></configuration>) does not works. The AM starts but
>> >>fails
>> >> >to
>> >> >create any components and shows errors like
>> >> >
>> >> >2016-07-28 23:18:46,018
>> >> >[AmExecutor-006-SendThread(localhost.localdomain:2181)] WARN
>> >> > zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error,
>> >> >closing socket connection and attempting reconnect
>> >> >java.net.ConnectException: Connection refused
>> >> >        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>> >> >        at
>> >> >sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>> >> >        at
>> >>
>>
>> >>>org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO
>> >>>.j
>> >> >ava:361)
>> >> >        at
>> >> >org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
>> >> >
>> >> >Also, command "slider destroy <app>" fails with zookeeper errors ...
>> >> >
>> >> >I had to keep a minimal slider-client.xml. It does not have any RM
>> info
>> >> >etc. but does contain slider ZK related properties like
>> >> >"slider.zookeeper.quorum", "hadoop.registry.zk.quorum",
>> >> >"hadoop.registry.zk.root". I haven't yet distilled the absolute
>> minimal
>> >> >set
>> >> >of properties required, but this should suffice for now. All RM / HDFS
>> >> >properties will be read from HADOOP_CONF_DIR files.
>> >> >
>> >> >Let me know if this could cause any issues.
>> >> >
>> >> >On Thu, Jul 28, 2016 at 3:36 PM, Gour Saha <gs...@hortonworks.com>
>> >>wrote:
>> >> >
>> >> >> No need to copy any files. Pointing HADOOP_CONF_DIR to
>> >>/etc/hadoop/conf
>> >> >>is
>> >> >> good.
>> >> >>
>> >> >> -Gour
>> >> >>
>> >> >> On 7/28/16, 3:24 PM, "Manoj Samel" <manojsamelt...@gmail.com>
>> wrote:
>> >> >>
>> >> >> >Follow up question regarding Gour's comment in earlier thread -
>> >> >> >
>> >> >> >Slider is installed on one of the hadoop nodes. SLIDER_HOME/conf
>> >> >>directory
>> >> >> >(say /data/slider/conf) is different than HADOOP_CONF_DIR
>> >> >> >(/etc/hadoop/conf). Is it required/recommended that files in
>> >> >> >HADOOP_CONF_DIR be copied to SLIDER_HOME/conf and slider-env.sh
>> >>script
>> >> >> >sets
>> >> >> >HADOOP_CONF_DIR to /data/slider/conf ?
>> >> >> >
>> >> >> >Or can the slider-env.sh set HADOOP_CONF_DIR to /etc/hadoop/conf ,
>> >> >>without
>> >> >> >copying the files ?
>> >> >> >
>> >> >> >Using slider .80 for now, but would like to know recommendation for
>> >> >>this
>> >> >> >and future versions as well.
>> >> >> >
>> >> >> >Thanks in advance,
>> >> >> >
>> >> >> >Manoj
>> >> >> >
>> >> >> >On Tue, Jul 26, 2016 at 3:27 PM, Manoj Samel
>> >><manojsamelt...@gmail.com
>> >> >
>> >> >> >wrote:
>> >> >> >
>> >> >> >> Filed https://issues.apache.org/jira/browse/SLIDER-1158 with
>> logs
>> >> and
>> >> >> my
>> >> >> >> analysis of logs.
>> >> >> >>
>> >> >> >> On Tue, Jul 26, 2016 at 10:36 AM, Gour Saha
>> >><gs...@hortonworks.com>
>> >> >> >>wrote:
>> >> >> >>
>> >> >> >>> Please file a JIRA and upload the logs to it.
>> >> >> >>>
>> >> >> >>> On 7/26/16, 10:21 AM, "Manoj Samel" <manojsamelt...@gmail.com>
>> >> >>wrote:
>> >> >> >>>
>> >> >> >>> >Hi Gour,
>> >> >> >>> >
>> >> >> >>> >Can you please reach me using your own email-id? I will then
>> >>send
>> >> >> >>>logs to
>> >> >> >>> >you, along with my analysis - I don't want to send logs on
>> >>public
>> >> >>list
>> >> >> >>> >
>> >> >> >>> >Thanks,
>> >> >> >>> >
>> >> >> >>> >On Mon, Jul 25, 2016 at 5:39 PM, Gour Saha
>> >><gs...@hortonworks.com>
>> >> >> >>> wrote:
>> >> >> >>> >
>> >> >> >>> >> Ok, so this node is not a gateway. It is part of the cluster,
>> >> >>which
>> >> >> >>> >>means
>> >> >> >>> >> you don¹t need slider-client.xml at all. Just have
>> >> >>HADOOP_CONF_DIR
>> >> >> >>> >> pointing to /etc/hadoop/conf in slider-env.sh and that should
>> >>be
>> >> >>it.
>> >> >> >>> >>
>> >> >> >>> >> So the above simplifies your config setup. It will not solve
>> >> >>either
>> >> >> >>>of
>> >> >> >>> >>the
>> >> >> >>> >> 2 problems you are facing.
>> >> >> >>> >>
>> >> >> >>> >> Now coming to the 2 issues you are facing, you have to
>> provide
>> >> >> >>> >>additional
>> >> >> >>> >> logs for us to understand better. Let¹s start with  -
>> >> >> >>> >> 1. RM logs (specifically between the time when rm1->rm2
>> >>failover
>> >> >>is
>> >> >> >>> >> simulated)
>> >> >> >>> >> 2. Slider App logs
>> >> >> >>> >>
>> >> >> >>> >> -Gour
>> >> >> >>> >>
>> >> >> >>> >> On 7/25/16, 5:16 PM, "Manoj Samel" <manojsamelt...@gmail.com
>> >
>> >> >> wrote:
>> >> >> >>> >>
>> >> >> >>> >> >   1. Not clear about your question on "gateway" node. The
>> >>node
>> >> >> >>> running
>> >> >> >>> >> >   slider is part of the hadoop cluster and there are other
>> >> >> >>>services
>> >> >> >>> >>like
>> >> >> >>> >> >   Oozie that run on this node that utilizes hdfs and yarn.
>> >>So
>> >> >>if
>> >> >> >>>your
>> >> >> >>> >> >   question is whether the node is otherwise working for
>> HDFS
>> >> >>and
>> >> >> >>>Yarn
>> >> >> >>> >> >   configuration, it is working
>> >> >> >>> >> >   2. I copied all files from HADOOP_CONF_DIR (say
>> >> >> >>>/etc/hadoop/conf)
>> >> >> >>> to
>> >> >> >>> >> >the
>> >> >> >>> >> >   directory containing slider-client.xml (say
>> >> >>/data/latest/conf)
>> >> >> >>> >> >   3. In earlier email, I had done a mistake where
>> >>slider-env.sh
>> >> >> >>>file
>> >> >> >>> >> >HADOOP_CONF_DIR
>> >> >> >>> >> >   was pointing to original directory /etc/hadoop/conf. I
>> >>edited
>> >> >> >>>it to
>> >> >> >>> >> >   point to same directory containing slider-client.xml &
>> >> >> >>> slider-env.sh
>> >> >> >>> >> >i.e.
>> >> >> >>> >> >   /data/latest/conf
>> >> >> >>> >> >   4. I emptied slider-client.xml. It just had the
>> >> >> >>> >> ><configuration></configuration>.
>> >> >> >>> >> >   The creation of spas worked but the Slider AM still shows
>> >>the
>> >> >> >>>same
>> >> >> >>> >> >issue.
>> >> >> >>> >> >   i.e. when RM1 goes from active to standby, slider AM goes
>> >> >>from
>> >> >> >>> >>RUNNING
>> >> >> >>> >> >to
>> >> >> >>> >> >   ACCPTED state with same error about TOKEN. Also NOTE that
>> >> >>when
>> >> >> >>> >> >   slider-client.xml is empty, the "slider destroy xxx"
>> >>command
>> >> >> >>>still
>> >> >> >>> >> >fails
>> >> >> >>> >> >   with Zookeeper connection errors.
>> >> >> >>> >> >   5. I then added same parameters (as my last email -
>> except
>> >> >> >>> >> >   HADOOP_CONF_DIR) to slider-client.xml and ran. This time
>> >> >> >>> >>slider-env.sh
>> >> >> >>> >> >   has HADOOP_CONF_DIR pointing to /data/latest/conf and
>> >> >> >>> >>slider-client.xml
>> >> >> >>> >> >   does not have HADOOP_CONF_DIR. The same issue exists (but
>> >> >> >>>"slider
>> >> >> >>> >> >   destroy" does not fails)
>> >> >> >>> >> >   6. Could you explain what do you expect to pick up from
>> >> >>Hadoop
>> >> >> >>> >> >   configurations that will help you in RM Token ? If slider
>> >>has
>> >> >> >>>token
>> >> >> >>> >> >from
>> >> >> >>> >> >   RM1, and it switches to RM2, not clear what slider does
>> to
>> >> >>get
>> >> >> >>> >> >delegation
>> >> >> >>> >> >   token for RM2 communication ?
>> >> >> >>> >> >   7. It is worth repeating again that issue happens only
>> >>when
>> >> >>RM1
>> >> >> >>>was
>> >> >> >>> >> >   active when slider app was created and then RM1 becomes
>> >> >> >>>standby. If
>> >> >> >>> >> >RM2 was
>> >> >> >>> >> >   active when slider app was created, then slider AM keeps
>> >> >>running
>> >> >> >>> for
>> >> >> >>> >> >any
>> >> >> >>> >> >   number of switches between RM2 and RM1 back and forth ...
>> >> >> >>> >> >
>> >> >> >>> >> >
>> >> >> >>> >> >On Mon, Jul 25, 2016 at 4:21 PM, Gour Saha
>> >> >><gs...@hortonworks.com>
>> >> >> >>> >>wrote:
>> >> >> >>> >> >
>> >> >> >>> >> >> The node you are running slider from, is that a gateway
>> >>node?
>> >> >> >>>Sorry
>> >> >> >>> >>for
>> >> >> >>> >> >> not being explicit. I meant copy everything under
>> >> >> >>>/etc/hadoop/conf
>> >> >> >>> >>from
>> >> >> >>> >> >> your cluster into some temp directory (say
>> >>/tmp/hadoop_conf)
>> >> >>in
>> >> >> >>>your
>> >> >> >>> >> >> gateway node or local or whichever node you are running
>> >>slider
>> >> >> >>>from.
>> >> >> >>> >> >>Then
>> >> >> >>> >> >> set HADOOP_CONF_DIR to /tmp/hadoop_conf and clear
>> >>everything
>> >> >>out
>> >> >> >>> from
>> >> >> >>> >> >> slider-client.xml.
>> >> >> >>> >> >>
>> >> >> >>> >> >> On 7/25/16, 4:12 PM, "Manoj Samel"
>> >><manojsamelt...@gmail.com>
>> >> >> >>> wrote:
>> >> >> >>> >> >>
>> >> >> >>> >> >> >Hi Gour,
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >Thanks for your prompt reply.
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >FYI, issue happens when I create slider app when rm1 is
>> >> >>active
>> >> >> >>>and
>> >> >> >>> >>when
>> >> >> >>> >> >> >rm1
>> >> >> >>> >> >> >fails over to rm2. As soon as rm2 becomes active; the
>> >>slider
>> >> >>AM
>> >> >> >>> goes
>> >> >> >>> >> >>from
>> >> >> >>> >> >> >RUNNING to ACCEPTED state with above error.
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >For your suggestion, I did following
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >1) Copied core-site, hdfs-site, yarn-site, and
>> mapred-site
>> >> >>from
>> >> >> >>> >> >> >HADOOP_CONF_DIR
>> >> >> >>> >> >> >to slider conf directory.
>> >> >> >>> >> >> >2) Our slider-env.sh already had HADOOP_CONF_DIR set
>> >> >> >>> >> >> >3) I removed all properties from slider-client.xml EXCEPT
>> >> >> >>>following
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >   - HADOOP_CONF_DIR
>> >> >> >>> >> >> >   - slider.yarn.queue
>> >> >> >>> >> >> >   - slider.zookeeper.quorum
>> >> >> >>> >> >> >   - hadoop.registry.zk.quorum
>> >> >> >>> >> >> >   - hadoop.registry.zk.root
>> >> >> >>> >> >> >   - hadoop.security.authorization
>> >> >> >>> >> >> >   - hadoop.security.authentication
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >Then I made rm1 active, installed and created slider app
>> >>and
>> >> >> >>> >>restarted
>> >> >> >>> >> >>rm1
>> >> >> >>> >> >> >(to make rm2) active. The slider-am again went from
>> >>RUNNING
>> >> >>to
>> >> >> >>> >>ACCEPTED
>> >> >> >>> >> >> >state.
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >Let me know if you want me to try further changes.
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >If I make the slider-client.xml completely empty per your
>> >> >> >>> >>suggestion,
>> >> >> >>> >> >>only
>> >> >> >>> >> >> >slider AM comes up but it
>> >> >> >>> >> >> >fails to start components. The AM log shows errors trying
>> >>to
>> >> >> >>> >>connect to
>> >> >> >>> >> >> >zookeeper like below.
>> >> >> >>> >> >> >2016-07-25 23:07:41,532
>> >> >> >>> >> >> >[AmExecutor-006-SendThread(localhost.localdomain:2181)]
>> >>WARN
>> >> >> >>> >> >> >zookeeper.ClientCnxn - Session 0x0 for server null,
>> >> >>unexpected
>> >> >> >>> >>error,
>> >> >> >>> >> >> >closing socket connection and attempting reconnect
>> >> >> >>> >> >> >java.net.ConnectException: Connection refused
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >Hence I kept minimal info in slider-client.xml
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >FYI This is slider version 0.80
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >Thanks,
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >Manoj
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >On Mon, Jul 25, 2016 at 2:54 PM, Gour Saha
>> >> >> >>><gs...@hortonworks.com>
>> >> >> >>> >> >>wrote:
>> >> >> >>> >> >> >
>> >> >> >>> >> >> >> If possible, can you copy the entire content of the
>> >> >>directory
>> >> >> >>> >> >> >> /etc/hadoop/conf and then set HADOOP_CONF_DIR in
>> >> >> >>>slider-env.sh to
>> >> >> >>> >>it.
>> >> >> >>> >> >> >>Keep
>> >> >> >>> >> >> >> slider-client.xml empty.
>> >> >> >>> >> >> >>
>> >> >> >>> >> >> >> Now when you do the same rm1->rm2 and then the reverse
>> >> >> >>>failovers,
>> >> >> >>> >>do
>> >> >> >>> >> >>you
>> >> >> >>> >> >> >> see the same behaviors?
>> >> >> >>> >> >> >>
>> >> >> >>> >> >> >> -Gour
>> >> >> >>> >> >> >>
>> >> >> >>> >> >> >> On 7/25/16, 2:28 PM, "Manoj Samel"
>> >> >><manojsamelt...@gmail.com>
>> >> >> >>> >>wrote:
>> >> >> >>> >> >> >>
>> >> >> >>> >> >> >> >Another observation (whatever it is worth)
>> >> >> >>> >> >> >> >
>> >> >> >>> >> >> >> >If slider app is created and started when rm2 was
>> >>active,
>> >> >> >>>then
>> >> >> >>> it
>> >> >> >>> >> >> >>seems to
>> >> >> >>> >> >> >> >survive switches between rm2 and rm1 (and back). I.e
>> >> >> >>> >> >> >> >
>> >> >> >>> >> >> >> >* rm2 is active
>> >> >> >>> >> >> >> >* create and start slider application
>> >> >> >>> >> >> >> >* fail over to rm1. Now the Slider AM keeps running
>> >> >> >>> >> >> >> >* fail over to rm2 again. Slider AM still keeps
>> running
>> >> >> >>> >> >> >> >
>> >> >> >>> >> >> >> >So, it seems if it starts with rm1 active, then the AM
>> >> >>goes
>> >> >> >>>to
>> >> >> >>> >> >> >>"ACCEPTED"
>> >> >> >>> >> >> >> >state when RM fails to rm2. If it starts with rm2
>> >>active,
>> >> >> >>>then
>> >> >> >>> it
>> >> >> >>> >> >>runs
>> >> >> >>> >> >> >> >fine
>> >> >> >>> >> >> >> >with any switches between rm1 and rm2.
>> >> >> >>> >> >> >> >
>> >> >> >>> >> >> >> >Any feedback ?
>> >> >> >>> >> >> >> >
>> >> >> >>> >> >> >> >Thanks,
>> >> >> >>> >> >> >> >
>> >> >> >>> >> >> >> >Manoj
>> >> >> >>> >> >> >> >
>> >> >> >>> >> >> >> >On Mon, Jul 25, 2016 at 12:25 PM, Manoj Samel
>> >> >> >>> >> >> >><manojsamelt...@gmail.com>
>> >> >> >>> >> >> >> >wrote:
>> >> >> >>> >> >> >> >
>> >> >> >>> >> >> >> >> Setup
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >> >> - Hadoop 2.6 with RM HA, Kerberos enabled
>> >> >> >>> >> >> >> >> - Slider 0.80
>> >> >> >>> >> >> >> >> - In my slider-client.xml, I have added all RM HA
>> >> >> >>>properties,
>> >> >> >>> >> >> >>including
>> >> >> >>> >> >> >> >> the ones mentioned in
>> >> >> >>> >> >>http://markmail.org/message/wnhpp2zn6ixo65e3.
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >> >> Following is the issue
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >> >> * rm1 is active, rm2 is standby
>> >> >> >>> >> >> >> >> * deploy and start slider application, it runs fine
>> >> >> >>> >> >> >> >> * restart rm1, rm2 is now active.
>> >> >> >>> >> >> >> >> * The slider-am now goes from running into
>> "ACCEPTED"
>> >> >> >>>mode. It
>> >> >> >>> >> >>stays
>> >> >> >>> >> >> >> >>there
>> >> >> >>> >> >> >> >> till rm1 is made active again.
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >> >> In the slider-am log, it tries to connect to RM2 and
>> >> >> >>> connection
>> >> >> >>> >> >>fails
>> >> >> >>> >> >> >> >>due
>> >> >> >>> >> >> >> >> to
>> org.apache.hadoop.security.AccessControlException:
>> >> >> >>>Client
>> >> >> >>> >> >>cannot
>> >> >> >>> >> >> >> >> authenticate via:[TOKEN]. See detailed log below
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >> >>  It seems it has some token (delegation token?) for
>> >>RM1
>> >> >>but
>> >> >> >>> >>tries
>> >> >> >>> >> >>to
>> >> >> >>> >> >> >>use
>> >> >> >>> >> >> >> >> same(?) for RM2 and fails. Am I missing some
>> >> >>configuration
>> >> >> >>>???
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >> >> Thanks,
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread]
>> >>INFO
>> >> >> >>> >> >> >> >>  client.ConfiguredRMFailoverProxyProvider - Failing
>> >> >>over to
>> >> >> >>> rm2
>> >> >> >>> >> >> >> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread]
>> >>WARN
>> >> >> >>> >> >> >> >>  security.UserGroupInformation -
>> >> >>PriviledgedActionException
>> >> >> >>> >> >> >>as:abc@XYZ
>> >> >> >>> >> >> >> >> (auth:KERBEROS)
>> >> >> >>> >> >> >>cause:org.apache.hadoop.security.AccessControlException:
>> >> >> >>> >> >> >> >> Client cannot authenticate via:[TOKEN]
>> >> >> >>> >> >> >> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread]
>> >>WARN
>> >> >> >>> >> >>ipc.Client -
>> >> >> >>> >> >> >> >> Exception encountered while connecting to the server
>> >>:
>> >> >> >>> >> >> >> >> org.apache.hadoop.security.AccessControlException:
>> >> >>Client
>> >> >> >>> >>cannot
>> >> >> >>> >> >> >> >> authenticate via:[TOKEN]
>> >> >> >>> >> >> >> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread]
>> >>WARN
>> >> >> >>> >> >> >> >>  security.UserGroupInformation -
>> >> >>PriviledgedActionException
>> >> >> >>> >> >> >>as:abc@XYZ
>> >> >> >>> >> >> >> >> (auth:KERBEROS) cause:java.io.IOException:
>> >> >> >>> >> >> >> >> org.apache.hadoop.security.AccessControlException:
>> >> >>Client
>> >> >> >>> >>cannot
>> >> >> >>> >> >> >> >> authenticate via:[TOKEN]
>> >> >> >>> >> >> >> >> 2016-07-25 19:06:48,089 [AMRM Heartbeater thread]
>> >>INFO
>> >> >> >>> >> >> >> >>  retry.RetryInvocationHandler - Exception while
>> >>invoking
>> >> >> >>> >>allocate
>> >> >> >>> >> >>of
>> >> >> >>> >> >> >> >>class
>> >> >> >>> >> >> >> >> ApplicationMasterProtocolPBClientImpl over rm2 after
>> >>287
>> >> >> >>>fail
>> >> >> >>> >>over
>> >> >> >>> >> >> >> >> attempts. Trying to fail over immediately.
>> >> >> >>> >> >> >> >> java.io.IOException: Failed on local exception:
>> >> >> >>> >> >>java.io.IOException:
>> >> >> >>> >> >> >> >> org.apache.hadoop.security.AccessControlException:
>> >> >>Client
>> >> >> >>> >>cannot
>> >> >> >>> >> >> >> >> authenticate via:[TOKEN]; Host Details : local host
>> >>is:
>> >> >> >>> >>"<SliderAM
>> >> >> >>> >> >> >> >> HOST>/<slider AM Host IP>"; destination host is:
>> >>"<RM2
>> >> >> >>> >> >>HOST>":23130;
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >>
>> >> >> >>>>>org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>>org.apache.hadoop.ipc.Client.call(Client.java:1476)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>>org.apache.hadoop.ipc.Client.call(Client.java:1403)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(Proto
>> >>>>>>>>>>>>>>>bu
>> >> >>>>>>>>>>>>>fR
>> >> >> >>>>>>>>>>>pcE
>> >> >> >>> >>>>>>>>ng
>> >> >> >>> >> >>>>>>in
>> >> >> >>> >> >> >>>>e.
>> >> >> >>> >> >> >> >>java:230)
>> >> >> >>> >> >> >> >>         at com.sun.proxy.$Proxy23.allocate(Unknown
>> >> >>Source)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>>>org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterP
>> >>>>>>>>>>>>>>>ro
>> >> >>>>>>>>>>>>>to
>> >> >> >>>>>>>>>>>col
>> >> >> >>> >>>>>>>>PB
>> >> >> >>> >> >>>>>>Cl
>> >> >> >>> >> >> >>>>ie
>> >> >> >>> >> >> >>
>> >> >> >>>>>ntImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> sun.reflect.GeneratedMethodAccessor10.invoke(Unknown
>> >> >> >>> >> >> >>Source)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> >>>>>>>>>>>>>>>th
>> >> >>>>>>>>>>>>>od
>> >> >> >>>>>>>>>>>Acc
>> >> >> >>> >>>>>>>>es
>> >> >> >>> >> >>>>>>so
>> >> >> >>> >> >> >>>>rI
>> >> >> >>> >> >> >> >>mpl.java:43)
>> >> >> >>> >> >> >> >>         at
>> >> >>java.lang.reflect.Method.invoke(Method.java:497)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>> >>>>>>>>>>>>>>>od
>> >> >>>>>>>>>>>>>(R
>> >> >> >>>>>>>>>>>etr
>> >> >> >>> >>>>>>>>yI
>> >> >> >>> >> >>>>>>nv
>> >> >> >>> >> >> >>>>oc
>> >> >> >>> >> >> >> >>ationHandler.java:252)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>> >>>>>>>>>>>>>>>ry
>> >> >>>>>>>>>>>>>In
>> >> >> >>>>>>>>>>>voc
>> >> >> >>> >>>>>>>>at
>> >> >> >>> >> >>>>>>io
>> >> >> >>> >> >> >>>>nH
>> >> >> >>> >> >> >> >>andler.java:104)
>> >> >> >>> >> >> >> >>         at com.sun.proxy.$Proxy24.allocate(Unknown
>> >> >>Source)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>>>org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.alloca
>> >>>>>>>>>>>>>>>te
>> >> >>>>>>>>>>>>>(A
>> >> >> >>>>>>>>>>>MRM
>> >> >> >>> >>>>>>>>Cl
>> >> >> >>> >> >>>>>>ie
>> >> >> >>> >> >> >>>>nt
>> >> >> >>> >> >> >> >>Impl.java:278)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>>>org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsync
>> >>>>>>>>>>>>>>>Im
>> >> >>>>>>>>>>>>>pl
>> >> >> >>>>>>>>>>>$He
>> >> >> >>> >>>>>>>>ar
>> >> >> >>> >> >>>>>>tb
>> >> >> >>> >> >> >>>>ea
>> >> >> >>> >> >> >> >>tThread.run(AMRMClientAsyncImpl.java:224)
>> >> >> >>> >> >> >> >> Caused by: java.io.IOException:
>> >> >> >>> >> >> >> >> org.apache.hadoop.security.AccessControlException:
>> >> >>Client
>> >> >> >>> >>cannot
>> >> >> >>> >> >> >> >> authenticate via:[TOKEN]
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >>
>> >> >> >>>>>org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>>java.security.AccessController.doPrivileged(Native
>> >> >> >>> >> >>Method)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>>javax.security.auth.Subject.doAs(Subject.java:422)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>> >>>>>>>>>>>>>>>up
>> >> >>>>>>>>>>>>>In
>> >> >> >>>>>>>>>>>for
>> >> >> >>> >>>>>>>>ma
>> >> >> >>> >> >>>>>>ti
>> >> >> >>> >> >> >>>>on
>> >> >> >>> >> >> >> >>.java:1671)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>>>org.apache.hadoop.ipc.Client$Connection.handleSaslConnection
>> >>>>>>>>>>>>>>>Fa
>> >> >>>>>>>>>>>>>il
>> >> >> >>>>>>>>>>>ure
>> >> >> >>> >>>>>>>>(C
>> >> >> >>> >> >>>>>>li
>> >> >> >>> >> >> >>>>en
>> >> >> >>> >> >> >> >>t.java:645)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.
>> >>>>>>>>>>>>>ja
>> >> >>>>>>>>>>>va
>> >> >> >>>>>>>>>:73
>> >> >> >>> >>>>>>3)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >>
>> >> >>
>> >>
>>
>> >>>>>>>>>org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:37
>> >>>>>>>>>0)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >>
>> >>>>org.apache.hadoop.ipc.Client.getConnection(Client.java:1525)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>>org.apache.hadoop.ipc.Client.call(Client.java:1442)
>> >> >> >>> >> >> >> >>         ... 12 more
>> >> >> >>> >> >> >> >> Caused by:
>> >> >> >>>org.apache.hadoop.security.AccessControlException:
>> >> >> >>> >> >>Client
>> >> >> >>> >> >> >> >> cannot authenticate via:[TOKEN]
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>>>org.apache.hadoop.security.SaslRpcClient.selectSaslClient(Sa
>> >>>>>>>>>>>>>>>sl
>> >> >>>>>>>>>>>>>Rp
>> >> >> >>>>>>>>>>>cCl
>> >> >> >>> >>>>>>>>ie
>> >> >> >>> >> >>>>>>nt
>> >> >> >>> >> >> >>>>.j
>> >> >> >>> >> >> >> >>ava:172)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>>>org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpc
>> >>>>>>>>>>>>>>>Cl
>> >> >>>>>>>>>>>>>ie
>> >> >> >>>>>>>>>>>nt.
>> >> >> >>> >>>>>>>>ja
>> >> >> >>> >> >>>>>>va
>> >> >> >>> >> >> >>>>:3
>> >> >> >>> >> >> >> >>96)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>>>org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(
>> >>>>>>>>>>>>>>>Cl
>> >> >>>>>>>>>>>>>ie
>> >> >> >>>>>>>>>>>nt.
>> >> >> >>> >>>>>>>>ja
>> >> >> >>> >> >>>>>>va
>> >> >> >>> >> >> >>>>:5
>> >> >> >>> >> >> >> >>55)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >>
>> >> >>
>> >>
>>
>> >>>>>>>>>org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:37
>> >>>>>>>>>0)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >>
>> >> >> >>>>>org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >>
>> >> >> >>>>>org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:721)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>>java.security.AccessController.doPrivileged(Native
>> >> >> >>> >> >>Method)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>>javax.security.auth.Subject.doAs(Subject.java:422)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>> >>>>>>>>>>>>>>>up
>> >> >>>>>>>>>>>>>In
>> >> >> >>>>>>>>>>>for
>> >> >> >>> >>>>>>>>ma
>> >> >> >>> >> >>>>>>ti
>> >> >> >>> >> >> >>>>on
>> >> >> >>> >> >> >> >>.java:1671)
>> >> >> >>> >> >> >> >>         at
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >>
>> >>
>>
>> >>>>>>>>>>>>>org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.
>> >>>>>>>>>>>>>ja
>> >> >>>>>>>>>>>va
>> >> >> >>>>>>>>>:72
>> >> >> >>> >>>>>>0)
>> >> >> >>> >> >> >> >>         ... 15 more
>> >> >> >>> >> >> >> >> 2016-07-25 19:06:48,089 [AMRM Heartbeater thread]
>> >>INFO
>> >> >> >>> >> >> >> >>  client.ConfiguredRMFailoverProxyProvider - Failing
>> >> >>over to
>> >> >> >>> rm1
>> >> >> >>> >> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >> >>
>> >> >> >>> >>
>> >> >> >>> >>
>> >> >> >>>
>> >> >> >>>
>> >> >> >>
>> >> >>
>> >> >>
>> >>
>> >>
>>
>>
>

Reply via email to