The node you are running slider from, is that a gateway node? Sorry for
not being explicit. I meant copy everything under /etc/hadoop/conf from
your cluster into some temp directory (say /tmp/hadoop_conf) in your
gateway node or local or whichever node you are running slider from. Then
set HADOOP_CONF_DIR to /tmp/hadoop_conf and clear everything out from
slider-client.xml.

On 7/25/16, 4:12 PM, "Manoj Samel" <manojsamelt...@gmail.com> wrote:

>Hi Gour,
>
>Thanks for your prompt reply.
>
>FYI, issue happens when I create slider app when rm1 is active and when
>rm1
>fails over to rm2. As soon as rm2 becomes active; the slider AM goes from
>RUNNING to ACCEPTED state with above error.
>
>For your suggestion, I did following
>
>1) Copied core-site, hdfs-site, yarn-site, and mapred-site from
>HADOOP_CONF_DIR
>to slider conf directory.
>2) Our slider-env.sh already had HADOOP_CONF_DIR set
>3) I removed all properties from slider-client.xml EXCEPT following
>
>   - HADOOP_CONF_DIR
>   - slider.yarn.queue
>   - slider.zookeeper.quorum
>   - hadoop.registry.zk.quorum
>   - hadoop.registry.zk.root
>   - hadoop.security.authorization
>   - hadoop.security.authentication
>
>Then I made rm1 active, installed and created slider app and restarted rm1
>(to make rm2) active. The slider-am again went from RUNNING to ACCEPTED
>state.
>
>Let me know if you want me to try further changes.
>
>If I make the slider-client.xml completely empty per your suggestion, only
>slider AM comes up but it
>fails to start components. The AM log shows errors trying to connect to
>zookeeper like below.
>2016-07-25 23:07:41,532
>[AmExecutor-006-SendThread(localhost.localdomain:2181)] WARN
>zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error,
>closing socket connection and attempting reconnect
>java.net.ConnectException: Connection refused
>
>Hence I kept minimal info in slider-client.xml
>
>FYI This is slider version 0.80
>
>Thanks,
>
>Manoj
>
>On Mon, Jul 25, 2016 at 2:54 PM, Gour Saha <gs...@hortonworks.com> wrote:
>
>> If possible, can you copy the entire content of the directory
>> /etc/hadoop/conf and then set HADOOP_CONF_DIR in slider-env.sh to it.
>>Keep
>> slider-client.xml empty.
>>
>> Now when you do the same rm1->rm2 and then the reverse failovers, do you
>> see the same behaviors?
>>
>> -Gour
>>
>> On 7/25/16, 2:28 PM, "Manoj Samel" <manojsamelt...@gmail.com> wrote:
>>
>> >Another observation (whatever it is worth)
>> >
>> >If slider app is created and started when rm2 was active, then it
>>seems to
>> >survive switches between rm2 and rm1 (and back). I.e
>> >
>> >* rm2 is active
>> >* create and start slider application
>> >* fail over to rm1. Now the Slider AM keeps running
>> >* fail over to rm2 again. Slider AM still keeps running
>> >
>> >So, it seems if it starts with rm1 active, then the AM goes to
>>"ACCEPTED"
>> >state when RM fails to rm2. If it starts with rm2 active, then it runs
>> >fine
>> >with any switches between rm1 and rm2.
>> >
>> >Any feedback ?
>> >
>> >Thanks,
>> >
>> >Manoj
>> >
>> >On Mon, Jul 25, 2016 at 12:25 PM, Manoj Samel
>><manojsamelt...@gmail.com>
>> >wrote:
>> >
>> >> Setup
>> >>
>> >> - Hadoop 2.6 with RM HA, Kerberos enabled
>> >> - Slider 0.80
>> >> - In my slider-client.xml, I have added all RM HA properties,
>>including
>> >> the ones mentioned in http://markmail.org/message/wnhpp2zn6ixo65e3.
>> >>
>> >> Following is the issue
>> >>
>> >> * rm1 is active, rm2 is standby
>> >> * deploy and start slider application, it runs fine
>> >> * restart rm1, rm2 is now active.
>> >> * The slider-am now goes from running into "ACCEPTED" mode. It stays
>> >>there
>> >> till rm1 is made active again.
>> >>
>> >> In the slider-am log, it tries to connect to RM2 and connection fails
>> >>due
>> >> to org.apache.hadoop.security.AccessControlException: Client cannot
>> >> authenticate via:[TOKEN]. See detailed log below
>> >>
>> >>  It seems it has some token (delegation token?) for RM1 but tries to
>>use
>> >> same(?) for RM2 and fails. Am I missing some configuration ???
>> >>
>> >> Thanks,
>> >>
>> >>
>> >>
>> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] INFO
>> >>  client.ConfiguredRMFailoverProxyProvider - Failing over to rm2
>> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] WARN
>> >>  security.UserGroupInformation - PriviledgedActionException
>>as:abc@XYZ
>> >> (auth:KERBEROS)
>>cause:org.apache.hadoop.security.AccessControlException:
>> >> Client cannot authenticate via:[TOKEN]
>> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] WARN  ipc.Client -
>> >> Exception encountered while connecting to the server :
>> >> org.apache.hadoop.security.AccessControlException: Client cannot
>> >> authenticate via:[TOKEN]
>> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] WARN
>> >>  security.UserGroupInformation - PriviledgedActionException
>>as:abc@XYZ
>> >> (auth:KERBEROS) cause:java.io.IOException:
>> >> org.apache.hadoop.security.AccessControlException: Client cannot
>> >> authenticate via:[TOKEN]
>> >> 2016-07-25 19:06:48,089 [AMRM Heartbeater thread] INFO
>> >>  retry.RetryInvocationHandler - Exception while invoking allocate of
>> >>class
>> >> ApplicationMasterProtocolPBClientImpl over rm2 after 287 fail over
>> >> attempts. Trying to fail over immediately.
>> >> java.io.IOException: Failed on local exception: java.io.IOException:
>> >> org.apache.hadoop.security.AccessControlException: Client cannot
>> >> authenticate via:[TOKEN]; Host Details : local host is: "<SliderAM
>> >> HOST>/<slider AM Host IP>"; destination host is: "<RM2 HOST>":23130;
>> >>         at
>> >>org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>> >>         at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>> >>         at org.apache.hadoop.ipc.Client.call(Client.java:1403)
>> >>         at
>> >>
>> 
>>>>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngin
>>>>e.
>> >>java:230)
>> >>         at com.sun.proxy.$Proxy23.allocate(Unknown Source)
>> >>         at
>> >>
>> 
>>>>org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBCl
>>>>ie
>> >>ntImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
>> >>         at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown
>>Source)
>> >>         at
>> >>
>> 
>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccesso
>>>>rI
>> >>mpl.java:43)
>> >>         at java.lang.reflect.Method.invoke(Method.java:497)
>> >>         at
>> >>
>> 
>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInv
>>>>oc
>> >>ationHandler.java:252)
>> >>         at
>> >>
>> 
>>>>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocatio
>>>>nH
>> >>andler.java:104)
>> >>         at com.sun.proxy.$Proxy24.allocate(Unknown Source)
>> >>         at
>> >>
>> 
>>>>org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClie
>>>>nt
>> >>Impl.java:278)
>> >>         at
>> >>
>> 
>>>>org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$Heartb
>>>>ea
>> >>tThread.run(AMRMClientAsyncImpl.java:224)
>> >> Caused by: java.io.IOException:
>> >> org.apache.hadoop.security.AccessControlException: Client cannot
>> >> authenticate via:[TOKEN]
>> >>         at
>> >>org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682)
>> >>         at java.security.AccessController.doPrivileged(Native Method)
>> >>         at javax.security.auth.Subject.doAs(Subject.java:422)
>> >>         at
>> >>
>> 
>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformati
>>>>on
>> >>.java:1671)
>> >>         at
>> >>
>> 
>>>>org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Cli
>>>>en
>> >>t.java:645)
>> >>         at
>> >> 
>>org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:733)
>> >>         at
>> >> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
>> >>         at 
>>org.apache.hadoop.ipc.Client.getConnection(Client.java:1525)
>> >>         at org.apache.hadoop.ipc.Client.call(Client.java:1442)
>> >>         ... 12 more
>> >> Caused by: org.apache.hadoop.security.AccessControlException: Client
>> >> cannot authenticate via:[TOKEN]
>> >>         at
>> >>
>> 
>>>>org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient
>>>>.j
>> >>ava:172)
>> >>         at
>> >>
>> 
>>>>org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java
>>>>:3
>> >>96)
>> >>         at
>> >>
>> 
>>>>org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java
>>>>:5
>> >>55)
>> >>         at
>> >> org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:370)
>> >>         at
>> >>org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725)
>> >>         at
>> >>org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:721)
>> >>         at java.security.AccessController.doPrivileged(Native Method)
>> >>         at javax.security.auth.Subject.doAs(Subject.java:422)
>> >>         at
>> >>
>> 
>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformati
>>>>on
>> >>.java:1671)
>> >>         at
>> >> 
>>org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:720)
>> >>         ... 15 more
>> >> 2016-07-25 19:06:48,089 [AMRM Heartbeater thread] INFO
>> >>  client.ConfiguredRMFailoverProxyProvider - Failing over to rm1
>> >>
>>
>>

Reply via email to