Hi Gour,

Thanks for your prompt reply.

FYI, issue happens when I create slider app when rm1 is active and when rm1
fails over to rm2. As soon as rm2 becomes active; the slider AM goes from
RUNNING to ACCEPTED state with above error.

For your suggestion, I did following

1) Copied core-site, hdfs-site, yarn-site, and mapred-site from
HADOOP_CONF_DIR
to slider conf directory.
2) Our slider-env.sh already had HADOOP_CONF_DIR set
3) I removed all properties from slider-client.xml EXCEPT following

   - HADOOP_CONF_DIR
   - slider.yarn.queue
   - slider.zookeeper.quorum
   - hadoop.registry.zk.quorum
   - hadoop.registry.zk.root
   - hadoop.security.authorization
   - hadoop.security.authentication

Then I made rm1 active, installed and created slider app and restarted rm1
(to make rm2) active. The slider-am again went from RUNNING to ACCEPTED
state.

Let me know if you want me to try further changes.

If I make the slider-client.xml completely empty per your suggestion, only
slider AM comes up but it
fails to start components. The AM log shows errors trying to connect to
zookeeper like below.
2016-07-25 23:07:41,532
[AmExecutor-006-SendThread(localhost.localdomain:2181)] WARN
zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error,
closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused

Hence I kept minimal info in slider-client.xml

FYI This is slider version 0.80

Thanks,

Manoj

On Mon, Jul 25, 2016 at 2:54 PM, Gour Saha <gs...@hortonworks.com> wrote:

> If possible, can you copy the entire content of the directory
> /etc/hadoop/conf and then set HADOOP_CONF_DIR in slider-env.sh to it. Keep
> slider-client.xml empty.
>
> Now when you do the same rm1->rm2 and then the reverse failovers, do you
> see the same behaviors?
>
> -Gour
>
> On 7/25/16, 2:28 PM, "Manoj Samel" <manojsamelt...@gmail.com> wrote:
>
> >Another observation (whatever it is worth)
> >
> >If slider app is created and started when rm2 was active, then it seems to
> >survive switches between rm2 and rm1 (and back). I.e
> >
> >* rm2 is active
> >* create and start slider application
> >* fail over to rm1. Now the Slider AM keeps running
> >* fail over to rm2 again. Slider AM still keeps running
> >
> >So, it seems if it starts with rm1 active, then the AM goes to "ACCEPTED"
> >state when RM fails to rm2. If it starts with rm2 active, then it runs
> >fine
> >with any switches between rm1 and rm2.
> >
> >Any feedback ?
> >
> >Thanks,
> >
> >Manoj
> >
> >On Mon, Jul 25, 2016 at 12:25 PM, Manoj Samel <manojsamelt...@gmail.com>
> >wrote:
> >
> >> Setup
> >>
> >> - Hadoop 2.6 with RM HA, Kerberos enabled
> >> - Slider 0.80
> >> - In my slider-client.xml, I have added all RM HA properties, including
> >> the ones mentioned in http://markmail.org/message/wnhpp2zn6ixo65e3.
> >>
> >> Following is the issue
> >>
> >> * rm1 is active, rm2 is standby
> >> * deploy and start slider application, it runs fine
> >> * restart rm1, rm2 is now active.
> >> * The slider-am now goes from running into "ACCEPTED" mode. It stays
> >>there
> >> till rm1 is made active again.
> >>
> >> In the slider-am log, it tries to connect to RM2 and connection fails
> >>due
> >> to org.apache.hadoop.security.AccessControlException: Client cannot
> >> authenticate via:[TOKEN]. See detailed log below
> >>
> >>  It seems it has some token (delegation token?) for RM1 but tries to use
> >> same(?) for RM2 and fails. Am I missing some configuration ???
> >>
> >> Thanks,
> >>
> >>
> >>
> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] INFO
> >>  client.ConfiguredRMFailoverProxyProvider - Failing over to rm2
> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] WARN
> >>  security.UserGroupInformation - PriviledgedActionException as:abc@XYZ
> >> (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException:
> >> Client cannot authenticate via:[TOKEN]
> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] WARN  ipc.Client -
> >> Exception encountered while connecting to the server :
> >> org.apache.hadoop.security.AccessControlException: Client cannot
> >> authenticate via:[TOKEN]
> >> 2016-07-25 19:06:48,088 [AMRM Heartbeater thread] WARN
> >>  security.UserGroupInformation - PriviledgedActionException as:abc@XYZ
> >> (auth:KERBEROS) cause:java.io.IOException:
> >> org.apache.hadoop.security.AccessControlException: Client cannot
> >> authenticate via:[TOKEN]
> >> 2016-07-25 19:06:48,089 [AMRM Heartbeater thread] INFO
> >>  retry.RetryInvocationHandler - Exception while invoking allocate of
> >>class
> >> ApplicationMasterProtocolPBClientImpl over rm2 after 287 fail over
> >> attempts. Trying to fail over immediately.
> >> java.io.IOException: Failed on local exception: java.io.IOException:
> >> org.apache.hadoop.security.AccessControlException: Client cannot
> >> authenticate via:[TOKEN]; Host Details : local host is: "<SliderAM
> >> HOST>/<slider AM Host IP>"; destination host is: "<RM2 HOST>":23130;
> >>         at
> >>org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
> >>         at org.apache.hadoop.ipc.Client.call(Client.java:1476)
> >>         at org.apache.hadoop.ipc.Client.call(Client.java:1403)
> >>         at
> >>
> >>org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.
> >>java:230)
> >>         at com.sun.proxy.$Proxy23.allocate(Unknown Source)
> >>         at
> >>
> >>org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClie
> >>ntImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
> >>         at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
> >>         at
> >>
> >>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
> >>mpl.java:43)
> >>         at java.lang.reflect.Method.invoke(Method.java:497)
> >>         at
> >>
> >>org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvoc
> >>ationHandler.java:252)
> >>         at
> >>
> >>org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationH
> >>andler.java:104)
> >>         at com.sun.proxy.$Proxy24.allocate(Unknown Source)
> >>         at
> >>
> >>org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClient
> >>Impl.java:278)
> >>         at
> >>
> >>org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$Heartbea
> >>tThread.run(AMRMClientAsyncImpl.java:224)
> >> Caused by: java.io.IOException:
> >> org.apache.hadoop.security.AccessControlException: Client cannot
> >> authenticate via:[TOKEN]
> >>         at
> >>org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682)
> >>         at java.security.AccessController.doPrivileged(Native Method)
> >>         at javax.security.auth.Subject.doAs(Subject.java:422)
> >>         at
> >>
> >>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
> >>.java:1671)
> >>         at
> >>
> >>org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Clien
> >>t.java:645)
> >>         at
> >> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:733)
> >>         at
> >> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
> >>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525)
> >>         at org.apache.hadoop.ipc.Client.call(Client.java:1442)
> >>         ... 12 more
> >> Caused by: org.apache.hadoop.security.AccessControlException: Client
> >> cannot authenticate via:[TOKEN]
> >>         at
> >>
> >>org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.j
> >>ava:172)
> >>         at
> >>
> >>org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:3
> >>96)
> >>         at
> >>
> >>org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:5
> >>55)
> >>         at
> >> org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:370)
> >>         at
> >>org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725)
> >>         at
> >>org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:721)
> >>         at java.security.AccessController.doPrivileged(Native Method)
> >>         at javax.security.auth.Subject.doAs(Subject.java:422)
> >>         at
> >>
> >>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
> >>.java:1671)
> >>         at
> >> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:720)
> >>         ... 15 more
> >> 2016-07-25 19:06:48,089 [AMRM Heartbeater thread] INFO
> >>  client.ConfiguredRMFailoverProxyProvider - Failing over to rm1
> >>
>
>

Reply via email to