CDH 5.5.1 cluster with Kerberos, slider version 0.80

Sometimes Slider commands start hanging

slider list <app> --containers

[r...@s-76zyl02.sys.az1.eng.pdx.wd ~]# slider list spas --containers
2017-09-18 21:44:45,659 [main] INFO  tools.SliderUtils - JVM initialized
into secure mode with kerberos realm BIGDATA
Exception: Call From <host running command>/<host_ip> to <slider_AM_HOST>
failed on socket timeout exception: java.net.SocketTimeoutException: 15000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/<slider
command_host>:46777 remote=<host_running_slider_am>/<IP of host running
slider am>:32120]; For more details see:
http://wiki.apache.org/hadoop/SocketTimeout
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:750)
    at org.apache.hadoop.ipc.Client.call(Client.java:1476)
    at org.apache.hadoop.ipc.Client.call(Client.java:1403)
    at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
    at com.sun.proxy.$Proxy19.getLiveContainers(Unknown Source)
    at
org.apache.slider.server.appmaster.rpc.SliderClusterProtocolProxy.getLiveContainers(SliderClusterProtocolProxy.java:229)
    at
org.apache.slider.client.ipc.SliderClusterOperations.getContainers(SliderClusterOperations.java:458)
    at
org.apache.slider.client.SliderClient.getContainers(SliderClient.java:2763)
    at
org.apache.slider.client.SliderClient.actionList(SliderClient.java:2735)
    at org.apache.slider.client.SliderClient.exec(SliderClient.java:510)
    at
org.apache.slider.client.SliderClient.runService(SliderClient.java:424)
    at
org.apache.slider.core.main.ServiceLauncher.launchService(ServiceLauncher.java:188)
    at
org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(ServiceLauncher.java:475)
    at
org.apache.slider.core.main.ServiceLauncher.launchServiceAndExit(ServiceLauncher.java:403)
    at
org.apache.slider.core.main.ServiceLauncher.serviceMain(ServiceLauncher.java:630)
    at org.apache.slider.Slider.main(Slider.java:49)
Caused by: java.net.SocketTimeoutException: 15000 millis timeout while
waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/<Local_IP>:46777
remote=<Slider_AM_HOST>/<slider_am_host_ip>:32120]
    at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
    at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
    at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:515)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
    at java.io.DataInputStream.readInt(DataInputStream.java:387)
    at
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1075)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:970)
2017-09-18 21:45:01,499 [main] INFO  util.ExitUtil - Exiting with status 56


Slider AM Log Shows no errors. The only warning I can see is about TGT
renewer

2017-09-18 15:40:57,009 [TGT Renewer for xyz@mydomain] WARN
 security.UserGroupInformation - Exception encountered while running the
renewal command. Aborting renew thread. ExitCodeException exitCode=1:
kinit: Ticket expired while renewing credentials
2017-09-18 15:43:29,536 [Socket Reader #1 for port 32120] INFO  ipc.Server
- Auth successful for xyz@mydomain (auth:SIMPLE)
2017-09-18 15:43:29,537 [Socket Reader #1 for port 32120] INFO
 authorize.ServiceAuthorizationManager - Authorization successful for
xyz@mydomain (auth:TOKEN) for protocol=interface
org.apache.slider.server.appmaster.rpc.SliderClusterProtocolPB
2017-09-18 15:48:29,569 [Socket Reader #1 for port 32120] INFO  ipc.Server
- Auth successful for xyz@mydomain (auth:SIMPLE)
2017-09-18 15:48:29,570 [Socket Reader #1 for port 32120] INFO
 authorize.ServiceAuthorizationManager - Authorization successful for
xyz@mydomain (auth:TOKEN) for protocol=interface
org.apache.slider.server.appmaster.rpc.SliderClusterProtocolPB

Reply via email to