Are you able to go to the RM UI and load the ApplicationMaster web ui for
this app?

-Gour

On 9/21/17, 11:00 AM, "Manoj Samel" <manojsamelt...@gmail.com> wrote:

>Any thoughts ?
>
>On Mon, Sep 18, 2017 at 3:22 PM, Manoj Samel <manojsamelt...@gmail.com>
>wrote:
>
>>
>> CDH 5.5.1 cluster with Kerberos, slider version 0.80
>>
>> Sometimes Slider commands start hanging
>>
>> slider list <app> --containers
>>
>> [r...@s-76zyl02.sys.az1.eng.pdx.wd ~]# slider list spas --containers
>> 2017-09-18 21:44:45,659 [main] INFO  tools.SliderUtils - JVM initialized
>> into secure mode with kerberos realm BIGDATA
>> Exception: Call From <host running command>/<host_ip> to
>><slider_AM_HOST>
>> failed on socket timeout exception: java.net.SocketTimeoutException:
>> 15000 millis timeout while waiting for channel to be ready for read. ch
>>:
>> java.nio.channels.SocketChannel[connected local=/<slider
>> command_host>:46777 remote=<host_running_slider_am>/<IP of host running
>> slider am>:32120]; For more details see:  http://wiki.apache.org/
>> hadoop/SocketTimeout
>>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>>     at sun.reflect.NativeConstructorAccessorImpl.newInstance(
>> NativeConstructorAccessorImpl.java:62)
>>     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
>> DelegatingConstructorAccessorImpl.java:45)
>>     at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>>     at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:750)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1476)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1403)
>>     at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.
>> invoke(ProtobufRpcEngine.java:230)
>>     at com.sun.proxy.$Proxy19.getLiveContainers(Unknown Source)
>>     at 
>>org.apache.slider.server.appmaster.rpc.SliderClusterProtocolProxy.
>> getLiveContainers(SliderClusterProtocolProxy.java:229)
>>     at 
>>org.apache.slider.client.ipc.SliderClusterOperations.getContainers(
>> SliderClusterOperations.java:458)
>>     at org.apache.slider.client.SliderClient.getContainers(
>> SliderClient.java:2763)
>>     at org.apache.slider.client.SliderClient.actionList(
>> SliderClient.java:2735)
>>     at org.apache.slider.client.SliderClient.exec(SliderClient.java:510)
>>     at org.apache.slider.client.SliderClient.runService(
>> SliderClient.java:424)
>>     at org.apache.slider.core.main.ServiceLauncher.launchService(
>> ServiceLauncher.java:188)
>>     at 
>>org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(
>> ServiceLauncher.java:475)
>>     at org.apache.slider.core.main.ServiceLauncher.launchServiceAndExit(
>> ServiceLauncher.java:403)
>>     at org.apache.slider.core.main.ServiceLauncher.serviceMain(
>> ServiceLauncher.java:630)
>>     at org.apache.slider.Slider.main(Slider.java:49)
>> Caused by: java.net.SocketTimeoutException: 15000 millis timeout while
>> waiting for channel to be ready for read. ch :
>>java.nio.channels.SocketChannel[connected
>> local=/<Local_IP>:46777
>>remote=<Slider_AM_HOST>/<slider_am_host_ip>:32120]
>>     at org.apache.hadoop.net.SocketIOWithTimeout.doIO(
>> SocketIOWithTimeout.java:164)
>>     at org.apache.hadoop.net.SocketInputStream.read(
>> SocketInputStream.java:161)
>>     at org.apache.hadoop.net.SocketInputStream.read(
>> SocketInputStream.java:131)
>>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>     at org.apache.hadoop.ipc.Client$Connection$PingInputStream.
>> read(Client.java:515)
>>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>>     at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(
>> Client.java:1075)
>>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:970)
>> 2017-09-18 21:45:01,499 [main] INFO  util.ExitUtil - Exiting with
>>status 56
>>
>>
>> Slider AM Log Shows no errors. The only warning I can see is about TGT
>> renewer
>>
>> 2017-09-18 15:40:57,009 [TGT Renewer for xyz@mydomain] WARN
>>  security.UserGroupInformation - Exception encountered while running the
>> renewal command. Aborting renew thread. ExitCodeException exitCode=1:
>> kinit: Ticket expired while renewing credentials
>> 2017-09-18 15:43:29,536 [Socket Reader #1 for port 32120] INFO
>>ipc.Server
>> - Auth successful for xyz@mydomain (auth:SIMPLE)
>> 2017-09-18 15:43:29,537 [Socket Reader #1 for port 32120] INFO
>>authorize.ServiceAuthorizationManager
>> - Authorization successful for xyz@mydomain (auth:TOKEN) for
>> protocol=interface org.apache.slider.server.appmaster.rpc.
>> SliderClusterProtocolPB
>> 2017-09-18 15:48:29,569 [Socket Reader #1 for port 32120] INFO
>>ipc.Server
>> - Auth successful for xyz@mydomain (auth:SIMPLE)
>> 2017-09-18 15:48:29,570 [Socket Reader #1 for port 32120] INFO
>>authorize.ServiceAuthorizationManager
>> - Authorization successful for xyz@mydomain (auth:TOKEN) for
>> protocol=interface org.apache.slider.server.appmaster.rpc.
>> SliderClusterProtocolPB
>>

Reply via email to