[ https://issues.apache.org/jira/browse/YARN-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15675413#comment-15675413 ]
gaoyanfu commented on YARN-5898: -------------------------------- AppMaster did not restart, usually, appMaster after running for some time, some container appear the exception > Container can not stop, because the call stopContainer NMClient method > appears DIGEST-MD5 exception, onGetContainerStatusError NMClientAsync method > is also the same > -------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: YARN-5898 > URL: https://issues.apache.org/jira/browse/YARN-5898 > Project: Hadoop YARN > Issue Type: Bug > Components: api > Affects Versions: 2.6.0 > Environment: cdh5.5,java 7 > Reporter: gaoyanfu > Labels: DIGEST-MD5, getContainerStatuses, > onGetContainerStatusError, stopContainer > Fix For: 2.6.0 > > Original Estimate: 96h > Remaining Estimate: 96h > > GetContainerStatusAsync call the NMClientAsync method, the callback method > corresponding onGetContainerStatusError method, DIGEST-MD5 SaslException, > ContainerStatus stopContainer can not get; call the nmClient method will be > the exception, not stop Container. > ---------------------------REST API------------------------------- > request: > http://server3.xdpp.boco:8042/ws/v1/node/containers > response: > {"containers":{"container":[ > {"id":"container_e07_1477704520017_0001_01_000004","state":"RUNNING","exitCode":-1000,"diagnostics":"","user":"xdpp","totalMemoryNeededMB":8704,"totalVCoresNeeded":1,"containerLogsLink":"http://server3.xdpp.boco:8042/node/containerlogs/container_e07_1477704520017_0001_01_000004/xdpp","nodeId":"server3.xdpp.boco:8041"}, > {"id":"container_e09_1477719748865_0003_01_000025","state":"RUNNING","exitCode":-1000,"diagnostics":"","user":"xdpp","totalMemoryNeededMB":1536,"totalVCoresNeeded":1,"containerLogsLink":"http://server3.xdpp.boco:8042/node/containerlogs/container_e09_1477719748865_0003_01_000025/xdpp","nodeId":"server3.xdpp.boco:8041"}, > {"id":"container_e09_1477719748865_0004_02_000103","state":"RUNNING","exitCode":-1000,"diagnostics":"","user":"xdpp","totalMemoryNeededMB":6656,"totalVCoresNeeded":1,"containerLogsLink":"http://server3.xdpp.boco:8042/node/containerlogs/container_e09_1477719748865_0004_02_000103/xdpp","nodeId":"server3.xdpp.boco:8041"} > ]}} > -----------------------exception---------------------------------- > 2016-11-14 11:17:12.725 ERROR containerStatusLogger > [ContainerManager.java:484] *********Container onGetContainerStatusError deal > begin.containerId:container_e09_1477719748865_0003_01_000025 > javax.security.sasl.SaslException: DIGEST-MD5: digest response format > violation. Mismatched response. > at sun.reflect.GeneratedConstructorAccessor59.newInstance(Unknown > Source) ~[na:na] > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > ~[na:1.7.0_79] > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > ~[na:1.7.0_79] > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > ~[hadoop-yarn-common-2.6.0.jar:na] > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104) > ~[hadoop-yarn-common-2.6.0.jar:na] > at > org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.getContainerStatuses(ContainerManagementProtocolPBClientImpl.java:127) > ~[hadoop-yarn-common-2.6.0.jar:na] > at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source) ~[na:na] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[na:1.7.0_79] > at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_79] > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > ~[hadoop-common-2.6.0.jar:na] > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > ~[hadoop-common-2.6.0.jar:na] > at com.sun.proxy.$Proxy23.getContainerStatuses(Unknown Source) ~[na:na] > at > org.apache.hadoop.yarn.client.api.impl.NMClientImpl.getContainerStatus(NMClientImpl.java:267) > ~[hadoop-yarn-client-2.6.0.jar:na] > at > org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl$ContainerEventProcessor.run(NMClientAsyncImpl.java:534) > ~[hadoop-yarn-client-2.6.0.jar:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_79] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_79] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] > Caused by: org.apache.hadoop.ipc.RemoteException: DIGEST-MD5: digest response > format violation. Mismatched response. > at org.apache.hadoop.ipc.Client.call(Client.java:1468) > ~[hadoop-common-2.6.0.jar:na] > at org.apache.hadoop.ipc.Client.call(Client.java:1399) > ~[hadoop-common-2.6.0.jar:na] > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > ~[hadoop-common-2.6.0.jar:na] > at com.sun.proxy.$Proxy22.getContainerStatuses(Unknown Source) ~[na:na] > at > org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.getContainerStatuses(ContainerManagementProtocolPBClientImpl.java:124) > ~[hadoop-yarn-common-2.6.0.jar:na] > ... 11 common frames omitted -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org