[ https://issues.apache.org/jira/browse/YARN-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542271#comment-14542271 ]
Junping Du commented on YARN-3634: ---------------------------------- Thanks [~sjlee0] for updating the patch! Latest patch LGTM. +1 pending on Jenkins' result. > TestMRTimelineEventHandling and TestApplication are broken > ---------------------------------------------------------- > > Key: YARN-3634 > URL: https://issues.apache.org/jira/browse/YARN-3634 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver > Affects Versions: YARN-2928 > Reporter: Sangjin Lee > Assignee: Sangjin Lee > Attachments: YARN-3634-YARN-2928.001.patch, > YARN-3634-YARN-2928.002.patch, YARN-3634-YARN-2928.003.patch, > YARN-3634-YARN-2928.004.patch > > > TestMRTimelineEventHandling is broken. Relevant error message: > {noformat} > 2015-05-12 06:28:56,415 INFO [AsyncDispatcher event handler] ipc.Client > (Client.java:handleConnectionFailure(882)) - Retrying connect to server: > asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 0 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2015-05-12 06:28:57,416 INFO [AsyncDispatcher event handler] ipc.Client > (Client.java:handleConnectionFailure(882)) - Retrying connect to server: > asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 1 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2015-05-12 06:28:58,416 INFO [AsyncDispatcher event handler] ipc.Client > (Client.java:handleConnectionFailure(882)) - Retrying connect to server: > asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 2 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2015-05-12 06:28:59,417 INFO [AsyncDispatcher event handler] ipc.Client > (Client.java:handleConnectionFailure(882)) - Retrying connect to server: > asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 3 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2015-05-12 06:29:00,418 INFO [AsyncDispatcher event handler] ipc.Client > (Client.java:handleConnectionFailure(882)) - Retrying connect to server: > asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 4 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2015-05-12 06:29:01,419 INFO [AsyncDispatcher event handler] ipc.Client > (Client.java:handleConnectionFailure(882)) - Retrying connect to server: > asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 5 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2015-05-12 06:29:02,420 INFO [AsyncDispatcher event handler] ipc.Client > (Client.java:handleConnectionFailure(882)) - Retrying connect to server: > asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 6 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2015-05-12 06:29:03,420 INFO [AsyncDispatcher event handler] ipc.Client > (Client.java:handleConnectionFailure(882)) - Retrying connect to server: > asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 7 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2015-05-12 06:29:04,421 INFO [AsyncDispatcher event handler] ipc.Client > (Client.java:handleConnectionFailure(882)) - Retrying connect to server: > asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 8 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2015-05-12 06:29:05,422 INFO [AsyncDispatcher event handler] ipc.Client > (Client.java:handleConnectionFailure(882)) - Retrying connect to server: > asf904.gq1.ygridcore.net/67.195.81.148:0. Already tried 9 time(s); retry > policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 2015-05-12 06:29:05,424 ERROR [AsyncDispatcher event handler] > collector.NodeTimelineCollectorManager > (NodeTimelineCollectorManager.java:postPut(121)) - Failed to communicate with > NM Collector Service for application_1431412130291_0001 > 2015-05-12 06:29:05,425 WARN [AsyncDispatcher event handler] > containermanager.AuxServices > (AuxServices.java:logWarningWhenAuxServiceThrowExceptions(261)) - The > auxService name is timeline_collector and it got an error at event: > CONTAINER_INIT > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.net.ConnectException: Call From asf904.gq1.ygridcore.net/67.195.81.148 > to asf904.gq1.ygridcore.net:0 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > at > org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.putIfAbsent(TimelineCollectorManager.java:97) > at > org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService.addApplication(PerNodeTimelineCollectorsAuxService.java:99) > at > org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService.initializeContainer(PerNodeTimelineCollectorsAuxService.java:126) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.handle(AuxServices.java:226) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.handle(AuxServices.java:49) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.net.ConnectException: Call From asf904.gq1.ygridcore.net/67.195.81.148 > to asf904.gq1.ygridcore.net:0 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > at > org.apache.hadoop.yarn.server.timelineservice.collector.NodeTimelineCollectorManager.postPut(NodeTimelineCollectorManager.java:122) > at > org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.putIfAbsent(TimelineCollectorManager.java:95) > ... 7 more > Caused by: java.net.ConnectException: Call From > asf904.gq1.ygridcore.net/67.195.81.148 to asf904.gq1.ygridcore.net:0 failed > on connection exception: java.net.ConnectException: Connection refused; For > more details see: http://wiki.apache.org/hadoop/ConnectionRefused > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) > at org.apache.hadoop.ipc.Client.call(Client.java:1496) > at org.apache.hadoop.ipc.Client.call(Client.java:1423) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) > at com.sun.proxy.$Proxy108.getTimelineCollectorContext(Unknown Source) > at > org.apache.hadoop.yarn.server.api.impl.pb.client.CollectorNodemanagerProtocolPBClientImpl.getTimelineCollectorContext(CollectorNodemanagerProtocolPBClientImpl.java:99) > at > org.apache.hadoop.yarn.server.timelineservice.collector.NodeTimelineCollectorManager.updateTimelineCollectorContext(NodeTimelineCollectorManager.java:188) > at > org.apache.hadoop.yarn.server.timelineservice.collector.NodeTimelineCollectorManager.postPut(NodeTimelineCollectorManager.java:116) > ... 8 more > Caused by: java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) > at > org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:625) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:723) > at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:1545) > at org.apache.hadoop.ipc.Client.call(Client.java:1462) > ... 14 more > {noformat} > This surfaced when we switched to use port ":0" for the mini-YARN cluster for > the node collector service. > Also, TestApplication tests are broken because the mocked context does not > have the configuration object which ApplicationImpl depends on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)