[ 
https://issues.apache.org/jira/browse/YARN-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16947583#comment-16947583
 ] 

Zoltan Siegl commented on YARN-9541:
------------------------------------

TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV2Enabled
TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled
TestSystemMetricsPublisherForV2.testPublishApplicationMetrics
TestSystemMetricsPublisherForV2.testPublishAppAttemptMetrics
TestSystemMetricsPublisherForV2.testPublishContainerMetrics
These fail together consistently, and seemingly for the same route cause.

> TestCombinedSystemMetricsPublisher and TestSystemMetricsPublisherForV2 fail 
> intermittent
> ----------------------------------------------------------------------------------------
>
>                 Key: YARN-9541
>                 URL: https://issues.apache.org/jira/browse/YARN-9541
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: ATSv2
>    Affects Versions: 3.2.0
>            Reporter: Prabhu Joseph
>            Assignee: Prabhu Joseph
>            Priority: Minor
>
> org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled
> {code}
> Failing for the past 1 build (Since Failed#24071 )
> Took 0.19 sec.
> Error Message
> java.net.BindException: Problem binding to [0.0.0.0:10200] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
> Stacktrace
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [0.0.0.0:10200] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>       at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
>       at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:66)
>       at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:55)
>       at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.serviceStart(ApplicationHistoryClientService.java:94)
>       at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>       at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>       at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceStart(ApplicationHistoryServer.java:120)
>       at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testSetup(TestCombinedSystemMetricsPublisher.java:123)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.runTest(TestCombinedSystemMetricsPublisher.java:242)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled(TestCombinedSystemMetricsPublisher.java:252)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.BindException: Problem binding to [0.0.0.0:10200] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>       at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:833)
>       at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:738)
>       at org.apache.hadoop.ipc.Server.bind(Server.java:599)
>       at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:1121)
>       at org.apache.hadoop.ipc.Server.<init>(Server.java:2976)
>       at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:1039)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:427)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:347)
>       at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:848)
>       at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.createServer(RpcServerFactoryPBImpl.java:173)
>       at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:132)
>       ... 22 more
> Caused by: java.net.BindException: Address already in use
>       at sun.nio.ch.Net.bind0(Native Method)
>       at sun.nio.ch.Net.bind(Net.java:433)
>       at sun.nio.ch.Net.bind(Net.java:425)
>       at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>       at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>       at org.apache.hadoop.ipc.Server.bind(Server.java:582)
>       ... 30 more
> Standard Output
> 2019-05-09 12:38:27,124 INFO  [Time-limited test] 
> collector.TimelineCollectorManager 
> (TimelineCollectorManager.java:createTimelineWriter(78)) - Using 
> TimelineWriter: 
> org.apache.hadoop.yarn.server.timelineservice.storage.FileSystemTimelineWriterImpl
> 2019-05-09 12:38:27,194 INFO  [Time-limited test] impl.MetricsConfig 
> (MetricsConfig.java:loadFirst(118)) - Loaded properties from 
> hadoop-metrics2.properties
> 2019-05-09 12:38:27,198 INFO  [Time-limited test] impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:startTimer(374)) - Scheduled Metric snapshot period 
> at 0 second(s).
> 2019-05-09 12:38:27,198 INFO  [Time-limited test] impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:start(191)) - ApplicationHistoryServer metrics system 
> started
> 2019-05-09 12:38:27,200 INFO  [Time-limited test] 
> delegation.AbstractDelegationTokenSecretManager 
> (AbstractDelegationTokenSecretManager.java:updateCurrentKey(354)) - Updating 
> the current master key for generating delegation tokens
> 2019-05-09 12:38:27,202 INFO  [Time-limited test] ipc.CallQueueManager 
> (CallQueueManager.java:<init>(84)) - Using callQueue: class 
> java.util.concurrent.LinkedBlockingQueue, queueCapacity: 1000, scheduler: 
> class org.apache.hadoop.ipc.DefaultRpcScheduler, ipcBackoff: false.
> 2019-05-09 12:38:27,205 INFO  [Thread[Thread-50,5,FailOnTimeoutGroup]] 
> delegation.AbstractDelegationTokenSecretManager 
> (AbstractDelegationTokenSecretManager.java:run(686)) - Starting expired 
> delegation token remover thread, tokenRemoverScanInterval=60 min(s)
> 2019-05-09 12:38:27,217 INFO  [Thread[Thread-50,5,FailOnTimeoutGroup]] 
> delegation.AbstractDelegationTokenSecretManager 
> (AbstractDelegationTokenSecretManager.java:updateCurrentKey(354)) - Updating 
> the current master key for generating delegation tokens
> 2019-05-09 12:38:27,217 INFO  [Time-limited test] service.AbstractService 
> (AbstractService.java:noteFailure(267)) - Service 
> ApplicationHistoryClientService failed in state STARTED
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [0.0.0.0:10200] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>       at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
>       at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:66)
>       at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:55)
>       at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.serviceStart(ApplicationHistoryClientService.java:94)
>       at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>       at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>       at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceStart(ApplicationHistoryServer.java:120)
>       at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testSetup(TestCombinedSystemMetricsPublisher.java:123)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.runTest(TestCombinedSystemMetricsPublisher.java:242)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled(TestCombinedSystemMetricsPublisher.java:252)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.BindException: Problem binding to [0.0.0.0:10200] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>       at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:833)
>       at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:738)
>       at org.apache.hadoop.ipc.Server.bind(Server.java:599)
>       at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:1121)
>       at org.apache.hadoop.ipc.Server.<init>(Server.java:2976)
>       at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:1039)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:427)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:347)
>       at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:848)
>       at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.createServer(RpcServerFactoryPBImpl.java:173)
>       at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:132)
>       ... 22 more
> Caused by: java.net.BindException: Address already in use
>       at sun.nio.ch.Net.bind0(Native Method)
>       at sun.nio.ch.Net.bind(Net.java:433)
>       at sun.nio.ch.Net.bind(Net.java:425)
>       at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>       at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>       at org.apache.hadoop.ipc.Server.bind(Server.java:582)
>       ... 30 more
> 2019-05-09 12:38:27,231 INFO  [Time-limited test] service.AbstractService 
> (AbstractService.java:noteFailure(267)) - Service 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
>  failed in state STARTED
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [0.0.0.0:10200] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>       at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
>       at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:66)
>       at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:55)
>       at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.serviceStart(ApplicationHistoryClientService.java:94)
>       at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>       at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
>       at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceStart(ApplicationHistoryServer.java:120)
>       at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testSetup(TestCombinedSystemMetricsPublisher.java:123)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.runTest(TestCombinedSystemMetricsPublisher.java:242)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled(TestCombinedSystemMetricsPublisher.java:252)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.BindException: Problem binding to [0.0.0.0:10200] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>       at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:833)
>       at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:738)
>       at org.apache.hadoop.ipc.Server.bind(Server.java:599)
>       at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:1121)
>       at org.apache.hadoop.ipc.Server.<init>(Server.java:2976)
>       at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:1039)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:427)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:347)
>       at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:848)
>       at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.createServer(RpcServerFactoryPBImpl.java:173)
>       at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:132)
>       ... 22 more
> Caused by: java.net.BindException: Address already in use
>       at sun.nio.ch.Net.bind0(Native Method)
>       at sun.nio.ch.Net.bind(Net.java:433)
>       at sun.nio.ch.Net.bind(Net.java:425)
>       at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>       at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>       at org.apache.hadoop.ipc.Server.bind(Server.java:582)
>       ... 30 more
> 2019-05-09 12:38:27,232 INFO  [Time-limited test] impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:stop(210)) - Stopping ApplicationHistoryServer 
> metrics system...
> 2019-05-09 12:38:27,232 INFO  [Time-limited test] impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:stop(216)) - ApplicationHistoryServer metrics system 
> stopped.
> 2019-05-09 12:38:27,233 INFO  [Time-limited test] impl.MetricsSystemImpl 
> (MetricsSystemImpl.java:shutdown(607)) - ApplicationHistoryServer metrics 
> system shutdown complete.
> 2019-05-09 12:38:27,234 ERROR [Thread[Thread-50,5,FailOnTimeoutGroup]] 
> delegation.AbstractDelegationTokenSecretManager 
> (AbstractDelegationTokenSecretManager.java:run(707)) - ExpiredTokenRemover 
> received java.lang.InterruptedException: sleep interrupted
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to