[ 
https://issues.apache.org/jira/browse/HDDS-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16821021#comment-16821021
 ] 

Elek, Marton commented on HDDS-1384:
------------------------------------

Thanks to the comment from [~shashikant] at HDDS-1282, I learned that problem 
can be a result of a concurrency problem. There could be a short time between 
identifying a free port in RATIS and the usage. So it's possible that the port 
was free at the time of the decision but it's not free any more when somebody 
starts to use it.

I am trying to address this issue to use fixed incremental ports instead of 
random ports (but choose the next port if a port is not available from the 
range). 

> TestBlockOutputStreamWithFailures is failing
> --------------------------------------------
>
>                 Key: HDDS-1384
>                 URL: https://issues.apache.org/jira/browse/HDDS-1384
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: test
>            Reporter: Nanda kumar
>            Assignee: Elek, Marton
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> TestBlockOutputStreamWithFailures is failing with the following error
> {noformat}
> 2019-04-04 18:52:43,240 INFO  volume.ThrottledAsyncChecker 
> (ThrottledAsyncChecker.java:schedule(140)) - Scheduling a check for 
> org.apache.hadoop.ozone.container.common.volume.HddsVolume@1f6c0e8a
> 2019-04-04 18:52:43,240 INFO  volume.HddsVolumeChecker 
> (HddsVolumeChecker.java:checkAllVolumes(203)) - Scheduled health check for 
> volume org.apache.hadoop.ozone.container.common.volume.HddsVolume@1f6c0e8a
> 2019-04-04 18:52:43,241 ERROR server.GrpcService 
> (ExitUtils.java:terminate(133)) - Terminating with exit status 1: Failed to 
> start Grpc server
> java.io.IOException: Failed to bind
>   at 
> org.apache.ratis.thirdparty.io.grpc.netty.NettyServer.start(NettyServer.java:253)
>   at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.start(ServerImpl.java:166)
>   at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.start(ServerImpl.java:81)
>   at org.apache.ratis.grpc.server.GrpcService.startImpl(GrpcService.java:144)
>   at org.apache.ratis.util.LifeCycle.startAndTransition(LifeCycle.java:202)
>   at 
> org.apache.ratis.server.impl.RaftServerRpcWithProxy.start(RaftServerRpcWithProxy.java:69)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.lambda$start$3(RaftServerProxy.java:300)
>   at org.apache.ratis.util.LifeCycle.startAndTransition(LifeCycle.java:202)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:298)
>   at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.start(XceiverServerRatis.java:419)
>   at 
> org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:186)
>   at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.start(DatanodeStateMachine.java:169)
>   at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$startDaemon$0(DatanodeStateMachine.java:338)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.BindException: Address already in use
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:433)
>   at sun.nio.ch.Net.bind(Net.java:425)
>   at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at 
> org.apache.ratis.thirdparty.io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:130)
>   at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:558)
>   at 
> org.apache.ratis.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1358)
>   at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:501)
>   at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:486)
>   at 
> org.apache.ratis.thirdparty.io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:1019)
>   at 
> org.apache.ratis.thirdparty.io.netty.channel.AbstractChannel.bind(AbstractChannel.java:254)
>   at 
> org.apache.ratis.thirdparty.io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:366)
>   at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
>   at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
>   at 
> org.apache.ratis.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
>   at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
>   at 
> org.apache.ratis.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   ... 1 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to