[ 
https://issues.apache.org/jira/browse/HDDS-2107?focusedWorklogId=310275&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-310275
 ]

ASF GitHub Bot logged work on HDDS-2107:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Sep/19 03:54
            Start Date: 11/Sep/19 03:54
    Worklog Time Spent: 10m 
      Work Description: hadoop-yetus commented on issue #1424: HDDS-2107. 
Datanodes should retry forever to connect to SCM in an…
URL: https://github.com/apache/hadoop/pull/1424#issuecomment-530208630
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | 0 | reexec | 41 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 1 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 67 | Maven dependency ordering for branch |
   | +1 | mvninstall | 589 | trunk passed |
   | +1 | compile | 381 | trunk passed |
   | +1 | checkstyle | 83 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 868 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 178 | trunk passed |
   | 0 | spotbugs | 417 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 615 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 41 | Maven dependency ordering for patch |
   | +1 | mvninstall | 536 | the patch passed |
   | +1 | compile | 387 | the patch passed |
   | +1 | javac | 387 | the patch passed |
   | +1 | checkstyle | 90 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 678 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 175 | the patch passed |
   | +1 | findbugs | 631 | the patch passed |
   ||| _ Other Tests _ |
   | -1 | unit | 280 | hadoop-hdds in the patch failed. |
   | -1 | unit | 2824 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 55 | The patch does not generate ASF License warnings. |
   | | | 8715 | |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | 
hadoop.hdds.scm.container.placement.algorithms.TestSCMContainerPlacementRackAware
 |
   |   | hadoop.ozone.container.TestContainerReplication |
   |   | hadoop.ozone.client.rpc.TestCloseContainerHandlingByClient |
   |   | 
hadoop.ozone.container.common.statemachine.commandhandler.TestBlockDeletion |
   |   | hadoop.ozone.client.rpc.TestContainerStateMachineFailures |
   |   | hadoop.ozone.client.rpc.Test2WayCommitInRatis |
   |   | hadoop.ozone.TestSecureOzoneCluster |
   |   | hadoop.ozone.scm.TestContainerSmallFile |
   |   | hadoop.ozone.client.rpc.TestBlockOutputStream |
   |   | hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures |
   |   | hadoop.ozone.om.TestOzoneManagerHA |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1424/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1424 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 2f98f8163e51 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / f8f8598 |
   | Default Java | 1.8.0_222 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1424/1/artifact/out/patch-unit-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1424/1/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1424/1/testReport/ |
   | Max. process+thread count | 5408 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/container-service hadoop-ozone/ozone-manager U: . 
|
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1424/1/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 310275)
    Time Spent: 40m  (was: 0.5h)

> Datanodes should retry forever to connect to SCM in an unsecure environment
> ---------------------------------------------------------------------------
>
>                 Key: HDDS-2107
>                 URL: https://issues.apache.org/jira/browse/HDDS-2107
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode
>    Affects Versions: 0.4.1
>            Reporter: Vivek Ratnavel Subramanian
>            Assignee: Vivek Ratnavel Subramanian
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> In an unsecure environment, the datanodes try upto 10 times after waiting for 
> 1000 milliseconds each time before throwing this error:
> {code:java}
> Unable to communicate to SCM server at scm:9861 for past 0 seconds.
> java.net.ConnectException: Call From scm/10.65.36.118 to scm:9861 failed on 
> connection exception: java.net.ConnectException: Connection refused; For more 
> details see:  http://wiki.apache.org/hadoop/ConnectionRefused
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>       at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
>       at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:755)
>       at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1457)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1367)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>       at com.sun.proxy.$Proxy33.getVersion(Unknown Source)
>       at 
> org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolClientSideTranslatorPB.getVersion(StorageContainerDatanodeProtocolClientSideTranslatorPB.java:112)
>       at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:70)
>       at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.net.ConnectException: Connection refused
>       at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>       at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>       at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>       at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
>       at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:690)
>       at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:794)
>       at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:411)
>       at org.apache.hadoop.ipc.Client.getConnection(Client.java:1572)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1403)
>       ... 13 more
> {code}
> The datanodes should try forever to connect with SCM and not fail immediately 
> after 10 retries.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to