[ https://issues.apache.org/jira/browse/HADOOP-16385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876250#comment-16876250 ]
Hadoop QA commented on HADOOP-16385: ------------------------------------ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 0s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 37s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 38s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}104m 32s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.5 Server=18.09.5 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | HADOOP-16385 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12973338/HADOOP-16385-03.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 99325212fb40 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1e727cf | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/16363/testReport/ | | Max. process+thread count | 1364 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/16363/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Namenode crashes with "RedundancyMonitor thread received Runtime exception" > --------------------------------------------------------------------------- > > Key: HADOOP-16385 > URL: https://issues.apache.org/jira/browse/HADOOP-16385 > Project: Hadoop Common > Issue Type: Bug > Affects Versions: 3.1.1 > Reporter: krishna reddy > Assignee: Ayush Saxena > Priority: Major > Attachments: HADOOP-16385-01.patch, HADOOP-16385-02.patch, > HADOOP-16385-03.patch, HADOOP-16385.branch-3.1.001.patch > > > *Description: *While removing dead nodes, Namenode went down with error > "RedundancyMonitor thread received Runtime exception" > *Environment: * > Server OS :- UBUNTU > No. of Cluster Node:- 1NN / 225DN's / 3ZK / 2RM/ 4850 NMs > total 240 machines, in each machine 21 docker containers (1 DN & 20 NM's) > *Steps:* > 1. Total number of containers running state : ~53000 > 2. Because of the load, machine was going to outofMemory and restarting the > machine and starting all the docker containers including NM's and DN's > 3. in some point namenode throughs below error while removing a node and NN > went down. > {noformat} > 2019-06-19 05:54:07,262 INFO org.apache.hadoop.net.NetworkTopology: Removing > a node: /rack-1550/255.255.117.195:23735 > 2019-06-19 05:54:07,263 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > removeDeadDatanode: lost heartbeat from 255.255.117.151:23735, > removeBlocksFromBlockMap true > 2019-06-19 05:54:07,281 INFO org.apache.hadoop.net.NetworkTopology: Removing > a node: /rack-4097/255.255.117.151:23735 > 2019-06-19 05:54:07,282 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > removeDeadDatanode: lost heartbeat from 255.255.116.213:23735, > removeBlocksFromBlockMap true > 2019-06-19 05:54:07,290 ERROR > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: RedundancyMonitor > thread received Runtime exception. > java.lang.IllegalArgumentException: 247 should >= 248, and both should be > positive. > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:575) > at > org.apache.hadoop.net.NetworkTopology.chooseRandom(NetworkTopology.java:552) > at > org.apache.hadoop.hdfs.net.DFSNetworkTopology.chooseRandomWithStorageTypeTwoTrial(DFSNetworkTopology.java:122) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseDataNode(BlockPlacementPolicyDefault.java:873) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:770) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:712) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:507) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:425) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargets(BlockPlacementPolicyDefault.java:311) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:290) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:143) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.chooseTarget(BlockPlacementPolicy.java:103) > at > org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:51) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReconstructionWorkForBlocks(BlockManager.java:1902) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockReconstructionWork(BlockManager.java:1854) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4842) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4709) > at java.lang.Thread.run(Thread.java:748) > 2019-06-19 05:54:07,296 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: java.lang.IllegalArgumentException: 247 should >= 248, and both > should be positive. > 2019-06-19 05:54:07,298 INFO > org.apache.hadoop.hdfs.server.common.HadoopAuditLogger.audit: > process=Namenode operation=shutdown result=invoked > 2019-06-19 05:54:07,298 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > SHUTDOWN_MSG: > /************************************************************ > SHUTDOWN_MSG: Shutting down NameNode at namenode/255.255.182.104 > ************************************************************/ > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org