[jira] [Commented] (HDFS-8986) Add option to -du to calculate directory space usage excluding snapshots
[ https://issues.apache.org/jira/browse/HDFS-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984274#comment-14984274 ] Harsh J commented on HDFS-8986: --- [~jagadesh.kiran] - What's the precise opposition w.r.t. adding a new flag to explicitly exclude snapshot counts? It does not break compatibility, and is an added feature/improvement. > Add option to -du to calculate directory space usage excluding snapshots > > > Key: HDFS-8986 > URL: https://issues.apache.org/jira/browse/HDFS-8986 > Project: Hadoop HDFS > Issue Type: Improvement > Components: snapshots >Reporter: Gautam Gopalakrishnan >Assignee: Jagadesh Kiran N > > When running {{hadoop fs -du}} on a snapshotted directory (or one of its > children), the report includes space consumed by blocks that are only present > in the snapshots. This is confusing for end users. > {noformat} > $ hadoop fs -du -h -s /tmp/parent /tmp/parent/* > 799.7 M 2.3 G /tmp/parent > 799.7 M 2.3 G /tmp/parent/sub1 > $ hdfs dfs -createSnapshot /tmp/parent snap1 > Created snapshot /tmp/parent/.snapshot/snap1 > $ hadoop fs -rm -skipTrash /tmp/parent/sub1/* > ... > $ hadoop fs -du -h -s /tmp/parent /tmp/parent/* > 799.7 M 2.3 G /tmp/parent > 799.7 M 2.3 G /tmp/parent/sub1 > $ hdfs dfs -deleteSnapshot /tmp/parent snap1 > $ hadoop fs -du -h -s /tmp/parent /tmp/parent/* > 0 0 /tmp/parent > 0 0 /tmp/parent/sub1 > {noformat} > It would be helpful if we had a flag, say -X, to exclude any snapshot related > disk usage in the output -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-9352) Hadoop-Trunk test cases start failing from morning
[ https://issues.apache.org/jira/browse/HDFS-9352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula resolved HDFS-9352. Resolution: Won't Fix Closed as won't fix. I think we should better revert HDFS-4937 > Hadoop-Trunk test cases start failing from morning > -- > > Key: HDFS-9352 > URL: https://issues.apache.org/jira/browse/HDFS-9352 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Blocker > > {noformat} > Stack Trace: > org.apache.hadoop.ipc.RemoteException: File /TestAbandonBlock_test_quota1 > could only be replicated to 0 nodes instead of minReplication (=1). There > are 2 datanode(s) running and 1 node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1736) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:299) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2457) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:796) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:500) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:976) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2305) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2301) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2301) > {noformat} > https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/558/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983864#comment-14983864 ] Brahma Reddy Battula commented on HDFS-4937: I don't know if I can give a -1. But shall we revert this? A low of tests are broken because of it. {code} 662 int refreshCounter = numOfAvailableNodes; ... 671 while(numOfReplicas > 0 && numOfAvailableNodes > 0) { 672 DatanodeDescriptor chosenNode = chooseDataNode(scope); 673 if (excludedNodes.add(chosenNode)) { //was not in the excluded list 674 if (LOG.isDebugEnabled()) { 675 builder.append("\nNode ").append(NodeBase.getPath(chosenNode)).append(" ["); 676 } 677 numOfAvailableNodes--; 678 DatanodeStorageInfo storage = null; 679 if (isGoodDatanode(chosenNode, maxNodesPerRack, considerLoad, ... 711 } 712 // Refresh the node count. If the live node count became smaller, 713 // but it is not reflected in this loop, it may loop forever in case 714 // the replicas/rack cannot be satisfied. 715 if (--refreshCounter == 0) { 716 refreshCounter = clusterMap.countNumOfAvailableNodes(scope, 717 excludedNodes); 718 // It has already gone through enough number of nodes. 719 if (refreshCounter <= excludedNodes.size()) { 720 break; 721 } 722 } 723 } {code} line 672 {{chooseDataNode(scope)}} is random, if {{chosenNode}} happens to be a excluded one, it won't go to line 674. But {{refreshCounter}} is still decreased. If we out of luck, too many times of {{chooseDataNode(scope)}} return a already excluded one, we go inside line 716, and break at line 720. Then we end up with choosing not enough {{numOfReplicas}}. In fact we could have. > ReplicationMonitor can infinite-loop in > BlockPlacementPolicyDefault#chooseRandom() > -- > > Key: HDFS-4937 > URL: https://issues.apache.org/jira/browse/HDFS-4937 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.4-alpha, 0.23.8 >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch > > > When a large number of nodes are removed by refreshing node lists, the > network topology is updated. If the refresh happens at the right moment, the > replication monitor thread may stuck in the while loop of {{chooseRandom()}}. > This is because the cached cluster size is used in the terminal condition > check of the loop. This usually happens when a block with a high replication > factor is being processed. Since replicas/rack is also calculated beforehand, > no node choice may satisfy the goodness criteria if refreshing removed racks. > All nodes will end up in the excluded list, but the size will still be less > than the cached cluster size, so it will loop infinitely. This was observed > in a production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery
[ https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983869#comment-14983869 ] Yongjun Zhang commented on HDFS-9236: - Thanks [~walter.k.su], that makes sense. > Missing sanity check for block size during block recovery > - > > Key: HDFS-9236 > URL: https://issues.apache.org/jira/browse/HDFS-9236 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu > Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, > HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch > > > Ran into an issue while running test against faulty data-node code. > Currently in DataNode.java: > {code:java} > /** Block synchronization */ > void syncBlock(RecoveringBlock rBlock, > List syncList) throws IOException { > … > // Calculate the best available replica state. > ReplicaState bestState = ReplicaState.RWR; > … > // Calculate list of nodes that will participate in the recovery > // and the new block size > List participatingList = new ArrayList(); > final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId, > -1, recoveryId); > switch(bestState) { > … > case RBW: > case RWR: > long minLength = Long.MAX_VALUE; > for(BlockRecord r : syncList) { > ReplicaState rState = r.rInfo.getOriginalReplicaState(); > if(rState == bestState) { > minLength = Math.min(minLength, r.rInfo.getNumBytes()); > participatingList.add(r); > } > } > newBlock.setNumBytes(minLength); > break; > … > } > … > nn.commitBlockSynchronization(block, > newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false, > datanodes, storages); > } > {code} > This code is called by the DN coordinating the block recovery. In the above > case, it is possible for none of the rState (reported by DNs with copies of > the replica being recovered) to match the bestState. This can either be > caused by faulty DN code or stale/modified/corrupted files on DN. When this > happens the DN will end up reporting the minLengh of Long.MAX_VALUE. > Unfortunately there is no check on the NN for replica length. See > FSNamesystem.java: > {code:java} > void commitBlockSynchronization(ExtendedBlock oldBlock, > long newgenerationstamp, long newlength, > boolean closeFile, boolean deleteblock, DatanodeID[] newtargets, > String[] newtargetstorages) throws IOException { > … > if (deleteblock) { > Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock); > boolean remove = iFile.removeLastBlock(blockToDel) != null; > if (remove) { > blockManager.removeBlock(storedBlock); > } > } else { > // update last block > if(!copyTruncate) { > storedBlock.setGenerationStamp(newgenerationstamp); > > // XXX block length is updated without any check <<< storedBlock.setNumBytes(newlength); > } > … > if (closeFile) { > LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock > + ", file=" + src > + (copyTruncate ? ", newBlock=" + truncatedBlock > : ", newgenerationstamp=" + newgenerationstamp) > + ", newlength=" + newlength > + ", newtargets=" + Arrays.asList(newtargets) + ") successful"); > } else { > LOG.info("commitBlockSynchronization(" + oldBlock + ") successful"); > } > } > {code} > After this point the block length becomes Long.MAX_VALUE. Any subsequent > block report (even with correct length) will cause the block to be marked as > corrupted. Since this is block could be the last block of the file. If this > happens and the client goes away, NN won’t be able to recover the lease and > close the file because the last block is under-replicated. > I believe we need to have a sanity check for block size on both DN and NN to > prevent such case from happening. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983905#comment-14983905 ] Yi Liu edited comment on HDFS-4937 at 10/31/15 8:46 AM: Revert from trunk, branch-2, branch-2.7. Thanks Vinay and Brahma. I thought the tests passed... But actually the jenkins doesn't include the tests result. was (Author: hitliuyi): Revert from trunk, branch-2, branch-2.7. Thanks Brahma. I thought the tests passed... But actually the jenkins doesn't include the tests result. > ReplicationMonitor can infinite-loop in > BlockPlacementPolicyDefault#chooseRandom() > -- > > Key: HDFS-4937 > URL: https://issues.apache.org/jira/browse/HDFS-4937 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.4-alpha, 0.23.8 >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch > > > When a large number of nodes are removed by refreshing node lists, the > network topology is updated. If the refresh happens at the right moment, the > replication monitor thread may stuck in the while loop of {{chooseRandom()}}. > This is because the cached cluster size is used in the terminal condition > check of the loop. This usually happens when a block with a high replication > factor is being processed. Since replicas/rack is also calculated beforehand, > no node choice may satisfy the goodness criteria if refreshing removed racks. > All nodes will end up in the excluded list, but the size will still be less > than the cached cluster size, so it will loop infinitely. This was observed > in a production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983924#comment-14983924 ] Hudson commented on HDFS-4937: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #612 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/612/]) Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java > ReplicationMonitor can infinite-loop in > BlockPlacementPolicyDefault#chooseRandom() > -- > > Key: HDFS-4937 > URL: https://issues.apache.org/jira/browse/HDFS-4937 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.4-alpha, 0.23.8 >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch > > > When a large number of nodes are removed by refreshing node lists, the > network topology is updated. If the refresh happens at the right moment, the > replication monitor thread may stuck in the while loop of {{chooseRandom()}}. > This is because the cached cluster size is used in the terminal condition > check of the loop. This usually happens when a block with a high replication > factor is being processed. Since replicas/rack is also calculated beforehand, > no node choice may satisfy the goodness criteria if refreshing removed racks. > All nodes will end up in the excluded list, but the size will still be less > than the cached cluster size, so it will loop infinitely. This was observed > in a production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983923#comment-14983923 ] Hudson commented on HDFS-4937: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #624 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/624/]) Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java > ReplicationMonitor can infinite-loop in > BlockPlacementPolicyDefault#chooseRandom() > -- > > Key: HDFS-4937 > URL: https://issues.apache.org/jira/browse/HDFS-4937 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.4-alpha, 0.23.8 >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch > > > When a large number of nodes are removed by refreshing node lists, the > network topology is updated. If the refresh happens at the right moment, the > replication monitor thread may stuck in the while loop of {{chooseRandom()}}. > This is because the cached cluster size is used in the terminal condition > check of the loop. This usually happens when a block with a high replication > factor is being processed. Since replicas/rack is also calculated beforehand, > no node choice may satisfy the goodness criteria if refreshing removed racks. > All nodes will end up in the excluded list, but the size will still be less > than the cached cluster size, so it will loop infinitely. This was observed > in a production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9354) Fix TestBalancer#testBalancerWithZeroThreadsForMove on Windows
[ https://issues.apache.org/jira/browse/HDFS-9354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983975#comment-14983975 ] Hadoop QA commented on HDFS-9354: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 5s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 44s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 52s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s {color} | {color:red} Patch generated 56 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 157m 31s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.hdfs.TestReplaceDatanodeOnFailure | | | hadoop.hdfs.server.mover.TestMover | | | hadoop.hdfs.server.datanode.TestBlockScanner | | | hadoop.hdfs.server.datanode.TestDataNodeMetrics | | | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot | | JDK v1.7.0_79 Failed junit tests | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.0 Server=1.7.0 Image:test-patch-base-hadoop-date2015-10-31 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12769933/HDFS-9354.00.patch | | JIRA Issue | HDFS-9354 | | Optional Tests | asflicense javac javadoc mvninstall unit findbugs checkstyle compile | | uname | Linux a4146648b3cc 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983954#comment-14983954 ] Hudson commented on HDFS-4937: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1347 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1347/]) Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java > ReplicationMonitor can infinite-loop in > BlockPlacementPolicyDefault#chooseRandom() > -- > > Key: HDFS-4937 > URL: https://issues.apache.org/jira/browse/HDFS-4937 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.4-alpha, 0.23.8 >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch > > > When a large number of nodes are removed by refreshing node lists, the > network topology is updated. If the refresh happens at the right moment, the > replication monitor thread may stuck in the while loop of {{chooseRandom()}}. > This is because the cached cluster size is used in the terminal condition > check of the loop. This usually happens when a block with a high replication > factor is being processed. Since replicas/rack is also calculated beforehand, > no node choice may satisfy the goodness criteria if refreshing removed racks. > All nodes will end up in the excluded list, but the size will still be less > than the cached cluster size, so it will loop infinitely. This was observed > in a production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9354) Fix TestBalancer#testBalancerWithZeroThreadsForMove on Windows
Xiaoyu Yao created HDFS-9354: Summary: Fix TestBalancer#testBalancerWithZeroThreadsForMove on Windows Key: HDFS-9354 URL: https://issues.apache.org/jira/browse/HDFS-9354 Project: Hadoop HDFS Issue Type: Test Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao This negative test expect HadoopIllegalArgumentException on illegal configuration. It uses JUnit (expected=HadoopIllegalArgumentException.class) and passed fine on Linux. On windows, this test passes as well. But it left open handles on NN metadata directories used by MiniDFSCluster. As a result, quite a few of subsequent TestBalancer unit tests can't start MiniDFSCluster. The open handles prevents them from cleaning up NN metadata directories on Windows. This JIRA is opened to explicitly catch the Exception and ensure the test cluster is properly shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9338) in webhdfs Null pointer exception will be thrown when xattrname is not given
[ https://issues.apache.org/jira/browse/HDFS-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jagadesh Kiran N updated HDFS-9338: --- Status: Patch Available (was: Open) > in webhdfs Null pointer exception will be thrown when xattrname is not given > > > Key: HDFS-9338 > URL: https://issues.apache.org/jira/browse/HDFS-9338 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jagadesh Kiran N >Assignee: Jagadesh Kiran N > Attachments: HDFS-9338-00.patch > > > {code} > curl -i -X PUT > "http://10.19.92.127:50070/webhdfs/v1/kiran/?op=REMOVEXATTR=; > or > curl -i -X PUT "http://10.19.92.127:50070/webhdfs/v1/kiran/?op=REMOVEXATTR; > {code} > {code} > {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":"XAttr > name cannot be null."}} > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9338) in webhdfs Null pointer exception will be thrown when xattrname is not given
[ https://issues.apache.org/jira/browse/HDFS-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jagadesh Kiran N updated HDFS-9338: --- Attachment: HDFS-9338-00.patch Attached the patch please review > in webhdfs Null pointer exception will be thrown when xattrname is not given > > > Key: HDFS-9338 > URL: https://issues.apache.org/jira/browse/HDFS-9338 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jagadesh Kiran N >Assignee: Jagadesh Kiran N > Attachments: HDFS-9338-00.patch > > > {code} > curl -i -X PUT > "http://10.19.92.127:50070/webhdfs/v1/kiran/?op=REMOVEXATTR=; > or > curl -i -X PUT "http://10.19.92.127:50070/webhdfs/v1/kiran/?op=REMOVEXATTR; > {code} > {code} > {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":"XAttr > name cannot be null."}} > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983973#comment-14983973 ] Hudson commented on HDFS-4937: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2554 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2554/]) Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > ReplicationMonitor can infinite-loop in > BlockPlacementPolicyDefault#chooseRandom() > -- > > Key: HDFS-4937 > URL: https://issues.apache.org/jira/browse/HDFS-4937 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.4-alpha, 0.23.8 >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch > > > When a large number of nodes are removed by refreshing node lists, the > network topology is updated. If the refresh happens at the right moment, the > replication monitor thread may stuck in the while loop of {{chooseRandom()}}. > This is because the cached cluster size is used in the terminal condition > check of the loop. This usually happens when a block with a high replication > factor is being processed. Since replicas/rack is also calculated beforehand, > no node choice may satisfy the goodness criteria if refreshing removed racks. > All nodes will end up in the excluded list, but the size will still be less > than the cached cluster size, so it will loop infinitely. This was observed > in a production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9254) HDFS Secure Mode Documentation updates
[ https://issues.apache.org/jira/browse/HDFS-9254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983960#comment-14983960 ] Steve Loughran commented on HDFS-9254: -- # check the indentation on some of the XML examples # doesn't my little ebook merit a reference? # you can paste in it's references too, or at least a subset of them: https://github.com/steveloughran/kerberos_and_hadoop/blob/master/sections/biblography.md > HDFS Secure Mode Documentation updates > -- > > Key: HDFS-9254 > URL: https://issues.apache.org/jira/browse/HDFS-9254 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.1 >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Attachments: HDFS-9254.01.patch, HDFS-9254.02.patch, > HDFS-9254.03.patch > > > Some Kerberos configuration parameters are not documented well enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9355) Support colocation in HDFS.
Surendra Singh Lilhore created HDFS-9355: Summary: Support colocation in HDFS. Key: HDFS-9355 URL: https://issues.apache.org/jira/browse/HDFS-9355 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Through this feature client can give suggestion to HDFS to write his all the blocks on same set of datanodes. Currently this we can achieve through HDFS-2576. HDFS-2576 give option to hint namenode about favored nodes, but in heterogeneous cluster this will not work out. Support client wants to write his data in directory which have COLD policy, but he don't know which DN have ARCHIVE storage, So he will not able to give favoredNodes list. *Implementation* Colocation can enable by setting "dfs.colocation.enable" true in client configuration. If colocation is enable and favoredNodes list is empty then {{DataStreamer}} will set first set of datanodes as favoredNodes which is chosen for first block and subsequent block will use the same datanodes for write. Before closing file client can get the favoredNodes list and same he can use for writing new file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983986#comment-14983986 ] Hudson commented on HDFS-4937: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #560 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/560/]) Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > ReplicationMonitor can infinite-loop in > BlockPlacementPolicyDefault#chooseRandom() > -- > > Key: HDFS-4937 > URL: https://issues.apache.org/jira/browse/HDFS-4937 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.4-alpha, 0.23.8 >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch > > > When a large number of nodes are removed by refreshing node lists, the > network topology is updated. If the refresh happens at the right moment, the > replication monitor thread may stuck in the while loop of {{chooseRandom()}}. > This is because the cached cluster size is used in the terminal condition > check of the loop. This usually happens when a block with a high replication > factor is being processed. Since replicas/rack is also calculated beforehand, > no node choice may satisfy the goodness criteria if refreshing removed racks. > All nodes will end up in the excluded list, but the size will still be less > than the cached cluster size, so it will loop infinitely. This was observed > in a production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9308) Add truncateMeta() and deleteMeta() to MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-9308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983881#comment-14983881 ] Hadoop QA commented on HDFS-9308: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 14s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 2s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 56s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 23s {color} | {color:red} Patch generated 56 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 162m 2s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 | | | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithNodeGroup | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 | | | hadoop.hdfs.TestDFSStripedOutputStream | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 | | | hadoop.hdfs.server.namenode.TestRecoverStripedBlocks | | | hadoop.hdfs.TestCrcCorruption | | | hadoop.hdfs.server.blockmanagement.TestNodeCount | | | hadoop.hdfs.TestGetFileChecksum | | | hadoop.hdfs.TestSafeModeWithStripedFile | | | hadoop.hdfs.server.datanode.TestDataNodeMetrics | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | | |
[jira] [Commented] (HDFS-9344) QUEUE_WITH_CORRUPT_BLOCKS is no longer needed
[ https://issues.apache.org/jira/browse/HDFS-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983902#comment-14983902 ] Jagadesh Kiran N commented on HDFS-9344: No issues [~xiaochen] , Good Analysis > QUEUE_WITH_CORRUPT_BLOCKS is no longer needed > - > > Key: HDFS-9344 > URL: https://issues.apache.org/jira/browse/HDFS-9344 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.1 >Reporter: Zhe Zhang >Assignee: Xiao Chen >Priority: Minor > > After the change of HDFS-9205, the {{QUEUE_WITH_CORRUPT_BLOCKS}} queue in > {{UnderReplicatedBlocks}} is no longer needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu reopened HDFS-4937: -- > ReplicationMonitor can infinite-loop in > BlockPlacementPolicyDefault#chooseRandom() > -- > > Key: HDFS-4937 > URL: https://issues.apache.org/jira/browse/HDFS-4937 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.4-alpha, 0.23.8 >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch > > > When a large number of nodes are removed by refreshing node lists, the > network topology is updated. If the refresh happens at the right moment, the > replication monitor thread may stuck in the while loop of {{chooseRandom()}}. > This is because the cached cluster size is used in the terminal condition > check of the loop. This usually happens when a block with a high replication > factor is being processed. Since replicas/rack is also calculated beforehand, > no node choice may satisfy the goodness criteria if refreshing removed racks. > All nodes will end up in the excluded list, but the size will still be less > than the cached cluster size, so it will loop infinitely. This was observed > in a production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983905#comment-14983905 ] Yi Liu commented on HDFS-4937: -- Revert from trunk, branch-2, branch-2.7. Thanks Brahma. I thought the tests passed... But actually the jenkins doesn't include the tests result. > ReplicationMonitor can infinite-loop in > BlockPlacementPolicyDefault#chooseRandom() > -- > > Key: HDFS-4937 > URL: https://issues.apache.org/jira/browse/HDFS-4937 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.4-alpha, 0.23.8 >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch > > > When a large number of nodes are removed by refreshing node lists, the > network topology is updated. If the refresh happens at the right moment, the > replication monitor thread may stuck in the while loop of {{chooseRandom()}}. > This is because the cached cluster size is used in the terminal condition > check of the loop. This usually happens when a block with a high replication > factor is being processed. Since replicas/rack is also calculated beforehand, > no node choice may satisfy the goodness criteria if refreshing removed racks. > All nodes will end up in the excluded list, but the size will still be less > than the cached cluster size, so it will loop infinitely. This was observed > in a production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983907#comment-14983907 ] Hudson commented on HDFS-4937: -- FAILURE: Integrated in Hadoop-trunk-Commit #8738 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8738/]) Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java > ReplicationMonitor can infinite-loop in > BlockPlacementPolicyDefault#chooseRandom() > -- > > Key: HDFS-4937 > URL: https://issues.apache.org/jira/browse/HDFS-4937 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.4-alpha, 0.23.8 >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch > > > When a large number of nodes are removed by refreshing node lists, the > network topology is updated. If the refresh happens at the right moment, the > replication monitor thread may stuck in the while loop of {{chooseRandom()}}. > This is because the cached cluster size is used in the terminal condition > check of the loop. This usually happens when a block with a high replication > factor is being processed. Since replicas/rack is also calculated beforehand, > no node choice may satisfy the goodness criteria if refreshing removed racks. > All nodes will end up in the excluded list, but the size will still be less > than the cached cluster size, so it will loop infinitely. This was observed > in a production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983906#comment-14983906 ] Yi Liu edited comment on HDFS-4937 at 10/31/15 8:45 AM: I did consider the situation you mentioned, But I thought in real env the NN could find other racks/DNs if it has gone through enough (not all) number of nodes. But I missed the fact that many tests may only contain few available DNs, and {{refreshCounter <= excludedNodes.size()}} will be true, also in real env this also may happen if total number of DNs is few. So the patch should not be correct for these cases, revert them. was (Author: hitliuyi): I did consider the situation you mentioned, But I thought in real env the NN could find other racks/DNs if it has gone through enough number of nodes. But I missed the fact that many tests may only contain few available DNs, and {{refreshCounter <= excludedNodes.size()}} will be true, also in real env this also may happen if total number of DNs is few. So the patch should not be correct for these cases, revert them. > ReplicationMonitor can infinite-loop in > BlockPlacementPolicyDefault#chooseRandom() > -- > > Key: HDFS-4937 > URL: https://issues.apache.org/jira/browse/HDFS-4937 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.4-alpha, 0.23.8 >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch > > > When a large number of nodes are removed by refreshing node lists, the > network topology is updated. If the refresh happens at the right moment, the > replication monitor thread may stuck in the while loop of {{chooseRandom()}}. > This is because the cached cluster size is used in the terminal condition > check of the loop. This usually happens when a block with a high replication > factor is being processed. Since replicas/rack is also calculated beforehand, > no node choice may satisfy the goodness criteria if refreshing removed racks. > All nodes will end up in the excluded list, but the size will still be less > than the cached cluster size, so it will loop infinitely. This was observed > in a production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9354) Fix TestBalancer#testBalancerWithZeroThreadsForMove on Windows
[ https://issues.apache.org/jira/browse/HDFS-9354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-9354: - Attachment: HDFS-9354.00.patch > Fix TestBalancer#testBalancerWithZeroThreadsForMove on Windows > -- > > Key: HDFS-9354 > URL: https://issues.apache.org/jira/browse/HDFS-9354 > Project: Hadoop HDFS > Issue Type: Test >Reporter: Xiaoyu Yao >Assignee: Xiaoyu Yao > Attachments: HDFS-9354.00.patch > > > This negative test expect HadoopIllegalArgumentException on illegal > configuration. It uses JUnit (expected=HadoopIllegalArgumentException.class) > and passed fine on Linux. > On windows, this test passes as well. But it left open handles on NN metadata > directories used by MiniDFSCluster. As a result, quite a few of subsequent > TestBalancer unit tests can't start MiniDFSCluster. The open handles prevents > them from cleaning up NN metadata directories on Windows. > This JIRA is opened to explicitly catch the Exception and ensure the test > cluster is properly shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983906#comment-14983906 ] Yi Liu commented on HDFS-4937: -- I did consider the situation you mentioned, But I thought in real env the NN could find other racks/DNs if it has gone through enough number of nodes. But I missed the fact that many tests may only contain few available DNs, and {{refreshCounter <= excludedNodes.size()}} will be true, also in real env this also may happen if total number of DNs is few. So the patch should not be correct for these cases, revert them. > ReplicationMonitor can infinite-loop in > BlockPlacementPolicyDefault#chooseRandom() > -- > > Key: HDFS-4937 > URL: https://issues.apache.org/jira/browse/HDFS-4937 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.4-alpha, 0.23.8 >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch > > > When a large number of nodes are removed by refreshing node lists, the > network topology is updated. If the refresh happens at the right moment, the > replication monitor thread may stuck in the while loop of {{chooseRandom()}}. > This is because the cached cluster size is used in the terminal condition > check of the loop. This usually happens when a block with a high replication > factor is being processed. Since replicas/rack is also calculated beforehand, > no node choice may satisfy the goodness criteria if refreshing removed racks. > All nodes will end up in the excluded list, but the size will still be less > than the cached cluster size, so it will loop infinitely. This was observed > in a production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9354) Fix TestBalancer#testBalancerWithZeroThreadsForMove on Windows
[ https://issues.apache.org/jira/browse/HDFS-9354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-9354: - Status: Patch Available (was: Open) > Fix TestBalancer#testBalancerWithZeroThreadsForMove on Windows > -- > > Key: HDFS-9354 > URL: https://issues.apache.org/jira/browse/HDFS-9354 > Project: Hadoop HDFS > Issue Type: Test >Reporter: Xiaoyu Yao >Assignee: Xiaoyu Yao > Attachments: HDFS-9354.00.patch > > > This negative test expect HadoopIllegalArgumentException on illegal > configuration. It uses JUnit (expected=HadoopIllegalArgumentException.class) > and passed fine on Linux. > On windows, this test passes as well. But it left open handles on NN metadata > directories used by MiniDFSCluster. As a result, quite a few of subsequent > TestBalancer unit tests can't start MiniDFSCluster. The open handles prevents > them from cleaning up NN metadata directories on Windows. > This JIRA is opened to explicitly catch the Exception and ensure the test > cluster is properly shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9337) In webhdfs Nullpoint exception will be thrown in renamesnapshot when oldsnapshotname is not given
[ https://issues.apache.org/jira/browse/HDFS-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jagadesh Kiran N updated HDFS-9337: --- Attachment: HDFS-9337_00.patch Attached the patch please review > In webhdfs Nullpoint exception will be thrown in renamesnapshot when > oldsnapshotname is not given > - > > Key: HDFS-9337 > URL: https://issues.apache.org/jira/browse/HDFS-9337 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jagadesh Kiran N >Assignee: Jagadesh Kiran N > Attachments: HDFS-9337_00.patch > > > {code} > curl -i -X PUT > "http://10.19.92.127:50070/webhdfs/v1/kiran/sreenu?op=RENAMESNAPSHOT=SNAPSHOTNAME; > {code} > Null point exception will be thrown > {code} > {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}} > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9337) In webhdfs Nullpoint exception will be thrown in renamesnapshot when oldsnapshotname is not given
[ https://issues.apache.org/jira/browse/HDFS-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jagadesh Kiran N updated HDFS-9337: --- Status: Patch Available (was: Open) > In webhdfs Nullpoint exception will be thrown in renamesnapshot when > oldsnapshotname is not given > - > > Key: HDFS-9337 > URL: https://issues.apache.org/jira/browse/HDFS-9337 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jagadesh Kiran N >Assignee: Jagadesh Kiran N > Attachments: HDFS-9337_00.patch > > > {code} > curl -i -X PUT > "http://10.19.92.127:50070/webhdfs/v1/kiran/sreenu?op=RENAMESNAPSHOT=SNAPSHOTNAME; > {code} > Null point exception will be thrown > {code} > {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}} > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9337) In webhdfs Nullpoint exception will be thrown in renamesnapshot when oldsnapshotname is not given
[ https://issues.apache.org/jira/browse/HDFS-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983981#comment-14983981 ] Hadoop QA commented on HDFS-9337: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 53s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 45s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 51s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 18s {color} | {color:red} Patch generated 56 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 121m 15s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.hdfs.TestLeaseRecovery2 | | | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport | | JDK v1.7.0_79 Failed junit tests | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock | | | hadoop.hdfs.TestDecommission | | | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-10-31 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12769935/HDFS-9337_00.patch | | JIRA Issue | HDFS-9337 | | Optional Tests | asflicense javac javadoc mvninstall unit findbugs checkstyle compile | | uname | Linux ca05e8403277 3.13.0-36-lowlatency #63-Ubuntu SMP
[jira] [Commented] (HDFS-9338) in webhdfs Null pointer exception will be thrown when xattrname is not given
[ https://issues.apache.org/jira/browse/HDFS-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983983#comment-14983983 ] Hadoop QA commented on HDFS-9338: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 7s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 10s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 5s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 3s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 37s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s {color} | {color:red} Patch generated 58 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 143m 18s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock | | | hadoop.hdfs.TestDFSUpgradeFromImage | | | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | hadoop.hdfs.server.datanode.TestBlockScanner | | JDK v1.7.0_79 Failed junit tests | hadoop.hdfs.TestDFSShell | | | hadoop.hdfs.server.blockmanagement.TestNodeCount | | | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport | | | hadoop.hdfs.TestBlockStoragePolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-10-31 | | JIRA Patch URL |
[jira] [Commented] (HDFS-9348) DFS GetErasureCodingPolicy API on a non-existent file should be handled properly
[ https://issues.apache.org/jira/browse/HDFS-9348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984063#comment-14984063 ] Hadoop QA commented on HDFS-9348: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 49s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 8s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s {color} | {color:red} Patch generated 1 new checkstyle issues in hadoop-hdfs-project (total was 82, now 82). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 8s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 49m 45s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 53s {color} | {color:green} hadoop-hdfs-client in the patch passed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s {color} | {color:red} Patch generated 56 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 77m 5s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.7.0_79 Failed junit tests | hadoop.hdfs.server.blockmanagement.TestNodeCount | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-10-31 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12769855/HDFS-9348-00.patch | | JIRA Issue | HDFS-9348 | | Optional Tests | asflicense javac javadoc mvninstall unit findbugs checkstyle compile | | uname | Linux 821f9b477c10 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/apache-yetus-e77b1ce/precommit/personality/hadoop.sh | | git revision | trunk / 7fd6416 | | Default Java | 1.7.0_79 | |
[jira] [Commented] (HDFS-8777) Erasure Coding: add tests for taking snapshots on EC files
[ https://issues.apache.org/jira/browse/HDFS-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984109#comment-14984109 ] Hadoop QA commented on HDFS-8777: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 48s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 29s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 30s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 24s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 34s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 27s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 27s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 28s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s {color} | {color:red} Patch generated 58 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 158m 40s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | | | hadoop.hdfs.server.namenode.TestCheckpoint | | | hadoop.hdfs.security.TestDelegationTokenForProxyUser | | | hadoop.hdfs.TestSafeModeWithStripedFile | | | hadoop.hdfs.server.datanode.TestBlockScanner | | | hadoop.hdfs.TestDataTransferKeepalive | | JDK v1.7.0_79 Failed junit tests | hadoop.hdfs.server.blockmanagement.TestNodeCount | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 | | | hadoop.hdfs.TestRollingUpgrade | | | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 | | | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | |
[jira] [Commented] (HDFS-9354) Fix TestBalancer#testBalancerWithZeroThreadsForMove on Windows
[ https://issues.apache.org/jira/browse/HDFS-9354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984141#comment-14984141 ] Chris Nauroth commented on HDFS-9354: - Hi [~xyao]. Thanks for tracking down these file handle leaks. I have a couple of thoughts that might help {{TestBalancer}} become more resilient to these kinds of problems in the future. # We could add a JUnit {{@After}} method that always shuts down {{cluster}} if it is non-null. Then, the individual tests wouldn't need to do try-finally, and any new tests that get added over time will get the automatic shutdown for free. This would require a bigger patch though. # The check for {{HadoopIllegalArgumentException}} could be simplified by using JUnit's {{ExpectedException}} rule. If you'd like to see a simple example of this, I recommend looking at {{TestAclConfigFlag}}. Please let me know your thoughts on this. Thanks again! > Fix TestBalancer#testBalancerWithZeroThreadsForMove on Windows > -- > > Key: HDFS-9354 > URL: https://issues.apache.org/jira/browse/HDFS-9354 > Project: Hadoop HDFS > Issue Type: Test >Reporter: Xiaoyu Yao >Assignee: Xiaoyu Yao > Attachments: HDFS-9354.00.patch > > > This negative test expect HadoopIllegalArgumentException on illegal > configuration. It uses JUnit (expected=HadoopIllegalArgumentException.class) > and passed fine on Linux. > On windows, this test passes as well. But it left open handles on NN metadata > directories used by MiniDFSCluster. As a result, quite a few of subsequent > TestBalancer unit tests can't start MiniDFSCluster. The open handles prevents > them from cleaning up NN metadata directories on Windows. > This JIRA is opened to explicitly catch the Exception and ensure the test > cluster is properly shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983996#comment-14983996 ] Hudson commented on HDFS-4937: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2497 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2497/]) Revert "HDFS-4937. ReplicationMonitor can infinite-loop in (yliu: rev 7fd6416759cbb202ed21b47d28c1587e04a5cdc6) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > ReplicationMonitor can infinite-loop in > BlockPlacementPolicyDefault#chooseRandom() > -- > > Key: HDFS-4937 > URL: https://issues.apache.org/jira/browse/HDFS-4937 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.4-alpha, 0.23.8 >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.7.2 > > Attachments: HDFS-4937.patch, HDFS-4937.v1.patch, HDFS-4937.v2.patch > > > When a large number of nodes are removed by refreshing node lists, the > network topology is updated. If the refresh happens at the right moment, the > replication monitor thread may stuck in the while loop of {{chooseRandom()}}. > This is because the cached cluster size is used in the terminal condition > check of the loop. This usually happens when a block with a high replication > factor is being processed. Since replicas/rack is also calculated beforehand, > no node choice may satisfy the goodness criteria if refreshing removed racks. > All nodes will end up in the excluded list, but the size will still be less > than the cached cluster size, so it will loop infinitely. This was observed > in a production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9337) In webhdfs Nullpoint exception will be thrown in renamesnapshot when oldsnapshotname is not given
[ https://issues.apache.org/jira/browse/HDFS-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983999#comment-14983999 ] Surendra Singh Lilhore commented on HDFS-9337: -- Thanks [~jagadesh.kiran] for patch... > I feel we should add same check for {{snapshotName.getValue()}} Check {{setSnapshotNewName()}} in {{ClientNamenodeProtocolProtos}}, it will throw NPE if value is null. {code} public Builder setSnapshotNewName( java.lang.String value) { if (value == null) { throw new NullPointerException(); } {code} > In webhdfs Nullpoint exception will be thrown in renamesnapshot when > oldsnapshotname is not given > - > > Key: HDFS-9337 > URL: https://issues.apache.org/jira/browse/HDFS-9337 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jagadesh Kiran N >Assignee: Jagadesh Kiran N > Attachments: HDFS-9337_00.patch > > > {code} > curl -i -X PUT > "http://10.19.92.127:50070/webhdfs/v1/kiran/sreenu?op=RENAMESNAPSHOT=SNAPSHOTNAME; > {code} > Null point exception will be thrown > {code} > {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}} > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9337) In webhdfs Nullpoint exception will be thrown in renamesnapshot when oldsnapshotname is not given
[ https://issues.apache.org/jira/browse/HDFS-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984005#comment-14984005 ] Jagadesh Kiran N commented on HDFS-9337: Thanks [~surendrasingh] for your comments,i will update the patch soon as per your comment > In webhdfs Nullpoint exception will be thrown in renamesnapshot when > oldsnapshotname is not given > - > > Key: HDFS-9337 > URL: https://issues.apache.org/jira/browse/HDFS-9337 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jagadesh Kiran N >Assignee: Jagadesh Kiran N > Attachments: HDFS-9337_00.patch > > > {code} > curl -i -X PUT > "http://10.19.92.127:50070/webhdfs/v1/kiran/sreenu?op=RENAMESNAPSHOT=SNAPSHOTNAME; > {code} > Null point exception will be thrown > {code} > {"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}} > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9236) Missing sanity check for block size during block recovery
[ https://issues.apache.org/jira/browse/HDFS-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984040#comment-14984040 ] Tony Wu commented on HDFS-9236: --- Thanks [~walter.k.su] and [~yzhangal] for your comments. I'll post a new patch which will exclude RURs from syncList. > Missing sanity check for block size during block recovery > - > > Key: HDFS-9236 > URL: https://issues.apache.org/jira/browse/HDFS-9236 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Tony Wu >Assignee: Tony Wu > Attachments: HDFS-9236.001.patch, HDFS-9236.002.patch, > HDFS-9236.003.patch, HDFS-9236.004.patch, HDFS-9236.005.patch > > > Ran into an issue while running test against faulty data-node code. > Currently in DataNode.java: > {code:java} > /** Block synchronization */ > void syncBlock(RecoveringBlock rBlock, > List syncList) throws IOException { > … > // Calculate the best available replica state. > ReplicaState bestState = ReplicaState.RWR; > … > // Calculate list of nodes that will participate in the recovery > // and the new block size > List participatingList = new ArrayList(); > final ExtendedBlock newBlock = new ExtendedBlock(bpid, blockId, > -1, recoveryId); > switch(bestState) { > … > case RBW: > case RWR: > long minLength = Long.MAX_VALUE; > for(BlockRecord r : syncList) { > ReplicaState rState = r.rInfo.getOriginalReplicaState(); > if(rState == bestState) { > minLength = Math.min(minLength, r.rInfo.getNumBytes()); > participatingList.add(r); > } > } > newBlock.setNumBytes(minLength); > break; > … > } > … > nn.commitBlockSynchronization(block, > newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false, > datanodes, storages); > } > {code} > This code is called by the DN coordinating the block recovery. In the above > case, it is possible for none of the rState (reported by DNs with copies of > the replica being recovered) to match the bestState. This can either be > caused by faulty DN code or stale/modified/corrupted files on DN. When this > happens the DN will end up reporting the minLengh of Long.MAX_VALUE. > Unfortunately there is no check on the NN for replica length. See > FSNamesystem.java: > {code:java} > void commitBlockSynchronization(ExtendedBlock oldBlock, > long newgenerationstamp, long newlength, > boolean closeFile, boolean deleteblock, DatanodeID[] newtargets, > String[] newtargetstorages) throws IOException { > … > if (deleteblock) { > Block blockToDel = ExtendedBlock.getLocalBlock(oldBlock); > boolean remove = iFile.removeLastBlock(blockToDel) != null; > if (remove) { > blockManager.removeBlock(storedBlock); > } > } else { > // update last block > if(!copyTruncate) { > storedBlock.setGenerationStamp(newgenerationstamp); > > // XXX block length is updated without any check <<< storedBlock.setNumBytes(newlength); > } > … > if (closeFile) { > LOG.info("commitBlockSynchronization(oldBlock=" + oldBlock > + ", file=" + src > + (copyTruncate ? ", newBlock=" + truncatedBlock > : ", newgenerationstamp=" + newgenerationstamp) > + ", newlength=" + newlength > + ", newtargets=" + Arrays.asList(newtargets) + ") successful"); > } else { > LOG.info("commitBlockSynchronization(" + oldBlock + ") successful"); > } > } > {code} > After this point the block length becomes Long.MAX_VALUE. Any subsequent > block report (even with correct length) will cause the block to be marked as > corrupted. Since this is block could be the last block of the file. If this > happens and the client goes away, NN won’t be able to recover the lease and > close the file because the last block is under-replicated. > I believe we need to have a sanity check for block size on both DN and NN to > prevent such case from happening. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9343) Empty caller context considered invalid
[ https://issues.apache.org/jira/browse/HDFS-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14984231#comment-14984231 ] Hadoop QA commented on HDFS-9343: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 0s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 47s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 30s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 10s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 21s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 12s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 58s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 47s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 17s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 10s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 33s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 19s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 50s {color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 40s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 22s {color} | {color:red} Patch generated 58 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 200m 52s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.ipc.TestDecayRpcScheduler | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | JDK v1.8.0_60 Timed out junit tests | org.apache.hadoop.http.TestHttpServerLifecycle | | JDK v1.7.0_79 Failed junit tests | hadoop.ipc.TestIPC | | | hadoop.metrics2.impl.TestGangliaMetrics | | | hadoop.hdfs.server.datanode.TestFsDatasetCache | | | hadoop.hdfs.server.blockmanagement.TestNodeCount | | | hadoop.hdfs.TestEncryptionZones | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure140 | \\ \\ || Subsystem ||
[jira] [Updated] (HDFS-9343) Empty caller context considered invalid
[ https://issues.apache.org/jira/browse/HDFS-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9343: Attachment: HDFS-9343.004.patch The v4 patch rebases from {{trunk}} code and will trigger the Jenkins again. > Empty caller context considered invalid > --- > > Key: HDFS-9343 > URL: https://issues.apache.org/jira/browse/HDFS-9343 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9343.000.patch, HDFS-9343.001.patch, > HDFS-9343.002.patch, HDFS-9343.003.patch, HDFS-9343.004.patch > > > The caller context with empty context string is considered invalid, and it > should not appear in the audit log. > Meanwhile, too long signature will not be written to audit log. -- This message was sent by Atlassian JIRA (v6.3.4#6332)