[jira] [Created] (HDFS-9234) WebHdfs : getContentSummary() should give quota for storage types
Surendra Singh Lilhore created HDFS-9234: Summary: WebHdfs : getContentSummary() should give quota for storage types Key: HDFS-9234 URL: https://issues.apache.org/jira/browse/HDFS-9234 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Affects Versions: 2.7.1 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Currently webhdfs API for ContentSummary give only namequota and spacequota but it will not give storage types quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9046) Any Error during BPOfferService run can leads to Missing DN.
[ https://issues.apache.org/jira/browse/HDFS-9046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954510#comment-14954510 ] nijel commented on HDFS-9046: - thanks [~vinayrpet] for your time [~cnauroth], please have a review of the this change. > Any Error during BPOfferService run can leads to Missing DN. > > > Key: HDFS-9046 > URL: https://issues.apache.org/jira/browse/HDFS-9046 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: nijel >Assignee: nijel > Attachments: HDFS-9046_1.patch, HDFS-9046_2.patch, HDFS-9046_3.patch > > > The cluster is ins HA mode and each DN having only one block pool. > The issue is once after switch one DN is missing from the current active NN. > Upon analysis I found that there is one exception in BPOfferService.run() > {noformat} > 2015-08-21 09:02:11,190 | WARN | DataNode: > [[[DISK]file:/srv/BigData/hadoop/data5/dn/ > [DISK]file:/srv/BigData/hadoop/data4/dn/]] heartbeating to > 160-149-0-114/160.149.0.114:25000 | Unexpected exception in block pool Block > pool BP-284203724-160.149.0.114-1438774011693 (Datanode Uuid > 15ce1dd7-227f-4fd2-9682-091aa6bc2b89) service to > 160-149-0-114/160.149.0.114:25000 | BPServiceActor.java:830 > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1357) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.execute(FsDatasetAsyncDiskService.java:172) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService.deleteAsync(FsDatasetAsyncDiskService.java:221) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.invalidate(FsDatasetImpl.java:1887) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:669) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:616) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:856) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:671) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822) > at java.lang.Thread.run(Thread.java:745) > {noformat} > After this particular BPOfferService is down during the run time. > And this particular NN will not have the details of this DN > Similar issues are discussed in the following JIRAs > https://issues.apache.org/jira/browse/HDFS-2882 > https://issues.apache.org/jira/browse/HDFS-7714 > Can we retry in this case also with a larger interval instead of shutting > down this BPOfferService ? > I think since this exceptions can occur randomly in DN it is not good to keep > the DN running where some NN does not have the info ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9224) TestFileTruncate fails intermittently with BindException
[ https://issues.apache.org/jira/browse/HDFS-9224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954508#comment-14954508 ] Hudson commented on HDFS-9224: -- FAILURE: Integrated in Hadoop-trunk-Commit #8615 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8615/]) HDFS-9224. TestFileTruncate fails intermittently with BindException (vinayakumarb: rev 69b025dbbaa44395e49d1c04b90e1f65f0fc1132) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > TestFileTruncate fails intermittently with BindException > > > Key: HDFS-9224 > URL: https://issues.apache.org/jira/browse/HDFS-9224 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.0.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: HDFS-9224-002.patch, HDFS-9224.patch > > > https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/478/#showFailuresLink > {noformat} > java.net.BindException: Problem binding to [localhost:8020] > java.net.BindException: Address already in use; For more details see: > http://wiki.apache.org/hadoop/BindException > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:414) > at sun.nio.ch.Net.bind(Net.java:406) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at org.apache.hadoop.ipc.Server.bind(Server.java:469) > at org.apache.hadoop.ipc.Server$Listener.(Server.java:646) > at org.apache.hadoop.ipc.Server.(Server.java:2399) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:945) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:787) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:358) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:692) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:630) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:833) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:812) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1505) > at > org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1248) > at > org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1017) > at > org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:889) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:821) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:480) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:439) > at > org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.setUp(TestFileTruncate.java:107) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-8575) Support User level Quota for space and Name (count)
[ https://issues.apache.org/jira/browse/HDFS-8575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nijel reassigned HDFS-8575: --- Assignee: (was: nijel) keeping it unassigned as no work planned. > Support User level Quota for space and Name (count) > --- > > Key: HDFS-8575 > URL: https://issues.apache.org/jira/browse/HDFS-8575 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: nijel > > I would like to have one feature in HDFS to have quota management at user > level. > Background : > When the customer uses a multi tenant solution it will have many Hadoop eco > system components like HIVE, HBASE, yarn etc. The base folder of these > components are different like /hive - Hive , /hbase -HBase. > Now if a user creates some file or table these will be under the folder > specific to component. If the user name is taken into account it looks like > {code} > /hive/user1/table1 > /hive/user2/table1 > /hbase/user1/Htable1 > /hbase/user2/Htable1 > > Same for yarn/map-reduce data and logs > {code} > > In this case restricting the user to use a certain amount of disk/file is > very difficult since the current quota management is at folder level. > > Requirement: User level Quota for space and Name (count). Say user1 can have > 100G irrespective of the folder or location used. > > Here the idea to consider the file owner ad the key and attribute the quota > to it. So the current quota system can have a initial check for the user > quota if defined, before validating the folder quota. > Note: > This need a change in fsimage to store the user and quota information > Please have a look on this scenario. If it sounds good, i will create the > tasks and the update the design and prototype. > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9205) Do not schedule corrupt blocks for replication
[ https://issues.apache.org/jira/browse/HDFS-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954494#comment-14954494 ] Hadoop QA commented on HDFS-9205: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 23s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 8m 30s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 22s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 25s | The applied patch generated 7 new checkstyle issues (total was 202, now 205). | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 34s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 12s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 187m 5s | Tests failed in hadoop-hdfs. | | | | 234m 5s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints | | | hadoop.hdfs.web.TestWebHDFSOAuth2 | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12766253/h9205_20151013.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / c60a16f | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12946/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12946/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12946/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12946/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12946/console | This message was automatically generated. > Do not schedule corrupt blocks for replication > -- > > Key: HDFS-9205 > URL: https://issues.apache.org/jira/browse/HDFS-9205 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze >Priority: Minor > Attachments: h9205_20151007.patch, h9205_20151007b.patch, > h9205_20151008.patch, h9205_20151009.patch, h9205_20151009b.patch, > h9205_20151013.patch > > > Corrupted blocks by definition are blocks cannot be read. As a consequence, > they cannot be replicated. In UnderReplicatedBlocks, there is a queue for > QUEUE_WITH_CORRUPT_BLOCKS and chooseUnderReplicatedBlocks may choose blocks > from it. It seems that scheduling corrupted block for replication is wasting > resource and potentially slow down replication for the higher priority blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9219) Even if permission is enabled in an environment, while resolving reserved paths there is no check on permission.
[ https://issues.apache.org/jira/browse/HDFS-9219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954480#comment-14954480 ] J.Andreina commented on HDFS-9219: -- Thanks [~liuml07] for the comments. Check on SuperUserPrivilege is required even before resolving reserved path. There are 2 resolvePath methods( static and non-static ) in FsDirectory#resolvePath(..) Only in non-static method we have the check on permissions. Some places invokes static resolvePath method (which does not have a permission check ) eventhough its required. > Even if permission is enabled in an environment, while resolving reserved > paths there is no check on permission. > > > Key: HDFS-9219 > URL: https://issues.apache.org/jira/browse/HDFS-9219 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: J.Andreina >Assignee: J.Andreina > > Currently at few instances , reserved paths are resolved without checking for > permission, even if "dfs.permissions.enabled" is set to true. > {code} > FSPermissionChecker pc = fsd.getPermissionChecker(); > byte[][] pathComponents = > FSDirectory.getPathComponentsForReservedPath(src); > INodesInPath iip; > fsd.writeLock(); > try { > src = *FSDirectory.resolvePath(src, pathComponents, fsd);* > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9006) Provide BlockPlacementPolicy that supports upgrade domain
[ https://issues.apache.org/jira/browse/HDFS-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954478#comment-14954478 ] Hudson commented on HDFS-9006: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2425 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2425/]) HDFS-9006. Provide BlockPlacementPolicy that supports upgrade domain. (lei: rev 0f5f9846edab3ea7e80f3572136f998bcd46) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementStatusWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java > Provide BlockPlacementPolicy that supports upgrade domain > - > > Key: HDFS-9006 > URL: https://issues.apache.org/jira/browse/HDFS-9006 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9006-2.patch, HDFS-9006-3.patch, HDFS-9006.patch > > > As part of the upgrade domain feature, we need to provide the actual upgrade > domain block placement. > Namenode provides a mechanism to specify custom block placement policy. We > can use that to implement BlockPlacementPolicy with upgrade domain support. > {noformat} > > dfs.block.replicator.classname > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyWithUpgradeDomain > > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long
[ https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954475#comment-14954475 ] Mingliang Liu commented on HDFS-9145: - The failing tests seem unrelated. Specially, we consider the failures in {{hadoop.hdfs.TestEncryptionZonesWithKMS}} as a data race bug and filed a jira about this [HADOOP-12474]. > Tracking methods that hold FSNamesytemLock for too long > --- > > Key: HDFS-9145 > URL: https://issues.apache.org/jira/browse/HDFS-9145 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Jing Zhao >Assignee: Mingliang Liu > Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, > HDFS-9145.002.patch, HDFS-9145.003.patch > > > It will be helpful that if we can have a way to track (or at least log a msg) > if some operation is holding the FSNamesystem lock for a long time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9224) TestFileTruncate fails intermittently with BindException
[ https://issues.apache.org/jira/browse/HDFS-9224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-9224: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-2. Thanks [~brahmareddy] for the contribution > TestFileTruncate fails intermittently with BindException > > > Key: HDFS-9224 > URL: https://issues.apache.org/jira/browse/HDFS-9224 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.0.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: HDFS-9224-002.patch, HDFS-9224.patch > > > https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/478/#showFailuresLink > {noformat} > java.net.BindException: Problem binding to [localhost:8020] > java.net.BindException: Address already in use; For more details see: > http://wiki.apache.org/hadoop/BindException > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:414) > at sun.nio.ch.Net.bind(Net.java:406) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at org.apache.hadoop.ipc.Server.bind(Server.java:469) > at org.apache.hadoop.ipc.Server$Listener.(Server.java:646) > at org.apache.hadoop.ipc.Server.(Server.java:2399) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:945) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:787) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:358) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:692) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:630) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:833) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:812) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1505) > at > org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1248) > at > org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1017) > at > org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:889) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:821) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:480) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:439) > at > org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.setUp(TestFileTruncate.java:107) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9006) Provide BlockPlacementPolicy that supports upgrade domain
[ https://issues.apache.org/jira/browse/HDFS-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954462#comment-14954462 ] Hudson commented on HDFS-9006: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #487 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/487/]) HDFS-9006. Provide BlockPlacementPolicy that supports upgrade domain. (lei: rev 0f5f9846edab3ea7e80f3572136f998bcd46) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementStatusWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml > Provide BlockPlacementPolicy that supports upgrade domain > - > > Key: HDFS-9006 > URL: https://issues.apache.org/jira/browse/HDFS-9006 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9006-2.patch, HDFS-9006-3.patch, HDFS-9006.patch > > > As part of the upgrade domain feature, we need to provide the actual upgrade > domain block placement. > Namenode provides a mechanism to specify custom block placement policy. We > can use that to implement BlockPlacementPolicy with upgrade domain support. > {noformat} > > dfs.block.replicator.classname > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyWithUpgradeDomain > > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9173) Erasure Coding: Lease recovery for striped file
[ https://issues.apache.org/jira/browse/HDFS-9173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954458#comment-14954458 ] Zhe Zhang commented on HDFS-9173: - Thanks for the update Walter. One quick question on {{getSafeLength}}: since we are already calculating it as "the smallest length that covers at least 6 internal blocks", can we just sort all replica lengths and take the 6th shortest one? > Erasure Coding: Lease recovery for striped file > --- > > Key: HDFS-9173 > URL: https://issues.apache.org/jira/browse/HDFS-9173 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Walter Su >Assignee: Walter Su > Attachments: HDFS-9173.00.wip.patch, HDFS-9173.01.patch, > HDFS-9173.02.step125.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9224) TestFileTruncate fails intermittently with BindException
[ https://issues.apache.org/jira/browse/HDFS-9224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-9224: Summary: TestFileTruncate fails intermittently with BindException (was: TestFileTruncate fails intermittently) > TestFileTruncate fails intermittently with BindException > > > Key: HDFS-9224 > URL: https://issues.apache.org/jira/browse/HDFS-9224 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.0.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Attachments: HDFS-9224-002.patch, HDFS-9224.patch > > > https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/478/#showFailuresLink > {noformat} > java.net.BindException: Problem binding to [localhost:8020] > java.net.BindException: Address already in use; For more details see: > http://wiki.apache.org/hadoop/BindException > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:414) > at sun.nio.ch.Net.bind(Net.java:406) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at org.apache.hadoop.ipc.Server.bind(Server.java:469) > at org.apache.hadoop.ipc.Server$Listener.(Server.java:646) > at org.apache.hadoop.ipc.Server.(Server.java:2399) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:945) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:787) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:358) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:692) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:630) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:833) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:812) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1505) > at > org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1248) > at > org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1017) > at > org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:889) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:821) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:480) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:439) > at > org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.setUp(TestFileTruncate.java:107) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9006) Provide BlockPlacementPolicy that supports upgrade domain
[ https://issues.apache.org/jira/browse/HDFS-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954454#comment-14954454 ] Hudson commented on HDFS-9006: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1252 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1252/]) HDFS-9006. Provide BlockPlacementPolicy that supports upgrade domain. (lei: rev 0f5f9846edab3ea7e80f3572136f998bcd46) * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementStatusWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java > Provide BlockPlacementPolicy that supports upgrade domain > - > > Key: HDFS-9006 > URL: https://issues.apache.org/jira/browse/HDFS-9006 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9006-2.patch, HDFS-9006-3.patch, HDFS-9006.patch > > > As part of the upgrade domain feature, we need to provide the actual upgrade > domain block placement. > Namenode provides a mechanism to specify custom block placement policy. We > can use that to implement BlockPlacementPolicy with upgrade domain support. > {noformat} > > dfs.block.replicator.classname > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyWithUpgradeDomain > > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9224) TestFileTruncate fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-9224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954452#comment-14954452 ] Vinayakumar B commented on HDFS-9224: - Thanks [~brahmareddy] +1 for latest patch, Will commit shortly > TestFileTruncate fails intermittently > - > > Key: HDFS-9224 > URL: https://issues.apache.org/jira/browse/HDFS-9224 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.0.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Attachments: HDFS-9224-002.patch, HDFS-9224.patch > > > https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/478/#showFailuresLink > {noformat} > java.net.BindException: Problem binding to [localhost:8020] > java.net.BindException: Address already in use; For more details see: > http://wiki.apache.org/hadoop/BindException > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:414) > at sun.nio.ch.Net.bind(Net.java:406) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at org.apache.hadoop.ipc.Server.bind(Server.java:469) > at org.apache.hadoop.ipc.Server$Listener.(Server.java:646) > at org.apache.hadoop.ipc.Server.(Server.java:2399) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:945) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:787) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:358) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:692) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:630) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:833) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:812) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1505) > at > org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1248) > at > org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1017) > at > org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:889) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:821) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:480) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:439) > at > org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.setUp(TestFileTruncate.java:107) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7264) Tha last datanode in a pipeline should send a heartbeat when there is no traffic
[ https://issues.apache.org/jira/browse/HDFS-7264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-7264: Attachment: h7264_20151012.patch Rebasing the patch against current trunk. I'm trying to add a test to illustrate the scenario in the JIRA description: a DN in the middle of a pipeline is too busy to serve regular write requests, but can forward downstream DN heartbeats. > Tha last datanode in a pipeline should send a heartbeat when there is no > traffic > > > Key: HDFS-7264 > URL: https://issues.apache.org/jira/browse/HDFS-7264 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Labels: BB2015-05-TBR > Attachments: h7264_20141017.patch, h7264_20141020.patch, > h7264_20151012.patch > > > When the client is writing slowly, the client will send a heartbeat to signal > that the connection is still alive. This case works fine. > However, when a client is writing fast but some of the datanodes in the > pipeline are busy, a PacketResponder may get a timeout since no ack is sent > from the upstream datanode. We suggest that the last datanode in a pipeline > should send a heartbeat when there is no traffic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9006) Provide BlockPlacementPolicy that supports upgrade domain
[ https://issues.apache.org/jira/browse/HDFS-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954402#comment-14954402 ] Hudson commented on HDFS-9006: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #527 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/527/]) HDFS-9006. Provide BlockPlacementPolicy that supports upgrade domain. (lei: rev 0f5f9846edab3ea7e80f3572136f998bcd46) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementStatusWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java > Provide BlockPlacementPolicy that supports upgrade domain > - > > Key: HDFS-9006 > URL: https://issues.apache.org/jira/browse/HDFS-9006 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9006-2.patch, HDFS-9006-3.patch, HDFS-9006.patch > > > As part of the upgrade domain feature, we need to provide the actual upgrade > domain block placement. > Namenode provides a mechanism to specify custom block placement policy. We > can use that to implement BlockPlacementPolicy with upgrade domain support. > {noformat} > > dfs.block.replicator.classname > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyWithUpgradeDomain > > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-9233) Create LICENSE.txt and NOTICES files for libhdfs++
[ https://issues.apache.org/jira/browse/HDFS-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu reassigned HDFS-9233: --- Assignee: Mingliang Liu > Create LICENSE.txt and NOTICES files for libhdfs++ > -- > > Key: HDFS-9233 > URL: https://issues.apache.org/jira/browse/HDFS-9233 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Bob Hansen >Assignee: Mingliang Liu > > We use third-party libraries that are Apache and Google licensed, and may be > adding an MIT-licenced third-party library. We need to include the > appropriate license files for inclusion into Apache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9233) Create LICENSE.txt and NOTICES files for libhdfs++
Bob Hansen created HDFS-9233: Summary: Create LICENSE.txt and NOTICES files for libhdfs++ Key: HDFS-9233 URL: https://issues.apache.org/jira/browse/HDFS-9233 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Bob Hansen We use third-party libraries that are Apache and Google licensed, and may be adding an MIT-licenced third-party library. We need to include the appropriate license files for inclusion into Apache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs
[ https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954364#comment-14954364 ] Hadoop QA commented on HDFS-9184: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 20m 4s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 51s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 38s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 50s | The applied patch generated 9 new checkstyle issues (total was 225, now 233). | | {color:red}-1{color} | checkstyle | 2m 31s | The applied patch generated 4 new checkstyle issues (total was 651, now 652). | | {color:green}+1{color} | whitespace | 0m 2s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 28s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 4m 31s | The patch appears to introduce 2 new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 6m 41s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 77m 35s | Tests failed in hadoop-hdfs. | | | | 132m 25s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.ipc.TestRPC | | | hadoop.net.TestDNS | | | hadoop.hdfs.server.namenode.TestINodeFile | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestDatanodeRestart | | | hadoop.hdfs.TestFileCreationDelete | | | hadoop.hdfs.server.namenode.ha.TestHASafeMode | | | hadoop.hdfs.TestDFSShell | | | hadoop.hdfs.server.namenode.TestNameNodeXAttr | | | hadoop.hdfs.shortcircuit.TestShortCircuitCache | | | hadoop.hdfs.server.namenode.TestFSEditLogLoader | | | hadoop.hdfs.server.blockmanagement.TestNameNodePrunesMissingStorages | | | hadoop.hdfs.server.namenode.TestDeleteRace | | | hadoop.hdfs.server.namenode.TestParallelImageWrite | | | hadoop.hdfs.server.namenode.TestSaveNamespace | | | hadoop.hdfs.TestReplaceDatanodeOnFailure | | | hadoop.hdfs.server.namenode.TestQuotaWithStripedBlocks | | | hadoop.hdfs.server.namenode.TestFsck | | | hadoop.hdfs.server.namenode.ha.TestHarFileSystemWithHA | | | hadoop.hdfs.server.datanode.TestDeleteBlockPool | | | hadoop.hdfs.server.namenode.TestStorageRestore | | | hadoop.hdfs.server.namenode.TestFileLimit | | | hadoop.hdfs.server.blockmanagement.TestNodeCount | | | hadoop.hdfs.TestEncryptionZones | | | hadoop.hdfs.server.namenode.snapshot.TestCheckpointsWithSnapshots | | | hadoop.hdfs.qjournal.TestNNWithQJM | | | hadoop.hdfs.web.TestWebHdfsFileSystemContract | | | hadoop.hdfs.TestDFSFinalize | | | hadoop.hdfs.server.namenode.TestSecureNameNode | | | hadoop.hdfs.server.namenode.TestFileContextAcl | | | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA | | | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 | | | hadoop.hdfs.TestFsShellPermission | | | hadoop.hdfs.TestDisableConnCache | | | hadoop.hdfs.server.namenode.ha.TestFailureOfSharedDir | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotNameWithInvalidCharacters | | | hadoop.hdfs.server.blockmanagement.TestBlockManager | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | | hadoop.hdfs.server.datanode.TestTransferRbw | | | hadoop.hdfs.TestGetFileChecksum | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | | | hadoop.hdfs.server.namenode.TestFSImageWithAcl | | | hadoop.hdfs.TestDFSPermission | | | hadoop.hdfs.TestParallelRead | | | hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks | | | hadoop.hdfs.server.namenode.TestAddBlock | | | hadoop.hdfs.server.datanode.TestDnRespectsBlockReportSplitThreshold | | | hadoop.hdfs.server.namenode.TestMetaSave | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.web.TestHttpsFileSystem | | | hadoop.hdfs.TestDFSStripedInputStream | | | hadoop.hdfs.server.datanode.TestTriggerBlockReport | | | hadoop.hdfs.server.namenode.TestEditLog | | | hadoop.hdfs.server.namenode.snapshot.TestFileContextSnapshot | | | hadoop.hdfs.tools.TestDFSZKFailoverController | | | hadoop.hdfs.server.namenode.TestHDFSConcat | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.namenode.snapshot.TestAclWithSnapshot | | | hadoop.hdfs.server.blockmanagement.TestBlockToken
[jira] [Commented] (HDFS-9011) Support splitting BlockReport of a storage into multiple RPC
[ https://issues.apache.org/jira/browse/HDFS-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954355#comment-14954355 ] Tsz Wo Nicholas Sze commented on HDFS-9011: --- Here is a new idea -- we may partition the block ID space so that datanodes can send multiple small full block reports for each partition. The partitions needs not be fixed. - When a full block report is larger than a threshold, the report is split into two reports, one for blocks with odd ID and one for blocks with even IDs. If these reports are still too large, split them into four reports with ID suffixes 00, 01, 10 and 11. The process continue until the reports are smaller than the threshold. Datanode sends each partitioned report with its suffix. - Since the block ID space is partitioned, Namenode can process each partitioned report without knowing the remaining partitioned reports. > Support splitting BlockReport of a storage into multiple RPC > > > Key: HDFS-9011 > URL: https://issues.apache.org/jira/browse/HDFS-9011 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9011.000.patch, HDFS-9011.001.patch, > HDFS-9011.002.patch > > > Currently if a DataNode has too many blocks (more than 1m by default), it > sends multiple RPC to the NameNode for the block report, each RPC contains > report for a single storage. However, in practice we've seen sometimes even a > single storage can contains large amount of blocks and the report even > exceeds the max RPC data length. It may be helpful to support sending > multiple RPC for the block report of a storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9215) Suppress the RAT warnings in hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954347#comment-14954347 ] Hudson commented on HDFS-9215: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #486 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/486/]) Addendum patch for HDFS-9215. (wheat9: rev c60a16fceae258f360f7b382e0a2a1ec9bdccad3) * hadoop-hdfs-project/hadoop-hdfs-native-client/pom.xml > Suppress the RAT warnings in hdfs-native-client module > -- > > Key: HDFS-9215 > URL: https://issues.apache.org/jira/browse/HDFS-9215 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-9215.000.patch, HDFS-9215.001.patch, > HDFS-9215.002.patch, HDFS-9215.003.patch, HDFS-9215.addendum.004.patch > > > HDFS-9170 moves the native client implementation to the hdfs-native-client > module. This is a follow-up jira to suppress the RAT warning that was > suppressed in the original hadoop-hdfs module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9006) Provide BlockPlacementPolicy that supports upgrade domain
[ https://issues.apache.org/jira/browse/HDFS-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954342#comment-14954342 ] Hudson commented on HDFS-9006: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2461 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2461/]) HDFS-9006. Provide BlockPlacementPolicy that supports upgrade domain. (lei: rev 0f5f9846edab3ea7e80f3572136f998bcd46) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementStatusWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java > Provide BlockPlacementPolicy that supports upgrade domain > - > > Key: HDFS-9006 > URL: https://issues.apache.org/jira/browse/HDFS-9006 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9006-2.patch, HDFS-9006-3.patch, HDFS-9006.patch > > > As part of the upgrade domain feature, we need to provide the actual upgrade > domain block placement. > Namenode provides a mechanism to specify custom block placement policy. We > can use that to implement BlockPlacementPolicy with upgrade domain support. > {noformat} > > dfs.block.replicator.classname > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyWithUpgradeDomain > > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8630) WebHDFS : Support get/setStoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore updated HDFS-8630: - Status: Patch Available (was: Open) Resubmitting patch for QA results.. > WebHDFS : Support get/setStoragePolicy > --- > > Key: HDFS-8630 > URL: https://issues.apache.org/jira/browse/HDFS-8630 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: nijel >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8630.001.patch, HDFS-8630.002.patch, > HDFS-8630.003.patch, HDFS-8630.patch > > > User can set and get the storage policy from filesystem object. Same > operation can be allowed trough REST API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8630) WebHDFS : Support get/setStoragePolicy
[ https://issues.apache.org/jira/browse/HDFS-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore updated HDFS-8630: - Status: Open (was: Patch Available) > WebHDFS : Support get/setStoragePolicy > --- > > Key: HDFS-8630 > URL: https://issues.apache.org/jira/browse/HDFS-8630 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: nijel >Assignee: Surendra Singh Lilhore > Attachments: HDFS-8630.001.patch, HDFS-8630.002.patch, > HDFS-8630.003.patch, HDFS-8630.patch > > > User can set and get the storage policy from filesystem object. Same > operation can be allowed trough REST API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8647) Abstract BlockManager's rack policy into BlockPlacementPolicy
[ https://issues.apache.org/jira/browse/HDFS-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954317#comment-14954317 ] Ming Ma commented on HDFS-8647: --- [~brahmareddy], regarding the {{verifyBlockPlacement}} part, maybe it is ok to replace {{LocatedBlock}} with {{DatanodeInfo[]}} as in your earlier patch. If we need to add additional information for new scenarios, we can update the API again. In addition, such change might allow balancer to ask block placement policy if any move might violate the policy, as in HDFS-9007. In addition, many changes around block management and block placement policy have gone. So I understand it requires some effort to rebase your patch. Thanks for your work. It is close. > Abstract BlockManager's rack policy into BlockPlacementPolicy > - > > Key: HDFS-8647 > URL: https://issues.apache.org/jira/browse/HDFS-8647 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ming Ma >Assignee: Brahma Reddy Battula > Attachments: HDFS-8647-001.patch, HDFS-8647-002.patch, > HDFS-8647-003.patch, HDFS-8647-004.patch, HDFS-8647-004.patch > > > Sometimes we want to have namenode use alternative block placement policy > such as upgrade domains in HDFS-7541. > BlockManager has built-in assumption about rack policy in functions such as > useDelHint, blockHasEnoughRacks. That means when we have new block placement > policy, we need to modify BlockManager to account for the new policy. Ideally > BlockManager should ask BlockPlacementPolicy object instead. That will allow > us to provide new BlockPlacementPolicy without changing BlockManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long
[ https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954312#comment-14954312 ] Mingliang Liu commented on HDFS-9145: - The failing tests can pass locally and seem unrelated. > Tracking methods that hold FSNamesytemLock for too long > --- > > Key: HDFS-9145 > URL: https://issues.apache.org/jira/browse/HDFS-9145 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Jing Zhao >Assignee: Mingliang Liu > Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, > HDFS-9145.002.patch, HDFS-9145.003.patch > > > It will be helpful that if we can have a way to track (or at least log a msg) > if some operation is holding the FSNamesystem lock for a long time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9006) Provide BlockPlacementPolicy that supports upgrade domain
[ https://issues.apache.org/jira/browse/HDFS-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954307#comment-14954307 ] Hudson commented on HDFS-9006: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #515 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/515/]) HDFS-9006. Provide BlockPlacementPolicy that supports upgrade domain. (lei: rev 0f5f9846edab3ea7e80f3572136f998bcd46) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementStatusWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java > Provide BlockPlacementPolicy that supports upgrade domain > - > > Key: HDFS-9006 > URL: https://issues.apache.org/jira/browse/HDFS-9006 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9006-2.patch, HDFS-9006-3.patch, HDFS-9006.patch > > > As part of the upgrade domain feature, we need to provide the actual upgrade > domain block placement. > Namenode provides a mechanism to specify custom block placement policy. We > can use that to implement BlockPlacementPolicy with upgrade domain support. > {noformat} > > dfs.block.replicator.classname > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyWithUpgradeDomain > > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9232) Shouldn't start block recovery if block has no enough replicas
[ https://issues.apache.org/jira/browse/HDFS-9232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9232: Attachment: HDFS-9232.01.patch > Shouldn't start block recovery if block has no enough replicas > -- > > Key: HDFS-9232 > URL: https://issues.apache.org/jira/browse/HDFS-9232 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Walter Su >Assignee: Walter Su > Attachments: HDFS-9232.01.patch > > > from HDFS-8406: > {quote} > Before primary DN calls commitBlockSynchronization, it synchronized 2 RBW > replicas, and make them finalized. Then primary DN calls > commitBlockSynchronization, to complete the lastBlock and close the file. The > question is, your dfs.namenode.replication.min is 3, the last block can't be > completed. NameNode shouldn't issue blockRecovery in the first place because > lastBlock can't be completed anyway. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9232) Shouldn't start block recovery if block has no enough replicas
[ https://issues.apache.org/jira/browse/HDFS-9232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-9232: Status: Patch Available (was: Open) > Shouldn't start block recovery if block has no enough replicas > -- > > Key: HDFS-9232 > URL: https://issues.apache.org/jira/browse/HDFS-9232 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Walter Su >Assignee: Walter Su > Attachments: HDFS-9232.01.patch > > > from HDFS-8406: > {quote} > Before primary DN calls commitBlockSynchronization, it synchronized 2 RBW > replicas, and make them finalized. Then primary DN calls > commitBlockSynchronization, to complete the lastBlock and close the file. The > question is, your dfs.namenode.replication.min is 3, the last block can't be > completed. NameNode shouldn't issue blockRecovery in the first place because > lastBlock can't be completed anyway. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long
[ https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954302#comment-14954302 ] Hadoop QA commented on HDFS-9145: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 20m 30s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 8m 7s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 39s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 16s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 24s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 6m 59s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 190m 33s | Tests failed in hadoop-hdfs. | | | | 246m 5s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestEncryptionZonesWithKMS | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12766189/HDFS-9145.003.patch | | Optional Tests | javac unit findbugs checkstyle javadoc | | git revision | trunk / c60a16f | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12942/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12942/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12942/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12942/console | This message was automatically generated. > Tracking methods that hold FSNamesytemLock for too long > --- > > Key: HDFS-9145 > URL: https://issues.apache.org/jira/browse/HDFS-9145 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Jing Zhao >Assignee: Mingliang Liu > Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, > HDFS-9145.002.patch, HDFS-9145.003.patch > > > It will be helpful that if we can have a way to track (or at least log a msg) > if some operation is holding the FSNamesystem lock for a long time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9006) Provide BlockPlacementPolicy that supports upgrade domain
[ https://issues.apache.org/jira/browse/HDFS-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954298#comment-14954298 ] Hudson commented on HDFS-9006: -- FAILURE: Integrated in Hadoop-trunk-Commit #8614 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8614/]) HDFS-9006. Provide BlockPlacementPolicy that supports upgrade domain. (lei: rev 0f5f9846edab3ea7e80f3572136f998bcd46) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementStatusWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyConsiderLoad.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/BaseReplicationPolicyTest.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithUpgradeDomain.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicyWithNodeGroup.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java > Provide BlockPlacementPolicy that supports upgrade domain > - > > Key: HDFS-9006 > URL: https://issues.apache.org/jira/browse/HDFS-9006 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9006-2.patch, HDFS-9006-3.patch, HDFS-9006.patch > > > As part of the upgrade domain feature, we need to provide the actual upgrade > domain block placement. > Namenode provides a mechanism to specify custom block placement policy. We > can use that to implement BlockPlacementPolicy with upgrade domain support. > {noformat} > > dfs.block.replicator.classname > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyWithUpgradeDomain > > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9215) Suppress the RAT warnings in hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954296#comment-14954296 ] Hudson commented on HDFS-9215: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2424 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2424/]) Addendum patch for HDFS-9215. (wheat9: rev c60a16fceae258f360f7b382e0a2a1ec9bdccad3) * hadoop-hdfs-project/hadoop-hdfs-native-client/pom.xml > Suppress the RAT warnings in hdfs-native-client module > -- > > Key: HDFS-9215 > URL: https://issues.apache.org/jira/browse/HDFS-9215 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-9215.000.patch, HDFS-9215.001.patch, > HDFS-9215.002.patch, HDFS-9215.003.patch, HDFS-9215.addendum.004.patch > > > HDFS-9170 moves the native client implementation to the hdfs-native-client > module. This is a follow-up jira to suppress the RAT warning that was > suppressed in the original hadoop-hdfs module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-9232) Shouldn't start block recovery if block has no enough replicas
[ https://issues.apache.org/jira/browse/HDFS-9232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su reassigned HDFS-9232: --- Assignee: Walter Su > Shouldn't start block recovery if block has no enough replicas > -- > > Key: HDFS-9232 > URL: https://issues.apache.org/jira/browse/HDFS-9232 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Walter Su >Assignee: Walter Su > > from HDFS-8406: > {quote} > Before primary DN calls commitBlockSynchronization, it synchronized 2 RBW > replicas, and make them finalized. Then primary DN calls > commitBlockSynchronization, to complete the lastBlock and close the file. The > question is, your dfs.namenode.replication.min is 3, the last block can't be > completed. NameNode shouldn't issue blockRecovery in the first place because > lastBlock can't be completed anyway. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9232) Shouldn't start block recovery if block has no enough replicas
Walter Su created HDFS-9232: --- Summary: Shouldn't start block recovery if block has no enough replicas Key: HDFS-9232 URL: https://issues.apache.org/jira/browse/HDFS-9232 Project: Hadoop HDFS Issue Type: Bug Reporter: Walter Su from HDFS-8406: {quote} Before primary DN calls commitBlockSynchronization, it synchronized 2 RBW replicas, and make them finalized. Then primary DN calls commitBlockSynchronization, to complete the lastBlock and close the file. The question is, your dfs.namenode.replication.min is 3, the last block can't be completed. NameNode shouldn't issue blockRecovery in the first place because lastBlock can't be completed anyway. {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8406) Lease recovery continually failed
[ https://issues.apache.org/jira/browse/HDFS-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954291#comment-14954291 ] Walter Su commented on HDFS-8406: - Before primary DN calls commitBlockSynchronization, it synchronized 2 RBW replicas, and make them finalized. Then primary DN calls commitBlockSynchronization, to complete the lastBlock and close the file. The question is, your {{dfs.namenode.replication.min}} is 3, the last block can't be completed. NameNode shouldn't issue blockRecovery in the first place because lastBlock can't be completed anyway. If your {{dfs.namenode.replication.min}} is 3, you should make sure you write to 3 DNs when you setup the pipeline. You can increase the replication number, or setup the {{ReplaceDatanodeOnFailure}} policy. The default policy is {noformat} /** // ReplaceDatanodeOnFailure.java * DEFAULT condition: * Let r be the replication number. * Let n be the number of existing datanodes. * Add a new datanode only if r >= 3 and either * (1) floor(r/2) >= n; or * (2) r > n and the block is hflushed/appended. */ {noformat} It's likey you end up with 2 replicas using default policy. You can try {{always}} replace the failed DataNode to make sure you have 3 RBW replicas. If client accidently failed, blockRecovery can go on. > Lease recovery continually failed > - > > Key: HDFS-8406 > URL: https://issues.apache.org/jira/browse/HDFS-8406 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Keith Turner > Labels: Accumulo, HBase, SolrCloud > > While testing Accumulo on a cluster and killing processes, I ran into a > situation where the lease on an accumulo write ahead log in HDFS could not be > recovered. Even restarting HDFS and Accumulo would not fix the problem. > The following message was seen in an Accumulo tablet server log immediately > before the tablet server was killed. > {noformat} > 2015-05-14 17:12:37,466 [hdfs.DFSClient] WARN : DFSOutputStream > ResponseProcessor exception for block > BP-802741494-10.1.5.6-1431557089849:blk_1073932823_192060 > java.io.IOException: Bad response ERROR for block > BP-802741494-10.1.5.6-1431557089849:blk_1073932823_192060 from datanode > 10.1.5.9:50010 > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:897) > 2015-05-14 17:12:37,466 [hdfs.DFSClient] WARN : Error Recovery for block > BP-802741494-10.1.5.6-1431557089849:blk_1073932823_192060 in pipeline > 10.1.5.55:50010, 10.1.5.9:5 > {noformat} > Before recovering data from a write ahead log, the Accumulo master attempts > to recover the lease. This repeatedly failed with messages like the > following. > {noformat} > 2015-05-14 17:14:54,301 [recovery.HadoopLogCloser] WARN : Error recovering > lease on > hdfs://10.1.5.6:1/accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): > failed to create file > /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 for > DFSClient_NONMAPREDUCE_950713214_16 for client 10.1.5.158 because > pendingCreates is non-null but no leases found. > {noformat} > Below is some info from the NN logs for the problematic file. > {noformat} > [ec2-user@leader2 logs]$ grep 3a731759-3594-4535-8086-245 > hadoop-ec2-user-namenode-leader2.log > 2015-05-14 17:10:46,299 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > allocateBlock: > /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2. > BP-802741494-10.1.5.6-1431557089849 > blk_1073932823_192060{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, > replicas=[ReplicaUnderConstruction[[DISK]DS-ffe07d7d-0e68-45b8-b3d5-c976f1716481:NORMAL:10.1.5.55:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-6efec702-3f1f-4ec0-a31f-de947e7e6097:NORMAL:10.1.5.9:50010|RBW], > > ReplicaUnderConstruction[[DISK]DS-5e27df17-abf8-47df-b4bc-c38d0cd426ea:NORMAL:10.1.5.45:50010|RBW]]} > 2015-05-14 17:10:46,628 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > fsync: /accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 for > DFSClient_NONMAPREDUCE_-1128465883_16 > 2015-05-14 17:14:49,288 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: recoverLease: [Lease. > Holder: DFSClient_NONMAPREDUCE_-1128465883_16, pendingcreates: 1], > src=/accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 from > client DFSClient_NONMAPREDUCE_-1128465883_16 > 2015-05-14 17:14:49,288 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease. > Holder: DFSClient_NONMAPREDUCE_-1128465883_16, pendingcreates: 1], > src=/accumulo/wal/worker11+9997/3a731759-3594-4535-8086-245eed7cd4c2 > 2015-05-14
[jira] [Commented] (HDFS-9223) Code cleanup for DatanodeDescriptor and HeartbeatManager
[ https://issues.apache.org/jira/browse/HDFS-9223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954290#comment-14954290 ] Tsz Wo Nicholas Sze commented on HDFS-9223: --- Since getCapacityUsedPercent, getCapacityRemainingPercent, getPercentBlockPoolUsed, getCapacityUsedNonDFS call multiple stats methods, we should also add them to the new class and synchronize them. > ... I will check this code to confirm and maybe we can do this change > separately. Sure, please check the code and do it separately. > Code cleanup for DatanodeDescriptor and HeartbeatManager > > > Key: HDFS-9223 > URL: https://issues.apache.org/jira/browse/HDFS-9223 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Jing Zhao >Assignee: Jing Zhao >Priority: Minor > Attachments: HDFS-9223.000.patch, HDFS-9223.001.patch > > > Some code cleanup for {{DatanodeDescriptor}} and {{HeartbeatManager}}. The > changes include: > # Change {{DataDescriptor#isAlive}} and {{DatanodeDescriptor#needKeyUpdate}} > from public to private > # Use EnumMap for {{HeartbeatManager#storageTypeStatesMap}} > # Move the {{isInStartupSafeMode}} out of the namesystem lock in > {{heartbeatCheck}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs
[ https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954282#comment-14954282 ] Hadoop QA commented on HDFS-9184: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 19m 59s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 8m 6s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 29s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 51s | The applied patch generated 9 new checkstyle issues (total was 225, now 233). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 41s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 4m 31s | The patch appears to introduce 2 new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 6m 40s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 62m 49s | Tests failed in hadoop-hdfs. | | | | 117m 23s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.ipc.TestRPC | | | hadoop.net.TestDNS | | | hadoop.hdfs.web.TestWebHDFSOAuth2 | | Timed out tests | org.apache.hadoop.hdfs.TestDatanodeDeath | | | org.apache.hadoop.hdfs.TestSafeMode | | | org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits | | | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12766202/HDFS-9184.004.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / c60a16f | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12943/artifact/patchprocess/diffcheckstylehadoop-common.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12943/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12943/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12943/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12943/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12943/console | This message was automatically generated. > Logging HDFS operation's caller context into audit logs > --- > > Key: HDFS-9184 > URL: https://issues.apache.org/jira/browse/HDFS-9184 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, > HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, > HDFS-9184.005.patch > > > For a given HDFS operation (e.g. delete file), it's very helpful to track > which upper level job issues it. The upper level callers may be specific > Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode > (NN) is abused/spammed, the operator may want to know immediately which MR > job should be blamed so that she can kill it. To this end, the caller context > contains at least the application-dependent "tracking id". > There are several existing techniques that may be related to this problem. > 1. Currently the HDFS audit log tracks the users of the the operation which > is obviously not enough. It's common that the same user issues multiple jobs > at the same time. Even for a single top level task, tracking back to a > specific caller in a chain of operations of the whole workflow (e.g.Oozie -> > Hive -> Yarn) is hard, if not impossible. > 2. HDFS integrated {{htrace}} support for providing tracing information > across multiple layers. The span is created in many places interconnected > like a tree structure which relies on offline analysis across RPC boundary. > For this use case, {{htrace}} has to be enabled at 100% sampling rate which > introduces significant
[jira] [Updated] (HDFS-9205) Do not schedule corrupt blocks for replication
[ https://issues.apache.org/jira/browse/HDFS-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-9205: -- Attachment: h9205_20151013.patch h9205_20151013.patch: addresses Jing's comments. > Do not schedule corrupt blocks for replication > -- > > Key: HDFS-9205 > URL: https://issues.apache.org/jira/browse/HDFS-9205 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze >Priority: Minor > Attachments: h9205_20151007.patch, h9205_20151007b.patch, > h9205_20151008.patch, h9205_20151009.patch, h9205_20151009b.patch, > h9205_20151013.patch > > > Corrupted blocks by definition are blocks cannot be read. As a consequence, > they cannot be replicated. In UnderReplicatedBlocks, there is a queue for > QUEUE_WITH_CORRUPT_BLOCKS and chooseUnderReplicatedBlocks may choose blocks > from it. It seems that scheduling corrupted block for replication is wasting > resource and potentially slow down replication for the higher priority blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity
[ https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954235#comment-14954235 ] Hadoop QA commented on HDFS-8287: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12761597/HDFS-8287-HDFS-7285.11.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / 6c17d31 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12944/console | This message was automatically generated. > DFSStripedOutputStream.writeChunk should not wait for writing parity > - > > Key: HDFS-8287 > URL: https://issues.apache.org/jira/browse/HDFS-8287 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Kai Sasaki > Attachments: HDFS-8287-HDFS-7285.00.patch, > HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, > HDFS-8287-HDFS-7285.03.patch, HDFS-8287-HDFS-7285.04.patch, > HDFS-8287-HDFS-7285.05.patch, HDFS-8287-HDFS-7285.06.patch, > HDFS-8287-HDFS-7285.07.patch, HDFS-8287-HDFS-7285.08.patch, > HDFS-8287-HDFS-7285.09.patch, HDFS-8287-HDFS-7285.10.patch, > HDFS-8287-HDFS-7285.11.patch, HDFS-8287-HDFS-7285.WIP.patch, > HDFS-8287-performance-report.pdf, h8287_20150911.patch, jstack-dump.txt > > > When a stripping cell is full, writeChunk computes and generates parity > packets. It sequentially calls waitAndQueuePacket so that user client cannot > continue to write data until it finishes. > We should allow user client to continue writing instead but not blocking it > when writing parity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9226) MiniDFSCluster leaks dependency Mockito via DataNodeTestUtils
[ https://issues.apache.org/jira/browse/HDFS-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954230#comment-14954230 ] Hadoop QA commented on HDFS-9226: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 8m 9s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 19 new or modified test files. | | {color:green}+1{color} | javac | 8m 9s | There were no new javac warning messages. | | {color:red}-1{color} | release audit | 0m 18s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 26s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 30s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 30s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 1m 15s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 234m 7s | Tests failed in hadoop-hdfs. | | | | 258m 3s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestAddBlockRetry | | | hadoop.hdfs.TestReplaceDatanodeOnFailure | | | hadoop.hdfs.web.TestWebHDFSXAttr | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12766164/HDFS-9226.004.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / 9849c8b | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/12935/artifact/patchprocess/patchReleaseAuditProblems.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12935/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12935/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12935/console | This message was automatically generated. > MiniDFSCluster leaks dependency Mockito via DataNodeTestUtils > - > > Key: HDFS-9226 > URL: https://issues.apache.org/jira/browse/HDFS-9226 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS, test >Reporter: Josh Elser >Assignee: Josh Elser > Attachments: HDFS-9226.001.patch, HDFS-9226.002.patch, > HDFS-9226.003.patch, HDFS-9226.004.patch > > > Noticed a test failure when attempting to run Accumulo unit tests against > 2.8.0-SNAPSHOT: > {noformat} > java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer > at > org.apache.hadoop.hdfs.MiniDFSCluster.shouldWait(MiniDFSCluster.java:2421) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2323) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2367) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1529) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:841) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:479) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:438) > at > org.apache.accumulo.start.test.AccumuloDFSBase.miniDfsClusterSetup(AccumuloDFSBase.java:67) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283) > at > org.apache.maven.surefire.junit4.JUnit4Provide
[jira] [Commented] (HDFS-9223) Code cleanup for DatanodeDescriptor and HeartbeatManager
[ https://issues.apache.org/jira/browse/HDFS-9223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954228#comment-14954228 ] Hadoop QA commented on HDFS-9223: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 24m 28s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 4 new or modified test files. | | {color:green}+1{color} | javac | 11m 9s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 45s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 21s | The applied patch generated 1 release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 46s | The applied patch generated 4 new checkstyle issues (total was 330, now 320). | | {color:red}-1{color} | whitespace | 0m 2s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 41s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 37s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 51s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 35s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 197m 42s | Tests failed in hadoop-hdfs. | | | | 256m 1s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.util.TestByteArrayManager | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12766169/HDFS-9223.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 9849c8b | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/12936/artifact/patchprocess/patchReleaseAuditProblems.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12936/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12936/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12936/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12936/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12936/console | This message was automatically generated. > Code cleanup for DatanodeDescriptor and HeartbeatManager > > > Key: HDFS-9223 > URL: https://issues.apache.org/jira/browse/HDFS-9223 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Jing Zhao >Assignee: Jing Zhao >Priority: Minor > Attachments: HDFS-9223.000.patch, HDFS-9223.001.patch > > > Some code cleanup for {{DatanodeDescriptor}} and {{HeartbeatManager}}. The > changes include: > # Change {{DataDescriptor#isAlive}} and {{DatanodeDescriptor#needKeyUpdate}} > from public to private > # Use EnumMap for {{HeartbeatManager#storageTypeStatesMap}} > # Move the {{isInStartupSafeMode}} out of the namesystem lock in > {{heartbeatCheck}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs
[ https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954199#comment-14954199 ] Hadoop QA commented on HDFS-9184: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 20m 33s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 8m 18s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 40s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 18s | The applied patch generated 1 release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 59s | The applied patch generated 9 new checkstyle issues (total was 225, now 233). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 37s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 4m 31s | The patch appears to introduce 2 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 7m 0s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 96m 36s | Tests failed in hadoop-hdfs. | | | | 152m 26s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.TestReservedRawPaths | | | hadoop.hdfs.server.namenode.snapshot.TestUpdatePipelineWithSnapshots | | | hadoop.hdfs.TestModTime | | | hadoop.fs.TestUrlStreamHandler | | | hadoop.hdfs.security.TestDelegationToken | | | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead | | | hadoop.hdfs.server.namenode.TestFileLimit | | | hadoop.hdfs.TestParallelShortCircuitRead | | | hadoop.hdfs.server.namenode.snapshot.TestFileContextSnapshot | | | hadoop.hdfs.TestDisableConnCache | | | hadoop.hdfs.server.namenode.TestEditLogAutoroll | | | hadoop.TestRefreshCallQueue | | | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints | | | hadoop.cli.TestCryptoAdminCLI | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.TestSetrepDecreasing | | | hadoop.hdfs.server.datanode.TestDiskError | | | hadoop.fs.viewfs.TestViewFsWithAcls | | | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | hadoop.hdfs.server.namenode.TestAddStripedBlocks | | | hadoop.hdfs.server.namenode.TestFSEditLogLoader | | | hadoop.hdfs.server.namenode.TestHostsFiles | | | hadoop.hdfs.server.datanode.TestTransferRbw | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy | | | hadoop.fs.contract.hdfs.TestHDFSContractDelete | | | hadoop.hdfs.server.namenode.TestFileContextAcl | | | hadoop.hdfs.TestSafeModeWithStripedFile | | | hadoop.fs.TestFcHdfsSetUMask | | | hadoop.fs.TestUnbuffer | | | hadoop.hdfs.server.namenode.TestDeleteRace | | | hadoop.hdfs.TestPread | | | hadoop.hdfs.server.namenode.TestFSDirectory | | | hadoop.fs.contract.hdfs.TestHDFSContractOpen | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotListing | | | hadoop.hdfs.server.datanode.TestBlockRecovery | | | hadoop.hdfs.server.namenode.TestFileTruncate | | | hadoop.hdfs.TestReadWhileWriting | | | hadoop.fs.contract.hdfs.TestHDFSContractMkdir | | | hadoop.fs.contract.hdfs.TestHDFSContractAppend | | | hadoop.hdfs.server.datanode.TestFsDatasetCache | | | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock | | | hadoop.hdfs.server.namenode.ha.TestQuotasWithHA | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.namenode.TestAuditLogger | | | hadoop.hdfs.server.namenode.TestRecoverStripedBlocks | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles | | | hadoop.hdfs.TestWriteBlockGetsBlockLengthHint | | | hadoop.hdfs.TestDatanodeLayoutUpgrade | | | hadoop.hdfs.server.namenode.TestHDFSConcat | | | hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer | | | hadoop.hdfs.server.datanode.TestCachingStrategy | | | hadoop.fs.TestSymlinkHdfsFileSystem | | | hadoop.fs.viewfs.TestViewFsDefaultValue | | | hadoop.fs.TestSymlinkHdfsFileContext | | | hadoop.hdfs.TestClientProtocolForPipelineRecovery | | | hadoop.hdfs.TestFSInputChecker | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | | hadoop.hdfs.server.mover.TestStorageMover | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistLockedMemory | | | hadoop.hdfs.server.datanode.TestBlockReplacement | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestInterDatanodeP
[jira] [Commented] (HDFS-9170) Move libhdfs / fuse-dfs / libwebhdfs to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-9170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954188#comment-14954188 ] Andrew Wang commented on HDFS-9170: --- Okay, if this just changing 2.8 to 2.6, let's get the trivial patch up. [~eepayne] if you're willing to file a new JIRA and do this, I'm sure myself or Haohui can quickly verify and commit it. Thanks for the testing efforts thus far too. > Move libhdfs / fuse-dfs / libwebhdfs to hdfs-client > --- > > Key: HDFS-9170 > URL: https://issues.apache.org/jira/browse/HDFS-9170 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Haohui Mai >Assignee: Haohui Mai > Fix For: 2.8.0 > > Attachments: HDFS-9170.000.patch, HDFS-9170.001.patch, > HDFS-9170.002.patch, HDFS-9170.003.patch, HDFS-9170.004.patch, > native-package-build-fails-with-cmake-2.5.log > > > After HDFS-6200 the Java implementation of hdfs-client has be moved to a > separate hadoop-hdfs-client module. > libhdfs, fuse-dfs and libwebhdfs still reside in the hadoop-hdfs module. > Ideally these modules should reside in the hadoop-hdfs-client. However, to > write unit tests for these components, it is often necessary to run > MiniDFSCluster which resides in the hadoop-hdfs module. > This jira is to discuss how these native modules should layout after > HDFS-6200. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9215) Suppress the RAT warnings in hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954187#comment-14954187 ] Hudson commented on HDFS-9215: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2460 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2460/]) Addendum patch for HDFS-9215. (wheat9: rev c60a16fceae258f360f7b382e0a2a1ec9bdccad3) * hadoop-hdfs-project/hadoop-hdfs-native-client/pom.xml > Suppress the RAT warnings in hdfs-native-client module > -- > > Key: HDFS-9215 > URL: https://issues.apache.org/jira/browse/HDFS-9215 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-9215.000.patch, HDFS-9215.001.patch, > HDFS-9215.002.patch, HDFS-9215.003.patch, HDFS-9215.addendum.004.patch > > > HDFS-9170 moves the native client implementation to the hdfs-native-client > module. This is a follow-up jira to suppress the RAT warning that was > suppressed in the original hadoop-hdfs module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9184) Logging HDFS operation's caller context into audit logs
[ https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9184: Attachment: HDFS-9184.005.patch Thanks for your comment [~leftnoteasy]. it makes perfect sense to me that the caller context is inherited in the child thread, as we don't support caller context hierarchy (which was by-design). Please note that the child thread is free to override its own current caller context. The v5 patch addresses this with updated unit test. > Logging HDFS operation's caller context into audit logs > --- > > Key: HDFS-9184 > URL: https://issues.apache.org/jira/browse/HDFS-9184 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, > HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, > HDFS-9184.005.patch > > > For a given HDFS operation (e.g. delete file), it's very helpful to track > which upper level job issues it. The upper level callers may be specific > Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode > (NN) is abused/spammed, the operator may want to know immediately which MR > job should be blamed so that she can kill it. To this end, the caller context > contains at least the application-dependent "tracking id". > There are several existing techniques that may be related to this problem. > 1. Currently the HDFS audit log tracks the users of the the operation which > is obviously not enough. It's common that the same user issues multiple jobs > at the same time. Even for a single top level task, tracking back to a > specific caller in a chain of operations of the whole workflow (e.g.Oozie -> > Hive -> Yarn) is hard, if not impossible. > 2. HDFS integrated {{htrace}} support for providing tracing information > across multiple layers. The span is created in many places interconnected > like a tree structure which relies on offline analysis across RPC boundary. > For this use case, {{htrace}} has to be enabled at 100% sampling rate which > introduces significant overhead. Moreover, passing additional information > (via annotations) other than span id from root of the tree to leaf is a > significant additional work. > 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there > are some related discussion on this topic. The final patch implemented the > tracking id as a part of delegation token. This protects the tracking > information from being changed or impersonated. However, kerberos > authenticated connections or insecure connections don't have tokens. > [HADOOP-8779] proposes to use tokens in all the scenarios, but that might > mean changes to several upstream projects and is a major change in their > security implementation. > We propose another approach to address this problem. We also treat HDFS audit > log as a good place for after-the-fact root cause analysis. We propose to put > the caller id (e.g. Hive query id) in threadlocals. Specially, on client side > the threadlocal object is passed to NN as a part of RPC header (optional), > while on sever side NN retrieves it from header and put it to {{Handler}}'s > threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the > caller context for each operation. In this way, the existing code is not > affected. > It is still challenging to keep "lying" client from abusing the caller > context. Our proposal is to add a {{signature}} field to the caller context. > The client choose to provide its signature along with the caller id. The > operator may need to validate the signature at the time of offline analysis. > The NN is not responsible for validating the signature online. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9215) Suppress the RAT warnings in hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954169#comment-14954169 ] Hudson commented on HDFS-9215: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #526 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/526/]) Addendum patch for HDFS-9215. (wheat9: rev c60a16fceae258f360f7b382e0a2a1ec9bdccad3) * hadoop-hdfs-project/hadoop-hdfs-native-client/pom.xml > Suppress the RAT warnings in hdfs-native-client module > -- > > Key: HDFS-9215 > URL: https://issues.apache.org/jira/browse/HDFS-9215 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-9215.000.patch, HDFS-9215.001.patch, > HDFS-9215.002.patch, HDFS-9215.003.patch, HDFS-9215.addendum.004.patch > > > HDFS-9170 moves the native client implementation to the hdfs-native-client > module. This is a follow-up jira to suppress the RAT warning that was > suppressed in the original hadoop-hdfs module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity
[ https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954156#comment-14954156 ] Jing Zhao commented on HDFS-8287: - I'm very sorry I missed this jira in my radar.. Could you please rebase the patch in order to run Jenkins, [~lewuathe]? > DFSStripedOutputStream.writeChunk should not wait for writing parity > - > > Key: HDFS-8287 > URL: https://issues.apache.org/jira/browse/HDFS-8287 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Kai Sasaki > Attachments: HDFS-8287-HDFS-7285.00.patch, > HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, > HDFS-8287-HDFS-7285.03.patch, HDFS-8287-HDFS-7285.04.patch, > HDFS-8287-HDFS-7285.05.patch, HDFS-8287-HDFS-7285.06.patch, > HDFS-8287-HDFS-7285.07.patch, HDFS-8287-HDFS-7285.08.patch, > HDFS-8287-HDFS-7285.09.patch, HDFS-8287-HDFS-7285.10.patch, > HDFS-8287-HDFS-7285.11.patch, HDFS-8287-HDFS-7285.WIP.patch, > HDFS-8287-performance-report.pdf, h8287_20150911.patch, jstack-dump.txt > > > When a stripping cell is full, writeChunk computes and generates parity > packets. It sequentially calls waitAndQueuePacket so that user client cannot > continue to write data until it finishes. > We should allow user client to continue writing instead but not blocking it > when writing parity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9215) Suppress the RAT warnings in hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954158#comment-14954158 ] Hudson commented on HDFS-9215: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1251 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1251/]) Addendum patch for HDFS-9215. (wheat9: rev c60a16fceae258f360f7b382e0a2a1ec9bdccad3) * hadoop-hdfs-project/hadoop-hdfs-native-client/pom.xml > Suppress the RAT warnings in hdfs-native-client module > -- > > Key: HDFS-9215 > URL: https://issues.apache.org/jira/browse/HDFS-9215 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-9215.000.patch, HDFS-9215.001.patch, > HDFS-9215.002.patch, HDFS-9215.003.patch, HDFS-9215.addendum.004.patch > > > HDFS-9170 moves the native client implementation to the hdfs-native-client > module. This is a follow-up jira to suppress the RAT warning that was > suppressed in the original hadoop-hdfs module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9231) fsck doesn't explicitly list when Bad Replicas/Blocks are in a snapshot
Xiao Chen created HDFS-9231: --- Summary: fsck doesn't explicitly list when Bad Replicas/Blocks are in a snapshot Key: HDFS-9231 URL: https://issues.apache.org/jira/browse/HDFS-9231 Project: Hadoop HDFS Issue Type: Bug Components: snapshots Reporter: Xiao Chen Assignee: Xiao Chen For snapshot files, fsck shows corrupt blocks with the original file dir instead of the snapshot dir. This can be confusing since even when the original file is deleted, a new fsck run will still show that file as corrupted although what's actually corrupted is the snapshot. This is true even when given the -includeSnapshots option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9205) Do not schedule corrupt blocks for replication
[ https://issues.apache.org/jira/browse/HDFS-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954147#comment-14954147 ] Jing Zhao commented on HDFS-9205: - # Nit: need to fix the javadoc of {{UnderReplicatedBlocks}}, "getPriority(BlockInfo, int, int, int)" should be updated to "getPriority(BlockInfo, int, int, int, int)". # Minor: since the iterator of the LightWeightLinkedSet already correctly throws NoSuchElementException when there is no next element, it may not be necessary to do the hasNext check. {code} public BlockInfo next() { if (!hasNext()) { throw new NoSuchElementException(); } return b.next(); } {code} +1 after addressing these. > Do not schedule corrupt blocks for replication > -- > > Key: HDFS-9205 > URL: https://issues.apache.org/jira/browse/HDFS-9205 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze >Priority: Minor > Attachments: h9205_20151007.patch, h9205_20151007b.patch, > h9205_20151008.patch, h9205_20151009.patch, h9205_20151009b.patch > > > Corrupted blocks by definition are blocks cannot be read. As a consequence, > they cannot be replicated. In UnderReplicatedBlocks, there is a queue for > QUEUE_WITH_CORRUPT_BLOCKS and chooseUnderReplicatedBlocks may choose blocks > from it. It seems that scheduling corrupted block for replication is wasting > resource and potentially slow down replication for the higher priority blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs
[ https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954104#comment-14954104 ] Wangda Tan commented on HDFS-9184: -- Thanks [~liuml07], instead of {{static final ThreadLocal context = new ThreadLocal<>();}}, could you use InheritableThreadLocal instead? With the {{InheritableThreadLocal}}, we don't need to set the context at every thread. For example, MR can set the Context at main thread so all threads will have this value automatically. > Logging HDFS operation's caller context into audit logs > --- > > Key: HDFS-9184 > URL: https://issues.apache.org/jira/browse/HDFS-9184 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, > HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch > > > For a given HDFS operation (e.g. delete file), it's very helpful to track > which upper level job issues it. The upper level callers may be specific > Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode > (NN) is abused/spammed, the operator may want to know immediately which MR > job should be blamed so that she can kill it. To this end, the caller context > contains at least the application-dependent "tracking id". > There are several existing techniques that may be related to this problem. > 1. Currently the HDFS audit log tracks the users of the the operation which > is obviously not enough. It's common that the same user issues multiple jobs > at the same time. Even for a single top level task, tracking back to a > specific caller in a chain of operations of the whole workflow (e.g.Oozie -> > Hive -> Yarn) is hard, if not impossible. > 2. HDFS integrated {{htrace}} support for providing tracing information > across multiple layers. The span is created in many places interconnected > like a tree structure which relies on offline analysis across RPC boundary. > For this use case, {{htrace}} has to be enabled at 100% sampling rate which > introduces significant overhead. Moreover, passing additional information > (via annotations) other than span id from root of the tree to leaf is a > significant additional work. > 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there > are some related discussion on this topic. The final patch implemented the > tracking id as a part of delegation token. This protects the tracking > information from being changed or impersonated. However, kerberos > authenticated connections or insecure connections don't have tokens. > [HADOOP-8779] proposes to use tokens in all the scenarios, but that might > mean changes to several upstream projects and is a major change in their > security implementation. > We propose another approach to address this problem. We also treat HDFS audit > log as a good place for after-the-fact root cause analysis. We propose to put > the caller id (e.g. Hive query id) in threadlocals. Specially, on client side > the threadlocal object is passed to NN as a part of RPC header (optional), > while on sever side NN retrieves it from header and put it to {{Handler}}'s > threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the > caller context for each operation. In this way, the existing code is not > affected. > It is still challenging to keep "lying" client from abusing the caller > context. Our proposal is to add a {{signature}} field to the caller context. > The client choose to provide its signature along with the caller id. The > operator may need to validate the signature at the time of offline analysis. > The NN is not responsible for validating the signature online. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9215) Suppress the RAT warnings in hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954091#comment-14954091 ] Hudson commented on HDFS-9215: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #514 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/514/]) Addendum patch for HDFS-9215. (wheat9: rev c60a16fceae258f360f7b382e0a2a1ec9bdccad3) * hadoop-hdfs-project/hadoop-hdfs-native-client/pom.xml > Suppress the RAT warnings in hdfs-native-client module > -- > > Key: HDFS-9215 > URL: https://issues.apache.org/jira/browse/HDFS-9215 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-9215.000.patch, HDFS-9215.001.patch, > HDFS-9215.002.patch, HDFS-9215.003.patch, HDFS-9215.addendum.004.patch > > > HDFS-9170 moves the native client implementation to the hdfs-native-client > module. This is a follow-up jira to suppress the RAT warning that was > suppressed in the original hadoop-hdfs module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9006) Provide BlockPlacementPolicy that supports upgrade domain
[ https://issues.apache.org/jira/browse/HDFS-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954023#comment-14954023 ] Ming Ma commented on HDFS-9006: --- Thanks [~eddyxu] and [~ctrezzo]! > Provide BlockPlacementPolicy that supports upgrade domain > - > > Key: HDFS-9006 > URL: https://issues.apache.org/jira/browse/HDFS-9006 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9006-2.patch, HDFS-9006-3.patch, HDFS-9006.patch > > > As part of the upgrade domain feature, we need to provide the actual upgrade > domain block placement. > Namenode provides a mechanism to specify custom block placement policy. We > can use that to implement BlockPlacementPolicy with upgrade domain support. > {noformat} > > dfs.block.replicator.classname > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyWithUpgradeDomain > > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9167) Update pom.xml in other modules to depend on hdfs-client instead of hdfs
[ https://issues.apache.org/jira/browse/HDFS-9167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954020#comment-14954020 ] Mingliang Liu commented on HDFS-9167: - As I can run the "findbugs" locally and pass the unit tests, I don't have any clue about the broken findbugs and failing tests. Any input is highly appreciated. > Update pom.xml in other modules to depend on hdfs-client instead of hdfs > > > Key: HDFS-9167 > URL: https://issues.apache.org/jira/browse/HDFS-9167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Haohui Mai >Assignee: Mingliang Liu > Attachments: HDFS-9167.000.patch, HDFS-9167.001.patch, > HDFS-9167.002.patch > > > Since now the implementation of the client has been moved to the > hadoop-hdfs-client, we should update the poms of other modules in hadoop to > use hdfs-client instead of hdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9215) Suppress the RAT warnings in hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953998#comment-14953998 ] Hudson commented on HDFS-9215: -- FAILURE: Integrated in Hadoop-trunk-Commit #8613 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8613/]) Addendum patch for HDFS-9215. (wheat9: rev c60a16fceae258f360f7b382e0a2a1ec9bdccad3) * hadoop-hdfs-project/hadoop-hdfs-native-client/pom.xml > Suppress the RAT warnings in hdfs-native-client module > -- > > Key: HDFS-9215 > URL: https://issues.apache.org/jira/browse/HDFS-9215 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-9215.000.patch, HDFS-9215.001.patch, > HDFS-9215.002.patch, HDFS-9215.003.patch, HDFS-9215.addendum.004.patch > > > HDFS-9170 moves the native client implementation to the hdfs-native-client > module. This is a follow-up jira to suppress the RAT warning that was > suppressed in the original hadoop-hdfs module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9184) Logging HDFS operation's caller context into audit logs
[ https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9184: Attachment: HDFS-9184.004.patch The v4 patch fixes the audit log format, separating the caller context and signature with colon : > Logging HDFS operation's caller context into audit logs > --- > > Key: HDFS-9184 > URL: https://issues.apache.org/jira/browse/HDFS-9184 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, > HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch > > > For a given HDFS operation (e.g. delete file), it's very helpful to track > which upper level job issues it. The upper level callers may be specific > Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode > (NN) is abused/spammed, the operator may want to know immediately which MR > job should be blamed so that she can kill it. To this end, the caller context > contains at least the application-dependent "tracking id". > There are several existing techniques that may be related to this problem. > 1. Currently the HDFS audit log tracks the users of the the operation which > is obviously not enough. It's common that the same user issues multiple jobs > at the same time. Even for a single top level task, tracking back to a > specific caller in a chain of operations of the whole workflow (e.g.Oozie -> > Hive -> Yarn) is hard, if not impossible. > 2. HDFS integrated {{htrace}} support for providing tracing information > across multiple layers. The span is created in many places interconnected > like a tree structure which relies on offline analysis across RPC boundary. > For this use case, {{htrace}} has to be enabled at 100% sampling rate which > introduces significant overhead. Moreover, passing additional information > (via annotations) other than span id from root of the tree to leaf is a > significant additional work. > 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there > are some related discussion on this topic. The final patch implemented the > tracking id as a part of delegation token. This protects the tracking > information from being changed or impersonated. However, kerberos > authenticated connections or insecure connections don't have tokens. > [HADOOP-8779] proposes to use tokens in all the scenarios, but that might > mean changes to several upstream projects and is a major change in their > security implementation. > We propose another approach to address this problem. We also treat HDFS audit > log as a good place for after-the-fact root cause analysis. We propose to put > the caller id (e.g. Hive query id) in threadlocals. Specially, on client side > the threadlocal object is passed to NN as a part of RPC header (optional), > while on sever side NN retrieves it from header and put it to {{Handler}}'s > threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the > caller context for each operation. In this way, the existing code is not > affected. > It is still challenging to keep "lying" client from abusing the caller > context. Our proposal is to add a {{signature}} field to the caller context. > The client choose to provide its signature along with the caller id. The > operator may need to validate the signature at the time of offline analysis. > The NN is not responsible for validating the signature online. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9006) Provide BlockPlacementPolicy that supports upgrade domain
[ https://issues.apache.org/jira/browse/HDFS-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-9006: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 3.0.0 Target Version/s: 3.0.0, 2.8.0 Status: Resolved (was: Patch Available) Nice work, thanks Ming! LGTM, +1. Committed to trunk and branch-2. > Provide BlockPlacementPolicy that supports upgrade domain > - > > Key: HDFS-9006 > URL: https://issues.apache.org/jira/browse/HDFS-9006 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.8.0 > > Attachments: HDFS-9006-2.patch, HDFS-9006-3.patch, HDFS-9006.patch > > > As part of the upgrade domain feature, we need to provide the actual upgrade > domain block placement. > Namenode provides a mechanism to specify custom block placement policy. We > can use that to implement BlockPlacementPolicy with upgrade domain support. > {noformat} > > dfs.block.replicator.classname > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyWithUpgradeDomain > > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9167) Update pom.xml in other modules to depend on hdfs-client instead of hdfs
[ https://issues.apache.org/jira/browse/HDFS-9167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953970#comment-14953970 ] Hadoop QA commented on HDFS-9167: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 25m 25s | Findbugs (version 3.0.0) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 8m 0s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 34s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 20s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 4m 56s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 0m 14s | Post-patch findbugs hadoop-client compilation is broken. | | {color:red}-1{color} | findbugs | 0m 27s | Post-patch findbugs hadoop-dist compilation is broken. | | {color:red}-1{color} | findbugs | 0m 40s | Post-patch findbugs hadoop-hdfs-project/hadoop-hdfs-nfs compilation is broken. | | {color:red}-1{color} | findbugs | 0m 53s | Post-patch findbugs hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal compilation is broken. | | {color:red}-1{color} | findbugs | 1m 7s | Post-patch findbugs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs compilation is broken. | | {color:red}-1{color} | findbugs | 1m 21s | Post-patch findbugs hadoop-mapreduce-project/hadoop-mapreduce-examples compilation is broken. | | {color:red}-1{color} | findbugs | 1m 34s | Post-patch findbugs hadoop-tools/hadoop-ant compilation is broken. | | {color:red}-1{color} | findbugs | 1m 47s | Post-patch findbugs hadoop-tools/hadoop-archives compilation is broken. | | {color:red}-1{color} | findbugs | 2m 1s | Post-patch findbugs hadoop-tools/hadoop-datajoin compilation is broken. | | {color:red}-1{color} | findbugs | 2m 15s | Post-patch findbugs hadoop-tools/hadoop-distcp compilation is broken. | | {color:red}-1{color} | findbugs | 2m 28s | Post-patch findbugs hadoop-tools/hadoop-extras compilation is broken. | | {color:red}-1{color} | findbugs | 2m 42s | Post-patch findbugs hadoop-tools/hadoop-gridmix compilation is broken. | | {color:red}-1{color} | findbugs | 2m 55s | Post-patch findbugs hadoop-tools/hadoop-rumen compilation is broken. | | {color:red}-1{color} | findbugs | 3m 9s | Post-patch findbugs hadoop-tools/hadoop-streaming compilation is broken. | | {color:green}+1{color} | findbugs | 3m 9s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 0m 10s | Pre-build of native portion | | {color:green}+1{color} | client tests | 0m 12s | Tests passed in hadoop-client. | | {color:green}+1{color} | dist tests | 0m 15s | Tests passed in hadoop-dist. | | {color:green}+1{color} | mapreduce tests | 5m 55s | Tests passed in hadoop-mapreduce-client-hs. | | {color:green}+1{color} | mapreduce tests | 0m 37s | Tests passed in hadoop-mapreduce-examples. | | {color:green}+1{color} | tools/hadoop tests | 0m 13s | Tests passed in hadoop-ant. | | {color:green}+1{color} | tools/hadoop tests | 0m 53s | Tests passed in hadoop-archives. | | {color:green}+1{color} | tools/hadoop tests | 0m 25s | Tests passed in hadoop-datajoin. | | {color:green}+1{color} | tools/hadoop tests | 6m 29s | Tests passed in hadoop-distcp. | | {color:green}+1{color} | tools/hadoop tests | 0m 54s | Tests passed in hadoop-extras. | | {color:green}+1{color} | tools/hadoop tests | 14m 49s | Tests passed in hadoop-gridmix. | | {color:green}+1{color} | tools/hadoop tests | 0m 20s | Tests passed in hadoop-rumen. | | {color:red}-1{color} | tools/hadoop tests | 6m 20s | Tests failed in hadoop-streaming. | | {color:red}-1{color} | hdfs tests | 1m 32s | Tests failed in hadoop-hdfs-nfs. | | {color:green}+1{color} | hdfs tests | 2m 47s | Tests passed in bkjournal. | | | | 96m 25s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.typedbytes.TestTypedBytesWritable | | | hadoop.hdfs.nfs.nfs3.TestWrites | | | hadoop.hdfs.nfs.nfs3.TestExportsTable | | Timed out tests | org.apache.hadoop.streaming.TestMultipleArchiveFiles | \\ \\ || Subsystem
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953947#comment-14953947 ] Jitendra Nath Pandey commented on HDFS-8855: +1 for the latest patch. I will commit today, if there are no objections. > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Attachments: HDFS-8855.005.patch, HDFS-8855.006.patch, > HDFS-8855.007.patch, HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, > HDFS-8855.4.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9215) Suppress the RAT warnings in hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-9215: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed the addendum patch to branch-2 and trunk. > Suppress the RAT warnings in hdfs-native-client module > -- > > Key: HDFS-9215 > URL: https://issues.apache.org/jira/browse/HDFS-9215 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-9215.000.patch, HDFS-9215.001.patch, > HDFS-9215.002.patch, HDFS-9215.003.patch, HDFS-9215.addendum.004.patch > > > HDFS-9170 moves the native client implementation to the hdfs-native-client > module. This is a follow-up jira to suppress the RAT warning that was > suppressed in the original hadoop-hdfs module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8766) Implement a libhdfs(3) compatible API
[ https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953933#comment-14953933 ] Hadoop QA commented on HDFS-8766: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12766186/HDFS-8766.HDFS-8707.005.patch | | Optional Tests | javac unit | | git revision | HDFS-8707 / 1b05389 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12940/console | This message was automatically generated. > Implement a libhdfs(3) compatible API > - > > Key: HDFS-8766 > URL: https://issues.apache.org/jira/browse/HDFS-8766 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-8766.HDFS-8707.000.patch, > HDFS-8766.HDFS-8707.001.patch, HDFS-8766.HDFS-8707.002.patch, > HDFS-8766.HDFS-8707.003.patch, HDFS-8766.HDFS-8707.004.patch, > HDFS-8766.HDFS-8707.005.patch > > > Add a synchronous API that is compatible with the hdfs.h header used in > libhdfs and libhdfs3. This will make it possible for projects using > libhdfs/libhdfs3 to relink against libhdfspp with minimal changes. > This also provides a pure C interface that can be linked against projects > that aren't built in C++11 mode for various reasons but use the same > compiler. It also allows many other programming languages to access > libhdfspp through builtin FFI interfaces. > The libhdfs API is very similar to the posix file API which makes it easier > for programs built using posix filesystem calls to be modified to access HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9215) Suppress the RAT warnings in hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953935#comment-14953935 ] Haohui Mai commented on HDFS-9215: -- Sorry for the typos in the earlier fixes. Thanks Andrew again for the reviews. > Suppress the RAT warnings in hdfs-native-client module > -- > > Key: HDFS-9215 > URL: https://issues.apache.org/jira/browse/HDFS-9215 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-9215.000.patch, HDFS-9215.001.patch, > HDFS-9215.002.patch, HDFS-9215.003.patch, HDFS-9215.addendum.004.patch > > > HDFS-9170 moves the native client implementation to the hdfs-native-client > module. This is a follow-up jira to suppress the RAT warning that was > suppressed in the original hadoop-hdfs module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9215) Suppress the RAT warnings in hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953910#comment-14953910 ] Hadoop QA commented on HDFS-9215: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 13s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 9m 20s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 44s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 42s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 40s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | native | 3m 40s | Pre-build of native portion | | {color:green}+1{color} | hdfs tests | 0m 53s | Tests passed in hadoop-hdfs-native-client. | | | | 44m 40s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12766174/HDFS-9215.addendum.004.patch | | Optional Tests | javadoc javac unit | | git revision | trunk / 9849c8b | | hadoop-hdfs-native-client test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12939/artifact/patchprocess/testrun_hadoop-hdfs-native-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12939/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12939/console | This message was automatically generated. > Suppress the RAT warnings in hdfs-native-client module > -- > > Key: HDFS-9215 > URL: https://issues.apache.org/jira/browse/HDFS-9215 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-9215.000.patch, HDFS-9215.001.patch, > HDFS-9215.002.patch, HDFS-9215.003.patch, HDFS-9215.addendum.004.patch > > > HDFS-9170 moves the native client implementation to the hdfs-native-client > module. This is a follow-up jira to suppress the RAT warning that was > suppressed in the original hadoop-hdfs module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long
[ https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9145: Attachment: HDFS-9145.003.patch Thank you [~jingzhao] for your review. The v3 patch addresses the two comments. > Tracking methods that hold FSNamesytemLock for too long > --- > > Key: HDFS-9145 > URL: https://issues.apache.org/jira/browse/HDFS-9145 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Jing Zhao >Assignee: Mingliang Liu > Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, > HDFS-9145.002.patch, HDFS-9145.003.patch > > > It will be helpful that if we can have a way to track (or at least log a msg) > if some operation is holding the FSNamesystem lock for a long time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8766) Implement a libhdfs(3) compatible API
[ https://issues.apache.org/jira/browse/HDFS-8766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-8766: -- Attachment: HDFS-8766.HDFS-8707.005.patch Added basic error checking tests in gmock. They use the code from hdfs.cc to check the status determine if a read failure should return 0 for resource unavailable or -1 for all other errors. > Implement a libhdfs(3) compatible API > - > > Key: HDFS-8766 > URL: https://issues.apache.org/jira/browse/HDFS-8766 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-8766.HDFS-8707.000.patch, > HDFS-8766.HDFS-8707.001.patch, HDFS-8766.HDFS-8707.002.patch, > HDFS-8766.HDFS-8707.003.patch, HDFS-8766.HDFS-8707.004.patch, > HDFS-8766.HDFS-8707.005.patch > > > Add a synchronous API that is compatible with the hdfs.h header used in > libhdfs and libhdfs3. This will make it possible for projects using > libhdfs/libhdfs3 to relink against libhdfspp with minimal changes. > This also provides a pure C interface that can be linked against projects > that aren't built in C++11 mode for various reasons but use the same > compiler. It also allows many other programming languages to access > libhdfspp through builtin FFI interfaces. > The libhdfs API is very similar to the posix file API which makes it easier > for programs built using posix filesystem calls to be modified to access HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9230) Report space overhead of unfinalized upgrade/rollingUpgrade
Xiaoyu Yao created HDFS-9230: Summary: Report space overhead of unfinalized upgrade/rollingUpgrade Key: HDFS-9230 URL: https://issues.apache.org/jira/browse/HDFS-9230 Project: Hadoop HDFS Issue Type: Improvement Components: HDFS Reporter: Xiaoyu Yao DataNodes do not delete block files during upgrades to allow rollback. This is often confusing to administrators since they sometimes delete files before finalize upgrade but don't see the DFS used space reduce. Ideally, HDFS should report the un-finalized upgrade overhead along with its message on NN UI "Upgrade in progress. Not yet finalized." Or, this can be improve with better NN UI message and document that space won't be reclaimed for deletion until upgrade is finalized. For non-rolling upgrade, it is not easy to track this due to hard link. Say NN initialized upgrade at T1, the block files on DNs that exist before T1 are still under 'current' directory but is just a hard link to 'previous' directory. When those files are deleted after T1 due to deletion, the block file usage on DN won't get deleted until upgrade is finalized. So we need to book keeping files created before T1 but deleted after T1 as the un-finalized upgrade overhead here. For rolling upgrade, it is relative easy to track space overhead as we are not using hard link. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9184) Logging HDFS operation's caller context into audit logs
[ https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9184: Attachment: HDFS-9184.003.patch The failing tests seem unrelated. While the "cancel patch" and "submit patch" trick did not trigger the Jenkins again, I simply rebase the v2 patch from {{trunk}} branch and submit again, which is called v3. Thanks [~jnp] for reviewing the code. > Logging HDFS operation's caller context into audit logs > --- > > Key: HDFS-9184 > URL: https://issues.apache.org/jira/browse/HDFS-9184 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, > HDFS-9184.002.patch, HDFS-9184.003.patch > > > For a given HDFS operation (e.g. delete file), it's very helpful to track > which upper level job issues it. The upper level callers may be specific > Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode > (NN) is abused/spammed, the operator may want to know immediately which MR > job should be blamed so that she can kill it. To this end, the caller context > contains at least the application-dependent "tracking id". > There are several existing techniques that may be related to this problem. > 1. Currently the HDFS audit log tracks the users of the the operation which > is obviously not enough. It's common that the same user issues multiple jobs > at the same time. Even for a single top level task, tracking back to a > specific caller in a chain of operations of the whole workflow (e.g.Oozie -> > Hive -> Yarn) is hard, if not impossible. > 2. HDFS integrated {{htrace}} support for providing tracing information > across multiple layers. The span is created in many places interconnected > like a tree structure which relies on offline analysis across RPC boundary. > For this use case, {{htrace}} has to be enabled at 100% sampling rate which > introduces significant overhead. Moreover, passing additional information > (via annotations) other than span id from root of the tree to leaf is a > significant additional work. > 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there > are some related discussion on this topic. The final patch implemented the > tracking id as a part of delegation token. This protects the tracking > information from being changed or impersonated. However, kerberos > authenticated connections or insecure connections don't have tokens. > [HADOOP-8779] proposes to use tokens in all the scenarios, but that might > mean changes to several upstream projects and is a major change in their > security implementation. > We propose another approach to address this problem. We also treat HDFS audit > log as a good place for after-the-fact root cause analysis. We propose to put > the caller id (e.g. Hive query id) in threadlocals. Specially, on client side > the threadlocal object is passed to NN as a part of RPC header (optional), > while on sever side NN retrieves it from header and put it to {{Handler}}'s > threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the > caller context for each operation. In this way, the existing code is not > affected. > It is still challenging to keep "lying" client from abusing the caller > context. Our proposal is to add a {{signature}} field to the caller context. > The client choose to provide its signature along with the caller id. The > operator may need to validate the signature at the time of offline analysis. > The NN is not responsible for validating the signature online. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7087) Ability to list /.reserved
[ https://issues.apache.org/jira/browse/HDFS-7087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953815#comment-14953815 ] Xiao Chen commented on HDFS-7087: - Thanks a lot Andrew for the review and comments! I have attached patch 002 to address your comments. {quote} Would appreciate javadoc and comments explaining the new methods, and also rationale for the new fake FileStatuses. {quote} Added to {{FSDirectory#createReservedStatuses}}, please see if they make sense to you. {quote} Any reason you set the sticky bit? {quote} I initially thought we should set it to indicate the dir is not intended for unprivileged deletion. On a second thought, this is not needed and confusing, so I removed it. {quote} The isExactReservedName checks, noticed there's no check for rename. This seems brittle in general since we have to make sure it's present in everything besides list. Is the intent of these checks to provide consistent behavior? Do we think this is necessary, or can we simply rely on the existing behavior? {quote} I agree that it's fragile because we have to protect everything besides list. But we can’t rely on the existing behavior, because currently it’s protected by {{getFileInfo}} while parsing from the shell - before this patch, getFileInfo returns null for /.reserved so that the command isn't get processed and hence never reach to FSOp (except mkdirs). Now that we have to enable {{getFileInfo}} in order for ls to work, I think we have to protect other commands by ourselves. Below is a sample stack trace of the getFileInfo call. {noformat} at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:80) at com.sun.proxy.$Proxy19.getFileInfo(Unknown Source:-1) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1667) at org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1269) at org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1265) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1265) at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:62) at org.apache.hadoop.fs.Globber.doGlob(Globber.java:270) at org.apache.hadoop.fs.Globber.glob(Globber.java:149) at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1670) at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:326) at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:239) at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:222) at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:103) at org.apache.hadoop.fs.shell.Command.run(Command.java:166) at org.apache.hadoop.fs.FsShell.run(FsShell.java:310) at org.apache.hadoop.hdfs.TestDFSShell.runCmd(TestDFSShell.java:971) {noformat} {quote} This enables listing /.reserved/raw/.reserved/, which is a little weird. {quote} Agree and good catch! Fixed. {quote} There are some acrobatics to get the cTime. Is this something we could set during FSImage loading? Would let us avoid the null check. {quote} Good idea, updated. > Ability to list /.reserved > -- > > Key: HDFS-7087 > URL: https://issues.apache.org/jira/browse/HDFS-7087 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.6.0 >Reporter: Andrew Wang >Assignee: Xiao Chen > Attachments: HDFS-7087.001.patch, HDFS-7087.002.patch, > HDFS-7087.draft.patch > > > We have two special paths within /.reserved now, /.reserved/.inodes and > /.reserved/raw. It seems like we should be able to list /.reserved to see > them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7087) Ability to list /.reserved
[ https://issues.apache.org/jira/browse/HDFS-7087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-7087: Status: Open (was: Patch Available) > Ability to list /.reserved > -- > > Key: HDFS-7087 > URL: https://issues.apache.org/jira/browse/HDFS-7087 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.6.0 >Reporter: Andrew Wang >Assignee: Xiao Chen > Attachments: HDFS-7087.001.patch, HDFS-7087.002.patch, > HDFS-7087.draft.patch > > > We have two special paths within /.reserved now, /.reserved/.inodes and > /.reserved/raw. It seems like we should be able to list /.reserved to see > them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9215) Suppress the RAT warnings in hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953786#comment-14953786 ] Andrew Wang commented on HDFS-9215: --- My earlier +1 was pending Jenkins, this is another +1 pending Jenkins. I expect test-patch's RAT check to come back clean. > Suppress the RAT warnings in hdfs-native-client module > -- > > Key: HDFS-9215 > URL: https://issues.apache.org/jira/browse/HDFS-9215 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-9215.000.patch, HDFS-9215.001.patch, > HDFS-9215.002.patch, HDFS-9215.003.patch, HDFS-9215.addendum.004.patch > > > HDFS-9170 moves the native client implementation to the hdfs-native-client > module. This is a follow-up jira to suppress the RAT warning that was > suppressed in the original hadoop-hdfs module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7087) Ability to list /.reserved
[ https://issues.apache.org/jira/browse/HDFS-7087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-7087: Attachment: HDFS-7087.002.patch > Ability to list /.reserved > -- > > Key: HDFS-7087 > URL: https://issues.apache.org/jira/browse/HDFS-7087 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.6.0 >Reporter: Andrew Wang >Assignee: Xiao Chen > Attachments: HDFS-7087.001.patch, HDFS-7087.002.patch, > HDFS-7087.draft.patch > > > We have two special paths within /.reserved now, /.reserved/.inodes and > /.reserved/raw. It seems like we should be able to list /.reserved to see > them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long
[ https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953755#comment-14953755 ] Jing Zhao commented on HDFS-9145: - The patch looks good to me. One minor optional thing to do is to use a boolean capturing whether to log first, and do the real log out of the lock. Another nit is that in the warning message, we can add a newline before the stack trace msg. Other than this +1. > Tracking methods that hold FSNamesytemLock for too long > --- > > Key: HDFS-9145 > URL: https://issues.apache.org/jira/browse/HDFS-9145 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Jing Zhao >Assignee: Mingliang Liu > Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, > HDFS-9145.002.patch > > > It will be helpful that if we can have a way to track (or at least log a msg) > if some operation is holding the FSNamesystem lock for a long time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9215) Suppress the RAT warnings in hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-9215: - Status: Patch Available (was: Reopened) > Suppress the RAT warnings in hdfs-native-client module > -- > > Key: HDFS-9215 > URL: https://issues.apache.org/jira/browse/HDFS-9215 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-9215.000.patch, HDFS-9215.001.patch, > HDFS-9215.002.patch, HDFS-9215.003.patch, HDFS-9215.addendum.004.patch > > > HDFS-9170 moves the native client implementation to the hdfs-native-client > module. This is a follow-up jira to suppress the RAT warning that was > suppressed in the original hadoop-hdfs module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9215) Suppress the RAT warnings in hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-9215: - Attachment: HDFS-9215.addendum.004.patch > Suppress the RAT warnings in hdfs-native-client module > -- > > Key: HDFS-9215 > URL: https://issues.apache.org/jira/browse/HDFS-9215 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-9215.000.patch, HDFS-9215.001.patch, > HDFS-9215.002.patch, HDFS-9215.003.patch, HDFS-9215.addendum.004.patch > > > HDFS-9170 moves the native client implementation to the hdfs-native-client > module. This is a follow-up jira to suppress the RAT warning that was > suppressed in the original hadoop-hdfs module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9184) Logging HDFS operation's caller context into audit logs
[ https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-9184: --- Status: Patch Available (was: Open) > Logging HDFS operation's caller context into audit logs > --- > > Key: HDFS-9184 > URL: https://issues.apache.org/jira/browse/HDFS-9184 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, > HDFS-9184.002.patch > > > For a given HDFS operation (e.g. delete file), it's very helpful to track > which upper level job issues it. The upper level callers may be specific > Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode > (NN) is abused/spammed, the operator may want to know immediately which MR > job should be blamed so that she can kill it. To this end, the caller context > contains at least the application-dependent "tracking id". > There are several existing techniques that may be related to this problem. > 1. Currently the HDFS audit log tracks the users of the the operation which > is obviously not enough. It's common that the same user issues multiple jobs > at the same time. Even for a single top level task, tracking back to a > specific caller in a chain of operations of the whole workflow (e.g.Oozie -> > Hive -> Yarn) is hard, if not impossible. > 2. HDFS integrated {{htrace}} support for providing tracing information > across multiple layers. The span is created in many places interconnected > like a tree structure which relies on offline analysis across RPC boundary. > For this use case, {{htrace}} has to be enabled at 100% sampling rate which > introduces significant overhead. Moreover, passing additional information > (via annotations) other than span id from root of the tree to leaf is a > significant additional work. > 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there > are some related discussion on this topic. The final patch implemented the > tracking id as a part of delegation token. This protects the tracking > information from being changed or impersonated. However, kerberos > authenticated connections or insecure connections don't have tokens. > [HADOOP-8779] proposes to use tokens in all the scenarios, but that might > mean changes to several upstream projects and is a major change in their > security implementation. > We propose another approach to address this problem. We also treat HDFS audit > log as a good place for after-the-fact root cause analysis. We propose to put > the caller id (e.g. Hive query id) in threadlocals. Specially, on client side > the threadlocal object is passed to NN as a part of RPC header (optional), > while on sever side NN retrieves it from header and put it to {{Handler}}'s > threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the > caller context for each operation. In this way, the existing code is not > affected. > It is still challenging to keep "lying" client from abusing the caller > context. Our proposal is to add a {{signature}} field to the caller context. > The client choose to provide its signature along with the caller id. The > operator may need to validate the signature at the time of offline analysis. > The NN is not responsible for validating the signature online. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9139) Enable parallel JUnit tests for HDFS Pre-commit
[ https://issues.apache.org/jira/browse/HDFS-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953734#comment-14953734 ] Haohui Mai commented on HDFS-9139: -- bq. I think its okay to commit current changes and fix further random issues in follow on jiras. Sounds good to me. > Enable parallel JUnit tests for HDFS Pre-commit > > > Key: HDFS-9139 > URL: https://issues.apache.org/jira/browse/HDFS-9139 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Vinayakumar B >Assignee: Vinayakumar B > Attachments: HDFS-9139.01.patch, HDFS-9139.02.patch, > HDFS-9139.03.patch, HDFS-9139.04.patch > > > Forked from HADOOP-11984, > With the initial and significant work from [~cnauroth], this Jira is to track > and support parallel tests' run for HDFS Precommit -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9184) Logging HDFS operation's caller context into audit logs
[ https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-9184: --- Status: Open (was: Patch Available) > Logging HDFS operation's caller context into audit logs > --- > > Key: HDFS-9184 > URL: https://issues.apache.org/jira/browse/HDFS-9184 > Project: Hadoop HDFS > Issue Type: Task >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, > HDFS-9184.002.patch > > > For a given HDFS operation (e.g. delete file), it's very helpful to track > which upper level job issues it. The upper level callers may be specific > Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode > (NN) is abused/spammed, the operator may want to know immediately which MR > job should be blamed so that she can kill it. To this end, the caller context > contains at least the application-dependent "tracking id". > There are several existing techniques that may be related to this problem. > 1. Currently the HDFS audit log tracks the users of the the operation which > is obviously not enough. It's common that the same user issues multiple jobs > at the same time. Even for a single top level task, tracking back to a > specific caller in a chain of operations of the whole workflow (e.g.Oozie -> > Hive -> Yarn) is hard, if not impossible. > 2. HDFS integrated {{htrace}} support for providing tracing information > across multiple layers. The span is created in many places interconnected > like a tree structure which relies on offline analysis across RPC boundary. > For this use case, {{htrace}} has to be enabled at 100% sampling rate which > introduces significant overhead. Moreover, passing additional information > (via annotations) other than span id from root of the tree to leaf is a > significant additional work. > 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there > are some related discussion on this topic. The final patch implemented the > tracking id as a part of delegation token. This protects the tracking > information from being changed or impersonated. However, kerberos > authenticated connections or insecure connections don't have tokens. > [HADOOP-8779] proposes to use tokens in all the scenarios, but that might > mean changes to several upstream projects and is a major change in their > security implementation. > We propose another approach to address this problem. We also treat HDFS audit > log as a good place for after-the-fact root cause analysis. We propose to put > the caller id (e.g. Hive query id) in threadlocals. Specially, on client side > the threadlocal object is passed to NN as a part of RPC header (optional), > while on sever side NN retrieves it from header and put it to {{Handler}}'s > threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the > caller context for each operation. In this way, the existing code is not > affected. > It is still challenging to keep "lying" client from abusing the caller > context. Our proposal is to add a {{signature}} field to the caller context. > The client choose to provide its signature along with the caller id. The > operator may need to validate the signature at the time of offline analysis. > The NN is not responsible for validating the signature online. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9207) Move the implementation to the hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953725#comment-14953725 ] Haohui Mai commented on HDFS-9207: -- bq. If we can have HDFS-8766 landed before we do the rebase, it will save the re-work on that code that has been kicked around for a month. If James Clampffer can get the gmock stuff for HDFS-8766 done by tomorrow, let's get it reviewed and landed, then land HDFS-9207, then continue to progress on the rest of the outstanding issues. I'm trying to understand why rebasing is an issue at all? It should be a simple {{mv}} command. Am I missing something? I don't think it is a good idea to commit HDFS-8766 with a copy with {{hdfs.h}}. This main motivation of this jira is to avoid duplicating such a copy. bq. I think there is a legitimate need to have the library itself (as opposed to the tests) build without a dependency on the JNI code, but we can leave that as-is for now and make another Jira to capture that work. Just to clarify that the library does not require JNI at all (as this is the main purpose of building this library). > Move the implementation to the hdfs-native-client module > > > Key: HDFS-9207 > URL: https://issues.apache.org/jira/browse/HDFS-9207 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-9207.000.patch > > > The implementation of libhdfspp should be moved to the new hdfs-native-client > module as HDFS-9170 has landed in trunk and branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1172) Blocks in newly completed files are considered under-replicated too quickly
[ https://issues.apache.org/jira/browse/HDFS-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953717#comment-14953717 ] Jing Zhao commented on HDFS-1172: - +1 for the 014 patch. I will commit it later today or early tomorrow if no objections. > Blocks in newly completed files are considered under-replicated too quickly > --- > > Key: HDFS-1172 > URL: https://issues.apache.org/jira/browse/HDFS-1172 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 0.21.0 >Reporter: Todd Lipcon >Assignee: Masatake Iwasaki > Attachments: HDFS-1172-150907.patch, HDFS-1172.008.patch, > HDFS-1172.009.patch, HDFS-1172.010.patch, HDFS-1172.011.patch, > HDFS-1172.012.patch, HDFS-1172.013.patch, HDFS-1172.014.patch, > HDFS-1172.014.patch, HDFS-1172.patch, hdfs-1172.txt, hdfs-1172.txt, > replicateBlocksFUC.patch, replicateBlocksFUC1.patch, replicateBlocksFUC1.patch > > > I've seen this for a long time, and imagine it's a known issue, but couldn't > find an existing JIRA. It often happens that we see the NN schedule > replication on the last block of files very quickly after they're completed, > before the other DNs in the pipeline have a chance to report the new block. > This results in a lot of extra replication work on the cluster, as we > replicate the block and then end up with multiple excess replicas which are > very quickly deleted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9223) Code cleanup for DatanodeDescriptor and HeartbeatManager
[ https://issues.apache.org/jira/browse/HDFS-9223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953715#comment-14953715 ] Jing Zhao commented on HDFS-9223: - For {{heartbeatCheck}}, currently the {{dead}} node object is retrieved from the heartbeatManager's own datanode list, I'm not very sure if this list is always consistent with datanodeManager's datanodeMap. I will check this code to confirm and maybe we can do this change separately. > Code cleanup for DatanodeDescriptor and HeartbeatManager > > > Key: HDFS-9223 > URL: https://issues.apache.org/jira/browse/HDFS-9223 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Jing Zhao >Assignee: Jing Zhao >Priority: Minor > Attachments: HDFS-9223.000.patch, HDFS-9223.001.patch > > > Some code cleanup for {{DatanodeDescriptor}} and {{HeartbeatManager}}. The > changes include: > # Change {{DataDescriptor#isAlive}} and {{DatanodeDescriptor#needKeyUpdate}} > from public to private > # Use EnumMap for {{HeartbeatManager#storageTypeStatesMap}} > # Move the {{isInStartupSafeMode}} out of the namesystem lock in > {{heartbeatCheck}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9223) Code cleanup for DatanodeDescriptor and HeartbeatManager
[ https://issues.apache.org/jira/browse/HDFS-9223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-9223: Attachment: HDFS-9223.001.patch Thanks for the review, Nicholas! Update patch to address your comments. Also I moved the stats related code into a separate class and made it thread safe. > Code cleanup for DatanodeDescriptor and HeartbeatManager > > > Key: HDFS-9223 > URL: https://issues.apache.org/jira/browse/HDFS-9223 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Jing Zhao >Assignee: Jing Zhao >Priority: Minor > Attachments: HDFS-9223.000.patch, HDFS-9223.001.patch > > > Some code cleanup for {{DatanodeDescriptor}} and {{HeartbeatManager}}. The > changes include: > # Change {{DataDescriptor#isAlive}} and {{DatanodeDescriptor#needKeyUpdate}} > from public to private > # Use EnumMap for {{HeartbeatManager#storageTypeStatesMap}} > # Move the {{isInStartupSafeMode}} out of the namesystem lock in > {{heartbeatCheck}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9226) MiniDFSCluster leaks dependency Mockito via DataNodeTestUtils
[ https://issues.apache.org/jira/browse/HDFS-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser updated HDFS-9226: - Attachment: HDFS-9226.004.patch > MiniDFSCluster leaks dependency Mockito via DataNodeTestUtils > - > > Key: HDFS-9226 > URL: https://issues.apache.org/jira/browse/HDFS-9226 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS, test >Reporter: Josh Elser >Assignee: Josh Elser > Attachments: HDFS-9226.001.patch, HDFS-9226.002.patch, > HDFS-9226.003.patch, HDFS-9226.004.patch > > > Noticed a test failure when attempting to run Accumulo unit tests against > 2.8.0-SNAPSHOT: > {noformat} > java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer > at > org.apache.hadoop.hdfs.MiniDFSCluster.shouldWait(MiniDFSCluster.java:2421) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2323) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2367) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1529) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:841) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:479) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:438) > at > org.apache.accumulo.start.test.AccumuloDFSBase.miniDfsClusterSetup(AccumuloDFSBase.java:67) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) > Caused by: java.lang.ClassNotFoundException: org.mockito.stubbing.Answer > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shouldWait(MiniDFSCluster.java:2421) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2323) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2367) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1529) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:841) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:479) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:438) > at > org.apache.accumulo.start.test.AccumuloDFSBase.miniDfsClusterSetup(AccumuloDFSBase.java:67) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
[jira] [Commented] (HDFS-9119) Discrepancy between edit log tailing interval and RPC timeout for transitionToActive
[ https://issues.apache.org/jira/browse/HDFS-9119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953672#comment-14953672 ] Hadoop QA commented on HDFS-9119: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 23m 54s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 8m 29s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 0s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 19s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 35s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 38s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 35s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 25s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 237m 14s | Tests failed in hadoop-hdfs. | | | | 290m 51s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12766125/HDFS-9119.00.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / e617cf6 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/12933/artifact/patchprocess/patchReleaseAuditProblems.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12933/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12933/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12933/console | This message was automatically generated. > Discrepancy between edit log tailing interval and RPC timeout for > transitionToActive > > > Key: HDFS-9119 > URL: https://issues.apache.org/jira/browse/HDFS-9119 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.7.1 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9119.00.patch > > > {{EditLogTailer}} on standby NameNode tails edits from active NameNode every > 2 minutes. But the {{transitionToActive}} RPC call has a timeout of 1 minute. > If active NameNode encounters very intensive metadata workload (in > particular, a lot of {{AddOp}} and {{MkDir}} operations to create new files > and directories), the amount of updates accumulated in the 2 mins edit log > tailing interval is hard for the standby NameNode to catch up in the 1 min > timeout window. If that happens, the FailoverController will timeout and give > up trying to transition the standby to active. The old ANN will resume adding > more edits. When the SbNN finally finishes catching up the edits and tries to > become active, it will crash. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers
[ https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953652#comment-14953652 ] Zhe Zhang commented on HDFS-9079: - Thanks for the comments Walter! It's a very good point that the current patch doesn't handle failures of the streamer threads. Since the change is already quite large, maybe we can leave that as a separate JIRA, if we at least agree on the basic direction of this JIRA? I'll try to rev the patch to complete the handling of DN failures, and try to add some basic handling of streamer thread failures. I'm currently debugging the patch against {{TestDFSStripedOutputStreamWithFailure}}. I think the logic of allocating multiple genStamps goes against some assumptions in {{runTest}}. Whenever I run a single configuration of the below parameter set the test passes (e.g., if I change {{runTestWithMultipleFailure}} to only test a single entry in {{dnIndexSuite}}). But for multiple configurations it fails. {code} private void runTest(final int length, final int[] killPos, final int[] dnIndex, final boolean tokenExpire) throws Exception { {code} > Erasure coding: preallocate multiple generation stamps and serialize updates > from data streamers > > > Key: HDFS-9079 > URL: https://issues.apache.org/jira/browse/HDFS-9079 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch > > > A non-striped DataStreamer goes through the following steps in error handling: > {code} > 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) > Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) > Updates block on NN > {code} > To simplify the above we can preallocate GS when NN creates a new striped > block group ({{FSN#createNewBlock}}). For each new striped block group we can > reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can > be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we > shouldn't try to further recover anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7916) 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infinite loop
[ https://issues.apache.org/jira/browse/HDFS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953640#comment-14953640 ] Rushabh S Shah commented on HDFS-7916: -- [~yzhangal] {quote}We deployed this fix to one of our cluster and unfortunately the datanode were still spamming the namenode with the same stack trace as before. We debugged the issue and found out that the Datanode were receiving StandbyException wrapped in RemoteException. And the patch was checking for StandbyException and not RemoteException. {quote} Inititally we were catching specifically StandbyException. At that time we thought not to catch StandbyException in ErrorReportAction. But then we discovered that the namenode was throwing StandbyException wrapped in RemoteException. So we chose to ignore all the RemoteException in both the class and just log it as WARN. Hope this helps. > 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for > infinite loop > -- > > Key: HDFS-7916 > URL: https://issues.apache.org/jira/browse/HDFS-7916 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.7.0 >Reporter: Vinayakumar B >Assignee: Rushabh S Shah >Priority: Critical > Fix For: 2.7.1 > > Attachments: HDFS-7916-01.patch, HDFS-7916-1.patch > > > if any badblock found, then BPSA for StandbyNode will go for infinite times > to report it. > {noformat}2015-03-11 19:43:41,528 WARN > org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to report bad block > BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode: > stobdtserver3/10.224.54.70:18010 > org.apache.hadoop.hdfs.server.datanode.BPServiceActorActionException: Failed > to report bad block > BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode: > at > org.apache.hadoop.hdfs.server.datanode.ReportBadBlockAction.reportTo(ReportBadBlockAction.java:63) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processQueueMessages(BPServiceActor.java:1020) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:762) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:856) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1172) Blocks in newly completed files are considered under-replicated too quickly
[ https://issues.apache.org/jira/browse/HDFS-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953604#comment-14953604 ] Hadoop QA commented on HDFS-1172: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 21m 28s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 10m 51s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 12m 53s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 20s | The applied patch generated 1 release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 37s | The applied patch generated 1 new checkstyle issues (total was 164, now 163). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 41s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 42s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 11s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 46s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 102m 9s | Tests failed in hadoop-hdfs. | | | | 158m 43s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.web.TestWebHDFSOAuth2 | | Timed out tests | org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults | | | org.apache.hadoop.hdfs.server.namenode.TestNNStorageRetentionFunctional | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12766138/HDFS-1172.014.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / e617cf6 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/12934/artifact/patchprocess/patchReleaseAuditProblems.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12934/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12934/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12934/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12934/console | This message was automatically generated. > Blocks in newly completed files are considered under-replicated too quickly > --- > > Key: HDFS-1172 > URL: https://issues.apache.org/jira/browse/HDFS-1172 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 0.21.0 >Reporter: Todd Lipcon >Assignee: Masatake Iwasaki > Attachments: HDFS-1172-150907.patch, HDFS-1172.008.patch, > HDFS-1172.009.patch, HDFS-1172.010.patch, HDFS-1172.011.patch, > HDFS-1172.012.patch, HDFS-1172.013.patch, HDFS-1172.014.patch, > HDFS-1172.014.patch, HDFS-1172.patch, hdfs-1172.txt, hdfs-1172.txt, > replicateBlocksFUC.patch, replicateBlocksFUC1.patch, replicateBlocksFUC1.patch > > > I've seen this for a long time, and imagine it's a known issue, but couldn't > find an existing JIRA. It often happens that we see the NN schedule > replication on the last block of files very quickly after they're completed, > before the other DNs in the pipeline have a chance to report the new block. > This results in a lot of extra replication work on the cluster, as we > replicate the block and then end up with multiple excess replicas which are > very quickly deleted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9226) MiniDFSCluster leaks dependency Mockito via DataNodeTestUtils
[ https://issues.apache.org/jira/browse/HDFS-9226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953601#comment-14953601 ] Josh Elser commented on HDFS-9226: -- bq. The class name InternalDataNodeTestUtils is not clear because test artifacts are originally internal. It should be DataNodeMockUtils or something? I am very open to name suggestions. My only concern was naming it something that wasn't specifically tied to Mockito. {{DataNodeMockUtils}} works for me. Let me get a new patch up. Thanks again, Masatake! > MiniDFSCluster leaks dependency Mockito via DataNodeTestUtils > - > > Key: HDFS-9226 > URL: https://issues.apache.org/jira/browse/HDFS-9226 > Project: Hadoop HDFS > Issue Type: Bug > Components: HDFS, test >Reporter: Josh Elser >Assignee: Josh Elser > Attachments: HDFS-9226.001.patch, HDFS-9226.002.patch, > HDFS-9226.003.patch > > > Noticed a test failure when attempting to run Accumulo unit tests against > 2.8.0-SNAPSHOT: > {noformat} > java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer > at > org.apache.hadoop.hdfs.MiniDFSCluster.shouldWait(MiniDFSCluster.java:2421) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2323) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2367) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1529) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:841) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:479) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:438) > at > org.apache.accumulo.start.test.AccumuloDFSBase.miniDfsClusterSetup(AccumuloDFSBase.java:67) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) > Caused by: java.lang.ClassNotFoundException: org.mockito.stubbing.Answer > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shouldWait(MiniDFSCluster.java:2421) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2323) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2367) > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1529) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:841) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:479) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:438) > at > org.apache.accumulo.start.test.AccumuloDFSBase.miniDfsClusterSetup(AccumuloDFSBase.java:67) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
[jira] [Assigned] (HDFS-9117) Config file reader / options classes for libhdfs++
[ https://issues.apache.org/jira/browse/HDFS-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bob Hansen reassigned HDFS-9117: Assignee: Bob Hansen > Config file reader / options classes for libhdfs++ > -- > > Key: HDFS-9117 > URL: https://issues.apache.org/jira/browse/HDFS-9117 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Affects Versions: HDFS-8707 >Reporter: Bob Hansen >Assignee: Bob Hansen > > For environmental compatability with HDFS installations, libhdfs++ should be > able to read the configurations from Hadoop XML files and behave in line with > the Java implementation. > Most notably, machine names and ports should be readable from Hadoop XML > configuration files. > Similarly, an internal Options architecture for libhdfs++ should be developed > to efficiently transport the configuration information within the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9207) Move the implementation to the hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953510#comment-14953510 ] Bob Hansen commented on HDFS-9207: -- If we can have HDFS-8766 landed before we do the rebase, it will save the re-work on that code that has been kicked around for a month. If [~James Clampffer] can get the gmock stuff for HDFS-8766 done by tomorrow, let's get it reviewed and landed, then land HDFS-9207, then continue to progress on the rest of the outstanding issues. I think there is a legitimate need to have the library itself (as opposed to the tests) build without a dependency on the JNI code, but we can leave that as-is for now and make another Jira to capture that work. > Move the implementation to the hdfs-native-client module > > > Key: HDFS-9207 > URL: https://issues.apache.org/jira/browse/HDFS-9207 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-9207.000.patch > > > The implementation of libhdfspp should be moved to the new hdfs-native-client > module as HDFS-9170 has landed in trunk and branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7916) 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infinite loop
[ https://issues.apache.org/jira/browse/HDFS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953493#comment-14953493 ] Yongjun Zhang commented on HDFS-7916: - Hi [~shahrs87] and [~vinayrpet], Thanks for your earlier work on this. I saw that the latest patch does handle ErrorReportAction the same way but there was a comment below: {quote} Vinayakumar B added a comment - 01/Apr/15 15:03 Yes. Rushabh S Shah is right. I had already checked about this. ErrorReportAction doesn't need to be handled. {quote} This is related to [~hitliuyi]'s comment (thanks Yi). It seems reasonable to handle both, but because of the above comment, I have a question, did we discover something new such that we made the the change in ErrorReportAction? At least this seems to be a discrepancy between the comments and the patch. Thanks. > 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for > infinite loop > -- > > Key: HDFS-7916 > URL: https://issues.apache.org/jira/browse/HDFS-7916 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.7.0 >Reporter: Vinayakumar B >Assignee: Rushabh S Shah >Priority: Critical > Fix For: 2.7.1 > > Attachments: HDFS-7916-01.patch, HDFS-7916-1.patch > > > if any badblock found, then BPSA for StandbyNode will go for infinite times > to report it. > {noformat}2015-03-11 19:43:41,528 WARN > org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to report bad block > BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode: > stobdtserver3/10.224.54.70:18010 > org.apache.hadoop.hdfs.server.datanode.BPServiceActorActionException: Failed > to report bad block > BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode: > at > org.apache.hadoop.hdfs.server.datanode.ReportBadBlockAction.reportTo(ReportBadBlockAction.java:63) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processQueueMessages(BPServiceActor.java:1020) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:762) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:856) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8941) DistributedFileSystem listCorruptFileBlocks API should resolve relative path
[ https://issues.apache.org/jira/browse/HDFS-8941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953483#comment-14953483 ] Rakesh R commented on HDFS-8941: Thank you [~andrew.wang] for reviewing and committing the patch! > DistributedFileSystem listCorruptFileBlocks API should resolve relative path > > > Key: HDFS-8941 > URL: https://issues.apache.org/jira/browse/HDFS-8941 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Reporter: Rakesh R >Assignee: Rakesh R > Fix For: 2.8.0 > > Attachments: HDFS-8941-00.patch, HDFS-8941-01.patch, > HDFS-8941-02.patch, HDFS-8941-03.patch, HDFS-8941-04.patch > > > Presently {{DFS#listCorruptFileBlocks(path)}} API is not resolving the given > path relative to the workingDir. This jira is to discuss and provide the > implementation of the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks
[ https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953475#comment-14953475 ] Anu Engineer commented on HDFS-4015: Release audit, checkstyle and test failures are not related to this patch > Safemode should count and report orphaned blocks > > > Key: HDFS-4015 > URL: https://issues.apache.org/jira/browse/HDFS-4015 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.0.0 >Reporter: Todd Lipcon >Assignee: Anu Engineer > Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, > HDFS-4015.003.patch, HDFS-4015.004.patch > > > The safemode status currently reports the number of unique reported blocks > compared to the total number of blocks referenced by the namespace. However, > it does not report the inverse: blocks which are reported by datanodes but > not referenced by the namespace. > In the case that an admin accidentally starts up from an old image, this can > be confusing: safemode and fsck will show "corrupt files", which are the > files which actually have been deleted but got resurrected by restarting from > the old image. This will convince them that they can safely force leave > safemode and remove these files -- after all, they know that those files > should really have been deleted. However, they're not aware that leaving > safemode will also unrecoverably delete a bunch of other block files which > have been orphaned due to the namespace rollback. > I'd like to consider reporting something like: "90 of expected 100 > blocks have been reported. Additionally, 1 blocks have been reported > which do not correspond to any file in the namespace. Forcing exit of > safemode will unrecoverably remove those data blocks" > Whether this statistic is also used for some kind of "inverse safe mode" is > the logical next step, but just reporting it as a warning seems easy enough > to accomplish and worth doing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9207) Move the implementation to the hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953464#comment-14953464 ] Haohui Mai commented on HDFS-9207: -- It'll be done via rebase. bq. Will there be a simple way to disable "include(HadoopJNI)" for builds on machines that don't have JNI headers? I don't think so. I believe that some of the integration tests will eventually require them. You can simply do a ``ldd`` to verify the build artifacts. > Move the implementation to the hdfs-native-client module > > > Key: HDFS-9207 > URL: https://issues.apache.org/jira/browse/HDFS-9207 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-9207.000.patch > > > The implementation of libhdfspp should be moved to the new hdfs-native-client > module as HDFS-9170 has landed in trunk and branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9167) Update pom.xml in other modules to depend on hdfs-client instead of hdfs
[ https://issues.apache.org/jira/browse/HDFS-9167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9167: Status: Patch Available (was: Open) > Update pom.xml in other modules to depend on hdfs-client instead of hdfs > > > Key: HDFS-9167 > URL: https://issues.apache.org/jira/browse/HDFS-9167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Haohui Mai >Assignee: Mingliang Liu > Attachments: HDFS-9167.000.patch, HDFS-9167.001.patch, > HDFS-9167.002.patch > > > Since now the implementation of the client has been moved to the > hadoop-hdfs-client, we should update the poms of other modules in hadoop to > use hdfs-client instead of hdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9167) Update pom.xml in other modules to depend on hdfs-client instead of hdfs
[ https://issues.apache.org/jira/browse/HDFS-9167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-9167: Status: Open (was: Patch Available) > Update pom.xml in other modules to depend on hdfs-client instead of hdfs > > > Key: HDFS-9167 > URL: https://issues.apache.org/jira/browse/HDFS-9167 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Haohui Mai >Assignee: Mingliang Liu > Attachments: HDFS-9167.000.patch, HDFS-9167.001.patch, > HDFS-9167.002.patch > > > Since now the implementation of the client has been moved to the > hadoop-hdfs-client, we should update the poms of other modules in hadoop to > use hdfs-client instead of hdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9207) Move the implementation to the hdfs-native-client module
[ https://issues.apache.org/jira/browse/HDFS-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953424#comment-14953424 ] James Clampffer commented on HDFS-9207: --- Looks like a pretty simple patch to me, just a couple questions: Are you doing the rebase just to get the directory structure right or is it to squash out commits? I'd like to avoid squashing HDFS-8707 commits for now just so it's easier for us to keep track of changes and ramp people up. Similarly is the failure to compile after rebasing just due to relative paths that no longer match up? Will there be a simple way to disable "include(HadoopJNI)" for builds on machines that don't have JNI headers? It's great for testing but I want to make sure that this can be built without anything java related. It doesn't look like anything depends on it now so it might be a good time to add a "compile-only" flag to make sure that separation is shown in the build system early on. > Move the implementation to the hdfs-native-client module > > > Key: HDFS-9207 > URL: https://issues.apache.org/jira/browse/HDFS-9207 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-9207.000.patch > > > The implementation of libhdfspp should be moved to the new hdfs-native-client > module as HDFS-9170 has landed in trunk and branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9229) Expose size of NameNode directory as a metric
[ https://issues.apache.org/jira/browse/HDFS-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953388#comment-14953388 ] Zhe Zhang commented on HDFS-9229: - [~surendrasingh] Thanks for the interest in this issue! I haven't started any work. > Expose size of NameNode directory as a metric > - > > Key: HDFS-9229 > URL: https://issues.apache.org/jira/browse/HDFS-9229 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.7.1 >Reporter: Zhe Zhang >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.8.0 > > > Useful for admins in reserving / managing NN local file system space. Also > useful when transferring NN backups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9229) Expose size of NameNode directory as a metric
[ https://issues.apache.org/jira/browse/HDFS-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953382#comment-14953382 ] Surendra Singh Lilhore commented on HDFS-9229: -- Please feel free to reassign if you started working on this :) > Expose size of NameNode directory as a metric > - > > Key: HDFS-9229 > URL: https://issues.apache.org/jira/browse/HDFS-9229 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.7.1 >Reporter: Zhe Zhang >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.8.0 > > > Useful for admins in reserving / managing NN local file system space. Also > useful when transferring NN backups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-9229) Expose size of NameNode directory as a metric
[ https://issues.apache.org/jira/browse/HDFS-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore reassigned HDFS-9229: Assignee: Surendra Singh Lilhore > Expose size of NameNode directory as a metric > - > > Key: HDFS-9229 > URL: https://issues.apache.org/jira/browse/HDFS-9229 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.7.1 >Reporter: Zhe Zhang >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.8.0 > > > Useful for admins in reserving / managing NN local file system space. Also > useful when transferring NN backups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-1172) Blocks in newly completed files are considered under-replicated too quickly
[ https://issues.apache.org/jira/browse/HDFS-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-1172: --- Attachment: HDFS-1172.014.patch attaching the same file again to kick jenkins. > Blocks in newly completed files are considered under-replicated too quickly > --- > > Key: HDFS-1172 > URL: https://issues.apache.org/jira/browse/HDFS-1172 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 0.21.0 >Reporter: Todd Lipcon >Assignee: Masatake Iwasaki > Attachments: HDFS-1172-150907.patch, HDFS-1172.008.patch, > HDFS-1172.009.patch, HDFS-1172.010.patch, HDFS-1172.011.patch, > HDFS-1172.012.patch, HDFS-1172.013.patch, HDFS-1172.014.patch, > HDFS-1172.014.patch, HDFS-1172.patch, hdfs-1172.txt, hdfs-1172.txt, > replicateBlocksFUC.patch, replicateBlocksFUC1.patch, replicateBlocksFUC1.patch > > > I've seen this for a long time, and imagine it's a known issue, but couldn't > find an existing JIRA. It often happens that we see the NN schedule > replication on the last block of files very quickly after they're completed, > before the other DNs in the pipeline have a chance to report the new block. > This results in a lot of extra replication work on the cluster, as we > replicate the block and then end up with multiple excess replicas which are > very quickly deleted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)