[jira] [Commented] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot
[ https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215551#comment-15215551 ] Yufei Gu commented on HDFS-9820: Hi Yongjun, Thanks a lot for working on it. I have some comments here. 1. Why not use {{getSnapshotDiffReport(target, "", s1)}} to get the diff report instead manually twiddle between delete/create? 2. Since this one is also trying to restore HDFS to previous snapshot, should test cases cover it? e.g. providing another set of test case which source and target are the same. 3. It will be nice to delete the line {{// syncAndVerify();}} in function {{testSync5}}. > Improve distcp to support efficient restore to an earlier snapshot > -- > > Key: HDFS-9820 > URL: https://issues.apache.org/jira/browse/HDFS-9820 > Project: Hadoop HDFS > Issue Type: New Feature > Components: distcp >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-9820.001.patch, HDFS-9820.002.patch > > > HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are > some complexity and challenges. > HDFS-7535 improved distcp performance by avoiding copying files that changed > name since last backup. > On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data > from source to target cluster, by only copying changed files since last > backup. The way it works is use snapshot diff to find out all files changed, > and copy the changed files only. > See > https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/ > This jira is to propose a variation of HDFS-8828, to find out the files > changed in target cluster since last snapshot sx, and copy these from the > source target's same snapshot sx, to restore target cluster to sx. > If a file/dir is > - renamed, rename it back > - created in target cluster, delete it > - modified, put it to the copy list > - run distcp with the copy list, copy from the source cluster's corresponding > snapshot > This could be a new command line switch -rdiff in distcp. > HDFS-4167 would still be nice to have. It just seems to me that HDFS-9820 > would hopefully be easier to implement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9599) TestDecommissioningStatus.testDecommissionStatus occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-9599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215547#comment-15215547 ] Hadoop QA commented on HDFS-9599: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 55s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 1s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 53s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s {color} | {color:red} Patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 145m 43s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_77 Failed junit tests | hadoop.hdfs.TestHFlush | | | hadoop.cli.TestHDFSCLI | | JDK v1.8.0_77 Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 | | JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12795755/HDFS-9599.002.patch | | JIRA Issue | HDFS-9599 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux d386b550605c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed
[jira] [Created] (HDFS-10227) BlockManager should decrease blocksScheduled count for timeout replication
Lin Yiqun created HDFS-10227: Summary: BlockManager should decrease blocksScheduled count for timeout replication Key: HDFS-10227 URL: https://issues.apache.org/jira/browse/HDFS-10227 Project: Hadoop HDFS Issue Type: Bug Reporter: Lin Yiqun Assignee: Lin Yiqun In {{BlockManager#processPendingReplications}}, it suggests that we could invoke decBlocksScheduled() for timeout replication. {code} /** * If there were any replication requests that timed out, reap them * and put them back into the neededReplication queue */ private void processPendingReplications() { BlockInfo[] timedOutItems = pendingReplications.getTimedOutBlocks(); if (timedOutItems != null) { namesystem.writeLock(); try { for (int i = 0; i < timedOutItems.length; i++) { /* * Use the blockinfo from the blocksmap to be certain we're working * with the most up-to-date block information (e.g. genstamp). */ BlockInfo bi = blocksMap.getStoredBlock(timedOutItems[i]); if (bi == null) { continue; } NumberReplicas num = countNodes(timedOutItems[i]); if (isNeededReconstruction(bi, num.liveReplicas())) { neededReconstruction.add(bi, num.liveReplicas(), num.readOnlyReplicas(), num.decommissionedAndDecommissioning(), getReplication(bi)); } } } finally { namesystem.writeUnlock(); } /* If we know the target datanodes where the replication timedout, * we could invoke decBlocksScheduled() on it. Its ok for now. */ } } {code} The comment seems right. After the timeout items are added to {{neededReplications}}, the blocksScheduled count will be repeated increased when these timeout replications removed from {{neededReplications}} to {{pendingReplications}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10226) Track and use BlockScheduled size for DatanodeDescriptor instead of count.
[ https://issues.apache.org/jira/browse/HDFS-10226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated HDFS-10226: Description: Tracking count will result in inaccurate estimation of remaining space in case of different block sized files being written. This issue can happen when parallel write happen with different block size. *For Example:* Datanode Capacity is 10GB, available is 2GB ClientA wants to write 2 blocks with block size 1GB ClientB wants to write 2 blocks with block size 128MB Here ClientB thinks scheduled size 128MB *2 = 256MB and write success where clientA write will fail. was: Tracking count will result in inaccurate estimation of remaining space in case of different block sized files being written. *For Example:* 1. Datanode Capacity is 10GB, available is 2GB. 2. For NNBench testing, Low blocksize might be used ( such as 1MB), and currently 20 blocks are being written to DN. Scheduled counter will be 20. 3. This counter will not give any issue for blocks of NNBench with block size as 1MB. but for normal files with 128MB block size, remaining space will be seen as 0. (because it will calculate based on current file's block size. not the original scheduled size. 20*128MB = 2.5GB, which is greater than available. So remaining will be 0 for normal block. here we'll get, "Could only be replicated to 0 nodes instead of minReplication (=1 ). There are 2 datanode(s) running and no node(s) are excluded in this operation" exception will come. > Track and use BlockScheduled size for DatanodeDescriptor instead of count. > -- > > Key: HDFS-10226 > URL: https://issues.apache.org/jira/browse/HDFS-10226 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > > Tracking count will result in inaccurate estimation of remaining space in > case of different block sized files being written. > This issue can happen when parallel write happen with different block size. > *For Example:* > Datanode Capacity is 10GB, available is 2GB > ClientA wants to write 2 blocks with block size 1GB > ClientB wants to write 2 blocks with block size 128MB > Here ClientB thinks scheduled size 128MB *2 = 256MB and write success where > clientA write will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9478) Reason for failing ipc.FairCallQueue contruction should be thrown
[ https://issues.apache.org/jira/browse/HDFS-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215482#comment-15215482 ] Hadoop QA commented on HDFS-9478: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 40s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 50s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 41s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 6s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 27s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 10s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 13m 35s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 1s {color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 38s {color} | {color:red} Patch generated 3 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 113m 33s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.ipc.TestRPCWaitForProxy | | | hadoop.fs.shell.find.TestIname | | | hadoop.fs.shell.find.TestPrint0 | | | hadoop.ipc.TestIPC | | | hadoop.fs.shell.find.TestPrint | | | hadoop.fs.shell.find.TestName | | JDK v1.8.0_74 Timed out junit tests | org.apache.hadoop.util.TestNativeLibraryChecker | | JDK v1.7.0_95 Failed junit tests | hadoop.fs.shell.find.TestIname | | | hadoop.fs.shell.find.TestPrint0 | | | hadoop.fs.shell.find.TestName | | JDK v1.7.0_95 Timed out junit tests | org.apache.hadoop.util.TestNativeLibraryChecker | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA
[jira] [Commented] (HDFS-7060) Avoid taking locks when sending heartbeats from the DataNode
[ https://issues.apache.org/jira/browse/HDFS-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215462#comment-15215462 ] Hadoop QA commented on HDFS-7060: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} | {color:red} HDFS-7060 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12708051/HDFS-7060-002.patch | | JIRA Issue | HDFS-7060 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/14976/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Avoid taking locks when sending heartbeats from the DataNode > > > Key: HDFS-7060 > URL: https://issues.apache.org/jira/browse/HDFS-7060 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Xinwei Qin > Labels: BB2015-05-TBR > Attachments: HDFS-7060-002.patch, HDFS-7060.000.patch, > HDFS-7060.001.patch > > > We're seeing the heartbeat is blocked by the monitor of {{FsDatasetImpl}} > when the DN is under heavy load of writes: > {noformat} >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:115) > - waiting to lock <0x000780304fb8> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:91) > - locked <0x000780612fd8> (a java.lang.Object) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:563) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:668) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:827) > at java.lang.Thread.run(Thread.java:744) >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:743) > - waiting to lock <0x000780304fb8> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:60) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:169) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:621) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232) > at java.lang.Thread.run(Thread.java:744) >java.lang.Thread.State: RUNNABLE > at java.io.UnixFileSystem.createFileExclusively(Native Method) > at java.io.File.createNewFile(File.java:1006) > at > org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:59) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:244) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:195) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:753) > - locked <0x000780304fb8> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:60) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:169) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:621) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232) > at java.lang.Thread.run(Thread.java:744) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10225) DataNode hot swap drives should recognize storage type tags.
[ https://issues.apache.org/jira/browse/HDFS-10225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215459#comment-15215459 ] Hadoop QA commented on HDFS-10225: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 53s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 21s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 26s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 2 new + 333 unchanged - 0 fixed = 335 total (was 333) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 36s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 30s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 109m 31s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m 39s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 30s {color} | {color:red} Patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 238m 14s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.cli.TestHDFSCLI | | | hadoop.hdfs.server.namenode.ha.TestEditLogTailer | | | hadoop.hdfs.shortcircuit.TestShortCircuitCache | | | hadoop.hdfs.server.datanode.TestDataNodeMetrics | | | hadoop.hdfs.security.TestDelegationTokenForProxyUser | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.server.namenode.TestEditLog | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength | | | hadoop.hdfs.TestReconstructStripedFile | | | hadoop.hdfs.qjournal.TestSecureNNWithQJM | | | hadoop.hdfs.TestEncryption
[jira] [Commented] (HDFS-9847) HDFS configuration without time unit name should accept friendly time units
[ https://issues.apache.org/jira/browse/HDFS-9847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215449#comment-15215449 ] Chris Douglas commented on HDFS-9847: - Have you tried to set units on these variables, then format and bring up a cluster? Unless the unit tests are using a separate path, the patch will break this basic functionality... The deprecation logic should be able to handle migrating variables, but this needs to wrap up. Minimally, all the changes to variables in the main source tree should understand the new type assigned to them here. Please file a followup ticket to complete the work in test classes deferred from this JIRA. > HDFS configuration without time unit name should accept friendly time units > --- > > Key: HDFS-9847 > URL: https://issues.apache.org/jira/browse/HDFS-9847 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9847-branch-2.001.patch, > HDFS-9847-branch-2.002.patch, HDFS-9847-nothrow.001.patch, > HDFS-9847-nothrow.002.patch, HDFS-9847.001.patch, HDFS-9847.002.patch, > HDFS-9847.003.patch, HDFS-9847.004.patch, HDFS-9847.005.patch, > HDFS-9847.006.patch, branch-2-delta.002.txt, timeduration-w-y.patch > > > In HDFS-9821, it talks about the issue of leting existing keys use friendly > units e.g. 60s, 5m, 1d, 6w etc. But there are som configuration key names > contain time unit name, like {{dfs.blockreport.intervalMsec}}, so we can make > some other configurations which without time unit name to accept friendly > time units. The time unit {{seconds}} is frequently used in hdfs. We can > updating this configurations first. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8356) Document missing properties in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated HDFS-8356: - Attachment: HDFS-8356.011.patch - Clean up whitespace and see how a test re-run looks > Document missing properties in hdfs-default.xml > --- > > Key: HDFS-8356 > URL: https://issues.apache.org/jira/browse/HDFS-8356 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.0 >Reporter: Ray Chiang >Assignee: Ray Chiang > Labels: supportability, test > Attachments: HDFS-8356.001.patch, HDFS-8356.002.patch, > HDFS-8356.003.patch, HDFS-8356.004.patch, HDFS-8356.005.patch, > HDFS-8356.006.patch, HDFS-8356.007.patch, HDFS-8356.008.patch, > HDFS-8356.009.patch, HDFS-8356.010.patch, HDFS-8356.011.patch > > > The following properties are currently not defined in hdfs-default.xml. These > properties should either be > A) documented in hdfs-default.xml OR > B) listed as an exception (with comments, e.g. for internal use) in the > TestHdfsConfigFields unit test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline
[ https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215424#comment-15215424 ] Hadoop QA commented on HDFS-9805: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 264 unchanged - 1 fixed = 265 total (was 265) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 41s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 47s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 54m 0s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 22s {color} | {color:red} Patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 136m 3s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_77 Failed junit tests | hadoop.hdfs.TestHFlush | | | hadoop.cli.TestHDFSCLI | | JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestHFlush | | | hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12795741/HDFS-9805.002.patch | | JIRA Issue | HDFS-9805 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit
[jira] [Commented] (HDFS-9871) "Bytes Being Moved" -ve(-1 B) when cluster was already balanced.
[ https://issues.apache.org/jira/browse/HDFS-9871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215420#comment-15215420 ] Hudson commented on HDFS-9871: -- FAILURE: Integrated in Hadoop-trunk-Commit #9517 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9517/]) HDFS-9871. "Bytes Being Moved" -ve(-1 B) when cluster was already (vinayakumarb: rev 1f004b3367c57de9e8a67040a57efc31c9ba8ee2) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java > "Bytes Being Moved" -ve(-1 B) when cluster was already balanced. > > > Key: HDFS-9871 > URL: https://issues.apache.org/jira/browse/HDFS-9871 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: HDFS-9871-002.patch, HDFS-9871.patch > > > Run balancer when there is no {{over}} and {{under}} utlized nodes. > {noformat} > 16/02/29 02:39:40 INFO net.NetworkTopology: Adding a new node: > /default-rack/**.120:50076 > 16/02/29 02:39:40 INFO net.NetworkTopology: Adding a new node: > /default-rack/**.121:50076 > 16/02/29 02:39:40 INFO net.NetworkTopology: Adding a new node: > /default-rack/**.122:50076 > 16/02/29 02:39:41 INFO balancer.Balancer: 0 over-utilized: [] > 16/02/29 02:39:41 INFO balancer.Balancer: 0 underutilized: [] > The cluster is balanced. Exiting... > Feb 29, 2016 2:40:57 AM 0 0 B 0 B > -1 B > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7060) Avoid taking locks when sending heartbeats from the DataNode
[ https://issues.apache.org/jira/browse/HDFS-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215413#comment-15215413 ] John Zhuge commented on HDFS-7060: -- If we assume dataset lock protects {{bpSlices}}. The following functions access {{bpSlices}} but they and their callers do NOT lock dataset. Did I miss anything? Is this a bug? {code} FsVolumeImpl#addBlockPool (this is a write access to bpSlices) FsVolumeImpl#addBlockPool FsDatasetImpl#addVolume FsVolumeImpl#getBlockPoolUsed FsVolumeList#getBlockPoolUsed FsDatasetImpl#getBlockPoolUsed FsVolumeImpl#getBlockPoolSlice (too many callers, hard to track down !) FsVolumeImpl#getBlockPoolList DirectoryScanner.ReportCompiler#call FsVolumeImpl#checkDirs FsVolumeImpl#checkDirs FsDatasetImpl#checkDataDir ... {code} A few suggestions regarding lock usage: * All locks including intrinsic locks should include docs that clearly state what resources they protect. * Locks should be used in the classes they are defined. Very hard to reason if their usages are scattered around. If we want to protect {{bpSlices}} with lock, can we just use {{bpSlices}} intrinsic lock instead of dataset lock? > Avoid taking locks when sending heartbeats from the DataNode > > > Key: HDFS-7060 > URL: https://issues.apache.org/jira/browse/HDFS-7060 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Xinwei Qin > Labels: BB2015-05-TBR > Attachments: HDFS-7060-002.patch, HDFS-7060.000.patch, > HDFS-7060.001.patch > > > We're seeing the heartbeat is blocked by the monitor of {{FsDatasetImpl}} > when the DN is under heavy load of writes: > {noformat} >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:115) > - waiting to lock <0x000780304fb8> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:91) > - locked <0x000780612fd8> (a java.lang.Object) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:563) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:668) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:827) > at java.lang.Thread.run(Thread.java:744) >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:743) > - waiting to lock <0x000780304fb8> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:60) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:169) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:621) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232) > at java.lang.Thread.run(Thread.java:744) >java.lang.Thread.State: RUNNABLE > at java.io.UnixFileSystem.createFileExclusively(Native Method) > at java.io.File.createNewFile(File.java:1006) > at > org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:59) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:244) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:195) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:753) > - locked <0x000780304fb8> (a > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:60) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:169) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:621) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) > at > org.apache.hado
[jira] [Commented] (HDFS-10225) DataNode hot swap drives should recognize storage type tags.
[ https://issues.apache.org/jira/browse/HDFS-10225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215404#comment-15215404 ] Hadoop QA commented on HDFS-10225: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 23s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 2 new + 332 unchanged - 0 fixed = 334 total (was 332) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 50s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 12s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 22s {color} | {color:red} Patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 136m 48s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.hdfs.TestHFlush | | | hadoop.cli.TestHDFSCLI | | JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl | | | hadoop.hdfs.TestHFlush | | | hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12795742/HDFS-10225.000.patch | | JIRA Issue | HDFS-10225 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 9fd31ef7
[jira] [Created] (HDFS-10226) Track and use BlockScheduled size for DatanodeDescriptor instead of count.
Brahma Reddy Battula created HDFS-10226: --- Summary: Track and use BlockScheduled size for DatanodeDescriptor instead of count. Key: HDFS-10226 URL: https://issues.apache.org/jira/browse/HDFS-10226 Project: Hadoop HDFS Issue Type: Bug Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Tracking count will result in inaccurate estimation of remaining space in case of different block sized files being written. *For Example:* 1. Datanode Capacity is 10GB, available is 2GB. 2. For NNBench testing, Low blocksize might be used ( such as 1MB), and currently 20 blocks are being written to DN. Scheduled counter will be 20. 3. This counter will not give any issue for blocks of NNBench with block size as 1MB. but for normal files with 128MB block size, remaining space will be seen as 0. (because it will calculate based on current file's block size. not the original scheduled size. 20*128MB = 2.5GB, which is greater than available. So remaining will be 0 for normal block. here we'll get, "Could only be replicated to 0 nodes instead of minReplication (=1 ). There are 2 datanode(s) running and no node(s) are excluded in this operation" exception will come. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8356) Document missing properties in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215387#comment-15215387 ] Hadoop QA commented on HDFS-8356: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 1s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 43s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 48s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 59s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 38s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 46s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 18s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 25s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 27s {color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 54m 2s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 26s {color} | {color:red} Patch generated 3 ASF License warnings. {color} | | {color:black}{color}
[jira] [Commented] (HDFS-5177) blocksScheduled count should be decremented for abandoned blocks
[ https://issues.apache.org/jira/browse/HDFS-5177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215385#comment-15215385 ] Vinayakumar B commented on HDFS-5177: - Test failures are not related. > blocksScheduled count should be decremented for abandoned blocks > - > > Key: HDFS-5177 > URL: https://issues.apache.org/jira/browse/HDFS-5177 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.0.0, 2.1.0-beta >Reporter: Vinayakumar B >Assignee: Vinayakumar B > Attachments: HDFS-5177-04.patch, HDFS-5177.patch, HDFS-5177.patch, > HDFS-5177.patch > > > DatanodeDescriptor#incBlocksScheduled() will be called for all datanodes of > the block on each allocation. But same should be decremented for abandoned > blocks. > When one of the datanodes is down and same is allocated for the block along > with other live datanodes, then this block will be abandoned, but the > scheduled count on other datanodes will consider live datanodes as loaded, > but in reality these datanodes may not be loaded. > Anyway this scheduled count will be rolled every 20 mins. > Problem will come if the rate of creation of files is more. Due to increase > in the scheduled count, there might be chances of missing local datanode to > write to. and some times writes also can fail in small clusters. > So we need to decrement the unnecessary count on abandon block call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart
[ https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215380#comment-15215380 ] Hadoop QA commented on HDFS-9349: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 201 unchanged - 0 fixed = 202 total (was 201) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 2s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 12s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s {color} | {color:red} Patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 167m 24s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.hdfs.TestHFlush | | | hadoop.cli.TestHDFSCLI | | JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestHFlush | | | hadoop.cli.TestHDFSCLI | | | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot | | | hadoop.hdfs.server.balancer.TestBalancer | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12795737/HDFS-9349-HDFS-9000.008.patch | | JIRA Issue | HDFS-9349 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite
[jira] [Commented] (HDFS-9885) In distcp cmd ouput, Display name should be given for org.apache.hadoop.tools.mapred.CopyMapper$Counter.
[ https://issues.apache.org/jira/browse/HDFS-9885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215376#comment-15215376 ] Surendra Singh Lilhore commented on HDFS-9885: -- Hi [~yzhangal], Please can you review .. > In distcp cmd ouput, Display name should be given for > org.apache.hadoop.tools.mapred.CopyMapper$Counter. > > > Key: HDFS-9885 > URL: https://issues.apache.org/jira/browse/HDFS-9885 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Affects Versions: 2.7.1 >Reporter: Archana T >Assignee: Surendra Singh Lilhore >Priority: Minor > Attachments: HDFS-9885.001.patch, HDFS-9885.002.patch > > > In distcp cmd output, > hadoop distcp hdfs://NN1:port/file1 hdfs://NN2:port/file2 > 16/02/29 07:05:55 INFO tools.DistCp: DistCp job-id: job_1456729398560_0002 > 16/02/29 07:05:55 INFO mapreduce.Job: Running job: job_1456729398560_0002 > 16/02/29 07:06:01 INFO mapreduce.Job: Job job_1456729398560_0002 running in > uber mode : false > 16/02/29 07:06:01 INFO mapreduce.Job: map 0% reduce 0% > 16/02/29 07:06:06 INFO mapreduce.Job: map 100% reduce 0% > 16/02/29 07:06:07 INFO mapreduce.Job: Job job_1456729398560_0002 completed > successfully > ... > ... > File Input Format Counters > Bytes Read=212 > File Output Format Counters > Bytes Written=0{color:red} > org.apache.hadoop.tools.mapred.CopyMapper$Counter > {color} > BANDWIDTH_IN_BYTES=12418 > BYTESCOPIED=12418 > BYTESEXPECTED=12418 > COPY=1 > Expected: > Display Name can be given instead of > {color:red}"org.apache.hadoop.tools.mapred.CopyMapper$Counter"{color} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9871) "Bytes Being Moved" -ve(-1 B) when cluster was already balanced.
[ https://issues.apache.org/jira/browse/HDFS-9871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215366#comment-15215366 ] Brahma Reddy Battula commented on HDFS-9871: Thanks [~vinayrpet] for review and commit and thanks to others for reviewing. > "Bytes Being Moved" -ve(-1 B) when cluster was already balanced. > > > Key: HDFS-9871 > URL: https://issues.apache.org/jira/browse/HDFS-9871 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: HDFS-9871-002.patch, HDFS-9871.patch > > > Run balancer when there is no {{over}} and {{under}} utlized nodes. > {noformat} > 16/02/29 02:39:40 INFO net.NetworkTopology: Adding a new node: > /default-rack/**.120:50076 > 16/02/29 02:39:40 INFO net.NetworkTopology: Adding a new node: > /default-rack/**.121:50076 > 16/02/29 02:39:40 INFO net.NetworkTopology: Adding a new node: > /default-rack/**.122:50076 > 16/02/29 02:39:41 INFO balancer.Balancer: 0 over-utilized: [] > 16/02/29 02:39:41 INFO balancer.Balancer: 0 underutilized: [] > The cluster is balanced. Exiting... > Feb 29, 2016 2:40:57 AM 0 0 B 0 B > -1 B > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9478) Reason for failing ipc.FairCallQueue contruction should be thrown
[ https://issues.apache.org/jira/browse/HDFS-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215356#comment-15215356 ] Arpit Agarwal commented on HDFS-9478: - +1 pending Jenkins. > Reason for failing ipc.FairCallQueue contruction should be thrown > - > > Key: HDFS-9478 > URL: https://issues.apache.org/jira/browse/HDFS-9478 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-9478.2.patch, HDFS-9478.3.patch, HDFS-9478.patch > > > When FairCallQueue Construction fails, NN fails to start throwing > RunTimeException without throwing any reason on why it fails. > 2015-11-30 17:45:26,661 INFO org.apache.hadoop.ipc.FairCallQueue: > FairCallQueue is in use with 4 queues. > 2015-11-30 17:45:26,665 DEBUG org.apache.hadoop.metrics2.util.MBeans: > Registered Hadoop:service=ipc.65110,name=DecayRpcScheduler > 2015-11-30 17:45:26,666 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.lang.RuntimeException: org.apache.hadoop.ipc.FairCallQueue could not be > constructed. > at > org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:96) > at org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:55) > at org.apache.hadoop.ipc.Server.(Server.java:2241) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:942) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:534) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:784) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:346) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:750) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:687) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:889) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:872) > Example: reason for above failure could have been -- > 1. the weights were not equal to the number of queues configured. > 2. decay-scheduler.thresholds not in sync with number of queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9871) "Bytes Being Moved" -ve(-1 B) when cluster was already balanced.
[ https://issues.apache.org/jira/browse/HDFS-9871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-9871: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Committed to branch-2.8 and above. Thanks [~brahmareddy] for the contribution. Thanks [~szetszwo],[~ajisakaa] and [~shahrs87] > "Bytes Being Moved" -ve(-1 B) when cluster was already balanced. > > > Key: HDFS-9871 > URL: https://issues.apache.org/jira/browse/HDFS-9871 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: HDFS-9871-002.patch, HDFS-9871.patch > > > Run balancer when there is no {{over}} and {{under}} utlized nodes. > {noformat} > 16/02/29 02:39:40 INFO net.NetworkTopology: Adding a new node: > /default-rack/**.120:50076 > 16/02/29 02:39:40 INFO net.NetworkTopology: Adding a new node: > /default-rack/**.121:50076 > 16/02/29 02:39:40 INFO net.NetworkTopology: Adding a new node: > /default-rack/**.122:50076 > 16/02/29 02:39:41 INFO balancer.Balancer: 0 over-utilized: [] > 16/02/29 02:39:41 INFO balancer.Balancer: 0 underutilized: [] > The cluster is balanced. Exiting... > Feb 29, 2016 2:40:57 AM 0 0 B 0 B > -1 B > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9579) Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level
[ https://issues.apache.org/jira/browse/HDFS-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215302#comment-15215302 ] Brahma Reddy Battula commented on HDFS-9579: ok..you can handle as part of HDFS-10208. > Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level > - > > Key: HDFS-9579 > URL: https://issues.apache.org/jira/browse/HDFS-9579 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.9.0 > > Attachments: HDFS-9579-10.patch, HDFS-9579-2.patch, > HDFS-9579-3.patch, HDFS-9579-4.patch, HDFS-9579-5.patch, HDFS-9579-6.patch, > HDFS-9579-7.patch, HDFS-9579-8.patch, HDFS-9579-9.patch, > HDFS-9579-branch-2.patch, HDFS-9579.patch, MR job counters.png > > > For cross DC distcp or other applications, it becomes useful to have insight > as to the traffic volume for each network distance to distinguish cross-DC > traffic, local-DC-remote-rack, etc. > FileSystem's existing {{bytesRead}} metrics tracks all the bytes read. To > provide additional metrics for each network distance, we can add additional > metrics to FileSystem level and have {{DFSInputStream}} update the value > based on the network distance between client and the datanode. > {{DFSClient}} will resolve client machine's network location as part of its > initialization. It doesn't need to resolve datanode's network location for > each read as {{DatanodeInfo}} already has the info. > There are existing HDFS specific metrics such as {{ReadStatistics}} and > {{DFSHedgedReadMetrics}}. But these metrics are only accessible via > {{DFSClient}} or {{DFSInputStream}}. Not something that application framework > such as MR and Tez can get to. That is the benefit of storing these new > metrics in FileSystem.Statistics. > This jira only includes metrics generation by HDFS. The consumption of these > metrics at MR and Tez will be tracked by separated jiras. > We can add similar metrics for HDFS write scenario later if it is necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10182) Hedged read might overwrite user's buf
[ https://issues.apache.org/jira/browse/HDFS-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhouyingchao updated HDFS-10182: Attachment: HDFS-10182-branch26.patch Patch of branch-2.6 > Hedged read might overwrite user's buf > -- > > Key: HDFS-10182 > URL: https://issues.apache.org/jira/browse/HDFS-10182 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: zhouyingchao >Assignee: zhouyingchao > Fix For: 2.7.3 > > Attachments: HDFS-10182-001.patch, HDFS-10182-branch26.patch > > > In DFSInputStream::hedgedFetchBlockByteRange, during the first attempt, the > passed-in buf from the caller is passed to another thread to fill. If the > first attempt is timed out, the second attempt would be issued with another > temp ByteBuffer. Now suppose the second attempt wins and the first attempt > is blocked somewhere in the IO path. The second attempt's result would be > copied to the buf provided by the caller and then caller would think the > pread is all set. Later the caller might use the buf to do something else > (for e.g. read another chunk of data), however, the first attempt in earlier > hedgedFetchBlockByteRange might get some data and fill into the buf ... > If this happens, the caller's buf would then be corrupted. > To fix the issue, we should allocate a temp buf for the first attempt too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9478) Reason for failing ipc.FairCallQueue contruction should be thrown
[ https://issues.apache.org/jira/browse/HDFS-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajith S updated HDFS-9478: -- Attachment: HDFS-9478.3.patch Thanks for the input, i have updated the patch as per review comment. > Reason for failing ipc.FairCallQueue contruction should be thrown > - > > Key: HDFS-9478 > URL: https://issues.apache.org/jira/browse/HDFS-9478 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-9478.2.patch, HDFS-9478.3.patch, HDFS-9478.patch > > > When FairCallQueue Construction fails, NN fails to start throwing > RunTimeException without throwing any reason on why it fails. > 2015-11-30 17:45:26,661 INFO org.apache.hadoop.ipc.FairCallQueue: > FairCallQueue is in use with 4 queues. > 2015-11-30 17:45:26,665 DEBUG org.apache.hadoop.metrics2.util.MBeans: > Registered Hadoop:service=ipc.65110,name=DecayRpcScheduler > 2015-11-30 17:45:26,666 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.lang.RuntimeException: org.apache.hadoop.ipc.FairCallQueue could not be > constructed. > at > org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:96) > at org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:55) > at org.apache.hadoop.ipc.Server.(Server.java:2241) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:942) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:534) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:784) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:346) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:750) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:687) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:889) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:872) > Example: reason for above failure could have been -- > 1. the weights were not equal to the number of queues configured. > 2. decay-scheduler.thresholds not in sync with number of queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9599) TestDecommissioningStatus.testDecommissionStatus occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-9599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Yiqun updated HDFS-9599: Attachment: HDFS-9599.002.patch Thanks [~jojochuang] for comment. Update the latest patch for addressing comments. > TestDecommissioningStatus.testDecommissionStatus occasionally fails > --- > > Key: HDFS-9599 > URL: https://issues.apache.org/jira/browse/HDFS-9599 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Environment: Jenkins >Reporter: Wei-Chiu Chuang >Assignee: Lin Yiqun > Attachments: HDFS-9599.001.patch, HDFS-9599.002.patch > > > From test result of a recent jenkins nightly > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2663/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestDecommissioningStatus/testDecommissionStatus/ > The test failed because the number of under replicated blocks is 4, instead > of 3. > Looking at the log, there is a strayed block, which might have caused the > faillure: > {noformat} > 2015-12-23 00:42:05,820 [Block report processor] INFO BlockStateChange > (BlockManager.java:processReport(2131)) - BLOCK* processReport: > blk_1073741825_1001 on node 127.0.0.1:57382 size 16384 does not belong to any > file > {noformat} > The block size 16384 suggests this is left over from the sibling test case > testDecommissionStatusAfterDNRestart. This can happen, because the same > minidfs cluster is reused between tests. > The test implementation should do a better job isolating tests. > Another case of failure is when the load factor comes into play, and a block > can not find sufficient data nodes to place replica. In this test, the > runtime should not consider load factor: > {noformat} > conf.setBoolean(DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_KEY, > false); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7651) [ NN Bench ] Refactor nnbench as a Tool implementation.
[ https://issues.apache.org/jira/browse/HDFS-7651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215272#comment-15215272 ] Brahma Reddy Battula commented on HDFS-7651: bq.Patch generated 17 ASF License warnings. Only one is related to this jira, same I fixed and uploaded the patch..Rest all covered in MAPREDUCE-6662. Test failures are unrelated.. > [ NN Bench ] Refactor nnbench as a Tool implementation. > --- > > Key: HDFS-7651 > URL: https://issues.apache.org/jira/browse/HDFS-7651 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Attachments: HDFS-7651-001.patch, HDFS-7651-002.patch, > HDFS-7651-003.patch, HDFS-7651-004.patch > > > {code} > public class NNBench { > private static final Log LOG = LogFactory.getLog( > "org.apache.hadoop.hdfs.NNBench"); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215270#comment-15215270 ] Jitendra Nath Pandey commented on HDFS-10175: - This proposal is very simple and lightweight to collect statistics for various file system operations, as compared to HTrace. HTrace is significantly more complicated to setup and operate. The counters in mapreduce have been in use for a long time and are very useful to gather job level information, which is critical to analyze job behaviors. The proposed map has only 48 entries. Being an enum-map it will be significantly optimized. I think this is a very simple patch and a good low hanging fruit for more detailed analysis of job behaviors. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7651) [ NN Bench ] Refactor nnbench as a Tool implementation.
[ https://issues.apache.org/jira/browse/HDFS-7651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated HDFS-7651: --- Attachment: HDFS-7651-004.patch > [ NN Bench ] Refactor nnbench as a Tool implementation. > --- > > Key: HDFS-7651 > URL: https://issues.apache.org/jira/browse/HDFS-7651 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula > Attachments: HDFS-7651-001.patch, HDFS-7651-002.patch, > HDFS-7651-003.patch, HDFS-7651-004.patch > > > {code} > public class NNBench { > private static final Log LOG = LogFactory.getLog( > "org.apache.hadoop.hdfs.NNBench"); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10213) there is a corrupt/missing block report, but fsck result is HEALTHY
[ https://issues.apache.org/jira/browse/HDFS-10213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215264#comment-15215264 ] Wataru Yukawa commented on HDFS-10213: -- Thank you for comment. I'll check it out. but I think that this is a strange situation. There is a corrupt/missing block report. but, a few minitutes later, There is no corrupt/missing block report. It's weird... > there is a corrupt/missing block report, but fsck result is HEALTHY > --- > > Key: HDFS-10213 > URL: https://issues.apache.org/jira/browse/HDFS-10213 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs >Affects Versions: 2.7.1 > Environment: HDP2.4.0 > CentOS6 > JDK1.8 >Reporter: Wataru Yukawa > > I monitor HDFS to check http://namenode.host:port/jmx > Sometimes, there is a corrupt/missing block report, but fsck result is > HEALTHY. > https://gyazo.com/00988be63b6b1e910994ae26e8519b4d > It's weird, I am confused... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10197) TestFsDatasetCache failing intermittently due to timeout
[ https://issues.apache.org/jira/browse/HDFS-10197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215232#comment-15215232 ] Lin Yiqun commented on HDFS-10197: -- Thanks [~andrew.wang] for reviewing the patch. I think we can commit v002 first. Now I'm not sure that reusing minidfs cluster is the main reason to lead {{TestFsDatasetCache}} timeout. If timeout still happens, I'm glad to do further optimization. > TestFsDatasetCache failing intermittently due to timeout > > > Key: HDFS-10197 > URL: https://issues.apache.org/jira/browse/HDFS-10197 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-10197.001.patch, HDFS-10197.002.patch > > > In {{TestFsDatasetCache}}, the unit tests failed sometimes. I collected some > failed reason in recent jenkins reports. They are all timeout errors. > {code} > Tests in error: > TestFsDatasetCache.testFilesExceedMaxLockedMemory:378 ? Timeout Timed out > wait... > TestFsDatasetCache.tearDown:149 ? Timeout Timed out waiting for condition. > Thr... > {code} > {code} > Tests in error: > TestFsDatasetCache.testPageRounder:474 ? test timed out after 6 > milliseco... > TestBalancer.testUnknownDatanodeSimple:1040->testUnknownDatanode:1098 ? > test ... > {code} > But there was a little different between these failure. > * The first because the total block time was exceed the > {{waitTimeMillis}}(here is 60s) then throw the timeout exception and print > thread diagnostic string in method {{DFSTestUtil#verifyExpectedCacheUsage}}. > {code} > long st = Time.now(); > do { > boolean result = check.get(); > if (result) { > return; > } > > Thread.sleep(checkEveryMillis); > } while (Time.now() - st < waitForMillis); > > throw new TimeoutException("Timed out waiting for condition. " + > "Thread diagnostics:\n" + > TimedOutTestsListener.buildThreadDiagnosticString()); > {code} > * The second is due to test elapsed time more than timeout time setting. Like > in {{TestFsDatasetCache#testPageRounder}}. > We should adjust timeout time for these unit test which would failed > sometimes due to timeout. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9847) HDFS configuration without time unit name should accept friendly time units
[ https://issues.apache.org/jira/browse/HDFS-9847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215223#comment-15215223 ] Lin Yiqun commented on HDFS-9847: - So [~chris.douglas], what can we do next for this jira? > HDFS configuration without time unit name should accept friendly time units > --- > > Key: HDFS-9847 > URL: https://issues.apache.org/jira/browse/HDFS-9847 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9847-branch-2.001.patch, > HDFS-9847-branch-2.002.patch, HDFS-9847-nothrow.001.patch, > HDFS-9847-nothrow.002.patch, HDFS-9847.001.patch, HDFS-9847.002.patch, > HDFS-9847.003.patch, HDFS-9847.004.patch, HDFS-9847.005.patch, > HDFS-9847.006.patch, branch-2-delta.002.txt, timeduration-w-y.patch > > > In HDFS-9821, it talks about the issue of leting existing keys use friendly > units e.g. 60s, 5m, 1d, 6w etc. But there are som configuration key names > contain time unit name, like {{dfs.blockreport.intervalMsec}}, so we can make > some other configurations which without time unit name to accept friendly > time units. The time unit {{seconds}} is frequently used in hdfs. We can > updating this configurations first. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10206) getBlockLocations might not sort datanodes properly by distance
[ https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-10206: --- Description: If the DFSClient machine is not a datanode, but it shares its rack with some datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} might not put the local-rack datanodes at the beginning of the sorted list. That is because the function didn't call {{networktopology.add(client);}} to properly set the node's parent node; something required by {{networktopology.sortByDistance}} to compute distance between two nodes in the same topology tree. Another issue with {{networktopology.sortByDistance}} is it only distinguishes local rack from remote rack, but it doesn't support general distance calculation to tell how remote the rack is. {noformat} NetworkTopology.java protected int getWeight(Node reader, Node node) { // 0 is local, 1 is same rack, 2 is off rack // Start off by initializing to off rack int weight = 2; if (reader != null) { if (reader.equals(node)) { weight = 0; } else if (isOnSameRack(reader, node)) { weight = 1; } } return weight; } {noformat} HDFS-10203 has suggested moving the sorting from namenode to DFSClient to address another issue. Regardless of where we do the sorting, we still need fix the issues outline here. Note that BlockPlacementPolicyDefault shares the same NetworkTopology object used by DatanodeManager and requires Nodes stored in the topology to be {{DatanodeDescriptor}} for block placement. So we need to make sure we don't pollute the NetworkTopology if we plan to fix it on the server side. was: If the DFSClient machine is not a datanode, but it shares its rack with some datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} might not put the local-rack datanodes at the beginning of the sorted list. That is because the function didn't call {{networktopology.add(client);}} to properly set the node's parent node; something required by {{networktopology.sortByDistance}} to compute distance between two nodes in the same topology tree. Another issue with {{networktopology.sortByDistance}} is it only distinguishes local rack from remote rack, but it doesn't support general distance calculation to tell how remote the rack is. {noformat} NetworkTopology.java protected int getWeight(Node reader, Node node) { // 0 is local, 1 is same rack, 2 is off rack // Start off by initializing to off rack int weight = 2; if (reader != null) { if (reader.equals(node)) { weight = 0; } else if (isOnSameRack(reader, node)) { weight = 1; } } return weight; } {noformat} HDFS-10203 has suggested moving the sorting from namenode to DFSClient to address another issue. Regardless of where we do the sorting, we still fix the issues outline here. > getBlockLocations might not sort datanodes properly by distance > --- > > Key: HDFS-10206 > URL: https://issues.apache.org/jira/browse/HDFS-10206 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ming Ma > > If the DFSClient machine is not a datanode, but it shares its rack with some > datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} > might not put the local-rack datanodes at the beginning of the sorted list. > That is because the function didn't call {{networktopology.add(client);}} to > properly set the node's parent node; something required by > {{networktopology.sortByDistance}} to compute distance between two nodes in > the same topology tree. > Another issue with {{networktopology.sortByDistance}} is it only > distinguishes local rack from remote rack, but it doesn't support general > distance calculation to tell how remote the rack is. > {noformat} > NetworkTopology.java > protected int getWeight(Node reader, Node node) { > // 0 is local, 1 is same rack, 2 is off rack > // Start off by initializing to off rack > int weight = 2; > if (reader != null) { > if (reader.equals(node)) { > weight = 0; > } else if (isOnSameRack(reader, node)) { > weight = 1; > } > } > return weight; > } > {noformat} > HDFS-10203 has suggested moving the sorting from namenode to DFSClient to > address another issue. Regardless of where we do the sorting, we still need > fix the issues outline here. > Note that BlockPlacementPolicyDefault shares the same NetworkTopology object > used by DatanodeManager and requires Nodes stored in the topology to be > {{DatanodeDescriptor}} for block placement. So we need to make sure we don't > pollute the NetworkTopology if we plan to fix it on the server side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5442) Zero loss HDFS data replication for multiple datacenters
[ https://issues.apache.org/jira/browse/HDFS-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-5442: --- Target Version/s: (was: 3.0.0) > Zero loss HDFS data replication for multiple datacenters > > > Key: HDFS-5442 > URL: https://issues.apache.org/jira/browse/HDFS-5442 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Avik Dey >Assignee: Dian Fu > Attachments: Disaster Recovery Solution for Hadoop.pdf, Disaster > Recovery Solution for Hadoop.pdf, Disaster Recovery Solution for Hadoop.pdf > > > Hadoop is architected to operate efficiently at scale for normal hardware > failures within a datacenter. Hadoop is not designed today to handle > datacenter failures. Although HDFS is not designed for nor deployed in > configurations spanning multiple datacenters, replicating data from one > location to another is common practice for disaster recovery and global > service availability. There are current solutions available for batch > replication using data copy/export tools. However, while providing some > backup capability for HDFS data, they do not provide the capability to > recover all your HDFS data from a datacenter failure and be up and running > again with a fully operational Hadoop cluster in another datacenter in a > matter of minutes. For disaster recovery from a datacenter failure, we should > provide a fully distributed, zero data loss, low latency, high throughput and > secure HDFS data replication solution for multiple datacenter setup. > Design and code for Phase-1 to follow soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215187#comment-15215187 ] Colin Patrick McCabe commented on HDFS-10175: - I'm still -1 on this change. v3 creates a very large map per FileSystem per thread. For an application with lots of threads, this overhead is unacceptable. HTrace seems like a better way to get this information. I don't see a clear use-case for this. > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states
[ https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215182#comment-15215182 ] Rakesh R commented on HDFS-9918: Thanks [~zhz], attached another patch fixing the comment. > Erasure Coding: Sort located striped blocks based on decommissioned states > -- > > Key: HDFS-9918 > URL: https://issues.apache.org/jira/browse/HDFS-9918 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-9918-001.patch, HDFS-9918-002.patch, > HDFS-9918-003.patch, HDFS-9918-004.patch, HDFS-9918-005.patch, > HDFS-9918-006.patch > > > This jira is a follow-on work of HDFS-8786, where we do decommissioning of > datanodes having striped blocks. > Now, after decommissioning it requires to change the ordering of the storage > list so that the decommissioned datanodes should only be last node in list. > For example, assume we have a block group with storage list:- > d0, d1, d2, d3, d4, d5, d6, d7, d8, d9 > mapping to indices > 0, 1, 2, 3, 4, 5, 6, 7, 8, 2 > Here the internal block b2 is duplicated, locating in d2 and d9. If d2 is a > decommissioning node then should switch d2 and d9 in the storage list. > Thanks [~jingzhao] for the > [discussions|https://issues.apache.org/jira/browse/HDFS-8786?focusedCommentId=15180415&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15180415] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states
[ https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-9918: --- Attachment: HDFS-9918-006.patch > Erasure Coding: Sort located striped blocks based on decommissioned states > -- > > Key: HDFS-9918 > URL: https://issues.apache.org/jira/browse/HDFS-9918 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-9918-001.patch, HDFS-9918-002.patch, > HDFS-9918-003.patch, HDFS-9918-004.patch, HDFS-9918-005.patch, > HDFS-9918-006.patch > > > This jira is a follow-on work of HDFS-8786, where we do decommissioning of > datanodes having striped blocks. > Now, after decommissioning it requires to change the ordering of the storage > list so that the decommissioned datanodes should only be last node in list. > For example, assume we have a block group with storage list:- > d0, d1, d2, d3, d4, d5, d6, d7, d8, d9 > mapping to indices > 0, 1, 2, 3, 4, 5, 6, 7, 8, 2 > Here the internal block b2 is duplicated, locating in d2 and d9. If d2 is a > decommissioning node then should switch d2 and d9 in the storage list. > Thanks [~jingzhao] for the > [discussions|https://issues.apache.org/jira/browse/HDFS-8786?focusedCommentId=15180415&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15180415] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5442) Zero loss HDFS data replication for multiple datacenters
[ https://issues.apache.org/jira/browse/HDFS-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215175#comment-15215175 ] Konstantin Boudnik commented on HDFS-5442: -- Don't think this goes anywhere > Zero loss HDFS data replication for multiple datacenters > > > Key: HDFS-5442 > URL: https://issues.apache.org/jira/browse/HDFS-5442 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Avik Dey >Assignee: Dian Fu > Attachments: Disaster Recovery Solution for Hadoop.pdf, Disaster > Recovery Solution for Hadoop.pdf, Disaster Recovery Solution for Hadoop.pdf > > > Hadoop is architected to operate efficiently at scale for normal hardware > failures within a datacenter. Hadoop is not designed today to handle > datacenter failures. Although HDFS is not designed for nor deployed in > configurations spanning multiple datacenters, replicating data from one > location to another is common practice for disaster recovery and global > service availability. There are current solutions available for batch > replication using data copy/export tools. However, while providing some > backup capability for HDFS data, they do not provide the capability to > recover all your HDFS data from a datacenter failure and be up and running > again with a fully operational Hadoop cluster in another datacenter in a > matter of minutes. For disaster recovery from a datacenter failure, we should > provide a fully distributed, zero data loss, low latency, high throughput and > secure HDFS data replication solution for multiple datacenter setup. > Design and code for Phase-1 to follow soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5442) Zero loss HDFS data replication for multiple datacenters
[ https://issues.apache.org/jira/browse/HDFS-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215174#comment-15215174 ] Liang Dianpeng commented on HDFS-5442: -- what's the status for this issue. Is it on the development or Is it suspended. if it is still on schedule to release it as the big feature in hadoop 3.0 > Zero loss HDFS data replication for multiple datacenters > > > Key: HDFS-5442 > URL: https://issues.apache.org/jira/browse/HDFS-5442 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Avik Dey >Assignee: Dian Fu > Attachments: Disaster Recovery Solution for Hadoop.pdf, Disaster > Recovery Solution for Hadoop.pdf, Disaster Recovery Solution for Hadoop.pdf > > > Hadoop is architected to operate efficiently at scale for normal hardware > failures within a datacenter. Hadoop is not designed today to handle > datacenter failures. Although HDFS is not designed for nor deployed in > configurations spanning multiple datacenters, replicating data from one > location to another is common practice for disaster recovery and global > service availability. There are current solutions available for batch > replication using data copy/export tools. However, while providing some > backup capability for HDFS data, they do not provide the capability to > recover all your HDFS data from a datacenter failure and be up and running > again with a fully operational Hadoop cluster in another datacenter in a > matter of minutes. For disaster recovery from a datacenter failure, we should > provide a fully distributed, zero data loss, low latency, high throughput and > secure HDFS data replication solution for multiple datacenter setup. > Design and code for Phase-1 to follow soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10223) peerFromSocketAndKey performs SASL exchange before setting connection timeouts
[ https://issues.apache.org/jira/browse/HDFS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215168#comment-15215168 ] Colin Patrick McCabe commented on HDFS-10223: - Thanks for the reviews, [~cnauroth] and [~busbey]. > peerFromSocketAndKey performs SASL exchange before setting connection timeouts > -- > > Key: HDFS-10223 > URL: https://issues.apache.org/jira/browse/HDFS-10223 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-10223.001.patch, HDFS-10223.002.patch > > > {{peerFromSocketAndKey}} performs the SASL exchange before setting up > connection timeouts. Because of this, the timeout used for setting up SASL > connections is the default system-wide TCP timeout, which is usually several > hours long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10225) DataNode hot swap drives should recognize storage type tags.
[ https://issues.apache.org/jira/browse/HDFS-10225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-10225: - Attachment: HDFS-10225.000.patch > DataNode hot swap drives should recognize storage type tags. > - > > Key: HDFS-10225 > URL: https://issues.apache.org/jira/browse/HDFS-10225 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.2 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-10225.000.patch > > > The current hot swap code only differentiate data dirs by their paths. People > might want to change the types of certain data dirs from the default value in > an existing cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10225) DataNode hot swap drives should recognize storage type tags.
[ https://issues.apache.org/jira/browse/HDFS-10225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-10225: - Attachment: (was: HDFS-10225.000.patch) > DataNode hot swap drives should recognize storage type tags. > - > > Key: HDFS-10225 > URL: https://issues.apache.org/jira/browse/HDFS-10225 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.2 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-10225.000.patch > > > The current hot swap code only differentiate data dirs by their paths. People > might want to change the types of certain data dirs from the default value in > an existing cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline
[ https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HDFS-9805: Attachment: HDFS-9805.002.patch Here's a new patch rebased against the latest trunk. > TCP_NODELAY not set before SASL handshake in data transfer pipeline > --- > > Key: HDFS-9805 > URL: https://issues.apache.org/jira/browse/HDFS-9805 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Gary Helmling >Assignee: Gary Helmling > Attachments: HDFS-9805.002.patch > > > There are a few places in the DN -> DN block transfer pipeline where > TCP_NODELAY is not set before doing a SASL handshake: > * in {{DataNode.DataTransfer::run()}} > * in {{DataXceiver::replaceBlock()}} > * in {{DataXceiver::writeBlock()}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart
[ https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-9349: Attachment: HDFS-9349-HDFS-9000.008.patch v008 fixed one unit test failure. > Support reconfiguring fs.protected.directories without NN restart > - > > Key: HDFS-9349 > URL: https://issues.apache.org/jira/browse/HDFS-9349 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9349-HDFS-9000.003.patch, > HDFS-9349-HDFS-9000.004.patch, HDFS-9349-HDFS-9000.005.patch, > HDFS-9349-HDFS-9000.006.patch, HDFS-9349-HDFS-9000.007.patch, > HDFS-9349-HDFS-9000.008.patch, HDFS-9349.001.patch, HDFS-9349.002.patch > > > This is to reconfigure > {code} > fs.protected.directories > {code} > without restarting NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10225) DataNode hot swap drives should recognize storage type tags.
[ https://issues.apache.org/jira/browse/HDFS-10225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-10225: - Status: Patch Available (was: Open) > DataNode hot swap drives should recognize storage type tags. > - > > Key: HDFS-10225 > URL: https://issues.apache.org/jira/browse/HDFS-10225 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.2 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-10225.000.patch > > > The current hot swap code only differentiate data dirs by their paths. People > might want to change the types of certain data dirs from the default value in > an existing cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10225) DataNode hot swap drives should recognize storage type tags.
[ https://issues.apache.org/jira/browse/HDFS-10225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-10225: - Attachment: HDFS-10225.000.patch Upload an initial patch to set {{FsVolumeSpi#storageType}} directly from reconfig code. > DataNode hot swap drives should recognize storage type tags. > - > > Key: HDFS-10225 > URL: https://issues.apache.org/jira/browse/HDFS-10225 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.2 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-10225.000.patch > > > The current hot swap code only differentiate data dirs by their paths. People > might want to change the types of certain data dirs from the default value in > an existing cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-10225) DataNode hot swap drives should recognize storage type tags.
Lei (Eddy) Xu created HDFS-10225: Summary: DataNode hot swap drives should recognize storage type tags. Key: HDFS-10225 URL: https://issues.apache.org/jira/browse/HDFS-10225 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.7.2 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu The current hot swap code only differentiate data dirs by their paths. People might want to change the types of certain data dirs from the default value in an existing cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9939) Increase DecompressorStream skip buffer size
[ https://issues.apache.org/jira/browse/HDFS-9939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215119#comment-15215119 ] John Zhuge commented on HDFS-9939: -- Thanks. That could have been caught by HADOOP-12701 since I ran test-patch. John Zhuge > Increase DecompressorStream skip buffer size > > > Key: HDFS-9939 > URL: https://issues.apache.org/jira/browse/HDFS-9939 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Yongjun Zhang >Assignee: John Zhuge > Fix For: 2.8.0 > > Attachments: HDFS-9939.001.patch > > > See ACCUMULO-2353 for details. > Filing this jira to investigate performance difference and possibly make the > buf size change accordingly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart
[ https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215101#comment-15215101 ] Xiaobing Zhou commented on HDFS-9349: - I posted patch v007. It's hard to do that change directly on original parseProtectedDirectories, but the overload/refactoring is possible. [~arpiagariu] thanks. > Support reconfiguring fs.protected.directories without NN restart > - > > Key: HDFS-9349 > URL: https://issues.apache.org/jira/browse/HDFS-9349 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9349-HDFS-9000.003.patch, > HDFS-9349-HDFS-9000.004.patch, HDFS-9349-HDFS-9000.005.patch, > HDFS-9349-HDFS-9000.006.patch, HDFS-9349-HDFS-9000.007.patch, > HDFS-9349.001.patch, HDFS-9349.002.patch > > > This is to reconfigure > {code} > fs.protected.directories > {code} > without restarting NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart
[ https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-9349: Attachment: HDFS-9349-HDFS-9000.007.patch > Support reconfiguring fs.protected.directories without NN restart > - > > Key: HDFS-9349 > URL: https://issues.apache.org/jira/browse/HDFS-9349 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-9349-HDFS-9000.003.patch, > HDFS-9349-HDFS-9000.004.patch, HDFS-9349-HDFS-9000.005.patch, > HDFS-9349-HDFS-9000.006.patch, HDFS-9349-HDFS-9000.007.patch, > HDFS-9349.001.patch, HDFS-9349.002.patch > > > This is to reconfigure > {code} > fs.protected.directories > {code} > without restarting NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8356) Document missing properties in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated HDFS-8356: - Attachment: HDFS-8356.010.patch - Updated with latest feedback. > Document missing properties in hdfs-default.xml > --- > > Key: HDFS-8356 > URL: https://issues.apache.org/jira/browse/HDFS-8356 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.0 >Reporter: Ray Chiang >Assignee: Ray Chiang > Labels: supportability, test > Attachments: HDFS-8356.001.patch, HDFS-8356.002.patch, > HDFS-8356.003.patch, HDFS-8356.004.patch, HDFS-8356.005.patch, > HDFS-8356.006.patch, HDFS-8356.007.patch, HDFS-8356.008.patch, > HDFS-8356.009.patch, HDFS-8356.010.patch > > > The following properties are currently not defined in hdfs-default.xml. These > properties should either be > A) documented in hdfs-default.xml OR > B) listed as an exception (with comments, e.g. for internal use) in the > TestHdfsConfigFields unit test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10175) add per-operation stats to FileSystem.Statistics
[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215055#comment-15215055 ] Jitendra Nath Pandey commented on HDFS-10175: - [~cmccabe], and [~andrew.wang] the latest patch addresses the concerns about synchronization. Do you think it is ok to commit now? > add per-operation stats to FileSystem.Statistics > > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Ram Venkatesh >Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9939) Increase DecompressorStream skip buffer size
[ https://issues.apache.org/jira/browse/HDFS-9939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215032#comment-15215032 ] Andrew Wang commented on HDFS-9939: --- Only one little nit, unused variable {{buf}} in testSkip. Otherwise LGTM, thanks John! > Increase DecompressorStream skip buffer size > > > Key: HDFS-9939 > URL: https://issues.apache.org/jira/browse/HDFS-9939 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Yongjun Zhang >Assignee: John Zhuge > Fix For: 2.8.0 > > Attachments: HDFS-9939.001.patch > > > See ACCUMULO-2353 for details. > Filing this jira to investigate performance difference and possibly make the > buf size change accordingly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9478) Reason for failing ipc.FairCallQueue contruction should be thrown
[ https://issues.apache.org/jira/browse/HDFS-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215022#comment-15215022 ] Arpit Agarwal commented on HDFS-9478: - Thanks for updating the patch [~ajithshetty]. Nitpick for the test case - you can use Assert.fail instead of assertTrue(false). +1 otherwise. I tried this out with FairCallQueue and bad parameters and verified that the original exception is logged. > Reason for failing ipc.FairCallQueue contruction should be thrown > - > > Key: HDFS-9478 > URL: https://issues.apache.org/jira/browse/HDFS-9478 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Ajith S >Priority: Minor > Attachments: HDFS-9478.2.patch, HDFS-9478.patch > > > When FairCallQueue Construction fails, NN fails to start throwing > RunTimeException without throwing any reason on why it fails. > 2015-11-30 17:45:26,661 INFO org.apache.hadoop.ipc.FairCallQueue: > FairCallQueue is in use with 4 queues. > 2015-11-30 17:45:26,665 DEBUG org.apache.hadoop.metrics2.util.MBeans: > Registered Hadoop:service=ipc.65110,name=DecayRpcScheduler > 2015-11-30 17:45:26,666 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. > java.lang.RuntimeException: org.apache.hadoop.ipc.FairCallQueue could not be > constructed. > at > org.apache.hadoop.ipc.CallQueueManager.createCallQueueInstance(CallQueueManager.java:96) > at org.apache.hadoop.ipc.CallQueueManager.(CallQueueManager.java:55) > at org.apache.hadoop.ipc.Server.(Server.java:2241) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:942) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:534) > at > org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509) > at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:784) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:346) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:750) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:687) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:889) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:872) > Example: reason for above failure could have been -- > 1. the weights were not equal to the number of queues configured. > 2. decay-scheduler.thresholds not in sync with number of queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10224) Augment DistributedFileSystem to support asynchronous HDFS access
[ https://issues.apache.org/jira/browse/HDFS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215016#comment-15215016 ] Xiaobing Zhou commented on HDFS-10224: -- I posted HDFS-10224 patch (i.e. HDFS-10224-HDFS-9924.000.patch), and also posted the combo patch (i.e. HDFS-10224-and-HADOOP-12909.000.patch) that contains HDFS-10224 and its dependent patch HADOOP-12909. Please kindly review, thanks. > Augment DistributedFileSystem to support asynchronous HDFS access > - > > Key: HDFS-10224 > URL: https://issues.apache.org/jira/browse/HDFS-10224 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10224-HDFS-9924.000.patch, > HDFS-10224-and-HADOOP-12909.000.patch > > > Instead of pushing asynchronous HDFS API up to FileSystem, this is to scope > changes to only DistributedFileSystem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10224) Augment DistributedFileSystem to support asynchronous HDFS access
[ https://issues.apache.org/jira/browse/HDFS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-10224: - Status: Patch Available (was: Open) > Augment DistributedFileSystem to support asynchronous HDFS access > - > > Key: HDFS-10224 > URL: https://issues.apache.org/jira/browse/HDFS-10224 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10224-HDFS-9924.000.patch, > HDFS-10224-and-HADOOP-12909.000.patch > > > Instead of pushing asynchronous HDFS API up to FileSystem, this is to scope > changes to only DistributedFileSystem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10224) Augment DistributedFileSystem to support asynchronous HDFS access
[ https://issues.apache.org/jira/browse/HDFS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-10224: - Attachment: HDFS-10224-HDFS-9924.000.patch > Augment DistributedFileSystem to support asynchronous HDFS access > - > > Key: HDFS-10224 > URL: https://issues.apache.org/jira/browse/HDFS-10224 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10224-HDFS-9924.000.patch, > HDFS-10224-and-HADOOP-12909.000.patch > > > Instead of pushing asynchronous HDFS API up to FileSystem, this is to scope > changes to only DistributedFileSystem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10224) Augment DistributedFileSystem to support asynchronous HDFS access
[ https://issues.apache.org/jira/browse/HDFS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-10224: - Attachment: HDFS-10224-and-HADOOP-12909.000.patch > Augment DistributedFileSystem to support asynchronous HDFS access > - > > Key: HDFS-10224 > URL: https://issues.apache.org/jira/browse/HDFS-10224 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-10224-HDFS-9924.000.patch, > HDFS-10224-and-HADOOP-12909.000.patch > > > Instead of pushing asynchronous HDFS API up to FileSystem, this is to scope > changes to only DistributedFileSystem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10224) Augment DistributedFileSystem to support asynchronous HDFS access
[ https://issues.apache.org/jira/browse/HDFS-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-10224: - Description: Instead of pushing asynchronous HDFS API up to FileSystem, this is to scope changes to only DistributedFileSystem. > Augment DistributedFileSystem to support asynchronous HDFS access > - > > Key: HDFS-10224 > URL: https://issues.apache.org/jira/browse/HDFS-10224 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: fs >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > > Instead of pushing asynchronous HDFS API up to FileSystem, this is to scope > changes to only DistributedFileSystem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-10224) Augment DistributedFileSystem to support asynchronous HDFS access
Xiaobing Zhou created HDFS-10224: Summary: Augment DistributedFileSystem to support asynchronous HDFS access Key: HDFS-10224 URL: https://issues.apache.org/jira/browse/HDFS-10224 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6520) hdfs fsck -move not working
[ https://issues.apache.org/jira/browse/HDFS-6520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214966#comment-15214966 ] John Zhuge commented on HDFS-6520: -- Wrote the unit test that reproduced the problem. Got checksum exception in {{readNextPacket}}. > hdfs fsck -move not working > --- > > Key: HDFS-6520 > URL: https://issues.apache.org/jira/browse/HDFS-6520 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Shengjun Xin >Assignee: John Zhuge > > I met some error when I run fsck -move. > My steps are as the following: > 1. Set up a pseudo cluster > 2. Copy a file to hdfs > 3. Corrupt a block of the file > 4. Run fsck to check: > {code} > Connecting to namenode via http://localhost:50070 > FSCK started by hadoop (auth:SIMPLE) from /127.0.0.1 for path /user/hadoop at > Wed Jun 11 15:58:38 CST 2014 > . > /user/hadoop/fsck-test: CORRUPT blockpool > BP-654596295-10.37.7.84-1402466764642 block blk_1073741825 > /user/hadoop/fsck-test: MISSING 1 blocks of total size 1048576 B.Status: > CORRUPT > Total size:4104304 B > Total dirs:1 > Total files: 1 > Total symlinks:0 > Total blocks (validated): 4 (avg. block size 1026076 B) > > CORRUPT FILES:1 > MISSING BLOCKS: 1 > MISSING SIZE: 1048576 B > CORRUPT BLOCKS: 1 > > Minimally replicated blocks: 3 (75.0 %) > Over-replicated blocks:0 (0.0 %) > Under-replicated blocks: 0 (0.0 %) > Mis-replicated blocks: 0 (0.0 %) > Default replication factor:1 > Average block replication: 0.75 > Corrupt blocks:1 > Missing replicas: 0 (0.0 %) > Number of data-nodes: 1 > Number of racks: 1 > FSCK ended at Wed Jun 11 15:58:38 CST 2014 in 1 milliseconds > The filesystem under path '/user/hadoop' is CORRUPT > {code} > 5. Run fsck -move to move the corrupted file to /lost+found and the error > message in the namenode log: > {code} > 2014-06-11 15:48:16,686 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > FSCK started by hadoop (auth:SIMPLE) from /127.0.0.1 for path /user/hadoop at > Wed Jun 11 15:48:16 CST 2014 > 2014-06-11 15:48:16,894 INFO > org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 35 > Total time for transactions(ms): 9 Number of transactions batched in Syncs: 0 > Number of syncs: 25 SyncTimes(ms): 73 > 2014-06-11 15:48:16,991 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: Error reading block > java.io.IOException: Expected empty end-of-read packet! Header: PacketHeader > with packetLen=66048 header data: offsetInBlock: 65536 > seqno: 1 > lastPacketInBlock: false > dataLen: 65536 > at > org.apache.hadoop.hdfs.RemoteBlockReader2.readTrailingEmptyPacket(RemoteBlockReader2.java:259) > at > org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:220) > at > org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:138) > at > org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.copyBlock(NamenodeFsck.java:649) > at > org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.copyBlocksToLostFound(NamenodeFsck.java:543) > at > org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:460) > at > org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.check(NamenodeFsck.java:324) > at > org.apache.hadoop.hdfs.server.namenode.NamenodeFsck.fsck(NamenodeFsck.java:233) > at > org.apache.hadoop.hdfs.server.namenode.FsckServlet$1.run(FsckServlet.java:67) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at > org.apache.hadoop.hdfs.server.namenode.FsckServlet.doGet(FsckServlet.java:58) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at > org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) > at > org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1192) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at > org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at > org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilte
[jira] [Updated] (HDFS-10223) peerFromSocketAndKey performs SASL exchange before setting connection timeouts
[ https://issues.apache.org/jira/browse/HDFS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-10223: - Hadoop Flags: Reviewed +1 for patch v002, pending pre-commit. Thank you for making the update. > peerFromSocketAndKey performs SASL exchange before setting connection timeouts > -- > > Key: HDFS-10223 > URL: https://issues.apache.org/jira/browse/HDFS-10223 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-10223.001.patch, HDFS-10223.002.patch > > > {{peerFromSocketAndKey}} performs the SASL exchange before setting up > connection timeouts. Because of this, the timeout used for setting up SASL > connections is the default system-wide TCP timeout, which is usually several > hours long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states
[ https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214940#comment-15214940 ] Zhe Zhang commented on HDFS-9918: - Thanks for the update Rakesh. {code} // FSNamesystem#getBlockLocations } else if (blkList.get(0) instanceof LocatedStripedBlock) { // sort located striped blocks based on decommissioned states blockManager.getDatanodeManager().sortLocatedStripedBlocks(blkList); return blocks; } {code} Seems that for a set of striped blocks, the method will just return there without calling the added logic of sorting the last striped block? > Erasure Coding: Sort located striped blocks based on decommissioned states > -- > > Key: HDFS-9918 > URL: https://issues.apache.org/jira/browse/HDFS-9918 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-9918-001.patch, HDFS-9918-002.patch, > HDFS-9918-003.patch, HDFS-9918-004.patch, HDFS-9918-005.patch > > > This jira is a follow-on work of HDFS-8786, where we do decommissioning of > datanodes having striped blocks. > Now, after decommissioning it requires to change the ordering of the storage > list so that the decommissioned datanodes should only be last node in list. > For example, assume we have a block group with storage list:- > d0, d1, d2, d3, d4, d5, d6, d7, d8, d9 > mapping to indices > 0, 1, 2, 3, 4, 5, 6, 7, 8, 2 > Here the internal block b2 is duplicated, locating in d2 and d9. If d2 is a > decommissioning node then should switch d2 and d9 in the storage list. > Thanks [~jingzhao] for the > [discussions|https://issues.apache.org/jira/browse/HDFS-8786?focusedCommentId=15180415&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15180415] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10223) peerFromSocketAndKey performs SASL exchange before setting connection timeouts
[ https://issues.apache.org/jira/browse/HDFS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-10223: Attachment: HDFS-10223.002.patch * Add {{IOUtils#cleanup}} in the unit test in a finally block > peerFromSocketAndKey performs SASL exchange before setting connection timeouts > -- > > Key: HDFS-10223 > URL: https://issues.apache.org/jira/browse/HDFS-10223 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-10223.001.patch, HDFS-10223.002.patch > > > {{peerFromSocketAndKey}} performs the SASL exchange before setting up > connection timeouts. Because of this, the timeout used for setting up SASL > connections is the default system-wide TCP timeout, which is usually several > hours long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10220) Namenode failover due to too long loking in LeaseManager.Monitor
[ https://issues.apache.org/jira/browse/HDFS-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214934#comment-15214934 ] Ravi Prakash commented on HDFS-10220: - Thank you for the report Nicolas! Which version of Hadoop did you experience this on? > Namenode failover due to too long loking in LeaseManager.Monitor > > > Key: HDFS-10220 > URL: https://issues.apache.org/jira/browse/HDFS-10220 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Nicolas Fraison >Priority: Minor > Attachments: HADOOP-10220.001.patch, threaddump_zkfc.txt > > > I have faced a namenode failover due to unresponsive namenode detected by the > zkfc with lot's of WARN messages (5 millions) like this one: > _org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease: All > existing blocks are COMPLETE, lease removed, file closed._ > On the threaddump taken by the zkfc there are lots of thread blocked due to a > lock. > Looking at the code, there are a lock taken by the LeaseManager.Monitor when > some lease must be released. Due to the really big number of lease to be > released the namenode has taken too many times to release them blocking all > other tasks and making the zkfc thinking that the namenode was not > available/stuck. > The idea of this patch is to limit the number of leased released each time we > check for lease so the lock won't be taken for a too long time period. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10197) TestFsDatasetCache failing intermittently due to timeout
[ https://issues.apache.org/jira/browse/HDFS-10197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214920#comment-15214920 ] Andrew Wang commented on HDFS-10197: Hi [~linyiqun], thanks for doing the analysis. I'm +1 on your v002 patch. If you want to do further optimization on this JIRA, I'm happy to wait for another rev. If not, we can just commit v002 as-is. > TestFsDatasetCache failing intermittently due to timeout > > > Key: HDFS-10197 > URL: https://issues.apache.org/jira/browse/HDFS-10197 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-10197.001.patch, HDFS-10197.002.patch > > > In {{TestFsDatasetCache}}, the unit tests failed sometimes. I collected some > failed reason in recent jenkins reports. They are all timeout errors. > {code} > Tests in error: > TestFsDatasetCache.testFilesExceedMaxLockedMemory:378 ? Timeout Timed out > wait... > TestFsDatasetCache.tearDown:149 ? Timeout Timed out waiting for condition. > Thr... > {code} > {code} > Tests in error: > TestFsDatasetCache.testPageRounder:474 ? test timed out after 6 > milliseco... > TestBalancer.testUnknownDatanodeSimple:1040->testUnknownDatanode:1098 ? > test ... > {code} > But there was a little different between these failure. > * The first because the total block time was exceed the > {{waitTimeMillis}}(here is 60s) then throw the timeout exception and print > thread diagnostic string in method {{DFSTestUtil#verifyExpectedCacheUsage}}. > {code} > long st = Time.now(); > do { > boolean result = check.get(); > if (result) { > return; > } > > Thread.sleep(checkEveryMillis); > } while (Time.now() - st < waitForMillis); > > throw new TimeoutException("Timed out waiting for condition. " + > "Thread diagnostics:\n" + > TimedOutTestsListener.buildThreadDiagnosticString()); > {code} > * The second is due to test elapsed time more than timeout time setting. Like > in {{TestFsDatasetCache#testPageRounder}}. > We should adjust timeout time for these unit test which would failed > sometimes due to timeout. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10223) peerFromSocketAndKey performs SASL exchange before setting connection timeouts
[ https://issues.apache.org/jira/browse/HDFS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214913#comment-15214913 ] Chris Nauroth commented on HDFS-10223: -- Thank you, Colin. This is a good catch. In the test, would you please close {{serverSocket}} and {{socket}} in a {{finally}} block, or use try-with-resources? > peerFromSocketAndKey performs SASL exchange before setting connection timeouts > -- > > Key: HDFS-10223 > URL: https://issues.apache.org/jira/browse/HDFS-10223 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-10223.001.patch > > > {{peerFromSocketAndKey}} performs the SASL exchange before setting up > connection timeouts. Because of this, the timeout used for setting up SASL > connections is the default system-wide TCP timeout, which is usually several > hours long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10222) libhdfs++: Shutdown sockets to avoid "Connection reset by peer"
[ https://issues.apache.org/jira/browse/HDFS-10222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-10222: --- Status: Patch Available (was: Open) > libhdfs++: Shutdown sockets to avoid "Connection reset by peer" > --- > > Key: HDFS-10222 > URL: https://issues.apache.org/jira/browse/HDFS-10222 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-10222.HDFS-8707.000.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10222) libhdfs++: Shutdown sockets to avoid "Connection reset by peer"
[ https://issues.apache.org/jira/browse/HDFS-10222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-10222: --- Attachment: HDFS-10222.HDFS-8707.000.patch Added patch. The fix has two parts: 1) Always call shutdown() before closing a socket 2) Catch the exceptions that shutdown() may throw. asio converts error codes from posix functions into std::exceptions, the error codes happening here aren't really worth caring about. These show up when "delete this" is called in the continuation after a failed connect. The socket isn't open or closed as far as asio is concerned so weird things happen. We should plan resource management better so we can remove "delete this". > libhdfs++: Shutdown sockets to avoid "Connection reset by peer" > --- > > Key: HDFS-10222 > URL: https://issues.apache.org/jira/browse/HDFS-10222 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-10222.HDFS-8707.000.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10223) peerFromSocketAndKey performs SASL exchange before setting connection timeouts
[ https://issues.apache.org/jira/browse/HDFS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214867#comment-15214867 ] Sean Busbey commented on HDFS-10223: +1 non-binding, pending non-surprising buildbot feedback > peerFromSocketAndKey performs SASL exchange before setting connection timeouts > -- > > Key: HDFS-10223 > URL: https://issues.apache.org/jira/browse/HDFS-10223 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-10223.001.patch > > > {{peerFromSocketAndKey}} performs the SASL exchange before setting up > connection timeouts. Because of this, the timeout used for setting up SASL > connections is the default system-wide TCP timeout, which is usually several > hours long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline
[ https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214854#comment-15214854 ] Colin Patrick McCabe commented on HDFS-9805: Hi [~ghelmling], thanks for looking at this. Can you update the patch? > TCP_NODELAY not set before SASL handshake in data transfer pipeline > --- > > Key: HDFS-9805 > URL: https://issues.apache.org/jira/browse/HDFS-9805 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Gary Helmling >Assignee: Gary Helmling > > There are a few places in the DN -> DN block transfer pipeline where > TCP_NODELAY is not set before doing a SASL handshake: > * in {{DataNode.DataTransfer::run()}} > * in {{DataXceiver::replaceBlock()}} > * in {{DataXceiver::writeBlock()}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10223) peerFromSocketAndKey performs SASL exchange before setting connection timeouts
[ https://issues.apache.org/jira/browse/HDFS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-10223: Attachment: HDFS-10223.001.patch > peerFromSocketAndKey performs SASL exchange before setting connection timeouts > -- > > Key: HDFS-10223 > URL: https://issues.apache.org/jira/browse/HDFS-10223 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-10223.001.patch > > > {{peerFromSocketAndKey}} performs the SASL exchange before setting up > connection timeouts. Because of this, the timeout used for setting up SASL > connections is the default system-wide TCP timeout, which is usually several > hours long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10223) peerFromSocketAndKey performs SASL exchange before setting connection timeouts
[ https://issues.apache.org/jira/browse/HDFS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-10223: Attachment: (was: HDFS-10223.001.patch) > peerFromSocketAndKey performs SASL exchange before setting connection timeouts > -- > > Key: HDFS-10223 > URL: https://issues.apache.org/jira/browse/HDFS-10223 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-10223.001.patch > > > {{peerFromSocketAndKey}} performs the SASL exchange before setting up > connection timeouts. Because of this, the timeout used for setting up SASL > connections is the default system-wide TCP timeout, which is usually several > hours long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline
[ https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-9805: --- Attachment: (was: HDFS-10223.001.patch) > TCP_NODELAY not set before SASL handshake in data transfer pipeline > --- > > Key: HDFS-9805 > URL: https://issues.apache.org/jira/browse/HDFS-9805 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Gary Helmling >Assignee: Gary Helmling > > There are a few places in the DN -> DN block transfer pipeline where > TCP_NODELAY is not set before doing a SASL handshake: > * in {{DataNode.DataTransfer::run()}} > * in {{DataXceiver::replaceBlock()}} > * in {{DataXceiver::writeBlock()}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline
[ https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-9805: --- Attachment: (was: HDFS-9805.001.patch) > TCP_NODELAY not set before SASL handshake in data transfer pipeline > --- > > Key: HDFS-9805 > URL: https://issues.apache.org/jira/browse/HDFS-9805 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Gary Helmling >Assignee: Gary Helmling > Attachments: HDFS-10223.001.patch > > > There are a few places in the DN -> DN block transfer pipeline where > TCP_NODELAY is not set before doing a SASL handshake: > * in {{DataNode.DataTransfer::run()}} > * in {{DataXceiver::replaceBlock()}} > * in {{DataXceiver::writeBlock()}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline
[ https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-9805: --- Attachment: HDFS-10223.001.patch > TCP_NODELAY not set before SASL handshake in data transfer pipeline > --- > > Key: HDFS-9805 > URL: https://issues.apache.org/jira/browse/HDFS-9805 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Gary Helmling >Assignee: Gary Helmling > Attachments: HDFS-10223.001.patch > > > There are a few places in the DN -> DN block transfer pipeline where > TCP_NODELAY is not set before doing a SASL handshake: > * in {{DataNode.DataTransfer::run()}} > * in {{DataXceiver::replaceBlock()}} > * in {{DataXceiver::writeBlock()}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10223) peerFromSocketAndKey performs SASL exchange before setting connection timeouts
[ https://issues.apache.org/jira/browse/HDFS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-10223: Attachment: HDFS-10223.001.patch > peerFromSocketAndKey performs SASL exchange before setting connection timeouts > -- > > Key: HDFS-10223 > URL: https://issues.apache.org/jira/browse/HDFS-10223 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-10223.001.patch > > > {{peerFromSocketAndKey}} performs the SASL exchange before setting up > connection timeouts. Because of this, the timeout used for setting up SASL > connections is the default system-wide TCP timeout, which is usually several > hours long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10223) peerFromSocketAndKey performs SASL exchange before setting connection timeouts
[ https://issues.apache.org/jira/browse/HDFS-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-10223: Status: Patch Available (was: Open) > peerFromSocketAndKey performs SASL exchange before setting connection timeouts > -- > > Key: HDFS-10223 > URL: https://issues.apache.org/jira/browse/HDFS-10223 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-10223.001.patch > > > {{peerFromSocketAndKey}} performs the SASL exchange before setting up > connection timeouts. Because of this, the timeout used for setting up SASL > connections is the default system-wide TCP timeout, which is usually several > hours long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-10223) peerFromSocketAndKey performs SASL exchange before setting connection timeouts
Colin Patrick McCabe created HDFS-10223: --- Summary: peerFromSocketAndKey performs SASL exchange before setting connection timeouts Key: HDFS-10223 URL: https://issues.apache.org/jira/browse/HDFS-10223 Project: Hadoop HDFS Issue Type: Bug Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe {{peerFromSocketAndKey}} performs the SASL exchange before setting up connection timeouts. Because of this, the timeout used for setting up SASL connections is the default system-wide TCP timeout, which is usually several hours long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9820) Improve distcp to support efficient restore to an earlier snapshot
[ https://issues.apache.org/jira/browse/HDFS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214680#comment-15214680 ] Jing Zhao commented on HDFS-9820: - Thanks for working on this, [~yzhangal]! Some comments on the current patch: # For the rdiff, We can switch the order of from-snapshot and to-snapshot before computing the diff report on the target cluster. In this way we can reuse the original sync code. # Currently rdiff is a standalone option for distcp. This means we're using distcp to do the restore. To restore a directory back to a snapshot, this may not be the most efficient way compared with a local restoring solution (HDFS-4167), which can avoid most of the unnecessary data copying and can provide a copy-on-write semantic when supporting restoring appended/truncated files. # But before we finish the work in HDFS-4167, maybe we can augment the current diff-based distcp by allowing the admin to choose to restore the target back to the latest snapshot. We can still use the implementation in the current patch, but instead of adding a new rdiff option for distcp, we add a "--force" option to the current diff-based distcp. What do you think? > Improve distcp to support efficient restore to an earlier snapshot > -- > > Key: HDFS-9820 > URL: https://issues.apache.org/jira/browse/HDFS-9820 > Project: Hadoop HDFS > Issue Type: New Feature > Components: distcp >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-9820.001.patch, HDFS-9820.002.patch > > > HDFS-4167 intends to restore HDFS to the most recent snapshot, and there are > some complexity and challenges. > HDFS-7535 improved distcp performance by avoiding copying files that changed > name since last backup. > On top of HDFS-7535, HDFS-8828 improved distcp performance when copying data > from source to target cluster, by only copying changed files since last > backup. The way it works is use snapshot diff to find out all files changed, > and copy the changed files only. > See > https://blog.cloudera.com/blog/2015/12/distcp-performance-improvements-in-apache-hadoop/ > This jira is to propose a variation of HDFS-8828, to find out the files > changed in target cluster since last snapshot sx, and copy these from the > source target's same snapshot sx, to restore target cluster to sx. > If a file/dir is > - renamed, rename it back > - created in target cluster, delete it > - modified, put it to the copy list > - run distcp with the copy list, copy from the source cluster's corresponding > snapshot > This could be a new command line switch -rdiff in distcp. > HDFS-4167 would still be nice to have. It just seems to me that HDFS-9820 > would hopefully be easier to implement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-10222) libhdfs++: Shutdown sockets to avoid "Connection reset by peer"
James Clampffer created HDFS-10222: -- Summary: libhdfs++: Shutdown sockets to avoid "Connection reset by peer" Key: HDFS-10222 URL: https://issues.apache.org/jira/browse/HDFS-10222 Project: Hadoop HDFS Issue Type: Sub-task Reporter: James Clampffer Assignee: James Clampffer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7651) [ NN Bench ] Refactor nnbench as a Tool implementation.
[ https://issues.apache.org/jira/browse/HDFS-7651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214590#comment-15214590 ] Hadoop QA commented on HDFS-7651: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 51s {color} | {color:green} hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_74 with JDK v1.8.0_74 generated 0 new + 188 unchanged - 1 fixed = 188 total (was 189) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s {color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 12s {color} | {color:green} hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.7.0_95 with JDK v1.7.0_95 generated 0 new + 188 unchanged - 1 fixed = 188 total (was 189) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s {color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 26s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 103m 11s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 25s {color} | {color:red} Patch generated 17 ASF License warnings. {color} | | {color:black}{col
[jira] [Updated] (HDFS-4949) Centralized cache management in HDFS
[ https://issues.apache.org/jira/browse/HDFS-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-4949: -- Fix Version/s: 2.3.0 > Centralized cache management in HDFS > > > Key: HDFS-4949 > URL: https://issues.apache.org/jira/browse/HDFS-4949 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: 3.0.0, 2.3.0 >Reporter: Andrew Wang >Assignee: Andrew Wang > Fix For: 2.3.0 > > Attachments: HDFS-4949-consolidated.patch, > caching-design-doc-2013-07-02.pdf, caching-design-doc-2013-08-09.pdf, > caching-design-doc-2013-10-24.pdf, caching-testplan.pdf, > hdfs-4949-branch-2.patch > > > HDFS currently has no support for managing or exposing in-memory caches at > datanodes. This makes it harder for higher level application frameworks like > Hive, Pig, and Impala to effectively use cluster memory, because they cannot > explicitly cache important datasets or place their tasks for memory locality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9720) DiskBalancer : Add configuration parameters
[ https://issues.apache.org/jira/browse/HDFS-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214559#comment-15214559 ] Hadoop QA commented on HDFS-9720: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s {color} | {color:red} HDFS-9720 does not apply to HDFS-1312. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12785042/HDFS-9720-HDFS-1312.001.patch | | JIRA Issue | HDFS-9720 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/14960/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > DiskBalancer : Add configuration parameters > --- > > Key: HDFS-9720 > URL: https://issues.apache.org/jira/browse/HDFS-9720 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Affects Versions: HDFS-1312 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: HDFS-1312 > > Attachments: HDFS-9720-HDFS-1312.001.patch > > > Add the following config params: > # Max Disk Throughput - This allows users to control how much disk I/O is > generated while disk balancer is running > # Max Disk Errors - For each move operation, this allows user to specify how > many I/O failures are tolerated before we declare the operation as failed. > #Block Tolerance - Specifies how much movement of data is good enough. since > the datanodes are active while the copy is going on, tolerance lets the user > specify that if we reached something like 5% of projected Ideal storage on > the destination disk, we can consider that move to be successful. > Each of these parameter can be specified for each MoveStep. if not specified > then the config parameters are used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9847) HDFS configuration without time unit name should accept friendly time units
[ https://issues.apache.org/jira/browse/HDFS-9847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214540#comment-15214540 ] Chris Douglas commented on HDFS-9847: - bq. Are you okay with making this a INFO ? Sure. bq. Should we just keep the defaults as they are to keep the patch size sane? If it was backwards-compatible test code, then we could file a followup to fix it. But _it is not just test code_. Tests are failing in core classes: {noformat} java.lang.NumberFormatException: For input string: "3s" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1375) at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.(DatanodeManager.java:234) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.(BlockManager.java:329) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:738) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:676) {noformat} Even if this constructor is only invoked by test code, it's reasonable to expect that all occurrences have had their type changed in the main classes. > HDFS configuration without time unit name should accept friendly time units > --- > > Key: HDFS-9847 > URL: https://issues.apache.org/jira/browse/HDFS-9847 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9847-branch-2.001.patch, > HDFS-9847-branch-2.002.patch, HDFS-9847-nothrow.001.patch, > HDFS-9847-nothrow.002.patch, HDFS-9847.001.patch, HDFS-9847.002.patch, > HDFS-9847.003.patch, HDFS-9847.004.patch, HDFS-9847.005.patch, > HDFS-9847.006.patch, branch-2-delta.002.txt, timeduration-w-y.patch > > > In HDFS-9821, it talks about the issue of leting existing keys use friendly > units e.g. 60s, 5m, 1d, 6w etc. But there are som configuration key names > contain time unit name, like {{dfs.blockreport.intervalMsec}}, so we can make > some other configurations which without time unit name to accept friendly > time units. The time unit {{seconds}} is frequently used in hdfs. We can > updating this configurations first. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states
[ https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214520#comment-15214520 ] Rakesh R commented on HDFS-9918: Attached new patch, here modified sort logic of lastlocatedBlock to fix test case failure - {{TestWriteReadStripedFile.testFileMoreThanABlockGroup3}} > Erasure Coding: Sort located striped blocks based on decommissioned states > -- > > Key: HDFS-9918 > URL: https://issues.apache.org/jira/browse/HDFS-9918 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-9918-001.patch, HDFS-9918-002.patch, > HDFS-9918-003.patch, HDFS-9918-004.patch, HDFS-9918-005.patch > > > This jira is a follow-on work of HDFS-8786, where we do decommissioning of > datanodes having striped blocks. > Now, after decommissioning it requires to change the ordering of the storage > list so that the decommissioned datanodes should only be last node in list. > For example, assume we have a block group with storage list:- > d0, d1, d2, d3, d4, d5, d6, d7, d8, d9 > mapping to indices > 0, 1, 2, 3, 4, 5, 6, 7, 8, 2 > Here the internal block b2 is duplicated, locating in d2 and d9. If d2 is a > decommissioning node then should switch d2 and d9 in the storage list. > Thanks [~jingzhao] for the > [discussions|https://issues.apache.org/jira/browse/HDFS-8786?focusedCommentId=15180415&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15180415] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9579) Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level
[ https://issues.apache.org/jira/browse/HDFS-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214518#comment-15214518 ] Ming Ma commented on HDFS-9579: --- [~brahmareddy], that is right for the case where ScriptBasedMapping is used but the topology script isn't set. Should we handle the special case where DEFAULT_RACK is returned from {{DNSToSwitchMapping#resolve}} similar to HDFS-10208? > Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level > - > > Key: HDFS-9579 > URL: https://issues.apache.org/jira/browse/HDFS-9579 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 3.0.0, 2.9.0 > > Attachments: HDFS-9579-10.patch, HDFS-9579-2.patch, > HDFS-9579-3.patch, HDFS-9579-4.patch, HDFS-9579-5.patch, HDFS-9579-6.patch, > HDFS-9579-7.patch, HDFS-9579-8.patch, HDFS-9579-9.patch, > HDFS-9579-branch-2.patch, HDFS-9579.patch, MR job counters.png > > > For cross DC distcp or other applications, it becomes useful to have insight > as to the traffic volume for each network distance to distinguish cross-DC > traffic, local-DC-remote-rack, etc. > FileSystem's existing {{bytesRead}} metrics tracks all the bytes read. To > provide additional metrics for each network distance, we can add additional > metrics to FileSystem level and have {{DFSInputStream}} update the value > based on the network distance between client and the datanode. > {{DFSClient}} will resolve client machine's network location as part of its > initialization. It doesn't need to resolve datanode's network location for > each read as {{DatanodeInfo}} already has the info. > There are existing HDFS specific metrics such as {{ReadStatistics}} and > {{DFSHedgedReadMetrics}}. But these metrics are only accessible via > {{DFSClient}} or {{DFSInputStream}}. Not something that application framework > such as MR and Tez can get to. That is the benefit of storing these new > metrics in FileSystem.Statistics. > This jira only includes metrics generation by HDFS. The consumption of these > metrics at MR and Tez will be tracked by separated jiras. > We can add similar metrics for HDFS write scenario later if it is necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states
[ https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-9918: --- Attachment: HDFS-9918-005.patch > Erasure Coding: Sort located striped blocks based on decommissioned states > -- > > Key: HDFS-9918 > URL: https://issues.apache.org/jira/browse/HDFS-9918 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-9918-001.patch, HDFS-9918-002.patch, > HDFS-9918-003.patch, HDFS-9918-004.patch, HDFS-9918-005.patch > > > This jira is a follow-on work of HDFS-8786, where we do decommissioning of > datanodes having striped blocks. > Now, after decommissioning it requires to change the ordering of the storage > list so that the decommissioned datanodes should only be last node in list. > For example, assume we have a block group with storage list:- > d0, d1, d2, d3, d4, d5, d6, d7, d8, d9 > mapping to indices > 0, 1, 2, 3, 4, 5, 6, 7, 8, 2 > Here the internal block b2 is duplicated, locating in d2 and d9. If d2 is a > decommissioning node then should switch d2 and d9 in the storage list. > Thanks [~jingzhao] for the > [discussions|https://issues.apache.org/jira/browse/HDFS-8786?focusedCommentId=15180415&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15180415] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9720) DiskBalancer : Add configuration parameters
[ https://issues.apache.org/jira/browse/HDFS-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-9720: Status: Patch Available (was: Open) > DiskBalancer : Add configuration parameters > --- > > Key: HDFS-9720 > URL: https://issues.apache.org/jira/browse/HDFS-9720 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Affects Versions: HDFS-1312 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: HDFS-1312 > > Attachments: HDFS-9720-HDFS-1312.001.patch > > > Add the following config params: > # Max Disk Throughput - This allows users to control how much disk I/O is > generated while disk balancer is running > # Max Disk Errors - For each move operation, this allows user to specify how > many I/O failures are tolerated before we declare the operation as failed. > #Block Tolerance - Specifies how much movement of data is good enough. since > the datanodes are active while the copy is going on, tolerance lets the user > specify that if we reached something like 5% of projected Ideal storage on > the destination disk, we can consider that move to be successful. > Each of these parameter can be specified for each MoveStep. if not specified > then the config parameters are used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9599) TestDecommissioningStatus.testDecommissionStatus occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-9599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214475#comment-15214475 ] Wei-Chiu Chuang commented on HDFS-9599: --- Hi [~linyiqun], thanks for the contribution. I think your patch makes sense to me. However, instead of starting/shutting down the cluster explicitly in each test method, what about changing the annotation of {{setUp}} from {{@BeforeClass}} to {{@Before}}, and the annotation of {{tearDown}} from {{@AfterClass}} to {{@After}}? This will make sure that the cluster is shut down properly even if an exception is thrown in the test method, while making sure the tests are isolated. > TestDecommissioningStatus.testDecommissionStatus occasionally fails > --- > > Key: HDFS-9599 > URL: https://issues.apache.org/jira/browse/HDFS-9599 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Environment: Jenkins >Reporter: Wei-Chiu Chuang >Assignee: Lin Yiqun > Attachments: HDFS-9599.001.patch > > > From test result of a recent jenkins nightly > https://builds.apache.org/job/Hadoop-Hdfs-trunk/2663/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestDecommissioningStatus/testDecommissionStatus/ > The test failed because the number of under replicated blocks is 4, instead > of 3. > Looking at the log, there is a strayed block, which might have caused the > faillure: > {noformat} > 2015-12-23 00:42:05,820 [Block report processor] INFO BlockStateChange > (BlockManager.java:processReport(2131)) - BLOCK* processReport: > blk_1073741825_1001 on node 127.0.0.1:57382 size 16384 does not belong to any > file > {noformat} > The block size 16384 suggests this is left over from the sibling test case > testDecommissionStatusAfterDNRestart. This can happen, because the same > minidfs cluster is reused between tests. > The test implementation should do a better job isolating tests. > Another case of failure is when the load factor comes into play, and a block > can not find sufficient data nodes to place replica. In this test, the > runtime should not consider load factor: > {noformat} > conf.setBoolean(DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_KEY, > false); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-10208) Addendum for HDFS-9579: to handle the case when client machine can't resolve network path
[ https://issues.apache.org/jira/browse/HDFS-10208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214460#comment-15214460 ] Ming Ma commented on HDFS-10208: asf license will be taken care of by HDFS-10221. The unit test failures aren't related. > Addendum for HDFS-9579: to handle the case when client machine can't resolve > network path > - > > Key: HDFS-10208 > URL: https://issues.apache.org/jira/browse/HDFS-10208 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-10208.patch > > > If DFSClient runs on a machine that can't resolve network path, > e.g.{{dnsToSwitchMapping.resolve}} returns null, that will cause exception > when it tries to create {{clientNode}}. In such case, there is no need to > create {{clientNode}} as null {{clientNode}} means its network distance with > any datanode is Integer.MAX_VALUE, which is what we want. > {noformat} > clientNode = new NodeBase(clientHostName, > dnsToSwitchMapping.resolve(nodes).get(0)); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9309) Tests that use KeyStoreUtil must call KeyStoreUtil.cleanupSSLConfig()
[ https://issues.apache.org/jira/browse/HDFS-9309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9309: -- Assignee: (was: Wei-Chiu Chuang) > Tests that use KeyStoreUtil must call KeyStoreUtil.cleanupSSLConfig() > - > > Key: HDFS-9309 > URL: https://issues.apache.org/jira/browse/HDFS-9309 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang >Priority: Minor > Labels: unittest > Attachments: HDFS-9309.001.patch, HDFS-9309.002.patch > > > When KeyStoreUtil.setupSSLConfig() is called, several files are created > (ssl-server.xml, ssl-client.xml, trustKS.jks, clientKS.jks, serverKS.jks). > However, if they are not deleted upon exit, weird thing can happen to any > subsequent tests. > For example, if ssl-client.xml is not delete, but trustKS.jks is deleted, > TestWebHDFSOAuth2.listStatusReturnsAsExpected will fail with message: > {noformat} > java.io.IOException: Unable to load OAuth2 connection factory. > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.(FileInputStream.java:146) > at > org.apache.hadoop.security.ssl.ReloadingX509TrustManager.loadTrustManager(ReloadingX509TrustManager.java:164) > at > org.apache.hadoop.security.ssl.ReloadingX509TrustManager.(ReloadingX509TrustManager.java:81) > at > org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(FileBasedKeyStoresFactory.java:215) > at org.apache.hadoop.security.ssl.SSLFactory.init(SSLFactory.java:131) > at > org.apache.hadoop.hdfs.web.URLConnectionFactory.newSslConnConfigurator(URLConnectionFactory.java:138) > at > org.apache.hadoop.hdfs.web.URLConnectionFactory.newOAuth2URLConnectionFactory(URLConnectionFactory.java:112) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:163) > at > org.apache.hadoop.hdfs.web.TestWebHDFSOAuth2.listStatusReturnsAsExpected(TestWebHDFSOAuth2.java:147) > {noformat} > There are currently several tests that do not clean up: > {noformat} > 130 ✗ weichiu@weichiu ~/trunk (trunk) $ grep -rnw . -e > 'KeyStoreTestUtil\.setupSSLConfig' | cut -d: -f1 |xargs grep -L > "KeyStoreTestUtil\.cleanupSSLConfig" > ./hadoop-common-project/hadoop-kms/src/test/java/org/apache/hadoop/crypto/key/kms/server/TestKMS.java > ./hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServicesWithSSL.java > ./hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsTokens.java > ./hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/SaslDataTransferTestCase.java > ./hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/TestSecureNNWithQJM.java > ./hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeRespectsBindHostKeys.java > ./hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/client/TestHttpFSFWithSWebhdfsFileSystem.java > {noformat} > This JIRA is the effort to remove the bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9277) IOException "Unable to load OAuth2 connection factory." in TestWebHDFSOAuth2.listStatusReturnsAsExpected
[ https://issues.apache.org/jira/browse/HDFS-9277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9277: -- Assignee: (was: Wei-Chiu Chuang) > IOException "Unable to load OAuth2 connection factory." in > TestWebHDFSOAuth2.listStatusReturnsAsExpected > > > Key: HDFS-9277 > URL: https://issues.apache.org/jira/browse/HDFS-9277 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang > Attachments: HDFS-9277.001.patch > > > This test is failing consistently in Hadoop-hdfs-trunk and > Hadoop-hdfs-trunk-Java8 since September 22. > REGRESSION: > org.apache.hadoop.hdfs.web.TestWebHDFSOAuth2.listStatusReturnsAsExpected > Error Message: > Unable to load OAuth2 connection factory. > Stack Trace: > java.io.IOException: Unable to load OAuth2 connection factory. > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.(FileInputStream.java:146) > at > org.apache.hadoop.security.ssl.ReloadingX509TrustManager.loadTrustManager(ReloadingX509TrustManager.java:164) > at > org.apache.hadoop.security.ssl.ReloadingX509TrustManager.(ReloadingX509TrustManager.java:81) > at > org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(FileBasedKeyStoresFactory.java:215) > at org.apache.hadoop.security.ssl.SSLFactory.init(SSLFactory.java:131) > at > org.apache.hadoop.hdfs.web.URLConnectionFactory.newSslConnConfigurator(URLConnectionFactory.java:135) > at > org.apache.hadoop.hdfs.web.URLConnectionFactory.newOAuth2URLConnectionFactory(URLConnectionFactory.java:110) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:158) > at > org.apache.hadoop.hdfs.web.TestWebHDFSOAuth2.listStatusReturnsAsExpected(TestWebHDFSOAuth2.java:147) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9309) Tests that use KeyStoreUtil must call KeyStoreUtil.cleanupSSLConfig()
[ https://issues.apache.org/jira/browse/HDFS-9309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9309: -- Resolution: Cannot Reproduce Status: Resolved (was: Patch Available) The issue does not seem to appear any more. > Tests that use KeyStoreUtil must call KeyStoreUtil.cleanupSSLConfig() > - > > Key: HDFS-9309 > URL: https://issues.apache.org/jira/browse/HDFS-9309 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Minor > Labels: unittest > Attachments: HDFS-9309.001.patch, HDFS-9309.002.patch > > > When KeyStoreUtil.setupSSLConfig() is called, several files are created > (ssl-server.xml, ssl-client.xml, trustKS.jks, clientKS.jks, serverKS.jks). > However, if they are not deleted upon exit, weird thing can happen to any > subsequent tests. > For example, if ssl-client.xml is not delete, but trustKS.jks is deleted, > TestWebHDFSOAuth2.listStatusReturnsAsExpected will fail with message: > {noformat} > java.io.IOException: Unable to load OAuth2 connection factory. > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.(FileInputStream.java:146) > at > org.apache.hadoop.security.ssl.ReloadingX509TrustManager.loadTrustManager(ReloadingX509TrustManager.java:164) > at > org.apache.hadoop.security.ssl.ReloadingX509TrustManager.(ReloadingX509TrustManager.java:81) > at > org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(FileBasedKeyStoresFactory.java:215) > at org.apache.hadoop.security.ssl.SSLFactory.init(SSLFactory.java:131) > at > org.apache.hadoop.hdfs.web.URLConnectionFactory.newSslConnConfigurator(URLConnectionFactory.java:138) > at > org.apache.hadoop.hdfs.web.URLConnectionFactory.newOAuth2URLConnectionFactory(URLConnectionFactory.java:112) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:163) > at > org.apache.hadoop.hdfs.web.TestWebHDFSOAuth2.listStatusReturnsAsExpected(TestWebHDFSOAuth2.java:147) > {noformat} > There are currently several tests that do not clean up: > {noformat} > 130 ✗ weichiu@weichiu ~/trunk (trunk) $ grep -rnw . -e > 'KeyStoreTestUtil\.setupSSLConfig' | cut -d: -f1 |xargs grep -L > "KeyStoreTestUtil\.cleanupSSLConfig" > ./hadoop-common-project/hadoop-kms/src/test/java/org/apache/hadoop/crypto/key/kms/server/TestKMS.java > ./hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServicesWithSSL.java > ./hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsTokens.java > ./hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/SaslDataTransferTestCase.java > ./hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/TestSecureNNWithQJM.java > ./hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeRespectsBindHostKeys.java > ./hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/client/TestHttpFSFWithSWebhdfsFileSystem.java > {noformat} > This JIRA is the effort to remove the bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9466) TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky
[ https://issues.apache.org/jira/browse/HDFS-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9466: -- Attachment: org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache-output.txt Attaching the log of a failed run. [~cmccabe] sorry for the long delay ... The hypothesis here is that after {{TestCleanupFailureInjector#injectRequestFileDescriptorsFailure}} throws an exception to inject failure, it takes some time to propagate that exception, and we occasionally test and verify if the number of slots is one, before it catches the exception and remove the slot. This race between removing the slot and checking the slot size failed the test. Adding a {{watiFor}} seem to remove the race. > TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky > > > Key: HDFS-9466 > URL: https://issues.apache.org/jira/browse/HDFS-9466 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs, hdfs-client >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-9466.001.patch, HDFS-9466.002.patch, > org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache-output.txt > > > This test is flaky and fails quite frequently in trunk. > Error Message > expected:<1> but was:<2> > Stacktrace > {noformat} > java.lang.AssertionError: expected:<1> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache$17.accept(TestShortCircuitCache.java:636) > at > org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.visit(ShortCircuitRegistry.java:395) > at > org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.checkNumberOfSegmentsAndSlots(TestShortCircuitCache.java:631) > at > org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.testDataXceiverCleansUpSlotsOnFailure(TestShortCircuitCache.java:684) > {noformat} > Thanks to [~xiaochen] for identifying the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8101) DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at runtime
[ https://issues.apache.org/jira/browse/HDFS-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214422#comment-15214422 ] Kihwal Lee commented on HDFS-8101: -- Fixed CHANGES.txt. > DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at > runtime > --- > > Key: HDFS-8101 > URL: https://issues.apache.org/jira/browse/HDFS-8101 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Minor > Fix For: 2.7.3 > > Attachments: HDFS-8101.1.patch.txt > > > Previously, all references to DFSConfigKeys in DFSClient were compile time > constants which meant that normal users of DFSClient wouldn't resolve > DFSConfigKeys at run time. As of HDFS-7718, DFSClient has a reference to a > member of DFSConfigKeys that isn't compile time constant > (DFS_CLIENT_KEY_PROVIDER_CACHE_EXPIRY_DEFAULT). > Since the class must be resolved now, this particular member > {code} > public static final String DFS_WEBHDFS_AUTHENTICATION_FILTER_DEFAULT = > AuthFilter.class.getName(); > {code} > means that javax.servlet.Filter needs to be on the classpath. > javax-servlet-api is one of the properly listed dependencies for HDFS, > however if we replace {{AuthFilter.class.getName()}} with the equivalent > String literal then downstream folks can avoid including it while maintaining > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-8101) DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at runtime
[ https://issues.apache.org/jira/browse/HDFS-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214417#comment-15214417 ] Kihwal Lee edited comment on HDFS-8101 at 3/28/16 4:28 PM: --- The CHANGES.txt entry was added to the 2.7.2 (released) section. It should be in 2.7.3. was (Author: kihwal): The CHANGES.txt entry was added to 2.7.2 (released) section. > DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at > runtime > --- > > Key: HDFS-8101 > URL: https://issues.apache.org/jira/browse/HDFS-8101 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Minor > Fix For: 2.7.3 > > Attachments: HDFS-8101.1.patch.txt > > > Previously, all references to DFSConfigKeys in DFSClient were compile time > constants which meant that normal users of DFSClient wouldn't resolve > DFSConfigKeys at run time. As of HDFS-7718, DFSClient has a reference to a > member of DFSConfigKeys that isn't compile time constant > (DFS_CLIENT_KEY_PROVIDER_CACHE_EXPIRY_DEFAULT). > Since the class must be resolved now, this particular member > {code} > public static final String DFS_WEBHDFS_AUTHENTICATION_FILTER_DEFAULT = > AuthFilter.class.getName(); > {code} > means that javax.servlet.Filter needs to be on the classpath. > javax-servlet-api is one of the properly listed dependencies for HDFS, > however if we replace {{AuthFilter.class.getName()}} with the equivalent > String literal then downstream folks can avoid including it while maintaining > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10221) Add test resource dfs.hosts.json to the rat exclusions
[ https://issues.apache.org/jira/browse/HDFS-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-10221: --- Assignee: Ming Ma Status: Patch Available (was: Open) > Add test resource dfs.hosts.json to the rat exclusions > -- > > Key: HDFS-10221 > URL: https://issues.apache.org/jira/browse/HDFS-10221 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-10221.patch > > > A new test resource dfs.hosts.json was added to HDFS-9005 for better > readability. Given json file doesn't allow comments, it violates ASF license. > To address this, we can add the file to rat exclusions list in the pom.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8101) DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at runtime
[ https://issues.apache.org/jira/browse/HDFS-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214417#comment-15214417 ] Kihwal Lee commented on HDFS-8101: -- The CHANGES.txt entry was added to 2.7.2 (released) section. > DFSClient use of non-constant DFSConfigKeys pulls in WebHDFS classes at > runtime > --- > > Key: HDFS-8101 > URL: https://issues.apache.org/jira/browse/HDFS-8101 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.7.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Minor > Fix For: 2.7.3 > > Attachments: HDFS-8101.1.patch.txt > > > Previously, all references to DFSConfigKeys in DFSClient were compile time > constants which meant that normal users of DFSClient wouldn't resolve > DFSConfigKeys at run time. As of HDFS-7718, DFSClient has a reference to a > member of DFSConfigKeys that isn't compile time constant > (DFS_CLIENT_KEY_PROVIDER_CACHE_EXPIRY_DEFAULT). > Since the class must be resolved now, this particular member > {code} > public static final String DFS_WEBHDFS_AUTHENTICATION_FILTER_DEFAULT = > AuthFilter.class.getName(); > {code} > means that javax.servlet.Filter needs to be on the classpath. > javax-servlet-api is one of the properly listed dependencies for HDFS, > however if we replace {{AuthFilter.class.getName()}} with the equivalent > String literal then downstream folks can avoid including it while maintaining > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10221) Add test resource dfs.hosts.json to the rat exclusions
[ https://issues.apache.org/jira/browse/HDFS-10221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-10221: --- Attachment: HDFS-10221.patch Here is the patch. > Add test resource dfs.hosts.json to the rat exclusions > -- > > Key: HDFS-10221 > URL: https://issues.apache.org/jira/browse/HDFS-10221 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma > Attachments: HDFS-10221.patch > > > A new test resource dfs.hosts.json was added to HDFS-9005 for better > readability. Given json file doesn't allow comments, it violates ASF license. > To address this, we can add the file to rat exclusions list in the pom.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9277) IOException "Unable to load OAuth2 connection factory." in TestWebHDFSOAuth2.listStatusReturnsAsExpected
[ https://issues.apache.org/jira/browse/HDFS-9277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-9277: -- Resolution: Cannot Reproduce Status: Resolved (was: Patch Available) Resolve as "cannot reproduce". I think it is still worth noting that the test directories for ssl tests should be removed at the end test as a good practice, but since it is no longer an issue, let's close it. > IOException "Unable to load OAuth2 connection factory." in > TestWebHDFSOAuth2.listStatusReturnsAsExpected > > > Key: HDFS-9277 > URL: https://issues.apache.org/jira/browse/HDFS-9277 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Attachments: HDFS-9277.001.patch > > > This test is failing consistently in Hadoop-hdfs-trunk and > Hadoop-hdfs-trunk-Java8 since September 22. > REGRESSION: > org.apache.hadoop.hdfs.web.TestWebHDFSOAuth2.listStatusReturnsAsExpected > Error Message: > Unable to load OAuth2 connection factory. > Stack Trace: > java.io.IOException: Unable to load OAuth2 connection factory. > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.(FileInputStream.java:146) > at > org.apache.hadoop.security.ssl.ReloadingX509TrustManager.loadTrustManager(ReloadingX509TrustManager.java:164) > at > org.apache.hadoop.security.ssl.ReloadingX509TrustManager.(ReloadingX509TrustManager.java:81) > at > org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(FileBasedKeyStoresFactory.java:215) > at org.apache.hadoop.security.ssl.SSLFactory.init(SSLFactory.java:131) > at > org.apache.hadoop.hdfs.web.URLConnectionFactory.newSslConnConfigurator(URLConnectionFactory.java:135) > at > org.apache.hadoop.hdfs.web.URLConnectionFactory.newOAuth2URLConnectionFactory(URLConnectionFactory.java:110) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.initialize(WebHdfsFileSystem.java:158) > at > org.apache.hadoop.hdfs.web.TestWebHDFSOAuth2.listStatusReturnsAsExpected(TestWebHDFSOAuth2.java:147) -- This message was sent by Atlassian JIRA (v6.3.4#6332)