[jira] [Commented] (HDFS-8925) Move BlockReader to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704396#comment-14704396 ] Zhe Zhang commented on HDFS-8925: - Thanks Haohui for the feedback. With the current {{ErasureCodingWorker}} design I think we are almost certain to move {{BlockReader}} out of client module again. I'll leave to [~hitliuyi] to comment on whether it makes sense to reimplement a block reader for DN. > Move BlockReader to hdfs-client > --- > > Key: HDFS-8925 > URL: https://issues.apache.org/jira/browse/HDFS-8925 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > > This jira tracks the effort of moving the {{BlockReader}} class into the > hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8927) CredentialsSys is not unix/linux compatible
Haim Helman created HDFS-8927: - Summary: CredentialsSys is not unix/linux compatible Key: HDFS-8927 URL: https://issues.apache.org/jira/browse/HDFS-8927 Project: Hadoop HDFS Issue Type: Bug Components: nfs Reporter: Haim Helman Priority: Minor When trying to connect to a linux NFS server using AUTH_SYS and a hostname with 33 bytes I get: bad auth_len gid 0 str 36 auth 53 Looking at the Unix/Linux code at svc_auth_unix.c, it looks like the hostname length is rounded up to the nearest multiple of 4: str_len = RNDUP(str_len); Perhaps CredentialsSys should do that too? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704384#comment-14704384 ] Zhe Zhang commented on HDFS-7285: - I just finished step #1 and #2 above, and pushed the result of "git merge" to the {{HDFS-7285-merge}} branch. Jenkins [job | https://builds.apache.org/job/Hadoop-HDFS-7285-Merge/] has been triggered. Because HDFS-8801 requires major changes to the branch, this "git merge" was against HDFS-6407, which immediately precedes HDFS-8801 in trunk. [~jingzhao] has created a patch under HDFS-8909 which should be able to merge HDFS-8801 to the branch (I think we should keep it as a separate JIRA because it does more than just merging). > Erasure Coding Support inside HDFS > -- > > Key: HDFS-7285 > URL: https://issues.apache.org/jira/browse/HDFS-7285 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Weihua Jiang >Assignee: Zhe Zhang > Attachments: Consolidated-20150707.patch, > Consolidated-20150806.patch, Consolidated-20150810.patch, ECAnalyzer.py, > ECParser.py, HDFS-7285-initial-PoC.patch, > HDFS-7285-merge-consolidated-01.patch, > HDFS-7285-merge-consolidated-trunk-01.patch, > HDFS-7285-merge-consolidated.trunk.03.patch, > HDFS-7285-merge-consolidated.trunk.04.patch, > HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch, > HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf, > HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, > HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf, > fsimage-analysis-20150105.pdf > > > Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice > of data reliability, comparing to the existing HDFS 3-replica approach. For > example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, > with storage overhead only being 40%. This makes EC a quite attractive > alternative for big data storage, particularly for cold data. > Facebook had a related open source project called HDFS-RAID. It used to be > one of the contribute packages in HDFS but had been removed since Hadoop 2.0 > for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends > on MapReduce to do encoding and decoding tasks; 2) it can only be used for > cold files that are intended not to be appended anymore; 3) the pure Java EC > coding implementation is extremely slow in practical use. Due to these, it > might not be a good idea to just bring HDFS-RAID back. > We (Intel and Cloudera) are working on a design to build EC into HDFS that > gets rid of any external dependencies, makes it self-contained and > independently maintained. This design lays the EC feature on the storage type > support and considers compatible with existing HDFS features like caching, > snapshot, encryption, high availability and etc. This design will also > support different EC coding schemes, implementations and policies for > different deployment scenarios. By utilizing advanced libraries (e.g. Intel > ISA-L library), an implementation can greatly improve the performance of EC > encoding/decoding and makes the EC solution even more attractive. We will > post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704343#comment-14704343 ] Yongjun Zhang commented on HDFS-8828: - +1 on rev 011. Will commit tomorrow morning. Thanks [~yufeigu] and [~jingzhao]! > Utilize Snapshot diff report to build incremental copy list in distcp > - > > Key: HDFS-8828 > URL: https://issues.apache.org/jira/browse/HDFS-8828 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, > HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, > HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, > HDFS-8828.009.patch, HDFS-8828.010.patch, HDFS-8828.011.patch > > > Some users reported huge time cost to build file copy list in distcp. (30 > hours for 1.6M files). We can leverage snapshot diff report to build file > copy list including files/dirs which are changes only between two snapshots > (or a snapshot and a normal dir). It speed up the process in two folds: 1. > less copy list building time. 2. less file copy MR jobs. > HDFS snapshot diff report provide information about file/directory creation, > deletion, rename and modification between two snapshots or a snapshot and a > normal directory. HDFS-7535 synchronize deletion and rename, then fallback to > the default distcp. So it still relies on default distcp to building complete > list of files under the source dir. This patch only puts creation and > modification files into the copy list based on snapshot diff report. We can > minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704321#comment-14704321 ] Hadoop QA commented on HDFS-8924: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 19m 42s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 7m 54s | The applied patch generated 1 additional warning messages. | | {color:green}+1{color} | javadoc | 9m 55s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 29s | The applied patch generated 22 new checkstyle issues (total was 40, now 62). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 41s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 17s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 203m 24s | Tests failed in hadoop-hdfs. | | {color:green}+1{color} | hdfs tests | 0m 28s | Tests passed in hadoop-hdfs-client. | | | | 254m 26s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestNameNodeMetricsLogger | | | hadoop.hdfs.server.namenode.TestFileTruncate | | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12751359/HDFS-8924.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 4e14f79 | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/12052/artifact/patchprocess/diffJavacWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12052/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12052/artifact/patchprocess/testrun_hadoop-hdfs.txt | | hadoop-hdfs-client test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12052/artifact/patchprocess/testrun_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12052/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12052/console | This message was automatically generated. > Add pluggable interface for reading replicas in DFSClient > - > > Key: HDFS-8924 > URL: https://issues.apache.org/jira/browse/HDFS-8924 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.8.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-8924.001.patch > > > We should add a pluggable interface for reading replicas in the DFSClient. > This could be used to implement short-circuit reads on systems without file > descriptors, or for other optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity
[ https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8287: --- Target Version/s: HDFS-7285 > DFSStripedOutputStream.writeChunk should not wait for writing parity > - > > Key: HDFS-8287 > URL: https://issues.apache.org/jira/browse/HDFS-8287 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Kai Sasaki > Attachments: HDFS-8287-HDFS-7285.00.patch, > HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, > HDFS-8287-HDFS-7285.03.patch, HDFS-8287-HDFS-7285.04.patch, > HDFS-8287-HDFS-7285.05.patch > > > When a stripping cell is full, writeChunk computes and generates parity > packets. It sequentially calls waitAndQueuePacket so that user client cannot > continue to write data until it finishes. > We should allow user client to continue writing instead but not blocking it > when writing parity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity
[ https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704315#comment-14704315 ] Rakesh R commented on HDFS-8287: bq. I think it might need some more rewrite, so it is better to do in separate JIRA. Is that okay? OK, it makes sense to me. Thanks [~kaisasak], +1 latest patch looks good. > DFSStripedOutputStream.writeChunk should not wait for writing parity > - > > Key: HDFS-8287 > URL: https://issues.apache.org/jira/browse/HDFS-8287 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Kai Sasaki > Attachments: HDFS-8287-HDFS-7285.00.patch, > HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, > HDFS-8287-HDFS-7285.03.patch, HDFS-8287-HDFS-7285.04.patch, > HDFS-8287-HDFS-7285.05.patch > > > When a stripping cell is full, writeChunk computes and generates parity > packets. It sequentially calls waitAndQueuePacket so that user client cannot > continue to write data until it finishes. > We should allow user client to continue writing instead but not blocking it > when writing parity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8306) Generate ACL and Xattr outputs in OIV XML outputs
[ https://issues.apache.org/jira/browse/HDFS-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704288#comment-14704288 ] Akira AJISAKA commented on HDFS-8306: - Thanks [~eddyxu] for creating the patch. Could you skip output the contents of a feature (ex. quota by storage type) instead of throwing IOException if the layout version of the fsimage does not support the feature? > Generate ACL and Xattr outputs in OIV XML outputs > - > > Key: HDFS-8306 > URL: https://issues.apache.org/jira/browse/HDFS-8306 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.7.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-8306.000.patch, HDFS-8306.001.patch, > HDFS-8306.002.patch, HDFS-8306.003.patch, HDFS-8306.004.patch, > HDFS-8306.005.patch, HDFS-8306.006.patch, HDFS-8306.007.patch, > HDFS-8306.008.patch, HDFS-8306.debug0.patch, HDFS-8306.debug1.patch > > > Currently, in the {{hdfs oiv}} XML outputs, not all fields of fsimage are > outputs. It makes inspecting {{fsimage}} from XML outputs less practical. > Also it prevents recovering a fsimage from XML file. > This JIRA is adding ACL and XAttrs in the XML outputs as the first step to > achieve the goal described in HDFS-8061. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8823) Move replication factor into individual blocks
[ https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704275#comment-14704275 ] Hadoop QA commented on HDFS-8823: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 57s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 7 new or modified test files. | | {color:green}+1{color} | javac | 7m 58s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 52s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 24s | The applied patch generated 2 new checkstyle issues (total was 648, now 645). | | {color:green}+1{color} | whitespace | 0m 7s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 38s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 13s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 176m 27s | Tests failed in hadoop-hdfs. | | | | 222m 10s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12751373/HDFS-8823.006.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 4e14f79 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12051/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12051/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12051/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12051/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12051/console | This message was automatically generated. > Move replication factor into individual blocks > -- > > Key: HDFS-8823 > URL: https://issues.apache.org/jira/browse/HDFS-8823 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-8823.000.patch, HDFS-8823.001.patch, > HDFS-8823.002.patch, HDFS-8823.003.patch, HDFS-8823.004.patch, > HDFS-8823.005.patch, HDFS-8823.006.patch > > > This jira proposes to record the replication factor in the {{BlockInfo}} > class. The changes have two advantages: > * Decoupling the namespace and the block management layer. It is a > prerequisite step to move block management off the heap or to a separate > process. > * Increased flexibility on replicating blocks. Currently the replication > factors of all blocks have to be the same. The replication factors of these > blocks are equal to the highest replication factor across all snapshots. The > changes will allow blocks in a file to have different replication factor, > potentially saving some space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704270#comment-14704270 ] Hadoop QA commented on HDFS-8828: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 56s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 38s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 40s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 26s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 10s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 28s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 47s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | tools/hadoop tests | 6m 26s | Tests passed in hadoop-distcp. | | | | 43m 32s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12751374/HDFS-8828.011.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 36b1a1e | | hadoop-distcp test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12053/artifact/patchprocess/testrun_hadoop-distcp.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12053/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12053/console | This message was automatically generated. > Utilize Snapshot diff report to build incremental copy list in distcp > - > > Key: HDFS-8828 > URL: https://issues.apache.org/jira/browse/HDFS-8828 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, > HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, > HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, > HDFS-8828.009.patch, HDFS-8828.010.patch, HDFS-8828.011.patch > > > Some users reported huge time cost to build file copy list in distcp. (30 > hours for 1.6M files). We can leverage snapshot diff report to build file > copy list including files/dirs which are changes only between two snapshots > (or a snapshot and a normal dir). It speed up the process in two folds: 1. > less copy list building time. 2. less file copy MR jobs. > HDFS snapshot diff report provide information about file/directory creation, > deletion, rename and modification between two snapshots or a snapshot and a > normal directory. HDFS-7535 synchronize deletion and rename, then fallback to > the default distcp. So it still relies on default distcp to building complete > list of files under the source dir. This patch only puts creation and > modification files into the copy list based on snapshot diff report. We can > minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8829) DataNode sets SO_RCVBUF explicitly is disabling tcp auto-tuning
[ https://issues.apache.org/jira/browse/HDFS-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704266#comment-14704266 ] He Tianyi commented on HDFS-8829: - Completely agree with you, [~cmccabe]. > DataNode sets SO_RCVBUF explicitly is disabling tcp auto-tuning > --- > > Key: HDFS-8829 > URL: https://issues.apache.org/jira/browse/HDFS-8829 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.3.0, 2.6.0 >Reporter: He Tianyi >Assignee: kanaka kumar avvaru > > {code:java} > private void initDataXceiver(Configuration conf) throws IOException { > // find free port or use privileged port provided > TcpPeerServer tcpPeerServer; > if (secureResources != null) { > tcpPeerServer = new TcpPeerServer(secureResources); > } else { > tcpPeerServer = new TcpPeerServer(dnConf.socketWriteTimeout, > DataNode.getStreamingAddr(conf)); > } > > tcpPeerServer.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE); > {code} > The last line sets SO_RCVBUF explicitly, thus disabling tcp auto-tuning on > some system. > Shall we make this behavior configurable? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-247) A tool to plot the locations of the blocks of a directory
[ https://issues.apache.org/jira/browse/HDFS-247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avinash Desireddy reassigned HDFS-247: -- Assignee: Avinash Desireddy > A tool to plot the locations of the blocks of a directory > - > > Key: HDFS-247 > URL: https://issues.apache.org/jira/browse/HDFS-247 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Owen O'Malley >Assignee: Avinash Desireddy > Labels: newbie > > It would be very useful to have a command that we could give a hdfs directory > to, that would use fsck to find the block locations of the data files in that > directory and group them by host and display the distribution graphically. We > did this by hand and it was very for finding a skewed distribution that was > causing performance problems. The tool should also be able to group by rack > id and generate a similar plot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8388) Time and Date format need to be in sync in Namenode UI page
[ https://issues.apache.org/jira/browse/HDFS-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704264#comment-14704264 ] Surendra Singh Lilhore commented on HDFS-8388: -- yes, I will do this and update the patch here soon.. > Time and Date format need to be in sync in Namenode UI page > --- > > Key: HDFS-8388 > URL: https://issues.apache.org/jira/browse/HDFS-8388 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Surendra Singh Lilhore >Priority: Minor > Attachments: HDFS-8388-002.patch, HDFS-8388-003.patch, > HDFS-8388.patch, HDFS-8388_1.patch, ScreenShot-InvalidDate.png > > > In NameNode UI Page, Date and Time FORMAT displayed on the page are not in > sync currently. > Started:Wed May 13 12:28:02 IST 2015 > Compiled:23 Apr 2015 12:22:59 > Block Deletion Start Time 13 May 2015 12:28:02 > We can keep a common format in all the above places. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8388) Time and Date format need to be in sync in Namenode UI page
[ https://issues.apache.org/jira/browse/HDFS-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704184#comment-14704184 ] Akira AJISAKA commented on HDFS-8388: - bq. I didn't see any document for NNStarted, so I didn't added for new metric. I'm thinking {{NNStarted}} should be documented as well. Would you document both {{NNStarted}} and the new metric? > Time and Date format need to be in sync in Namenode UI page > --- > > Key: HDFS-8388 > URL: https://issues.apache.org/jira/browse/HDFS-8388 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Archana T >Assignee: Surendra Singh Lilhore >Priority: Minor > Attachments: HDFS-8388-002.patch, HDFS-8388-003.patch, > HDFS-8388.patch, HDFS-8388_1.patch, ScreenShot-InvalidDate.png > > > In NameNode UI Page, Date and Time FORMAT displayed on the page are not in > sync currently. > Started:Wed May 13 12:28:02 IST 2015 > Compiled:23 Apr 2015 12:22:59 > Block Deletion Start Time 13 May 2015 12:28:02 > We can keep a common format in all the above places. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7116) Add a command to get the bandwidth of balancer
[ https://issues.apache.org/jira/browse/HDFS-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704182#comment-14704182 ] Akira AJISAKA commented on HDFS-7116: - bq. How about exposing balancerBandwidth value as a Datanode metric? I'm fine with your suggestion. Let's start this first. > Add a command to get the bandwidth of balancer > -- > > Key: HDFS-7116 > URL: https://issues.apache.org/jira/browse/HDFS-7116 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer & mover >Reporter: Akira AJISAKA >Assignee: Rakesh R > Attachments: HDFS-7116-00.patch, HDFS-7116-01.patch > > > Now reading logs is the only way to check how the balancer bandwidth is set. > It would be useful for administrators if they can get the parameter via CLI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8892) ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too
[ https://issues.apache.org/jira/browse/HDFS-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704181#comment-14704181 ] Colin Patrick McCabe commented on HDFS-8892: bq. I assume slot-invalidation will happen during block-invalidation/deletes {Primarily triggered by compaction/shard-takeover etc..} Yes. I guess the good thing about this patch is that it might reduce fd consumption in some scenarios. The bad thing about this approach is that instead of only checking the older replicas, we would have to iterate over all replicas in order to check every slot. Also, currently the Runnable runs less often when the replica timeout is longer, so this logic would have to be changed. > ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too > - > > Key: HDFS-8892 > URL: https://issues.apache.org/jira/browse/HDFS-8892 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.7.1 >Reporter: Ravikumar >Assignee: kanaka kumar avvaru >Priority: Minor > > Currently CacheCleaner thread checks only for cache-expiry times. It would be > nice if it handles an invalid-slot too in an extra-pass of evictable map… > for(ShortCircuitReplica replica:evictable.values()) { > if(!scr.getSlot().isValid()) { > purge(replica); > } > } > //Existing code... > int numDemoted = demoteOldEvictableMmaped(curMs); > int numPurged = 0; > Long evictionTimeNs = Long.valueOf(0); > …. > ….. > Apps like HBase can tweak the expiry/staleness/cache-size params in > DFS-Client, so that ShortCircuitReplica will never be closed except when Slot > is declared invalid. > I assume slot-invalidation will happen during block-invalidation/deletes > {Primarily triggered by compaction/shard-takeover etc..} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8922) IBM Java requires libdl for linking in native_mini_dfs
[ https://issues.apache.org/jira/browse/HDFS-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704149#comment-14704149 ] Colin Patrick McCabe commented on HDFS-8922: +1 > IBM Java requires libdl for linking in native_mini_dfs > -- > > Key: HDFS-8922 > URL: https://issues.apache.org/jira/browse/HDFS-8922 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 2.7.1 > Environment: IBM Java RHEL7.1 >Reporter: Ayappan > Attachments: HDFS-8922.patch > > > Building hadoop-hdfs-project with -Pnative option using IBM Java fails with > the following error > [exec] Linking C executable test_native_mini_dfs > [exec] /usr/bin/cmake -E cmake_link_script > CMakeFiles/test_native_mini_dfs.dir/link.txt --verbose=1 > [exec] /usr/bin/cc -g -Wall -O2 -D_REENTRANT -D_GNU_SOURCE > -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -fvisibility=hidden > CMakeFiles/test_native_mini_dfs.dir/main/native/libhdfs/test_native_mini_dfs.c.o > -o test_native_mini_dfs -rdynamic libnative_mini_dfs.a > /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so -lpthread > -Wl,-rpath,/home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic > [exec] make[2]: Leaving directory > `/home/ayappan/hadoop_2.7.1_new/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/native' > [exec] make[1]: Leaving directory > `/home/ayappan/hadoop_2.7.1_new/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/native' > [exec] > /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: > undefined reference to `dlopen' > [exec] > /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: > undefined reference to `dlclose' > [exec] > /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: > undefined reference to `dlerror' > [exec] > /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: > undefined reference to `dlsym' > [exec] > /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: > undefined reference to `dladdr' > [exec] collect2: error: ld returned 1 exit status > [exec] make[2]: *** [test_native_mini_dfs] Error 1 > [exec] make[1]: *** [CMakeFiles/test_native_mini_dfs.dir/all] Error 2 > [exec] make: *** [all] Error 2 > It seems like the IBM jvm requires libdl for linking in native_mini_dfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704147#comment-14704147 ] Lei (Eddy) Xu commented on HDFS-8924: - This patch is mostly proposing an interface, it looks good to me. Only a few minor comments: * Could you remove the change in {{ClientContext.java}}? * Would {{ReplicaAccessor}}, {{ReplicaAccessorBuilder}} to be {{interface}}? +1 after these being addressed. > Add pluggable interface for reading replicas in DFSClient > - > > Key: HDFS-8924 > URL: https://issues.apache.org/jira/browse/HDFS-8924 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.8.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-8924.001.patch > > > We should add a pluggable interface for reading replicas in the DFSClient. > This could be used to implement short-circuit reads on systems without file > descriptors, or for other optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8920) Erasure Coding: when recovering lost blocks, logs can be too verbose and hurt performance
[ https://issues.apache.org/jira/browse/HDFS-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704148#comment-14704148 ] Colin Patrick McCabe commented on HDFS-8920: These log message reflect possible data loss. I do not think we should change the log level here, although maybe there is another change we could make. > Erasure Coding: when recovering lost blocks, logs can be too verbose and hurt > performance > - > > Key: HDFS-8920 > URL: https://issues.apache.org/jira/browse/HDFS-8920 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rui Li >Assignee: Rui Li > > When we test reading data with datanodes killed, > {{DFSInputStream::getBestNodeDNAddrPair}} becomes a hot spot method and > effectively blocks the client JVM. This log seems too verbose: > {code} > if (chosenNode == null) { > DFSClient.LOG.warn("No live nodes contain block " + block.getBlock() + > " after checking nodes = " + Arrays.toString(nodes) + > ", ignoredNodes = " + ignoredNodes); > return null; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704102#comment-14704102 ] Xiaobing Zhou commented on HDFS-8855: - Thanks [~wheat9] and [~bobhansen] for review. Tracing down to ProtobufRpcEngine that calls Client.getConnection which fetches connection from cache(com.google.common.cache.Cache) by using ConnectionId as key. ConnectionId is different for every webhdfs request even if the url and user are same. That's why the NN connection is constantly created by DN in this case. Need to refactor ConnectionId(hashCode or equals) somehow to meet comparison invariant assumed by cache(com.google.common.cache.Cache) to make it work properly. > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs > Environment: HDP 2.2 >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Attachments: HDFS-8855.1.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs
[ https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704101#comment-14704101 ] Hadoop QA commented on HDFS-6481: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | patch | 0m 1s | The patch file was not named according to hadoop's naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute for instructions. | | {color:blue}0{color} | pre-patch | 18m 54s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 8m 9s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 34s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 27s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 46s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 16s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 181m 21s | Tests failed in hadoop-hdfs. | | | | 228m 58s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestNameNodeMetricsLogger | | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12648186/hdfs-6481-v1.txt | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 3aac475 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12047/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12047/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12047/console | This message was automatically generated. > DatanodeManager#getDatanodeStorageInfos() should check the length of > storageIDs > --- > > Key: HDFS-6481 > URL: https://issues.apache.org/jira/browse/HDFS-6481 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Ted Yu >Assignee: Ted Yu > Labels: BB2015-05-TBR > Attachments: hdfs-6481-v1.txt > > > Ian Brooks reported the following stack trace: > {code} > 2014-06-03 13:05:03,915 WARN [DataStreamer for file > /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200 > block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] > hdfs.DFSClient: DataStreamer Exception > org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException): > 0 > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security
[jira] [Commented] (HDFS-8809) HDFS fsck reports HBase WALs files (under construction) as "CORRUPT" (missing blocks) when HBase is running
[ https://issues.apache.org/jira/browse/HDFS-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704088#comment-14704088 ] Hadoop QA commented on HDFS-8809: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 22s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 16s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 19s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 34s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 42s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 37s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 47s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 23s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 177m 26s | Tests failed in hadoop-hdfs. | | | | 221m 53s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM | | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12751321/HDFS-8809.000.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 3aac475 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12048/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12048/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12048/console | This message was automatically generated. > HDFS fsck reports HBase WALs files (under construction) as "CORRUPT" (missing > blocks) when HBase is running > --- > > Key: HDFS-8809 > URL: https://issues.apache.org/jira/browse/HDFS-8809 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 2.7.0 > Environment: Hadoop 2.7.1 and HBase 1.1.1, on SUSE11sp3 (other > Linuxes not tested, probably not platform-dependent). This did NOT happen > with Hadoop 2.4 and HBase 0.98. >Reporter: Sudhir Prakash >Assignee: Jing Zhao > Attachments: HDFS-8809.000.patch > > > Whenever HBase is running, the "hdfs fsck /" reports four hbase-related > files in the path "hbase/data/WALs/" as CORRUPT. Even after letting the > cluster sit idle for a couple hours, it is still in the corrupt state. If > HBase is shut down, the problem goes away. If HBase is then restarted, the > problem recurs. This was observed with Hadoop 2.7.1 and HBase 1.1.1, and did > NOT happen with Hadoop 2.4 and HBase 0.98. > {code} > hades1:/var/opt/teradata/packages # su hdfs > hdfs@hades1:/var/opt/teradata/packages> hdfs fsck / > Connecting to namenode via > http://hades1.labs.teradata.com:50070/fsck?ugi=hdfs&path=%2F > FSCK started by hdfs (auth:SIMPLE) from /39.0.8.2 for path / at Wed Jun 24 > 20:40:17 GMT 2015 > ... > /apps/hbase/data/WALs/hades4.labs.teradata.com,16020,1435168292684/hades4.labs.teradata.com%2C16020%2C1435168292684.default.1435175500556: > MISSING 1 blocks of total size 83 B. > /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466..meta.1435175562144.meta: > MISSING 1 blocks of total size 83 B. > /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466.default.1435175498500: > MISSING 1 blocks of total size 83 B. > /apps/hbase/data/WALs/hades6.labs.teradata.com,16020,1435168292373/hades6.labs.teradata.com%2C16020%2C1435168292373.default.1435175500301: > MISSING 1 blocks of total size 83 > B.. > > .
[jira] [Updated] (HDFS-8900) Optimize XAttr memory footprint.
[ https://issues.apache.org/jira/browse/HDFS-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-8900: - Summary: Optimize XAttr memory footprint. (was: Improve XAttr memory footprint.) > Optimize XAttr memory footprint. > > > Key: HDFS-8900 > URL: https://issues.apache.org/jira/browse/HDFS-8900 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Yi Liu >Assignee: Yi Liu > > {code} > private final ImmutableList xAttrs; > {code} > Currently we use above in XAttrFeature, it's not efficient from memory point > of view, since {{ImmutableList}} and {{XAttr}} have object memory overhead, > and each object has memory alignment. > We can use a {{byte[]}} in XAttrFeature and do some compact in {{XAttr}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8926) Update the distcp document for new improvements by using snapshot diff report
[ https://issues.apache.org/jira/browse/HDFS-8926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated HDFS-8926: --- Description: HDFS-8828 utilize Snapshot diff report to build incremental copy list in distcp. We should update the DistCp document to describe how to use this feature. > Update the distcp document for new improvements by using snapshot diff report > - > > Key: HDFS-8926 > URL: https://issues.apache.org/jira/browse/HDFS-8926 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, documentation >Reporter: Yufei Gu >Assignee: Yufei Gu > > HDFS-8828 utilize Snapshot diff report to build incremental copy list in > distcp. We should update the DistCp document to describe how to use this > feature. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8926) Update the distcp document for new improvements by using snapshot diff report
Yufei Gu created HDFS-8926: -- Summary: Update the distcp document for new improvements by using snapshot diff report Key: HDFS-8926 URL: https://issues.apache.org/jira/browse/HDFS-8926 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, documentation Reporter: Yufei Gu Assignee: Yufei Gu -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704039#comment-14704039 ] Yufei Gu commented on HDFS-8828: Thank you very much, [~jingzhao]! Thank you very much, [~yzhangal]! I've uploaded a new patch 011 for all your comments. I will create a follow-up jira for the document. Thanks. > Utilize Snapshot diff report to build incremental copy list in distcp > - > > Key: HDFS-8828 > URL: https://issues.apache.org/jira/browse/HDFS-8828 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, > HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, > HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, > HDFS-8828.009.patch, HDFS-8828.010.patch, HDFS-8828.011.patch > > > Some users reported huge time cost to build file copy list in distcp. (30 > hours for 1.6M files). We can leverage snapshot diff report to build file > copy list including files/dirs which are changes only between two snapshots > (or a snapshot and a normal dir). It speed up the process in two folds: 1. > less copy list building time. 2. less file copy MR jobs. > HDFS snapshot diff report provide information about file/directory creation, > deletion, rename and modification between two snapshots or a snapshot and a > normal directory. HDFS-7535 synchronize deletion and rename, then fallback to > the default distcp. So it still relies on default distcp to building complete > list of files under the source dir. This patch only puts creation and > modification files into the copy list based on snapshot diff report. We can > minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated HDFS-8828: --- Attachment: HDFS-8828.011.patch > Utilize Snapshot diff report to build incremental copy list in distcp > - > > Key: HDFS-8828 > URL: https://issues.apache.org/jira/browse/HDFS-8828 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, > HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, > HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, > HDFS-8828.009.patch, HDFS-8828.010.patch, HDFS-8828.011.patch > > > Some users reported huge time cost to build file copy list in distcp. (30 > hours for 1.6M files). We can leverage snapshot diff report to build file > copy list including files/dirs which are changes only between two snapshots > (or a snapshot and a normal dir). It speed up the process in two folds: 1. > less copy list building time. 2. less file copy MR jobs. > HDFS snapshot diff report provide information about file/directory creation, > deletion, rename and modification between two snapshots or a snapshot and a > normal directory. HDFS-7535 synchronize deletion and rename, then fallback to > the default distcp. So it still relies on default distcp to building complete > list of files under the source dir. This patch only puts creation and > modification files into the copy list based on snapshot diff report. We can > minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8823) Move replication factor into individual blocks
[ https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8823: - Attachment: HDFS-8823.006.patch > Move replication factor into individual blocks > -- > > Key: HDFS-8823 > URL: https://issues.apache.org/jira/browse/HDFS-8823 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-8823.000.patch, HDFS-8823.001.patch, > HDFS-8823.002.patch, HDFS-8823.003.patch, HDFS-8823.004.patch, > HDFS-8823.005.patch, HDFS-8823.006.patch > > > This jira proposes to record the replication factor in the {{BlockInfo}} > class. The changes have two advantages: > * Decoupling the namespace and the block management layer. It is a > prerequisite step to move block management off the heap or to a separate > process. > * Increased flexibility on replicating blocks. Currently the replication > factors of all blocks have to be the same. The replication factors of these > blocks are equal to the highest replication factor across all snapshots. The > changes will allow blocks in a file to have different replication factor, > potentially saving some space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704021#comment-14704021 ] Yongjun Zhang commented on HDFS-8828: - Thanks a lot [~jingzhao]! Hi [~yufeigu], Would you please work out a new rev to address our latest comments? I will commit it after jenkins. And would you please create a follow-up jira for document update afterwards? Thanks. > Utilize Snapshot diff report to build incremental copy list in distcp > - > > Key: HDFS-8828 > URL: https://issues.apache.org/jira/browse/HDFS-8828 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, > HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, > HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, > HDFS-8828.009.patch, HDFS-8828.010.patch > > > Some users reported huge time cost to build file copy list in distcp. (30 > hours for 1.6M files). We can leverage snapshot diff report to build file > copy list including files/dirs which are changes only between two snapshots > (or a snapshot and a normal dir). It speed up the process in two folds: 1. > less copy list building time. 2. less file copy MR jobs. > HDFS snapshot diff report provide information about file/directory creation, > deletion, rename and modification between two snapshots or a snapshot and a > normal directory. HDFS-7535 synchronize deletion and rename, then fallback to > the default distcp. So it still relies on default distcp to building complete > list of files under the source dir. This patch only puts creation and > modification files into the copy list based on snapshot diff report. We can > minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity
[ https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated HDFS-8287: - Attachment: HDFS-8287-HDFS-7285.05.patch > DFSStripedOutputStream.writeChunk should not wait for writing parity > - > > Key: HDFS-8287 > URL: https://issues.apache.org/jira/browse/HDFS-8287 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Kai Sasaki > Attachments: HDFS-8287-HDFS-7285.00.patch, > HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, > HDFS-8287-HDFS-7285.03.patch, HDFS-8287-HDFS-7285.04.patch, > HDFS-8287-HDFS-7285.05.patch > > > When a stripping cell is full, writeChunk computes and generates parity > packets. It sequentially calls waitAndQueuePacket so that user client cannot > continue to write data until it finishes. > We should allow user client to continue writing instead but not blocking it > when writing parity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8823) Move replication factor into individual blocks
[ https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704010#comment-14704010 ] Jing Zhao commented on HDFS-8823: - The latest patch looks good to me. One comment is that for {{setReplication}}, instead of checking/updating quota block by block, we should calculate the total delta first and check if it breaks the quota limit. {code} // Ensure the quota does not exceed if (oldBR < replication) { for (BlockInfo b : file.getBlocks()) { fsd.updateCount(iip, 0L, b.getNumBytes(), oldBR, replication, true); } } {code} [~daryn], do you want to take a look at the patch? > Move replication factor into individual blocks > -- > > Key: HDFS-8823 > URL: https://issues.apache.org/jira/browse/HDFS-8823 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-8823.000.patch, HDFS-8823.001.patch, > HDFS-8823.002.patch, HDFS-8823.003.patch, HDFS-8823.004.patch, > HDFS-8823.005.patch > > > This jira proposes to record the replication factor in the {{BlockInfo}} > class. The changes have two advantages: > * Decoupling the namespace and the block management layer. It is a > prerequisite step to move block management off the heap or to a separate > process. > * Increased flexibility on replicating blocks. Currently the replication > factors of all blocks have to be the same. The replication factors of these > blocks are equal to the highest replication factor across all snapshots. The > changes will allow blocks in a file to have different replication factor, > potentially saving some space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8888) Support volumes in HDFS
[ https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703974#comment-14703974 ] Andrew Wang commented on HDFS-: --- I see encryption zones as the closest thing semantically to volumes right now because of the rename restriction, and it's been incompatible with some applications like Hive (which we fixed) and HDFS trash (which we haven't). Right now that pain is restricted to the subset of HDFS users who are also using encryption, but volumes as a first-class citizen will bring this into the spotlight. Volumes might be compelling enough to revisit the various rename assumptions in our app stack, but need to think hard about the app changes that are required. The motivations you've listed for the first phase of development reference simpler implementation and management. Regarding implementation, we've already implemented the additional complexity of doing it at the directory-level, so what's the advantage of changing it up now? Management-wise, I don't quite understand why it's easier to manage volumes vs. folders. You can treat some folders as you would volumes and get the same properties, right? The scalability motivations are more compelling to me since it's something we can't do now, but there's still more vertical scalability work we can do first that preserves existing semantics. Also if we want to pursue volumes vs. a true distributed namespace implementation which might preserve existing semantics. Finally, is this going to be linked with viewfs improvements? If volumes are a first-class citizen and being added and removed all the time, it'd be nice to have a centralized mount table rather than having to push out new client configs each time. Also need it to be able to say, list the set of volumes, or automatically choosing a NN when provisioning or rebalancing volumes. > Support volumes in HDFS > --- > > Key: HDFS- > URL: https://issues.apache.org/jira/browse/HDFS- > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai > > There are multiple types of zones (e.g., snapshottable directories, > encryption zones, directories with quotas) which are conceptually close to > namespace volumes in traditional file systems. > This jira proposes to introduce the concept of volume to simplify the > implementation of snapshots and encryption zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703956#comment-14703956 ] Colin Patrick McCabe commented on HDFS-8924: One example is a storage appliance which wanted to create a custom short-circuit read implementation. Creating a new BlockReader does not satisfy this use-case because this block reader would have to have hardware-specific code which does not belong in upstream. For example, it would have dependencies on specific JNI and other libraries for interfacing with the hardware. > Add pluggable interface for reading replicas in DFSClient > - > > Key: HDFS-8924 > URL: https://issues.apache.org/jira/browse/HDFS-8924 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.8.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-8924.001.patch > > > We should add a pluggable interface for reading replicas in the DFSClient. > This could be used to implement short-circuit reads on systems without file > descriptors, or for other optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8925) Move BlockReader to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703957#comment-14703957 ] Haohui Mai commented on HDFS-8925: -- In general the development of feature branches does not block changes in trunk. Feature branches need to continuously be in sync of trunk. Please feel free to create a jira if you think you need one to track the change of merging things back. > Move BlockReader to hdfs-client > --- > > Key: HDFS-8925 > URL: https://issues.apache.org/jira/browse/HDFS-8925 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > > This jira tracks the effort of moving the {{BlockReader}} class into the > hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8883) NameNode Metrics : Add FSNameSystem lock Queue Length
[ https://issues.apache.org/jira/browse/HDFS-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8883: Reporter: Arpit Agarwal (was: Anu Engineer) > NameNode Metrics : Add FSNameSystem lock Queue Length > - > > Key: HDFS-8883 > URL: https://issues.apache.org/jira/browse/HDFS-8883 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Arpit Agarwal > Fix For: 2.8.0 > > Attachments: HDFS-8883.001.patch > > > FSNameSystemLock can have contention when NameNode is under load. This patch > adds LockQueueLength -- the number of threads waiting on FSNameSystemLock -- > as a metric in NameNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8883) NameNode Metrics : Add FSNameSystem lock Queue Length
[ https://issues.apache.org/jira/browse/HDFS-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8883: Assignee: Anu Engineer > NameNode Metrics : Add FSNameSystem lock Queue Length > - > > Key: HDFS-8883 > URL: https://issues.apache.org/jira/browse/HDFS-8883 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Arpit Agarwal >Assignee: Anu Engineer > Fix For: 2.8.0 > > Attachments: HDFS-8883.001.patch > > > FSNameSystemLock can have contention when NameNode is under load. This patch > adds LockQueueLength -- the number of threads waiting on FSNameSystemLock -- > as a metric in NameNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8883) NameNode Metrics : Add FSNameSystem lock Queue Length
[ https://issues.apache.org/jira/browse/HDFS-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8883: Assignee: (was: Anu Engineer) > NameNode Metrics : Add FSNameSystem lock Queue Length > - > > Key: HDFS-8883 > URL: https://issues.apache.org/jira/browse/HDFS-8883 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Anu Engineer > Fix For: 2.8.0 > > Attachments: HDFS-8883.001.patch > > > FSNameSystemLock can have contention when NameNode is under load. This patch > adds LockQueueLength -- the number of threads waiting on FSNameSystemLock -- > as a metric in NameNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8917) Cleanup BlockInfoUnderConstruction from comments and tests
[ https://issues.apache.org/jira/browse/HDFS-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703954#comment-14703954 ] Hudson commented on HDFS-8917: -- FAILURE: Integrated in Hadoop-trunk-Commit #8324 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8324/]) HDFS-8917. Cleanup BlockInfoUnderConstruction from comments and tests. Contributed by Zhe Zhang. (jing9: rev 4e14f7982a6e57bf08deb3b266806c2b779a157d) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotTestHelper.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileUnderConstructionFeature.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockUnderConstructionFeature.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockInfoUnderConstruction.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockUnderConstructionFeature.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java > Cleanup BlockInfoUnderConstruction from comments and tests > -- > > Key: HDFS-8917 > URL: https://issues.apache.org/jira/browse/HDFS-8917 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.8.0 >Reporter: Zhe Zhang >Assignee: Zhe Zhang >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-8917.00.patch > > > HDFS-8801 eliminates the {{BlockInfoUnderConstruction}} class. This JIRA is a > follow-on to cleanup comments and tests which refer to the class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8883) NameNode Metrics : Add FSNameSystem lock Queue Length
[ https://issues.apache.org/jira/browse/HDFS-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8883: Reporter: Anu Engineer (was: Arpit Agarwal) > NameNode Metrics : Add FSNameSystem lock Queue Length > - > > Key: HDFS-8883 > URL: https://issues.apache.org/jira/browse/HDFS-8883 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: 2.8.0 > > Attachments: HDFS-8883.001.patch > > > FSNameSystemLock can have contention when NameNode is under load. This patch > adds LockQueueLength -- the number of threads waiting on FSNameSystemLock -- > as a metric in NameNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8925) Move BlockReader to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703952#comment-14703952 ] Zhe Zhang commented on HDFS-8925: - While {{BlockReader}} is only used on the client side in trunk, the HDFS-7285 branch uses it in DN. Maybe we should reconsider this refactor? Thanks. > Move BlockReader to hdfs-client > --- > > Key: HDFS-8925 > URL: https://issues.apache.org/jira/browse/HDFS-8925 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > > This jira tracks the effort of moving the {{BlockReader}} class into the > hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity
[ https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703946#comment-14703946 ] Kai Sasaki commented on HDFS-8287: -- [~rakeshr] Thank you for reviewing! {quote} We should improve this by notifying the writers, isn't it? {quote} It's reasonable. Current client side or streamer itself cannot handle exceptions threw by {{ParityGenerator}}. There might be two points we can do to improve this. 1. Put together handling logic into {{UncaughtExceptionHandler}} in order to maintainability and readability. 2. Notify exception to client side from {{UncaughtExceptionHandler}}. I think it might need some more rewrite, so it is better to do in separate JIRA. Is that okay? I'll update other points you pointed out in this JIRA. Thank you. > DFSStripedOutputStream.writeChunk should not wait for writing parity > - > > Key: HDFS-8287 > URL: https://issues.apache.org/jira/browse/HDFS-8287 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Kai Sasaki > Attachments: HDFS-8287-HDFS-7285.00.patch, > HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, > HDFS-8287-HDFS-7285.03.patch, HDFS-8287-HDFS-7285.04.patch > > > When a stripping cell is full, writeChunk computes and generates parity > packets. It sequentially calls waitAndQueuePacket so that user client cannot > continue to write data until it finishes. > We should allow user client to continue writing instead but not blocking it > when writing parity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6244) Make Trash Interval configurable for each of the namespaces
[ https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703943#comment-14703943 ] Hadoop QA commented on HDFS-6244: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12751344/HDFS-6244.v6.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 4e14f79 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12050/console | This message was automatically generated. > Make Trash Interval configurable for each of the namespaces > --- > > Key: HDFS-6244 > URL: https://issues.apache.org/jira/browse/HDFS-6244 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.5-alpha >Reporter: Siqi Li >Assignee: Siqi Li > Labels: BB2015-05-TBR > Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, > HDFS-6244.v3.patch, HDFS-6244.v4.patch, HDFS-6244.v5.patch, HDFS-6244.v6.patch > > > Somehow we need to avoid the cluster filling up. > One solution is to have a different trash policy per namespace. However, if > we can simply make the property configurable per namespace, then the same > config can be rolled everywhere and we'd be done. This seems simple enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8890) Allow admin to specify which blockpools the balancer should run on
[ https://issues.apache.org/jira/browse/HDFS-8890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703942#comment-14703942 ] Hadoop QA commented on HDFS-8890: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 12s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 8m 10s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 11s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 26s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 33s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 39s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 39s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 20s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 173m 56s | Tests failed in hadoop-hdfs. | | | | 217m 45s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestWriteBlockGetsBlockLengthHint | | Timed out tests | org.apache.hadoop.hdfs.server.balancer.TestBalancer | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12751316/HDFS-8890-trunk-v2.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 3aac475 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12046/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12046/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12046/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12046/console | This message was automatically generated. > Allow admin to specify which blockpools the balancer should run on > -- > > Key: HDFS-8890 > URL: https://issues.apache.org/jira/browse/HDFS-8890 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Chris Trezzo >Assignee: Chris Trezzo > Attachments: HDFS-8890-trunk-v1.patch, HDFS-8890-trunk-v2.patch > > > Currently the balancer runs on all blockpools. Allow an admin to run the > balancer on a set of blockpools. This will enable the balancer to skip > blockpools that should not be balanced. For example, a tmp blockpool that has > a large amount of churn. > An example of the command line interface would be an additional flag that > specifies the blockpools by id: > -blockpools > BP-6299761-10.55.116.188-1415904647555,BP-47348528-10.51.120.139-1415904199257 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8925) Move BlockReader to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-8925: Description: This jira tracks the effort of moving the {{BlockReader}} class into the hdfs-client module. (was: This jira tracks the effort of moving the {{DfsClientConf}} class into the hdfs-client module.) > Move BlockReader to hdfs-client > --- > > Key: HDFS-8925 > URL: https://issues.apache.org/jira/browse/HDFS-8925 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > > This jira tracks the effort of moving the {{BlockReader}} class into the > hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8925) Move BlockReader to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-8925: Hadoop Flags: (was: Reviewed) > Move BlockReader to hdfs-client > --- > > Key: HDFS-8925 > URL: https://issues.apache.org/jira/browse/HDFS-8925 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > > This jira tracks the effort of moving the {{DfsClientConf}} class into the > hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8925) Move BlockReader to hdfs-client
Mingliang Liu created HDFS-8925: --- Summary: Move BlockReader to hdfs-client Key: HDFS-8925 URL: https://issues.apache.org/jira/browse/HDFS-8925 Project: Hadoop HDFS Issue Type: Sub-task Components: build Reporter: Mingliang Liu Assignee: Mingliang Liu Fix For: 2.8.0 This jira tracks the effort of moving the {{DfsClientConf}} class into the hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-6290) File is not closed in OfflineImageViewerPB#run()
[ https://issues.apache.org/jira/browse/HDFS-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai resolved HDFS-6290. -- Resolution: Won't Fix I don't think this is worth fixing as the life cycle of the file closely matches the life cycle of the process. The file will be automatically closed when the process exits. > File is not closed in OfflineImageViewerPB#run() > > > Key: HDFS-6290 > URL: https://issues.apache.org/jira/browse/HDFS-6290 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Reporter: Ted Yu >Priority: Minor > > {code} > } else if (processor.equals("XML")) { > new PBImageXmlWriter(conf, out).visit(new RandomAccessFile(inputFile, > "r")); > {code} > The RandomAccessFile instance should be closed before the method returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703926#comment-14703926 ] Haohui Mai commented on HDFS-8924: -- Do you have any specific use cases in mind? Will creating a new {{BlockReader}} satisfy your use case? > Add pluggable interface for reading replicas in DFSClient > - > > Key: HDFS-8924 > URL: https://issues.apache.org/jira/browse/HDFS-8924 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.8.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-8924.001.patch > > > We should add a pluggable interface for reading replicas in the DFSClient. > This could be used to implement short-circuit reads on systems without file > descriptors, or for other optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703923#comment-14703923 ] Hadoop QA commented on HDFS-8828: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 43s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 47s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 53s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 27s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 6s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 29s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 47s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | tools/hadoop tests | 6m 24s | Tests passed in hadoop-distcp. | | | | 43m 40s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12751292/HDFS-8828.010.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 4e14f79 | | hadoop-distcp test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12049/artifact/patchprocess/testrun_hadoop-distcp.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12049/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12049/console | This message was automatically generated. > Utilize Snapshot diff report to build incremental copy list in distcp > - > > Key: HDFS-8828 > URL: https://issues.apache.org/jira/browse/HDFS-8828 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, > HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, > HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, > HDFS-8828.009.patch, HDFS-8828.010.patch > > > Some users reported huge time cost to build file copy list in distcp. (30 > hours for 1.6M files). We can leverage snapshot diff report to build file > copy list including files/dirs which are changes only between two snapshots > (or a snapshot and a normal dir). It speed up the process in two folds: 1. > less copy list building time. 2. less file copy MR jobs. > HDFS snapshot diff report provide information about file/directory creation, > deletion, rename and modification between two snapshots or a snapshot and a > normal directory. HDFS-7535 synchronize deletion and rename, then fallback to > the default distcp. So it still relies on default distcp to building complete > list of files under the source dir. This patch only puts creation and > modification files into the copy list based on snapshot diff report. We can > minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703916#comment-14703916 ] Colin Patrick McCabe commented on HDFS-8924: This patch adds a pluggable {{ReplicaAccessorBuilder}} class which can be used to create {{ReplicaAccessor}} objects. Unlike {{BlockReader}}, {{ReplicaAccessor}} is a stable API which is decoupled from internal implementation details and non-public classes. {{BlockReaderFactory}} will ask all of the configured {{ReplicaAccessorBuilder}} objects to create a new {{ReplicaAccessor}}. If none are configured, or none can create one, we use the existing block reader code. Otherwise, we create an {{ExternalBlockReader}} wrapping the {{ReplicaAccessor}}. I also added a reserved {{DataTransferProtocol}} opcode (127) in {{Op.java}}. This will ensure that anyone adding a custom opcode will not conflict with other new opcodes added upstream. > Add pluggable interface for reading replicas in DFSClient > - > > Key: HDFS-8924 > URL: https://issues.apache.org/jira/browse/HDFS-8924 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.8.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-8924.001.patch > > > We should add a pluggable interface for reading replicas in the DFSClient. > This could be used to implement short-circuit reads on systems without file > descriptors, or for other optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-8924: --- Attachment: HDFS-8924.001.patch > Add pluggable interface for reading replicas in DFSClient > - > > Key: HDFS-8924 > URL: https://issues.apache.org/jira/browse/HDFS-8924 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.8.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-8924.001.patch > > > We should add a pluggable interface for reading replicas in the DFSClient. > This could be used to implement short-circuit reads on systems without file > descriptors, or for other optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-8924: --- Status: Patch Available (was: Open) > Add pluggable interface for reading replicas in DFSClient > - > > Key: HDFS-8924 > URL: https://issues.apache.org/jira/browse/HDFS-8924 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.8.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-8924.001.patch > > > We should add a pluggable interface for reading replicas in the DFSClient. > This could be used to implement short-circuit reads on systems without file > descriptors, or for other optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8924) Add pluggable interface for reading replicas in DFSClient
Colin Patrick McCabe created HDFS-8924: -- Summary: Add pluggable interface for reading replicas in DFSClient Key: HDFS-8924 URL: https://issues.apache.org/jira/browse/HDFS-8924 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.8.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe We should add a pluggable interface for reading replicas in the DFSClient. This could be used to implement short-circuit reads on systems without file descriptors, or for other optimizations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703863#comment-14703863 ] Jing Zhao commented on HDFS-8828: - Thanks for updating the patch, Yufei! The latest patch looks good to me. One nit is the two "=null" initialization can be skipped. {code} DistCpSync(DistCpOptions options, Configuration conf) { this.inputOptions = options; this.conf = conf; this.diffMap = null; this.renameDiffs = null; } {code} +1 after addressing this and Yongjun's comments. Also looks like we need to update the distcp doc for this functionality? Please feel free to do it in a separate jira. > Utilize Snapshot diff report to build incremental copy list in distcp > - > > Key: HDFS-8828 > URL: https://issues.apache.org/jira/browse/HDFS-8828 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, > HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, > HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, > HDFS-8828.009.patch, HDFS-8828.010.patch > > > Some users reported huge time cost to build file copy list in distcp. (30 > hours for 1.6M files). We can leverage snapshot diff report to build file > copy list including files/dirs which are changes only between two snapshots > (or a snapshot and a normal dir). It speed up the process in two folds: 1. > less copy list building time. 2. less file copy MR jobs. > HDFS snapshot diff report provide information about file/directory creation, > deletion, rename and modification between two snapshots or a snapshot and a > normal directory. HDFS-7535 synchronize deletion and rename, then fallback to > the default distcp. So it still relies on default distcp to building complete > list of files under the source dir. This patch only puts creation and > modification files into the copy list based on snapshot diff report. We can > minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8923) Add -source flag to balancer usage message
[ https://issues.apache.org/jira/browse/HDFS-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703855#comment-14703855 ] Chris Trezzo commented on HDFS-8923: Test failure is unrelated. Patch should be good to go. > Add -source flag to balancer usage message > -- > > Key: HDFS-8923 > URL: https://issues.apache.org/jira/browse/HDFS-8923 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Trivial > Attachments: HDFS-8923-trunk-v1.patch > > > HDFS-8826 added a -source flag to the balancer, but the usage message still > needs to be updated. See current usage message in trunk: > {code} >private static final String USAGE = "Usage: hdfs balancer" >+ "\n\t[-policy ]\tthe balancing policy: " >+ BalancingPolicy.Node.INSTANCE.getName() + " or " >+ BalancingPolicy.Pool.INSTANCE.getName() >+ "\n\t[-threshold ]\tPercentage of disk capacity" >+ "\n\t[-exclude [-f | ]]" >+ "\tExcludes the specified datanodes." >+ "\n\t[-include [-f | ]]" >+ "\tIncludes only the specified datanodes." >+ "\n\t[-idleiterations ]" >+ "\tNumber of consecutive idle iterations (-1 for Infinite) before " >+ "exit." >+ "\n\t[-runDuringUpgrade]" >+ "\tWhether to run the balancer during an ongoing HDFS upgrade." >+ "This is usually not desired since it will not affect used space " >+ "on over-utilized machines."; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8917) Cleanup BlockInfoUnderConstruction from comments and tests
[ https://issues.apache.org/jira/browse/HDFS-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703851#comment-14703851 ] Zhe Zhang commented on HDFS-8917: - Thanks Jing for reviewing! > Cleanup BlockInfoUnderConstruction from comments and tests > -- > > Key: HDFS-8917 > URL: https://issues.apache.org/jira/browse/HDFS-8917 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.8.0 >Reporter: Zhe Zhang >Assignee: Zhe Zhang >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-8917.00.patch > > > HDFS-8801 eliminates the {{BlockInfoUnderConstruction}} class. This JIRA is a > follow-on to cleanup comments and tests which refer to the class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8918) Convert BlockUnderConstructionFeature#replicas form list to array
[ https://issues.apache.org/jira/browse/HDFS-8918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang resolved HDFS-8918. - Resolution: Duplicate > Convert BlockUnderConstructionFeature#replicas form list to array > - > > Key: HDFS-8918 > URL: https://issues.apache.org/jira/browse/HDFS-8918 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.8.0 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > > {{BlockInfoUnderConstruction}} / {{BlockUnderConstructionFeature}} uses a > List to store its {{replicas}}. To reduce memory usage, we can use an array > instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8917) Cleanup BlockInfoUnderConstruction from comments and tests
[ https://issues.apache.org/jira/browse/HDFS-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8917: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) I've committed this. Thanks Zhe for the contribution! > Cleanup BlockInfoUnderConstruction from comments and tests > -- > > Key: HDFS-8917 > URL: https://issues.apache.org/jira/browse/HDFS-8917 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.8.0 >Reporter: Zhe Zhang >Assignee: Zhe Zhang >Priority: Minor > Fix For: 2.8.0 > > Attachments: HDFS-8917.00.patch > > > HDFS-8801 eliminates the {{BlockInfoUnderConstruction}} class. This JIRA is a > follow-on to cleanup comments and tests which refer to the class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8917) Cleanup BlockInfoUnderConstruction from comments and tests
[ https://issues.apache.org/jira/browse/HDFS-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703819#comment-14703819 ] Jing Zhao commented on HDFS-8917: - +1. I will commit it shortly. > Cleanup BlockInfoUnderConstruction from comments and tests > -- > > Key: HDFS-8917 > URL: https://issues.apache.org/jira/browse/HDFS-8917 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.8.0 >Reporter: Zhe Zhang >Assignee: Zhe Zhang >Priority: Minor > Attachments: HDFS-8917.00.patch > > > HDFS-8801 eliminates the {{BlockInfoUnderConstruction}} class. This JIRA is a > follow-on to cleanup comments and tests which refer to the class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8909) Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use BlockUnderConstructionFeature
[ https://issues.apache.org/jira/browse/HDFS-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8909: Attachment: HDFS-8909.001.patch Thanks for the review, Zhe! Update the patch to address your comments. bq. separate ReplicaUnderConstruction Currently I prefer separating it out as a standalone class. Maybe we can do it also in trunk. bq. a contiguous block should not have different IDs reported. Should we add some assertion to be more clear Here the issue is the BlockUnderConstructionFeature itself does not know if this is striped or not, thus there is no way to do extra verification. But since the reported block is added a replica only if its block id is mapped to the BlockInfo object, I do not think we need to add extra assertion here. bq. simplify BlockInfo#convertToBlockUnderConstruction Currently I want to make sure when we create a BlockUCFeature, the expected locations are already passed in. Thus I leave the storage array in the constructor parameter list. But please let me know if you have a strong feeling about it. > Erasure coding: update BlockInfoContiguousUC and BlockInfoStripedUC to use > BlockUnderConstructionFeature > > > Key: HDFS-8909 > URL: https://issues.apache.org/jira/browse/HDFS-8909 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Jing Zhao > Attachments: HDFS-8909.000.patch, HDFS-8909.001.patch > > > HDFS-8801 converts {{BlockInfoUC}} as a feature. We should consolidate > {{BlockInfoContiguousUC}} and {{BlockInfoStripedUC}} logics to use this > feature. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8923) Add -source flag to balancer usage message
[ https://issues.apache.org/jira/browse/HDFS-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703778#comment-14703778 ] Hadoop QA commented on HDFS-8923: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 35s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 8m 32s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 31s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 32s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 45s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 20s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 93m 2s | Tests failed in hadoop-hdfs. | | | | 140m 52s | | \\ \\ || Reason || Tests || | Timed out tests | org.apache.hadoop.hdfs.TestPread | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12751312/HDFS-8923-trunk-v1.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 3aac475 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12045/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12045/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12045/console | This message was automatically generated. > Add -source flag to balancer usage message > -- > > Key: HDFS-8923 > URL: https://issues.apache.org/jira/browse/HDFS-8923 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Trivial > Attachments: HDFS-8923-trunk-v1.patch > > > HDFS-8826 added a -source flag to the balancer, but the usage message still > needs to be updated. See current usage message in trunk: > {code} >private static final String USAGE = "Usage: hdfs balancer" >+ "\n\t[-policy ]\tthe balancing policy: " >+ BalancingPolicy.Node.INSTANCE.getName() + " or " >+ BalancingPolicy.Pool.INSTANCE.getName() >+ "\n\t[-threshold ]\tPercentage of disk capacity" >+ "\n\t[-exclude [-f | ]]" >+ "\tExcludes the specified datanodes." >+ "\n\t[-include [-f | ]]" >+ "\tIncludes only the specified datanodes." >+ "\n\t[-idleiterations ]" >+ "\tNumber of consecutive idle iterations (-1 for Infinite) before " >+ "exit." >+ "\n\t[-runDuringUpgrade]" >+ "\tWhether to run the balancer during an ongoing HDFS upgrade." >+ "This is usually not desired since it will not affect used space " >+ "on over-utilized machines."; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703777#comment-14703777 ] Yongjun Zhang commented on HDFS-8828: - The build failure may be an intermittent one. I manually kicked off a new run at https://builds.apache.org/job/PreCommit-HDFS-Build/12049/ > Utilize Snapshot diff report to build incremental copy list in distcp > - > > Key: HDFS-8828 > URL: https://issues.apache.org/jira/browse/HDFS-8828 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, > HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, > HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, > HDFS-8828.009.patch, HDFS-8828.010.patch > > > Some users reported huge time cost to build file copy list in distcp. (30 > hours for 1.6M files). We can leverage snapshot diff report to build file > copy list including files/dirs which are changes only between two snapshots > (or a snapshot and a normal dir). It speed up the process in two folds: 1. > less copy list building time. 2. less file copy MR jobs. > HDFS snapshot diff report provide information about file/directory creation, > deletion, rename and modification between two snapshots or a snapshot and a > normal directory. HDFS-7535 synchronize deletion and rename, then fallback to > the default distcp. So it still relies on default distcp to building complete > list of files under the source dir. This patch only puts creation and > modification files into the copy list based on snapshot diff report. We can > minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8922) IBM Java requires libdl for linking in native_mini_dfs
[ https://issues.apache.org/jira/browse/HDFS-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703754#comment-14703754 ] Hadoop QA commented on HDFS-8922: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 5m 44s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 8m 13s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 21s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 25s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | native | 1m 1s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 176m 59s | Tests failed in hadoop-hdfs. | | | | 194m 21s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM | | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12751299/HDFS-8922.patch | | Optional Tests | javac unit | | git revision | trunk / f61120d | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12043/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12043/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12043/console | This message was automatically generated. > IBM Java requires libdl for linking in native_mini_dfs > -- > > Key: HDFS-8922 > URL: https://issues.apache.org/jira/browse/HDFS-8922 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 2.7.1 > Environment: IBM Java RHEL7.1 >Reporter: Ayappan > Attachments: HDFS-8922.patch > > > Building hadoop-hdfs-project with -Pnative option using IBM Java fails with > the following error > [exec] Linking C executable test_native_mini_dfs > [exec] /usr/bin/cmake -E cmake_link_script > CMakeFiles/test_native_mini_dfs.dir/link.txt --verbose=1 > [exec] /usr/bin/cc -g -Wall -O2 -D_REENTRANT -D_GNU_SOURCE > -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -fvisibility=hidden > CMakeFiles/test_native_mini_dfs.dir/main/native/libhdfs/test_native_mini_dfs.c.o > -o test_native_mini_dfs -rdynamic libnative_mini_dfs.a > /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so -lpthread > -Wl,-rpath,/home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic > [exec] make[2]: Leaving directory > `/home/ayappan/hadoop_2.7.1_new/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/native' > [exec] make[1]: Leaving directory > `/home/ayappan/hadoop_2.7.1_new/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/native' > [exec] > /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: > undefined reference to `dlopen' > [exec] > /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: > undefined reference to `dlclose' > [exec] > /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: > undefined reference to `dlerror' > [exec] > /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: > undefined reference to `dlsym' > [exec] > /home/ayappan/ibm-java-ppc64le-71/jre/lib/ppc64le/classic/libjvm.so: > undefined reference to `dladdr' > [exec] collect2: error: ld returned 1 exit status > [exec] make[2]: *** [test_native_mini_dfs] Error 1 > [exec] make[1]: *** [CMakeFiles/test_native_mini_dfs.dir/all] Error 2 > [exec] make: *** [all] Error 2 > It seems like the IBM jvm requires libdl for linking in native_mini_dfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6244) Make Trash Interval configurable for each of the namespaces
[ https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated HDFS-6244: -- Attachment: HDFS-6244.v6.patch > Make Trash Interval configurable for each of the namespaces > --- > > Key: HDFS-6244 > URL: https://issues.apache.org/jira/browse/HDFS-6244 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.5-alpha >Reporter: Siqi Li >Assignee: Siqi Li > Labels: BB2015-05-TBR > Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, > HDFS-6244.v3.patch, HDFS-6244.v4.patch, HDFS-6244.v5.patch, HDFS-6244.v6.patch > > > Somehow we need to avoid the cluster filling up. > One solution is to have a different trash policy per namespace. However, if > we can simply make the property configurable per namespace, then the same > config can be rolled everywhere and we'd be done. This seems simple enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6244) Make Trash Interval configurable for each of the namespaces
[ https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated HDFS-6244: -- Status: Patch Available (was: Open) > Make Trash Interval configurable for each of the namespaces > --- > > Key: HDFS-6244 > URL: https://issues.apache.org/jira/browse/HDFS-6244 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.5-alpha >Reporter: Siqi Li >Assignee: Siqi Li > Labels: BB2015-05-TBR > Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, > HDFS-6244.v3.patch, HDFS-6244.v4.patch, HDFS-6244.v5.patch, HDFS-6244.v6.patch > > > Somehow we need to avoid the cluster filling up. > One solution is to have a different trash policy per namespace. However, if > we can simply make the property configurable per namespace, then the same > config can be rolled everywhere and we'd be done. This seems simple enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8803) Move DfsClientConf to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703732#comment-14703732 ] Hudson commented on HDFS-8803: -- FAILURE: Integrated in Hadoop-trunk-Commit #8323 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8323/]) HDFS-8803. Move DfsClientConf to hdfs-client. Contributed by Mingliang Liu. (wheat9: rev 3aac4758b007a56e3d66998d457b2156effca528) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend2.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelRead.java * hadoop-hdfs-project/hadoop-hdfs/src/test/aop/org/apache/hadoop/hdfs/server/datanode/TestFiDataTransferProtocol2.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java * hadoop-hdfs-project/hadoop-hdfs/src/test/aop/org/apache/hadoop/hdfs/server/datanode/TestFiDataTransferProtocol.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDisableConnCache.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRead.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyPersistTestCase.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestCachingStrategy.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeDeath.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/ByteArrayManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/FileAppendTest4.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFsDefaultValue.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferKeepalive.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ClientContext.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestDatanodeRestart.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReplacement.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderFactory.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend4.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPipelines.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestUnbuffer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestConnCache.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * hadoop-hdfs-project/hadoop-hdfs/src/test/aop/org/apache/hadoop/hdfs/TestFiPipelines.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelShortCircuitLegacyRead.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.ja
[jira] [Commented] (HDFS-8867) Enable optimized block reports
[ https://issues.apache.org/jira/browse/HDFS-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703731#comment-14703731 ] Hudson commented on HDFS-8867: -- FAILURE: Integrated in Hadoop-trunk-Commit #8323 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8323/]) HDFS-8867. Enable optimized block reports. Contributed by Daryn Sharp. (jing9: rev f61120d964a609ae5eabeb5c4d6c9afe0a15cad8) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java > Enable optimized block reports > -- > > Key: HDFS-8867 > URL: https://issues.apache.org/jira/browse/HDFS-8867 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Rushabh S Shah >Assignee: Daryn Sharp > Fix For: 2.7.2 > > Attachments: HDFS-8867.patch > > > Opening this ticket on behalf of [~daryn] > HDFS-7435 introduced a more efficiently encoded block report format designed > to improve performance and reduce GC load on the NN and DNs. The NN is not > advertising this capability to the DNs so old-style reports are still being > used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8828) Utilize Snapshot diff report to build incremental copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-8828: Summary: Utilize Snapshot diff report to build incremental copy list in distcp (was: Utilize Snapshot diff report to build copy list in distcp) > Utilize Snapshot diff report to build incremental copy list in distcp > - > > Key: HDFS-8828 > URL: https://issues.apache.org/jira/browse/HDFS-8828 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, > HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, > HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, > HDFS-8828.009.patch, HDFS-8828.010.patch > > > Some users reported huge time cost to build file copy list in distcp. (30 > hours for 1.6M files). We can leverage snapshot diff report to build file > copy list including files/dirs which are changes only between two snapshots > (or a snapshot and a normal dir). It speed up the process in two folds: 1. > less copy list building time. 2. less file copy MR jobs. > HDFS snapshot diff report provide information about file/directory creation, > deletion, rename and modification between two snapshots or a snapshot and a > normal directory. HDFS-7535 synchronize deletion and rename, then fallback to > the default distcp. So it still relies on default distcp to building complete > list of files under the source dir. This patch only puts creation and > modification files into the copy list based on snapshot diff report. We can > minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6481) DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs
[ https://issues.apache.org/jira/browse/HDFS-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HDFS-6481: - Description: Ian Brooks reported the following stack trace: {code} 2014-06-03 13:05:03,915 WARN [DataStreamer for file /user/hbase/WALs/,16020,1401716790638/%2C16020%2C1401716790638.1401796562200 block BP-2121456822-10.143.38.149-1396953188241:blk_1074073683_332932] hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException): 0 at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956) at org.apache.hadoop.ipc.Client.call(Client.java:1347) at org.apache.hadoop.ipc.Client.call(Client.java:1300) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy13.getAdditionalDatanode(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolTranslatorPB.java:352) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy14.getAdditionalDatanode(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266) at com.sun.proxy.$Proxy15.getAdditionalDatanode(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:919) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1031) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475) 2014-06-03 13:05:48,489 ERROR [RpcServer.handler=22,port=16020] wal.FSHLog: syncer encountered error, will retry. txid=211 org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException): 0 at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:467) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalDatanode(FSNamesystem.java:2779) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getAdditionalDatanode(NameNodeRpcServer.java:594) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getAdditionalDatanode(ClientNamenodeProtocolServerSideTranslatorPB.java:430) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apach
[jira] [Commented] (HDFS-8888) Support volumes in HDFS
[ https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703624#comment-14703624 ] Colin Patrick McCabe commented on HDFS-: bq. Each volume could become its own RW lock with in the NN. This would improve parallelism within NN without much additional effort. Given the problems we already have with large NN heaps, perhaps we would be better off running multiple Namenode processes than trying to manage multiple independent subtrees in a single process. I am also worried that a lot of the changes here seem incompatible. If we are going to break backwards compatibility, why wouldn't we push people towards something like ozone, which does have a better horizontal scalability story. It seems like we should have a design meeting about this before we do any work in this direction. > Support volumes in HDFS > --- > > Key: HDFS- > URL: https://issues.apache.org/jira/browse/HDFS- > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haohui Mai > > There are multiple types of zones (e.g., snapshottable directories, > encryption zones, directories with quotas) which are conceptually close to > namespace volumes in traditional file systems. > This jira proposes to introduce the concept of volume to simplify the > implementation of snapshots and encryption zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6264) Provide FileSystem#create() variant which throws exception if parent directory doesn't exist
[ https://issues.apache.org/jira/browse/HDFS-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HDFS-6264: - Description: FileSystem#createNonRecursive() is deprecated. However, there is no DistributedFileSystem#create() implementation which throws exception if parent directory doesn't exist. This limits clients' migration away from the deprecated method. For HBase, IO fencing relies on the behavior of FileSystem#createNonRecursive(). Variant of create() method should be added which throws exception if parent directory doesn't exist. was: FileSystem#createNonRecursive() is deprecated. However, there is no DistributedFileSystem#create() implementation which throws exception if parent directory doesn't exist. This limits clients' migration away from the deprecated method. For HBase, IO fencing relies on the behavior of FileSystem#createNonRecursive(). Variant of create() method should be added which throws exception if parent directory doesn't exist. > Provide FileSystem#create() variant which throws exception if parent > directory doesn't exist > > > Key: HDFS-6264 > URL: https://issues.apache.org/jira/browse/HDFS-6264 > Project: Hadoop HDFS > Issue Type: Task > Components: namenode >Affects Versions: 2.4.0 >Reporter: Ted Yu > Labels: hbase > Attachments: hdfs-6264-v1.txt > > > FileSystem#createNonRecursive() is deprecated. > However, there is no DistributedFileSystem#create() implementation which > throws exception if parent directory doesn't exist. > This limits clients' migration away from the deprecated method. > For HBase, IO fencing relies on the behavior of > FileSystem#createNonRecursive(). > Variant of create() method should be added which throws exception if parent > directory doesn't exist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6290) File is not closed in OfflineImageViewerPB#run()
[ https://issues.apache.org/jira/browse/HDFS-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HDFS-6290: - Description: {code} } else if (processor.equals("XML")) { new PBImageXmlWriter(conf, out).visit(new RandomAccessFile(inputFile, "r")); {code} The RandomAccessFile instance should be closed before the method returns. was: {code} } else if (processor.equals("XML")) { new PBImageXmlWriter(conf, out).visit(new RandomAccessFile(inputFile, "r")); {code} The RandomAccessFile instance should be closed before the method returns. > File is not closed in OfflineImageViewerPB#run() > > > Key: HDFS-6290 > URL: https://issues.apache.org/jira/browse/HDFS-6290 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Reporter: Ted Yu >Priority: Minor > > {code} > } else if (processor.equals("XML")) { > new PBImageXmlWriter(conf, out).visit(new RandomAccessFile(inputFile, > "r")); > {code} > The RandomAccessFile instance should be closed before the method returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8829) DataNode sets SO_RCVBUF explicitly is disabling tcp auto-tuning
[ https://issues.apache.org/jira/browse/HDFS-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703617#comment-14703617 ] Colin Patrick McCabe commented on HDFS-8829: I agree that we might not want to default to auto-tuning. But we should at least make it available. I think if {{dfs.data.socket.size}} is -1, we should use auto-tuning. > DataNode sets SO_RCVBUF explicitly is disabling tcp auto-tuning > --- > > Key: HDFS-8829 > URL: https://issues.apache.org/jira/browse/HDFS-8829 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.3.0, 2.6.0 >Reporter: He Tianyi >Assignee: kanaka kumar avvaru > > {code:java} > private void initDataXceiver(Configuration conf) throws IOException { > // find free port or use privileged port provided > TcpPeerServer tcpPeerServer; > if (secureResources != null) { > tcpPeerServer = new TcpPeerServer(secureResources); > } else { > tcpPeerServer = new TcpPeerServer(dnConf.socketWriteTimeout, > DataNode.getStreamingAddr(conf)); > } > > tcpPeerServer.setReceiveBufferSize(HdfsConstants.DEFAULT_DATA_SOCKET_SIZE); > {code} > The last line sets SO_RCVBUF explicitly, thus disabling tcp auto-tuning on > some system. > Shall we make this behavior configurable? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8862) BlockManager#excessReplicateMap should use a HashMap
[ https://issues.apache.org/jira/browse/HDFS-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703610#comment-14703610 ] Colin Patrick McCabe commented on HDFS-8862: Thanks, [~hitliuyi]. > BlockManager#excessReplicateMap should use a HashMap > > > Key: HDFS-8862 > URL: https://issues.apache.org/jira/browse/HDFS-8862 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: 2.8.0 > > Attachments: HDFS-8862.001.patch > > > Per [~cmccabe]'s comments in HDFS-8792, this JIRA is to discuss improving > {{BlockManager#excessReplicateMap}}. > That's right HashMap don't ever shrink when elements are removed, but > TreeMap entry needs to store more (memory) references (left, right, parent) > than HashMap entry (only one reference next), even when there is element > removing and cause some entry empty, the empty HashMap entry is just a > {{null}} reference (4 bytes), so they are close at this point. On the other > hand, the key of {{excessReplicateMap}} is datanode uuid, so the entries > number is almost fixed, so HashMap memory is good than TreeMap memory in this > case. I think the most important is the search/insert/remove performance, > HashMap is absolutely better than TreeMap. Because we don't need to sort, > we should use HashMap instead of TreeMap -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703591#comment-14703591 ] Yongjun Zhang commented on HDFS-8828: - Hi [~jingzhao], Thanks a lot for your review and comments, I discussed with [~yufeigu] and he worked out the new revs to address your comments. HI [~yufeigu], thanks for the new rev, some nits: * put the following code into its own method, like createInputFileListingWithDiff {code} 180 Path fileListingPath = getFileListingPath(); 181 CopyListing copyListing = 182 new SimpleCopyListing(job.getConfiguration(), 183 job.getCredentials(), distCpSync); 184 copyListing.buildListing(fileListingPath, inputOptions); {code} so this can be in parallel with the existing method {{createInputFileListing(Job job)}} * you accidentally changed {{* http://www.apache.org/licenses/LICENSE-2.0}}, please revert this change * In comments, "//xyz" should be "// xyz", notice the space between "//" and the text Please consider addressing them together with what Jing might have, Hi [~jingzhao], it looks good to me after the above nits addressed. Would you mind take another look so Yufei can address altogether if you have any more comments? Thanks, > Utilize Snapshot diff report to build copy list in distcp > - > > Key: HDFS-8828 > URL: https://issues.apache.org/jira/browse/HDFS-8828 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, > HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, > HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, > HDFS-8828.009.patch, HDFS-8828.010.patch > > > Some users reported huge time cost to build file copy list in distcp. (30 > hours for 1.6M files). We can leverage snapshot diff report to build file > copy list including files/dirs which are changes only between two snapshots > (or a snapshot and a normal dir). It speed up the process in two folds: 1. > less copy list building time. 2. less file copy MR jobs. > HDFS snapshot diff report provide information about file/directory creation, > deletion, rename and modification between two snapshots or a snapshot and a > normal directory. HDFS-7535 synchronize deletion and rename, then fallback to > the default distcp. So it still relies on default distcp to building complete > list of files under the source dir. This patch only puts creation and > modification files into the copy list based on snapshot diff report. We can > minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8803) Move DfsClientConf to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703575#comment-14703575 ] Hudson commented on HDFS-8803: -- FAILURE: Integrated in Hadoop-trunk-Commit #8322 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8322/]) HDFS-8803. Move DfsClientConf to hdfs-client. Contributed by Mingliang Liu. (wheat9: rev 3aac4758b007a56e3d66998d457b2156effca528) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitCache.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelRead.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelShortCircuitLegacyRead.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/package-info.java * hadoop-hdfs-project/hadoop-hdfs/src/test/aop/org/apache/hadoop/hdfs/server/datanode/TestFiDataTransferProtocol.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/ByteArrayManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelShortCircuitReadUnCached.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java * hadoop-hdfs-project/hadoop-hdfs/src/test/aop/org/apache/hadoop/hdfs/server/datanode/TestFiDataTransferProtocol2.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend4.java * hadoop-hdfs-project/hadoop-hdfs/src/test/aop/org/apache/hadoop/hdfs/TestFiPipelines.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestCachingStrategy.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/shortcircuit/TestShortCircuitLocalRead.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/TestEnhancedByteBufferAccess.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRead.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockTokenWithDFS.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/package-info.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFsDefaultValue.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/ByteArrayManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferKeepalive.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRemoteBlockReader.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDisableConnCache.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderFactory.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPread.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend2.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apac
[jira] [Commented] (HDFS-8867) Enable optimized block reports
[ https://issues.apache.org/jira/browse/HDFS-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703574#comment-14703574 ] Hudson commented on HDFS-8867: -- FAILURE: Integrated in Hadoop-trunk-Commit #8322 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8322/]) HDFS-8867. Enable optimized block reports. Contributed by Daryn Sharp. (jing9: rev f61120d964a609ae5eabeb5c4d6c9afe0a15cad8) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamespaceInfo.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java > Enable optimized block reports > -- > > Key: HDFS-8867 > URL: https://issues.apache.org/jira/browse/HDFS-8867 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Rushabh S Shah >Assignee: Daryn Sharp > Fix For: 2.7.2 > > Attachments: HDFS-8867.patch > > > Opening this ticket on behalf of [~daryn] > HDFS-7435 introduced a more efficiently encoded block report format designed > to improve performance and reduce GC load on the NN and DNs. The NN is not > advertising this capability to the DNs so old-style reports are still being > used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703563#comment-14703563 ] Hadoop QA commented on HDFS-8828: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | javac | 0m 6s | The patch appears to cause the build to fail. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12751292/HDFS-8828.010.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / f61120d | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12044/console | This message was automatically generated. > Utilize Snapshot diff report to build copy list in distcp > - > > Key: HDFS-8828 > URL: https://issues.apache.org/jira/browse/HDFS-8828 > Project: Hadoop HDFS > Issue Type: Improvement > Components: distcp, snapshots >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, > HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, > HDFS-8828.006.patch, HDFS-8828.007.patch, HDFS-8828.008.patch, > HDFS-8828.009.patch, HDFS-8828.010.patch > > > Some users reported huge time cost to build file copy list in distcp. (30 > hours for 1.6M files). We can leverage snapshot diff report to build file > copy list including files/dirs which are changes only between two snapshots > (or a snapshot and a normal dir). It speed up the process in two folds: 1. > less copy list building time. 2. less file copy MR jobs. > HDFS snapshot diff report provide information about file/directory creation, > deletion, rename and modification between two snapshots or a snapshot and a > normal directory. HDFS-7535 synchronize deletion and rename, then fallback to > the default distcp. So it still relies on default distcp to building complete > list of files under the source dir. This patch only puts creation and > modification files into the copy list based on snapshot diff report. We can > minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8809) HDFS fsck reports HBase WALs files (under construction) as "CORRUPT" (missing blocks) when HBase is running
[ https://issues.apache.org/jira/browse/HDFS-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8809: Attachment: HDFS-8809.000.patch There is another issue here, which I think exists before HDFS-8215. When "-OPENFORWRITE" is enabled, an UC block is still treated as missing/corrupted, since {{countNodes}} only checks the triplets inside of the BlockInfo and thus the liveReplicas is usually 0 for a UC block. The fix can be to ignore the check if the block is the last one and it's UC. Theoretically the penultimate block can also be in the committed state and with 0 reported replica yet, but maybe we do not need to handle this part here. > HDFS fsck reports HBase WALs files (under construction) as "CORRUPT" (missing > blocks) when HBase is running > --- > > Key: HDFS-8809 > URL: https://issues.apache.org/jira/browse/HDFS-8809 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 2.7.0 > Environment: Hadoop 2.7.1 and HBase 1.1.1, on SUSE11sp3 (other > Linuxes not tested, probably not platform-dependent). This did NOT happen > with Hadoop 2.4 and HBase 0.98. >Reporter: Sudhir Prakash >Assignee: Jing Zhao > Attachments: HDFS-8809.000.patch > > > Whenever HBase is running, the "hdfs fsck /" reports four hbase-related > files in the path "hbase/data/WALs/" as CORRUPT. Even after letting the > cluster sit idle for a couple hours, it is still in the corrupt state. If > HBase is shut down, the problem goes away. If HBase is then restarted, the > problem recurs. This was observed with Hadoop 2.7.1 and HBase 1.1.1, and did > NOT happen with Hadoop 2.4 and HBase 0.98. > {code} > hades1:/var/opt/teradata/packages # su hdfs > hdfs@hades1:/var/opt/teradata/packages> hdfs fsck / > Connecting to namenode via > http://hades1.labs.teradata.com:50070/fsck?ugi=hdfs&path=%2F > FSCK started by hdfs (auth:SIMPLE) from /39.0.8.2 for path / at Wed Jun 24 > 20:40:17 GMT 2015 > ... > /apps/hbase/data/WALs/hades4.labs.teradata.com,16020,1435168292684/hades4.labs.teradata.com%2C16020%2C1435168292684.default.1435175500556: > MISSING 1 blocks of total size 83 B. > /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466..meta.1435175562144.meta: > MISSING 1 blocks of total size 83 B. > /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466.default.1435175498500: > MISSING 1 blocks of total size 83 B. > /apps/hbase/data/WALs/hades6.labs.teradata.com,16020,1435168292373/hades6.labs.teradata.com%2C16020%2C1435168292373.default.1435175500301: > MISSING 1 blocks of total size 83 > B.. > > > Status: > CORRUPT > Total size:723977553 B (Total open files size: 332 B) > Total dirs:79 > Total files: 388 > Total symlinks:0 (Files currently being written: 5) > Total blocks (validated): 387 (avg. block size 1870743 B) (Total open > file blocks (not validated): 4) > > UNDER MIN REPL'D BLOCKS: 4 (1.0335917 %) > dfs.namenode.replication.min: 1 > CORRUPT FILES:4 > MISSING BLOCKS: 4 > MISSING SIZE: 332 B > > Minimally replicated blocks: 387 (100.0 %) > Over-replicated blocks:0 (0.0 %) > Under-replicated blocks: 0 (0.0 %) > Mis-replicated blocks: 0 (0.0 %) > Default replication factor:3 > Average block replication: 3.0 > Corrupt blocks:0 > Missing replicas: 0 (0.0 %) > Number of data-nodes: 3 > Number of racks: 1 > FSCK ended at Wed Jun 24 20:40:17 GMT 2015 in 7 milliseconds > The filesystem under path '/' is CORRUPT > hdfs@hades1:/var/opt/teradata/packages> > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8809) HDFS fsck reports HBase WALs files (under construction) as "CORRUPT" (missing blocks) when HBase is running
[ https://issues.apache.org/jira/browse/HDFS-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8809: Status: Patch Available (was: Open) > HDFS fsck reports HBase WALs files (under construction) as "CORRUPT" (missing > blocks) when HBase is running > --- > > Key: HDFS-8809 > URL: https://issues.apache.org/jira/browse/HDFS-8809 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 2.7.0 > Environment: Hadoop 2.7.1 and HBase 1.1.1, on SUSE11sp3 (other > Linuxes not tested, probably not platform-dependent). This did NOT happen > with Hadoop 2.4 and HBase 0.98. >Reporter: Sudhir Prakash >Assignee: Jing Zhao > Attachments: HDFS-8809.000.patch > > > Whenever HBase is running, the "hdfs fsck /" reports four hbase-related > files in the path "hbase/data/WALs/" as CORRUPT. Even after letting the > cluster sit idle for a couple hours, it is still in the corrupt state. If > HBase is shut down, the problem goes away. If HBase is then restarted, the > problem recurs. This was observed with Hadoop 2.7.1 and HBase 1.1.1, and did > NOT happen with Hadoop 2.4 and HBase 0.98. > {code} > hades1:/var/opt/teradata/packages # su hdfs > hdfs@hades1:/var/opt/teradata/packages> hdfs fsck / > Connecting to namenode via > http://hades1.labs.teradata.com:50070/fsck?ugi=hdfs&path=%2F > FSCK started by hdfs (auth:SIMPLE) from /39.0.8.2 for path / at Wed Jun 24 > 20:40:17 GMT 2015 > ... > /apps/hbase/data/WALs/hades4.labs.teradata.com,16020,1435168292684/hades4.labs.teradata.com%2C16020%2C1435168292684.default.1435175500556: > MISSING 1 blocks of total size 83 B. > /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466..meta.1435175562144.meta: > MISSING 1 blocks of total size 83 B. > /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466.default.1435175498500: > MISSING 1 blocks of total size 83 B. > /apps/hbase/data/WALs/hades6.labs.teradata.com,16020,1435168292373/hades6.labs.teradata.com%2C16020%2C1435168292373.default.1435175500301: > MISSING 1 blocks of total size 83 > B.. > > > Status: > CORRUPT > Total size:723977553 B (Total open files size: 332 B) > Total dirs:79 > Total files: 388 > Total symlinks:0 (Files currently being written: 5) > Total blocks (validated): 387 (avg. block size 1870743 B) (Total open > file blocks (not validated): 4) > > UNDER MIN REPL'D BLOCKS: 4 (1.0335917 %) > dfs.namenode.replication.min: 1 > CORRUPT FILES:4 > MISSING BLOCKS: 4 > MISSING SIZE: 332 B > > Minimally replicated blocks: 387 (100.0 %) > Over-replicated blocks:0 (0.0 %) > Under-replicated blocks: 0 (0.0 %) > Mis-replicated blocks: 0 (0.0 %) > Default replication factor:3 > Average block replication: 3.0 > Corrupt blocks:0 > Missing replicas: 0 (0.0 %) > Number of data-nodes: 3 > Number of racks: 1 > FSCK ended at Wed Jun 24 20:40:17 GMT 2015 in 7 milliseconds > The filesystem under path '/' is CORRUPT > hdfs@hades1:/var/opt/teradata/packages> > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8803) Move DfsClientConf to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8803: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) I've committed the patch to trunk and branch-2. Thanks [~liuml07] for the contribution. > Move DfsClientConf to hdfs-client > - > > Key: HDFS-8803 > URL: https://issues.apache.org/jira/browse/HDFS-8803 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Haohui Mai >Assignee: Mingliang Liu > Fix For: 2.8.0 > > Attachments: HDFS-8803.000.patch, HDFS-8803.001.patch, > HDFS-8803.002.patch, HDFS-8803.003.patch > > > This jira tracks the effort of moving the {{DfsClientConf}} class into the > hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8803) Move DfsClientConf to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8803: - Assignee: Mingliang Liu (was: Haohui Mai) > Move DfsClientConf to hdfs-client > - > > Key: HDFS-8803 > URL: https://issues.apache.org/jira/browse/HDFS-8803 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Haohui Mai >Assignee: Mingliang Liu > Attachments: HDFS-8803.000.patch, HDFS-8803.001.patch, > HDFS-8803.002.patch, HDFS-8803.003.patch > > > This jira tracks the effort of moving the {{DfsClientConf}} class into the > hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8890) Allow admin to specify which blockpools the balancer should run on
[ https://issues.apache.org/jira/browse/HDFS-8890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HDFS-8890: --- Attachment: HDFS-8890-trunk-v2.patch V2 attached. > Allow admin to specify which blockpools the balancer should run on > -- > > Key: HDFS-8890 > URL: https://issues.apache.org/jira/browse/HDFS-8890 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Chris Trezzo >Assignee: Chris Trezzo > Attachments: HDFS-8890-trunk-v1.patch, HDFS-8890-trunk-v2.patch > > > Currently the balancer runs on all blockpools. Allow an admin to run the > balancer on a set of blockpools. This will enable the balancer to skip > blockpools that should not be balanced. For example, a tmp blockpool that has > a large amount of churn. > An example of the command line interface would be an additional flag that > specifies the blockpools by id: > -blockpools > BP-6299761-10.55.116.188-1415904647555,BP-47348528-10.51.120.139-1415904199257 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703505#comment-14703505 ] Bob Hansen commented on HDFS-8855: -- Agreed that the RPC cache not working is a bug that should be fixed independently. It can be argued that caching the whole client object is an additional optimization that has some value here. But yes, we should track down why the RPC cache is failing us. > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs > Environment: HDP 2.2 >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Attachments: HDFS-8855.1.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8923) Add -source flag to balancer usage message
[ https://issues.apache.org/jira/browse/HDFS-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HDFS-8923: --- Status: Patch Available (was: In Progress) > Add -source flag to balancer usage message > -- > > Key: HDFS-8923 > URL: https://issues.apache.org/jira/browse/HDFS-8923 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Trivial > Attachments: HDFS-8923-trunk-v1.patch > > > HDFS-8826 added a -source flag to the balancer, but the usage message still > needs to be updated. See current usage message in trunk: > {code} >private static final String USAGE = "Usage: hdfs balancer" >+ "\n\t[-policy ]\tthe balancing policy: " >+ BalancingPolicy.Node.INSTANCE.getName() + " or " >+ BalancingPolicy.Pool.INSTANCE.getName() >+ "\n\t[-threshold ]\tPercentage of disk capacity" >+ "\n\t[-exclude [-f | ]]" >+ "\tExcludes the specified datanodes." >+ "\n\t[-include [-f | ]]" >+ "\tIncludes only the specified datanodes." >+ "\n\t[-idleiterations ]" >+ "\tNumber of consecutive idle iterations (-1 for Infinite) before " >+ "exit." >+ "\n\t[-runDuringUpgrade]" >+ "\tWhether to run the balancer during an ongoing HDFS upgrade." >+ "This is usually not desired since it will not affect used space " >+ "on over-utilized machines."; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703480#comment-14703480 ] Haohui Mai commented on HDFS-8855: -- This basically shows that the RPC connection cache is not working, but again this is the wrong place to fix. We should dig into in this case why RPC connection cache is not working instead of putting a band aid in WebHDFS. > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs > Environment: HDP 2.2 >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Attachments: HDFS-8855.1.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8923) Add -source flag to balancer usage message
[ https://issues.apache.org/jira/browse/HDFS-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HDFS-8923: --- Attachment: HDFS-8923-trunk-v1.patch [~szetszwo] [~arpitagarwal] V1 patch attached. > Add -source flag to balancer usage message > -- > > Key: HDFS-8923 > URL: https://issues.apache.org/jira/browse/HDFS-8923 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Trivial > Attachments: HDFS-8923-trunk-v1.patch > > > HDFS-8826 added a -source flag to the balancer, but the usage message still > needs to be updated. See current usage message in trunk: > {code} >private static final String USAGE = "Usage: hdfs balancer" >+ "\n\t[-policy ]\tthe balancing policy: " >+ BalancingPolicy.Node.INSTANCE.getName() + " or " >+ BalancingPolicy.Pool.INSTANCE.getName() >+ "\n\t[-threshold ]\tPercentage of disk capacity" >+ "\n\t[-exclude [-f | ]]" >+ "\tExcludes the specified datanodes." >+ "\n\t[-include [-f | ]]" >+ "\tIncludes only the specified datanodes." >+ "\n\t[-idleiterations ]" >+ "\tNumber of consecutive idle iterations (-1 for Infinite) before " >+ "exit." >+ "\n\t[-runDuringUpgrade]" >+ "\tWhether to run the balancer during an ongoing HDFS upgrade." >+ "This is usually not desired since it will not affect used space " >+ "on over-utilized machines."; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases
[ https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703461#comment-14703461 ] Chris Trezzo commented on HDFS-8826: Woops. Meant HDFS-8923. > Balancer may not move blocks efficiently in some cases > -- > > Key: HDFS-8826 > URL: https://issues.apache.org/jira/browse/HDFS-8826 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Fix For: 2.8.0 > > Attachments: h8826_20150811.patch, h8826_20150816.patch, > h8826_20150818.patch > > > Balancer is inefficient in the following case: > || Datanode || Utilization || Rack || > | D1 | 95% | A | > | D2 | 30% | B | > | D3, D4, D5 | 0% | B | > The average utilization is 25% so that D2 is within 10% threshold. However, > Balancer currently will first move blocks from D2 to D3, D4 and D5 since they > are under the same rack. Then, it will move blocks from D1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases
[ https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703456#comment-14703456 ] Chris Trezzo commented on HDFS-8826: HDFS-8826 filled. Posting patch there. > Balancer may not move blocks efficiently in some cases > -- > > Key: HDFS-8826 > URL: https://issues.apache.org/jira/browse/HDFS-8826 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Fix For: 2.8.0 > > Attachments: h8826_20150811.patch, h8826_20150816.patch, > h8826_20150818.patch > > > Balancer is inefficient in the following case: > || Datanode || Utilization || Rack || > | D1 | 95% | A | > | D2 | 30% | B | > | D3, D4, D5 | 0% | B | > The average utilization is 25% so that D2 is within 10% threshold. However, > Balancer currently will first move blocks from D2 to D3, D4 and D5 since they > are under the same rack. Then, it will move blocks from D1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-8923) Add -source flag to balancer usage message
[ https://issues.apache.org/jira/browse/HDFS-8923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-8923 started by Chris Trezzo. -- > Add -source flag to balancer usage message > -- > > Key: HDFS-8923 > URL: https://issues.apache.org/jira/browse/HDFS-8923 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Chris Trezzo >Assignee: Chris Trezzo >Priority: Trivial > > HDFS-8826 added a -source flag to the balancer, but the usage message still > needs to be updated. See current usage message in trunk: > {code} >private static final String USAGE = "Usage: hdfs balancer" >+ "\n\t[-policy ]\tthe balancing policy: " >+ BalancingPolicy.Node.INSTANCE.getName() + " or " >+ BalancingPolicy.Pool.INSTANCE.getName() >+ "\n\t[-threshold ]\tPercentage of disk capacity" >+ "\n\t[-exclude [-f | ]]" >+ "\tExcludes the specified datanodes." >+ "\n\t[-include [-f | ]]" >+ "\tIncludes only the specified datanodes." >+ "\n\t[-idleiterations ]" >+ "\tNumber of consecutive idle iterations (-1 for Infinite) before " >+ "exit." >+ "\n\t[-runDuringUpgrade]" >+ "\tWhether to run the balancer during an ongoing HDFS upgrade." >+ "This is usually not desired since it will not affect used space " >+ "on over-utilized machines."; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8923) Add -source flag to balancer usage message
Chris Trezzo created HDFS-8923: -- Summary: Add -source flag to balancer usage message Key: HDFS-8923 URL: https://issues.apache.org/jira/browse/HDFS-8923 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Chris Trezzo Assignee: Chris Trezzo Priority: Trivial HDFS-8826 added a -source flag to the balancer, but the usage message still needs to be updated. See current usage message in trunk: {code} private static final String USAGE = "Usage: hdfs balancer" + "\n\t[-policy ]\tthe balancing policy: " + BalancingPolicy.Node.INSTANCE.getName() + " or " + BalancingPolicy.Pool.INSTANCE.getName() + "\n\t[-threshold ]\tPercentage of disk capacity" + "\n\t[-exclude [-f | ]]" + "\tExcludes the specified datanodes." + "\n\t[-include [-f | ]]" + "\tIncludes only the specified datanodes." + "\n\t[-idleiterations ]" + "\tNumber of consecutive idle iterations (-1 for Infinite) before " + "exit." + "\n\t[-runDuringUpgrade]" + "\tWhether to run the balancer during an ongoing HDFS upgrade." + "This is usually not desired since it will not affect used space " + "on over-utilized machines."; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8803) Move DfsClientConf to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703438#comment-14703438 ] Jing Zhao commented on HDFS-8803: - The test failure should be unrelated. +1. > Move DfsClientConf to hdfs-client > - > > Key: HDFS-8803 > URL: https://issues.apache.org/jira/browse/HDFS-8803 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: build >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-8803.000.patch, HDFS-8803.001.patch, > HDFS-8803.002.patch, HDFS-8803.003.patch > > > This jira tracks the effort of moving the {{DfsClientConf}} class into the > hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8867) Enable optimized block reports
[ https://issues.apache.org/jira/browse/HDFS-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8867: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.7.2 Status: Resolved (was: Patch Available) Thanks for the fix, Daryn. I've committed this to trunk, branch-2 and branch-2.7. > Enable optimized block reports > -- > > Key: HDFS-8867 > URL: https://issues.apache.org/jira/browse/HDFS-8867 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Rushabh S Shah >Assignee: Daryn Sharp > Fix For: 2.7.2 > > Attachments: HDFS-8867.patch > > > Opening this ticket on behalf of [~daryn] > HDFS-7435 introduced a more efficiently encoded block report format designed > to improve performance and reduce GC load on the NN and DNs. The NN is not > advertising this capability to the DNs so old-style reports are still being > used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases
[ https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703369#comment-14703369 ] Hudson commented on HDFS-8826: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #289 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/289/]) HDFS-8826. In Balancer, add an option to specify the source node list so that balancer only selects blocks to move from those nodes. (szetszwo: rev 7ecbfd44aa57f5f54c214b7fdedda2500be76f51) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/StringUtils.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/HostsFileReader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java > Balancer may not move blocks efficiently in some cases > -- > > Key: HDFS-8826 > URL: https://issues.apache.org/jira/browse/HDFS-8826 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Fix For: 2.8.0 > > Attachments: h8826_20150811.patch, h8826_20150816.patch, > h8826_20150818.patch > > > Balancer is inefficient in the following case: > || Datanode || Utilization || Rack || > | D1 | 95% | A | > | D2 | 30% | B | > | D3, D4, D5 | 0% | B | > The average utilization is 25% so that D2 is within 10% threshold. However, > Balancer currently will first move blocks from D2 to D3, D4 and D5 since they > are under the same rack. Then, it will move blocks from D1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8435) Support CreateFlag in WebHdfs
[ https://issues.apache.org/jira/browse/HDFS-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703371#comment-14703371 ] Hudson commented on HDFS-8435: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #289 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/289/]) HDFS-8435. Support CreateFlag in WebHDFS. Contributed by Jakob Homan (cdouglas: rev 30e342a5d32be5efffeb472cce76d4ed43642608) * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/WebHDFS.md * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/ParameterParser.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/resources/CreateFlagParam.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/WebHdfsHandler.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/resources/CreateParentParam.java > Support CreateFlag in WebHdfs > - > > Key: HDFS-8435 > URL: https://issues.apache.org/jira/browse/HDFS-8435 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 2.6.0 >Reporter: Vinoth Sathappan >Assignee: Jakob Homan > Fix For: 2.8.0 > > Attachments: HDFS-8435-branch-2.7.001.patch, HDFS-8435.001.patch, > HDFS-8435.002.patch, HDFS-8435.003.patch, HDFS-8435.004.patch, > HDFS-8435.005.patch > > > The WebHdfsFileSystem implementation doesn't support createNonRecursive. > HBase extensively depends on that for proper functioning. Currently, when the > region servers are started over web hdfs, they crash due with - > createNonRecursive unsupported for this filesystem class > org.apache.hadoop.hdfs.web.SWebHdfsFileSystem > at > org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1137) > at > org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1112) > at > org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1088) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:85) > at > org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:198) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8852) HDFS architecture documentation of version 2.x is outdated about append write support
[ https://issues.apache.org/jira/browse/HDFS-8852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703370#comment-14703370 ] Hudson commented on HDFS-8852: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #289 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/289/]) HDFS-8852. HDFS architecture documentation of version 2.x is outdated about append write support. Contributed by Ajith S. (aajisaka: rev fc509f66d814e7a5ed81d5d73b23c400625d573b) * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > HDFS architecture documentation of version 2.x is outdated about append write > support > - > > Key: HDFS-8852 > URL: https://issues.apache.org/jira/browse/HDFS-8852 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Reporter: Hong Dai Thanh >Assignee: Ajith S > Labels: newbie > Fix For: 2.7.2 > > Attachments: HDFS-8852.2.patch, HDFS-8852.patch > > > In the [latest version of the > documentation|http://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Simple_Coherency_Model], > and also documentation for all releases with version 2, it’s mentioned that > “A file once created, written, and closed need not be changed. “ and “There > is a plan to support appending-writes to files in the future.” > > However, as far as I know, HDFS has supported append write since 0.21, based > on [HDFS-265|https://issues.apache.org/jira/browse/HDFS-265] and [the old > version of the documentation in > 2012|https://web.archive.org/web/20121221171824/http://hadoop.apache.org/docs/hdfs/current/hdfs_design.html#Appending-Writes+and+File+Syncs] > Various posts on the Internet also suggests that append write has been > available in HDFS, and will always be available in Hadoop version 2 branch. > > Can we update the documentation to reflect the current status? > (Please also review whether the documentation should also be updated for > version 0.21 and above, and the version 1.x branch) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8908) TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad datanode
[ https://issues.apache.org/jira/browse/HDFS-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703372#comment-14703372 ] Hudson commented on HDFS-8908: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #289 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/289/]) HDFS-8908. TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad datanode. (Tsz Wo Nicholas Sze via yliu) (yliu: rev 2da5aaab334d0d6a7dee244cac603aa35c9b0134) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestAppendSnapshotTruncate.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad > datanode > -- > > Key: HDFS-8908 > URL: https://issues.apache.org/jira/browse/HDFS-8908 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze >Priority: Minor > Fix For: 2.8.0 > > Attachments: h8908_20150817.patch > > > See > https://builds.apache.org/job/PreCommit-HDFS-Build/12005/testReport/org.apache.hadoop.hdfs/TestAppendSnapshotTruncate/testAST/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8911) NameNode Metric : Add Editlog counters as a JMX metric
[ https://issues.apache.org/jira/browse/HDFS-8911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703363#comment-14703363 ] Hudson commented on HDFS-8911: -- FAILURE: Integrated in Hadoop-trunk-Commit #8321 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8321/]) HDFS-8911. NameNode Metric : Add Editlog counters as a JMX metric. (Contributed by Anu Engineer) (arp: rev 9c3571ea607f0953487464844ed0d46fdb3e9f90) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/metrics/FSNamesystemMBean.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystemMBean.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-common-project/hadoop-common/src/site/markdown/Metrics.md > NameNode Metric : Add Editlog counters as a JMX metric > -- > > Key: HDFS-8911 > URL: https://issues.apache.org/jira/browse/HDFS-8911 > Project: Hadoop HDFS > Issue Type: Improvement > Components: HDFS >Affects Versions: 2.7.1 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: 2.8.0 > > Attachments: HDFS-8911.001.patch > > > Today we log editlog metrics in the log. This JIRA proposes to expose those > metrics via JMX. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8908) TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad datanode
[ https://issues.apache.org/jira/browse/HDFS-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703362#comment-14703362 ] Hudson commented on HDFS-8908: -- FAILURE: Integrated in Hadoop-trunk-Commit #8321 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8321/]) HDFS-8908. TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad datanode. (Tsz Wo Nicholas Sze via yliu) (yliu: rev 2da5aaab334d0d6a7dee244cac603aa35c9b0134) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestAppendSnapshotTruncate.java > TestAppendSnapshotTruncate may fail with IOException: Failed to replace a bad > datanode > -- > > Key: HDFS-8908 > URL: https://issues.apache.org/jira/browse/HDFS-8908 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze >Priority: Minor > Fix For: 2.8.0 > > Attachments: h8908_20150817.patch > > > See > https://builds.apache.org/job/PreCommit-HDFS-Build/12005/testReport/org.apache.hadoop.hdfs/TestAppendSnapshotTruncate/testAST/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8826) Balancer may not move blocks efficiently in some cases
[ https://issues.apache.org/jira/browse/HDFS-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703350#comment-14703350 ] Chris Trezzo commented on HDFS-8826: [~arpitagarwal] [~szetszwo] The applied patch is currently missing the -source flag in the usage message in the balancer. See current usage message in trunk: {code} private static final String USAGE = "Usage: hdfs balancer" + "\n\t[-policy ]\tthe balancing policy: " + BalancingPolicy.Node.INSTANCE.getName() + " or " + BalancingPolicy.Pool.INSTANCE.getName() + "\n\t[-threshold ]\tPercentage of disk capacity" + "\n\t[-exclude [-f | ]]" + "\tExcludes the specified datanodes." + "\n\t[-include [-f | ]]" + "\tIncludes only the specified datanodes." + "\n\t[-idleiterations ]" + "\tNumber of consecutive idle iterations (-1 for Infinite) before " + "exit." + "\n\t[-runDuringUpgrade]" + "\tWhether to run the balancer during an ongoing HDFS upgrade." + "This is usually not desired since it will not affect used space " + "on over-utilized machines."; {code} Should I file a jira or do you guys just want to post an amendment patch? Thanks! > Balancer may not move blocks efficiently in some cases > -- > > Key: HDFS-8826 > URL: https://issues.apache.org/jira/browse/HDFS-8826 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Fix For: 2.8.0 > > Attachments: h8826_20150811.patch, h8826_20150816.patch, > h8826_20150818.patch > > > Balancer is inefficient in the following case: > || Datanode || Utilization || Rack || > | D1 | 95% | A | > | D2 | 30% | B | > | D3, D4, D5 | 0% | B | > The average utilization is 25% so that D2 is within 10% threshold. However, > Balancer currently will first move blocks from D2 to D3, D4 and D5 since they > are under the same rack. Then, it will move blocks from D1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)