[jira] [Commented] (HDFS-10549) Memory leak if exception happens when closing DFSOutputStream or DFSStripedOutputStream
[ https://issues.apache.org/jira/browse/HDFS-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338922#comment-15338922 ] Hadoop QA commented on HDFS-10549: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client: The patch generated 1 new + 32 unchanged - 0 fixed = 33 total (was 32) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 56s {color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 19m 26s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:e2f6409 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12811742/HDFS-10549.001.patch | | JIRA Issue | HDFS-10549 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 21bf0eff356c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d0162f2 | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/15829/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/15829/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/15829/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Memory leak if exception happens when closing DFSOutputStream or > DFSStripedOutputStream > --- > > Key: HDFS-10549 > URL:
[jira] [Updated] (HDFS-10549) Memory leak if exception happens when closing DFSOutputStream or DFSStripedOutputStream
[ https://issues.apache.org/jira/browse/HDFS-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10549: - Description: As HADOOP-13264 memtioned, the code dfsClient.endFileLease(fileId) in {{DFSOutputStream}} will not be executed when the IOException happened in {{closeImpl()}}. {code} public void close() throws IOException { synchronized (this) { try (TraceScope ignored = dfsClient.newPathTraceScope("DFSOutputStream#close", src)) { closeImpl(); } } dfsClient.endFileLease(fileId); } } {code} This will cause that the files not be closed in {{DFSClient}} and finally lead to the memory leak. In {{DFSStripedOutputStream}}, it existed the same problem. was: As HADOOP-13264 memtioned, the code dfsClient.endFileLease(fileId) in {{ DFSOutputStream}} will not be executed when the IOException happened in {{closeImpl()}}. {code} public void close() throws IOException { synchronized (this) { try (TraceScope ignored = dfsClient.newPathTraceScope("DFSOutputStream#close", src)) { closeImpl(); } } dfsClient.endFileLease(fileId); } } {code} This will cause that the files not be closed in {{DFSClient}} and finally lead to the memory leak. In {{DFSStripedOutputStream}}, it existed the same problem. > Memory leak if exception happens when closing DFSOutputStream or > DFSStripedOutputStream > --- > > Key: HDFS-10549 > URL: https://issues.apache.org/jira/browse/HDFS-10549 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.1 >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10549.001.patch > > > As HADOOP-13264 memtioned, the code dfsClient.endFileLease(fileId) in > {{DFSOutputStream}} will not be executed when the IOException happened in > {{closeImpl()}}. > {code} > public void close() throws IOException { > synchronized (this) { > try (TraceScope ignored = > dfsClient.newPathTraceScope("DFSOutputStream#close", src)) { > closeImpl(); > } > } > dfsClient.endFileLease(fileId); > } > } > {code} > This will cause that the files not be closed in {{DFSClient}} and finally > lead to the memory leak. In {{DFSStripedOutputStream}}, it existed the same > problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10549) Memory leak if exception happens when closing DFSOutputStream or DFSStripedOutputStream
[ https://issues.apache.org/jira/browse/HDFS-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10549: - Attachment: HDFS-10549.001.patch > Memory leak if exception happens when closing DFSOutputStream or > DFSStripedOutputStream > --- > > Key: HDFS-10549 > URL: https://issues.apache.org/jira/browse/HDFS-10549 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.1 >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10549.001.patch > > > As HADOOP-13264 memtioned, the code dfsClient.endFileLease(fileId) in {{ > DFSOutputStream}} will not be executed when the IOException happened in > {{closeImpl()}}. > {code} > public void close() throws IOException { > synchronized (this) { > try (TraceScope ignored = > dfsClient.newPathTraceScope("DFSOutputStream#close", src)) { > closeImpl(); > } > } > dfsClient.endFileLease(fileId); > } > } > {code} > This will cause that the files not be closed in {{DFSClient}} and finally > lead to the memory leak. In {{DFSStripedOutputStream}}, it existed the same > problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10549) Memory leak if exception happens when closing DFSOutputStream or DFSStripedOutputStream
[ https://issues.apache.org/jira/browse/HDFS-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10549: - Status: Patch Available (was: Open) Attach a initial patch. > Memory leak if exception happens when closing DFSOutputStream or > DFSStripedOutputStream > --- > > Key: HDFS-10549 > URL: https://issues.apache.org/jira/browse/HDFS-10549 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.1 >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10549.001.patch > > > As HADOOP-13264 memtioned, the code dfsClient.endFileLease(fileId) in {{ > DFSOutputStream}} will not be executed when the IOException happened in > {{closeImpl()}}. > {code} > public void close() throws IOException { > synchronized (this) { > try (TraceScope ignored = > dfsClient.newPathTraceScope("DFSOutputStream#close", src)) { > closeImpl(); > } > } > dfsClient.endFileLease(fileId); > } > } > {code} > This will cause that the files not be closed in {{DFSClient}} and finally > lead to the memory leak. In {{DFSStripedOutputStream}}, it existed the same > problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10549) Memory leak if exception happens when closing DFSOutputStream or DFSStripedOutputStream
Yiqun Lin created HDFS-10549: Summary: Memory leak if exception happens when closing DFSOutputStream or DFSStripedOutputStream Key: HDFS-10549 URL: https://issues.apache.org/jira/browse/HDFS-10549 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.7.1 Reporter: Yiqun Lin Assignee: Yiqun Lin As HADOOP-13264 memtioned, the code dfsClient.endFileLease(fileId) in {{ DFSOutputStream}} will not be executed when the IOException happened in {{closeImpl()}}. {code} public void close() throws IOException { synchronized (this) { try (TraceScope ignored = dfsClient.newPathTraceScope("DFSOutputStream#close", src)) { closeImpl(); } } dfsClient.endFileLease(fileId); } } {code} This will cause that the files not be closed in {{DFSClient}} and finally lead to the memory leak. In {{DFSStripedOutputStream}}, it existed the same problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9924) [umbrella] Nonblocking HDFS Access
[ https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338904#comment-15338904 ] Tsz Wo Nicholas Sze commented on HDFS-9924: --- Sure, let's have a branch for the API development. It seems that we still need a lot of discussion. > [umbrella] Nonblocking HDFS Access > -- > > Key: HDFS-9924 > URL: https://issues.apache.org/jira/browse/HDFS-9924 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Reporter: Tsz Wo Nicholas Sze >Assignee: Xiaobing Zhou > Attachments: Async-HDFS-Performance-Report.pdf, AsyncHdfs20160510.pdf > > > This is an umbrella JIRA for supporting Nonblocking HDFS Access. > Currently, all the API methods are blocking calls -- the caller is blocked > until the method returns. It is very slow if a client makes a large number > of independent calls in a single thread since each call has to wait until the > previous call is finished. It is inefficient if a client needs to create a > large number of threads to invoke the calls. > We propose adding a new API to support nonblocking calls, i.e. the caller is > not blocked. The methods in the new API immediately return a Java Future > object. The return value can be obtained by the usual Future.get() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10536) Standby NN can not trigger log roll after EditLogTailer thread failed 3 times in EditLogTailer.triggerActiveLogRoll method.
[ https://issues.apache.org/jira/browse/HDFS-10536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338899#comment-15338899 ] XingFeng Shen edited comment on HDFS-10536 at 6/20/16 2:23 AM: --- Hi [~brahmareddy],[~vinayrpet] please help to check this issue and give me some suggestions was (Author: xingfengshen): [~brahmareddy][~vinayrpet] please help to check this issue and give me some suggestions > Standby NN can not trigger log roll after EditLogTailer thread failed 3 times > in EditLogTailer.triggerActiveLogRoll method. > --- > > Key: HDFS-10536 > URL: https://issues.apache.org/jira/browse/HDFS-10536 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Reporter: XingFeng Shen >Priority: Critical > Attachments: HDFS-10536.patch > > > When all NameNodes become standby, EditLogTailer will retry 3 times to > trigger log roll, then it will be failed and throw Exception "Cannot find any > valid remote NN to service request!". After one namenode become active, > standby NN still can not trigger log roll again because variable > "nnLoopCount" is still 3, it can not init to 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10536) Standby NN can not trigger log roll after EditLogTailer thread failed 3 times in EditLogTailer.triggerActiveLogRoll method.
[ https://issues.apache.org/jira/browse/HDFS-10536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338899#comment-15338899 ] XingFeng Shen commented on HDFS-10536: -- [~brahmareddy][~vinayrpet] please help to check this issue and give me some suggestions > Standby NN can not trigger log roll after EditLogTailer thread failed 3 times > in EditLogTailer.triggerActiveLogRoll method. > --- > > Key: HDFS-10536 > URL: https://issues.apache.org/jira/browse/HDFS-10536 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Reporter: XingFeng Shen >Priority: Critical > Attachments: HDFS-10536.patch > > > When all NameNodes become standby, EditLogTailer will retry 3 times to > trigger log roll, then it will be failed and throw Exception "Cannot find any > valid remote NN to service request!". After one namenode become active, > standby NN still can not trigger log roll again because variable > "nnLoopCount" is still 3, it can not init to 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10548) Remove the long deprecated BlockReaderRemote
[ https://issues.apache.org/jira/browse/HDFS-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338879#comment-15338879 ] Kai Zheng commented on HDFS-10548: -- [~cmccabe], how would you like this if we do it targeting the 3.0 version? Thanks! > Remove the long deprecated BlockReaderRemote > > > Key: HDFS-10548 > URL: https://issues.apache.org/jira/browse/HDFS-10548 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Kai Zheng >Assignee: Kai Zheng > > To lessen the maintain burden like raised in HDFS-8901, suggest we remove > {{BlockReaderRemote}} class that's deprecated very long time ago. > From {{BlockReaderRemote}} header: > {quote} > * @deprecated this is an old implementation that is being left around > * in case any issues spring up with the new {@link BlockReaderRemote2} > * implementation. > * It will be removed in the next release. > {quote} > From {{BlockReaderRemote2}} class header: > {quote} > * This is a new implementation introduced in Hadoop 0.23 which > * is more efficient and simpler than the older BlockReader > * implementation. It should be renamed to BlockReaderRemote > * once we are confident in it. > {quote} > So even further, after getting rid of the old class, we could rename as the > comment suggested: BlockReaderRemote2 => BlockReaderRemote. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10548) Remove the long deprecated BlockReaderRemote
Kai Zheng created HDFS-10548: Summary: Remove the long deprecated BlockReaderRemote Key: HDFS-10548 URL: https://issues.apache.org/jira/browse/HDFS-10548 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs Reporter: Kai Zheng Assignee: Kai Zheng To lessen the maintain burden like raised in HDFS-8901, suggest we remove {{BlockReaderRemote}} class that's deprecated very long time ago. >From {{BlockReaderRemote}} header: {quote} * @deprecated this is an old implementation that is being left around * in case any issues spring up with the new {@link BlockReaderRemote2} * implementation. * It will be removed in the next release. {quote} >From {{BlockReaderRemote2}} class header: {quote} * This is a new implementation introduced in Hadoop 0.23 which * is more efficient and simpler than the older BlockReader * implementation. It should be renamed to BlockReaderRemote * once we are confident in it. {quote} So even further, after getting rid of the old class, we could rename as the comment suggested: BlockReaderRemote2 => BlockReaderRemote. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8901) Use ByteBuffer in striping positional read
[ https://issues.apache.org/jira/browse/HDFS-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338872#comment-15338872 ] Kai Zheng commented on HDFS-8901: - Thanks Bo for the update, and fixing of the test failure. The fix by implementing the {{read(ByteBuffer buf)}} method looks good, but I wonder if we could remove the long deprecated class entirely at all. Will raise this separately. {code} diff --git a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderRemote.java b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderRemote.java index 22d4e23..d7c5da9 100644 --- a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderRemote.java +++ b/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderRemote.java @@ -480,7 +480,21 @@ void sendReadResult(Peer peer, Status statusCode) { @Override public int read(ByteBuffer buf) throws IOException { -throw new UnsupportedOperationException("readDirect unsupported in BlockReaderRemote"); +int toRead = buf.remaining(); +int nRead; +if (buf.isDirect()) { + byte[] bytes = new byte[toRead]; + nRead = read(bytes, 0, toRead); + if (nRead > 0) { +buf.put(bytes, 0, nRead); + } +} else { + nRead = read(buf.array(), buf.arrayOffset(), toRead); + if (nRead > 0) { +buf.position(buf.position() + nRead); + } +} +return nRead; } {code} > Use ByteBuffer in striping positional read > -- > > Key: HDFS-8901 > URL: https://issues.apache.org/jira/browse/HDFS-8901 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Kai Zheng > Attachments: HDFS-8901-v2.patch, HDFS-8901-v3.patch, > HDFS-8901-v4.patch, HDFS-8901-v5.patch, HDFS-8901-v6.patch, > HDFS-8901-v7.patch, HDFS-8901-v8.patch, HDFS-8901-v9.patch, initial-poc.patch > > > Native erasure coder prefers to direct ByteBuffer for performance > consideration. To prepare for it, this change uses ByteBuffer through the > codes in implementing striping position read. It will also fix avoiding > unnecessary data copying between striping read chunk buffers and decode input > buffers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-10460) Erasure Coding: Recompute block checksum for a particular range less than file size on the fly by reconstructing missed block
[ https://issues.apache.org/jira/browse/HDFS-10460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338600#comment-15338600 ] Rakesh R edited comment on HDFS-10460 at 6/19/16 4:33 PM: -- Thanks [~drankye] for the review comments. bq. 1. Could you explain why we need to add actualNumBytes for this, or ellaborate some bit in the description for better understanding I've used the {{actualNumBytes}} parameter for reconstructing the block correctly. Initially I have tried {{requestLength}} value for reconstructing the block to avoid following exception. IIUC this could occur in cases where the requested length is conflicting with the target buffer size. Probably you can reproduce this exception by commenting out setting of acutalNumBytes value after applying my patch and run {{TestFileChecksum#testStripedFileChecksumWithMissedDataBlocksRangeQuery1}} {code} BlockChecksumHelper.java line no#481 ExtendedBlock reconBlockGroup = new ExtendedBlock(blockGroup); // reconBlockGroup.setNumBytes(actualNumBytes); {code} {code} 2016-06-19 21:37:34,583 [DataXceiver for client /127.0.0.1:5882 [Getting checksum for block groupBP-1490511527-10.252.155.196-1466352430600:blk_-9223372036854775792_1001]] ERROR datanode.DataNode (DataXceiver.java:run(316)) - 127.0.0.1:5333:DataXceiver error processing BLOCK_GROUP_CHECKSUM operation src: /127.0.0.1:5882 dst: /127.0.0.1:5333 org.apache.hadoop.HadoopIllegalArgumentException: No enough valid inputs are provided, not recoverable at org.apache.hadoop.io.erasurecode.rawcoder.ByteBufferDecodingState.checkInputBuffers(ByteBufferDecodingState.java:107) {code} I have done the following approach to handle requestLength less than cellSize. First, using the actualNumBytes it will allow to reconstruct the buffers and then take a copy of the targetbuffer using the remaining length. Then using this copied buffer will calculate the checksum. {code} StripedBlockChecksumReconstructor.java line no#93 if (requestedLen <= toReconstructLen) { int remainingLen = (int) requestedLen; outputData = Arrays.copyOf(targetBuffer.array(), remainingLen); {code} bq. 1) you mean less than bytesPerCRC, but in fact you passed bytesPerCRC as the request length. 2) you could get bytesPerCRC and save it in setup method? So you can use it in other tests. Yes, I will do this modifications in next patch. was (Author: rakeshr): Thanks [~drankye] for the review comments. bq. 1. Could you explain why we need to add actualNumBytes for this, or ellaborate some bit in the description for better understanding I've used the {{actualNumBytes}} parameter for reconstructing the block correctly. Initially I have tried {{requestLength}} value for reconstructing the block to avoid following exception. IIUC this could occur in cases where the requested length is conflicting with the target buffer size. Probably you can reproduce this exception by commenting out setting of acutalNumBytes value after applying my patch and run {{TestFileChecksum#testStripedFileChecksumWithMissedDataBlocksRangeQuery1}} {code} BlockChecksumHelper.java line no#481 ExtendedBlock reconBlockGroup = new ExtendedBlock(blockGroup); // reconBlockGroup.setNumBytes(actualNumBytes); {code} {code} 2016-06-19 21:37:34,583 [DataXceiver for client /127.0.0.1:5882 [Getting checksum for block groupBP-1490511527-10.252.155.196-1466352430600:blk_-9223372036854775792_1001]] ERROR datanode.DataNode (DataXceiver.java:run(316)) - 127.0.0.1:5333:DataXceiver error processing BLOCK_GROUP_CHECKSUM operation src: /127.0.0.1:5882 dst: /127.0.0.1:5333 org.apache.hadoop.HadoopIllegalArgumentException: No enough valid inputs are provided, not recoverable at org.apache.hadoop.io.erasurecode.rawcoder.ByteBufferDecodingState.checkInputBuffers(ByteBufferDecodingState.java:107) {code} bq. 1) you mean less than bytesPerCRC, but in fact you passed bytesPerCRC as the request length. 2) you could get bytesPerCRC and save it in setup method? So you can use it in other tests. Yes, I will do this modifications in next patch. > Erasure Coding: Recompute block checksum for a particular range less than > file size on the fly by reconstructing missed block > - > > Key: HDFS-10460 > URL: https://issues.apache.org/jira/browse/HDFS-10460 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-10460-00.patch, HDFS-10460-01.patch > > > This jira is HDFS-9833 follow-on task to address reconstructing block and > then recalculating block checksum for a particular range query. > For example, > {code} > // create a file 'stripedFile1' with
[jira] [Commented] (HDFS-10460) Erasure Coding: Recompute block checksum for a particular range less than file size on the fly by reconstructing missed block
[ https://issues.apache.org/jira/browse/HDFS-10460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338600#comment-15338600 ] Rakesh R commented on HDFS-10460: - Thanks [~drankye] for the review comments. bq. 1. Could you explain why we need to add actualNumBytes for this, or ellaborate some bit in the description for better understanding I've used the {{actualNumBytes}} parameter for reconstructing the block correctly. Initially I have tried {{requestLength}} value for reconstructing the block to avoid following exception. IIUC this could occur in cases where the requested length is conflicting with the target buffer size. Probably you can reproduce this exception by commenting out setting of acutalNumBytes value after applying my patch and run {{TestFileChecksum#testStripedFileChecksumWithMissedDataBlocksRangeQuery1}} {code} BlockChecksumHelper.java line no#481 ExtendedBlock reconBlockGroup = new ExtendedBlock(blockGroup); // reconBlockGroup.setNumBytes(actualNumBytes); {code} {code} 2016-06-19 21:37:34,583 [DataXceiver for client /127.0.0.1:5882 [Getting checksum for block groupBP-1490511527-10.252.155.196-1466352430600:blk_-9223372036854775792_1001]] ERROR datanode.DataNode (DataXceiver.java:run(316)) - 127.0.0.1:5333:DataXceiver error processing BLOCK_GROUP_CHECKSUM operation src: /127.0.0.1:5882 dst: /127.0.0.1:5333 org.apache.hadoop.HadoopIllegalArgumentException: No enough valid inputs are provided, not recoverable at org.apache.hadoop.io.erasurecode.rawcoder.ByteBufferDecodingState.checkInputBuffers(ByteBufferDecodingState.java:107) {code} bq. 1) you mean less than bytesPerCRC, but in fact you passed bytesPerCRC as the request length. 2) you could get bytesPerCRC and save it in setup method? So you can use it in other tests. Yes, I will do this modifications in next patch. > Erasure Coding: Recompute block checksum for a particular range less than > file size on the fly by reconstructing missed block > - > > Key: HDFS-10460 > URL: https://issues.apache.org/jira/browse/HDFS-10460 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-10460-00.patch, HDFS-10460-01.patch > > > This jira is HDFS-9833 follow-on task to address reconstructing block and > then recalculating block checksum for a particular range query. > For example, > {code} > // create a file 'stripedFile1' with fileSize = cellSize * numDataBlocks = > 65536 * 6 = 393216 > FileChecksum stripedFileChecksum = getFileChecksum(stripedFile1, 10, true); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10534) NameNode WebUI should display DataNode usage rate with a certain percentile
[ https://issues.apache.org/jira/browse/HDFS-10534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338580#comment-15338580 ] Hadoop QA commented on HDFS-10534: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 49s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 54s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 31s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 34s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 34s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 32s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 585 unchanged - 2 fixed = 587 total (was 587) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 25s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 24s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 26s {color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 18m 31s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:e2f6409 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12811711/HDFS-10534.03.patch | | JIRA Issue | HDFS-10534 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux 18dfc493fdeb 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0319d73 | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | mvninstall | https://builds.apache.org/job/PreCommit-HDFS-Build/15828/artifact/patchprocess/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt | | compile | https://builds.apache.org/job/PreCommit-HDFS-Build/15828/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs.txt | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/15828/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/15828/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | mvnsite | https://builds.apache.org/job/PreCommit-HDFS-Build/15828/artifact/patchprocess/patch-mvnsite-hadoop-hdfs-project_hadoop-hdfs.txt | | findbugs |
[jira] [Updated] (HDFS-10534) NameNode WebUI should display DataNode usage rate with a certain percentile
[ https://issues.apache.org/jira/browse/HDFS-10534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated HDFS-10534: -- Attachment: HDFS-10534.03.patch > NameNode WebUI should display DataNode usage rate with a certain percentile > --- > > Key: HDFS-10534 > URL: https://issues.apache.org/jira/browse/HDFS-10534 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode, ui >Reporter: Zhe Zhang >Assignee: Kai Sasaki > Attachments: HDFS-10534.01.patch, HDFS-10534.02.patch, > HDFS-10534.03.patch > > > In addition of *Min/Median/Max*, another meaningful metric for cluster > balance is DN usage rate at a certain percentile (e.g. 90 or 95). We should > add a config option, and another filed on NN WebUI, to display this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10460) Erasure Coding: Recompute block checksum for a particular range less than file size on the fly by reconstructing missed block
[ https://issues.apache.org/jira/browse/HDFS-10460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338470#comment-15338470 ] Kai Zheng commented on HDFS-10460: -- Thanks [~rakeshr] for handling the hard part! My quick look gave the following comments, and will do a careful review later. 1. Could you explain why we need to add {{actualNumBytes}} for this, or ellaborate some bit in the description for better understanding. I'm thinking maybe we could use {{requestLength}} for the extra needed info. actualNumBytes instead could be set to the block group. Not sure if this could be better. 2. The newly added tests look great! Ref. this codes: 1) you mean less than bytesPerCRC, but in fact you passed bytesPerCRC as the request length. 2) you could get {{bytesPerCRC}} and save it in setup method? So you can use it in other tests. {code} + /** + * Test to verify that the checksum can be computed by giving less than + * bytesPerCRC length of the file range for checksum calculation. 512 is the + * value of bytesPerCRC. + */ + @Test(timeout = 9) + public void testStripedFileChecksumWithMissedDataBlocksRangeQuery2() + throws Exception { +int bytesPerCRC = conf.getInt( +HdfsClientConfigKeys.DFS_BYTES_PER_CHECKSUM_KEY, +HdfsClientConfigKeys.DFS_BYTES_PER_CHECKSUM_DEFAULT); +testStripedFileChecksumWithMissedDataBlocksRangeQuery(stripedFile1, +bytesPerCRC); + } {code} > Erasure Coding: Recompute block checksum for a particular range less than > file size on the fly by reconstructing missed block > - > > Key: HDFS-10460 > URL: https://issues.apache.org/jira/browse/HDFS-10460 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-10460-00.patch, HDFS-10460-01.patch > > > This jira is HDFS-9833 follow-on task to address reconstructing block and > then recalculating block checksum for a particular range query. > For example, > {code} > // create a file 'stripedFile1' with fileSize = cellSize * numDataBlocks = > 65536 * 6 = 393216 > FileChecksum stripedFileChecksum = getFileChecksum(stripedFile1, 10, true); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org