[jira] Commented: (HDFS-453) XML-based metrics as JSP servlet for NameNode
[ https://issues.apache.org/jira/browse/HDFS-453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883053#action_12883053 ] Hadoop QA commented on HDFS-453: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446660/HDFS-453.7.patch against trunk revision 957669. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 24 javac compiler warnings (more than the trunk's current 23 warnings). -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/405/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/405/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/405/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/405/console This message is automatically generated. XML-based metrics as JSP servlet for NameNode - Key: HDFS-453 URL: https://issues.apache.org/jira/browse/HDFS-453 Project: Hadoop HDFS Issue Type: New Feature Components: name-node Affects Versions: 0.21.0, 0.22.0 Reporter: Aaron Kimball Assignee: Aaron Kimball Fix For: 0.21.0, 0.22.0 Attachments: dfshealth.xml.jspx, example-dfshealth.xml, HDFS-453.2.patch, HDFS-453.3.patch, HDFS-453.4.patch, HDFS-453.5.patch, HDFS-453.6.patch, HDFS-453.7.patch, HDFS-453.patch In HADOOP-4559, a general REST API for reporting metrics was proposed but work seems to have stalled. In the interim, we have a simple XML translation of the existing NameNode status page which provides the same metrics as the human-readable page. This is a relatively lightweight addition to provide some machine-understandable metrics reporting. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1205) FSDatasetAsyncDiskService should name its threads
[ https://issues.apache.org/jira/browse/HDFS-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883056#action_12883056 ] Hadoop QA commented on HDFS-1205: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447006/hdfs-1205-0.20.txt against trunk revision 957669. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/406/console This message is automatically generated. FSDatasetAsyncDiskService should name its threads - Key: HDFS-1205 URL: https://issues.apache.org/jira/browse/HDFS-1205 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-1205-0.20.txt, hdfs-1205.txt FSDatasetAsyncService creates threads but doesn't name them. The ThreadFactory should name them with the volume they work on as well as a thread index. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1203) DataNode should sleep before reentering service loop after an exception
[ https://issues.apache.org/jira/browse/HDFS-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883059#action_12883059 ] Hadoop QA commented on HDFS-1203: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12446925/hdfs-1203.txt against trunk revision 957669. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/197/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/197/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/197/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/197/console This message is automatically generated. DataNode should sleep before reentering service loop after an exception --- Key: HDFS-1203 URL: https://issues.apache.org/jira/browse/HDFS-1203 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-1203.txt When the DN gets an exception in response to a heartbeat, it logs it and continues, but there is no sleep. I've occasionally seen bugs produce a case where heartbeats continuously produce exceptions, and thus the DN floods the NN with bad heartbeats. Adding a 1 second sleep at least throttles the error messages for easier debugging and error isolation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel
[ https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883100#action_12883100 ] Hadoop QA commented on HDFS-1071: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447486/HDFS-1071.5.patch against trunk revision 957669. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/410/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/410/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/410/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/410/console This message is automatically generated. savenamespace should write the fsimage to all configured fs.name.dir in parallel Key: HDFS-1071 URL: https://issues.apache.org/jira/browse/HDFS-1071 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: Dmytro Molkov Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, HDFS-1071.5.patch, HDFS-1071.patch If you have a large number of files in HDFS, the fsimage file is very big. When the namenode restarts, it writes a copy of the fsimage to all directories configured in fs.name.dir. This takes a long time, especially if there are many directories in fs.name.dir. Make the NN write the fsimage to all these directories in parallel. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1202) DataBlockScanner throws NPE when updated before initialized
[ https://issues.apache.org/jira/browse/HDFS-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883147#action_12883147 ] Hadoop QA commented on HDFS-1202: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447746/hdfs-1202.txt against trunk revision 957669. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/411/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/411/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/411/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/411/console This message is automatically generated. DataBlockScanner throws NPE when updated before initialized --- Key: HDFS-1202 URL: https://issues.apache.org/jira/browse/HDFS-1202 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.20-append, 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.20-append, 0.22.0 Attachments: hdfs-1202-0.20-append.txt, hdfs-1202.txt Missing an isInitialized() check in updateScanStatusInternal -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1268) Extract blockInvalidateLimit as a seperated configuration
[ https://issues.apache.org/jira/browse/HDFS-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883176#action_12883176 ] jinglong.liujl commented on HDFS-1268: -- In my case, if I want to delete 600 blocks, I have to wait 6 heartbeats periods. During this period, disk maybe reach its capacity. Then, too slow block fetching will cause write failure. In general case, default value (100 can work well), but in this extremely case, default value is not enough. Currently, this parameter can be computed by heartbeatInterval, but in the case before, slower heartbeat + per heartbeat carry more blocks can not carry more blocks in the same period. Why not make this parameter can be configured? Extract blockInvalidateLimit as a seperated configuration - Key: HDFS-1268 URL: https://issues.apache.org/jira/browse/HDFS-1268 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: jinglong.liujl Attachments: patch.diff If there're many file piled up in recentInvalidateSets, only Math.max(blockInvalidateLimit, 20*(int)(heartbeatInterval/1000)) invalid blocks can be carried in a heartbeat.(By default, It's 100). In high write stress, it'll cause process of invalidate blocks removing can not catch up with speed of writing. We extract blockInvalidateLimit to a sperate config parameter that user can make the right configure for your cluster. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1268) Extract blockInvalidateLimit as a seperated configuration
[ https://issues.apache.org/jira/browse/HDFS-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883277#action_12883277 ] Konstantin Shvachko commented on HDFS-1268: --- I was actually in favor of introducing the parameter, see [here|https://issues.apache.org/jira/browse/HADOOP-774?focusedCommentId=12455413page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12455413] So it is mostly about clear motivation, and making sure that the solution will actually work for you. So you are talking about a corner case, when a DN is almost full and needs to remove blocks faster in order to free space for subsequent writes, right? How does this parameter help on a running cluster? Configuration change takes effect only when you restart the name-node. Do you plan to restart cluster when you see data-nodes are getting close to full? Extract blockInvalidateLimit as a seperated configuration - Key: HDFS-1268 URL: https://issues.apache.org/jira/browse/HDFS-1268 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: jinglong.liujl Attachments: patch.diff If there're many file piled up in recentInvalidateSets, only Math.max(blockInvalidateLimit, 20*(int)(heartbeatInterval/1000)) invalid blocks can be carried in a heartbeat.(By default, It's 100). In high write stress, it'll cause process of invalidate blocks removing can not catch up with speed of writing. We extract blockInvalidateLimit to a sperate config parameter that user can make the right configure for your cluster. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN
[ https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883296#action_12883296 ] Todd Lipcon commented on HDFS-1262: --- bq. so it really is a glorified 'cleanup and close' which has the same behavior as if the lease expired--nice and tidy imo. It does have the slight delay of lease recovery, though. I think that makes sense - best to do recovery since we might have gotten halfway through creating the pipeline, for example, and this will move the blocks back to finalized state on the DNs. Performance shouldn't be a concern, since this is such a rare case. bq. While in theory it could happen on the NN side, right now, the namenode RPC for create happens and then all we do is start the streamer (hence i don't have a test case for it yet). What happens if we have a transient network error? For example, let's say the client is on the same machine as the NN, but it got partitioned from the network for a bit. When we call create(), it succeeds, but then when we actually try to write the blocks, it fails temporarily. This currently leaves a 0-length file, but does it also orphan the lease for that file? Failed pipeline creation during append leaves lease hanging on NN - Key: HDFS-1262 URL: https://issues.apache.org/jira/browse/HDFS-1262 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client, name-node Affects Versions: 0.20-append Reporter: Todd Lipcon Assignee: sam rash Priority: Critical Fix For: 0.20-append Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened was the following: 1) File's original writer died 2) Recovery client tried to open file for append - looped for a minute or so until soft lease expired, then append call initiated recovery 3) Recovery completed successfully 4) Recovery client calls append again, which succeeds on the NN 5) For some reason, the block recovery that happens at the start of append pipeline creation failed on all datanodes 6 times, causing the append() call to throw an exception back to HBase master. HBase assumed the file wasn't open and put it back on a queue to try later 6) Some time later, it tried append again, but the lease was still assigned to the same DFS client, so it wasn't able to recover. The recovery failure in step 5 is a separate issue, but the problem for this JIRA is that the NN can think it failed to open a file for append when the NN thinks the writer holds a lease. Since the writer keeps renewing its lease, recovery never happens, and no one can open or recover the file until the DFS client shuts down. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1202) DataBlockScanner throws NPE when updated before initialized
[ https://issues.apache.org/jira/browse/HDFS-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883311#action_12883311 ] Hadoop QA commented on HDFS-1202: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447746/hdfs-1202.txt against trunk revision 957669. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/412/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/412/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/412/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/412/console This message is automatically generated. DataBlockScanner throws NPE when updated before initialized --- Key: HDFS-1202 URL: https://issues.apache.org/jira/browse/HDFS-1202 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.20-append, 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.20-append, 0.22.0 Attachments: hdfs-1202-0.20-append.txt, hdfs-1202.txt Missing an isInitialized() check in updateScanStatusInternal -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1108) ability to create a file whose newly allocated blocks are automatically persisted immediately
[ https://issues.apache.org/jira/browse/HDFS-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883320#action_12883320 ] Hadoop QA commented on HDFS-1108: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12447758/HDFS-1108.patch against trunk revision 957669. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/203/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/203/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/203/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/203/console This message is automatically generated. ability to create a file whose newly allocated blocks are automatically persisted immediately - Key: HDFS-1108 URL: https://issues.apache.org/jira/browse/HDFS-1108 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: Dmytro Molkov Attachments: HDFS-1108.patch The current HDFS design says that newly allocated blocks for a file are not persisted in the NN transaction log when the block is allocated. Instead, a hflush() or a close() on the file persists the blocks into the transaction log. It would be nice if we can immediately persist newly allocated blocks (as soon as they are allocated) for specific files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN
[ https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883319#action_12883319 ] sam rash commented on HDFS-1262: in the 2nd case, can't the client still call close? or will it hang forever waiting for blocks? either way, i've got test cases for create() + append() and the fix. took a little longer to clean up today, but will post the patch by end of day Failed pipeline creation during append leaves lease hanging on NN - Key: HDFS-1262 URL: https://issues.apache.org/jira/browse/HDFS-1262 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client, name-node Affects Versions: 0.20-append Reporter: Todd Lipcon Assignee: sam rash Priority: Critical Fix For: 0.20-append Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened was the following: 1) File's original writer died 2) Recovery client tried to open file for append - looped for a minute or so until soft lease expired, then append call initiated recovery 3) Recovery completed successfully 4) Recovery client calls append again, which succeeds on the NN 5) For some reason, the block recovery that happens at the start of append pipeline creation failed on all datanodes 6 times, causing the append() call to throw an exception back to HBase master. HBase assumed the file wasn't open and put it back on a queue to try later 6) Some time later, it tried append again, but the lease was still assigned to the same DFS client, so it wasn't able to recover. The recovery failure in step 5 is a separate issue, but the problem for this JIRA is that the NN can think it failed to open a file for append when the NN thinks the writer holds a lease. Since the writer keeps renewing its lease, recovery never happens, and no one can open or recover the file until the DFS client shuts down. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1272) HDFS changes corresponding to rename of TokenStorage to Credentials
[ https://issues.apache.org/jira/browse/HDFS-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-1272: --- Attachment: HDFS-1272.1.patch Patch for trunk. HDFS changes corresponding to rename of TokenStorage to Credentials --- Key: HDFS-1272 URL: https://issues.apache.org/jira/browse/HDFS-1272 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HDFS-1272.1.patch TokenStorage is renamed to Credentials as part of MAPREDUCE-1528 and HADOOP-6845. This jira tracks hdfs changes corresponding to that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-1272) HDFS changes corresponding to rename of TokenStorage to Credentials
HDFS changes corresponding to rename of TokenStorage to Credentials --- Key: HDFS-1272 URL: https://issues.apache.org/jira/browse/HDFS-1272 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey TokenStorage is renamed to Credentials as part of MAPREDUCE-1528 and HADOOP-6845. This jira tracks hdfs changes corresponding to that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1258) Clearing namespace quota on / corrupts FS image
[ https://issues.apache.org/jira/browse/HDFS-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1258: - Attachment: clear-quota.patch This patch doesn't actually solve the root problem of clearing the root directory quota causing a corrupt FS image, but it will prevent people from accidentally borking their file system in the mean time, until that gets fixed. Clearing namespace quota on / corrupts FS image - Key: HDFS-1258 URL: https://issues.apache.org/jira/browse/HDFS-1258 Project: Hadoop HDFS Issue Type: Bug Components: name-node Reporter: Aaron T. Myers Priority: Blocker Fix For: 0.20.3, 0.21.0, 0.22.0 Attachments: clear-quota.patch The HDFS root directory starts out with a default namespace quota of Integer.MAX_VALUE. If you clear this quota (using hadoop dfsadmin -clrQuota /), the fsimage gets corrupted immediately. Subsequent 2NN rolls will fail, and the NN will not come back up from a restart. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1108) ability to create a file whose newly allocated blocks are automatically persisted immediately
[ https://issues.apache.org/jira/browse/HDFS-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883347#action_12883347 ] Dmytro Molkov commented on HDFS-1108: - Suresh: 1. Yes, the block information will essentially be persisted twice, on each block allocation and on file close. Do you think that can be a problem for us? Since it is a configurable change and this will only happen for specifically configured clusters I do not feel like this is bad. 2. This part is tricky. I guess what can happen is: New block is allocated and then the client immediately dies without writing data + The namenode crashes and needs a restart. When the namenode is restarted it will have this last block as UnderConstruction and when NN tries to release the lease on this file it will try to recover the block and will never succeed because the block is not present on the datanodes. However it seems that it is the same case now, when namenode is not restarted the existence of this block in memory and absence of it on the datanodes will lead to the same problem. Or another case that is similar is when client calls hflush and then Namenode + client + all datanodes that are receiving the new block crash. Please correct me if I am wrong on this one. All in all it seems that if the namenode crashes it may lead to the client dying and so the probability of this happening might be higher than my first example of what might happen today? 3. I am not really sure what you mean by primary flagging this to standby, but in our case the only channel of communication between primary and standby is in fact the edits log, so this seemed like a reasonable way to go. ability to create a file whose newly allocated blocks are automatically persisted immediately - Key: HDFS-1108 URL: https://issues.apache.org/jira/browse/HDFS-1108 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: Dmytro Molkov Attachments: HDFS-1108.patch The current HDFS design says that newly allocated blocks for a file are not persisted in the NN transaction log when the block is allocated. Instead, a hflush() or a close() on the file persists the blocks into the transaction log. It would be nice if we can immediately persist newly allocated blocks (as soon as they are allocated) for specific files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1140) Speedup INode.getPathComponents
[ https://issues.apache.org/jira/browse/HDFS-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytro Molkov updated HDFS-1140: Status: Open (was: Patch Available) Speedup INode.getPathComponents --- Key: HDFS-1140 URL: https://issues.apache.org/jira/browse/HDFS-1140 Project: Hadoop HDFS Issue Type: Improvement Reporter: Dmytro Molkov Assignee: Dmytro Molkov Priority: Minor Attachments: HDFS-1140.2.patch, HDFS-1140.3.patch, HDFS-1140.4.patch, HDFS-1140.patch When the namenode is loading the image there is a significant amount of time being spent in the DFSUtil.string2Bytes. We have a very specific workload here. The path that namenode does getPathComponents for shares N - 1 component with the previous path this method was called for (assuming current path has N components). Hence we can improve the image load time by caching the result of previous conversion. We thought of using some simple LRU cache for components, but the reality is, String.getBytes gets optimized during runtime and LRU cache doesn't perform as well, however using just the latest path components and their translation to bytes in two arrays gives quite a performance boost. I could get another 20% off of the time to load the image on our cluster (30 seconds vs 24) and I wrote a simple benchmark that tests performance with and without caching. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1140) Speedup INode.getPathComponents
[ https://issues.apache.org/jira/browse/HDFS-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytro Molkov updated HDFS-1140: Attachment: HDFS-1140.4.patch Thanks for your comments, Konstantin. I addressed all of them in a new version of the patch. Speedup INode.getPathComponents --- Key: HDFS-1140 URL: https://issues.apache.org/jira/browse/HDFS-1140 Project: Hadoop HDFS Issue Type: Improvement Reporter: Dmytro Molkov Assignee: Dmytro Molkov Priority: Minor Attachments: HDFS-1140.2.patch, HDFS-1140.3.patch, HDFS-1140.4.patch, HDFS-1140.patch When the namenode is loading the image there is a significant amount of time being spent in the DFSUtil.string2Bytes. We have a very specific workload here. The path that namenode does getPathComponents for shares N - 1 component with the previous path this method was called for (assuming current path has N components). Hence we can improve the image load time by caching the result of previous conversion. We thought of using some simple LRU cache for components, but the reality is, String.getBytes gets optimized during runtime and LRU cache doesn't perform as well, however using just the latest path components and their translation to bytes in two arrays gives quite a performance boost. I could get another 20% off of the time to load the image on our cluster (30 seconds vs 24) and I wrote a simple benchmark that tests performance with and without caching. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel
[ https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytro Molkov updated HDFS-1071: Attachment: HDFS-1071.6.patch I added a documentation for FSImageSaver that describes initial assumptions for how the parallel writes are being done. As far as writing the image to the single disk in multiple directories. If you do it in parallel that might only hurt the performance, since the disk will do seeks all the times. savenamespace should write the fsimage to all configured fs.name.dir in parallel Key: HDFS-1071 URL: https://issues.apache.org/jira/browse/HDFS-1071 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: Dmytro Molkov Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, HDFS-1071.5.patch, HDFS-1071.6.patch, HDFS-1071.patch If you have a large number of files in HDFS, the fsimage file is very big. When the namenode restarts, it writes a copy of the fsimage to all directories configured in fs.name.dir. This takes a long time, especially if there are many directories in fs.name.dir. Make the NN write the fsimage to all these directories in parallel. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1267) fuse-dfs does not compile
[ https://issues.apache.org/jira/browse/HDFS-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883373#action_12883373 ] Hadoop QA commented on HDFS-1267: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12448022/1267-1.patch against trunk revision 957669. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/206/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/206/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/206/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/206/console This message is automatically generated. fuse-dfs does not compile - Key: HDFS-1267 URL: https://issues.apache.org/jira/browse/HDFS-1267 Project: Hadoop HDFS Issue Type: Bug Components: contrib/fuse-dfs Reporter: Tom White Priority: Critical Fix For: 0.21.0 Attachments: 1267-1.patch Looks like since libhdfs was updated to use the new UGI (HDFS-1000) fuse-dfs no longer compiles: {noformat} [exec] fuse_connect.c: In function 'doConnectAsUser': [exec] fuse_connect.c:40: error: too many arguments to function 'hdfsConnectAsUser' {noformat} Any takers to fix this please? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1212) Harmonize HDFS JAR library versions with Common
[ https://issues.apache.org/jira/browse/HDFS-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883374#action_12883374 ] Hadoop QA commented on HDFS-1212: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12448147/HDFS-1212.patch against trunk revision 957669. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/413/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/413/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/413/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/413/console This message is automatically generated. Harmonize HDFS JAR library versions with Common --- Key: HDFS-1212 URL: https://issues.apache.org/jira/browse/HDFS-1212 Project: Hadoop HDFS Issue Type: Bug Components: build Reporter: Tom White Assignee: Tom White Priority: Blocker Fix For: 0.21.0 Attachments: HDFS-1212.patch, HDFS-1212.patch, HDFS-1212.patch, HDFS-1212.patch HDFS part of HADOOP-6800. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN
[ https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam rash updated HDFS-1262: --- Attachment: hdfs-1262-1.txt -test case for append and create failures. -tried to get it so both cases fail fast, but create will hit the test timeout (default for create that gets AlreadyBeingCreatedException is 5 retries with 60s sleep) -append case fails in 30s w/o the fix worst case Failed pipeline creation during append leaves lease hanging on NN - Key: HDFS-1262 URL: https://issues.apache.org/jira/browse/HDFS-1262 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client, name-node Affects Versions: 0.20-append Reporter: Todd Lipcon Assignee: sam rash Priority: Critical Fix For: 0.20-append Attachments: hdfs-1262-1.txt Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened was the following: 1) File's original writer died 2) Recovery client tried to open file for append - looped for a minute or so until soft lease expired, then append call initiated recovery 3) Recovery completed successfully 4) Recovery client calls append again, which succeeds on the NN 5) For some reason, the block recovery that happens at the start of append pipeline creation failed on all datanodes 6 times, causing the append() call to throw an exception back to HBase master. HBase assumed the file wasn't open and put it back on a queue to try later 6) Some time later, it tried append again, but the lease was still assigned to the same DFS client, so it wasn't able to recover. The recovery failure in step 5 is a separate issue, but the problem for this JIRA is that the NN can think it failed to open a file for append when the NN thinks the writer holds a lease. Since the writer keeps renewing its lease, recovery never happens, and no one can open or recover the file until the DFS client shuts down. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1258) Clearing namespace quota on / corrupts FS image
[ https://issues.apache.org/jira/browse/HDFS-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883388#action_12883388 ] Todd Lipcon commented on HDFS-1258: --- Patch looks good. Can you reupload it with the --no-prefix option to git diff, and then change to Patch Available status so the Hudson QA bot runs? Clearing namespace quota on / corrupts FS image - Key: HDFS-1258 URL: https://issues.apache.org/jira/browse/HDFS-1258 Project: Hadoop HDFS Issue Type: Bug Components: name-node Reporter: Aaron T. Myers Priority: Blocker Fix For: 0.20.3, 0.21.0, 0.22.0 Attachments: clear-quota.patch The HDFS root directory starts out with a default namespace quota of Integer.MAX_VALUE. If you clear this quota (using hadoop dfsadmin -clrQuota /), the fsimage gets corrupted immediately. Subsequent 2NN rolls will fail, and the NN will not come back up from a restart. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1258) Clearing namespace quota on / corrupts FS image
[ https://issues.apache.org/jira/browse/HDFS-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1258: - Status: Patch Available (was: Open) Patch prevents user's from clearing namespace quota on /. Clearing namespace quota on / corrupts FS image - Key: HDFS-1258 URL: https://issues.apache.org/jira/browse/HDFS-1258 Project: Hadoop HDFS Issue Type: Bug Components: name-node Reporter: Aaron T. Myers Priority: Blocker Fix For: 0.20.3, 0.21.0, 0.22.0 Attachments: clear-quota.patch, clear-quota.patch The HDFS root directory starts out with a default namespace quota of Integer.MAX_VALUE. If you clear this quota (using hadoop dfsadmin -clrQuota /), the fsimage gets corrupted immediately. Subsequent 2NN rolls will fail, and the NN will not come back up from a restart. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1258) Clearing namespace quota on / corrupts FS image
[ https://issues.apache.org/jira/browse/HDFS-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1258: - Attachment: clear-quota.patch Same patch, but with the --no-prefix option to git diff. Clearing namespace quota on / corrupts FS image - Key: HDFS-1258 URL: https://issues.apache.org/jira/browse/HDFS-1258 Project: Hadoop HDFS Issue Type: Bug Components: name-node Reporter: Aaron T. Myers Priority: Blocker Fix For: 0.20.3, 0.21.0, 0.22.0 Attachments: clear-quota.patch, clear-quota.patch The HDFS root directory starts out with a default namespace quota of Integer.MAX_VALUE. If you clear this quota (using hadoop dfsadmin -clrQuota /), the fsimage gets corrupted immediately. Subsequent 2NN rolls will fail, and the NN will not come back up from a restart. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1250) Namenode accepts block report from dead datanodes
[ https://issues.apache.org/jira/browse/HDFS-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883407#action_12883407 ] Hadoop QA commented on HDFS-1250: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12448107/HDFS-1250.1.patch against trunk revision 957669. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/414/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/414/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/414/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/414/console This message is automatically generated. Namenode accepts block report from dead datanodes - Key: HDFS-1250 URL: https://issues.apache.org/jira/browse/HDFS-1250 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.2, 0.22.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-1250.1.patch, HDFS-1250.patch When a datanode heartbeat times out namenode marks it dead. The subsequent heartbeat from the datanode is rejected with a command to datanode to re-register. However namenode accepts block report from the datanode although it is marked dead. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
[ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883406#action_12883406 ] Hadoop QA commented on HDFS-1057: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12448081/hdfs-1057-trunk-5.txt against trunk revision 957669. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 30 javac compiler warnings (more than the trunk's current 23 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/207/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/207/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/207/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/207/console This message is automatically generated. Concurrent readers hit ChecksumExceptions if following a writer to very end of file --- Key: HDFS-1057 URL: https://issues.apache.org/jira/browse/HDFS-1057 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Affects Versions: 0.20-append, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: sam rash Priority: Blocker Fix For: 0.20-append Attachments: conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, HDFS-1057-0.20-append.patch, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush(). Therefore, if there is a concurrent reader, it's possible to race here - the reader will see the new length while those bytes are still in the buffers of BlockReceiver. Thus the client will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the file is made accessible to readers even though it is not stable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
[ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883413#action_12883413 ] sam rash commented on HDFS-1057: the one test that failed from my new tests had an fd leak. i've corrected that. the other failed tests I cannot reproduce: 1. org.apache.hadoop.hdfs.TestFileConcurrentReader.testUnfinishedBlockCRCErrorNormalTransferVerySmallWrite -had fd leak, fixed 2. org.apache.hadoop.hdfs.security.token.block.TestBlockToken.testBlockTokenRpc [junit] Running org.apache.hadoop.hdfs.security.token.block.TestBlockToken [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.305 sec 3. org.apache.hadoop.hdfs.server.common.TestJspHelper.testGetUgi [junit] Running org.apache.hadoop.hdfs.server.common.TestJspHelper [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.309 sec I can submit the patch with the fix for #1 plus warning fixes Concurrent readers hit ChecksumExceptions if following a writer to very end of file --- Key: HDFS-1057 URL: https://issues.apache.org/jira/browse/HDFS-1057 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Affects Versions: 0.20-append, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: sam rash Priority: Blocker Fix For: 0.20-append Attachments: conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, HDFS-1057-0.20-append.patch, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush(). Therefore, if there is a concurrent reader, it's possible to race here - the reader will see the new length while those bytes are still in the buffers of BlockReceiver. Thus the client will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the file is made accessible to readers even though it is not stable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.