[jira] [Commented] (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697735#comment-13697735 ] Dave Latham commented on HDFS-1539: --- Does anyone have any performance numbers for enabling this? Or, does anyone just have some experience running this on significant workloads in production? (Especially HBase?) prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, hdfs-client, namenode Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.23.0, 1.1.1 Attachments: syncOnClose1.txt, syncOnClose2_b-1.txt, syncOnClose2.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13495607#comment-13495607 ] Tsz Wo (Nicholas), SZE commented on HDFS-1539: -- Interestingly, TestFileCreation fails in branch-1 (with and without the patch) but not branch-1.1. I will file a JIRA for it. prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.23.0 Attachments: syncOnClose1.txt, syncOnClose2_b-1.txt, syncOnClose2.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13495614#comment-13495614 ] Tsz Wo (Nicholas), SZE commented on HDFS-1539: -- I have committed this to branch-1 and branch-1.1. prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.23.0 Attachments: syncOnClose1.txt, syncOnClose2_b-1.txt, syncOnClose2.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494800#comment-13494800 ] Suresh Srinivas commented on HDFS-1539: --- Nicholas, compared the backported patch with the original. It looks good. +1 for the patch. We should get this into 1.1.1. prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.23.0 Attachments: syncOnClose1.txt, syncOnClose2_b-1.txt, syncOnClose2.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494496#comment-13494496 ] Tsz Wo (Nicholas), SZE commented on HDFS-1539: -- Sure, let's backport this to branch-1. prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.23.0 Attachments: syncOnClose1.txt, syncOnClose2_b-1.txt, syncOnClose2.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13268609#comment-13268609 ] stack commented on HDFS-1539: - Should we pull this into 1.0.3? Or 1.1.0? prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.23.0 Attachments: syncOnClose1.txt, syncOnClose2.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977309#action_12977309 ] Hairong Kuang commented on HDFS-1539: - +1. The patch look good. A minor comment is that I do not think the unit test is of much use because the bug occurs when a machine is power off but it's hard to simulate this. prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: syncOnClose1.txt, syncOnClose2.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12975393#action_12975393 ] Hadoop QA commented on HDFS-1539: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12467019/syncOnClose2.txt against trunk revision 1053203. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.namenode.TestStorageRestore org.apache.hadoop.hdfs.TestFileConcurrentReader -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/49//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/49//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/49//console This message is automatically generated. prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: syncOnClose1.txt, syncOnClose2.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974258#action_12974258 ] dhruba borthakur commented on HDFS-1539: if there is a file with 20 blocks and each block has three replicas each, then there will be a total of 60 fflush calls, this does not matter on the number of servers. prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: syncOnClose1.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974400#action_12974400 ] Hairong Kuang commented on HDFS-1539: - Yes, should +this.cout = new BufferedOutputStream(streams.checksumOut, + SMALL_BUFFER_SIZE); be this.cout = streams.checksumOut? prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: syncOnClose1.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974017#action_12974017 ] M. C. Srivas commented on HDFS-1539: Dhruba, so if there's a file with 20 blocks on 20 different servers, with 3 replicas each, we might potentially end up sync'ing 41 servers (= 1 primary + 20*2 replicas) when closing the file, correct? prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: syncOnClose1.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973066#action_12973066 ] dhruba borthakur commented on HDFS-1539: @Allen: Thanks for ur comments. I jave kept the default behaviour as it is now, especially because I do not want any existing installations to see bad performance behaviour when they run with this patch. (On some customer sites, it is possible that they have enough redundant power supplies that they never have to configure this patch to be turned on) prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: syncOnClose1.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973068#action_12973068 ] Todd Lipcon commented on HDFS-1539: --- dhruba: do you plan to run this on your warehouse cluster or just scribe tiers? If so it would be very interesting to find out whether it affects throughput. If there is no noticeable hit I would argue to make it the default. prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: syncOnClose1.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973093#action_12973093 ] dhruba borthakur commented on HDFS-1539: I could make it the default, but I would like the hear the opinion of many people who are running hadoop clusters. Also, performance numbers could vary a lot based on the operating system (CentOs, Redhat, windows, ext4, xfs), etc., so it would be difficult to get it right based solely on performance. On the other hand, if the entire community thinks that it is better to have the default the prevents data loss at all costs, then this could be the default. If the debate on either side is fierce, then I would like to get this in first and then open another JIRA to debate the default settings. We are definitely going to first deploy this first on our archival cluster. This is a cluster that is used purely to backup/restore data from mySQL databases. prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: syncOnClose1.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973125#action_12973125 ] Todd Lipcon commented on HDFS-1539: --- Yep, I certainly didn't intend to block this JIRA. What you've done here is definitely prudent, and we can debate/benchmark turning it on by default in another JIRA. prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: syncOnClose1.txt we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971895#action_12971895 ] dhruba borthakur commented on HDFS-1539: We have seen this problem on a cluster that is purely used for archival purposes. I propose that we implement Option 1 listed above. prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971900#action_12971900 ] Allen Wittenauer commented on HDFS-1539: Is there a reason why the datanode just shouldn't sync anyway? [ie., is it really worth it to make this configurable?] prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss
[ https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971935#action_12971935 ] Todd Lipcon commented on HDFS-1539: --- @Allen: some file systems, if you sync() one file, will end up syncing *all* files, essentially. So it could be a moderately big performance hit, though it would be worth benchmarking terasort with/without - it should be fairly obvious if it's a killer. prevent data loss when a cluster suffers a power loss - Key: HDFS-1539 URL: https://issues.apache.org/jira/browse/HDFS-1539 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Reporter: dhruba borthakur we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS Datanodes do not sync the data of a block file when the block is closed. 1. Have a cluster-wide config setting that causes the datanode to sync a block file when a block is finalized. 2. Introduce a new parameter to the FileSystem.create() to trigger the new behaviour, i.e. cause the datanode to sync a block-file when it is finalized. 3. Implement the FSDataOutputStream.hsync() to cause all data written to the specified file to be written to stable storage. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.