[jira] [Commented] (HDFS-1539) prevent data loss when a cluster suffers a power loss

2013-07-02 Thread Dave Latham (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697735#comment-13697735
 ] 

Dave Latham commented on HDFS-1539:
---

Does anyone have any performance numbers for enabling this?  Or, does anyone 
just have some experience running this on significant workloads in production?  
(Especially HBase?)

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client, namenode
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.23.0, 1.1.1

 Attachments: syncOnClose1.txt, syncOnClose2_b-1.txt, syncOnClose2.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1539) prevent data loss when a cluster suffers a power loss

2012-11-12 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13495607#comment-13495607
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1539:
--

Interestingly, TestFileCreation fails in branch-1 (with and without the patch) 
but not branch-1.1.  I will file a JIRA for it.

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.23.0

 Attachments: syncOnClose1.txt, syncOnClose2_b-1.txt, syncOnClose2.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1539) prevent data loss when a cluster suffers a power loss

2012-11-12 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13495614#comment-13495614
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1539:
--

I have committed this to branch-1 and branch-1.1.

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.23.0

 Attachments: syncOnClose1.txt, syncOnClose2_b-1.txt, syncOnClose2.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1539) prevent data loss when a cluster suffers a power loss

2012-11-10 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494800#comment-13494800
 ] 

Suresh Srinivas commented on HDFS-1539:
---

Nicholas, compared the backported patch with the original. It looks good. +1 
for the patch.

We should get this into 1.1.1.

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.23.0

 Attachments: syncOnClose1.txt, syncOnClose2_b-1.txt, syncOnClose2.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1539) prevent data loss when a cluster suffers a power loss

2012-11-09 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494496#comment-13494496
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1539:
--

Sure, let's backport this to branch-1.

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.23.0

 Attachments: syncOnClose1.txt, syncOnClose2_b-1.txt, syncOnClose2.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1539) prevent data loss when a cluster suffers a power loss

2012-05-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13268609#comment-13268609
 ] 

stack commented on HDFS-1539:
-

Should we pull this into 1.0.3?  Or 1.1.0?

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Fix For: 0.23.0

 Attachments: syncOnClose1.txt, syncOnClose2.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2011-01-04 Thread Hairong Kuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977309#action_12977309
 ] 

Hairong Kuang commented on HDFS-1539:
-

+1. The patch look good.

A minor comment is that I do not think the unit test is of much use because the 
bug occurs when a machine is power off but it's hard to simulate this.

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: syncOnClose1.txt, syncOnClose2.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2010-12-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12975393#action_12975393
 ] 

Hadoop QA commented on HDFS-1539:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12467019/syncOnClose2.txt
  against trunk revision 1053203.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.hdfs.server.namenode.TestStorageRestore
  org.apache.hadoop.hdfs.TestFileConcurrentReader

-1 contrib tests.  The patch failed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/49//testReport/
Findbugs warnings: 
https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/49//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/49//console

This message is automatically generated.

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: syncOnClose1.txt, syncOnClose2.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2010-12-22 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974258#action_12974258
 ] 

dhruba borthakur commented on HDFS-1539:


if there is a file with 20 blocks and each block has three replicas each, then 
there will be a total of 60 fflush calls, this does not matter on the number of 
servers. 

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: syncOnClose1.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2010-12-22 Thread Hairong Kuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974400#action_12974400
 ] 

Hairong Kuang commented on HDFS-1539:
-

Yes, should
+this.cout = new BufferedOutputStream(streams.checksumOut, 
+  SMALL_BUFFER_SIZE);
 be this.cout = streams.checksumOut?

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: syncOnClose1.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2010-12-21 Thread M. C. Srivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974017#action_12974017
 ] 

M. C. Srivas commented on HDFS-1539:


Dhruba, so if there's a file with 20 blocks on 20 different servers, with 3 
replicas each, we might potentially end up sync'ing 41 servers (= 1 primary + 
20*2 replicas) when closing the file, correct?

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: syncOnClose1.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2010-12-19 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973066#action_12973066
 ] 

dhruba borthakur commented on HDFS-1539:


@Allen: Thanks for ur comments. I jave kept the default behaviour as it is now, 
especially because I do not want any existing installations to see bad 
performance behaviour when they run with this  patch. (On some customer sites, 
it is possible that they have enough redundant power supplies that they never 
have to configure this patch to be turned on)

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: syncOnClose1.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2010-12-19 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973068#action_12973068
 ] 

Todd Lipcon commented on HDFS-1539:
---

dhruba: do you plan to run this on your warehouse cluster or just scribe tiers? 
If so it would be very interesting to find out whether it affects throughput. 
If there is no noticeable hit I would argue to make it the default.

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: syncOnClose1.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2010-12-19 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973093#action_12973093
 ] 

dhruba borthakur commented on HDFS-1539:


I could make it the default, but I would like the hear the opinion of many 
people who are running hadoop clusters. Also, performance numbers could vary a 
lot based on the operating system (CentOs, Redhat, windows, ext4, xfs), etc., 
so it would be difficult to get it right based solely on performance. On the 
other hand, if the entire community thinks that it is better to have the 
default the prevents data loss at all costs, then this could be the default. If 
the debate on either side is fierce, then I would like to get this in first and 
then open another JIRA to debate the default settings.

We are definitely going to first deploy this first on our archival cluster. 
This is a cluster that is used purely to backup/restore data from mySQL 
databases.

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: syncOnClose1.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2010-12-19 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973125#action_12973125
 ] 

Todd Lipcon commented on HDFS-1539:
---

Yep, I certainly didn't intend to block this JIRA. What you've done here is 
definitely prudent, and we can debate/benchmark turning it on by default in 
another JIRA.

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: syncOnClose1.txt


 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2010-12-15 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971895#action_12971895
 ] 

dhruba borthakur commented on HDFS-1539:


We have seen this problem on a cluster that is purely used for archival 
purposes. I propose that we implement Option 1 listed above.

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur

 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2010-12-15 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971900#action_12971900
 ] 

Allen Wittenauer commented on HDFS-1539:


Is there a reason why the datanode just shouldn't sync anyway?  [ie., is it 
really worth it to make this configurable?]

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur

 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2010-12-15 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971935#action_12971935
 ] 

Todd Lipcon commented on HDFS-1539:
---

@Allen: some file systems, if you sync() one file, will end up syncing *all* 
files, essentially. So it could be a moderately big performance hit, though it 
would be worth benchmarking terasort with/without - it should be fairly obvious 
if it's a killer.

 prevent data loss when a cluster suffers a power loss
 -

 Key: HDFS-1539
 URL: https://issues.apache.org/jira/browse/HDFS-1539
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client, name-node
Reporter: dhruba borthakur

 we have seen an instance where a external outage caused many datanodes to 
 reboot at around the same time.  This resulted in many corrupted blocks. 
 These were recently written blocks; the current implementation of HDFS 
 Datanodes do not sync the data of a block file when the block is closed.
 1. Have a cluster-wide config setting that causes the datanode to sync a 
 block file when a block is finalized.
 2. Introduce a new parameter to the FileSystem.create() to trigger the new 
 behaviour, i.e. cause the datanode to sync a block-file when it is finalized.
 3. Implement the FSDataOutputStream.hsync() to cause all data written to the 
 specified file to be written to stable storage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.