[ 
https://issues.apache.org/jira/browse/HDFS-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13470016#comment-13470016
 ] 

Lars Hofhansl commented on HDFS-3979:
-------------------------------------

API4 is hflush (with change in OS buffers).

That's an interesting discussion by itself. hsync'ing every edit in HBase is 
prohibitive.
I have some simple numbers in HBASE-5954.

Although, I need to do that test again with the sync_file_range changes in 
HDFS-2465 (that would hopefully do most of the data sync'ing asynchronously and 
only sync the last changes and metadata synchronously upon client request).

Many applications do not need every edit to be guaranteed on disk, but have 
"sync points". That is what I am aiming for in HBase. The application will know 
the specific semantics.

What is really important for HBase (IMHO) is that every block is synced to disk 
when it is closed. HBase constantly rewrites existing data via compactions so 
without syncing arbitrarily old data can be lost during a rack or DC outage.

Lastly, we can play with this. For example only one of the replicas could sync 
to disk and the other's just guarantee the data in the OS buffers (API4.5 :) ).

                
> Fix hsync and hflush semantics.
> -------------------------------
>
>                 Key: HDFS-3979
>                 URL: https://issues.apache.org/jira/browse/HDFS-3979
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, hdfs client
>    Affects Versions: 0.22.0, 0.23.0, 2.0.0-alpha
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>         Attachments: hdfs-3979-sketch.txt, hdfs-3979-v2.txt
>
>
> See discussion in HDFS-744. The actual sync/flush operation in BlockReceiver 
> is not on a synchronous path from the DFSClient, hence it is possible that a 
> DN loses data that it has already acknowledged as persisted to a client.
> Edit: Spelling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to