[ 
https://issues.apache.org/jira/browse/HDFS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268803#comment-13268803
 ] 

Lars Hofhansl commented on HDFS-744:
------------------------------------

As Dhruba said, we need this for HBase.

One solution would be to (1) introduce a create mode that causes every block to 
be sync'ed upon close and (2) have a new DataXCeiver command to sync a current 
outstanding block.
Together these ensure that we can guarantee sync'ed data and still avoid having 
to keep track of outstanding unsync'ed blocks, and also avoid needing to sync 
upon receipt of every packet.
The client would need to pass the filemode along with every packet. (And maybe 
we could generalize this to a Unix-like file descriptor abstraction).

* DSFClient.create gets an extra flags argument
* The flags are passed along to the Datanode with every Packet via 
DFSClient.DFSOutputStream.Packet.writeData.
* BlockReceiver.receivePacket reads these flags. receiveBlock can then act 
accordingly on close and issue force on the open filechannel.
* A new DataTransferProtocol type OP_SYNC_BLOCK (or something), that'll sync an 
outstanding block. Maybe OP_SYNC_BLOCK should only be allowed of the file 
opened on sync-on-block-close-mode.

The next problem is pipelined syncs. Eating up the sync cost N times in a 
pipeline is not ideal for latency.
Here we could:
* (optionally) only issue a sync to disk on the last (or first) replica in the 
pipeline. That way we take the sync hit only once and data is on physical disk 
at least on of the replicas. (Could be made rack aware so that we issue sync at 
least on a machine per rack.) Or
* more radically: since the order guaranteed by pipelining is only needed for 
appending to the block files, we could issue the sync's to all mirrors involved 
in parallel in a second request.

Disclaimer: I work on HBase but am a HDFS noob.

                
> Support hsync in HDFS
> ---------------------
>
>                 Key: HDFS-744
>                 URL: https://issues.apache.org/jira/browse/HDFS-744
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Hairong Kuang
>
> HDFS-731 implements hsync by default as hflush. As descriibed in HADOOP-6313, 
> the real expected semantics should be "flushes out to all replicas and all 
> replicas have done posix fsync equivalent - ie the OS has flushed it to the 
> disk device (but the disk may have it in its cache)." This jira aims to 
> implement the expected behaviour.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to