[ https://issues.apache.org/jira/browse/HDFS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268803#comment-13268803 ]
Lars Hofhansl commented on HDFS-744: ------------------------------------ As Dhruba said, we need this for HBase. One solution would be to (1) introduce a create mode that causes every block to be sync'ed upon close and (2) have a new DataXCeiver command to sync a current outstanding block. Together these ensure that we can guarantee sync'ed data and still avoid having to keep track of outstanding unsync'ed blocks, and also avoid needing to sync upon receipt of every packet. The client would need to pass the filemode along with every packet. (And maybe we could generalize this to a Unix-like file descriptor abstraction). * DSFClient.create gets an extra flags argument * The flags are passed along to the Datanode with every Packet via DFSClient.DFSOutputStream.Packet.writeData. * BlockReceiver.receivePacket reads these flags. receiveBlock can then act accordingly on close and issue force on the open filechannel. * A new DataTransferProtocol type OP_SYNC_BLOCK (or something), that'll sync an outstanding block. Maybe OP_SYNC_BLOCK should only be allowed of the file opened on sync-on-block-close-mode. The next problem is pipelined syncs. Eating up the sync cost N times in a pipeline is not ideal for latency. Here we could: * (optionally) only issue a sync to disk on the last (or first) replica in the pipeline. That way we take the sync hit only once and data is on physical disk at least on of the replicas. (Could be made rack aware so that we issue sync at least on a machine per rack.) Or * more radically: since the order guaranteed by pipelining is only needed for appending to the block files, we could issue the sync's to all mirrors involved in parallel in a second request. Disclaimer: I work on HBase but am a HDFS noob. > Support hsync in HDFS > --------------------- > > Key: HDFS-744 > URL: https://issues.apache.org/jira/browse/HDFS-744 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: Hairong Kuang > > HDFS-731 implements hsync by default as hflush. As descriibed in HADOOP-6313, > the real expected semantics should be "flushes out to all replicas and all > replicas have done posix fsync equivalent - ie the OS has flushed it to the > disk device (but the disk may have it in its cache)." This jira aims to > implement the expected behaviour. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira