[jira] Commented: (HADOOP-2657) Enhancements to DFSClient to support flushing data at any point in time

Doug Cutting (JIRA) Thu, 28 Feb 2008 10:42:48 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573396#action_12573396
 ]


Doug Cutting commented on HADOOP-2657:
--------------------------------------

We should change FSOutputStream to implement Seekable, and have the default 
implementation of seek throw an IOException, then use this in 
CheckSumFileSystem to rewind and overwrite the checksum.  Then folks will only 
fail if they attempt to write more data after they've flushed on a 
ChecksumFileSystem that doesn't support seek.  I don't think we will have any 
filesystems that both extend CheckSumFileSystem and can't support seek.  Only 
LocalFileSystem currently extends CheckSumFileSystem, and it does support seek. 
 So flush() shouldn't ever fail for existing FileSystem's, but seek() will fail 
for most output streams (probably all except local).  Does that make sense?

> Enhancements to DFSClient to support flushing data at any point in time
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-2657
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2657
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: flush.patch, flush2.patch, flush3.patch
>
>
> The HDFS Append Design (HADOOP-1700) requires that there be a public API to 
> flush data written to a HDFS file that can be invoked by an application. This 
> API (popularly referred to a fflush(OutputStream)) will ensure that data 
> written to the DFSOutputStream is flushed to datanodes and any required 
> metadata is persisted on Namenode.
> This API has to handle the case when the client decides to flush after 
> writing data that is not a exact multiple of io.bytes.per.checksum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2657) Enhancements to DFSClient to support flushing data at any point in time

Reply via email to