[ 
https://issues.apache.org/jira/browse/HDFS-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793861#action_12793861
 ] 

Sanjay Radia commented on HDFS-814:
-----------------------------------

I like Dhruba's suggestion because of the transparent behaviour, 
but it has the following issue: Applications that don't care about very 
accurate file lengths will pay the cost for files
that happen to be open for writing. Cost of ls -r of a  dir (say MR output dir) 
can go up when some of the files  in the subtree are open for writing.

Isn't it acceptable to say that listStatus returns the last known file size. 
DFSDataInputStream.getVisibleLen() gives a more accurate result?

> Add an api to get the visible length of a DFSDataInputStream.
> -------------------------------------------------------------
>
>                 Key: HDFS-814
>                 URL: https://issues.apache.org/jira/browse/HDFS-814
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: hdfs client
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>             Fix For: 0.21.0, 0.22.0
>
>         Attachments: h814_20091221.patch
>
>
> Hflush guarantees that the bytes written before are visible to the new 
> readers.  However, there is no way to get the length of the visible bytes.  
> The visible length is useful in some applications like SequenceFile.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to