[ https://issues.apache.org/jira/browse/HDFS-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794446#action_12794446 ]
dhruba borthakur commented on HDFS-814: --------------------------------------- > but it has the following issue: Applications that don't care about very > accurate file lengths will pay the cost for files This will happen only if the file is being written to when somebody else does a getFileStatus on the file. This should never happen for the most typical app that runs on HDFS... a map-reduce job. >Cost of ls -r of a dir (say MR output dir) can go up when some of the files in >the subtree are open for writing. I suspect that this is not a typical use-case. The MR-job output directory will typically be empty until the job is committed and all files get renamed into the out directory (from the tmp directory). I am good for this patch because this does not introduce a FileSystem/FileContext API. > Add an api to get the visible length of a DFSDataInputStream. > ------------------------------------------------------------- > > Key: HDFS-814 > URL: https://issues.apache.org/jira/browse/HDFS-814 > Project: Hadoop HDFS > Issue Type: New Feature > Components: hdfs client > Reporter: Tsz Wo (Nicholas), SZE > Assignee: Tsz Wo (Nicholas), SZE > Fix For: 0.21.0, 0.22.0 > > Attachments: h814_20091221.patch, h814_20091221_0.21.patch > > > Hflush guarantees that the bytes written before are visible to the new > readers. However, there is no way to get the length of the visible bytes. > The visible length is useful in some applications like SequenceFile. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.