[ 
https://issues.apache.org/jira/browse/HDFS-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088000#comment-15088000
 ] 

Konstantin Shvachko commented on HDFS-9607:
-------------------------------------------

Dinesh, "random writes" should be indeed a valuable enhancement for HDFS. It 
would be a natural evolution of the update semantics for HDFS files. Currently 
we support
# Sequential writes
# Append (HADOOP-1700, HDFS-265)
# Snapshots (HDFS-2802)
# Truncate (HDFS-3107)

Random writes were proposed on several occasions HDFS-214, HADOOP-5215.
The semantics is quite simple: you {{seek}} to a given offset in the file, and 
then {{write}} bytes starting from that offset by overwriting whatever there 
previously was.
Semantically HDFS files store bytes, not records, so there is no notion of 
updating a record of such size with a record of the same size.

Thus for the API I would just add one _positional write_ method to 
{{DFSOutputStream}}
{code}
public void DFSOutputStream.write(long position, byte[] buffer, int offset, int 
length)
{code}
This would be symmetrical to the positional read of {{DFSInputStream}}.

The difficulty with implementing random writes is that it should provide 
consistency across the replicas of the same block. And support existing 
features including append, truncate, and notably snapshots. So I guess people 
are interested here what is your proposal in this regard.
You can look up the 
[append|https://issues.apache.org/jira/secure/attachment/12445209/appendDesign3.pdf]
 and 
[truncate|https://issues.apache.org/jira/secure/attachment/12697141/HDFS_truncate.pdf]
 documents as a guide for your design.

> Advance Hadoop Architecture (AHA) - HDFS
> ----------------------------------------
>
>                 Key: HDFS-9607
>                 URL: https://issues.apache.org/jira/browse/HDFS-9607
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Dinesh S. Atreya
>
> Link to Umbrella JIRA
> https://issues.apache.org/jira/browse/HADOOP-12620 
> Provide capability to carry out in-place writes/updates. Only writes in-place 
> are supported where the existing length does not change.
> For example, "Hello World" can be replaced by "Hello HDFS!"
> See 
> https://issues.apache.org/jira/browse/HADOOP-12620?focusedCommentId=15046300&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15046300
>  for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to