[
https://issues.apache.org/jira/browse/HADOOP-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665196#action_12665196
]
Luke Lu commented on HADOOP-4379:
---------------------------------
On the design of the TC3: I think that client informing namenode on each sync
is too expensive and not scalable. Hypertable can easily generate 1M
transactions/s from a small cluster, which the namenode cannot possibly handle.
Since restarting HDFS incurs no data loss, I think the basic logic is probably
already fine. For most cases, I think that making new readers a bit more
expensive to create is probably the right trade-off:
1. We could append and sync a file and only inform the namenode on the first
sync, hinting that we need to do something special for the new readers of the
file.
2. Upon creating a new reader for the file, namenode would request a block
report (an incremental one, if we try to be clever) from the involved data
nodes and return the right info the reader.
I think that this would cover most of the use cases for append and sync
reasonably.
> In HDFS, sync() not yet guarantees data available to the new readers
> --------------------------------------------------------------------
>
> Key: HADOOP-4379
> URL: https://issues.apache.org/jira/browse/HADOOP-4379
> Project: Hadoop Core
> Issue Type: New Feature
> Components: dfs
> Reporter: Tsz Wo (Nicholas), SZE
> Assignee: dhruba borthakur
> Fix For: 0.19.1
>
> Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt
>
>
> In the append design doc
> (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it
> says
> * A reader is guaranteed to be able to read data that was 'flushed' before
> the reader opened the file
> However, this feature is not yet implemented. Note that the operation
> 'flushed' is now called "sync".
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.