[
https://issues.apache.org/jira/browse/HADOOP-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710523#action_12710523
]
Hairong Kuang commented on HADOOP-5744:
---------------------------------------
Dhruba told me that hbase depended on a client calling of append to trigger the
close of a file that lost its writer. Once the file is closed, the client reads
the file and starts to work from the state defined in the closed file.
My question is that why the file needs to be closed before it is read. The read
semantics defined in this jira guarantees that
(1) any hflushed data become visible to any new readers;
(2) Once a byte becomes visible to a reader, it continues to be visible to the
reader except when all replicas containing the byte fail. This implies that a
reader continues to see a byte it saw before even when the replica that it read
from fails, during any error recovery, and after any error recovery as long as
one replica containing the byte is available.
Is it OK for your client to trigger the close of the file but does not wait for
it to close? The idea is to read the file and resume working before the file
gets closed. When the file finally gets closed, the file
(1) may have more bytes than when it was previously read. This is a norm case.
Will this be an issue to hbase?
(2) If all replicas went down during the period the file is triggered to close
and the time it is closed, the file may end up with less bytes. This is a rare
case. The default time for this period is 10 minutes so the chance of losing
the visible bytes is very slim. Can hbase tolerate this?
> Revisit append
> --------------
>
> Key: HADOOP-5744
> URL: https://issues.apache.org/jira/browse/HADOOP-5744
> Project: Hadoop Core
> Issue Type: Improvement
> Components: dfs
> Affects Versions: 0.20.0
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Fix For: 0.21.0
>
> Attachments: AppendSpec.pdf
>
>
> HADOOP-1700 and related issues have put a lot of efforts to provide the first
> implementation of append. However, append is such a complex feature. It turns
> out that there are issues that were initially seemed trivial but needs a
> careful design. This jira revisits append, aiming for a design and
> implementation supporting a semantics that are acceptable to its users.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.