Zheng Shao created HDFS-14229:
---------------------------------
Summary: Nonblocking HDFS create|write
Key: HDFS-14229
URL: https://issues.apache.org/jira/browse/HDFS-14229
Project: Hadoop HDFS
Issue Type: New Feature
Components: hdfs-client
Reporter: Zheng Shao
Right now, the create call on HDFS is blocking. The write call can also be
blocking if the write buffer reached its limit.
However, for most applications, the only requirement is that when "close" on a
file is called, the file is persisted and visible in HDFS. There is no need to
make "create" visible right after the "create" call returns.
A particular use case of this is to use HDFS as a place to store shuffle data
(in Spark, Map-Reduce, or other loose-coupled applications).
This Jira proposes that we add a new "async-hdfs://" protocol that maps to a
new AsyncDistributedFileSystem class, whose create call is nonblocking but
still returns a FSOutputStream that is never blocked on write (even when the
file has not been physically created on HDFS yet). The close call on the
FSOutputStream will block until the creation and all previous writes are
completed and the file is closed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]