[jira] [Created] (HDFS-10419) Building HDFS on top of Ozone's storage containers

Jing Zhao (JIRA) Tue, 17 May 2016 13:00:30 -0700

Jing Zhao created HDFS-10419:
--------------------------------

             Summary: Building HDFS on top of Ozone's storage containers
                 Key: HDFS-10419
                 URL: https://issues.apache.org/jira/browse/HDFS-10419
             Project: Hadoop HDFS
          Issue Type: New Feature
            Reporter: Jing Zhao
            Assignee: Jing Zhao



In HDFS-7240, Ozone defines storage containers to store both the data and the 
metadata. The storage container layer provides an object storage interface and 
aims to manage data/metadata in a distributed manner. More details about 
storage containers can be found in the design doc in HDFS-7240.

HDFS can adopt the storage containers to store and manage blocks. The general 
idea is:
# Each block can be treated as an object and the block ID is the object's key.
# Blocks will still be stored in DataNodes but as objects in storage containers.
# The block management work can be separated out of the NameNode and will be 
handled by the storage container layer in a more distributed way. The NameNode 
will only manage the namespace (i.e., files and directories).
# For each file, the NameNode only needs to record a list of block IDs which 
are used as keys to obtain real data from storage containers.
# A new DFSClient implementation talks to both NameNode and the storage
container layer to read/write.

HDFS, especially the NameNode, can get much better scalability from this
design. Currently the NameNode's heaviest workload comes from the block 
management, which includes maintaining the block-DataNode mapping, receiving 
full/incremental block reports, tracking block states (under/over/miss 
replicated), and joining every writing pipeline protocol to guarantee the data 
consistency. These work bring high memory footprint
and make NameNode suffer from GC. HDFS-5477 already proposes to convert 
BlockManager as a service. If we can build HDFS on top of the storage container 
layer, we not only separate out the BlockManager from the NameNode, but also 
replace it with a new distributed management scheme.

The storage container work is currently in progress in HDFS-7240, and the work 
proposed here is still in an experimental/exploring stage. We can do this 
experiment in a feature branch so that people with interests can be involved.

A design doc will be uploaded later explaining more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-10419) Building HDFS on top of Ozone's storage containers

Reply via email to