Jing Zhao created HDFS-10419: -------------------------------- Summary: Building HDFS on top of Ozone's storage containers Key: HDFS-10419 URL: https://issues.apache.org/jira/browse/HDFS-10419 Project: Hadoop HDFS Issue Type: New Feature Reporter: Jing Zhao Assignee: Jing Zhao
In HDFS-7240, Ozone defines storage containers to store both the data and the metadata. The storage container layer provides an object storage interface and aims to manage data/metadata in a distributed manner. More details about storage containers can be found in the design doc in HDFS-7240. HDFS can adopt the storage containers to store and manage blocks. The general idea is: # Each block can be treated as an object and the block ID is the object's key. # Blocks will still be stored in DataNodes but as objects in storage containers. # The block management work can be separated out of the NameNode and will be handled by the storage container layer in a more distributed way. The NameNode will only manage the namespace (i.e., files and directories). # For each file, the NameNode only needs to record a list of block IDs which are used as keys to obtain real data from storage containers. # A new DFSClient implementation talks to both NameNode and the storage container layer to read/write. HDFS, especially the NameNode, can get much better scalability from this design. Currently the NameNode's heaviest workload comes from the block management, which includes maintaining the block-DataNode mapping, receiving full/incremental block reports, tracking block states (under/over/miss replicated), and joining every writing pipeline protocol to guarantee the data consistency. These work bring high memory footprint and make NameNode suffer from GC. HDFS-5477 already proposes to convert BlockManager as a service. If we can build HDFS on top of the storage container layer, we not only separate out the BlockManager from the NameNode, but also replace it with a new distributed management scheme. The storage container work is currently in progress in HDFS-7240, and the work proposed here is still in an experimental/exploring stage. We can do this experiment in a feature branch so that people with interests can be involved. A design doc will be uploaded later explaining more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org