[
https://issues.apache.org/jira/browse/HDFS-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Colin Patrick McCabe resolved HDFS-3290.
----------------------------------------
Resolution: Duplicate
> Use a better local directory layout for the datanode
> ----------------------------------------------------
>
> Key: HDFS-3290
> URL: https://issues.apache.org/jira/browse/HDFS-3290
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: 0.23.0
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Priority: Minor
>
> When the HDFS DataNode stores chunks in a local directory, it currently puts
> all of the chunk files into either one big directory, or a collection of
> directories. However, there is no way to know which directory a given block
> will end up in, given its ID. As the number of files increases, this does
> not scale well.
> Similar to the git version control system, HDFS should create a few different
> top level directories keyed off of a few bits in the chunk ID. Git uses 8
> bits. This substantially cuts down on the number of chunk files in the same
> directory and gives increased performance, while not compromising O(1) lookup
> of chunks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)