[ 
https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Thomas updated HDFS-6482:
-------------------------------

    Attachment: HDFS-6482.8.patch

Fixed the new test and parallelized and nativized the hard link process so that 
creating 200k hard links (100k blocks) take just over a second on a dual core 
PC -- we should be well within 60 seconds for a rolling upgrade on any 
DataNode. Added a configuration parameter for users to specify the number of 
threads to be used in the hard link process. (We use these optimizations for 
the hard link process only when upgrading to the block ID-based layout, because 
otherwise the directory structures of the old and new layouts should be the 
same and we can perform fast batch hard links over directories -- see 
HDFS-1445.)

> Use block ID-based block layout on datanodes
> --------------------------------------------
>
>                 Key: HDFS-6482
>                 URL: https://issues.apache.org/jira/browse/HDFS-6482
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.5.0
>            Reporter: James Thomas
>            Assignee: James Thomas
>         Attachments: 6482-design.doc, HDFS-6482.1.patch, HDFS-6482.2.patch, 
> HDFS-6482.3.patch, HDFS-6482.4.patch, HDFS-6482.5.patch, HDFS-6482.6.patch, 
> HDFS-6482.7.patch, HDFS-6482.8.patch, HDFS-6482.patch
>
>
> Right now blocks are placed into directories that are split into many 
> subdirectories when capacity is reached. Instead we can use a block's ID to 
> determine the path it should go in. This eliminates the need for the LDir 
> data structure that facilitates the splitting of directories when they reach 
> capacity as well as fields in ReplicaInfo that keep track of a replica's 
> location.
> An extension of the work in HDFS-3290.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to