[jira] [Comment Edited] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning

Konstantin Shvachko (Jira) Wed, 14 Jul 2021 10:29:28 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17345870#comment-17345870
 ]


Konstantin Shvachko edited comment on HDFS-14703 at 7/14/21, 5:28 PM:
----------------------------------------------------------------------

I did some performance benchmarks using a physical server (a d430 server in 
[Utah Emulab testbed|http://www.emulab.net]). I used either RAMDISK or SSD, as 
the storage for HDFS. By using RAMDISK, we can remove the time used by the SSD 
to make each write persistent. For the RAM case, we observed an improvement of 
45% from fine-grained locking. For the SSD case, fine-grained locking gives us 
20% improvement.  We used an Intel SSD (model: SSDSC2BX200G4R).  

We noticed for trunk, the mkdir OPS is lower for the RAMDISK than SSD. We don't 
know the reason for this yet. We repeated the experiment for RAMDISK for trunk 
twice to confirm the performance number.
h2. tmpfs, hadoop-tmp-dir = /run/hadoop-utos
h3. 45% improvements fgl vs. trunk
trunk 
{noformat:nowrap}
2021-05-16 20:37:20,280 INFO namenode.NNThroughputBenchmark: # operations: 
10000000
2021-05-16 20:37:20,280 INFO namenode.NNThroughputBenchmark: Elapsed Time: 
663510
2021-05-16 20:37:20,280 INFO namenode.NNThroughputBenchmark:  Ops per sec: 
15071.362
2021-05-16 20:37:20,280 INFO namenode.NNThroughputBenchmark: Average Time: 13
2021-05-16 22:15:13,515 INFO namenode.NNThroughputBenchmark: — mkdirs stats  —
2021-05-16 22:15:13,515 INFO namenode.NNThroughputBenchmark: # operations: 
10000000
2021-05-16 22:15:13,515 INFO namenode.NNThroughputBenchmark: Elapsed Time: 
710248
2021-05-16 22:15:13,515 INFO namenode.NNThroughputBenchmark:  Ops per sec: 
14079.5
2021-05-16 22:15:13,515 INFO namenode.NNThroughputBenchmark: Average Time: 14
2021-05-16 22:15:13,515 INFO namenode.FSEditLog: Ending log segment 8345565, 
10019540
{noformat}

fgl
{noformat:nowrap}
2021-05-16 21:06:46,476 INFO namenode.NNThroughputBenchmark: — mkdirs stats  —
2021-05-16 21:06:46,476 INFO namenode.NNThroughputBenchmark: # operations: 
10000000
2021-05-16 21:06:46,476 INFO namenode.NNThroughputBenchmark: Elapsed Time: 
445980
2021-05-16 21:06:46,476 INFO namenode.NNThroughputBenchmark:  Ops per sec: 
22422.530
2021-05-16 21:06:46,476 INFO namenode.NNThroughputBenchmark: Average Time: 8
{noformat}

h2. SSD, hadoop.tmp.dir=/dev/sda4
h3. 23% improvement fgl vs. trunk

trunk:
{noformat:nowrap}
2021-05-16 21:59:06,042 INFO namenode.NNThroughputBenchmark: — mkdirs stats  —
2021-05-16 21:59:06,042 INFO namenode.NNThroughputBenchmark: # operations: 
10000000
2021-05-16 21:59:06,042 INFO namenode.NNThroughputBenchmark: Elapsed Time: 
593839
2021-05-16 21:59:06,042 INFO namenode.NNThroughputBenchmark:  Ops per sec: 
16839.581
2021-05-16 21:59:06,042 INFO namenode.NNThroughputBenchmark: Average Time: 11
{noformat:nowrap}

fgl
{noformat:nowrap}
2021-05-16 21:21:03,906 INFO namenode.NNThroughputBenchmark: — mkdirs stats  —
2021-05-16 21:21:03,906 INFO namenode.NNThroughputBenchmark: # operations: 
10000000
2021-05-16 21:21:03,906 INFO namenode.NNThroughputBenchmark: Elapsed Time: 
481269
2021-05-16 21:21:03,906 INFO namenode.NNThroughputBenchmark:  Ops per sec: 
20778.400
2021-05-16 21:21:03,906 INFO namenode.NNThroughputBenchmark: Average Time: 9
{noformat}
 
{noformat:nowrap}
/dev/sda:
ATA device, with non-removable media
Model Number:       INTEL SSDSC2BX200G4R
Serial Number:      BTHC523202RD200TGN
Firmware Revision:  G201DL2D
{noformat}


was (Author: xinglin):
I did some performance benchmarks using a physical server (a d430 server in 
[Utah Emulab testbed|http://www.emulab.net]). I used either RAMDISK or SSD, as 
the storage for HDFS. By using RAMDISK, we can remove the time used by the SSD 
to make each write persistent. For the RAM case, we observed an improvement of 
45% from fine-grained locking. For the SSD case, fine-grained locking gives us 
20% improvement.  We used an Intel SSD (model: SSDSC2BX200G4R).  

We noticed for trunk, the mkdir OPS is lower for the RAMDISK than SSD. We don't 
know the reason for this yet. We repeated the experiment for RAMDISK for trunk 
twice to confirm the performance number.
h1. tmpfs, hadoop-tmp-dir = /run/hadoop-utos
h1. 45% improvements fgl vs. trunk
h2. trunk 

2021-05-16 20:37:20,280 INFO namenode.NNThroughputBenchmark: # operations: 
10000000

2021-05-16 20:37:20,280 INFO namenode.NNThroughputBenchmark: Elapsed Time: 
663510

2021-05-16 20:37:20,280 INFO namenode.NNThroughputBenchmark:  Ops per sec: 
15071.362

2021-05-16 20:37:20,280 INFO namenode.NNThroughputBenchmark: Average Time: 13

2021-05-16 22:15:13,515 INFO namenode.NNThroughputBenchmark: — mkdirs stats  —

2021-05-16 22:15:13,515 INFO namenode.NNThroughputBenchmark: # operations: 
10000000

2021-05-16 22:15:13,515 INFO namenode.NNThroughputBenchmark: Elapsed Time: 
710248

2021-05-16 22:15:13,515 INFO namenode.NNThroughputBenchmark:  Ops per sec: 
14079.5

2021-05-16 22:15:13,515 INFO namenode.NNThroughputBenchmark: Average Time: 14

2021-05-16 22:15:13,515 INFO namenode.FSEditLog: Ending log segment 8345565, 
10019540

fgl

2021-05-16 21:06:46,476 INFO namenode.NNThroughputBenchmark: — mkdirs stats  —

2021-05-16 21:06:46,476 INFO namenode.NNThroughputBenchmark: # operations: 
10000000

2021-05-16 21:06:46,476 INFO namenode.NNThroughputBenchmark: Elapsed Time: 
445980

2021-05-16 21:06:46,476 INFO namenode.NNThroughputBenchmark:  Ops per sec: 
22422.530

2021-05-16 21:06:46,476 INFO namenode.NNThroughputBenchmark: Average Time: 8
h1. SSD, hadoop.tmp.dir=/dev/sda4
h1. 23% improvement fgl vs. trunk

trunk:

2021-05-16 21:59:06,042 INFO namenode.NNThroughputBenchmark: — mkdirs stats  —

2021-05-16 21:59:06,042 INFO namenode.NNThroughputBenchmark: # operations: 
10000000

2021-05-16 21:59:06,042 INFO namenode.NNThroughputBenchmark: Elapsed Time: 
593839

2021-05-16 21:59:06,042 INFO namenode.NNThroughputBenchmark:  Ops per sec: 
16839.581

2021-05-16 21:59:06,042 INFO namenode.NNThroughputBenchmark: Average Time: 11

 

fgl

2021-05-16 21:21:03,906 INFO namenode.NNThroughputBenchmark: — mkdirs stats  —

2021-05-16 21:21:03,906 INFO namenode.NNThroughputBenchmark: # operations: 
10000000

2021-05-16 21:21:03,906 INFO namenode.NNThroughputBenchmark: Elapsed Time: 
481269

2021-05-16 21:21:03,906 INFO namenode.NNThroughputBenchmark:  Ops per sec: 
20778.400

2021-05-16 21:21:03,906 INFO namenode.NNThroughputBenchmark: Average Time: 9

 

/dev/sda:

ATA device, with non-removable media

Model Number:       INTEL SSDSC2BX200G4R

Serial Number:      BTHC523202RD200TGN

Firmware Revision:  G201DL2D

> NameNode Fine-Grained Locking via Metadata Partitioning
> -------------------------------------------------------
>
>                 Key: HDFS-14703
>                 URL: https://issues.apache.org/jira/browse/HDFS-14703
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs, namenode
>            Reporter: Konstantin Shvachko
>            Priority: Major
>         Attachments: 001-partitioned-inodeMap-POC.tar.gz, 
> 002-partitioned-inodeMap-POC.tar.gz, 003-partitioned-inodeMap-POC.tar.gz, 
> NameNode Fine-Grained Locking.pdf, NameNode Fine-Grained Locking.pdf
>
>
> We target to enable fine-grained locking by splitting the in-memory namespace 
> into multiple partitions each having a separate lock. Intended to improve 
> performance of NameNode write operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14703) NameNode Fine-Grained Locking via Metadata Partitioning

Reply via email to